XMPP (Extensible Messaging and Presence Protocol) is a protocol based on Extensible Markup Language (XML) and intended for instant messaging (IM) and online presence detection. It functions between or among servers, and facilitates near-real-time operation. The protocol may eventually allow Internet users to send instant messages to anyone else on the Internet, regardless of differences in operating systems and browsers.
XMPP is sometimes called the Jabber protocol, but this is a technical misnomer. Jabber, an IM application similar to ICQ (I Seek You) and others, is based on XMPP, but there are many applications besides Jabber that are supported by XMPP. The IEEE XMPP working group, a consortium of engineers and programmers, is adapting XMPP for use as an Internet Engineering Task Force (IETF) technology. In addition, the Messaging and Presence Interoperability Consortium (MPIC) is considering XMPP as an important interoperability technology. Eventually, XMPP is expected to support IM applications with authentication, access control, a high measure of privacy, hop-by-hop encryption, end-to-end encryption, and compatibility with other protocols.
IBM and Microsoft are working on a similar standard called SIP for Instant Messaging and Presence Leveraging Extensions (SIMPLE) based on Session Initiation Protocol (SIP).
Unstructured data is information, in many different forms, that doesn't hew to conventional data models and thus typically isn't a good fit for a mainstream relational database. Thanks to the emergence of alternative platforms for storing and managing such data, it is increasingly prevalent in IT systems and is used by organizations in a variety of business intelligence and analytics applications.
Traditional structured data, such as the transaction data in financial systems and other business applications, conforms to a rigid format to ensure consistency in processing and analyzing it. Sets of unstructured data, on the other hand, can be maintained in formats that aren't uniform, freeing analytics teams to work with all of the available data without necessarily having to consolidate and standardize it first. That enables more comprehensive analyses than would otherwise be possible.
Types of unstructured data
One of the most common types of unstructured data is text. Unstructured text is generated and collected in a wide range of forms, including Word documents, email messages, PowerPoint presentations, survey responses, transcripts of call center interactions, and posts from blogs and social media sites.
Other types of unstructured data include images, audio and video files. Machine data is another category, one that's growing quickly in many organizations. For example, log files from websites, servers, networks and applications -- particularly mobile ones -- yield a trove of activity and performance data. In addition, companies increasingly capture and analyze data from sensors on manufacturing equipment and other internet of things (IoT) connected devices.
In some cases, such data may be considered to be semi-structured -- for example, if metadata tags are added to provide information and context about the content of the data. The line between unstructured and semi-structured data isn't absolute, though; some data management consultants contend that all data, even the unstructured kind, has some level of structure.
Unstructured data analytics
Because of its nature, unstructured data isn't suited to transaction processing applications, which are the province of structured data. Instead, it's primarily used for BI and analytics. One popular application is customer analytics. Retailers, manufacturers and other companies analyze unstructured data to improve customer relationship management processes and enable more-targeted marketing; they also do sentiment analysis to identify both positive and negative views of products, customer service and corporate entities, as expressed by customers on social networks and in other forums.
Predictive maintenance is an emerging analytics use case for unstructured data. For example, manufacturers can analyze sensor data to try to detect equipment failures before they occur in plant-floor systems or finished products in the field. Energy pipelines can also be monitored and checked for potential problems using unstructured data collected from IoT sensors.
Analyzing log data from IT systems highlights usage trends, identifies capacity limitations and pinpoints the cause of application errors, system crashes, performance bottlenecks and other issues. Unstructured data analytics also aids regulatory compliance efforts, particularly in helping organizations understand what corporate documents and records contain.
Unstructured data techniques and platforms
Analyst firms report that the vast majority of new data being generated is unstructured. In the past, that type of information often was locked away in siloed document management systems, individual manufacturing devices and the like -- making it what's known as dark data, unavailable for analysis.
But things changed with the development of big data platforms, primarily Hadoop clusters, NoSQL databases and the Amazon Simple Storage Service (S3). They provide the required infrastructure for processing, storing and managing large volumes of unstructured data without the imposition of a common data model and a single database schema, as in relational databases and data warehouses.
A variety of analytics techniques and tools are used to analyze unstructured data in big data environments. Text analytics tools look for patterns, keywords and sentiment in textual data; at a more advanced level, natural language processing technology is a form of artificial intelligence that seeks to understand meaning and context in text and human speech, increasingly with the aid of deep learning algorithms that use neural networks to analyze data. Other techniques that play roles in unstructured data analytics include data mining, machine learning and predictive analytics.
Data science teams need the right skills and solid processes
For data scientists, big data systems and AI-enabled advanced analytics technologies open up new possibilities to help drive better business decision-making. "Like never before, we have access to data, computing power and rapidly evolving tools," Forrester Research analyst Kjell Carlsson wrote in a July 2017 blog post.
The downside, Carlsson added, is that many organizations "are only just beginning to crack the code on how to unleash this potential." Often, that isn't due to a lack of internal data science skills, he said in a June 2018 blog; it's because companies treat data science as "an artisanal craft" instead of a well-coordinated process that involves analytics teams, IT and business units.
Of course, possessing the right data science skills is a predicate to making such processes work. The list of skills that LinkedIn's analytics and data science team wants in job candidates includes the ability to manipulate data, design experiments with it and build statistical and machine learning models, according to Michael Li, who heads the team.
But softer skills are equally important, Li said in an April 2018 blog. He cited communication, project management, critical thinking and problem-solving skills as key attributes. Being able to influence decision-makers is also an important part of "the art of being a data scientist," he wrote.
The problem is that such skills requirements are often "completely out of reach for a single person," Miriam Friedel wrote in a September 2017 blog when she was director and senior scientist at consulting services provider Elder Research. Friedel, who has since moved on to software vendor Metis Machine as data science director, suggested in the blog that instead of looking for the proverbial individual unicorn, companies should build "a team unicorn."
This handbook more closely examines that team-building approach as well as critical data science skills for the big data and AI era.
Reskilling the analytics team: Math, science and creativity
Technical skills are a must for data scientists. But to make analytics teams successful, they also need to think creatively, work in harmony and be good communicators.
In a 2009 study of its employee data, Google discovered that the top seven characteristics of a successful manager at the company didn't involve technical expertise. For example, they included being a good coach and an effective communicator, having a clear vision and strategy, and empowering teams without micromanaging them. Technical skills were No. 8.
Google's list, which was updated this year to add collaboration and strong decision-making capabilities as two more key traits, applies specifically to its managers, not to technical workers. But the findings from the study, known as Project Oxygen, are also relevant to building an effective analytics team.
Obviously, STEM skills are incredibly important in analytics. But as Google's initial and subsequent studies have shown, they aren't the whole or even the most important part of the story. As an analytics leader, I'm very glad that someone has put numbers to all this, but I've always known that the best data scientists are also empathetic and creative storytellers.
According to the latest employment projections report by the U.S. Bureau of Labor Statistics, statisticians are in high demand. Among occupations that currently employ at least 25,000 people, statistician ranks fifth in projected growth rate; it's expected to grow by 33.8% from 2016 to 2026. For context, the average rate of growth that the statistics bureau forecasts for all occupations is 7.4%. And with application software developers as the only other exception, all of the other occupations in the top 10 are in the healthcare or senior care verticals, which is consistent with an aging U.S. population.
Statistician is fifth among occupations with at least 25,000 workers projected to grow at the fastest rates.
Thanks to groundbreaking innovations in technology and computing power, the world is producing more data than ever before. Businesses are using actionable analytics to improve their day-to-day processes and drive diverse functions like sales, marketing, capital investment, HR and operations. Statisticians and data scientists are making that possible, using not only their mathematical and scientific skills, but also creativity and effective communication to extract and convey insights from the new data resources.
In 2017, IBM partnered with job market analytics software vendor Burning Glass Technologies and the Business-Higher Education Forum on a study that showed how the democratization of data is forcing change in the workforce. Without diving into the minutia, I gathered from the study that with more and more data now available to more and more people, the insights garnered from the data set you apart as an employee -- or as a company.
Developing and encouraging our analytics team
The need to find and communicate these insights influences how we hire and train our up-and-coming analytics employees at Dun & Bradstreet. Our focus is still primarily on mathematics, but we also consider other characteristics like critical- and innovative-thinking abilities as well as personality traits, so our statisticians and data scientists are effective in their roles.
Our employees have the advantage of working for a business-to-business company that has incredibly large and varied data sets -- containing more than 300 million business records -- and a wide variety of customers that are interested in our analytics services and applications. They get to work on a very diverse set of business challenges, share cutting-edge concepts with data scientists in other companies and develop creative solutions to unique problems.
Our associates are encouraged to pursue new analytical models and data analyses, and we have special five-day sprints where we augment and enhance some of the team's more creative suggestions. These sprints not only challenge the creativity of our data analysts, but also require them to work on their interpersonal and communication skills while developing these applications as a group.
Socializing the new, creative data analyst
It's very important to realize that some business users aren't yet completely comfortable with a well-rounded analytics team. For the most part, when bringing in an analyst, they're looking for confirmation of a hypothesis rather than a full analysis of the data at hand.
If that's the case in your organization, then be persistent. As your team continues to present valuable insights and creative solutions, your peers and business leaders across the company will start to seek guidance from data analysts as partners in problem-solving much more frequently and much earlier in their decision-making processes.
As companies and other institutions continue to amass data exponentially and rapid technological changes continue to affect the landscape of our businesses and lives, growing pains will inevitably follow. Exceptional employees who have creativity and empathy, in addition to mathematical skills, will help your company thrive through innovation. Hopefully, you have more than a few analysts who possess those capabilities. Identify and encourage them -- and give permission to the rest of your analytics team to think outside the box and rise to the occasion.
Data scientist vs. business analyst: What's the difference?
Data science and business analyst roles differ in that data scientists must deep dive into data and come up with unique business solutions -- but the distinctions don't end there.
What is the difference between data science and business analyst jobs? And what kind of training or education is required to become a data scientist?
There are a number of differences between data scientists and business analysts, the two most common business analytics roles, but at a high level, you can think about the distinction as similar to a medical researcher and a lab technician. One uses experimentation and the scientific method to search out new, potentially groundbreaking discoveries, while the other applies existing knowledge in an operational context.
Data scientist vs. business analyst comes down to the realms they inhabit. Data scientists delve into big data sets and use experimentation to discover new insights in data. Business analysts, on the other hand, typically use self-service analytics tools to review curated data sets, build reports and data visualizations, and report targeted findings -- things like revenue by quarter or sales needed to hit targets.
What does a data scientist do?
A data scientist takes analytics and data warehousing programs to the next level: What does the data really say about the company, and is the company able to decipher relevant data from irrelevant data?
A data scientist should be able to leverage the enterprise data warehouse to dive deeper into the data that comes out or to analyze new types of data stored in Hadoop clusters and other big data systems. A data scientist doesn't just report on data like a classic business analyst does, he also delivers business insights based on the data.
A data scientist job also requires a strong business sense and the ability to communicate data-driven conclusions to business stakeholders. Strong data scientists don't just address business problems, they'll also pinpoint the problems that have the most value to the organization. A data scientist plays a more strategic role within an organization.
Data scientist education, skills and personality traits
Data scientists look through all the available data with the goal of discovering a previously hidden insight that, in turn, can provide a competitive advantage or address a pressing business problem. Data scientists do not simply collect and report on data -- they also look at it from many angles, determine what it means and then recommend ways to apply the data. These insights could lead to a new product or even an entirely new business model.
Data scientists apply advanced machine learning models to automate processes that previously took too long or were inefficient. They use data processing and programming tools -- often open source, like Python, R and TensorFlow -- to develop new applications that take advantage of advances in artificial intelligence. These applications may perform a task such as transcribing calls to a customer service line using natural language processing or automatically generating text for email campaigns.
What does a business analyst do?
A business analyst -- a title often used interchangeably with data analyst -- focuses more on delivering operational insights to lines of business using smaller, more targeted data sets. For example, a business analyst tied to a sales team will work primarily with sales data to see how individual team members are performing, to identify members who might need extra coaching and to search for other areas where the team can improve on its performance.
Business analysts typically use self-service analytics and data visualization tools. Using these tools, business analysts can build reports and dashboards that team members can use to track their performance. Typically, the information contained in these reports is retrospective rather than predictive.
Data scientist vs. business analyst training, tools and trends
To become a business analyst, you need a familiarity with statistics and the basic fundamentals of data analysis, but there are many self-service analytics tools that do the mathematical heavy lifting for you. Of course, you have to know if it's statistically meaningful to join two separate data sets, and you have to understand the distinction between correlation and causation. But, on the whole, a deep background in mathematics is unnecessary.
To become a data scientist, on the other hand, you need a strong background in math. This is one of the primary differences in the question of data scientists vs. business analysts.
Many data scientists have doctorates in some field of math. Many have backgrounds in physics or other advanced sciences that lean heavily on statistical inference.
Business analysts can generally pick up the technical skills they need on the job. Whether an enterprise uses Tableau, Qlik or Power BI -- the three most common self-service analytics platforms -- or another tool, most use graphical user interfaces that are designed to be intuitive and easy to pick up.
Data science jobs require more specific technical training. In addition to advanced mathematical education, data scientists need deep technical skills. They must be proficient in several common coding languages -- including Python, SQL and Java -- which enable them to run complex machine learning models against big data stored in Hadoop or other distributed data management platforms. Most often, data scientists pick up these skills from a college-level computer science curriculum.
However, trends in data analytics are beginning to collapse the line between data science and data analysis. Increasingly, software companies are introducing platforms that can automate complex tasks using machine learning. At the same time, self-service software supports deeper analytical functionality, meaning data scientists are increasingly using tools that were once solely for business analysts.
Companies often report the highest analytics success when blending teams, so data scientists working alongside business analysts can produce operational benefits. This means that the data scientist vs. business analyst distinctions could become less important as time goes on -- a trend that may pay off for enterprises.
Hiring vs. training data scientists: The case for each approach
Hiring data scientists is easier said than done -- so should you try to train current employees in data science skills? That depends on your company's needs, writes one analytics expert.
Companies are faced with a dilemma on big data analytics initiatives: whether to hire data scientists from outside or train current employees to meet new demands. In many cases, realizing big data's enormous untapped potential brings the accompanying need to increase data science skills -- but building up your capacity can be tricky, especially in a crowded market of businesses looking for analytics talent.
Even with a shortage of available data scientists, screening and interviewing for quality hires is time- and resource-intensive. Alternatively, training data scientists from within may be futile if internal candidates don't have the fundamental aptitude.
At The Data Incubator, we've helped hundreds of companies train employees on data science and hire new talent -- and, often, we've aided organizations in handling the tradeoffs between the two approaches. Based on the experiences we've had with our corporate clients, you should consider the following factors when deciding which way to go.
New hires bring in new thinking
The main benefit of hiring rather than training data scientists comes from introducing new ideas and capabilities into your organization. What you add may be technical in nature: For example, are you looking to adopt advanced machine learning techniques, such as neural networks, or to develop real-time customer insights by using Spark Streaming? It may be cultural, too: Do you want an agile data science team that can iterate rapidly -- even at the expense of "breaking things," in Facebook's famous parlance? Or one that can think about data creatively and find novel approaches to using both internal and external information?
At other times, it's about having a fresh set of eyes looking at the same problems. Many quant hedge funds intentionally hire newly minted STEM Ph.D. holders -- people with degrees in science, technology, engineering or math -- instead of industry veterans precisely to get a fresh take on financial markets. And it isn't just Wall Street; in other highly competitive industries, too, new ideas are the most important currency, and companies fight for them to remain competitive.
How a company sources new talent can also require some innovation, given the scarcity of skilled data scientists. Kaggle and other competition platforms can be great places to find burgeoning data science talent. The public competitions on Kaggle are famous for bringing unconventional stars and unknown whiz kids into the spotlight and demonstrating that the best analytics performance may come from out of left field.
Similarly, we've found that economists and other social scientists often possess the same strong quantitative skill sets as their traditional STEM peers, but are overlooked by HR departments and hiring managers alike.
Training adds to existing expertise
In other cases, employers may value industry experience first and foremost. Domain expertise is complex, intricate and difficult to acquire in some industries. Such industries often already have another science at their core. Rocketry, mining, chemicals, oil and gas -- these are all businesses in which knowledge of the underlying science is more important than data science know-how.
Highly regulated industries are another case in point. Companies facing complex regulatory burdens must often meet very specific, and frequently longstanding, requirements. Banks must comply with financial risk testing and with statutes that were often written decades ago. Similarly, the drug approval process in healthcare is governed by a complex set of immutable rules. While there is certainly room for innovation via data science and big data in these fields, it is constrained by regulations.
Companies in this position often find training data scientists internally to be a better option for developing big data analytics capabilities than hiring new talent. For example, at The Data Incubator, we work with a large consumer finance institution that was looking for data science capabilities to help enhance its credit modeling. But its ideal candidate profile for that job was very different from the ones sought by organizations looking for new ideas on business operations or products and services.
The relevant credit data comes in slowly: Borrowers who are initially reliable could become insolvent months or years after the initial credit decision, which makes it difficult to predict defaults without a strong credit model. And wrong decisions are very expensive: Loan defaults result in direct hits to the company's profitability. In this case, we worked with the company to train existing statisticians and underwriters on complementary data science skills around big data.
Of course, companies must be targeted in selecting training candidates. They often start by identifying employees who possess strong foundational skills for data science -- things like programming and statistics experience. Suitable candidates go by many titles, including statisticians, actuaries and quantitative analysts, more popularly known as quants.
Find the right balance for your needs
For many companies, weighing the options for hiring or training data scientists comes down to understanding their specific business needs, which can vary even in different parts of an organization. It's worth noting that the same financial institution that trained its staffers to do analytics for credit modeling also hired data scientists for its digital marketing team.
Without the complex regulatory requirements imposed on the underwriting side, the digital marketing team felt it could more freely innovate -- and hence decided to bring in new blood with new ideas. These new hires are now building analytical models that leverage hundreds of data signals and use advanced AI and machine learning techniques to more precisely target marketing campaigns at customers and better understand the purchase journeys people take.
Ultimately, the decision of whether to hire or train data scientists must make sense for an organization. Companies must balance the desire to innovate with the need to incorporate existing expertise and satisfy regulatory requirements. Getting that balance right is a key step in a successful data science talent strategy.
A help desk management system has immeasurable value regardless of the size of an organization, whether it's a global enterprise or an SMB.
Good help desk software can enable organizations to streamline customer support, reduce manual intervention in the resolution process, manage costs and gather information from the data to drive product improvements. It is the key component that can get a critical system back online because it provides a link between the person having the problem on site and the support technician.
Help desk software manages this comprehensive process while providing important business benefits in communication, ticket management, service-level agreement (SLA) measurement, reporting and analytics, and knowledge management.
In the service ticket lifecycle, the user who logs the service request is the person most familiar with the issue. The support technician is the expert -- internal or external -- who works to resolve the issue. The support technician may need to engage other resources, including third-party personnel, or they may escalate the problem to a higher support level internally.
When the case closes, the system creates a report in a problem/resolution format and files it in a knowledge database. A help desk management system manages this interaction to ensure a smooth resolution.
Oversee ticket management
Help desk software is a valuable resource to manage incident tickets. Incidents can come from various sources, such as by phone, web service and even text. A help desk management system can receive input from various sources, as well as store information.
A help desk management system can also log and track incident progress. IT pros may have trouble resolving problems quickly if they don't have access to the software or hardware parts they need to address the issue. Help desk software must track the status of management updates, especially for critical systems.
It can also manage software patches which take on a life of their own because they may involve a third party and all the details that come with that, including a support contract, PIN and account number.
Help desk software also helps IT administrators oversee the end-to-end nature of incident management. It would be extremely difficult to provide this service manually or through an ad hoc set of tools, especially while servicing several calls at once.
Provide SLA performance measurement
SLAs define the criteria for servicing desktop computers. Vendors often define SLAs for performance reasons.
A support vendor is contractually liable to provide incident resolution in a certain period of time. Help desk software tracks the notes and communication, as well as the elapsed time of a call.
Desktop computers typically will not have an SLA metric attached unless the device is business-critical, such as hardware that is part of laboratory equipment or endpoints that belong to executives.
Typical SLAs include:
9/5 -- next day
24/7 -- four-hour response
24/7 -- eight-hour response
24/7 -- next-day response
24/7 -- six-hour call to repair
Response SLAs typically indicate the time it takes between the service provider receiving the call and when the provider makes contact with the customer. A repair option indicates that the system must return to service within a specific time period.
Help desk software, using data input by the support technician, will track important data to evaluate SLA compliance. That data includes:
when the provider receives the call;
when the support technician is contacted;
when the problem is resolved;
when the case closes; and
the technician's notes.
When defining problem resolution time for an SLA in a help desk application, make sure the help desk software tracks the time between when the case is opened and when it is resolved. The support engineer is responsible for these entries, and they must be accurate to provide correct performance data. There is often lag time between when engineers solve an issue, enter the report and close the call, which can result in inaccurate data.
Provide reporting and analytics
Without software applications, providing reporting and analytics is prohibitive. With the data fields the software defines, IT admins can produce custom reports to provide information specific to various requirements, such as:
When contact with the caller was made
When the incident was resolved
How long it took to resolve
What patch was used
Cause and resolution
Whether the SLA was met or not
By recording this type of data in a simple spreadsheet, IT admins can easily manipulate the data to provide specific reports and can even define custom fields for specific reporting needs. They can then use these reports to support analytics to determine:
what percentage of problems patches resolved and which patches resolved these problems; and
which devices are logging frequent calls and may require replacement.
For accurate reporting, a help desk management system should also include the ability to manage many sources of input, including email, web requests, tweets and texts, as well as the ability to store photos and videos. These capabilities will require personnel to manage the systems that receive this input.
Create knowledge management
Perhaps the most important advantage a help desk management system provides is the ability to develop and store a customized knowledge base. Buyers should note that this requires far more than simply storing data.
A knowledge base is a collection of user-supplied data. One notable example is Microsoft's Knowledge Base where Microsoft engineers, customers and the general public can submit entries when they discover a way to fix a problem that may benefit others. IT admins using the knowledge base can search for specific keywords and senior-level technicians can review articles for accuracy. This is a classic example of how preventing repetitive problems can, over time, dramatically reduce call resolution time and downtime and improve user satisfaction.
Many organizations provide incentives for employees who contribute to a knowledge base. The more information the database contains, the more value the organization gets from it.
Self-service business intelligence (SSBI) is an approach to data analytics that enables business users to access and work with corporate data even though they do not have a background in statistical analysis, business intelligence (BI) or data mining. Allowing end users to make decisions based on their own queries and analyses frees up the organization's business intelligence and information technology (IT) teams from creating the majority of reports and allows those teams to focus on other tasks that will help the organization reach its goals.
Because self-service BI software is used by people who may not be tech-savvy, it is imperative that the user interface (UI) for BI software be intuitive, with a dashboard and navigation that is user-friendly. Ideally, training should be provided to help users understand what data is available and how that information can be queried to make data-driven decisions to solve business problems, but once the IT department has set up the data warehouse and data marts that support the business intelligence system, business users should be able to query the data and create personalized reports with very little effort.
While self-service BI encourages users to base decisions on data instead of intuition, the flexibility it provides can cause unnecessary confusion if there is not a data governance policy in place. Among other things, the policy should define what the key metrics for determining success are, what processes should be followed to create and share reports, what privileges are necessary for accessing confidential data and how data quality, security and privacy will be maintained.
Explore the data discovery software market, including the products and vendors helping enterprises glean insights using data visualization and self-service BI.
Turning data into business insight is the ultimate goal. It's not about gathering as much data as possible, it's about applying tools and making discoveries that help a business succeed. The data discovery software market includes a range of software and cloud-based services that can help organizations gain value from their constantly growing information resources.
These products fall within the broad BI category, and at their most basic, they search for patterns within data and data sets. Many of these tools use visual presentation mechanisms, such as maps and models, to highlight patterns or specific items of relevance. The tools deliver visualizations to users, including nontechnical workers, such as business analysts, via dashboards, reports, charts and tables.
The big benefit here: data discovery tools provide detailed insights gleaned from data to better inform business decisions. In many cases, the tools accomplish this with limited IT involvement because the products offer self-service features.
Using extensive research into the data discovery software market, TechTarget editors focused on the data discovery software vendors that lead in market share, plus those that offer traditional and advanced functionality. Our research included data from TechTarget surveys, as well as reports from respected research firms, including Gartner and Forrester.
Alteryx Inc.'s Connect markets itself as a collaborative data exploration and data cataloging platform for the enterprise that changes how information workers discover, prioritize and analyze all the relevant information within an organization.
The data discovery software market includes a range of software and cloud-based services that can help organizations gain value from their constantly growing information resources.
Alteryx Connect key features include:
Data Asset Catalog, which collects metadata from information systems, enabling better relevant data organization;
Business Glossary, which defines standard business terms in a data dictionary and links them to assets in the catalog; and
Data Discovery, which lets users discover the information they need through search capabilities.
Other features include:
Data Enrichment and Collaboration, which allows users to annotate, discuss and rate information to offer business context and provide an organization with relevant data; and
Certification and Trust, which provides insights into information asset trustworthiness through certification, lineage and versioning.
Alteryx touts these features as decreasing the time necessary to gain insight and supporting faster, data-driven decisions by improving collaboration, enhancing analytic productivity and ensuring data governance.
Domo Inc. provides a single-source system for end-to-end data integration and preparation, data discovery, and sharing in the cloud. It's mobile-focused, and it doesn't need you to integrate desktop software, third-party tools or on-premises servers.
With more than 500 native connectors, Domo designed the platform for quick and easy access to data from across the business, according to the company. It contains a central repository that ingests the data and aids version and access control.
Domo also provides one workspace from which people can choose and explore all the data sets available to them in the platform.
Data discovery capabilities include Data Lineage, a path-based view that clarifies data sources. This feature also enables simultaneous display of data tables alongside visualizations, aiding insight discovery, as well as card-based publishing and sharing.
GoodData Enterprise Insights Platform
The GoodData Corp.'s cloud-based Enterprise Insights Platform is an end-to-end data discovery software platform that gathers data and user decisions, transforming them into actionable insights for line-of-business users.
The platform provides insights in the form of recommendations and predictive analytics with the goal of delivering the analytics that matter most for real-time decision-making. Customers, partners and employees see information that is relevant to the decision at hand, presented in what GoodData claims is a personalized, contextual, intuitive and actionable form. Users can also integrate these insights directly into applications.
IBM Watson Explorer
IBM has a host of data discovery products, and one of the key offerings is IBM Watson Explorer. It's a cognitive exploration and content analysis platform that enables business users to easily explore and analyze structured, unstructured, internal, external and public data for trends and patterns.
Organizations have used Watson Explorer to understand 100% of incoming calls and emails, to improve the quality of information, and to enhance their ability to use that information, according to IBM.
Machine learning models, natural language processing and next-generation APIs combine to help organizations unlock value from all of their data and gain a secure, 360-degree view of their customers, in context, according to the company.
The platform also enables users to classify and score structured and unstructured data with machine learning to reach the most relevant information. And a new mining application gives users deep insights into structured and unstructured data.
Informatica LLC offers multiple data management products powered by its Claire engine as part of its Intelligent Data Platform. The Claire engine is a metadata-driven AI technology that automatically scans enterprise data sets and exploits machine learning algorithms to infer relationships about the data structure and provide recommendations and insights. By augmenting end users' individual knowledge with AI, organizations can discover more data from more users in the enterprise, according to the company.
Another component, Informatica Enterprise Data Catalog, scans and catalogs data assets across the enterprise to deliver recommendations, suggestions and data management task automation. Semantic search and dynamic facet capabilities allow users to filter search results and get data lineage, profiling statistics and holistic relationship views.
Informatica Enterprise Data Lake enables data analysts to quickly find data using semantic and faceted search and to collaborate with one another in shared project workspaces. Machine learning algorithms recommend alternative data sets. Analysts can sample and prepare datasets in an Excel-like data preparation interface, which analysts can operationalize as reusable workflows.
Information Builders WebFocus
Information Builders claims its WebFocus data discovery software platform helps companies use BI and analytics strategically across and beyond the enterprise.
The platform includes a self-service visual discovery tool that enables nontechnical business users to conduct data preparation; visually analyze complex data sets; generate sophisticated data visualizations, dashboards, and reports; and share content with other users. Its extensive visualization and charting capabilities provide an approach to self-service discovery that supports any type of user, Information Builders claims.
Information Builders offers a number of tools related to the WebFocus BI and analytics platform that provide enterprise-grade analytics and data discovery. One is WebFocus InfoApps, which can take advantage of custom information applications designed to enable nontechnical users to rapidly gather insights and explore specific business contexts. InfoApps can include parameterized dashboards, reports, charts and visualizations.
Another tool, WebFocus InfoAssist, enables governed self-service reporting, analysis and discovery capabilities to nontechnical users. The product offers a self-service BI capability for immediate data access and analysis.
Microsoft Power BI
Microsoft Power BI is a cloud-based business analytics service that enables users to visualize and analyze data. The same users can distribute data insights anytime, anywhere, on any device in just a few clicks, according to the company.
As a BI and analytics SaaS tool, Power BI equips users across an organization to build reports with colleagues and share insights. It connects to a broad range of live data through dashboards, provides interactive reports and delivers visualizations that include KPIs from data on premises and in the cloud.
Organizations can use machine learning to automatically scan data and gain insights, ask questions of the data using natural language queries, and take advantage of more than 140 free custom visuals created by the user community.
Power BI applications include dashboards with prebuilt content for cloud services, including Salesforce, Google Analytics and Dynamics 365. It also integrates seamlessly with Microsoft products, such as Office 365, SharePoint, Excel and Teams.
Organizations can start by downloading Power BI Desktop for free, while Power BI Pro and Premium offer several licensing options for companies that want to deploy Power BI across their organization.
MicroStrategy Desktop Client
MicroStrategy Ltd. designed its Desktop client to deliver self-service BI and help business users or departmental analysts analyze data with out-of-the-box visualizations. Data discovery capabilities are available via Mac or Windows PC web browsers and native mobile apps for iOS and Android.
All the interfaces are consistent and users can promote content between the interfaces. With the MicroStrategy Desktop client, business users can visualize data on any chart or graph, including natural language generation narratives, Google Charts, geospatial maps and data-driven documents visualizations.
They can access data from more than 100 data sources, including spreadsheets, RDBMS, cloud systems, and more; prepare, blend, and profile data with graphical interfaces; share data as a static PDF or as an interactive dashboard file; and promote offline content to a server and publish governed and certified dashboards.
OpenText EnCase Risk Manager
OpenText EnCase Risk Manager enables organizations to understand the sensitive data they have in their environment, where the data exists and its value.
The data discovery software platform helps organizations identify, categorize and remediate sensitive information across the enterprise, whether that information exists in the form of personally identifiable customer data, financial records or intellectual property. EnCase Risk Manager provides the ability to search for standard patterns, such as national identification numbers and credit card data, with the ability to discover entirely unique or proprietary information specific to a business or industry.
Risk Manager is platform-agnostic and able to identify this information throughout the enterprise wherever structured or unstructured data is stored, be that on endpoints, servers, cloud repositories, SharePoint or Exchange. Pricing starts at $60,000.
Oracle Big Data Discovery
Oracle Big Data Discovery enables users to find, explore and analyze big data. They can use the platform to discover new insights from data and share results with other tools and resources in a big data ecosystem, according to the company.
The platform uses Apache Spark, and Oracle claims it's designed to speed time to completion, make big data more accessible to business users across an organization and decrease the risks associated with big data projects.
Big Data Discovery provides rapid visual access to data through an interactive catalog of the data; loads local data from Excel and CSV files through self-service wizards; provides data set summaries, annotations from other users, and recommendations for related data sets; and enables search and guided navigation.
Together with statistics about each individual attribute in any data set, these capabilities expose the shape of the data, according to Oracle, enabling users to understand data quality, detect anomalies, uncover outliers and ultimately determine potential. Organizations can use the platform to visualize attributes by data type; glean which are the most relevant; sort attributes by potential, so the most meaningful information displays first; and use a scratchpad to uncover potential patterns and correlations between attributes.
Qlik View Sense
Qlik Sense is Qlik's next-generation data discovery software platform for self-service BI. It supports a full range of analytics use cases including self-service visualization and exploration, guided analytics applications and dashboards, custom and embedded analytics, mobile analytics, and reporting, all within a governed, multi-cloud architecture.
It offers analytics capabilities for all types of users, including associative exploration and search, smart visualizations, self-service creation and data preparation, geographic analysis, collaboration, storytelling, and reporting. The platform also offers fully interactive online and offline mobility and an insight advisor that generates relevant charts and insights using AI.
The product can readily integrate streaming data sources from IoT, social media and messaging with at-rest data for real-time contextual analysis.
Freely distributed accelerators include product templates to help users get to production quickly.
Tibco's Insight Platform combines live streaming data with queries on large at-rest volumes. Historical patterns are interactively identified with Spotfire, running directly against Hadoop and Spark. The Insight Platform can then apply these patterns to streaming data for predictive and operational insights.
For the enterprise, Qlik Sense provides a platform that includes open and standard APIs for customization and extension, data integration scripting, broad data connectivity and data-as-a-service, centralized management and governance, and a multi-cloud architecture for scalability across on-premises environments, as well as private and public cloud environments.
Qlik Sense runs on the patented Qlik Associative Engine, which allows users to explore information without query-based tools. And the new Qlik cognitive engine works with the associative engine to augment the user, offering insight suggestions and automation in context with user behavior.
Qlik Sense is available in cloud and enterprise editions.
Salesforce Einstein Discovery
Salesforce's Einstein Discovery, an AI-powered feature within the Einstein Analytics portfolio, allows business users to automatically analyze millions of data points to understand their current business, explore historical trends, and automatically receive guided recommendations on what they can do to expand deals or resolve customer service cases faster.
Einstein Discovery for Analysts lets users analyze data in Salesforce CRM, CSV files or data from external data sources. In addition, users can take advantage of smart data preparation capabilities to make data improvements, run analyses to create stories, further explore these stories in Einstein Analytics for advanced visualization capabilities, and push insights into Salesforce objects for all business users.
Einstein Discovery for Business Users provides access to insights in natural language and into Salesforce -- within Sales Cloud or Service Cloud, for example. Einstein Discovery for Analysts is available for $2,000 per user, per month. Einstein Discovery for Business Users is $75 per user, per month.
SAS Visual Analytics
SAS Institute Inc.'s Visual Analytics on SAS Viya provides interactive data visualizations to help users explore and better understand data.
The product provides a scalable, in-memory engine along with a user-friendly interface, SAS claims. The combination of interactive data exploration, dashboards, reporting and analytics is designed to help business users find valuable insights without coding. Any user can assess probable outcomes and make more informed, data-driven decisions.
SAS Visual Analytics capabilities include:
automated forecasting, so users can select the most appropriate forecasting method to suit the data;
scenario analysis, which identifies important variables and how changes to them can influence forecasts;
goal-seeking to determine the values of underlying factors that would be required to achieve the target forecast; and
decision trees, allowing users to create a hierarchical segmentation of the data based on a series of rules applied to each observation.
Other features include network diagrams so users can see how complex data is interconnected; path analysis, which displays the flow of data from one event to another as a series of paths; and text analysis, which applies sentiment analysis to video, social media streams or customer comments to provide quick insights into what's being discussed online.
SAP Analytics Cloud
SAP's Analytics Cloud service offers analytics capabilities for all users in one data discovery software product, including discovery, analysis, planning, predicting and collaborating, in one integrated cloud platform, according to SAP.
The service gives users business insights based on its ability to turn embedded data analytics into business applications, the company claims.
Among the potential benefits:
enhanced user experience with the service's visualization and role-based personalization features;
better business results from deep collaboration and informed decisions due to SAP's ability to integrate with existing on-premises applications; and
simplified data across an organization to ensure faster, fact-based decision-making.
In addition, Analytics Cloud is free from operating system constraints, download requirements and setup tasks. It provides real-time analytics and extensibility using SAP Cloud Platform, which can reduce the total cost of ownership because all the features are offered in one SaaS product for all users.
Sisense Ltd. is an end-to-end platform that ingests data from a variety of sources before analyzing, mashing and visualizing it. Its open API framework also enables a high degree of customization without the input of designers, data scientists or IT specialists, according to Sisense.
The Sisense analytics engine runs 10 to 100 times faster than in-memory platforms, according to the company, dealing with terabytes of data and potentially eliminating onerous data preparation work. The platform provides business insights augmented by machine learning and anomaly detection. In addition, the analytics tool offers the delivery of insights beyond the dashboard, offering new forms of BI access, including chatbots and autonomous alerts.
Tableau Software Inc.'s Desktop is a visual analytics and data discovery software platform that lets users see and understand their data with drag-and-drop simplicity, according to the company. Users can create interactive visualizations and dashboards to gain immediate insights without the need for any programming. They can then share their findings with colleagues.
Tableau Desktop can connect to an organization's data in the cloud, on premises or both using one of 75 native data connectors or Tableau's Web Data Connector. This includes connectors to cloud data sources from cloud databases such as Amazon Redshift, Google BigQuery, SQL Server, SAP and Oracle, plus applications such as Salesforce and ServiceNow.
Tibco Software Inc.'s Spotfire is an enterprise analytics platform that connects to and blends data from files, relational and NoSQL databases, OLAP, Hadoop and web services, as well as to cloud applications such as Google Analytics and Salesforce.
A threat hunter, also called a cybersecurity threat analyst, is a security professional or managed service provider (MSP) that proactively uses manual or machine-assisted techniques to detect security incidents that may elude the grasp of automated systems. Threat hunters aim to uncover incidents that an enterprise would otherwise not find out about, providing chief information security officers (CISOs) and chief information officers (CIOs) with an additional line of defense against advanced persistent threats (APTs).
In order to detect a security incident an automated system might miss, a threat hunter uses critical-thinking skills and creativity to look at patterns of normal behavior and be able to identify network behavior anomalies. A threat hunter must have considerable business knowledge and an understanding of normal enterprise operations in order to avoid false positives and have good communication skills to share the results of the hunt. It is especially important for the threat hunter to keep current on the latest security research.
The job of the threat hunter is to both supplement and reinforce automated systems. As the review process uncovers patterns for initiating attacks, the security organization can use that information to improve its automated threat detection software.
A 2017 SANS Institute report found more organizations are pursuing threat hunting initiatives, but notes the bulk of the growth is confined to vertical markets such as financial services, high tech, military and government and telecommunications. As of 2017, the field of threat hunting was still new for the majority of IT security organizations. The SANS Institute report noted 45% of the respondents to its threat hunting survey do their hunting on an ad hoc basis.
Threat hunters typically work within a security operations center (SOC) and take the lead role in an enterprise's threat detection and incident response activities. Threat hunting may be assigned as an additional duty to one or more security engineers within a SOC, or a SOC may dedicate security engineers to full-time threat hunting duties.
Additional options for creating a threat hunting team include rotating security engineers into the threat hunting role on a temporary basis and then having them return to their usual jobs within the SOC. Internally, threat hunters hunters are often managed by the an organization's CISO, who works with the CIO to coordinate enterprise security.
COBOL forms the core of many legacy systems, and virtualizing it can be challenging. Organizations can use migration tools and Visual COBOL to modernize it.
The risks of maintaining an out-of-date COBOL system are significant, so organizations must either modernize a legacy COBOL system or transform it from the ground up. COBOL forms the core of many important systems. The organizations that use them are rightfully wary of modernization, because the costs and risks involved in transforming a central application are immense. Failure to act -- either by virtualizing COBOL or migrating it to the cloud -- could be detrimental to the business.
Many systems are rooted in legacy COBOL
COBOL is at the core of many banking and business systems; governments worldwide are heavy users. Much of COBOL -- there are some 200 billion lines of legacy COBOL code still in use -- runs on old-style mainframes, despite their high operational expenses and limited performance.
For COBOL's base of conservative users, those mainframes are a significant barrier to migration. They tend to see mainframe replacement with client-server or commercial off-the-shelf systems as too risky, at least until the new technology matures. In today's mobile-focused, graphical environment, however, the need for a bridge out of legacy COBOL has become critical, so code modules have been written to interface to the outside universe using C, .NET and other modern languages and environments.
Risks are high for legacy systems
Current IT tasks demand agility, and businesses need to evolve to match the changing market and get ahead of competitors. In the past, this seemed to require a migration from legacy COBOL to a modern language. The problem with this approach is that the COBOL app is often the backbone for company operations. The huge costs involved in the rewrite process to prove new code before it's released into production add to this risk.
There are plenty of examples of huge failures in the rewrite process. A number of government software projects around the world have crashed and burned spectacularly. This possibility makes many CIOs grit their teeth and buy another mainframe every 10 years or so. These captive customers can expect to pay through the nose, first for capital equipment and then for support and operating costs.
New tools provide virtualization options
The maturation of the cloud, coupled with migration tools, offers solutions to the captive hardware dilemma. We now have tools that transition core legacy COBOL apps from the mainframe into VMs without any major rewrite. You'll need to recompile to target the VM, but that's not a big deal compared to mass code rewrites.
Almost all of the code can move without change, though anything written in assembler will need to be recoded. With the right compiler, I/O and comms will also be identical to the old system. Things might get a bit more complicated if you're going to use RESTful I/O instead of block I/O.
Taking advantage of the parallel nature of the cloud is also a potential challenge. There are several companies, such as Micro Focus, that offer consultation on this topic, as well as with the general issue of tuning legacy COBOL in the cloud.
Making migration happen
The trickiest part of a COBOL migration to the private cloud lies in building out a virtual infrastructure that properly supports the apps. Both semantics and structure differ between private clouds and, of course, are all different from the mainframe experience. A one-to-one match won’t happen, so it's crucial to understand the differences between the original and what is possible in the private cloud. Don't expect to get it right the first time. It's critical to deploy test and QA processes to succeed with the migration.
Performance is another issue. Private clouds aren't mainframes and have much more scalability. It's critical that your app is set up to run in a multicore environment that will allow it to take advantage of the private cloud's inherent parallelism.
If the app isn't multithreaded, all might not be lost on the performance side. Some of the largest private cloud instances are powerful computers in their own right and might match or exceed a mainframe in compute power, especially if given a large dynamic RAM space and local solid-state drive instance storage. Of course, the most recent mainframes are faster too, but you should compare the 12-year-old mainframe you want to lose with the latest private cloud instances.
Legacy COBOL can be rejuvenated
All of this sounds pretty easy, but we need to step back and look at some collateral issues involved in a cloud transition. The programming staff for COBOL typically consists of long-term employees nearing retirement. They can apply changes with pinpoint accuracy because they have enormous experience with the app in use and in the company's field of business. The problem is that the industry isn't generating replacements. There are no universities or trade schools teaching COBOL; it has become the Latin of the IT industry, a dead language only good for ancient IT tasks.
The answer is to rejuvenate COBOL itself. A decade ago, Visual Basic revived Basic in a similar way through a design that worked with current coding practices. As with Basic, this doesn't initially mean changing the code itself, but this COBOL update does make COBOL easier to learn for new programmers and speeds up the process.
Visual COBOL isn't the complete answer. Organizations need to upgrade the IT processes that relate to changes, which is a considerable effort. Many legacy COBOL shops have change queues longer than a year, a timeline that doesn't meet the modern standards of agile operation.
Competitive pressure might force transformation
Even a cloud-based COBOL approach will face the competitive pressure of new technologies. Technology built around in-memory databases can easily outpace a legacy COBOL app, while GPU acceleration and massively parallel processing on the software side and much faster server engines on the cloud infrastructure side can increase the agility gap. In other words, the decision to move to cloud COBOL might only delay an inevitable move to a modern app.
The challenge for the CIO is to decide whether to modernize by moving legacy COBOL apps to VMs, the cloud or do a complete makeover. With today's Visual COBOL as a tool, that's at least a realistic choice.
Operational intelligence (OI) is an approach to data analysis that enables decisions and actions in business operations to be based on real-time data as it's generated or collected by companies. Typically, the data analysis process is automated, and the resulting information is integrated into operational systems for immediate use by business managers and workers.
OI applications are primarily targeted at front-line workers who, hopefully, can make better-informed business decisions or take faster action on issues if they have access to timely business intelligence (BI) and analytics data. Examples include call-center agents, sales representatives, online marketing teams, logistics planners, manufacturing managers and medical professionals. In addition, operational intelligence can be used to automatically trigger responses to specified events or conditions.
What is now known as OI evolved from operational business intelligence, an initial step focused more on applying traditional BI querying and reporting. OI takes the concept to a higher analytics level, but operational BI is sometimes still used interchangeably with operational intelligence as a term.
How operational intelligence works
In most OI initiatives, data analysis is done in tandem with data processing or shortly thereafter, so workers can quickly identify and act on problems and opportunities in business operations. Deployments often include real-time business intelligence systems set up to analyze incoming data, plus real-time data integration tools to pull together different sets of relevant data for analysis.
Stream processing systems and big data platforms, such as Hadoop and Spark, can also be part of the OI picture, particularly in applications that involve large amounts of data and require advanced analytics capabilities. In addition, various IT vendors have combined data streaming, real-time monitoring and data analytics tools to create specialized operational intelligence platforms.
As data is analyzed, organizations often present operational metrics, key performance indicators (KPIs) and business insights to managers and other workers in interactive dashboards that are embedded in the systems they use as part of their jobs; data visualizations are usually included to help make the information easy to understand. Alerts can also be sent to notify users of developments and data points that require their attention, and automated processes can be kicked off if predefined thresholds or other metrics are exceeded, such as stock trades being spurred by prices hitting particular levels.
Operational intelligence uses and examples
Stock trading and other types of investment management are prime candidates for operational intelligence initiatives because of the need to monitor huge volumes of data in real time and respond rapidly to events and market trends. Customer analytics is another area that's ripe for OI. For example, online marketers use real-time tools to analyze internet clickstream data, so they can better target marketing campaigns to consumers. And cable TV companies track data from set-top boxes in real time to analyze the viewing activities of customers and how the boxes are functioning.
The growth of the internet of things has sparked operational intelligence applications for analyzing sensor data being captured from manufacturing machines, pipelines, elevators and other equipment; that enables predictive maintenance efforts designed to detect potential equipment failures before they occur. Other types of machine data also fuel OI applications, including server, network and website logs that are analyzed in real time to look for security threats and IT operations issues.
There are less grandiose operational intelligence use cases, as well. That includes the likes of call-center applications that provide operators with up-to-date customer records and recommend promotional offers while they're on the phone with customers, as well as logistics ones that help calculate the most efficient driving routes for fleets of delivery vehicles.
OI benefits and challenges
The primary benefit of OI implementations is the ability to address operational issues and opportunities as they arise -- or even before they do, as in the case of predictive maintenance. Operational intelligence also empowers business managers and workers to make more informed -- and hopefully better -- decisions on a day-by-day basis. Ultimately, if managed successfully, the increased visibility and insight into business operations can lead to higher revenue and competitive advantages over rivals.
But there are challenges. Building operational intelligence architecture typically involves piecing together different technologies, and there are numerous data processing platforms and analytics tools to choose between, some of which may require new skills in organizations. High performance and sufficient scalability are also needed to handle the real-time workloads and large volumes of data common in OI applications without choking the system.
Also, most business processes at a typical company don't require real-time data analysis. With that in mind, a key part of operational intelligence projects involves determining which end users need up-to-the-minute data and then training them to handle the information once it starts being delivered to them in that fashion.
Operational intelligence vs. business intelligence
Conventional BI systems support the analysis of historical data that has been cleansed and consolidated in a data warehouse or data mart before being made available for business analytics uses. BI applications generally aim to tell corporate executives and business managers what happened in the past on revenues, profits and other KPIs to aid in budgeting and strategic planning.
Early on, BI data was primarily distributed to users in static operational reports. That's still the case in some organizations, although many have shifted to dashboards with the ability to drill down into data for further analysis. In addition, self-service BI tools let users run their own queries and create data visualizations on their own, but the focus is still mostly on analyzing data from the past.
Operational intelligence systems let business managers and front-line workers see what's currently happening in operational processes and then immediately act upon the findings, either on their own or through automated means. The purpose is not to facilitate planning, but to drive operational decisions and actions in the moment.
Virtual private clouds and private clouds differ in terms of architecture, the provider and tenants, and resource delivery. Decide between the two models based on these distinctions.
Organizations trying to decide between virtual private cloud vs. private cloud must first define what they want to accomplish. A private cloud gives individual business units more control over the IT resources allocated to them, whereas a virtual private cloud offers organizations a different level of isolation.
Virtual private clouds are typically layers of isolation within public clouds, but they might lack the self-service portal that enables IT to provide individual business units with DIY IT environments. Private clouds are generally on-premises environments with self-service portals that designated employees can use to deploy resources without intervention from IT.
But interest in the private cloud is about much more than just technology; private clouds represent a fundamental shift in the way organizations deliver IT resources.
In the past, corporate IT acted as a gatekeeper for all things tech. If a business unit within an organization needed to deploy a new application or a new service, they went through IT.
This way of doing things was problematic for both the business units and for IT. Whenever a department had to seek IT approval for a tech project, it ran the risk of IT denying the project or modifying its scope beyond recognition. Even if it was approved, the business unit might have to wait weeks or even months for IT to implement it.
The old way of doing things was also problematic for the IT department because it often put IT in the awkward position of having to say no to someone else's ideas. On the other hand, if IT did approve the project, it meant an increased workload for the IT staff that had to deploy, maintain and support the new application.
Moving away from traditional virtual infrastructures
Private cloud environments represent a shift away from the rigid administrative model that organizations have used for so long. Rather than the IT department acting as the sole governing body for all the organization's tech resources, it instead takes on the role of a service provider.
In a private cloud, the IT infrastructure is carved up into a series of private areas, and each area is assigned to a specific business unit. One or more designated employees within the department take on the role of tenant administrators for the available resources. These administrators are free to use the resources as they see fit without first seeking IT approval.
This doesn't mean that tenant administrators have total autonomy, nor does it mean that they require specialized IT skills. Every organization sets up its private cloud differently, but IT usually provides tenant administrators with a self-service portal that is designed to simplify tasks, such as deploying and managing VMs. Furthermore, IT usually creates VM templates that tenant administrators can use any time they create a new VM.
In other words, tenant administrators can create VMs on an as-needed basis, but must do so within the limits IT has put in place. These limits ensure that tenant administrators don't deplete the underlying infrastructure of hardware resources. Additionally, the use of templates guarantees that admins create VMs in accordance with the organization's security policies.
Virtual private cloud vs. private cloud
When it comes to virtual private cloud vs. private cloud, the terms are sometimes used interchangeably. In most cases, however, a virtual private cloud is different from a private cloud.
In a private cloud model, the IT department acts as a service provider and the individual business units act as tenants. In a virtual private cloud model, a public cloud provider acts as the service provider and the cloud's subscribers are the tenants. Just as the tenant administrators in a private cloud are free to create resources within the limits that have been set up for them, a public cloud's subscribers are also free to create resources within the public cloud.
When public cloud subscribers create resources, such as VM instances, databases or gateways, those instances are created within a virtual private cloud. Think of the virtual private cloud as an isolation boundary that keeps subscribers from being able to access -- or interfere with -- each other's resources.
Each public cloud provider has its own way of doing things, but some providers allow tenants to define additional virtual private clouds. For example, Amazon allows AWS subscribers to create as many virtual private clouds as they need.
Each virtual private cloud acts as an isolated environment. Organizations sometimes use virtual private clouds to isolate web servers from other cloud-hosted resources, or to create an isolation boundary around the virtual servers that make up a multi-tier application.
The new norm: Organizations don't have to choose
In spite of virtual private cloud vs. private cloud distinctions, the lines between them are blurring more than ever. Rather than choosing between a private cloud and a public cloud, most organizations opt for a hybrid cloud.
What are the pros and cons of private clouds and virtual private clouds?
Admins can construct hybrid clouds in many different ways, but one option is to create a self-service environment similar to that of a typical private cloud, but to configure it so some resources reside on premises, while others reside in the public cloud.
Startups will almost always benefit from operating entirely in the public cloud because doing so enables them to avoid a large upfront investment in IT infrastructure. For organizations that already have an on-premises IT infrastructure in place, however, a hybrid cloud usually offers the best of both worlds.