Live Chat Software by Kayako
Data science skills spawn success in AI, big data analytics
Posted by Thang Le Toan on 19 July 2018 11:53 PM
Data science teams need the right skills and solid processes
For data scientists, big data systems and AI-enabled advanced analytics technologies open up new possibilities to help drive better business decision-making. "Like never before, we have access to data, computing power and rapidly evolving tools," Forrester Research analyst Kjell Carlsson wrote in a July 2017 blog post.
The downside, Carlsson added, is that many organizations "are only just beginning to crack the code on how to unleash this potential." Often, that isn't due to a lack of internal data science skills, he said in a June 2018 blog; it's because companies treat data science as "an artisanal craft" instead of a well-coordinated process that involves analytics teams, IT and business units.
Of course, possessing the right data science skills is a predicate to making such processes work. The list of skills that LinkedIn's analytics and data science team wants in job candidates includes the ability to manipulate data, design experiments with it and build statistical and machine learning models, according to Michael Li, who heads the team.
But softer skills are equally important, Li said in an April 2018 blog. He cited communication, project management, critical thinking and problem-solving skills as key attributes. Being able to influence decision-makers is also an important part of "the art of being a data scientist," he wrote.
The problem is that such skills requirements are often "completely out of reach for a single person," Miriam Friedel wrote in a September 2017 blog when she was director and senior scientist at consulting services provider Elder Research. Friedel, who has since moved on to software vendor Metis Machine as data science director, suggested in the blog that instead of looking for the proverbial individual unicorn, companies should build "a team unicorn."
This handbook more closely examines that team-building approach as well as critical data science skills for the big data and AI era.
Reskilling the analytics team: Math, science and creativity
Technical skills are a must for data scientists. But to make analytics teams successful, they also need to think creatively, work in harmony and be good communicators.
In a 2009 study of its employee data, Google discovered that the top seven characteristics of a successful manager at the company didn't involve technical expertise. For example, they included being a good coach and an effective communicator, having a clear vision and strategy, and empowering teams without micromanaging them. Technical skills were No. 8.
Google's list, which was updated this year to add collaboration and strong decision-making capabilities as two more key traits, applies specifically to its managers, not to technical workers. But the findings from the study, known as Project Oxygen, are also relevant to building an effective analytics team.
Obviously, STEM skills are incredibly important in analytics. But as Google's initial and subsequent studies have shown, they aren't the whole or even the most important part of the story. As an analytics leader, I'm very glad that someone has put numbers to all this, but I've always known that the best data scientists are also empathetic and creative storytellers.
What's churning in the job market?
The role of statisticians and data scientists is changing -- and becoming more and more important in today's dynamic business world.
According to the latest employment projections report by the U.S. Bureau of Labor Statistics, statisticians are in high demand. Among occupations that currently employ at least 25,000 people, statistician ranks fifth in projected growth rate; it's expected to grow by 33.8% from 2016 to 2026. For context, the average rate of growth that the statistics bureau forecasts for all occupations is 7.4%. And with application software developers as the only other exception, all of the other occupations in the top 10 are in the healthcare or senior care verticals, which is consistent with an aging U.S. population.
Statistician is fifth among occupations with at least 25,000 workers projected to grow at the fastest rates.
Thanks to groundbreaking innovations in technology and computing power, the world is producing more data than ever before. Businesses are using actionable analytics to improve their day-to-day processes and drive diverse functions like sales, marketing, capital investment, HR and operations. Statisticians and data scientists are making that possible, using not only their mathematical and scientific skills, but also creativity and effective communication to extract and convey insights from the new data resources.
In 2017, IBM partnered with job market analytics software vendor Burning Glass Technologies and the Business-Higher Education Forum on a study that showed how the democratization of data is forcing change in the workforce. Without diving into the minutia, I gathered from the study that with more and more data now available to more and more people, the insights garnered from the data set you apart as an employee -- or as a company.
Developing and encouraging our analytics team
The need to find and communicate these insights influences how we hire and train our up-and-coming analytics employees at Dun & Bradstreet. Our focus is still primarily on mathematics, but we also consider other characteristics like critical- and innovative-thinking abilities as well as personality traits, so our statisticians and data scientists are effective in their roles.
Our employees have the advantage of working for a business-to-business company that has incredibly large and varied data sets -- containing more than 300 million business records -- and a wide variety of customers that are interested in our analytics services and applications. They get to work on a very diverse set of business challenges, share cutting-edge concepts with data scientists in other companies and develop creative solutions to unique problems.
Our associates are encouraged to pursue new analytical models and data analyses, and we have special five-day sprints where we augment and enhance some of the team's more creative suggestions. These sprints not only challenge the creativity of our data analysts, but also require them to work on their interpersonal and communication skills while developing these applications as a group.
Socializing the new, creative data analyst
It's very important to realize that some business users aren't yet completely comfortable with a well-rounded analytics team. For the most part, when bringing in an analyst, they're looking for confirmation of a hypothesis rather than a full analysis of the data at hand.
If that's the case in your organization, then be persistent. As your team continues to present valuable insights and creative solutions, your peers and business leaders across the company will start to seek guidance from data analysts as partners in problem-solving much more frequently and much earlier in their decision-making processes.
As companies and other institutions continue to amass data exponentially and rapid technological changes continue to affect the landscape of our businesses and lives, growing pains will inevitably follow. Exceptional employees who have creativity and empathy, in addition to mathematical skills, will help your company thrive through innovation. Hopefully, you have more than a few analysts who possess those capabilities. Identify and encourage them -- and give permission to the rest of your analytics team to think outside the box and rise to the occasion.
Data scientist vs. business analyst: What's the difference?
Data science and business analyst roles differ in that data scientists must deep dive into data and come up with unique business solutions -- but the distinctions don't end there.
What is the difference between data science and business analyst jobs? And what kind of training or education is required to become a data scientist?
There are a number of differences between data scientists and business analysts, the two most common business analytics roles, but at a high level, you can think about the distinction as similar to a medical researcher and a lab technician. One uses experimentation and the scientific method to search out new, potentially groundbreaking discoveries, while the other applies existing knowledge in an operational context.
Data scientist vs. business analyst comes down to the realms they inhabit. Data scientists delve into big data sets and use experimentation to discover new insights in data. Business analysts, on the other hand, typically use self-service analytics tools to review curated data sets, build reports and data visualizations, and report targeted findings -- things like revenue by quarter or sales needed to hit targets.
What does a data scientist do?
A data scientist takes analytics and data warehousing programs to the next level: What does the data really say about the company, and is the company able to decipher relevant data from irrelevant data?
A data scientist should be able to leverage the enterprise data warehouse to dive deeper into the data that comes out or to analyze new types of data stored in Hadoop clusters and other big data systems. A data scientist doesn't just report on data like a classic business analyst does, he also delivers business insights based on the data.
A data scientist job also requires a strong business sense and the ability to communicate data-driven conclusions to business stakeholders. Strong data scientists don't just address business problems, they'll also pinpoint the problems that have the most value to the organization. A data scientist plays a more strategic role within an organization.
Data scientist education, skills and personality traits
Data scientists look through all the available data with the goal of discovering a previously hidden insight that, in turn, can provide a competitive advantage or address a pressing business problem. Data scientists do not simply collect and report on data -- they also look at it from many angles, determine what it means and then recommend ways to apply the data. These insights could lead to a new product or even an entirely new business model.
Data scientists apply advanced machine learning models to automate processes that previously took too long or were inefficient. They use data processing and programming tools -- often open source, like Python, R and TensorFlow -- to develop new applications that take advantage of advances in artificial intelligence. These applications may perform a task such as transcribing calls to a customer service line using natural language processing or automatically generating text for email campaigns.
What does a business analyst do?
A business analyst -- a title often used interchangeably with data analyst -- focuses more on delivering operational insights to lines of business using smaller, more targeted data sets. For example, a business analyst tied to a sales team will work primarily with sales data to see how individual team members are performing, to identify members who might need extra coaching and to search for other areas where the team can improve on its performance.
Business analysts typically use self-service analytics and data visualization tools. Using these tools, business analysts can build reports and dashboards that team members can use to track their performance. Typically, the information contained in these reports is retrospective rather than predictive.
Data scientist vs. business analyst training, tools and trends
To become a business analyst, you need a familiarity with statistics and the basic fundamentals of data analysis, but there are many self-service analytics tools that do the mathematical heavy lifting for you. Of course, you have to know if it's statistically meaningful to join two separate data sets, and you have to understand the distinction between correlation and causation. But, on the whole, a deep background in mathematics is unnecessary.
To become a data scientist, on the other hand, you need a strong background in math. This is one of the primary differences in the question of data scientists vs. business analysts.
Many data scientists have doctorates in some field of math. Many have backgrounds in physics or other advanced sciences that lean heavily on statistical inference.
Business analysts can generally pick up the technical skills they need on the job. Whether an enterprise uses Tableau, Qlik or Power BI -- the three most common self-service analytics platforms -- or another tool, most use graphical user interfaces that are designed to be intuitive and easy to pick up.
Data science jobs require more specific technical training. In addition to advanced mathematical education, data scientists need deep technical skills. They must be proficient in several common coding languages -- including Python, SQL and Java -- which enable them to run complex machine learning models against big data stored in Hadoop or other distributed data management platforms. Most often, data scientists pick up these skills from a college-level computer science curriculum.
However, trends in data analytics are beginning to collapse the line between data science and data analysis. Increasingly, software companies are introducing platforms that can automate complex tasks using machine learning. At the same time, self-service software supports deeper analytical functionality, meaning data scientists are increasingly using tools that were once solely for business analysts.
Companies often report the highest analytics success when blending teams, so data scientists working alongside business analysts can produce operational benefits. This means that the data scientist vs. business analyst distinctions could become less important as time goes on -- a trend that may pay off for enterprises.
Hiring vs. training data scientists: The case for each approach
Hiring data scientists is easier said than done -- so should you try to train current employees in data science skills? That depends on your company's needs, writes one analytics expert.
Companies are faced with a dilemma on big data analytics initiatives: whether to hire data scientists from outside or train current employees to meet new demands. In many cases, realizing big data's enormous untapped potential brings the accompanying need to increase data science skills -- but building up your capacity can be tricky, especially in a crowded market of businesses looking for analytics talent.
Even with a shortage of available data scientists, screening and interviewing for quality hires is time- and resource-intensive. Alternatively, training data scientists from within may be futile if internal candidates don't have the fundamental aptitude.
At The Data Incubator, we've helped hundreds of companies train employees on data science and hire new talent -- and, often, we've aided organizations in handling the tradeoffs between the two approaches. Based on the experiences we've had with our corporate clients, you should consider the following factors when deciding which way to go.
New hires bring in new thinking
The main benefit of hiring rather than training data scientists comes from introducing new ideas and capabilities into your organization. What you add may be technical in nature: For example, are you looking to adopt advanced machine learning techniques, such as neural networks, or to develop real-time customer insights by using Spark Streaming? It may be cultural, too: Do you want an agile data science team that can iterate rapidly -- even at the expense of "breaking things," in Facebook's famous parlance? Or one that can think about data creatively and find novel approaches to using both internal and external information?
At other times, it's about having a fresh set of eyes looking at the same problems. Many quant hedge funds intentionally hire newly minted STEM Ph.D. holders -- people with degrees in science, technology, engineering or math -- instead of industry veterans precisely to get a fresh take on financial markets. And it isn't just Wall Street; in other highly competitive industries, too, new ideas are the most important currency, and companies fight for them to remain competitive.
How a company sources new talent can also require some innovation, given the scarcity of skilled data scientists. Kaggle and other competition platforms can be great places to find burgeoning data science talent. The public competitions on Kaggle are famous for bringing unconventional stars and unknown whiz kids into the spotlight and demonstrating that the best analytics performance may come from out of left field.
Similarly, we've found that economists and other social scientists often possess the same strong quantitative skill sets as their traditional STEM peers, but are overlooked by HR departments and hiring managers alike.
Training adds to existing expertise
In other cases, employers may value industry experience first and foremost. Domain expertise is complex, intricate and difficult to acquire in some industries. Such industries often already have another science at their core. Rocketry, mining, chemicals, oil and gas -- these are all businesses in which knowledge of the underlying science is more important than data science know-how.
Highly regulated industries are another case in point. Companies facing complex regulatory burdens must often meet very specific, and frequently longstanding, requirements. Banks must comply with financial risk testing and with statutes that were often written decades ago. Similarly, the drug approval process in healthcare is governed by a complex set of immutable rules. While there is certainly room for innovation via data science and big data in these fields, it is constrained by regulations.
Companies in this position often find training data scientists internally to be a better option for developing big data analytics capabilities than hiring new talent. For example, at The Data Incubator, we work with a large consumer finance institution that was looking for data science capabilities to help enhance its credit modeling. But its ideal candidate profile for that job was very different from the ones sought by organizations looking for new ideas on business operations or products and services.
The relevant credit data comes in slowly: Borrowers who are initially reliable could become insolvent months or years after the initial credit decision, which makes it difficult to predict defaults without a strong credit model. And wrong decisions are very expensive: Loan defaults result in direct hits to the company's profitability. In this case, we worked with the company to train existing statisticians and underwriters on complementary data science skills around big data.
Of course, companies must be targeted in selecting training candidates. They often start by identifying employees who possess strong foundational skills for data science -- things like programming and statistics experience. Suitable candidates go by many titles, including statisticians, actuaries and quantitative analysts, more popularly known as quants.
Find the right balance for your needs
For many companies, weighing the options for hiring or training data scientists comes down to understanding their specific business needs, which can vary even in different parts of an organization. It's worth noting that the same financial institution that trained its staffers to do analytics for credit modeling also hired data scientists for its digital marketing team.
Without the complex regulatory requirements imposed on the underwriting side, the digital marketing team felt it could more freely innovate -- and hence decided to bring in new blood with new ideas. These new hires are now building analytical models that leverage hundreds of data signals and use advanced AI and machine learning techniques to more precisely target marketing campaigns at customers and better understand the purchase journeys people take.
Ultimately, the decision of whether to hire or train data scientists must make sense for an organization. Companies must balance the desire to innovate with the need to incorporate existing expertise and satisfy regulatory requirements. Getting that balance right is a key step in a successful data science talent strategy.