• Mon. Jun 5th, 2023

From AI to Teamwork: 7 Key Skills for Data Scientists


Jun 18, 2021

Today’s data scientists need more than proficiency in AI and Python. Organizations are looking for specialists who also feel at home in the C-suite.

Credit: Maksym Yemelyanov via Adobe Stock

The Bureau of Labor Statistics lists jobs in data science in the top 15 fastest growing occupations with projected 31 percent job growth over the next 10 years. With data increasingly becoming the lifeblood of all organizations, data scientists need to be equipped not only with the right technical skills, but a robust dose of business acumen as well.
Machine Learning/Neural Networks
In 2021, machine learning methods like transfer learning and transformers are drawing a lot of attention because they are rapidly driving innovation in a number of different spaces. For building and training neural networks, PyTorch has a lot of momentum behind it, and Keras and TensorFlow are also commonly used.
There is also a rich ecosystem of software libraries, many open source, that can help accelerate machine learning and data science applications. 
“Data scientists can make themselves attractive by demonstrating deep intuition into why and how machine learning algorithms work, which is important for working through challenges that inevitably arise during training and testing,” said Matthew Silver, senior director of data science at Vectra, an AI threat detection and response specialist company. “ONNX, a neural networks standard that facilitates platform, library, and language independent model deployment, helped us streamline our use of AI in production and accelerate our modeling work.”
It’s important for data scientists to write high quality and maintainable code for exploratory analysis, data preprocessing, and algorithm training, and in some cases for deployment of models in production. Python, JavaScript, R, and Scala are the top languages for developing. Another helpful skill is understanding how to build a web API from your models that others can deploy.
“Data scientists who are able to walk on the job and start using common software libraries to build models right away are the most competitive, and strong software development skills are a plus in almost all cases,” Silver said.
Cloud Infrastructure
Data scientists with an understanding of cloud engineering principles and cloud infrastructure are attractive to many employers. That means getting comfortable with one of the big three public cloud providers — Microsoft, Amazon Web Services, or Google. Each offers a comprehensive set of tools for data scientists for data extraction, data cleansing, visualization, and machine learning purposes.
“I personally look for data scientists familiar with cloud infrastructure, CI/CD pipelines, and automation,” said Phillip Gates-Idem, chief architect at JupiterOne, a provider of cyber asset management and governance solutions. “Data scientists need to have a firm understanding of how to build and utilize tools with cloud infrastructure.”
Statistics, a field of mathematics which seeks to collect and interpret quantitative data using models and representations for a given set of data, is at the core of data science and includes concepts like probability, variability, regression and central tendency.
“If you don’t have an in-depth knowledge of statistics — the heart of data science — and how to apply sound mathematical reasoning to the problems you’re working on, then I don’t care how many platforms or languages you can list on your resume,” said Lars Kemmann, principal architect at IT consulting firm Netrix. “I think that’s a challenge in the industry right now — we get lots of resumes from people who haven’t done the hard work to internalize the scientific method.”
Project Management
Because data science projects can involve long exploration phases, as well as multiple unknowns even late into the game, project management is another key skill for data scientists to have. Adopting an agile methodology, for example, allows data scientists to prioritize and create roadmaps based on requirements and goals.
“It’s often very difficult to predict how long it will take to develop and train a machine learning model, and businesses waiting on updated models or results will often have timelines and planning that suffer due to this unpredictability,” Silver explained. “Data scientists who are able to take ownership over major modeling efforts by understanding limitations from the outset, conveying project status as efforts progress, and predicting when they’ll be able to offer the next meaningful readout, play an important role in our team.”
Data Storytelling/Visualization
While the organization’s data may hold remarkable amounts of potential value, no value can be created unless you can uncover those insights and then translate them into actions or business outcomes. Plotly, Tableau, and D3 are among the top data science visualization and storytelling tools in demand today.
“When your client doesn’t understand what you are doing, it’s easy for them to undervalue the work you are putting in, especially in the data prep phase,” Kemmann said. “Clearly explaining the process and the benefits of each step, in a language that your audience can relate to, and supported where possible by appropriate data visualizations, is a key part of your role.”
Data scientists now have more opportunities than ever before to be “hands on” with the data, but that requires a strong understanding of business objectives and the ability to communicate tech jargon clearly. The data scientists that can translate the data into useful terms are the people that are going to be able to add that extra value. 
“Being able to translate that data into clean, digestible business information is going to be a huge skill, and data scientists don’t always have those soft skills, or the experience of sitting in a room of executives and be able to clarify their decision-making process,” said Joshua Drew, regional manager at IT staffing firm Robert Half Technology.
Related Content:
How and Why Enterprises Must Tackle Ethical AIRubin Observatory Goes Open Source to Capture Galactic DataMachine Learning Basics Everyone Should KnowHow Enterprises are Evolving Their NLP
Nathan Eddy is a freelance writer for InformationWeek. He has written for Popular Mechanics, Sales & Marketing Management Magazine, FierceMarkets, and CRN, among others. In 2012 he made his first documentary film, The Absent Column. He currently lives in Berlin. View Full BioWe welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.More Insights

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *