My job: Data Scientist
We at Cybenews Academy spoke with Vinamra Mathur, a data scientist with seven years of experience, to discuss the vital components of data science, how to become a data scientist, and what skills are needed to break into the industry.
What is a Data Scientist?
A data scientist is responsible for analyzing and studying large amounts of data and extracting meaningful information that can be used in the industry. This field of science takes a multidisciplinary approach that combines various disciplines, including but not limited to mathematics, artificial intelligence, machine learning, software, and computer engineering. Our expert Vinamra Mathur explained that data scientists can make the life of a business or enterprise easier by using data to their advantage.
How are Data Scientists unique?
Data science is a unique facet of the IT sector and is said to be one of the most coveted jobs of the 21st century. Data science is a multidisciplinary field that combines various facets of computer science to solve some of the industry's most complex data problems. According to IBM, “Data science combines math and statistics, specialized programming, advanced analytics, AI, and machine learning with specific subject matter expertise to uncover actionable insights hidden in an organization’s data. These insights can be used to guide decision-making and strategic planning.” When understanding data science, Our expert Vinamra has some good insights into what strategies and tools data scientists use when investigating large data sets.
Vinamra’s data journey
When he started working, Vinamra described some of his responsibilities as a data scientist. “I was engineering and dealing with the data of regulatory reports and regulatory filings,” Vinamra explained that he worked for a Fortune 500 insurance company in Canada. His task was to build data science models so the company could make on its capital risk portfolio and profiling. “I developed software applications and interfaces. I also deployed data science algorithms over the top to generate sharp financial numbers, which aided the business's decision-making process.” Throughout the interview, Vinamra discussed how he became a data scientist. “My role was purely software engineering at the beginning, but when my manager asked me to go further and work on these data-related endeavors, I realized how important data is to the industry.” Vinamra assessed his skills and saw what his analysis could bring to his company. That is when the unity of his analytical and technical skills culminated to forge his data science career.
Data science skills
Vinamra explained that he needed to upskill and develop other skills that the field demanded to become a successful data scientist. “I did some reskilling from several online platforms and applied them to real-life data science problems.” Vinamra told Cybernews Academy, "Software engineering is the first and foremost essential part of data science, as it is essential trying to deal with large-scale data sets.” This is backed up by IBM, which suggests that “while data scientists can build machine learning models, scaling these efforts at a larger level requires more software engineering skills to optimize a program to run more quickly.” Despite needing various skills like communication, collaboration, programming, and other technical skills, Vinamra strongly believes that business knowledge is the key to a successful data scientist. “First, we need to understand the business because we have sophisticated data science algorithms available to do all the heavy lifting and build probabilistic models, but it is essential to understand the business first, understand the pain points and why I’m using this data. You can work on this data from there to draw meaningful insights.”
The three steps to success
As Vinamra gradually matured as a professional data scientist, he became vintage in the industry, and his role slowly transitioned into a people management position instead of dealing with large data sets. Vinamra explained that there are three facets of professional data science that one must consider if they plan on leveling up their career. The first facet is people management, which becomes very important when moving into a senior role. “I work with more junior data scientists in the industry. I mentor, coach, and give them nuggets of information to help them succeed in the industry.” The second facet is ‘individual contribution,’ “so you don’t forget the technical knowledge and experience you’ve acquired in the past.” Vinamra told Cybernews Academy that senior data scientists still have to keep their hands dirty to be relevant in the industry. The third facet is understanding the stakeholder, “you should have excellent communication skills, you should have empathy, and have an understanding of the business.”
What Vinamra does in the first instance before analyzing or drawing any data from the set is to understand the business. “I prepare documentation around the business and then map out data sources. From there, I begin data engineering and the extract transform load processes.” These ETLs or extract transform load processes are performed in data engineering to locate and process the data. “We receive the data in raw format, so it needs to be processed and run through many quality checks before we run sophisticated algorithms.” Once the algorithms run over the processed data, this produces distilled or filtered information. Vinamra explained that this data is ready to present once processed. “We need to present that information in the form of appealing visuals as visuals create a long-lasting impact on the audience.”
Data science life cycle
Much like our previous ‘My job’ article, we looked at the DevOps lifecycle, which outlines how DevOps functions in their working environment. Similarly, a data science lifecycle demonstrates how scientists assess issues, process data, and reach their final results. The lifecycle aims to take raw data and turn it into something enterprises can use.
- Business understanding - this first step identifies relevant details and data sources the business needs. If you don’t recognize the problem that must be resolved, you will ultimately shoot blindly into a mass of data. By outlining the business objective, you will have the full scope of what you are looking for and how to obtain the data needed to resolve this problem.
- Data acquisition and understanding - this step aims to acquire data through surveys, web scraping, or extracting/copying data directly from a website. Once this data is collected, the next step would be to clean the high-quality data set and ensure that the relationship between the data set and the business understanding is established.
- Modeling - this stage involves the development of predictive or descriptive models using machine learning algorithms and other analytical models to gain insights from the data. This model must be trained and accurately reflect the pre-defined problem waiting to be solved. These models are then evaluated to see how well they resolve this issue.
- Deployment - once an appropriate model has been made, it can be deployed in real-time. This could involve integrating this model into a web application, support system, or automated process.
- Customer Acceptance - in this stage, much confirmation must happen. Proof that the deployed model and components of the model meet the customer's needs is an essential aspect of the customer acceptance process. Once confirmed, the project is handed off to the organization running the system.
What industries need data scientists?
According to the World Economic Forum, data is the new oil in the 21st century. This shows just how vital data is to the transactions and exchanges we have within the world every day. Data is essentially money, and Vinamra suggests that data scientists are the backbone of businesses and the infrastructure as data is utilized in every industry. “Whether in retail, finance, or banking, data is used daily.” Vinamra gave an instance where data is used, “when we are traveling from point A to point B, we get a lot of route options in Google Maps. These routes are optimized, which means that data science algorithms are running in the backend, which gives us these different recommendation options. In today’s world, all organizations are incomplete without the knowledge of data.” Vinamra expressed that data science is a coveted role in the industry as more and more companies are looking for qualified data scientists. Our expert provided some tips on how to pursue data science and what skills are essential when embarking on your data science career.
Data science at university
This expert exclaimed that education is essential when pursuing a career in data science. “My education laid the foundation for understanding how the data science process works theoretically. The first part is understanding the theory and then applying that theory to the real world.” As Vinamra studied various degrees in computer science and software engineering, this laid the foundations for him to understand the technical aspects of the discipline and apply that knowledge to real-life situations. Our expert expressed that one facet of computer science and information technology is the very foundation of data science. “If your software engineering skills are very sound and solid, you will surely be successful in the data science industry.” In his experience, software engineering was the first thing that helped Vinarma translate his business knowledge into technical specifications.
Our expert explained some necessary skills that are taught at university that can positively impact your success as a data scientist:
- Programming skills
- Time management
- Business understanding
If you plan on becoming a data scientist, you can pursue specific bachelor’s degrees that will help lay the foundations for a solid career.
- Computer Science
- Software Engineering
- Information Technology
You can pursue a master’s in data science or computer science to further specialize in the discipline. Vinamra said, “Once you know how to design an interface, how to put the data as an input and get the maximum return on investment as output, that is when you are called a complete data scientist.” Our expert explained that there are layers to data science that you can learn while at university. “The first layer of data science is software engineering, the second is data engineering, the third is machine learning engineering, and the fourth is data visualization and information insight analysis.”
What employers value
Having been in both the interviewer and interviewee position, Vinamra explained what he looks for in a candidate. “In my experience, I only select individuals with data science projects or portfolios as this validates their skills and knowledge.” If you want to become a data scientist, employers value practical, hands-on experience in the field. You can gain this valuable experience and build a portfolio by going to Kaggle or KD Nuggets and practicing on small data sets. Vinamra told Cybernews Academy that alongside a degree in the relevant field, a polished portfolio, excellent communication skills, and a “problem-solving attitude,” you should succeed as a data scientist.
Data science is soon becoming one of the most coveted and lucrative fields in the information technology sector. You can break into the industry with hard and soft skills, a solid educational background, and hands-on experience. Vinamra provided some valuable guidelines for those wishing to pursue a career in data science.
- Formal education - pursue a computer science or information technology degree, as a formal education provides a solid foundation for professional development. Utilize online learning platforms that will help you develop critical tools and practices.
- Upskill - ensure you pursue other educational avenues like independent certifications to improve and adapt your skills.
- Hands-on experience - those who want to pursue this path will need to engage in personal projects that help them understand tools and develop the skills required for a career in Data Science.
- Networking - network with other professionals through conferences, online, or at your university. Ensure you build a solid online presence and establish yourself as a data scientist.
- Personal branding - having the tools, knowledge, and experience will only get you so far. Vinamra highlights the importance of charisma, tenacity, and communication skills, as data science is about effective communication.