Factor Mikhail Mew, Researcher, investor, data scientist
As the development of artificial intelligence continues to progress by leaps and bounds, access to data science at the grassroots level has become increasingly democratized. Traditional barriers to market entry, such as lack of data and computing power, have been removed as new data start-ups are opening up (some provide access to only a cup of coffee a day) and all efficient cloud services eliminate the need for expensive on-site hardware. Rounding the trinity of prerequisites is a skill and know-how to implement, which has arguably become the most common part of data processing. You don’t have to look far to find online tutorials that advertise taglines like “implement X model in seconds”, “use the Z method for your data in just a few lines of code”. In the digital world, immediate gratification has become the name of the game. While improved accessibility is not detrimental to face value, under the dazzling array of software libraries and glossy new models, the real purpose of data science is blurred and sometimes even forgotten. It should not use complex models to do it or optimize an arbitrary performance meter, but use it as a tool to solve real-world problems.
A simple but relative example is the Iris dataset. How many have used it to demonstrate an algorithm without sparing the idea of what the cladding is, let alone why its length is measured? While these may seem trivial aspects to a novice practitioner who may be more interested in adding a new model to their repertoire, it was less than trivial for botanist Edgar Anderson to list those qualities to understand the variations of Iris flowers. While this is an invented example, it demonstrates a simple matter; the mainstream has focused more on “doing” data science rather than “applying” it. However, this wrong trend is not the reason for the decline of the data scientist, but a symptom. To understand the origins of the problem, we need to step back and look at it from a bird’s eye view.
There is a curious difference in data science that it is one of the few fields of research that leaves a practitioner without territory. Pharmacy students become pharmacies, law students become lawyers, accounting students become accountants. So do students of computer science have to become data scientists? But what about scientists? The widespread application of data science proves to be a double-edged sword. On the other hand, it is a powerful toolkit that can be used in all industries where data is created and captured. On the other hand, the general applicability of these tools means that the user rarely has a real knowledge of the industries in question before the fact. Nevertheless, the problem was insignificant during the rise of computing, as employers rushed to take advantage of this emerging technology without fully understanding what it was and how it could be fully integrated into their company.
After almost a decade, however, both companies and the operating environment have evolved. They are now striving for the maturity of science with large established teams established in accordance with established industry standards. Urgent wage demand has shifted to problem solvers and critical thinkers who understand the company, the industry concerned, and its stakeholders. The ability to navigate a few software packages or regurgitate a few lines of code is no longer sufficient, and a computer science professional will not define the ability to encode. This is evidenced by the lack of code, the growing popularity of autoML solutions such as Data Robot, Rapid Miner and Alteryx.
What does this mean?
Data researchers will become extinct after 10 years (give or take) or at least the role title is. In the future, the skill package, collectively known as science, will carry a new generation of knowledgeable business experts and subject matter experts who will be able to get analysis with their deep domain, whether they can code or not. Their names reflect their expertise rather than the means by which they demonstrate it, whether they are compliance experts, product managers, or investment analysts. We don’t have to look far back to find historical precedents. At the time of the introduction of the spreadsheet, data entry experts were highly sought after, but today, as Cole Nussbaumer Knaflic (author of “Storytelling With Data”) aptly points out, skill in a Microsoft Office application is the minimum requirement. In the past, the ability to touch a type with a typewriter was considered a specialty, but due to the availability of personal computing, it has also been assumed.
Finally, for those considering a computing career or starting their studies, it can be helpful to constantly return to the Venn chart you will no doubt encounter. It describes data science as the interface between statistics, programming, and industry information. Despite the fact that each has an equal share of the cutting area, some may require greater weighting than others.
Disclaimer: The views are my own, based on my observations and experiences. That’s ok, if you don’t agree, a productive discussion is welcome.
Bio: Mikhail Mew is a researcher, investor, and data scientist as well as a curious observer who offers ideas and insights on the intersection of investment and machine learning.
Original. Re-posted with permission.