data scientist

Extracting knowledge or insights from data has suddenly become the new trend. Everyone is talking about the potential of data science in all aspects of life. Universities are gearing up to introduce new curriculum and degrees related to data science, and companies are building new data science competencies. New job titles like data scientists, data engineers and data journalists are introduced. There is a certain level of excitement about these things, that did not exist earlier.

What has changed?

Data has been used for making decisions for a long time. There are many seasoned data analysts and engineers who have been working for decades to manage data and gain insights. There are many mature tools and techniques available. So what has changed, or has anything really changed at all? If something has really changed, is it a natural progression of technological advancements or something that will fundamentally disrupt the existing market?

Some of the factors that are contributing to this change are listed below:


Enhanced Computing capabilities – Traditional offline companies processed huge amounts of data, but this wasn’t always time critical. End of day data refresh or even weekly data refresh was very common. There was not much urgency in processing huge amounts of data in real time. The new breed of online companies however collected huge amounts of data, and processing these huge volumes of data as quickly as possible was key to their business model. This led to the emergence of new set of technologies that were popularly termed as the big data technologies.

Availability of new data to analyze – Computing devices embedded in everyday objects enable us to collect all sorts of data that can be leveraged for analysis. Traditionally most of the data that was analysed by companies was collected through their existing information technology systems. Now, a lot of curated data is either available on the internet or can be generated and uploaded on the internet. People are getting comfortable sharing their personal data on the internet. Combining internal data with external data can lead to insights that were not possible before.

Advances in analytics – New technical advancements in analytics like machine learning methods and deep learning algorithms allow us to get insights that were not possible earlier. Advancements in computer vision, speech recognition etc have opened up many new application areas that did not exist few years back. The new found ability of machines to learn as things change have opened up possibilities to build predictive models that were too costly to maintain.

Open source technologies – Prominent traditional tools and technologies in this space charged organizations either by the number of users or by the amount of data processed. Open source technologies are free to use, they make it possible to process vast amounts of data at minimal cost, and also help in making the data available to infinite number of users, without having to pay for each user license.

Cloud computing & outsourcing options – Cloud computing offers the flexibility to expand (or reduce) the infrastructure capabilities as needed, without having to build upfront dedicate infrastructure. Modern ‘pay as you use’ analytical platforms make it possible for companies to leverage data for decisions without having to significantly invest in their own IT setups.

Globalization & changing markets – New markets are emerging, and they are quite different from the existing markets. We cannot afford to only speculate the differences, especially when we have methods available to get insights of new markets through data. As changes are happening at a rapid pace, the version of truth is changing, forcing companies to learn from new data instead of relying on previous knowledge.

New application areas – Traditionally, data was collected, organized and analysed within company headquarters for taking key business decisions. Few years after the dotcom bust, new strong online giants like Google, Facebook, Amazon, LinkedIn, Netflix etc emerged. These companies extended sophisticated data capabilities to their users. Modern features like recommendation engines, search engines etc that used deep analytical techniques behind the scenes are becoming common.

Each of these factors on their own would have been termed as mere advancement in technology, but the fact that all these changes are happening at the same time, give or take a couple of years, could potentially disrupt the existing market.