By Gaurav Vohra, CEO & Co-founder, Jigsaw Academy
The science of identifying, gathering, and analysing data—as well as the technologies used to perform these different functions—is predicated on the type of data available and their characteristics. Volume is one defining characteristic, as evidenced by the expansion of data measurement units (kilobytes giving way to megabytes, megabytes to gigabytes, and gigabytes to petabytes.) Data has also become extraordinarily varied in terms of output, point of origin, and use-value/applicability (from diagnosing and treating medical conditions to business intelligence.)
Given such seismic shifts in the production and circulation of information in the digital realm, it is no surprise that reskilling is now part-and-parcel of a career in data science. Unlike upskilling, which involves strengthening or upgrading existing skillsets, reskilling refers to the process of acquiring a whole new set of skills. This is because, as data and data science technologies evolve and transform, so do the demands that they make on us and the kind of mastery that they require.
CrowdFlower, a data science crowd sourcing platform, analysed 3500 job listings on LinkedIn to see what the most in-demand data analytics skills of 2016 were likely to be (experts at other tech publications echoed their findings.) Here it is.
1. SQL: This was listed in more than half of all the job advertisements that CrowdFlower reviewed.
2. Hadoop: The surge in big data generation and analytics has positioned this powerful, open-source software at the top of companies’ ‘must-have’ lists. A distributive logic underwrites how Hadoop stores and processes sprawling data sets. This, coupled with its customisability allows it to accommodate and manipulate large, varied stores of data with ease.
3. Python: This software is snaking its way into the data analytics mainstream. It is characterised by its ease of programming (it requires smaller amounts of code to write) and demonstrates an R-like brilliance with data mining. It is also very easy to learn.
4. Java: Continues to be popular with program developers because of its flexibility (attributable to its Write Once Run Anywhere or WORA set-up). It is also the base language for Hadoop, making knowledge of it mandatory for anyone intending to work with the Hadoop suite. It comes as no surprise then that TIOBE voted Java as 2015’s “Programming Language of the Year”.
5. R: A great programming language that enables the statistical exploration of data sets, and the creative visualisation of the output.
Adtech’s David Ramel made some interesting observations while reviewing this list. He noted that SQL’s and Hadoop’s domination of the data analytics job market indicate that data storage technologies continue to be in great demand. Explaining the importance of such technologies, Contel Bradford from Storagecraft.com writes that, “data must be kept somewhere before it’s even sorted, processed, and analyzed, making storage critical to even getting these huge data-driven initiatives off the ground.”
As for Python’s popularity, Ramel surmises that it “reflects the fact that data scientists are spending more and more of their time enriching their data rather than analyzing it.”
There you have it—a list of the most in-demand data analytics skills of 2016; skills that employers are keen that their analysts possess. Keep an eye out for courses that teach these skills (if you’re yet to learn them) so as to stay comfortably ahead of the curve, and retain the upper hand.