Big Data Analytics – The current scenario

By: Neeraj Sabharwal, Director of Cloud and Big Data Solutions, Xavient Information Systems

Big Data has moved beyond just a buzz word as businesses have started using Big Data technologies to generate more power out of data.

Sharing an interesting scenario of an acquaintance. John is obsessed about collecting promotional code as he is addicted to online shopping. The last thing he wants to do is to physically visit the store to buy anything. He is aware that e-retailers are tracking his online activities and are thus, aware of his shopping pattern. He also knows that his one click on any product in display will be creating a recommendation list by the retailer consisting of more products of similar type and that’s what makes his life easier. The challenge for retailers is to bring people like John into the store to drive in-store traffic beside snot losing lose customers like him just because he is getting better offer from competitors. Later in this article, we will explore the way retailers are addressing this and how the same solution is applicable to almost every vertical.

Google has transformed the way computing works and there is no doubt that as a company, google has launched many products and delivered many successful open source projects. In the same way, the white paper published on map reduce by google, introduced us to a brand new way of doing computing, data storage and processing.

Let’s go back 10-15 years and think about the way we used to store and process data. Developers built an application or front end based on webserver with a backend database to store the data written by the application. We then copied the data from this live system into reporting databases, to run business reports and analysis. In a couple of years, data size started to grow which made reporting tasks slow. We ended up putting more hardware to increase computational power and when we reached the maximum limit in memory and CPU, that’s when we started buying more servers. Now, the challenge was to move data into these new systems. This whole exercise of vertical scaling ended up creating several silos of data overall, maintenance cost increased besides license cost. Thus, vertical scaling failed because it did not scale and was expensive too. It became impractical to store all the new data types into traditional relational databases.

In parallel, there was a new revolution in the technology space because of Google’s mapreduce paper and a new technology was born called ‘Hadoop based on Horizontal Computing’. In addition to Hadoop, there was another important addition,‘NoSql databases’.

There is no doubt that smart phones have changed the way we do business. Literally, there is no need of a browser anymore and for businesses, it’s easier to track customers now because of location services. For instance, Starbucks knows the moment you step into one of their café’s and they don’t miss this chance to remind you about their special offers for the day. Similarly, retailers too know when John is in the mall or near their store and he starts receiving in-store purchase offers, and we all already know how obsessed he is with promotional offers. This event generated by retailer in real time allures him to visit the store.

Now, the question is, how does this happen?
Perishable insight is insight that provides incredible value while an event is happening because its value expires once the moment has passed. In our example, John’s smart phone location services is transmitting the data in real time and there is action getting generated while the data is in motion. In this case, the action can be defined as the promotional offers being sent to him, to get him into the store. Then, when the data comes at rest, in-depth analysis is done to generate more value from John’s behavior.

Therefore, businesses today want to generate the right information at the right time for the right people. It’s called the 3R principle. The growing demand of data scientists prove that businesses want to move beyond mere running data analysis or reporting. There is a high demand of predictive analysis or building recommendation engines based on data collected from various sources like wireless sensors, smart phones, social media and clickstream data. From the technology stand, Cloud (public, private and hybrid) and technologies like Hadoop, Cassandra, Elastic Search, Apache Kafka, Apache Spark, Apache Storm and many other streaming engines in combination with other open source technologies, play a significant role in gathering perishable insights.

2016 is the year of Artificial Intelligence. If you have played video games or received an email from your credit card company asking to verify a purchase, then you are part of AI revolution. Companies are working on running smart and accurate analysis on data with the help of fast computers. The convergence of big data with AI is certain, as the automation of smart decision-making is the next evolution of big data. It is being said that there will be no Big Data without AI and come to think of it, the pace at which data is increasing, we would definitely need to add this intelligence layer to Big Data in order to perform complex tasks at a much higher pace than what humans can imagine.

There are tons of startups and also established brands like Google and Facebook (FAIR) that have invested in home grown projects to make their systems smarter, in order to deliver better value from the data by running better intelligence and analytics. Also, companies are implementing cloud solutions alongside Big Data technologies, as an efficient what to scale resources in order to store and process data. Many a tech organizations are gathering and analyzing perishable insights for firms, taking them further in the race of personalization and customer satisfaction.

It’s still a humble beginning for Big Data and it said that by 2020, there will only be data businesses or no businesses at all. Now that will be an interesting trend to follow, isn’t it?

Leave a Reply

Your email address will not be published. Required fields are marked *