By: Andy Sen, CTO, Edureka
Earlier we looked at Big Data (collection of large and complex data sets that are hard to process using traditional data processing applications) and some of its applications such as Predictive Analytics. More often than not, Big Data applications tend to be powered by the Cloud.
What is Cloud Computing ?
As popular as the term Cloud Computing may be, the answer depends on who you ask!
“The interesting thing about cloud computing is that we’ve redefined cloud computing to include everything that we already do. We can’t think of anything that isn’t cloud computing with all of these announcements. The computer industry is the only industry that is more fashion-driven than women’s fashion” (Larry Ellision, Oracle)
Forrester says Cloud Computing is “A form of standardized IT-based capability — such as Internet-based services, software, or IT infrastructure — offered by a service provider that is accessible via Internet protocols from any computer, is always available and scales automatically to adjust to demand.”
Essentially we use the generic term Cloud to cover IaaS, PaaS and SaaS providers who provide some degree of virtualization, scalability and redundancy. These providers also have an inherent ability to tolerate low-level failures but are not necessarily immune to failures arising from natural disasters (Hurricane Sandy flooded several data centers in New York) or more mundane human errors (a configuration change caused a massive Amazon EC2 outage in 2011).
The current cloud ecosystem is leveraged by corporations using some avatar of the social model where the platform is provided by the company and the content by the users. Users play a pivotal role in bringing about the interactivity feature and maintaining the content ecosystem. This ecosystem is on its way to drastically change as we have billions of semi-autonomous devices hooking up.
Cue Internet of Things (IoT)
Did you know that the traffic congestion overlay on Google Maps is essentially crowd sourced Android phone movement data? (Yes! Your Android phone is transmitting this data even if GMaps is off). Now imagine if your fridge placed an order for milk as it ran low or your microwave could scan the food item and know exactly the duration and power levels to cook it. IoT refers to this network of devices or “things”, embedded with electronics, software, sensors and connectivity, which enable it to achieve greater value and service by exchanging data with the service provider or other connected devices.
Having typical non-networking devices talk to other devices (without a human in between) is what makes IoT so trendy. Coupled with a renewed focus on network enabled consumer devices (from electricity meters to videocam door bells), IoT opens up several avenues for businesses to expand their product and service offerings.
IoT allows companies to harvest data on an unimaginable scale. Now you do not need to use the laptop or tablet to be tracked. You smartphone knows where you are and locations you visit. Your smart set-top box knows what you watch and when. Your smart fridge knows what food you buy and when you eat. And cloud based PaaS/SaaS providers with their elastic storage and computing capabilities, provide the perfect platform on which all these streams of unorganized data can be loaded, crunched and analyzed. Privacy advocates can be up-in-arms against this, but the average consumer will gladly trade off their privacy (suitably anonymized) for a new platform (eg: facebook) or a subsidized service (eg: Ad-supported GMail with Gigabytes of storage, when most email providers gave Megabytes).
Big Data & Analytics and IoT: Tying it all together
IoT devices are a continuous source of rich streams/snippets of data. The momentous increase in these connected devices, will give rise to an exponential increase in the data that an enterprise is required to manage. The elastic platform of cloud providers will need to continue to scale and handle such streams.
Tech analysts opine that Software Defined Storage (SDS), with the ability to be deployed on different hardware and supporting rich automation capabilities, will extend its reach into cloud deployments and build a data fabric that spans premises and public clouds. Hence, the more we try to marry IoT and Big Data to derive business intelligence, the more focused we need to be on improving compliance, security and interoperability.
While we are trying to store the huge data in our well-managed cloud, with the help of Big Data and its associated techniques such as Predictive Analytics that provide the analytical engines to process this data and provide meaningful insight, we need to work on newer algorithms and open-source software frameworks to harness data that are more complicated. Apache Hadoop, an open source (and very popular) Big Data framework provides an analysis engine called Mahout, which allows us to utilise Machine Learning algorithms such as Recommendation Mining (taking users’ behaviour and trying to find items users might like). Similar frameworks needs to be designed and learnt to extract the maximum business information from this accumulated data.
Without the Cloud, this process of being able to analyze massive amounts of data and connect devices would have been a distant dream. With every individual embracing smart device as a part of their daily life, newer data will be piped in at a tremendous rate. We cannot afford to ignore this data, which can help us get important business insights. Cloud based IoT platforms are an essential element that would help to store such data and would play a key role in the initial phase of IoT and Big Data adoption by enterprises.