Cloudera, global provider of secure data management and analytics platform built on Apache Hadoop and the latest open source technologies, announced new technology enhancements to its core platform that will make it easier for companies to use elastic, on-demand cloud infrastructure to gain significant business value from all their data. The company also announced it is generating significant momentum with customers running production environments on public cloud infrastructure.
The move to cloud is a top priority for CIOs in 2016 across the globe. According to a Gartner Survey Analysis: Cloud Adoption Across Vertical Industries Exhibits More Similarities Than Differences, February 2015, IT spending on public cloud is growing at a five-year CAGR of 18% through 2018 – further evidence that cloud spend is far outpacing IT spend with no signs of slowing down.
Another industry analyst, Tony Baer of Ovum is on record saying that the cloud is where the next wave of Hadoop take-up is going to happen. More specifically: “Ovum believes that appliances and cloud deployment will drive the next major adoption wave of Hadoop and big data analytics.”
“We are using Cloudera Enterprise on AWS to capture and process thousands of critical events at scale to provide our banking clients with unique insights about their customers that drive revenue growth,” said Kaushik Deka, chief technology officer and director of engineering for Novantas. “In addition to their leadership on Hadoop and Spark, working with Cloudera also gives us the flexibility to deploy the application in the environment of our choice. We use Cloudera Director to launch on AWS, but we also maintain a hybrid environment for clients who don’t want their data in the public cloud.”
A significant numbers of enterprise companies – including Adecco, Airbnb, GoPro, Nielsen, Novantasand others – are running Cloudera Enterprise on public cloud infrastructure. Reasons for deploying in a hybrid, multi-cloud or single cloud service often include the desire to do the following:
- Reduce the cost associated with purchasing, configuring, and maintaining on-premises hardware required to run big data applications
- Increase the ability for data engineers and data analysts to respond to business problems through self-service provisioning
- Meet strategic objectives to “move to the cloud” to reduce a company’s owned data center footprint
“Helping our customers win in the cloud is a key strategic objective for Cloudera. Today our enterprise platform is uniquely positioned to support any kind of big data workload in the cloud, whether transient or long-lived, handling batch jobs in support of building data ingest pipelines or supporting advanced SQL analytics and complex event processing. We deliver true elasticity, scaling to handle workloads on demand, and offering consumption-based pricing that users expect in the cloud,” said Mike Olson, co-founder and chief strategy officer of Cloudera. “Delivering this customer success requires providing companies with choice in where they run their workloads. They need the ability to react quickly to changing business demands on a platform in a manner that’s secure and meets strict guidelines for data governance.”
Cloudera continues to drive customer success in the cloud by enabling production-ready big data analytics optimized to run across modern IT environments. Cloudera Enterprise 5.8 enables customers to run Apache Impala (incubating) against popular cloud-native object stores including Amazon S3. This means customers can now run high-performance SQL analytics and BI workloads on data in Amazon S3 without having to transform or move that data to another location on Amazon Web Services (AWS). Today, Cloudera customers can also use processing and query engines Apache Hive, Apache Spark, and Hive-on-Spark (typically 3x faster than Hive on MapReduce) directly against data in Amazon S3.
For Microsoft Azure customers who use PowerBI Desktop to build reports, Microsoft recently announced a new preview connector for Impala. This allows customers to use the speed of Impala to pull large volumes of data, of varying types and sizes, into their analytic dashboards and make it accessible to any number of users.
“We continue to see sizeable demand for Cloudera Enterprise, as Azure customers realize the impact that big data analytics in the cloud can have on their business,” said Jeana Jorgensen, general manager at Microsoft. “The Microsoft and Cloudera engineering teams worked closely to develop the new Director plugin for Azure, and we’re excited to provide our customers with a fast, easy way to deploy and manage the lifecycle of Cloudera Enterprise on Azure.”
Another growth area for cloud is the interest in multi-cloud and hybrid architectures. Increasingly companies want to be able to run certain workloads on-premises and other workloads in the cloud either for additional scale, for development and testing, or to comply with service-level agreements or industry regulations. At the same time, customers want to reduce their risk by not locking their data into a specific cloud service offering.
Cloudera Director makes it easier for customers to deploy and manage the lifecycle of Cloudera Enterprise clusters across cloud environments. Customers can select from templates for AWS, Google Cloud Platform, and now Microsoft Azure for rapid provisioning and cluster grow/shrink and terminate along with the ability to monitor and manage all clusters from a single unified interface. Additional features of Cloudera Director now include:
- Integrated usage meter with automated billing for a pay-as-you-go computing experience to go along with node-based pricing in the cloud
- Ability to deploy into multiple regions and availability zones from a single Cloudera Director instance
- Availability to deploy Cloudera Director via the Azure Marketplace coming soon
- Support for spot instance and preemptible instance provisioning