IBM and Hortonworks announced an expansion to their relationship focused on extending data science and machine learning to more developers and across the Apache Hadoop ecosystem. The companies are combining Hortonworks Data Platform with IBM Data Science Experience and IBM Big SQL into new integrated solutions designed to help everyone from data scientists to business leaders better analyze and manage their mounting data volumes and accelerate data-driven decision-making.
The news builds on the long-standing relationship between the companies and includes the following:
• Hortonworks will resell the IBM Data Science Experience with HDP, a leading Hadoop distribution, and adopt it as its strategic data science platform, giving developers a fast on-ramp to data science capabilities including machine learning, advanced analytics and statistics. Also, Hortonworks and IBM will create new solution bundles that integrate HDP with IBM Big SQL, IBM’s SQL engine for Hadoop, giving Hortonworks’ legions of clients and users a familiar method of managing their data.
• IBM is adopting HDP for its Hadoop distribution and will fully integrate it with Data Science Experience and Machine Learning. As a result, this solution will combine for users the rich data security, governance and operations functionality provided by HDP, and the advanced analytics and management of the Data Science Experience. IBM will migrate existing IBM BigInsights users to HDP.
IBM Data Science Experience provides a set of critical tools and a collaborative environment through which analysts and developers can create new analytic models quickly and easily. For example, IBM Machine Learning, found in the Data Science Experience, can speed the time it takes to build and deploy analytic models for application development by two-times, according to IBM testing.
Hortonworks announced Hortonworks DataFlow (HDFTM) for IBM Power Systems. HDF, the industry’s only data ingest, stream processing and streaming analytics platform built entirely on open source software, is designed to enable customers to collect, curate, analyze and act on all data in real-time, across the data center and cloud. Combined with IBM Power Systems, customers can gain access to industry-leading performance and efficiency for streaming analytics. HDF is complementary to HDP and is designed to accelerate the flow of data in motion into HDP to support full fidelity analytics.
Partnering On Apache
As part of their wide-ranging partnership, the companies will also team to advance the development of Unified Governance (IBM BigIntegrate, IBM BigQuality and IBM Information Governance Catalog) on the Apache Atlas open platform. Atlas provides a scalable governance platform for Enterprise Hadoop which is designed to help developers model new business processes and data assets quickly and easily. Through their work, both companies plan to help advance Atlas from its current Incubator status to Apache Top Level Project status, where projects are typically released for open development and deployment.
In addition to Atlas, the companies will also partner on the advancement of Apache Spark, the open source framework for processing and analyzing large data sets across clustered environments. The companies will also collaborate to advance the Apache Hadoop framework itself, working to unify access to multi-vendor, heterogeneous data environments across data warehouses and databases – ultimately aiming to simplify the environment for better value from all data.