With the real-time pace of business and growth of connected devices, data is becoming a strategic asset to drive organizational decision making, says Murthy Mathiprakasam, Principal Product Marketing Manager, Big Data, Informatica. In an interaction with Dataquest, he shares how organizations can foster a data-driven decision making culture while ensuring data security. Excerpts
How the richness of big data is disrupting the analytics infrastructure, and elevating the importance of next generation data integration and data preparation?
As the pace of business increases and organizations face more competitive pressure than ever, data becomes a strategic asset to drive organizational decision making. In a world that is increasingly more interconnected with powerful devices with nearly free storage, the opportunity to leverage data to achieve greater analytical aspirations has never been so compelling.
Organizations are using data to grow by gathering insights to drive profitability, innovation, customer satisfaction, and competitiveness. Data is also being used to mitigate risks from fraud, crime, system downtime, or security breaches. With new data platforms, such as SAP HANA and Hadoop, emerging organizations have the opportunity to become successful with analytics in a new big data world.
According to you, is data trustworthiness emerging as an area of concern in the light of massive volume of new data being generated today? How can organizations foster a data-driven decision making culture?
With the growing velocity of data and the near real-time pace of business, organizations are struggling to deliver data fast enough for analytical users to make effective decisions. Even with many storage and processing options now available, one still needs to prioritize what data one wants to capture. One needs to take a look at their business goals and map the priorities to the data that one may need. Data from social media and sensors often contain time-sensitive insights that must be actioned on within short time-frames. Data consumption by data scientists, and new event-driven systems such as next best action systems, is placing a greater demand on data management organizations. Timely delivery of data to analytical users is another critical aspect to becoming a decision ready organization.
Can you share your views on how the perimeter-less world of pervasive computing is disrupting the security infrastructure?
There is clear value to using technologies, such as Hadoop, to build next generation ‘data lakes’ for collecting, preparing, and analyzing greater volumes and types of data. Enterprises are augmenting their traditional data warehousing architectures to include Hadoop, both as a more efficient and scalable preparation stage and to offload less frequently used data for easily accessed archives. As more enterprise data moves into Hadoop for collection, preparation and analysis, concerns around security and governance can, and will, arise. Compliance-sensitive industries, such as healthcare or financial services, or any other consumer-driven industry,such as retail or CPG, are legally obligated to ensure strict controls on the use of data. These controls must also apply to new data platforms, such as Hadoop.
Built-in data security and data governance ensures that the data ingested into Hadoop can be prepared into trusted data for big data analytics. It’s not enough to just block off access to data, you need to build security into the data itself through data masking and data encryption. Once the data is anonymized, you can gather numerous insights, at the aggregated level—such as patterns and trends that you see with certain sets of demographics, for example.
How the economics of cloud computing is disrupting the computing infrastructure and driving demand for cloud integration?
With growing volume and variety of data, organizations are struggling to deliver complete and trustworthy data to analytical users. Newer cloud-based applications such as Salesforce and Workday are shifting data integration requirements to be more native in the cloud and newer sources of data such as data from social media or sensors are also particularly suspect due to their inherently unstructured nature. These raw datasets can often be incomplete, inconsistent or insecure leading to incomplete decisions or risk of compliance failure.