Datalaxy: A Constellation, That Can be Chaos, Unless…

As utopian as a world built on the fabric of data sounds, we cannot ignore the underbelly of privacy worries, security dangers.

New Update

As utopian as a world built on the fabric of data sounds, we cannot ignore the underbelly of privacy worries, security dangers and interoperability predicaments here. Are we overthinking about some Black Holes?


Scenario 1: Enter the NASA observatory where Greg Robinson walked in after being appointed as the director of the Webb program. After many giant struggles, the ambitious dream of sending the telescope into orbit around the sun was ignited again—and this time, it was onto his shoulders. It was project with a team of 10,000 people on a $10 billion project. There was no room for error or oversight now. And that’s what Robinson proved when he attacked the problems like the one sitting way deep into a bunch of loose screws and washers. Webb’s sunshield, a crucial structure of protection of the observatory from extreme temperatures – was supposed to unpack itself in outer space – so if the fasteners on the sunshield go loose – it is not just one screw gone wrong. It’s a scary domino effect—delaying the launch-date by months and ballooning the costs by millions. The answer – a blanket audit of the entire system—with no angle left. The paperwork against the hardware was checked at every level. So were the specifications, drawings, purchase requests, and other data entries. And thus—the problems were fixed. Leading all the way to big Christmas launch of Webb on Ariane 5.

Scenario 2: In the recent Uber-files Pandora box, one can notice – among other surprising emails—the conversation about a ‘kill switch’. Looks like—in an attempt to throw transport officials, authorities and cops off the scent for investigations; the ride-hailing app came up with this switch. It worked – as per the leaked documents revealed to media—whenever a raid to any of the app’s office (like in France, Belgium, Netherlands, India, Hungary etc.) was about to happen. Apparently, instructions were sent to IT staff to cut off access to the company’s main data systems so that evidence could be rolled under the carpet. What’s even more surprising is that a spokesperson explained, as per media reports, that the switch software was discontinued in 2017 and should never have been used to thwart regulatory action.

Scenario 3: In a 2020 research data activist Anja Kovacs and researcher Tripto Jain from the Internet Democracy Project identified the current perception of data, i.e. as a resource, as one of the crucial problems plaguing existing consent regimes. They also demonstrated that data is increasingly functioning as an extension of, or even integral to our bodies. It’s a new revolution brewing that brings to fore the privacy abuse and safety issues that females, in particular, face as they see their social media or other tech-use data (Which they shared for a service) being exploited – against them.

Dhawal Mehta1

Dhawal Mehta1

“When you know to use data for customization and speed, it works wonders and helps to improve profitability even with a relatively smaller number of screens compared to the biggies in some cities.”

—  Dhawal Mehta, GM-IT, Miraj Cinemas


Can you see the common thread stitching all these scenarios together? Whether it’s a telescope or a corporate misdemeanor or a women safety issue – it all starts, and ends, with that one word—Data.

Goodbye Fossils. Hello Data

Yes, whether we are able to see and decipher it or not— data is going to rule everything about our imminent future. Collecting, sharing, collaborating and building up on data—all that will be the new blue that will cover our planet soon. Data used to be the ‘new fuel’. It is turning to be the ‘new water’. It would be almost hard to conceive a world where we can live without it.


Consider some glimpses.

Miraj Cinemas is able to give personalized services and unprecedented ease-plus-speed in F&B offerings (right to the seat level) by using data from QR and ticketing. Dhawal Mehta, GM-IT, Miraj Cinemas is upbeat about the use of technology to gain an edge over industry players with bigger footprints – scale and number of screens. “When you know to use data for customization and speed, it works wonders and helps to improve profitability even with a relatively smaller number of screens compared to the biggies in some cities.”

Dr. Sumit D Chowdhury, CTO, Wadhwani Institute of Technology and Policy and Founder of Gaia has lot to share from his hands-on experiences with smart cities. He cites examples like offline Data Repositories which help in understanding our environment. “The IUDX is a great example of Open Data platform where cities are collaborating with their data to be analysed by anyone with the right use-case. There are other uses cases that are available with Public Health, MNREGA and many other data-sets. But in the true sense these are not real-world models. They are just aggregated data from specific sources open for collaboration.”


In a smart project, the role of data becomes even more crucial. As Dr. Chowdhury explains, “Anything ‘Smart’ is, by my definition, a learning and evolving thing. A Smart City is a Learning and evolving city, a Smart Project by extension, is a project/infrastructure that is designed to be flexible, to have learning mechanism that will allow its rules to evolve. The only way to do it is to collect data before, and during, the project; and create inbuilt mechanisms of data collection in the infrastructure. The Data is then used to build a robust model that represents the universe. This model is then rolled out during the Smart Project to allow it to evolve over time with constant flow of information between the Data-verse Digital Twin and the Real World. Without data, all this is just not possible.”

Look at Mahindra Finance. This organization provided training on data-based decision making for 3,000 employees across various departments like human resources and marketing to meet data governance needs and accelerate business growth. It is now set on a mission to bring all 16,000 of its employees onto an analytics solution player Tableau.

Dr Sumit D Chowdhury1

Dr Sumit D Chowdhury1

“We need to know that today our place in the World-Wide-Web is very public. Where we are going, what we are searching, what we are buying is already public and is being monetized in very interesting ways. The onus of ensuring privacy is therefore with the individual and not the systems.”

— Dr. Sumit D Chowdhury, CTO, Wadhwani Institute of Technology and Policy and Founder, Gaia

Similarly, we can look at Jaguar Land Rover (JLR) – which are automobile brands with a long history of success – and they have driven business transformation by scaling analytics across the entire enterprise with a strong data culture. By democratizing access to data and Tableau, JLR has been able to navigate supply chain issues and proactively manage risk. They recently tripled the number of Creator licenses and are currently using Tableau in every function across the organization.


Or look at BHFL (Bajaj Housing Finance Limited), a home mortgage company based in India, which faced data challenges across three key divisions. HR needed to automate key processes around headcount, turnover and learning & development. Its Builder Group required a better monitoring system of properties and customers, and Operations lacked clear identification of key metrics to use in the business. When it invested in technology to automate manual tasks, visualize static reports, it could create dynamic dashboards with evaluation metrics for immediate analysis.

Achim Granzen, Principal Analyst at Forrester drills deep into the role of data for enterprises in the future—when we enter a data universe scenario. “Enterprises today are still hard at work realizing and accelerating the data to insights to action cycle, focusing on leveraging primarily their internal data (with external augmentations). To be successful with this, they need to mature capabilities around data integrity, data democratization, data literacy, data collaboration, and data governance, risk and compliance. In my view, over the next few years enterprises will be focusing on getting the data to insights to action cycle right. Digital natives, who have data at the center of their existence (think about super-apps such as Grab or Gojek), will expand their data assets by continuing to add services and apps to their portfolio.”

Data, in all possibility, could keep the world hydrated and fluid in a new way.

But would everyone get access to this water? And how safe would it be? It’s not easy to toss away some realities as we paint this new picture. As companies, and people, share more and more data—at never-before speed and ease—would we completely throw privacy out of the window? And would we, at the same time, open a bigger window than ever for attackers and security breaches? Also, how can a data-powered world work when collaboration is the key but interoperability is still hard at the level of intention, practice and execution?

From Data Ponds to Data Swamps

Let’s talk about interoperability challenges first. Because if that does not happen, everything about this new world would, eventually, be a mess–and not a beautiful one.

Laura Petrone, Principal Analyst in the Thematic team at GlobalData opines that interoperability of data, in particular, consumer data, has been long advocated as a solution to give users more control over their data over digital platforms, for example requiring companies to hand over personal data to competitors, without the user losing what they have built up on a platform. “This scenario would be especially desirable from a data privacy perspective as it would allow users to manage, through a single service, the personal data they hold and share, or try out an innovative digital service that uses their information in a new way. However a standard on interoperability has never taken off and I don’t see how this will be achieved with enterprise data anytime soon.”

Achim Granzen1

Achim Granzen1

“over the next few years enterprises will be focusing on getting the data to insights to action cycle right. Digital natives, who have data at the center of their existence (think about super-apps such as Grab or Gojek), will expand their data assets by continuing to add services and apps to their portfolio.”

—  Achim Granzen, Principal Analyst, Forrester

There are many other struggles with Data.

Like Privacy.

Data protection and privacy regulations such as the EU’s General Data Protection Regulation (GDPR), the California Consumer Privacy Act (CCPA), and Singapore’s Personal Data Protection Act (PDPA) have been major drivers for data governance initiatives, Granzen reflects. “But organizations have an ever-increasing appetite to leverage their data for business advantage, either through internal collaboration, data sharing across ecosystems, direct commercialization, or as the basis for AI-driven business decision-making. While doing so, organizations must take care to maintain employee, partner, and customer trust in their approach of leveraging data (and technology fueled by data). This requires data governance and data governance solutions to step up and enable data-driven businesses to leverage their data responsibly, ethically, compliantly, and accountably.”

The problem we face is not all Data is good Data (in accuracy, completeness, timeliness and one obtained without manipulation), as Dr. Chowdhury lets on. “So, the design of a Smart Project is very important where these aspects are taken into account during the conceptualization itself, otherwise we will end with lot of data but something that is unusable.”

He explains the concept of Digital Twins as a good way to understand the Data universe of the future. “It has been around for a long time and people have made several attempts to create the meta model of any infrastructure so that the impacts of real events can be simulated and its impact understood BEFORE the actual rollout of any change in the environment. The real world has people and it is important to model the impact of people as well. However any such smart data project does NOT need specific information about any person.”

How do we navigate privacy in this world where everyone and anyone and access, collect, and act on data – with the right stamp in one’s pocket? With zero-proof models? With more decentralization of data and the advent of user power in how they share, and monetize, their own data?

According to Dr. Chowdhury there should technically be no conflict between the privacy if identifiable data is not used. However, there is a class of projects designed to improve specific experiences of people. Those projects need specific information of people and, therefore, it crosses over to the realm of conflict between public good and private lives. We, as architects of the future data universe, should always ask the question about privacy and the real need of that in modeling a project. Most Data, even without identifiable characteristics, is very important to take decisions at a macro level. Policy decisions can be evaluated with such data.

Even if privacy is taken care of, the big sword of security-dangers hangs quite close to the new power of data. It’s a big attack surface and open to many actors now, isn’t it?

Dr. Chowdhury recommends that non-specific data should be made universally available for researchers to understand the impact of policies and then model the universe in interesting ways. “Such public data is not a security threat at all. Collaborative datasets have been created in many ministries and governments across the world and this approach works well. It is generally aggregate transaction data shared in structured formats. However, identifiable transaction data is something that causes major concerns and has created havoc in the Data-universe. Banking, Health, Financial Identifier data is something that is universally accepted to be private and there are rules to protect this.”

Andrew Beers quote1 1

Andrew Beers quote1 1

“I believe in ‘ands.’ You can empower people while still providing trust and security that organizations demand.”

—  Andrew Beers, Chief Technology Officer, Tableau

Would such a universal and collaborative data-verse concept be extra-easy/attractive for security threat actors? Absolutely, says Granzen. “And it will not be threat actors from the outside. Take the Facebook/Cambridge Analytica scandal as an example: Cambridge Analytica was part of the Facebook data and platform ecosystem. They did not have to hack themselves in—they abused and misused data from within. Any collaborative or open system or platform must strictly protect itself from rogue actors within. Such internal rogue actors are, in fact, much more damaging to trust than an external attacker.”

And if bad actors can gain from sharing more data, why should good actors stay behind? Can’t we have whitelists and blacklists that are populated by security players and attack-victims as well survivors so that everyone is better prepared as a village?

Ask Balaji Rao, Country Manager, India & SAARC, Mandiant what he thinks of a collaborative data bank for cybersecurity and he avers that such a repository can help massively on the skill-sets part. “Although sharing data is tricky but it can be done.”

However, as Dr. Chowdhury cautions, in the rush to digitization, many systems expose partial information to other connected systems so that end-to-end processes can be completed outside the boundaries of any individual system. “It goes to external contractors and resides in their systems for years without the necessary checks and balances. These external and loosely protected sensitive data pool becomes super easy for hackers to get into and monetize. In the big bad world out there where hackers are smarter than the protectors (always assume that), it is always possible (however small the probability is) for a really determined hacker to find the information they are seeking (or find a surrogate).”

If we’ve learned anything during this new dynamic, it’s that predicting what’s next can be incredibly difficult, remarks Andrew Beers, Chief Technology Officer, Tableau. “But we know that the one constant, reliable way forward is with data—and democratizing it for all. This involves bringing analytics to everyone across organizations—making analytics easier to use, more powerful, more actionable, more predictive and integrated into the flow of business. A key piece facilitating all this is data management. I’m talking about data preparation, cataloging, governance, security, storage, quality monitoring and more.”

His favourite ingredient in this formula is that of governance. “The democratization of data often presents businesses with a conundrum: provide strong governance and security or provide agile self-service analytics. It shouldn’t be an ‘or’ proposition — empowerment or control. I believe in ‘ands.’ You can empower people while still providing trust and security that organizations demand.”

“Our point of view addresses the last mile of data management, bringing it to business people in the context of their work. In Tableau, everyone can see where data comes from, increasing trust. This ensures that the right people have access to the right data,” he spells it out.

Well, a lot can go wrong before it can go right in the newly-minted universe. Data, as the new liquid, can be pouring out from the shiny new tap as easily as it can turn into a flood or a bad mocktail. For now, the ice has begun to melt. And it’s up for grabs for telescopes and dark tunnels alike.

By Pratima H