Interview

Is third-party data a good fit for your purpose?

An analytics industry veteran unfolds many layers of data as it moves into a new world. We get to know the ‘in’ and ‘next’ of synthetic data.

DQINDIA Online

16 Aug 2022 11:36 IST

New Update

An analytics industry veteran unfolds many layers of data as it moves into a new world. We get to know the ‘in’ and ‘next’ of synthetic data, no-code analytics and quantum tech’s effect on data analytics

Advertisment

Who better to ask about synthetic data, third-party data, scalability hurdles of analytics and the next face of data collaboration than Andrew Beers. As its Chief Technology Officer, Andrew Beers has been responsible for Tableau’s long-term technology roadmap and emerging technologies. He has been leading many engineering teams, creating new products, and has been at the heart of Tableau’s engineering for most of the company’s existence. With a master’s degree in computer science from Stanford University, Beers brings a fresh perspective and a humble perspective when he looks at the power of data. Here are some snippets from his recipe of data analytics.

What is your vision for Tableau for the next 3-4 years? Do you see any help/competition coming from breakthroughs in quantum computing and fog computing when we think of analytics?

Quantum and fog computing are really exciting for the future of work—but it’s still too early for many organizations. Many of the executives we talk to right now are thinking about Artificial Intelligence, Data Ethics, Workforce Development, Flexible Governance and Data Equity.

Advertisment

We go into each of these categories in our Data Trends report. And we believe these trends will be a priority for years to come. All these trends are interconnected; and for organizations to be successful in the future, they absolutely must excel in each of these areas.

Now data-driven organizations will have the greatest advantage. And in the future, Tableau will continue to focus on how to help all organizations become data-driven. It’s our goal to help everyone inside an organization to make better decisions faster. Because we’ve seen what happens when people can make smarter decisions with data—not only do they transform business, but they also transform the world.

Of course, we’re going to tap various technologies and techniques to do that—all while keeping people at the center of our innovation.

Advertisment

Tableau will continue to focus on how to help all organizations become data-driven. It’s our goal to help everyone inside an organization to make better decisions faster.

How significant are integrations for the overall portfolio—as seen recently with Slack, Looker, Einstein Discovery etc.?

Over the past few years organizations everywhere have forever changed how, and where, the work gets done.

Advertisment

Instead of conference rooms, people are now making decisions in collaboration tools, on our phones, and in digital or mobile applications. We all now work in so many different places. Except when people have questions about their data they need to leave these applications and navigate to an analytics dashboard because their work and their data are in two different places.

Consider the possibilities of having data and analytics infused in the collaboration tools, applications, and experiences that everyone uses. With Tableau, people can ask questions about the data whenever and wherever they want—because every business app is now an analytics app. For instance, if they are using a collaboration tool like Slack, the data can tell them ahead of time when there might be a problem and what’s the next best action to take. Slack becomes your digital HQ—one that empowers everyone inside your organization to make data-driven decisions every day. And with Einstein Discovery in Tableau, we are bringing even more AI capabilities to our platform to help business teams build, and consume, powerful predictive models.

Einstein Discovery in Tableau will provide a new collaborative environment for your model-building projects that cover everything from data ingestion and preparation through building, deploying, and managing models. And, we’ll deliver a scenario planning experience to use what-if analysis and simulation to explore a variety of possible future scenarios.

Advertisment

And in this digital world, Cloud is the status quo. Our Tableau and Google Cloud partnership includes plans for deeper integration between Tableau and Looker. Leveraging Looker’s semantic layer will provide Tableau customers with trusted, governed data at every stage of their analytics journey.

Any views on how adequate the data science skills pool is—specially in India? What implications would the no-code data analytics trend have here?

We have barely scratched the surface of how data science can help businesses day-to-day. At best organizations there’s a small number of highly technical data experts supporting large teams across the business. Often times these data experts are maxed out working on issues with the highest priority, yet people all across the business still need help to solve business problems that require some data science but don’t justify—or can’t wait for—a data scientist. That means critical decisions are being made daily that aren’t supported by the centralized data science team.

Advertisment

Now there are two ways to solve this problem:

First is to train and hire more data scientists. And to help address the need for skilled labor, Tableau recently cemented a partnership with the All India Council for Technical Education (AICTE), Ministry of Education, Government of India to bring Tableau data analytics skills to all students attending one of AICTE’s 10,500 associated institutions. Still organizations typically want their data science teams to tackle the really big, mission-critical problems. And traditional data science processes might be overkill for most business questions.

So another way to solve this problem is to bring data science capabilities to more people—regardless of their technical expertise. We call this emerging trend Business Science.

Advertisment

Technologies like AI and Machine Learning are expanding data science capabilities to more people so they can make better decisions faster. It gives people with the right domain expertise and business context the ability to build predictive models, plan simulations and scenarios, and cluster data. With low to no coding.

This means more people across the business are better equipped to tackle tough questions like resource allocation, prioritization, staffing, and logistics. For instance, an account executive on the sales team can tap low-code or no-code data science capabilities or what we call Business Science to explore various possibilities and see which scenarios will best help their customers achieve their goals.

Tell us more about the role of analytics in today’s economy. Is it cost-effective to scale?

It’s not only cost-effective to scale, but it’s also imperative to our global economy. By 2030, PwC expects AI to add around US$15 trillion to the world economy. That’s $15 trillion in new companies, jobs, products and services in the next seven years. Not to mention a 26 per cent increase in global GDP.

By 2030, PwC expects AI to add around US$15 trillion to the world economy. That’s $15 trillion in new companies, jobs, products and services in the next seven years. Not to mention a 26 per cent increase in global GDP.

AI is fundamentally a set of deeply data-based techniques. Historical data about recruiting may fuel a predictive model that helps HR retain and nurture top talent. Customer data can fuel a model that helps sales improve forecasting, or even offer suggestions for how or when to best engage and what products your customers might be interested in.

It all starts with your data. And if you want to drive transformation across your organization, if you want to thrive now and in the future, applying these new AI technologies and techniques to the data can modernize your business. It can help your organization gain speed and agility—by automating tasks, enhancing your thinking, or recommending the next best action to take.

Are we getting past hurdles to get closer to clean data, enough data, affordable data, and well-owned data? Would better APIs, open metadata standards, and broad data ontologies etc. help?

Yes, we’re getting past some major data hurdles thanks to advancements in technology for collecting, cataloging, and organizing data better than ever before. Organizations also have more dedicated data people and centers of excellence to support to help with business problems. And we’re also seeing better models out there for how to own and maintain data inside large companies (e.g. data mesh approaches) —that’s leading to more clean sources being generated, shared, curated.

However, data is more dispersed than ever and it’s coming in from everywhere—cloud services, data inside applications, new database technologies, non-tabular data like text, image, audio, video. It’s no longer just about moving data between databases, but between different kinds of technologies. Lakes and warehouses mean lots of sources of data at varying levels of quality. And data silos still happen. To get through these hurdles still takes data work, literacy, and culture changes.

Better APIs would definitely help too, because data resides not just in databases but inside other tech. Open standards are always great for interoperability, and technology from MuleSoft helps glue together systems with API-based approaches.

What should enterprises be aware of when it comes to new forces like synthetic data? Is analytics all set to leverage it/make sense of it?

Synthetic data can be really useful for training models on events that don’t occur frequently, or for which you don’t have enough real-world data. These methods however require good labeled data, and real-world collected data can be hard to label.

Fortunately, synthetic data is generated with labels, but that doesn’t preclude it from errors or biases especially since you’re generating this data. You really need to be fluent in data to properly understand how the synthetic data represents the real world.

How does analytics address the questions that enterprises face on the pendulum between first-party and third-party data? Do blockchain-based user-data models and interoperable data exchanges like Gaia-X look scalable enough to solve this gap?

Third party data like weather data or financial data can often enhance what you have in your own data. But still, you need strong data skills to understand what value this third party data can add to your analysis or model. Is it a fit for your purpose? Does it add enough to your models? Are there any privacy or other regulations you need to consider for your first and third party data?

I’m no expert in blockchain systems, but they do have a role as a “sovereignty” to distribute trust and immutability. This can be important when multiple entities don’t trust each other or a third party, and when distributed transactions are required.

Andrew Beers, Chief Technology Officer, Tableau

By Pratima H

pratimah@cybermedia.co.in