Successful artificial intelligence requires the right data architecture. Here’s how to approach it

There are also some great books out there that can help you develop a more strategic, business-minded approach to data architecture

New Update
Data Protection Bill

For companies that can master it, Artificial Intelligence (AI) promises to deliver cost savings, a competitive edge, and a foothold in the future of business. But while the rate of AI adoption continues to rise, the level of investment is often out of kilter with monetary returns. Currently, only 26% of AI initiatives are being put into widespread production with an organization. Which means many companies are spending a lot of time on AI deployments, without seeing tangible ROI. 


Meanwhile, in a world where every company must perform like a tech company to stay ahead, there’s increasing pressure on technical teams and Engineering and IT leaders to harness data for commercial growth. Especially as spending on cloud storage increases, businesses are keen to improve efficiency, and maximize ROI from data that’s costly to store. They don’t have the luxury of time.

To meet this demand for rapid results, mapping data architecture can no longer stretch on for months with no defined goal. At the same time, it’s regressive to focus on standard data cleaning or Business Intelligence (BI) reporting. 

Rather, tech leaders must build data architecture with AI at the forefront of their objectives. Do otherwise, and they’ll find themselves retrofitting it later. In today’s businesses, data architecture should drive toward a defined outcome—and that outcome should include AI applications with clear benefits for end users. This is key to setting your business up for future success, even if you’re not (yet) ready for AI. 


Starting From Scratch? Begin With Best Practices

Data Architecture requires knowledge. There are a lot of tools out there, and how you stitch them together is governed by your business and what you need to achieve. The starting point is always a literature review to understand what has worked for similar enterprises before, as well as a deep dive into the tools you’re considering and their use cases. 

Microsoft has a good repository for data models, plus a lot of literature on best data practices. There are also some great books out there that can help you develop a more strategic, business-minded approach to data architecture. 


Prediction Machines by Ajay Agarwal, Joshua Gans and Avi Goldfarb is ideal for understanding AI at a more foundational level, with functional insights into how to use AI and data to run efficiently.  For more seasoned engineers and technical experts, I recommend Designing Data Intensive Applications by Martin Kleppmann. This book will give you the very latest thinking in the field, with actionable guidance on how to build data applications, architecture and strategy. 

Three Fundamentals For A Successful Data Architecture 

There are several core principles that will help you design a data architecture capable of powering AI applications that deliver ROI. Think of the following as compass points to check yourself against whenever you’re building, formatting and organizing data:

  • Always having your eye on the business outcome you’re working toward as you build your data architecture is the cardinal rule. In particular, I recommend looking at your company’s near term goals and aligning your data strategy accordingly. For example, if your business strategy is to achieve $30M in revenues by year end, figure out how you can use data to drive this. It doesn’t have to be daunting: break the bigger goal down into smaller objectives, and work toward those.
  • While setting a clear objective is key, the end solution must always be agile enough to adapt to changing business needs. Small scale projects might grow to become multi-channel, and you need to build with that in mind. Fixed modeling and fixed rules will only create more work down the line.  Any architecture you design should be capable of accommodating more data as it becomes available, and leveraging that data toward your company’s latest goals.

I also recommend automating as much as you can. This will help you make valuable business impact with your data strategy quickly and repeatedly over time. For example, if you know you need to deliver reporting every month, automate this process from the get go. That way, you’ll only spend time on it during the first month. From there, the impact will be consistently efficient and positive. 

  • To keep yourself on the right track, it’s important to know how to tell if your data architecture is performing effectively. Data architecture is working when it’s able to (1) support AI, and (2) deliver usable, relevant data to every employee in the business. Keeping close to these guardrails will help ensure your data strategy is fit for purpose, and fit for the future.

The Future of Data Architecture: Innovations to Know About

While these key principles are a great starting place for technical leaders and teams, it’s also important not to get stuck in one way of doing things. Otherwise, businesses risk missing opportunities that could deliver even greater value in the long-term. Tech leaders must constantly be plugged into the new technologies coming to market that can enhance their work and deliver better outcomes for their business:

  • We’re already seeing innovations that are making processing more cost efficient. This is critical because many of the advanced technologies being developed require such high levels of computer power they only exist in theory. Neural networks are a prime example. But as the required level of computer power becomes more feasible, we’ll have access to more sophisticated ways of solving problems. 
  • Currently, every machine learning model has to be trained by a data scientist. But in the future, there’s potential to build models that can train other models. This is still just a theory, but we’ll definitely see innovation like this accelerate as processing power becomes more accessible. Additionally, when it comes to apps or software that can decrease time to value for AI, we’re in a phase now where most technology available can only do one thing well. The tools needed to productionize AI—like storage, machine learning providers, API deployment, quality control—are unbundled. Currently, businesses risk wasting precious time simply figuring out which tools they need, and how to integrate them. But technology is gradually emerging that can help solve for multiple data architecture use cases, as well as databases that are specialized for powering AI applications. These more bundled offerings will help businesses put AI into production faster. It’s a similar trajectory to what we’ve seen in the fintech space, with companies initially focused on being the best in one core competency before eventually merging to create bundled solutions.
  • Looking further into the future, it seems safe to predict that data lakes will become the most  important AI and data stack investment for all organizations. Data lakes will help organizations to understand predictions and how best to execute on those insights. 

I see data marts becoming increasingly useful in the future. Marts deliver the same data to every team in a business in a format they can understand. For example, Marketing and Finance teams see the same data represented in metrics that are familiar and – most importantly – a format they can use. The new generation data marts will have more than dimensions, facts and hierarchy. They won’t just be slicing and dicing information, but will support decision making within specific departments.   

As the technology continues to develop, it’s critical that businesses stay up to speed, or they’ll get left behind. That means tech leaders staying connected to their teams, and giving them the opportunity to bring new innovations to the table. Even as a company’s data architecture and AI applications grow more robust, it’s important to make time to experiment, learn and (ultimately) innovate. 

The article has been written by Atul Sharma, CTO and Co-founder, Peak