IT-led Big Data Projects Always Fail: CTO Teradata

Big data is not what most CIOs assume tells Stephen Brobst, CTO, Teradata Corporation who was in India recently. Brobst throws light on the different aspects of big data and how it can be used in different environments. Excerpts:

On Teradata’s India presence….

We have multiple development labs in India. The largest one is in Hyderabad. That is a hard core R&D lab, the core of the engine. We started our operations in India about 15 years ago. That time, it was more testing than high intellectual stuff. But now we are doing core development at our Indian R&D environment. We cannot quote the exact number of employees but it is in large hundreds. We also got development going on in Pune, Mumbai, Chennai, and Bengaluru. But Hyderabad is the most important one.

 

What are the focus areas in R&D? Is it offshore?

What I was talking about was hard core R&D. We are doing product development in that group. In core areas of the engine, ie, the optimizer and the file system, we have implemented some of the advanced features like temporal data management that was done mainly in the Indian part of our organization.

 

Since Teradata is into big data and analytics, what sort of challenges you see in this space? How is your development team working around solving these problems?

You call it a challenge I call it an opportunity. The industry moves very quickly. If you stand still for any amount of time, you will be passed by. We have to be much better than our competition otherwise people won’t buy our products. And yet we dominate in the big data and data warehousing space. And the reason we dominate is because we are always developing next-generation while the other guys are still thinking of the previous generation. We do a lot of work with universities. We have very smart customers and we listen to them. We are working very closely with our customers and they help us understand not just what they want now but where are they going in the future. Our job is to get there before them. Right now the area where we are having a lot of success is the big data space.

Has the big data landscape changed?

Big data is more non-traditional data. True uses of big data were basically dominated by the dot coms mainly in Silicon Valley. There is a lot of confusion over what is big data. And the ones that were doing it well were almost always dotcoms. They were eBay, Facebook, LinkedIn, Netflix, Google and these kinds of players.

The volume is not really the important part. That’s where people get confused. In India, if volume was the issue they have been doing big data for 20 years. Just because the size of the marketplace is big, everything is high volume data. Big data is not about high volume. The Indian marketplace is particularly confused at this point about that.

How relevant is big data for BFSI, healthcare, or retail?

Let’s talk about things that we can do today. If we look at the traditional analytics, most companies whether you are a bank or a telecommunications company or retailer they analyze transactions at the lowest level of detail. And transactions at the Indian marketplace with a billion people are high volume. It’s not big data but high volume data. Big data means that I am less interested in the bigness of the data and I am more interested in the diversity of the data. So I don’t want the transactions but I want the interactions. If you are a bank you see that I make a deposit online. But do you know all the clicks and all the searches that led up to that purchase? That’s the interaction. That’s the order of magnitude of bigger data probably 200 times of bigger data than the transaction. So yes it is high volume but that’s not the interesting thing. The interesting thing is now the customer behavior not just the transaction.

The web log data is not as structured as a transaction that you get in a banking system or a retail system or so on. There are new kinds of data type such as JSON for example. In Silicon Valley anyone who uses XML is a dinosaur. JSON (Javascript Object Notation) is a self-describing data type that allows you to do late binding. It is very flexible and it has roughly the same type of functionality. JSON is being used to capture sensor data, log data and social interaction data. If we talk about interactions, then there is click stream data. I can talk about tweeting about my experience on your retail site. Or Facebook blogging about my experience in your bank, the voice call interaction with a call center. Most banks record those calls and they do no analytical value from them. There is a ton of content in that interaction. So now the technologies exist to be able to capture the voice and do voice to text translation fact to mention sentimental scoring on the text they capture. This is big data. Now we are getting into very different kinds of data. It is not some record oriented thing you put in an excel spreadsheet. It is a voice file, text file, web log data or social media interaction, very different kinds of data.

 

How can businesses adopt big data and make it work for their business? How can they get true insights in their business, can they use it to prevent some disaster or moving forward with new business goals?

I think there are a couple of critical success factors. One of them especially in this marketplace is that you need to think big, but start small. Don’t try to solve all the problems at once. Take ‘A’ problem, particularly in this scale of the marketplace you can’t do it all at once. So pick a problem that has high volume, high priority of business and then focus on that problem, create a value and then focus on a self-sustaining model. This is particularly true with big data. IT led big data projects almost always fail. It is very important that you have these kinds of initiatives that you have business stakeholder as a part of that team, so that there is a collaboration that takes place between business and IT. IT by themselves will not be successful and business by themselves will not be successful, so you really need to get that joint team in place. So the thing is to look at what is your business goal, what are you trying to do. Big data is not always the answer, but sometimes it is the answer.