Data Engineering and Privacy in a Smart World

Whether you like it or not, data engineering is making it easier than ever before to capture and analyze personal data, and the trend will continue

New Update

AI, ML and Big Data are the hot tech buzzwords today, thanks to all hype created around them because of their usage by leading companies like Google, Facebook, etc. As a result, they’re now being adopted or tested by just about every organization to explore how they could improve operations, sales and of course, cut costs while achieving growth.


While these technologies can help organizations achieve better insights into customer behavior and predict their next move, it also raises some security and privacy concerns.

Tech Evolution has Made Everyone into a Data scientist

The days are done when you needed a statistical masters or PhD and had to spend years learning sophisticated tools like SAS in order to get the hang of data science. You don’t need that anymore to run your models and make something out it. Computational evolution, dramatic cost reduction in data ownership, and availability of Big data on cheap hardware is allowing pretty much every small to mid-size firm to have their own data lake and run easy to use tools like Trifacta and Data Robot to do advanced data analytics without even hiring a Data scientist.


Having that level of data power opens the doors to some major data security concerns. For instance, what if it goes into wrong hands?

The Snowden Effect

Since Snowden blew the whistle on alleged mass surveillance by NSA powered by data from Internet and telecom companies, it created a tectonic shift in eyes of consumers on how these companies are managing their data. After that incident almost every company revamped their privacy policy and you might have got multiple notifications from them to read and acknowledge the new terms and conditions. We guarantee that 99% of the people wouldn’t have bothered to read those. They would have simply accepted them to get rid of their annoying messages.


Although it created more awareness among users and changed the way companies store and use customer data, it hardly changed anything in terms of government policies or laws. Few eye washing laws might have been passed across the western world but that is only to cover the tip of the iceberg.

There is still an increasing number of "concerned" users who still don't trust organizations and government with their own data. We specifically used the word “concerned” as not everybody cares about their data. Most people simply, knowingly or unknowingly, made their whole life public by sharing everything on Facebook, Instagram, etc. and let the advertising vultures do engineering on their data.

Now you must be hearing from all companies that they don't share or sell your data to anybody. Now you can choose if you want to keep your data private or share. With new privacy policies the companies are not supposed to share or sell data without user consent.


No Need to Store or Share Data, Thanks to Metadata Engineering

The Internet companies don’t need your data to analyze and predict your behavior, thanks to Metadata engineering. Take an example of a new Photo application in iOS and Android, which automatically does face, pet, location detection, etc. thanks to the new camera applications, which adds lot of metadata along with the image file to tell the whole story behind your photo.

Smartphones Getting Smarter


Lately you must be very happy with new iOS or Android update which "learns from you" and suggest you what is best option for you. For many people it is a very cool feature that lets Siri recommend your schedule or next shopping item or next holiday destinations. Industry leaders may tag these features with cool marketing names, but deep inside, there are old, statistical analysis algorithms powering the machine learning tools.

From a technology standpoint it is very interesting to learn and use these new machine learning tools to analyze and predict the next action. However, from a privacy standpoint, it’s a whole different story. Now the smartphones with their "learning feature" may upload all metadata to the server where it can be further refined and enriched for enhanced targeted marketing for products and same data could be used by Govt. or criminal organizations for a totally different purpose.

The New Normal


Whether you call it technological advancement or evolution, aforesaid data engineering is here to stay. As we are more connected than ever in the current fast-moving world, you need to stay updated to compete in the market. You can't just switch off your smartphone and go to the jungle. It’s just not possible. There are dozens of articles on how to protect your privacy on Google, but most of them either suggest switching off your smartphone or simply give misinformed and ineffective solutions.

There couldn’t be a more exciting time than this for data engineering enthusiasts like us. We now have access to so many tools to explore and play with data and make predictions about customer behavior. While on one side, it won't be fun for consumers to learn that their personal data is being engineered for various things. However, there’s companies are also putting this data to good use for the benefit of consumers.

For instance, on a lighter note, with continued advancement on automation and machine learning, you will soon have your milk delivered to your doorstep before you run out of it, thanks to IoT sensors on your fridge and Amazon Echo.

By Jaskarn Singh (AI and Data Sciences), Chandan Kumar (Founder and CEO of, and Jitin Khanna (Data and Analytics Architect