Imagine yourself in the middle of a typical Indian wedding. There is an overwhelming melee of sensory stimuli of all kinds — sounds, sights and smells coming at you from every direction — the cacophony of children playing and running around, the ear splitting sounds from the overly enthusiastic Mridangam and Nadhaswaram players, the constant chatter, the loud chanting of the mantras, rich visual stimuli from the floral arrangements and people flaunting their newest clothes or jewelry. The aromas wafting from the kitchen, titillating your sense of smell and fanning your slowly building hunger pangs. Â Now imagining this veritable melting pot of sensations, you would rightfully wonder how anyone could focus on any one thing in such a setting. Yet, if you remember the last time you attended such an event, you can recall sitting in the middle of it all and somehow being able to have a great conversation with that much loved cousin that you ran into after all those years.
Â
The brain has this incredible ability to filter out or subdue the sensory stimuli you don’t need at that moment to the background, helping you to focus on that conversation. This particular phenomenon called the “cocktail party effect” is scientifically well documented and studied by cognitive and neuroscience researchers. Selective focus/attention is arguably an essential skill that conferred survival advantage to our ancestors and got bred into the species through natural selection and evolution.
Â
You can, at a high level, think of Personalization technology as playing the role of this “brain filter,” which focuses your attention on the things that interest you, in the middle of the greatest information overload of them all — the World Wide Web.
Â
The World Wide Web came into existence 25 years ago. Today it is living up to its ambitious name, serving over three billion people across the globe. Search engines typically contain a few tens to hundreds of billions of pages in their index. Even that is only a small slice of the Web. Estimates of the number of unique Web pages put it at upwards of a trillion pages. Every day, the web grows by more than a billion new pages. In fact in 2013 alone, the Web has grown from about 630 million websites to over 850 million during the course of the year. These are mind-boggling numbers. When you are looking for something very specific, search engines index and give you the content that may best satisfy your information need. But, people don’t go online just to search for specific information — they also spend their time finding something interesting to read or watch and generally explore their interests.Â
This is where Personalization technology comes in. In its idealized form, Personalization acts as a personal valet who knows you well and understands the topics that interest you and is also contextually aware of which content on the Web may be interesting/ engaging to you personally at that given moment and presents that to you. Without Personalization, content discovery on the Web would be tremendously difficult due to the sheer volume of available content.Â
Â
Let us look at how Personalization technologies work. Ideally, you would like to have a detailed “interest profile” of your users. Then, if you have the detailed characterization of the content that you are trying to rank for the user, a simple matching of the content to the user is all that’s needed. But, in the real world, you typically have neither of these. Users don’t provide detailed profile information ‒ either because it is too tedious/impractical to do so or because of privacy concerns. On the content side, the sheer volume of content and the variety of sources you get it from may preclude having a uniformly high level of understanding of the content that you are trying to rank/serve to your users. So, while using the information that is available, real world Personalization systems rely heavily on analyzing the interaction patterns of the community of users with the content pool, to further increase their intelligence on both the user preferences and the content.Â
Machine learning algorithms that use large scale data mining techniques, which are broadly classified under the heading of “recommender systems,” analyze these data sets and create content recommendations for users. Â Two broad classes of algorithms are typically used: “Collaborative filtering” relies primarily on finding correlation patterns between consumption patterns of various users. Say that another user of the service (or a group of users) has had very similar likes/dislike signals on the content they have consumed as you. The system extrapolates from that and if say, there is a piece of content that is highly rated by the users in this “cohort group” of yours and you have not yet seen it, it is likely to serve that content to you. “Content filtering” on the other hand tries to learn your innate preferences by understanding the characteristics of the content you have liked before — say, an artist or a composer in the case of music that you seem to like more than others and uses this information to predict the likelihood of you liking some new content that you have not yet seen. Real world systems typically use a mix of both of these above approaches. Â It is important to note that all these algorithms have to work at scale and hence they are completely machine driven without any human intervention or inputs in the actual recommendation process.Â
For example, the stream of articles on the Yahoo homepage are picked and ranked for you specifically based on what the algorithm has learnt about your preferences and matching it to the large pool of available content. One key feature of these algorithms is that they learn over time and as the volume of data increases, they perform better. You, as a specific user, will notice that as you interact with the stream more, the system adapts itself further to your preferences in content. Given enough number of interactions over time, the system can really “learn” very fine-grained information about your interests — say, not only the fact that you prefer sports and finance more than entertainment or politics, but also that within sports, you have a specific liking for cricket over tennis and within cricket you are likely to engage better with content about specific teams or sports persons etc.Â
SimilarMly, Movie recommendations in Netflix and buying recommendations on most e-commerce sites work on similar principles. So, in closing, we can see how Personalization technologies have become an essential part of how we as users navigate the Web, and provide us a truly immersive experience by simplifying how we discover and engage with content.