Hike to leverage natural language processing, AI and ML for improving multi-lingual chat experience

Nimish Sawant

Among the many popular messaging services out there, Hike is probably the only one which has been made in India.

After going live in 2012, the cross-platform messaging service has acquired over 100 million users. Hike messenger started off with the objective of creating a Super App, much on the lines of Tencent's WeChat, offering multiple services on one platform. In 2019, however, Hike CEO and founder, Kavin Bharti Mittal, revealed plans of doubling down in the social and content space, but across multiple apps instead of a single Super App.

One of the key features of Hike is its focus on Stickers. The messenger has over 1 million stickers in its portfolio with more being added. Hike even has separate apps just to communicate using stickers, called Hike Sticker Chat. Hike also supports around 40 plus Indian languages and stickers are available for all these languages. In addition to this, Hike has also emerged as the top three patent filing companies, with 66 patents in 2017-18. This puts it alongside IT conglomerates such as Wipro and TCS when it comes to patents filed.

Dr Ankur Narang heads the AI and data science division at Hike and looks after the research and development projects on natural language processing, chatbots, computer vision, speech recognition and related artificial intelligence (AI) and machine learning (ML) divisions. At the TensorFlow World 2019 conference which took place in California, last month, Dr Narang spoke about how Hike is using AI-driven innovations for sticker recommendations on the Hike platform.

In an email interaction post the TensorFlow World address, Dr Narang spoke to tech2 on the AI innovations that are happening on Hike messenger platform.

Edited excerpts from the interaction follow.

Hike logo

Hike logo

tech2: Among the Indian messaging services, Hike is synonymous with Stickers €" you even have a Hike Sticker Chat app. Even for WhatsApp sticker recommendations, Stickers made by Hike show up. How did Hike come to the realisation that this is an inherent need among Indian users?

Dr Ankur Narang: When Hike started in 2012, the landscape was very different compared to what it is today. Mobile data was expensive, smartphones were low end and it was difficult to use the internet. Hence we started by building a Super App. Today the world has evolved tremendously. Data is cheap, smartphone storage is not a problem, and the smartphone itself has become a Super App. What's enabling us to innovate for this new landscape is our vision of building a new social future. One of the key things driving our team to enable this vision is to put the customer at the centre. Everything we do is inspired by how can we solve better for the customer and invent on their behalf. Stickers are a great example of that.

If you think about it, the keyboard, our usual medium of expression in the digital world, hasn't really evolved since the 1800s. In a diverse landscape like India, where dialects change every few kilometres, keyboards aren't well equipped. The new-age Indian internet users are looking for a seamless way to express themselves. That's where stickers were able to elevate expression amongst users and their close friends rather effectively.

What's unique about us is that we are also an art company and stickers are a perfect example of how we are innovating for consumers at the intersection of art and AI. What makes Hike stickers so popular is our unique ability to deliver hyper-local and hyper-personal elements. It's also interesting to note that Hike is the only player that is using Natural Language Processing (NLP) to solve for local languages at a mass scale.

Stickers have been at the core of Hike for many years now, and just like the landscape we operate in, it has evolved rapidly and will continue to do so. The evolution has been incredible (and fun!). We now have Stickers in 40+ languages and dialects. Built with Machine Learning (ML) at its core, the goal is to empower our consumers with social products that are both at par with advancements in tech and AI as well as enable a deeply customised experience.

tech2: What are the challenges faced when recommending multilingual stickers on the Hike platform? Do you only rely on mapping what's being typed in the chat window to stickers on your platform or is there something more to it?

Dr Ankur Narang: One of the challenges we faced was that we wanted the sticker recommendation model to be able to capture the nitty-gritty of the chat input. Unlike formal text, many orthographic variations of a phrase were observed in our anonymised chat data. For example, we figured that there can be more than 350 variants of "kya kar raha hai?". It was a challenge to ensure that the phrase variations used by our users fetched the same results as the original phrases.

Our product is evolving very fast and so are our stickers. Therefore, another challenge that we faced was to train an end to end sticker prediction model directly from the skewed and dynamic data. As the recommendation has to be updated for each letter typed, the latency of the model should be minimal to avoid any lag while the user is typing. This is possible only if the recommendation is computed on the device. It put strict constraints on the run time memory, CPU and storage that the model can use, which was the third major challenge we faced.

To solve the above-mentioned challenges, we decomposed the sticker recommendation model in two steps:

Message prediction: Depending upon the input (last message received from the other user and the text being typed), we predict a message the user is likely going to type. In order to make the message prediction more robust to orthographic variations of a message, we predict a cluster corresponding to the message rather than the message itself.

Sticker mapping: After predicting relevant message cluster, we make use of stickers that are mapped to these message clusters and we show them to the user. We had mapped stickers to each message cluster, with the help of tags associated with the sticker and historical usage of the sticker.

Sticker recommendation on Hike. Image: Hike

Sticker recommendation on Hike. Image: Hike

tech2: What if a particular text string isn't mapped to a sticker? How does machine learning happen then?

Dr Ankur Narang: Stickers by itself has click patterns that can be used to score the sticker, basis its usage on the platform. This can help in order while recommending the sticker.

tech2: How do you solve the issue of too many stickers? Many times, I don't want to go through the hassle of searching for stickers, but just want to it to be an intuitive experience?

Dr Ankur Narang: With stickers over 40 languages and dialects, the discovery has become a big challenge for our users. Naturally, as the content grew, it's become more difficult finding the right Sticker at the right time in real-time to have a truly meaningful conversation.

There are four things we're doing in Hike Sticker Chat that lay the foundation for delivering the right sticker at the right time for the right relationship:

  • Sticker Suggestions: The most obvious way to begin solving this problem was to go after the most used funnel ie. piggyback on users typing. Sticker Chat has something we call Sticker Suggestions as you type. This is where the new ML changes really kick in. You don't even have to type the entire word. Based on our understanding of many variables such as time of the day, your personal tastes, the relationship of a chat (and so much more), we help you find the Right Sticker at the Right Time.
  • Quick Follow: With Quick Follow, once you've sent a Sticker, you can tap on it to find Stickers that you can 'follow' the previous one with.
  • Quick Reply: What if you received a Sticker? Same applies. You can just tap on the Sticker and voila, you'll find incredibly personalised Sticker recommendations to Reply with. It's lightning-fast.
  • Text to Stickers: Often we just don't have the right recommendation or the right content available for what you may want to say. But you still may want to be expressive, what about those scenarios? For that, we have Text to Stickers! Just type something and in case you don't find a Sticker you like, you get a beautifully crafted Text to Sticker based on a combination of a few fonts and colours that we automatically put together.

tech2: With Hike Sticker Chat, do you have to download the sticker packs for the stickers to show up in context to the text being typed? Or does Hike ping its central sticker servers every time there is a context for a sticker?

Dr Ankur Narang: Hike Sticker Chat shows up stickers in context to the text being typed, even if the packs aren't downloaded.

tech2: Which markets outside India are you planning to expand to and any particular non-Indian language that is being trained on Sticker packs?

Dr Ankur Narang: We believe that what we're solving could be relevant across similar emerging markets where users are looking for richer experiences and social niches. With TensorFlow and TensorFlow Lite, the kind of work Hike has accomplished is pretty unique in the country. We have brought in the hyper-personal and hyper-local elements by addressing multiple local languages and by doing sticker recommendations in a very effective fashion at a low latency of few milliseconds.

tech2: Apart from sticker recommendations, what other AI innovations are happening on the Hike messenger?

Dr Ankur Narang: Hike today is an AI-led unicorn. As we operate at the intersection of AI and art, we are deeply committed to enabling not just our product but the ecosystem further.

The areas of AI and ML research at Hike include:

  • Natural Language Processing (NLP): Creating contextual experiences by making sense of chat in different Indian languages
  • Orthographic variations of chat messages, and computational limitations on low-end smartphones
  • Computer Vision: Reinventing audio and visual communication
  • Social Network Analysis (SNA): Mining large scale networks for connecting people and driving business growth.

From building radically unique products at the intersection of Product, Design, Engineering and Art, to cutting-edge work across NLP, computer vision and more. We have been able to make some significant leaps here€" from showcasing at renowned global platforms such as TensorFlow World, IJCAI and ECIR to live application of research through our powerful sticker recommendation and discovery features.

With a huge focus on research and innovation, we have also led partnerships with local academia such as Indraprastha Institute of Technology Delhi (IIIT-D). Our unique advantage of enabling locally relevant research, especially for a diverse market like India, has been key for these partnerships. Cultivating a culture of AI innovation has immense benefits both for Hike as well as the ecosystem. As an initiative in this direction, we have recently launched the Hike Patent Programme, which not only incentivises Hike employees with rewards and grants but also lends legal and market guidance to prospective patent filers. Hike was also recognised by Government of India as one of the top three patent filers in the country recently.

Also See: Google's former CEO urges US govt to invest more in artificial intelligence

Paytm plans to invest Rs 500 cr in early-stage startups; to focus on AI-based tech, big data solutions for new innovations

JNU students stage protest outside campus auditorium over fee hike, curfew timings, dress code; traffic affected near varsity

Read more on News & Analysis by Firstpost.