Sunday, April 29, 2018

Award Winning Social Bot 'Sounding Board'




5 research students at University of WA have come up with a social bot named ‘Sounding Board’ that tries to engage users in a meaningful voice conversation on any topic. They have used Amazon’s Alexa SDK for speech-to-text, text-to-speech conversions. Once the bot understands user’s intent, it searches web to pick relevant content and tries to continue somewhat meaningful conversations. Average meaningful conversation time for the ‘Sounding Board’ is 10 minutes and 22 seconds before the user lets it down because of inappropriateness. It’s a AWS cloud based solution

Sample demo video


People try to push the bots’ current status of ‘meanial jobs accomplisher’ to an ‘intelligent companion’

Saturday, February 3, 2018

An interesting developer survey


Some interesting facts…


  • 25% of the developers surveyed have started to programming before they are 16 years
  • On an average, one developer knows 4 languages/technologies (Choices for enterprise developers - .NET, Java, JS, TS, Angular, React, NodeJS. One may need to venture into Python as well because of its support for data science/machine learning)
  • Stackoverflow and Youtube are the top developer resources
  • Javascript is the top language preferred by employers (they want it in stack – JS, Angular and/or React, NodeJS, TypeScript)
  • Python is the top language liked by developers
  • When it comes to most loved frameworks, JS based frameworks take all the top positions (as expected) with NodeJS being the number one
  • Employers see for Github contributions (basically a developer's portfolio) more than the resume or experience. A developer must be doing one personal project or contributing to a Github open source project apart from the company assigned project

Wednesday, January 3, 2018

Building Chat-Bots using Google's Dialogflow

My very first encounter with a chat-bot happened a couple of years back when I was trying to lodge a complaint with Amazon customer service. I felt that I was handed out a poor delivery experience by the logistics company that was shipping on-behalf of Amazon and got miffed. I was just trying to find out where to register complaints in the Amazon site and got a pop-up from the right lower corner of the web page offering a conversation with an executive. I wasted no time in accepting the offer. After the initial pleasantries, I took time to write a best possible statement that captured my issue along with emotions. By any instant messaging standard, that was a big paragraph rather than a usual bits-and-pieces chat message. The moment I hit the 'Send', I was greeted with an even bigger response in good English explaining why and how it could have happened, promising to fix the problem. Only 2 things were running in my mind that time-

  • The executive was extremely fast and smart
  • My problem was not that unique and Amazon faces these issues day-in and day-out and they have standard response templates

Even then I was slightly perplexed if someone can read my big statement, grasp the issue, select the best possible response template, customize as needed and send it to me in such a fraction of a second? I continued the conversation. All my questions and chat messages were responded in a rapid fire mode, good English and NO chatty-informal replies even though I felt the executive's exchanges were slightly off-tracked at times. Still, by the time, I ended up the conversation my mind was swamped with the thought - am I talking to a human executive?

We got serious about building a bot ourselves a year ago when one of our flag-ship customers challenged us if we can develop a virtual conversational agent that can take the load off his customer services team and whether they can be relieved from regular, mundane queries and service requests. Since we were working with the customer for very long time, we were able to choose the use case that we thought as the right one for chat-bots and took a plunge. Even by the second half of 2016, we were able to locate some 20 bot frameworks - 4 or 5 were really good in vision even though the vision was not matched by good implementation. We explored Microsoft bot framework, Facebook's Wit.ai, Google's Dialogflow (formerly API.ai) and some name-less frameworks that eventually faded and not traced even by Google.

I showed special interest in Dialogflow because I thought it had lower entry barriers (I am not responsible if you understand 'no entry barriers' as 'credit card number not required' :)) I was stunned by its beautiful , elegant UI/UX that abstracted complex natural language processing algorithms (NLP). For a curious developer who didn't have any background in advanced mathematics, statistics or computational algorithms, Dialogflow offered a great launching pad for NLP/ML based conversational apps   

Creating a Conversational Model in Dialogflow

How do we interpret others' speeches? We need to infer-

  • What's being spoken about? a topic, a movie, a personality, a product, a selfie or any 'thing'?
  • And what they are saying about the 'thing'? good, bad, great, hopeless, want, don't want and so on

Dialogflow calls the first one as 'entities' and the second one as 'intents' (most NLP technologies call them the same). Modeling a conversational solution involves identifying the entities and intents of the problem domain. A complex flow may include hundreds of entities and intents. If you are modeling a 'pizza order' bot, then different pizza varieties, sizes, toppings, thin or deep crust, deserts are all essentially entities. As the modeler, you need to define all the different ways in which an entity will be called (in the Dialogflow terminology, it's called 'synonyms')

 Intents could be enquiring the menu, wanting to know the special offers and discounts, ordering the dinner, asking for bill etc. Intents may need to be aware of the context in which it can happen. For example, 'ask for bill' intent can happen only after 'order pizza' intent. Dialogflow lets you define the input contexts (when this intent can possibly occur?) and output contexts (announcing to other members of the model that something happened already). Continuing on the same example, 'order pizza' intent goes as input context for 'prepare the bill' intent

Dialogflow understands the intent by asking the modeler to give a few samples of 'User says' (i.e. example user utterances). This can be considered as the initial training data. For example, someone wants to know the menu may express the same in different ways ('what you have for lunch today?', 'menu card please' or 'do you have veggie garden?'). The modeler is expected to train the model with these sample utterances. Accuracy of these sample data will determine the accuracy of the bot itself as Dialogflow uses machine learning (ML).  In the 'User says' sample, you can optionally define the entities - 'lunch' could be an entity, 'menu card' could be an entity, 'veggie garden' could be an entity. You can make some entities mandatory in some intents. Example, the user must convey what ('entity') she is ordering in the 'order pizza' intent. When you make the entity mandatory, the bot will expect the user to specify what pizza they want while ordering

Confidence Score

Bot is trained with the 'User says' utterances and entities so that it can identify the user's intent. During the initial stages, your training data may not be adequate and bot will have difficulty in identifying the intent. Hence it declares a 'Confidence Score' whenever it matches user's utterance with a specific intent. As the modeler, you can configure the threshold for confidence score. If the confidence score is below the threshold, then bot will not match the user's utterance with the intent

Intent Response 

Dialog flow lets the modeler to configure the intent responses. When the bot is able to match a conversation with a specific intent, the bot delivers whatever intent response configured to the user.

Integration with Databases and 3rd Party Services

You might be curious if you don't need to be a developer for modeling a domain. You are mostly correct and it will be suffice if you know the domain the you are modeling. But most real-time scenarios will require some additional business logic push in the form of data external to the bot or 3rd party web services like a news feed from a media house or weather information from the meteorological department. When you are faced with such a scenario, you might need to write code to connect to the external databases and services. Bot can connect with any RESTful service (created using technologies like NodeJS/Express, ASP.NET MVC web API, Spring MVC etc.) For example, if the user asks for menu, bot may need to get the available menu items from the restaurant chain's product catalog that might have been globally defined. Github has sample code for these web hook services

Channels

The model that you defined is mostly channel agnostic and hence the same model can be used across multiple channels with minimal or no change to the model. Dialogflow natively supports all the popular channels like web, skype, slack, FB messenger, Kik, Alexa etc. You need to register your bot with the respective channel providers. For example, if you want to create a bot that works in Skype, you need to register and integrate the bot with Microsoft bot framework. Once done, your bot will sit along with your Skype contacts and you can start conversation with the bot as if you initiate a conversation with one of your contacts

Training the Bot

Since Dialogflow uses ML algorithms, the efficiency and accuracy of the bot increases as the days go by and you continue to amass real-time user conversations. Dialogflow provides a rich interface for customized training of the bot where you can identify additional entities, intents you might have not thought of before, but the users rightfully expect your bot to respond to. Bots may not be accurate all the time, but improves with valid training data and the modeler may need to adjust the model based on the training data

Small-Talk

Dialogflow supports a concept called 'Small-Talk' that takes care of normal utterances like 'how are you', 'hi', 'you are welcome' etc. You don't need to do anything other than just enabling the feature in the console


For hosting the chat-bot for your users, you will need to create it as a Google Firebase project and you may need to pay depending on different factors. There is a vibrant community that operates both in Github and Dialogflow. Both Google and the community build a lot of pre-built agents. You can refer them to get the best practices and some unique ideas. You can export/import the bots and invite other users to develop or review the bot using the Dialogflow console and Google accounts. Google has an extensive technical documentation here.  Having talked about Dialogflow, I will encourage the readers to take a look at Facebook's bot toolkit and Microsoft bot framework/LUIS (Language Understanding Intelligent Service)