Sunday, August 19, 2018

Intent Prediction

Intent prediction is an optimization technique similar to web navigation prediction in websites. Some of the same machine learning algorithms can be applied to intent prediction.

When a user is chatting with a bot, the user's requests are run through an intent classifier to figure out the intent. The model also gives us the entities captured from the request. The goal is to be able to predict the next intent (and entities) of the user so that we can prefetch the data in order to reduce latency.

This can be achieved with many types of sequence prediction ML algorithms, such as, variable order markov models, CPT+, LSTM, etc.

Before we can use any sequence prediction algorithm, we need a way to create sequences from the chat history of a user and a bot. The chat history is essentially a sequence of intent+entities and responses+parameters.

User: What's my vacation balance?
Bot: you have 10 days left
User: Do I have any shift on Friday?
Bot: yes, you are working Friday at 9am
User: Give my shift to John
Bot: ok, done
User: Submit a vacation request for Friday
Bot: you have a meeting on Friday
User: Cancel that
Bot: ok, cancelled
User: Submit my vacation request
Bot: ok

In the above conversation sequence, we want to be able to predict that the user is going to submit a vacation request, so that we can pre-fetch the meeting information to immediately notify the user of it (instead of doing a validation during the vacation submission).

We can create sequence of hashes as follows:

User: hash(GetBalance_vacation_req)
Bot: hash(ResponseCode1_10_res)
User: hash(GetSchedule_friday_req)
Bot: hash(ResponseCode2_friday_morning_res)
User: hash(TradeShift_req)
Bot: hash(ResponseCode3_res)
User: hash(TimeAwayFromWork_vacation_friday_req)

Note that it's not enough to know that user asked for GetBalance (and ignoring the vacation type), because if user asked for sick day balance then he may not necessarily be looking to take a sick day off on Friday. (you don't take sick days in advance)


These sequence of hashes uniquely identifies a conversation. We can then look at the history of conversation sequences and try to predict the next request.

The CPT+ algorithm can be used to train a model to predict the next item in a sequence. In our case, each item will be the hash code of a request/response.
The items in a CPT+ model are stored as nodes in a trie data structure.


Our vocabulary will be the set of all hash codes generated from all combinations of useful requests and responses.


Work in progress in github: Intent Prediction



References:

CPT (Compact Prediction Tree)
An Introduction to Sequence Prediction
A Sequence Prediction Framework




Sunday, August 5, 2018

Semantic Paraphrasing

Semantic paraphrasing is not easy!

I am doing an experiment on how to generate semantically similar phrases given an input phrase. So, for example, given "I have a meeting tomorrow", a semantically similar paraphrase would be "I have a meeting scheduled for tomorrow".
A sophisticated way of doing this kind of thing is to use a neural network, specifically a seq2seq generator. And even if you spend tremendous amount of time and effort training a model with a seq2seq network, it will still not be perfect.

So, I tried a hack. No Neural Network. Just good old google translate, since they are using NN behind the scenes anyway.

My experiment is as follows. Given an english phrase, I use google cloud translation API to convert it into 2 foreign languages and then back to english. The order is something like...
en -> fr -> es -> en

1. "yes, you are meeting someone tommorow"
2. "oui, vous rencontrez quelqu'un demain"
3. "sí, te encuentras con alguien mañana"
4. "yes, you meet someone tomorrow"

I am using 2 intermediate translation steps in order to get some variation in the final output. With just one step there is hardly any variation. Of course, it adds quite a bit of latency since I am making multiple cloud calls.

But it works! Sort of. It gives me just the right amount of semantic paraphrasing that I wanted. Not perfect, but good enough. it will need some tweaking to correct for weird mistakes.

The use case for this thing is in building a domain specific chatbot. When my chatbot responds to a question from the user, it picks up a random response from a pool of semi-hard-coded responses.
The first problem to solve is how to pick the closest response from the pool, closest to the question asked. For that I can use edit distance between the question and each of the responses in the pool.

Now the bigger question is, how do I generate this pool given a few seed responses. This is where the semantic paraphrasing comes into play.

Unfortunately, Google (or Microsoft) doesn't give you a downloadable translator. There is no good way to work around the cloud calls. So, to optimize the process, all the paraphrases must be generated as a preprocessing step.

All this trouble just to give the bot a human touch!


The code is in github: Semantic Paraphrasing