Blog
Key Principles of Building Machine Learning (ML) Virtual Assistants

Key Principles of Building Machine Learning (ML) Virtual Assistants

Peiru Teo
March 7, 2017
Key Principles of Building Machine Learning (ML) Virtual Assistants

Avoid a “clueless virtual assistant” situation by equipping your virtual assistant with past data from your business before it goes live.

For most of us, building a virtual assistant serves a real business purpose and isn’t just an amusing project. Since a virtual assistant is meant to serve or interact with customers in some way, it should be given as much information and intelligence as possible to properly interface with customers. The sooner this data is learnt by the virtual assistant, the better — and it’s best when it is done before the virtual assistant even goes live.

There are a few reasons why this makes absolute sense:

  1. You already have data sitting around that could be used to avoid a “cold start” that would put friction on the user experience and cause big drop-offs in use due to the virtual assistant’s inefficacy in replying.
  2. You are investing a lot of time and effort into building a virtual assistant, and expect it to perform a strategic purpose in the grand scheme of things.
  3. Nobody likes a virtual assistant that keeps saying “Sorry, I didn’t understand your question. Can you rephrase?”. Nobody.

Raiding your support archives

A customer-facing virtual assistant naturally lends itself to using information from customer support tickets. Support tickets are a treasure trove illuminating the types of concerns and issues that customers already have, and can inform your decisions on what features to build into the virtual assistant, and anticipate what kind of questions your virtual assistant should be able to answer. Since these are actual, concrete data points, you don’t run the risk of imagining problems and solutions unrelated to your customers. From our experience, this is the best way.

Frequently-asked questions are another great resource for provisioning your virtual assistant with the first cut of replies. Since these are distilled from repetitive queries (hopefully, and not just imagined), they can be the baseline from which your virtual assistant can draw responses. However, you should be aware that not all FAQs are updated regularly or relevant to customers, and so you have to be sure to vet these questions before haphazardly deciding to pop them into your virtual assistant.

Training the virtual assistant

While it is ideal to use past tickets to train your virtual assistant, it can also be challenging if you or your virtual assistant provider don’t have the prerequisite expertise to analyze and extract the important information needed at scale. If you have a big archive that you believe will be important to aggregate, then it will require some knowledge of text mining and natural language processing to adequately start classifying, topic modeling and summarizing these data into usable nuggets for your virtual assistant development process.

You can of course take the time to manually go through the tickets and categorize them yourself, or get someone on your team to do so. Regardless, be sure to select a representative sample of tickets across these parameters:

  • Products — to represent all product types that you carry
  • Time frame — to account for events or seasonal fluctuations
  • Account type — to capture the variety of customer accounts
  • Priority — to understand differences in urgent and normal issues
  • Resolution time — to sample how easily/quickly different issues are resolved (the easier or faster, the more likely your virtual assistant can handle them automatically)
  • Tags or categories — to make use of the human-classified data for training

There will be many other factors that you can sample for representativeness of your customer support tickets, depending on your business and product structures.

Now, the hard part to do without machine learning and NLP expertise is classifying and counting the results of your analysis by hand for a large number of tickets. Just classifying about 1,500 to 2,000 tickets would be overwhelming for the average person, and could take between 3 days to 2 weeks, depending on the complexity of issues and time availability of your “human labor”.

Assuming you do have some expertise at hand, you can consider using semi-supervised training and start off with these labeled instances for classification, run some topic modeling through a hybrid vector-based LDA, and have some fun experimenting with different models and hyperparameters that do well on your dataset.

At the end of your training phase, you should have a relatively stable and production-ready model that can classify incoming queries that your virtual assistant receives from users. To maximize the effort that you put into training this model, you can choose to package it into an API and letting your virtual assistant call this API when it needs to classify new queries from customers.

Testing the results of training

It’s really important to test and retest the results of your trained model with real users. Sometimes (actually oftentimes) your models will overfit on the data that you trained it on, and will turn out to be useless in the face of new data. Hence, other than the split test/dev sets you will use in evaluating the model during training, you should give the first iteration of the virtual assistant equipped with this model to users.

This is usually achieved during the trial phase or pilot/beta launch of your virtual assistant, when you are testing the experience with a group of customers or users. Since you probably won’t have that many users at this stage, take the time to look through each response and figure out if there are systematic samples or examples which were ignored or misclassified in the model.

Also, take advantage of the way that the virtual assistant can ask for feedback from users, such as “Did I answer your question? [Yes/No] that you can pop into the end of every response, as a way to add structured data to continue training the supervised classification model you have set up.

Launch and optimize

After the trials, it’s time to launch your virtual assistant — and really see what happens to your virtual assistant’s response-abilities in the field. If all goes well, you may be able to answer 50–80% of questions properly at launch. In this launch period, continue to observe and note what kind of questions people are asking the virtual assistant, and factor that into your next version for optimization.

Realistically, you’ll definitely have a lot more tuning to do in order for it to work at the level you expect it to. Be patient: Every 2–4 weeks, expect to update your model with more information if you can afford to. The more data you have amassed, the better; there is almost no exception to this rule (maybe except when the data is of low quality or improperly handled).

Want to go deeper?

So you have heard about the wonders of advanced techniques in deep learning and you feel that you can beat the performance of your vanilla machine learning model with it. Yes, you’re probably right — especially if you want your virtual assistant to be able to interpret the context of sentences or turns. Deep learning in the context of virtual assistants is probably most useful with LSTMs (GRUs), and can improve the experience over time.

However, don’t feel the need to jump straight into this option until you have tested the other machine learning models first; it’s as important to understand your baseline performance, and it’s also hard to find vendors or engineers who know how to tune the models enough for them to be useful in production. Many of the techniques (at the time of writing) are either still in, or fresh out of, research labs, and may not yet be robust enough to be used in mass production, facing your customers. Additionally, you’ll need a significantly bigger dataset, likely in orders of magnitude bigger, that you may not have in order to build a meaningful deep learning model.

Hopefully, this was a useful primer for you to start thinking about and planning your use of data in your virtual assistant’s dialog capabilities. We’re open to chatting about the newest research or ideas that you have, since you can always find us at KeyReply digging into and testing new research models for our own virtual assistants!