You (probably) don't need a chatbot

There has been a great hullabaloo in the last few months about the rise of chatbots, and discussions of “conversational UIs” or, even more radically, the concept of “no UI”—the idea that services might not need a UI at all.

This latter concept is quite interesting: I’ve written in the past about one-shot interactions. For these one-shot interactions, UI is clutter. But chatbots aren’t the answer to that problem: because chatbots are UI, just a different sort of UI. Compare…

Scenario 1: App

  1. Alice hears an amazing song playing in a club.
  2. Alice pulls out her iPhone and unlocks it by placing her finger on the TouchID sensor.
  3. Alice searches on the homescreen for the Shazam app.
  4. Alice opens Shazam, then presses the button to start the process of Shazam identifying the song that is currently playing.
  5. Alice waits.
  6. Alice is told what the song is and offered links to stream it or download it from a variety of streaming and download services that vary depending on the day of the week, the cycle of the moon, and how Shazam’s business development team are feeling this week.

Scenario 2: Chat

Someone at Shazam decides that apps are a bit old-fashioned and decides to build a chatbot. They have read an article on that tells them that chatbots are better, and decide to build one based solely on this advice rather than any actual empirical evidence.

  1. Alice hears an amazing song playing in a club.
  2. Alice pulls out her iPhone and unlocks it by placing her finger on the TouchID sensor.
  3. Alice searches on the homescreen for the Facebook Messenger app.
  4. Alice opens Facebook Messenger, then locates the existing chat session with the Shazam bot.
  5. Alice scrolls back up the chat to work out what the magic phrase she needs to type in to trigger the chatbot into listening to music.
  6. Alice waits.
  7. Alice is told what the song is and offered whatever extra rich data the chat UI is allowed to show.

As you can see, this is a vast improvement, not because it makes the process less involved or elaborate, but because someone on told them that it is new and exciting.

Scenario 3: Idealised One-Shot Interaction

  1. Alice hears an amazing song playing in a club.
  2. Alice taps a button on her smartwatch. Everything else happens in the background. Alice continues partying and enjoying herself rather than being the saddo staring at her phone all night.

For those without a smartwatch, a lockscreen button on the phone could be substituted.

Anyway, this is a slight distraction from the broader point: chatbots are a bit of a silly fashion and a fad and that they seem to be adopted based on fashion rather than based on any actual utility.

But, but, there’s this awesome chatbot I use, and I really like it!

Great. I’m not saying that they have no purpose, but that chatbots are being adopted even though they often are worse at what they do than the alternative. They also come with considerable downsides.

First of all, chatbot UIs are poor at letting a user compare things. When someone browses, say, Amazon or eBay or another e-commerce service, they will often wish to compare products. They’ll open up competing products in different tabs, read reviews, check up on things on third-party sites, ask questions of their friends via messaging apps and social media sites like Facebook. Chatbot UIs remove this complexity and replace it with a linear stream.

Removing complexity sounds good, but when someone is ordering something, researching something or in any way committing to something, navigating through the complexity is a key part of what they are doing.

Imagine this scenario. Apple have 500 different iPhones to choose from. And instead of calling them iPhones, they give them memorable names like UN40FH5303FXZP (Samsung!) or BDP-BX110 (Sony!). Some marketing manager realises the product line is too complex and so suggests that there ought to be a way to help consumers find the product they want. I mean, how is the Average Joe going to know the difference between a BDP-BX110, a BDP-BX210, and a BDP-BX110 Plus Extra? You could build a chatbot. Or, you know, you could reduce the complexity of your product line. The chatbot is just a sticking plaster for a broader business failure (namely, that you have a process whereby you end up creating 17 Blu-Ray players and calling them things like BDP-BX110 rather than calling them something like “iPhone 7” or whatever).

Chatbots aren’t removing complexity as much as recreating it in another form. I called my bank recently because I wanted to enquire about a direct debit that I’d cancelled but that I needed to “uncancel” (rather than setup again). I was presented with an interactive voice response system which asked me to press 1 for payments, 2 for account queries, 3 for something else, and then each of those things had a layer more options underneath them. Of course, I now need to spend five minutes listening to the options waiting for my magic lucky number to come up.

Here’s another problem: the chatbot platforms aren’t necessarily the chat services people use. I’m currently in Brazil, where WhatsApp is everywhere. You see signs at the side of the road for small businesses and they usually have a WhatsApp logo. WhatsApp is the de facto communication system for Brazilians. The pre-pay SIM card I have has unlimited WhatsApp (and Facebook and Twitter) as part of the 9.99 BRL (about USD 3) weekly package. (Net neutrality? Not here.) The country runs on WhatsApp: the courts have blocked WhatsApp three times this year, each time bringing a grinding halt to both business and personal interactions. Hell, during Operação Lava Jato, the ongoing investigations into political corruptions, many of the leaks from judges and politicians have been of WhatsApp messages. Who needs Hillary Clinton’s private email servers when you have WhatsApp?

WhatsApp is not far off being part of the critical national telecoms infrastructure of Brazil at this point. Network effects will continue to place WhatsApp at the top, at least here in Brazil (as well as most of the Spanish-speaking world).

And, yet, WhatsApp does not have a bot platform like Facebook Messenger or Telegram. To get those users to use your chatbot, you need to convince them to set up an account on a chat network that supports your bot. For a lot of users, they’ll be stuck with WhatsApp, the app they use to talk to their friends, and Telegram, the app they use to talk to weird, slightly badly programmed robots. Why bother? Just build a website.

Now, in fairness, WhatsApp are planning to change this situation at some point, but you still have an issue to deal with: what if your users don’t have an account on the messaging service used by the bot platform?

One of the places chatbots are being touted for use is in customer service. “They’ll reduce customer service costs”, say proponents, because instead of customers talking to an expensive human you have to employ (and pay, and give breaks and holidays and parental leave and sick days and all that stuff) to, you just talk to a chatbot which will answer questions.

It won’t though. Voice recognition is still in its infancy, and natural language parsing is still fairly primitive keyword matching. If your query is simple enough that it can be answered by an automated chatbot, it’s simple enough for you to just put the information on your website, which means you can find it with your favourite search engine. If it is more complicated than that, your customer will very quickly get frustrated and need to talk to a human. The chatbot serves only as a sticking plaster for lack of customer service, or business processes that are so complicated that the user needs to talk to customer service rather than simply being able to complete the task themselves.

You know what else will suffer if there were a widespread move to chatbots? Internationalisation. Currently, the process of internationalising and localising an app or website is reasonably understandable. In terms of language, the process isn’t complex: you just replace your strings with calls to gettext or a locale file, and then you have someone translate all the strings. There’s sometimes a bit of back and forth because there’s something that doesn’t really make sense in a language so you have to refactor a bit. There’s a few other fiddly things like address formats (no, I don’t have a fucking ZIP code) and currency, as well as following local laws and social taboos.

In chatbot land, you have the overhead of parsing the natural language that the user presents. It’s hard enough to parse English. Where are the engineering resources (not to mention linguistic expertise) going to come from to make it so that the 390 million Spanish speakers can use your app? Or the Hindi speakers or the Russian speakers. If your chatbot is voice rather than text-triggered, are you going to properly handle the differences between, say, American English and British English? European Portuguese, Brazilian Portuguese and Angolan Portuguese? European Spanish and Latin American Spanish? Français en France versus Québécois? When your chatbot fucks up (and it will), you get to enjoy a social media storm in a language you don’t speak. Have fun with that.

And you can’t use the user’s location to determine how to parse their language. What language should you expect from a Belgian user: French, Dutch or German?

If you tell a user “here’s our website, it’s in English, but we’ve got a rough German translation”, that’s… okay. I use a website that is primarily in German everyday, and the English translation is incomplete. But I can still get the information I need. If, instead, your service promised to understand everything I say, then completely failed to speak my language, that’d be a bit of a fuck you to the user.

In the chatbot future, the engineering resources go into making it work in English, and then we just ignore anyone who speaks anything that isn’t English. World Wide Web? Well, if we’re getting rid of the ‘web’ bit, we may as well get rid of the ‘world’ and ‘wide’ while we’re at it.

Siri and Cortana are still a bit crap at language parsing, even with the Herculean engineering efforts of Apple and Microsoft behind them. An individual developer isn’t going to do much better. Why bother? There’s a web there and it works.

There’s far more to “no UI” or one-shot interactions than chat. But I’m cynical as to whether we’re ever going to reach the point of having “no UI”. We measure our success based on “engagement” (i.e. how much time people spend staring at the stuff we built). But the success criteria for the user isn’t how much time they spend “engaging” with our app, but how much value they get out of it divided by the amount of time they spend doing it. The less time I spend using your goddamn app, the more time I get to spend, oh, I dunno, looking at cat pictures or snuggling with my partner while rewatching Buffy or writing snarky blog posts about chatbots.

But so long as we measure engagement by how many “sticky eyeballs” there are staring at digital stuff, we won’t end up building these light touch “no UIs”, the interaction models of set-it-and-forget-it, “push a button and the device does the rest”. Because a manager won’t be able to stand up and show a PowerPoint of how many of their KPIs they met. Because “not using your app” isn’t a KPI.

Don’t not build a chatbot because of my snarkiness. They may solve a problem that your users have. They probably don’t but they might. But please don’t just build a chatbot because someone on a tech blog or a Medium post told you to. That’s just a damn cargo cult. Build something that delivers value to your users. That may be a chatbot, but most likely, it’s something as simple as making your website/app better.