Is There A Downside To Conversational Interfaces?
I’ve been recently involved in a few initiatives around building so-called conversational commerce. I was also invited earlier this year…
I’ve been recently involved in a few initiatives around building so-called conversational commerce. I was also invited earlier this year (February 2016) to deliver a talk on the topic.
After the talk, some people approached me with questions, which I was more than happy to answer. There was one tough question that I have found difficult to answer on the spot. That’s why I’ve made a promise to try and deal with it in a separate article. And here is the tricky question: is there any downside to conversational interfaces?
Before answering the question, let’s review some of the main benefits of conversational interfaces.
Benefits for the Users
Users are the most important part of the equation. That’s why we’ll start by looking into the benefits users can enjoy.
User base has already been conditioned to conversational interface
We have reached the critical mass of users who are spending most of their online time in the chat channels. Over a billion users today spend most of their ‘digital minutes’ chatting. Facebook Messenger alone boasts 900+ million monthly active users. People prefer to communicate by sending short text messages. They are used to obtaining critical information and getting things done by texting. Today, people are mostly texting to their friends, coworkers and family.
It would be an easy shift to add online businesses and services to their customers’ daily chat routine. The same way someone can chat with their friends, they can also chat with their bank etc.
Moving into the text-based interaction would feel natural to almost everyone.
Users will eagerly embrace personalized services
Relying on graphical user interfaces (GUIs) tends to feel unnatural. This is because GUIs impose a much maligned ‘one size fits all’ approach. Every user has an identical experience when using a particular web site or a smartphone app. This rigid approach is not pleasant for the consumers. Users are thus definitely going to welcome a personalized experience that text-based interaction brings. Each user will experience a completely unique, one-of-a-kind interaction. That unique experience is possible only when interacting with a text-based service. Offering such personalized interaction should finally enable online businesses to delight their customers.
Users will appreciate that they finally own their data
When we’re interacting with web sites or smartphone apps, we tend to go through many steps. We may wish, later on, to review what transpired during the process of booking a trip, for example. Often times we’re not sure of all the details, and would like to retrace our steps. It is unfortunate that we can’t do that. Still, we, as consumers, should have access to all our information.
In the world of graphical user interfaces, all those steps are not available to the users. While the steps may have been logged on the back-end, there is no way for the users to get the logs. The service provider owns those logs, but customers have no access to them.
Conversational user interface (CUI) is turning the tables on that arrangement. All text messages that happen during our conversation with an online service remain available. For example, a doctor’s appointment booking service replies to our text enquiries. The conversation thread keeps all those replies. We own that recording, and we can review all the steps at any time. This is a huge advantage that makes non-conversational services appear quite crude and careless.
Users will welcome the ease of onboarding
The legacy of PCs is weighing user experiences down. This degradation is not necessary. Yes, there was a time when personal computers were an awe inspiring novelty. Back then most people were more than glad to put up with annoyances. The annoying process of downloading, installing, configuring and upgrading/upkeeping the apps was tolerable then. But those days are long gone. There is no need to keep perpetuating that experience today. A service that users may find desirable should be available on demand. And that’s exactly how conversational interface feels. As a user, I don’t have to be aware of the fact that I am using a computer that is connected to the network. I should request a service, and then enjoy the privilege of receiving that service.
In short, people want to get things done and then move on with their lives. They don’t care how are services delivered. So long as users receive quality services, that’s all that matters. And quality services are those that are accurate, pleasant, entertaining. Also, quality services arrive at the right moment and at the right price point. There should be nothing to download, nothing to install. Also, nothing to configure, nothing to keep upgrading, nothing to babysit.
The benefits of not having to learn any new skills
GUIs always force users to learn new and unexpected ways of accomplishing their tasks. Each aspect of a GUI-based service offers unique, non-standardized ways of getting things done. The users are thus forced to learn awkward ways of accomplishing their goals. Many users end up enrolling in intense training courses because of this awkwardness. All that despite gargantuan and protracted efforts to standardize GUIs.
Users feel relieved upon realizing that there is nothing to learn when using CUI. The conversational interface is generic across the board.
It’s the exact same experience regardless of the participants. Also, the experience remains identical across wide variety of devices. Whether using laptops, smartphones, tablets, gaming consoles etc., the conversational interface remains identical. No matter what kind of future devices may enter the market, user experience will remain the same. Now that’s the pinnacle of what we call ‘future proof’.
Benefits for the Businesses
Let’s now look at some of the benefits that businesses may reap.
Lowering the barriers to user onboarding
Since there isn’t anything to download when on the channel, users can continue chatting. No need to leave the discussion thread, so onboarding becomes a no-brainer. Recent studies have shown that people are abstaining from downloading apps. The app marketplace has reached saturation point. Most people nowadays tend to use approximately only 10% of the installed apps.
This is problematic for services that depend on users downloading and upgrading their apps. To remove those onboarding barriers, services should switch to conversation-centric interfaces.
Lowering/removing the barriers to user learning/training
Most people have already spent a few years chatting with their friends, colleagues, family members. Extending that behaviour to online services should not be a big stretch. There isn’t anything to learn when sending a chat to your online airline service to enquire if your flight is on time. That experience is almost like texting a friend who happens to be at the airport. Your friend can look at the flight schedule board on your behalf and reply back to you.
Simple and convenient wins every time.
There is also no need to spend any time or money on training courses. No need for learning anything when enjoying online conversational services — quite a relief.
Improving the ease of access to the service
Users often need to interrupt the task they’re working on and switch to another task. If using a GUI service, the interruption implies exiting the current platform. The user exits a GUI app and then hunts for another app. Once the user locates the app, the user opens it and works on the new task. After finishing that task, the user exits the new app and looks for the previous app. And so on, it can get quite complicated.
There is no need to do this song and dance when using conversational interface. Conversation is all about the flow. When switching the context, the user will only have to mention the service they need to access. As soon as the context switches, the user can start chatting with the new service.
This feature presents huge benefits. Not only to the users, but also to businesses providing the services. The risk of losing the customer gets lowered when everything is at customer’s fingertips.
Saving untold hours/money on not having to fiddle with layouts/decorations
Front-end client development has turned into a huge industry. Programming languages, dialects, frameworks and various other gimmicks are emerging afresh almost every week. People have now started making fun of this flurry of busybody activities. The relentless onslaught of newfangled ways of tackling the age old problem seems counterproductive .
It gets quite complicated to develop sophisticated responsive front end applications. Seems like endless hours get poured into choosing the right approach. So many languages, libraries, build tools, testing frameworks etc. to choose from. On top of that, countless hours get spent on tweaking the app to normalize it across various browsers. Not to mention concerns driven by various screen sizes, resolutions, and on and on. A veritable time and money pit.
As soon as we abandon GUI, we realize that all those herculean efforts aren’t needed anymore. Switching over to CUI allows businesses to save large quantities of hours and dollars. Work on layouts, logos, fonts, background colors, and other decorative concerns is tedious. Removing this work from the product line is a big win for businesses.
Saving untold hours/money on not having to fiddle with performance and response times
When people are using GUI apps they expect snappy, responsive behaviour. This expectation places enormous pressures on the app development teams. The challenge is to deliver complex graphical experience with as little latency as possible. Users are becoming conditioned to expect rich graphical experience. And if that experience is sluggish, they don’t hesitate to abandon the non-responsive app.
The goal is to deliver sophisticated graphical apps that operate at sub-second response times. Considering how unreliable networks are, this is a monumental technological challenge.
Well, when switching to conversational interface, all those concerns just melt away. Snappy response? Users will get startled if, the moment the send their text message, the response arrives back! That’s not how they are interacting in a chat channel. Their experience is that a message gets sent, and then there is a bit of a latency on the other end.
Why the latency during conversational experience? The recipient needs to read the message they receive. Then, they may do some thinking/researching, and only then type their response.
It is actually okay if the response arrives with a bit of a delay. As a matter of fact, it’s the normal thing to expect.
Making online services beefy and super reactive/responsive is a tall order. Businesses can save a lot of money and time by abandoning that goal. These savings can be better invested into the quality of the core business competence. Increased competence will improve the quality of the actual service.
What Are the Downsides of Conversational Interfaces?
Everything has some kind of a downside. The trick is to discern imaginary downsides from the real ones. This is especially important when evaluating emerging, nascent technologies.
There is a lot of buzz in the media nowadays about the downsides of conversational interfaces. Many of those criticisms appear to be the result of not understanding this technology. Let’s first mention some of the more prominent misconceptions about CUI.
Imaginary Downsides
Conversational Interfaces are Only Good for Contrived Scenarios
Any time a disruptive technology enters the mainstream, we keep hearing the same complaint. But let’s be realistic here. Any new technology hasn’t been around long enough to tackle sizeable problems. Hence the problems it solves at first are toy sized. We cannot blame it for the lack of demonstrable proof of being able to solve larger problems. It takes time to implement elaborate solutions to complex problems.
When GUIs made foray into mainstream computing, they were also ridiculed. People were claiming back then that GUIs are only good for solving contrived scenarios. Web sites had to go through the exact same pattern of wholesale ridicule. And after web sites managed to prove their worth, mobile apps got the same derisive treatment.
Now it’s the CUIs, together with accompanying bots that are on the receiving end of derisions. People seem too quick to dismiss the value of technological progress. They seem eager to toss the baby out with a bathwater.
It is Easier to Click on a Button than it is to Send a Text Message
Another fallacy is that it is way easier to click/tap on a button than it is to send a text message. While it may appear so, it is taking into consideration only the tip of the iceberg.
The real question is not “how hard is it to click on a button you see on the screen?” The real question is “how long did it take you to get to that button displayed on the screen?”
The problem with GUIs is that we often know there is a button that is easy to click on, but we don’t know how to find it. Same goes for links, or any other clickable/tappable graphical component. The ‘easy to click’ fallacy ignores the intricate complexity required to navigate the GUI.
In contrast, sending a text message is always the same. There is no need to navigate confusing maze of menus and tabs etc. Just type the text (or speak into the microphone) and then send it.
GUI is Superior Because Picture is Worth Thousand Words
This fallacy has deep roots. I have debunked it in a separate article.
Real Downsides
There are even more imaginary downsides to conversational interfaces than the three mentioned above. I feel that it’s now time to move into going over the real downsides.
Loss of Context
Conversational interfaces are gaining momentum because they offer the illusion of awareness. With GUIs, any illusion of awareness that might get perceived is difficult to sustain. Operating a GUI is almost identical to operating a physical, mechanical object. There are buttons installed on the surface, inviting us to push them. There are also other controls (sliders and toggles and such). Everything looks and behaves like a mechanical contraption.
Unlike GUIs, there isn’t anything mechanical about the conversation that occurs during the chat. Whether we are chatting with other humans or with bots, the experience is identical.
Because of that, chat bots tend to project an illusion of awareness. Despite the fact that we build bots as mechanical devices, they appear as if they’re aware. When people interact with a bot, they often feel that they are interacting with another mind.
Matt Schlicht just posted a brilliant expose on this topic. It is undeniable that the privacy and the intimacy of a discussion thread will open people up. People will no doubt begin sharing their deepest concerns with bots. They will do that more often than they’d do it with other humans.
The downside of that is in the possibility of losing the context. For example, a user may be chatting with a bot specializing in nutritional counselling. The conversation will revolve around questions about user’s symptoms and other issues. The user will likely be asking some detailed questions. The bot will likely be able to provide some answers to the user’s questions. But there is a problem lurking underneath the surface of that conversation.
Experience shows us that people tend to slip into casual chat as the discussion unfolds. This casual tone may not be appropriate when interacting with a service representative. If we’re interacting with bots, it gets easier to start meandering. A human representative will have resources to reel us back into the context. But the bots lack those resources (i.e. common-sense). The unwanted result is that our chat with a bot may fizzle out. And we may leave the chat feeling frustrated.
Inability to Handle Crisis Calls
Statistics show that increasing number of people feel lonely, isolated, even depressed. Some of them feel reluctant to reach out for help. Having an opportunity to chat with someone may feel like a big relief to them.
While bots can project the illusion of awareness, they’re lousy at handling crisis calls. Bots may not even recognize that the person on the other end is experiencing serious crisis. There is a real danger that bots may respond to a cry for help in a lackadaisical manner. No fault of their own, to be sure, but the outcome may be disastrous.
Is There a Remedy?
So is there a remedy to the loss of context and the inability to handle crisis calls? My hope is that there is a remedy, but we’re a long way off before we get there.
We could deal with the issue of loss of context by implementing the banter budget. I have described that concept in my article How to Design a Bot Protocol. It is my hope that bot designers will take heed and consider adding circuit breakers. The idea is to keep track of the number of frivolous messages during the conversation thread. Once the number exceeds the banter budget, short circuit the chat. We do that by asking the user to consider some topics that are more on point.
As for the inability to handle crisis calls, that’s a serious issue. We could attempt to deal with it by working on a general purpose support bot. But that bot would have to be community owned and trained to deal with serious crises. Something akin to Wikipedia, where experts curate the content that bots respond with when fielding crisis calls.
Intrigued? Want to learn more about the bot revolution? Read more detailed explanations here:
The Age of Self-Serve is Coming to an End
Only No Ux Is Good UX
Stop Building Lame Bots!
Four Types Of Bots
Are Bots just a Fad? Are GUIs really Superior?
How to Design a Bot Protocol
Breaking The Fourth Wall In Software
Bots Are The Anti-Apps
How Much NLP Do Bots Need?
Screens Are For Consumption, Not For Interaction