Some updates from my last blog on Alexa, now that I’m AWS Certified and actually know what I’m doing (sort of).
A couple of years ago, I wrote a blog about an emerging technology called Amazon Echo. By now, unless you’ve been living under a tree, you’re probably familiar with “Alexa.”
Even if you don’t own one of these smart devices, chances are, you have seen one at a friend’s house or on the numerous television ads in circulation. To put it mildly, these voice assistants have done a lot of growing up in a short amount of time.
So, I thought it would be good to revisit my little Alexa Skill that drove my family crazy with my incessant testing. “Alexa, Ask SurfCheck How’s the Surf?” is practically a four-letter word around my house. At the very least, the phrase elicits a groan or a dirty look.
After a brief overview of the Amazon Echo platform, I’ll introduce the Amazon Lex service, one of the core technologies that Echo runs atop. Since my last blog, I also became an Amazon Web Services (AWS) Certified Solutions Architect and Certified Developer, which enlightened me to all of the mistakes I made when I developed my first Alexa skill. With that in mind, I’ll review how Alexa and Lex fit into the broader AWS ecosystem.
Finally, I’ll focus on one of the key services that I used to update my Alexa Skill, Lambda. Lambda represents Amazon’s entry into the fascinating world of “Serverless Computing.”
In short, this blog is a mix of things:
- Overview of Amazon Echo
- Introduction to Amazon Lex
- Review of how Amazon Echo and Lex fit into the broader AWS ecosystem
- Introduction to Serverless Computing
- Bringing it all together
Overview of Amazon Echo
If you read my previous blog about Alexa, you’ll recall that Echo is a family of devices made by Amazon, of which, Alexa is the most widely known. The image below shows how Alexa’s family has been growing over the past couple of years, including a couple of newer siblings that add a visual aspect to the user interface (Show and Spot).
At its core, however, Echo is still a wireless speaker and voice assistant. Users still interact with Echo primarily using their voice. The new visual aspects of the Echo platform are meant to complement the voice user interface or VUI for short.
Amazon and other developers create Alexa Skills that enable the Echo to perform specific conversations with a user. For example, I created an Alexa Skill, Amazon’s technology for creating a voice user interface – called Surf Check – that enables a user to, not surprisingly, check the surf conditions at his or her local surf spot.
At the time of writing this blog, there are more than 20,000 Alexa Skills published in Amazon’s marketplace. This number will undoubtedly continue to rise as more and more Echo devices are sold.
Introducing Amazon Lex
With the success of Echo, Amazon realized that the underlying technology that powers Echo could be used for other applications. That led Amazon to introduce Lex, a service for building conversational interfaces into any application using voice and text.
According to Amazon, “…Lex provides the advanced deep learning functionalities of automatic speech recognition (ASR) for converting speech to text, and natural language understanding (NLU) to recognize the intent of the text, to enable you to build applications with highly engaging user experiences and lifelike conversational interactions…”
Of particular interest to me was the ability to re-use the same conversational interface I had for my voice-driven Surf Check Alexa Skill, but in a text-driven interface, also known as a chatbot. What is a chatbot? Derived from the terms “chat robot,” a chatbot allows for a conversational interaction, using either text or voice. Chatbots have become quite popular on mobile and web platforms such as Facebook Messenger and Slack, where they enable the automation of tasks such as customer support.
Using Surf Check as an example, let’s say I am not at home, but I still want to check the surf. Since my Echo devices are all at home, I can’t say “Alexa, ask Surf Check how’s the surf?” But, I can open up Facebook Messenger on my mobile phone and send a text message (“how’s the surf?”) to a Surf Check chatbot, thanks to Lex.
In fact, Amazon has made developing a Lex interaction nearly identical to an Alexa Skill. Lex uses Intents and Slots, just like Alexa. So in theory, I could write a new conversational application that supports both Lex and Alexa interactions using identical code. The only difference would be that one interaction outputs text and the other voice. This is obviously very powerful for developers to have in their toolbox as it allows consumers to connect to a virtual assistant from anywhere.
The Bigger Picture (aka AWS)
Until now, this blog has focused on the voice user interface that Alexa provides and the foundational technology behind Alexa called Lex. But as compelling as both of those are, conversational interfaces by themselves are not what make the platform so powerful. Combining an Alexa Skill or voice user interface with one or more cloud services is the secret sauce that takes this technology to another level.
(Note: a detailed description of Amazon Web Services (AWS) is beyond the scope of this blog. What follows is a brief description of “the cloud”; but look for a follow up blog where I’ll take a deep dive on AWS and Serverless computing.)
Ever since the World Wide Web came into being, the model of stitching together disparate services to appear as one has been the domain of server developers. In addition, the hardware required to support this effort had to be provisioned, typically in a data center at a different location (“in the cloud”) from where the developers worked.
The costs of maintaining the hardware and data center usually made this a headache for large companies and prohibitive for small companies. However, several companies including Amazon, Google, and Microsoft realized that “the cloud” could lower the cost and complexity behind provisioning these services.
Nowadays, when people refer to “the cloud”, they are referring to three primary resources:
As I mentioned in the introduction, I am partial to Amazon’s cloud offering, Amazon Web Services (AWS), and have attained two AWS certifications. With AWS, a developer like me can roll out an application using cloud storage, a server, and a database without ever buying or touching a physical piece of hardware. AWS allows a developer to provision all of the resources and pay for them based on usage all from the AWS web console.
Using AWS, I was able to update my Surf Check Alexa skill to include a database for storing information about users and move web services to a serverless component called Lambda. More details on that are provided in the following sections. The diagram below provides an example of how to create a weather application using various services available in AWS.
The other big change I made to my Alexa Skill was to use Lambda to process requests from Alexa. Amazon’s Lambda service describes serverless in this way: “Run code without thinking about servers.”
The name “serverless” itself is a bit of a misnomer, as servers are a critical component of any cloud application and aren’t going away anytime soon. However, the ways a server is deployed can vary and along with those variations come a wide range of costs.
Serverless computing aims to tackle two primary issues with cloud computing: streamlining the tasks associated with deploying a server and reducing the costs associated with running a server. As is typical in technology, a crisp definition of a buzzword is often hard to come by. Serverless Computing is no different.
In fact, a lot of people drop the term “computing” from Serverless Computing and just call it Serverless, causing even more confusion. My definition of serverless may be slightly different from the one provided by most of my colleagues. But all of the definitions should have some of the following traits in common.
Serverless is characterized by:
- Dynamically managed resources
- You don’t provision the server yourself. You simply write the code and the resources are automatically provisioned to accommodate your code.
- Pricing based on the actual amount of resources used
- You only pay for what you use. Not for a server that’s idle 99% of the time.
- Scaling of resources done automatically
- In addition to automatic provisioning, serverless platforms will also scale resources up or down according to usage.
Both Microsoft and Google have competing serverless offerings that work in much the same way as Lambda. Microsoft’s offering is called Azure Functions, while Google’s version is called Cloud Functions. Not very creative, but I’ll take function over form every time!
Here are the languages currently supported on each platform:
- AWS Lambda
- Microsoft Azure Functions
- Google Cloud Functions
Bringing it All Together
The diagram above illustrates how all of the AWS components are brought together to provide a seamless Alexa and Lex experience.
Not surprisingly, a lot has changed in the technology world in only a couple of years. Hopefully, this blog has shown the improvements made around Voice Assistants, Chatbots, the Cloud, and Serverless Computing.
Bringing all of those pieces together, I was able to:
- Re-use my existing Alexa Skill to create a Chatbot hosted in Facebook Messenger.
- Re-architect my Alexa Skill handler to use Lambda, AWS Serverless technology, thereby ridding me of the cost and complexity of maintaining a web server.
- Utilize cloud storage to store all the data my Alexa Skills uses.
I did all of the above without needing a data center or a lot of money. My AWS bill for the month of December was less than $1 – including all of the components described in this blog and my entire personal web site!
References and Further Reading