In the Heart of Silicon Valley
SAVE THE DATE - Feb 5-6, 2018
From the January 2017 issue of LUI News (on the Language User Interface):
Google lets developers write apps for the Assistant on Google Home
“Conversation actions” connect with outside companies
Google announced that all developers (not just those in its private preview program) can start bringing their applications and services to the Google Assistant, starting with what the company calls “conversation actions” on Google Home, similar to Amazon Alexa “skills.” This allows developers to create conversations with users through the Assistant. Users can start these conversations by using a phrase like “OK Google, talk to Samantha.” Google has also allowed a small number of partners to enable their apps on Google Home already.
The Assistant also runs Google’s Pixel phones and inside Google’s Allo messaging app. Google says it plans to bring actions to these other “Assistant surfaces” in the future.
The system does not require enabling an external conversation action before invoking it. When a user asks for a Conversation Action by name, e.g., “OK Google, talk to Samantha,” Google immediately connects with the outside action. From there, the outside action manages the rest, including how users are greeted, how to fulfill the user's request, and how the conversation ends.
Google indicates that the system can identify when an outside action is appropriate without the user knowing the name of the Conversation Action. When users make a request, the Google Assistant processes the request, determines the best action to invoke, and invokes an outside Conversation Action if relevant.
Users can use Google’s Conversation Application Programming Interface (API) and their Actions Software Development Kit to create a Conversation Action. The API.AI toolkit and other tools can be used to provide natural language processing. (Google previously purchased API.AI.)
When you build conversation actions, you define these major components:
Invocation triggers define how users invoke and discover your actions. Once triggered, your action carries out a conversation with users, which is defined by dialogs (see figure next page).
Dialogs define how users converse with your actions and act as the user interface for your actions. They rely on fulfillment code to move the conversation forward.
Fulfillment is the code that processes user input and returns responses. The developer exposes it as a REST endpoint. Fulfillment also typically has the logic that carries out the actual action, e.g., retrieving recipes or news to read aloud.
The Conversation API defines a request and response format that you must adhere to when communicating with the Google Assistant. Even though you control the user experience when your action is invoked, the Google Assistant still brokers and validates the communication between your action and the user.
The Actions SDK provides the following tools and libraries to build actions:
NodeJS client library - This library provides convenient methods to parse requests and construct responses that adhere to the Conversation API.
Action Package definition - A JSON-based definition that describes a group of actions, how they are invoked, what fulfillment endpoints to call, and other metadata. You provide this file, along with other assets, when you submit your action for approval.
gactions CLI - A Command Line Interface to test actions and deploy action packages.
Web Simulator - A web-based, virtual Google Home device for testing your actions without the hardware.
Google Conversation Actions
Natural language development tools
When using API.AI to build actions, the developer can use:
API.AI NLU - API.AI’s Natural Language Understanding is integrated into API.AI and does in-dialog processing of input. This offers the developer aids such as expansion of phrases and built-in features to easily distinguish and parse arguments from user input.
A GUI interface for development - Developers can define and configure action components, such as dialogs and invocation, in an easy-to-use user interface.
Conversation building features - API.AI has advanced features such as contexts, which let you easily maintain state and contextualize user requests, a built-in simulator, and machine learning algorithms to better understand user input.
Development tools other than API.AI for conversational end points are fully integrated with Actions on Google. These include tools from Gupshup (LUI News, August 2016, p. 7), DashBot (p. 28), VoiceLabs (p. 14), Assist, Notify.IO (conversational interfaces), Witlingo (conversational interfaces), and SpokenLayer (audio creation).