Ever wonder what it takes to add the power of Alexa to your own products? Are you curious about what Alexa partners have learned on their way to a successful product launch? In this session you will learn about the top tips and tricks on how to go from VUI newbie to an Alexa-enabled product launch. Key concepts around hardware selection, enabling far field voice interaction, building a robust Alexa Voice Service (AVS) client and more will be discussed along with customer and partner examples on how to plan for and avoid common challenges in product design, development and delivery.
2. What to Expect from the Session
• Key concepts for using the Alexa Voice Service
• Tips and Tricks for implementing an AVS client
• Considerations for evolving your solution
• Key components of a hands-free solution
3. Amazon Alexa Enabled
Open and extensible
solution to add Alexa to
any connected, for free
Alexa
Skills
Kit (ASK)
Works With Alexa
Open APIs and tools that
make it fast & easy to build
skills for Alexa products.
Lives In The Cloud
Automated Speech Recognition (ASR)
Natural Language Understanding (NLU)
Always Getting Smarter (AI)
Alexa
Voice
Service (AVS)
The Alexa Ecosystem
Supported by two powerful frameworks that leverage open APIs
Devices
4. Intelligent Cloud Service
Optimized suite of on-device + cloud-based technologies and services that power a wide array of connected devices
ON-DEVICE
COMPONENTS
DEVICE
TYPES
AMAZONSPEECH
OS
3P
CONTENT
HWSW
Mic Arrays Speaker Notification LEDs Mute Button SoC/DSP
Audio Player AEC Beamforming State Machine HTTP Manager LWA Auth
Speech PrimitivesProduct Platform Platform Services
ASR TTS NLU
State Mgr
Knowledge (Evi)
Model Training Analytics
Data Ingestion Auth Tools
Personalization
GUI Cards
Domains Services
VUI UX
Speech orchestrator
3P Skills Smart Home
Smart Things Wink Insteon SmartHome APIUber Dominos + 3000 more
Dialog Mgr
6. Recent AVS Announcements
“Omate Rise 3G smartwatch
slaps Amazon Alexa on your
wrist”
– Engadget 9/1/16
Adding Alexa to the already-
intriguing Pebble Core takes it
from “Huh, that’s interesting” to
“Did we just catch up to Star
Trek?” – Forbes 6/3/16
“This smart watch puts
Alexa on your wrist” – The
Verge 4/20/16
“Amazon Alexa is now available
on first device not made by
Amazon” – TechCrunch
4/28/16
“Nucleus debuts first
Alexa-enabled
touchscreen video
device” – Mashable
8/4/16
“Amazon Alexa support
coming to LG's
SmartThinQ hub” –
Engadget 9/2/16
“Sonos Bringing Voice Control
To Its Speakers With Amazon
Partnership” – Forbes 8/30/16
Beam me up, Alexa! Onyx
communicator gets voice
assistant integration – CNET
9/14/16
7. Tip #1: Follow the Sample brick road
• Get started working almost anywhere
• PC, Linux, Mac, Raspberry Pi, CHIP, …
• NEW! – Includes hands-free implementation
• Application.log shows the proper message flow
• 3 Sample companion apps for linking and tokens
• Android, iOS, Web app
• First stop for all debugging!
https://github.com/alexa/alexa-avs-sample-app
8. Example AVS Client Architecture
AVS Client
Companion Apps
Connection Management
Messaging Layer
Controller
Audio Input (Mic)
Audio Player
Alert Management
Wake Word Engine
Web App
iOS
Android
Native Media Player
Native Timers and
Alarms
Wake Word Process
Alexa model
GUI / Attention System
State
Mgmnt
Directive
Queues
Event
Dispatch
Audio Output HTTP/2
AVS
Control
Logic
3rd Party / Built-in
Custom dev / Sample
9. Interacting with AVS Cloud Service
• AVS is Amazon’s intelligent cloud service that allows you as a
developer to voice-enable any connected product with a microphone
and speaker
• API endpoint (https://avs-alexa-na.amazon.com)
• /events – for all speech, playback and alert events
• /directives – the source path of AVS directives (read-only)
• /ping – to keep connection open
• Message bus for all Events and Directives
• Response messages and a down channel
• State machine to determine how to handle messages
• Pause playback? Duck audio? Alert versus music?
10. Tip #2: Take a Phased Approach
Port sample
• Re-platform (e.g.,
Java to C++)
• Swap 3rd party
components (e.g.,
Jetty to OkHttp)
• Integrate with native
components (e.g.,
Android MediaPlayer,
local buttons)
Harden tap-to-talk
solution
• Implement AVS
functional design
guidelines
• Define device
monitoring and
management
• Design an update
and deployment
process
• Perform functional
validation of core
features and music
Integrate hands-free
• Integrate hands-free
components
• Test and tune hands-
free performance
• Responsiveness
• Distance testing
• Testing with audio
output
• Testing with
ambient noise
AVS functional design guidelines:
https://developer.amazon.com/public/solutions/alexa/alexa-voice-service/content/alexa-voice-service-functional-design-guide
11. Tip #3 Love the logs – Events
21:22:55.064 [AWT-EventQueue-0] INFO com.amazon.alexa.avs.http.AVSClient
- Request metadata:
{
"event" : {
"header" : {
"namespace" : "SpeechRecognizer",
"name" : "Recognize",
"messageId" : "b15376c6-6265-451c-acee-bc5b9168af8e",
"dialogRequestId" : "919336ea-25d5-43d9-8af8-61d1344fbcb5"
},
"payload" : {
"profile" : "CLOSE_TALK",
"format" : "AUDIO_L16_RATE_16000_CHANNELS_1"
}
…
}
Thread Id
Event Name
Message Id
12. Tip #3: Love the logs - Directives
21:23:00.827 [RequestThread] INFO com.amazon.alexa.avs.http.AVSClient -
x-amzn-requestid: 0e8aaffffee24de5-000017a1-0008f272-94b39f8f1fc8f82d-
50d324c7-5-
21:23:00.926 [RequestThread] INFO
com.amazon.alexa.avs.http.MessageParser - Response metadata:
{
"directive" : {
"header" : {
"namespace" : "SpeechSynthesizer",
"name" : "Speak",
"messageId" : "65106c28-f005-4f5a-87d5-f38ccaa58e0a",
"dialogRequestId" : "919336ea-25d5-43d9-8af8-61d1344fbcb5"
},
…
}
Request Id
Event Name
Message Id
13. Complex Sequences - Multi-turn
Alexa, set a timer.
Recognize event
Speak directive
For how long?
ExpectSpeech directive
SpeechStarted event
SpeechFinished event
Recognize event
AVS Controller
AudioPlayer
Microphone
10 minutes.
10 minutes starting now.
…
PCM
PCM
14. Complex Sequences - Setting an Alarm
Alexa, set a timer for 10 minutes.
Recognize event
Speak directive
10 minutes starting now.
SpeechStarted event
SpeechFinished event
SetAlertSucceeded event
AVS Controller
AudioPlayer
Alert Manager
PCM
SetAlert directive
Alert
Store
AlertStarted event
AlertEnteredForeground event
Time passes….
Local
management
15. Complex Sequences – Music Playback
Alexa, play classical music. Playing classical music from
Amazon Music.
PlaybackStarted event
AVS Controller
AudioPlayer
Play directive
ProgressReportDelayElapsed event
ProgressReportIntervalElapsed event
PlaybackNearlyFinished event
ProgressReportIntervalElapsed event
…
PlaybackFinished event
Play directive
…
16. Tip #4: Music comes in many formats
- Common formats
- Need support for all current codecs
- Need to handle playlists as well
AAC/MP4 Amazon Music, iHeartRadio, TuneIn
MP3 Amazon Music, TuneIn
HLS Amazon Music, iHeartRadio, TuneIn, Audible
PLS iHeartRadio, TuneIn
m3u TuneIn, Amazon Music
Shoutcast / ICY iHeartRadio, TuneIn
ID3 Tags iHeartRadio, TuneIn
17. Audio Player State Machine
Playing
Stopped
Idle
Buffer
Underrun
Paused Finished
18. Audio Player State Machine
Playing
Stopped
Idle
Buffer
Underrun
Paused Finished
Playback initiated via voice or
companion app.
- Directive: Play
- Events: PlaybackStarted,
Progress events
Superseded by other channels:
1. Dialog
2. Alerts
3. Content
Next Play directive comes after
PlaybackNearlyFinished event.
19. Audio Player State Machine
Playing
Stopped
Idle
Buffer
Underrun
Paused Finished
Playback paused by user
action or other channels.
- Directive: none
- Events: PlaybackPaused,
PlaybackResumed (back to
Playing)
20. Audio Player State Machine
Playing
Stopped
Idle
Buffer
Underrun
Paused Finished
Playback stopped via voice
command or companion app.
- Directive: Stop or
ClearQueue.CLEAR_ALL
- Events: PlaybackStopped
Playback continues with a Play
directive.
21. Audio Player State Machine
Playing
Stopped
Idle
Buffer
Underrun
Paused Finished
Playback reaches end of
content.
- Directive: none
- Events: PlaybackFinished
Playback ends when no Play
directives follow
PlaybackNearlyFinished/
PlaybackFinished events.
Playback continues with a new
Play directive.
22. Tip #4: Design for the Future
• Events and Directives
• Directives can come in at any time – don’t assume order
• New directives and events can be added at any time – drop
unknown directives on the floor
• Message Formats
• New elements should be able to be added to JSON formats
at any time
• Software Updating
• All AVS devices should have an OTA update mechanism
• Updates should not “brick” the device and support fallback
23. Hands-free Requires Hands-on
• Building a hands-free experience requires sourcing
multiple components and libraries
• Plan months (>3) in advance for tuning of a hands-free
solution
• No all-in-one offerings today but multiple solutions to
consider
• Wake word spotter:
• Front-end hardware:
• Audio libraries:
24. Hands-free Front End Architecture
Mic Array
Echo Cancellation
Wake Word Spotter
Beamforming (only for multiple mics)
Noise Reduction
One of more input microphones (SNR >=
65dB, Sensitivity: -38dB ±1dB @ 94dB SPL)
Hardware (DSP) or software solution to
subtract device audio output from mic input
Software process and library trained to “spot”
the Alexa wake word from an audio buffer
Decision making library to pick the best quality
mic for capturing user utterance
Optional component to further reduce ambient
noise and tune audio for an ASR
All of these components need to be sourced or developed
for your solution from 3rd party offerings or by hand.
26. Amazon + Intel
CLOUD &
DATA CENTER
THINGS &
DEVICES
AWS IOT Alexa Voice
Services
• 10+ year partnership
• Joint development
• Shared customer passion
• High performance + low costs
• World class supply chain
Amazon EC2 Amazon S3
27. Did You Know?
Collateral &
SW/HW Dev Kits
Standards
Influence
Form Factor
Reference
Design
Innovation
Excellence
Program
ODM Reference
System
Ethnographic
Research
30. Intel’s Solid Voice & Speech Expertise
• Support for multiple designs and form
factors
• Broad set of voice processing
components
• Low power, highly optimized noise
reduction
• High quality tuning & configuration tools
• Audio labs fully synchronized with
leading partners
31. Enriching Daily Life with the
Personal Experience and
Simple, Natural Interaction
of Voice
Intel and Amazon are Collaborating to Extend Natural Voice Interaction
For Consumers
32. Call to Action
• Download the Sample from GitHub – build out a
Raspberry Pi! ~ 2 hours
• Start your new product today…
https://github.com/alexa/alexa-avs-sample-app
Port sample
Harden tap-to-
talk solution
Integrate hands-
free
33. Other Alexa Sessions
Thursday
11:30am ALX202: How Amazon is enabling the future of Automotive Venetian, Level 3, Lido
3003
1pm ALX303: Building a Smarter Home with Alexa Venetian, Level 3,
Murano 3203
3:30 ALX307: Voice-enabling Your Home and Devices with Amazon Alexa and AWS
IoT
Venetian, Level 2,
Opaline Theatre
5pm ALX302: Build a Serverless Back End for Your Alexa-Based Voice Interactions Venetian, Level 2,
Opaline Theatre
9:30am ALX304: Tips and Tricks on Bringing Alexa to Your Products Venetian, Level 1,
Marco Polo 806
11am ALX305: From VUI to QA: Building a Voice-Based Adventure Game for Alexa Venetian, Level 1,
Marco Polo 806
Friday 11am ALX203: Workshop: Creating Voice Experiences with Alexa Skills: From Idea to
Testing in Two Hours
Mirage, Jamaica B
1pm ALX306: State of the Union: Amazon Alexa and Recent Advances in
Conversational AI
Venetian, Level 2,
Sands Showroom
11:30am
and
2:30pm
ALX204: Workshop: Build an Alexa-Enabled Product with Raspberry Pi Mirage, Antigua B
5pm ALX301: Alexa in the Enterprise: How JPL Leverages Alexa to Further Space
Exploration with Internet of Things
Venetian, Level 2,
Venetian B
Wednesday