2. The Evolution of Google Voice Actions
Google Now
Launcher
Google Voice
Interaction API
Google Voice
Access (Beta)
(for disabled person)
Custom Voice Actions
(With Selected Partners APP Only)
Google App
With
“Now Cards”
Android 2.2
2010.8
Google Search App
2012.6
Android 4.1
Google App
with
“Google Now”
2013.10
Android 4.4
Add
Say “OK Google”
to Launch
Google Now
2015.9
Android 6.0
Android M
(Preview)
2015.4
System Voice Actions
2016.3
Android N (Preview)
2016.5
Google Assistant
Google
Home
Google Allo
Android
Wear
Android 7
2014.10
Android 5.0
Google Glass
Android Wear
2011.10
iPhone4s Siri
Android N (Preview)
iOS 10 Sirikit
2016.9
3rd party app control
: End Product
: Dev API
API Lv. > 23
*two-way dialog
Voice Search App
With
Voice Actions
(English only)
Formerly Google Search
API Lv. > 21
Microsoft
Cortana
Ok Google
4. Google Now
• Google Now is about getting you just the right information, at just the right time.
– To open Google Now, follow the steps for your device:
Touch & hold the Home button. (Android 4.4 & above)
5. Voice Search
• Voice Search allows you to speak into your device to issue web searches.
• You can talk to it with either “Ok Google” or touching the microphone
icon.
– Weather condition, stock price, flight status, sports score, currency conversion,
mathematical calculation, traffic, movie listings, and more.
6. Turn Hotword Detection On
• The ability to trigger a search or action by saying “Ok Google” is called hotword detection. To
turn it off or on, open Google Now or the Google app and touch Menu > Settings > Voice >
Hotword detection.
• Google analyzes sound picked up by device’s microphone in intervals of a few seconds or less.
7. Voice Search & Actions
1. OK Google, open facebook
2. OK Google, set an alarm for 6 AM
3. OK Google, call Obama
4. OK Google, send a WhatsApp to Ken
5. OK Google, turn off Wifi
6. OK Google, turn on Flashlight
7. OK Google, take a photo
8. OK Google, remind me to buy milk at 9 pm
1. OK Google, define accentuate
2. OK Google, convert 250 USD to Euro
3. OK Google, what is 18 times 22
4. OK Google, say good morning in Spanish
5. OK Google, directions to Taipei Station
6. OK Google, search the flights to London
7. OK Google, is it going to rain tomorrow
8. OK Google, stock price of Apple
Voice Search : Voice Actions:
Open App
Alarm
Phone
Text Message
Device
Device
Camera
Google reminder
: Can be supported from your app
: Only handled by Google
8. Voice Actions
• Voice Actions allow you to issue commands related to make phone calls, sending
messages, launching apps and many more.
• We can setup our app to receive the command.
- System command types : Alarm, Timer, Calendar, Camera, Contacts, Email, Sports, Music, Phone,
Open URL, Text Message…
https://developer.android.com/guide/components/intents-common.html
• https://developers.google.com/voice-actions/
9. Voice Actions
• How it works?
Speech Intent Action
Vocabulary App Logic
OK Google, send a WhatsApp message to Brenda.
Google owns
Intent:
ACTION_SEND
10. Voice Interactions API
• This allows for more interaction.
• But Starts at API 23 (Android 6.0 Marshmallow).
– Add android.intent.category.VOICE category into intent
filter in Android manifest.
– This seems to be very experimental as well. A lot of the
documentation doesn’t really work.
https://www.youtube.com/watch?v=OW1A4XFRuyc
OK Google, play some music OK Google, turn on the lights
What genre? Which room?
11. Voice Interactions API
• Demo “Take a selfie”
Voice Action (Intent)
Voice Interactions API
http://io2015codelabs.appspot.com/codelabs/voice-interaction
12. Voice Interactions API
• Much of the Intents don’t allow voice interactions
– Even the ones Google says will
– Have tested most of them, only the IMAGE_CAPTURE intent worked
After someone testing….
13. Custom Voice Action
• All demos are not working so far.
– NPR One: "Listen to NPR."
– Realtor.com: "Show rentals near me on Realtor."
– Shazam: "Shazam this song."
– TripAdvisor: "Show attractions near me on TripAdvisor."
– Trulia: "Show homes for sale in Boston on Trulia."
– TuneIn Radio: "Open TuneIn in car mode."
– Walmart: "Scan my receipt on Walmart."
– Wink: "Activate home mode on Wink."
– Zillow: "Show me open houses nearby on Zillow.
– Flixster: "Show me Inception on Flixster."
– Instacart: "Show instacart availability."
– Lincoln: "Start my Lincoln MKZ."
14. Why Custom Voice Actions?
1. To ensure some keywords can launch your app directly on
the first sentence.
– If many app to monitor the same intent, Google App will show/ask
available apps.
2. Not limited to System Voice Actions. Give you more scenarios.
3. Your app can get more information on the first sentence.
– Show restaurants on Taipei from TripAdvisor.
15. iOS 10 SiriKit
• Siri in iOS 10 will only work with 6 types of third-
party apps.
1. Audio/video calling
2. Messaging
3. Sending and receiving payments
4. Searching photos
5. Starting workouts
6. Booking rides
• The all-new SiriKit is pretty limited for now, and the
only streaming service it supports is Apple Music.
– No Spotify
16. The Evolution of Google Voice Actions
Cloud Speech
API
(Beta)
Google Now
Launcher
Google Voice
Interaction API
Google Voice
Access (Beta)
(for disabled person)
Custom Voice Actions
(With Selected Partners APP Only)
Google App
With
“Now Cards”
Android 2.2
2010.8
Google Search App
2012.6
Android 4.1
Google App
with
“Google Now”
2013.10
Android 4.4
Add
Say “OK Google”
to Launch
Google Now
2015.9
Android 6.0
Android M
(Preview)
2015.4
System Voice Actions
2016.3
Android N (Preview)
2016.5
Google Assistant
Google
Home
Google Allo
Android
Wear
Android 7
Ok Google
2014.10
Android 5.0
Google Glass
Android Wear
API Lv. > 23
2011.10
iPhone4s Siri
API Lv. > 8
Android Speech API
(with Internet)
Android N (Preview)
iOS 10 Sirikit
3rd party app control
: End Product
: Voice API
Android Speech API With
Offline Recognition Engine
API Lv. > 23
Android Speech API With
Offline Recognition Engine
(English only)
API Lv. > 16
*Need to download Language Pack
1.Support any system (Android,iOS,JavaScript…..)
2.US$0.006/15 seconds US$1/40 min
https://cloud.google.com/speech/
*two-way dialog
Voice Search App
With
Voice Actions
(English only)
Formerly Google Search
API Lv. > 21
Microsoft
Cortana
: Speech API
17. Android Speech API
• It launches the Google voice recognition dialog through an intent
• It will give you some possible results. (default 5 results)
http://developer.android.com/reference/android/speech/RecognizerIntent.html
18. Summary
Google App/Google Voice Actions Android Speech API
Google App launch your app after
recognizing.
Pure Speech to Text by Google Voice
The first sentence is in the Google App The first sentence is in Your App
So far, you can’t custom the first sentence
to launch your app. It’s limited.
Charge US$0.006/15sec for cloud version
(for any platform)