SlideShare une entreprise Scribd logo
1  sur  32
Comparisons of Computer
Input Modalities and
Methods
Yoshiharu Sato, http://yo-sato.com/
Input methods
• Mechanical Movement
• Audio
• Gaze
• Brain
• Multimodal Fusion
Input methods
• Mechanical Movement
• Audio
• Gaze
• Brain
• Multimodal Fusion
Mechanical Movement
• Advantages
• Easy to control
• Disadvantages
• Speed is limited by the mechanical movement.
• Hand/Finger methods
• Body Gestures
• Muscle sensing
Hand/Finger methods
• Advantages
• Disadvantages
• Artifact (input device) must be within reach of the user.
It does not suit to remote control scenario or mobile
scenarios.
• Hand is busy to type characters or hold a device. Eyes
are busy to look at keyboards or touch panel. It does not
suit to mobile scenario.
• Keyboard
• Handwriting
Keyboard
• Advantages
• Keys directly map to characters, and there is smaller problem of
recognition accuracy than hand-writing or voice recognition (note:
it does involve recognition for East Asia ideogram input). This is one
of reasons why it is hard to beat keyboard as mainstream input
means.
• Keys can represent any functions and language is rich.
• Device is cheap.
• It requires smaller computation cost than the others.
• Disadvantages
• Keyboard input operation is not natural.
• Inputting texts reply on the knowledge of key positions (“memory in
the world”, Norman 1988).
• Hardware keyboard
• Software keyboard
Hardware keyboard
• Advantages
• Keys are fixed. By that, human can rely on the
knowledge in the world [Norman, 1988], and it’s easy to
operate.
• Disadvantages
• Keys are fixed and limited, and functions bound to a key
are sometimes over-loaded and modes are introduced
to confuse users.
Software keyboard
• Advantages
• Keys are configurable by software, and there is no need of
over-loading of keys.
• Touch language is richer than key press.
• Disadvantages
• Touch is less accurate than mouse, and requires more efforts
in correction than the hardware keyboard
• Keyboard occludes screen real-estate, and distracts user’s
thinking.
• Keys are part of touch monitor, and it is hard to use under an
extreme lightening condition.
• Keys are configurable by software, and users need to look for
key positions and a new key layout requires re-learning.
Handwriting
• State of art
• Online hand-writing technology was established in 90’s.
• Typical recognition engine goes through the process of
normalization of input data (e.g., base-line, slant/slope),
segmentation and feature extraction and classification
(dynamic programming, neural network or so + language
model).
• Commercial engines have about 10% character error rate for
isolated characters in boxes, and 20% character error rate for
run-on mode in 90’s. They were close to practical accuracy.
• Hand-writing is already integrated into most retail devices such
as PC or Smart Phone.
• Offline handwriting technology has not reached to a
practical use.
Handwriting (Cont’d)
• Advantages
• Ink is the character, and it is direct and intuitive. Human has
been familiar for a long time.
• Pen can play the role of mouse.
• Silent. It can protect privacy.
• Not subjective to environment noise.
• Disadvantages
• Ink needs a conversion to character codes, recognition cannot
be 100%, and recognition results require corrections.
• Hand-busy
• Finger-movement to write a character is complex and time
consuming.
Body Gestures
• State-of-art
• Body gestures are recognized by computer vision, or
motion sensor.
• Microsoft Kinect
• Leap Motion
• NTT DoCoMo “UbiButton”
• “Ring”
• Shiseido
• There is a research to use tongue gestures by magnetic
sensors.
Body Gestures (Cont’d)
• Advantages
• Gesture language can be richer than mouse/keys and touch
by virtue of 3D.
• It does not occlude screen real-estate.
• Disadvantages
• Computer vision is subjective to occlusion and light condition.
• 3D freehand pointing precision may be lower than that with a
2D surface.
• Freehand gestures involve more muscles than
keyboard/mouse interaction, and large/frequent arm
movements cause fatigue over time.
• It’s socially awkward. It is strange if I make gestures against
machine in crowded environment.
Muscle sensing
• State-of-art
• EMG (electromyography) in forearm muscle-sensing
band can classify finger moves.
• There is no commercial system yet for computer
commanding.
• There are several vendors of EMG, and low-end device
costs less than $1,000.
Muscle sensing (Cont’d)
• Advantages
• Muscle can be sensed by a non-obtrusive way without
some artifacts in the reach of the user.
• It allows hand-free operations.
• It doesn’t require observable interaction that can be
socially awkward. It protects privacy.
• Not interfere with environment as voice recognition or
computer vision.
• Fatigue free.
• Disadvantages
• It is limited by mechanical movement speed.
• Language must be designed.
Input methods
• Mechanical Movement
• Audio
• Gaze
• Brain
• Multimodal Fusion
Audio
• Advantages
• Speaking is direct, intuitive, and natural. Human has been familiar
with it for a long time. People don’t have to learn speaking. So
consumers perceive speech interface as not a input task.
• Hand-free and eye-free, and suites to mobile scenario.
• It is 5 times faster to speak than writing/typing.
• Disadvantages
• Voice needs a conversion to character codes, requires recognition,
and corrections.
• There is a segmentation problem of conversation, commands, and
text recognition.
• Voice recognition
• Silent speech recognition
• Lip reading
Voice recognition
• State-of-art
• Voice recognition technology has been investigated since
1960’s, established in 1990’s.
• Voice recognition has been already in practical use in call
centers, medical jobs, and any time-critical jobs but
documentation is required. Remote hand-free control by
speech in a car is also in practical use. The remote control of
home equipment’s is also starting up.
• There have been researches to use speech as primary and use
other method for confirmation, selection, or correction. A
research showed a double of T9 productivity. A research
combines speech with Gaze and Dasher, and gained twice
productivity compared Dasher only.
Voice recognition (Cont’d)
• Advantages
• Voice can communicate emotions.
• Disadvantages
• It is subjective to environmental noises. Recognition
accuracy drastically drops in noisy environment by 20-
50%. The accuracy degradation comes from natural
spontaneous interaction or diverse speaker too.
• It’s socially awkward in two ways
• Speaking is loud and invites noises to the others.
• It doesn’t keep privacy. It does not suit to crowded
environments.
• See http://yoshiharusato.wordpress.com/2014/05/29/why-
speech-recognition-do-not-work/.
Silent speech recognition
• State-of-art
• Research of non-voiced speech recognition emerged
recently. Alternative to air-microphone are throat
microphone, surface EMG (electromyography),
ultrasound imaging of tongue and lips, and a type of
stethoscope microphone.
• There is no commercial system yet.
Silent speech recognition
• Advantages
• Silent speech solves the most critical defects of voiced
speech recognition.
• It is robust against environmental noise.
• It protects privacy.
• Disadvantages
• Technology practicality is to be proved.
• The quality of body-conducted speech degrades compared
with normal speech.
• NAM is not able to recognize pitch (Tone of Chinese).
Lip reading
• State-of-art
• Lip reading is approached from pattern recognition by
computer vision, or muscle move recognition by EMG
(Electromyography). The computer vision approach is still the
level of limited vocabulary (Takeshi Saitoh, 2009). Word
recognition rate is about 80-90%. EMG approach can
distinguish only vowels.
• According to (Rosenblum, 2010), human lip-reading experts
can read tong positions, air flows, and tones by observing
subtle moves of chin, cheek, and face. Theoretically the
technology should be able to overcome the current
limitations.
• There are a number of researches to use lip reading to
supplement speech recognition, or combine it with keys.
• There is no commercial system yet.
Lip reading (Cont’d)
• Advantages
• Lip reading solves the most critical defects of voiced
speech recognition.
• It is robust against environmental noise.
• It protects privacy.
• Disadvantages
• Lip reading is not matured yet as a standalone
technology.
• Computer vision approach is subjective to occlusion and
light condition.
Input methods
• Mechanical Movement
• Audio
• Gaze
• Brain
• Multimodal Fusion
Gaze
• State-of-art
• It’s approached by computer vision. There are already some
commercial systems. Most of commercial systems measure the
Point-Of-Regard by “corneal-reflection and pupil-center” method
with an infrared camera. There are a number of vendors. Gaze
tracking is applied in Digital camera called “Iris” to sense focus.
• There are remote sensor type and head-mounted type. Head-
mounted eye tracker can take advantage of higher accuracy and
simplified geometry, and robust against head moves.
• Current eye-tracking systems achieve an accuracy of 0.5 degrees
(equivalent to a region of approximately 15 pixels on a 17” display
with a resolution of 1024x768 pixels viewed from a distance 70cm).
• There have been a number of researches of eye-typing for disabled
people. They use software keyboard or dasher with gaze. There was
a research to apply the gaze tracking to replace candidate selection
in document authoring scenario, which observed more than half
the time was spent on looking and selecting the right choice from
candidate list with traditional IME.
Gaze (Cont’d)
• Advantages
• Eye gaze moves quicker than hand/finger/body. A simple target selection and cursor
positioning operations were performed approximately twice as fast as with an eye tracker
than with any of the conventional cursor positioning devices. When all is performing well,
eye gaze interaction can give a subjective feeling of a highly responsive system, almost as
though the system is executing the user’s intentions before he or she expresses them
(Karn, 2003).
• The eyes can move without fatigue.
• The time required to move the eye is not related to the distance to be moved, unlike most
other input.
• Operating the eye requires no training or particular coordination for normal users.
• Disadvantages
• It is difficult how to interpret Point-Of-Regard if we don’t use other means or control.
Moving one’s eyes is often an almost subconscious act, and eye movement is always “on”,
called “Midas Touch” problem (Karn, 2003).
• Dwell time (hampering speed, fatiguing), “gaze-and-touch”, or eye gesture were used to solve this.
• Eyes basically provide only positional information.
• Computer vision is subjective to occlusion and light condition.
• It requires calibration before use.
Input methods
• Mechanical Movement
• Audio
• Gaze
• Brain
• Multimodal Fusion
Brain
• State-of-art
• The brain-machine interface may replace any human computer interactions
someday. But it is not certain when brain-machine interfaces can deal with texts
or symbol sequences.
• It uses
• expensive high-end sensors as
• fMRI (Functional Magnetic Resonance Imaging)
• or Brain blood pattern by fMRI (functional magnetic resonance imaging)
• or MEG (Magneto-encephalography),
• or low-end sensors as
• NIRS (Near-infrared spectroscopy)
• or EEG (Electro-encephalogram.
• MSR showed off-the-shell EEG ($1500) can classify several brain states [Tan,
2005]. Hitachi offers “Kokorogatari” (2005) which tells Yes/No by NIRS. Honda
research succeeded in 2006 to distinguish 3 symbols ‘paper, stone and
scissors‘ by fMRI. Honda research also showed in 2009 a robot ASIMO moves
arm and foot as commanded by EEG & NIRS system.
• There are ventures who offer some solution: NeuroSky, Inc, BrainGate, and
Emotiv Systems.
Brain (Cont’d)
• Advantages
• Eye-free, Hand-free.
• Disadvantages
• Technology is not matured yet. EEG requires intense
focus at present.
Input methods
• Mechanical Movement
• Audio
• Gaze
• Brain
• Multimodal Fusion
Multimodal Fusion
• Advantages
• Users have a freedom of choice of modality. It
contributes to reliability (error correction).
• Can support more users.
• Modality fusion usually outperforms uni-modal
recognition.
• Disadvantages
• Processing (either early fusion or late fusion) could
become more complex than mono-modal methods.
Input methods
• Mechanical Movement – slow but reliable
• Audio – fast for text input
• Gaze – fast for pointing
• Brain
• Multimodal Fusion
Summary
• Silent Speech (including Lip Reading) is a preferred
technology of text input.
• Gaze is the fast pointing method and provides
information of user’s intention.
• Finger dexterity is reliable to control & command
machines.

Contenu connexe

Tendances

Silent sound technology
Silent sound technologySilent sound technology
Silent sound technologyMaria Dominica
 
E0ad silent sound technology
E0ad silent  sound technologyE0ad silent  sound technology
E0ad silent sound technologyMadhuri Rudra
 
Artificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionArtificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionRHIMRJ Journal
 
Silent sound technology
Silent sound technologySilent sound technology
Silent sound technologynixytl
 
Silent sound technology
Silent sound technologySilent sound technology
Silent sound technologyJeet Das
 
Silent sound technology_powerpoint
Silent sound technology_powerpointSilent sound technology_powerpoint
Silent sound technology_powerpointAmitt Arrsh
 
Sensory Aids for Persons with Auditory Impairments
Sensory Aids for Persons with Auditory ImpairmentsSensory Aids for Persons with Auditory Impairments
Sensory Aids for Persons with Auditory ImpairmentsDamian T. Gordon
 
Silentsound documentation
Silentsound documentationSilentsound documentation
Silentsound documentationRaj Niranjan
 
Silent sound-technology ppt final
Silent sound-technology ppt finalSilent sound-technology ppt final
Silent sound-technology ppt finalLohit Dalal
 
Silent sound technology NEW
Silent sound technology NEW Silent sound technology NEW
Silent sound technology NEW Neha Tyagi
 
silent sound new by RAJ NIRANJAN
silent sound new by RAJ NIRANJANsilent sound new by RAJ NIRANJAN
silent sound new by RAJ NIRANJANRaj Niranjan
 
Silent Sound Technology
Silent Sound TechnologySilent Sound Technology
Silent Sound TechnologyHafiz Sanni
 
Sensory Aids for Persons with Visual Impairments
Sensory Aids for Persons with Visual ImpairmentsSensory Aids for Persons with Visual Impairments
Sensory Aids for Persons with Visual ImpairmentsDamian T. Gordon
 
silent sound technology
silent sound technologysilent sound technology
silent sound technologyNajeeb p
 
Silent sound tech new
Silent sound tech newSilent sound tech new
Silent sound tech newnarayananramu
 
Speech Generating Device
Speech Generating DeviceSpeech Generating Device
Speech Generating Devicemikedelo
 

Tendances (20)

Silent sound technology
Silent sound technologySilent sound technology
Silent sound technology
 
Silent Sound
Silent SoundSilent Sound
Silent Sound
 
E0ad silent sound technology
E0ad silent  sound technologyE0ad silent  sound technology
E0ad silent sound technology
 
Artificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionArtificial Intelligence for Speech Recognition
Artificial Intelligence for Speech Recognition
 
Silent sound technology
Silent sound technologySilent sound technology
Silent sound technology
 
Silent sound technology
Silent sound technologySilent sound technology
Silent sound technology
 
Silent sound technology
Silent sound technologySilent sound technology
Silent sound technology
 
Silent Sound Technology
Silent Sound TechnologySilent Sound Technology
Silent Sound Technology
 
Silent sound technology_powerpoint
Silent sound technology_powerpointSilent sound technology_powerpoint
Silent sound technology_powerpoint
 
Sensory Aids for Persons with Auditory Impairments
Sensory Aids for Persons with Auditory ImpairmentsSensory Aids for Persons with Auditory Impairments
Sensory Aids for Persons with Auditory Impairments
 
Silentsound documentation
Silentsound documentationSilentsound documentation
Silentsound documentation
 
Silent Sound Technology
Silent Sound TechnologySilent Sound Technology
Silent Sound Technology
 
Silent sound-technology ppt final
Silent sound-technology ppt finalSilent sound-technology ppt final
Silent sound-technology ppt final
 
Silent sound technology NEW
Silent sound technology NEW Silent sound technology NEW
Silent sound technology NEW
 
silent sound new by RAJ NIRANJAN
silent sound new by RAJ NIRANJANsilent sound new by RAJ NIRANJAN
silent sound new by RAJ NIRANJAN
 
Silent Sound Technology
Silent Sound TechnologySilent Sound Technology
Silent Sound Technology
 
Sensory Aids for Persons with Visual Impairments
Sensory Aids for Persons with Visual ImpairmentsSensory Aids for Persons with Visual Impairments
Sensory Aids for Persons with Visual Impairments
 
silent sound technology
silent sound technologysilent sound technology
silent sound technology
 
Silent sound tech new
Silent sound tech newSilent sound tech new
Silent sound tech new
 
Speech Generating Device
Speech Generating DeviceSpeech Generating Device
Speech Generating Device
 

En vedette

Historical perspective of communication --- Imbalance in producing and consum...
Historical perspective of communication --- Imbalance in producing and consum...Historical perspective of communication --- Imbalance in producing and consum...
Historical perspective of communication --- Imbalance in producing and consum...yoshiharu sato
 
Interactions of human, machines, and atoms
Interactions of human, machines, and atomsInteractions of human, machines, and atoms
Interactions of human, machines, and atomsyoshiharu sato
 
Paper, pen and digital
Paper, pen and digitalPaper, pen and digital
Paper, pen and digitalyoshiharu sato
 
Survey of finger gesture sensing
Survey of finger gesture sensingSurvey of finger gesture sensing
Survey of finger gesture sensingyoshiharu sato
 
Voiceye スマホアプリ操作 (in Japanese)
Voiceye スマホアプリ操作 (in Japanese)Voiceye スマホアプリ操作 (in Japanese)
Voiceye スマホアプリ操作 (in Japanese)yoshiharu sato
 
音声コード技術比較 Voiceye and Uni-voice (in Japanese)
音声コード技術比較 Voiceye and Uni-voice (in Japanese)音声コード技術比較 Voiceye and Uni-voice (in Japanese)
音声コード技術比較 Voiceye and Uni-voice (in Japanese)yoshiharu sato
 
Pythonスタートアップ勉強会201109 python入門
Pythonスタートアップ勉強会201109 python入門Pythonスタートアップ勉強会201109 python入門
Pythonスタートアップ勉強会201109 python入門Takayuki Shimizukawa
 
Gr 1: History of Information Systems and its Importance
Gr 1: History of Information Systems and its ImportanceGr 1: History of Information Systems and its Importance
Gr 1: History of Information Systems and its Importanceuniversity of education,Lahore
 
Python入門 : 4日間コース社内トレーニング
Python入門 : 4日間コース社内トレーニングPython入門 : 4日間コース社内トレーニング
Python入門 : 4日間コース社内トレーニングYuichi Ito
 

En vedette (10)

Historical perspective of communication --- Imbalance in producing and consum...
Historical perspective of communication --- Imbalance in producing and consum...Historical perspective of communication --- Imbalance in producing and consum...
Historical perspective of communication --- Imbalance in producing and consum...
 
Interactions of human, machines, and atoms
Interactions of human, machines, and atomsInteractions of human, machines, and atoms
Interactions of human, machines, and atoms
 
Paper, pen and digital
Paper, pen and digitalPaper, pen and digital
Paper, pen and digital
 
Survey of finger gesture sensing
Survey of finger gesture sensingSurvey of finger gesture sensing
Survey of finger gesture sensing
 
Voiceye スマホアプリ操作 (in Japanese)
Voiceye スマホアプリ操作 (in Japanese)Voiceye スマホアプリ操作 (in Japanese)
Voiceye スマホアプリ操作 (in Japanese)
 
音声コード技術比較 Voiceye and Uni-voice (in Japanese)
音声コード技術比較 Voiceye and Uni-voice (in Japanese)音声コード技術比較 Voiceye and Uni-voice (in Japanese)
音声コード技術比較 Voiceye and Uni-voice (in Japanese)
 
Pythonスタートアップ勉強会201109 python入門
Pythonスタートアップ勉強会201109 python入門Pythonスタートアップ勉強会201109 python入門
Pythonスタートアップ勉強会201109 python入門
 
Python入門
Python入門Python入門
Python入門
 
Gr 1: History of Information Systems and its Importance
Gr 1: History of Information Systems and its ImportanceGr 1: History of Information Systems and its Importance
Gr 1: History of Information Systems and its Importance
 
Python入門 : 4日間コース社内トレーニング
Python入門 : 4日間コース社内トレーニングPython入門 : 4日間コース社内トレーニング
Python入門 : 4日間コース社内トレーニング
 

Similaire à Comparisons of input modalities and methods

Palm vein technology.pptx
Palm vein technology.pptxPalm vein technology.pptx
Palm vein technology.pptxschetan202
 
Remote interface design
Remote interface designRemote interface design
Remote interface designPrabuddha Vyas
 
Biometric security Presentation
Biometric security PresentationBiometric security Presentation
Biometric security PresentationPrabh Jeet
 
Guide Dogs and Digital Devices
Guide Dogs and Digital DevicesGuide Dogs and Digital Devices
Guide Dogs and Digital DevicesXamarin
 
Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01girishjoshi1234
 
Assistive Technology Selection for Employment
Assistive Technology Selection for EmploymentAssistive Technology Selection for Employment
Assistive Technology Selection for EmploymentJeremy St. Pierre
 
EyeRing PowerPoint Presentation
EyeRing PowerPoint PresentationEyeRing PowerPoint Presentation
EyeRing PowerPoint PresentationPriyad S Naidu
 
class lecture on input & output devices(part1)
class lecture on input & output  devices(part1)class lecture on input & output  devices(part1)
class lecture on input & output devices(part1)sharif_12
 
Access to technology presentation
Access to technology presentationAccess to technology presentation
Access to technology presentationmountain2009
 
Abstract of speech recognition
Abstract of speech recognitionAbstract of speech recognition
Abstract of speech recognitionVinay Jaisriram
 
A Translation Device for the Vision Based Sign Language
A Translation Device for the Vision Based Sign LanguageA Translation Device for the Vision Based Sign Language
A Translation Device for the Vision Based Sign Languageijsrd.com
 
Speech recognizers & generators
Speech recognizers & generatorsSpeech recognizers & generators
Speech recognizers & generatorsPaul Kahoro
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceIlhaan Marwat
 

Similaire à Comparisons of input modalities and methods (20)

Palm vein technology.pptx
Palm vein technology.pptxPalm vein technology.pptx
Palm vein technology.pptx
 
Remote interface design
Remote interface designRemote interface design
Remote interface design
 
Biometric security Presentation
Biometric security PresentationBiometric security Presentation
Biometric security Presentation
 
Biometrics final ppt
Biometrics final pptBiometrics final ppt
Biometrics final ppt
 
Guide Dogs and Digital Devices
Guide Dogs and Digital DevicesGuide Dogs and Digital Devices
Guide Dogs and Digital Devices
 
Presentation.ai
Presentation.aiPresentation.ai
Presentation.ai
 
Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01
 
Assistive Technology Selection for Employment
Assistive Technology Selection for EmploymentAssistive Technology Selection for Employment
Assistive Technology Selection for Employment
 
EyeRing PowerPoint Presentation
EyeRing PowerPoint PresentationEyeRing PowerPoint Presentation
EyeRing PowerPoint Presentation
 
class lecture on input & output devices(part1)
class lecture on input & output  devices(part1)class lecture on input & output  devices(part1)
class lecture on input & output devices(part1)
 
Access to technology presentation
Access to technology presentationAccess to technology presentation
Access to technology presentation
 
Abstract of speech recognition
Abstract of speech recognitionAbstract of speech recognition
Abstract of speech recognition
 
Eye Ring ppt
Eye Ring pptEye Ring ppt
Eye Ring ppt
 
Types of User Interface
Types of User InterfaceTypes of User Interface
Types of User Interface
 
A Translation Device for the Vision Based Sign Language
A Translation Device for the Vision Based Sign LanguageA Translation Device for the Vision Based Sign Language
A Translation Device for the Vision Based Sign Language
 
Speech recognizers & generators
Speech recognizers & generatorsSpeech recognizers & generators
Speech recognizers & generators
 
It in business
It in businessIt in business
It in business
 
Biometric by amin
Biometric by aminBiometric by amin
Biometric by amin
 
IT presentation
IT presentationIT presentation
IT presentation
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail Inteligence
 

Plus de yoshiharu sato

Why isnt gaze a main stream hmi
Why isnt gaze a main stream hmiWhy isnt gaze a main stream hmi
Why isnt gaze a main stream hmiyoshiharu sato
 
日本のIt教育から抜け落ちているもの howの前にwhat
日本のIt教育から抜け落ちているもの howの前にwhat日本のIt教育から抜け落ちているもの howの前にwhat
日本のIt教育から抜け落ちているもの howの前にwhatyoshiharu sato
 
Cognitive problem of smart phone
Cognitive problem of smart phoneCognitive problem of smart phone
Cognitive problem of smart phoneyoshiharu sato
 
Behind smart phone disruption
Behind smart phone disruptionBehind smart phone disruption
Behind smart phone disruptionyoshiharu sato
 
End of gui senior friendly ui
End of gui   senior friendly uiEnd of gui   senior friendly ui
End of gui senior friendly uiyoshiharu sato
 
Why voice hardly beats writing or typing
Why voice hardly beats writing or typingWhy voice hardly beats writing or typing
Why voice hardly beats writing or typingyoshiharu sato
 
High Bandwidth Interactions - one minute overview
High Bandwidth Interactions - one minute overviewHigh Bandwidth Interactions - one minute overview
High Bandwidth Interactions - one minute overviewyoshiharu sato
 

Plus de yoshiharu sato (8)

Why isnt gaze a main stream hmi
Why isnt gaze a main stream hmiWhy isnt gaze a main stream hmi
Why isnt gaze a main stream hmi
 
日本のIt教育から抜け落ちているもの howの前にwhat
日本のIt教育から抜け落ちているもの howの前にwhat日本のIt教育から抜け落ちているもの howの前にwhat
日本のIt教育から抜け落ちているもの howの前にwhat
 
Cognitive problem of smart phone
Cognitive problem of smart phoneCognitive problem of smart phone
Cognitive problem of smart phone
 
Behind smart phone disruption
Behind smart phone disruptionBehind smart phone disruption
Behind smart phone disruption
 
Real NUI
Real NUIReal NUI
Real NUI
 
End of gui senior friendly ui
End of gui   senior friendly uiEnd of gui   senior friendly ui
End of gui senior friendly ui
 
Why voice hardly beats writing or typing
Why voice hardly beats writing or typingWhy voice hardly beats writing or typing
Why voice hardly beats writing or typing
 
High Bandwidth Interactions - one minute overview
High Bandwidth Interactions - one minute overviewHigh Bandwidth Interactions - one minute overview
High Bandwidth Interactions - one minute overview
 

Dernier

How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 

Dernier (20)

How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 

Comparisons of input modalities and methods

  • 1. Comparisons of Computer Input Modalities and Methods Yoshiharu Sato, http://yo-sato.com/
  • 2. Input methods • Mechanical Movement • Audio • Gaze • Brain • Multimodal Fusion
  • 3. Input methods • Mechanical Movement • Audio • Gaze • Brain • Multimodal Fusion
  • 4. Mechanical Movement • Advantages • Easy to control • Disadvantages • Speed is limited by the mechanical movement. • Hand/Finger methods • Body Gestures • Muscle sensing
  • 5. Hand/Finger methods • Advantages • Disadvantages • Artifact (input device) must be within reach of the user. It does not suit to remote control scenario or mobile scenarios. • Hand is busy to type characters or hold a device. Eyes are busy to look at keyboards or touch panel. It does not suit to mobile scenario. • Keyboard • Handwriting
  • 6. Keyboard • Advantages • Keys directly map to characters, and there is smaller problem of recognition accuracy than hand-writing or voice recognition (note: it does involve recognition for East Asia ideogram input). This is one of reasons why it is hard to beat keyboard as mainstream input means. • Keys can represent any functions and language is rich. • Device is cheap. • It requires smaller computation cost than the others. • Disadvantages • Keyboard input operation is not natural. • Inputting texts reply on the knowledge of key positions (“memory in the world”, Norman 1988). • Hardware keyboard • Software keyboard
  • 7. Hardware keyboard • Advantages • Keys are fixed. By that, human can rely on the knowledge in the world [Norman, 1988], and it’s easy to operate. • Disadvantages • Keys are fixed and limited, and functions bound to a key are sometimes over-loaded and modes are introduced to confuse users.
  • 8. Software keyboard • Advantages • Keys are configurable by software, and there is no need of over-loading of keys. • Touch language is richer than key press. • Disadvantages • Touch is less accurate than mouse, and requires more efforts in correction than the hardware keyboard • Keyboard occludes screen real-estate, and distracts user’s thinking. • Keys are part of touch monitor, and it is hard to use under an extreme lightening condition. • Keys are configurable by software, and users need to look for key positions and a new key layout requires re-learning.
  • 9. Handwriting • State of art • Online hand-writing technology was established in 90’s. • Typical recognition engine goes through the process of normalization of input data (e.g., base-line, slant/slope), segmentation and feature extraction and classification (dynamic programming, neural network or so + language model). • Commercial engines have about 10% character error rate for isolated characters in boxes, and 20% character error rate for run-on mode in 90’s. They were close to practical accuracy. • Hand-writing is already integrated into most retail devices such as PC or Smart Phone. • Offline handwriting technology has not reached to a practical use.
  • 10. Handwriting (Cont’d) • Advantages • Ink is the character, and it is direct and intuitive. Human has been familiar for a long time. • Pen can play the role of mouse. • Silent. It can protect privacy. • Not subjective to environment noise. • Disadvantages • Ink needs a conversion to character codes, recognition cannot be 100%, and recognition results require corrections. • Hand-busy • Finger-movement to write a character is complex and time consuming.
  • 11. Body Gestures • State-of-art • Body gestures are recognized by computer vision, or motion sensor. • Microsoft Kinect • Leap Motion • NTT DoCoMo “UbiButton” • “Ring” • Shiseido • There is a research to use tongue gestures by magnetic sensors.
  • 12. Body Gestures (Cont’d) • Advantages • Gesture language can be richer than mouse/keys and touch by virtue of 3D. • It does not occlude screen real-estate. • Disadvantages • Computer vision is subjective to occlusion and light condition. • 3D freehand pointing precision may be lower than that with a 2D surface. • Freehand gestures involve more muscles than keyboard/mouse interaction, and large/frequent arm movements cause fatigue over time. • It’s socially awkward. It is strange if I make gestures against machine in crowded environment.
  • 13. Muscle sensing • State-of-art • EMG (electromyography) in forearm muscle-sensing band can classify finger moves. • There is no commercial system yet for computer commanding. • There are several vendors of EMG, and low-end device costs less than $1,000.
  • 14. Muscle sensing (Cont’d) • Advantages • Muscle can be sensed by a non-obtrusive way without some artifacts in the reach of the user. • It allows hand-free operations. • It doesn’t require observable interaction that can be socially awkward. It protects privacy. • Not interfere with environment as voice recognition or computer vision. • Fatigue free. • Disadvantages • It is limited by mechanical movement speed. • Language must be designed.
  • 15. Input methods • Mechanical Movement • Audio • Gaze • Brain • Multimodal Fusion
  • 16. Audio • Advantages • Speaking is direct, intuitive, and natural. Human has been familiar with it for a long time. People don’t have to learn speaking. So consumers perceive speech interface as not a input task. • Hand-free and eye-free, and suites to mobile scenario. • It is 5 times faster to speak than writing/typing. • Disadvantages • Voice needs a conversion to character codes, requires recognition, and corrections. • There is a segmentation problem of conversation, commands, and text recognition. • Voice recognition • Silent speech recognition • Lip reading
  • 17. Voice recognition • State-of-art • Voice recognition technology has been investigated since 1960’s, established in 1990’s. • Voice recognition has been already in practical use in call centers, medical jobs, and any time-critical jobs but documentation is required. Remote hand-free control by speech in a car is also in practical use. The remote control of home equipment’s is also starting up. • There have been researches to use speech as primary and use other method for confirmation, selection, or correction. A research showed a double of T9 productivity. A research combines speech with Gaze and Dasher, and gained twice productivity compared Dasher only.
  • 18. Voice recognition (Cont’d) • Advantages • Voice can communicate emotions. • Disadvantages • It is subjective to environmental noises. Recognition accuracy drastically drops in noisy environment by 20- 50%. The accuracy degradation comes from natural spontaneous interaction or diverse speaker too. • It’s socially awkward in two ways • Speaking is loud and invites noises to the others. • It doesn’t keep privacy. It does not suit to crowded environments. • See http://yoshiharusato.wordpress.com/2014/05/29/why- speech-recognition-do-not-work/.
  • 19. Silent speech recognition • State-of-art • Research of non-voiced speech recognition emerged recently. Alternative to air-microphone are throat microphone, surface EMG (electromyography), ultrasound imaging of tongue and lips, and a type of stethoscope microphone. • There is no commercial system yet.
  • 20. Silent speech recognition • Advantages • Silent speech solves the most critical defects of voiced speech recognition. • It is robust against environmental noise. • It protects privacy. • Disadvantages • Technology practicality is to be proved. • The quality of body-conducted speech degrades compared with normal speech. • NAM is not able to recognize pitch (Tone of Chinese).
  • 21. Lip reading • State-of-art • Lip reading is approached from pattern recognition by computer vision, or muscle move recognition by EMG (Electromyography). The computer vision approach is still the level of limited vocabulary (Takeshi Saitoh, 2009). Word recognition rate is about 80-90%. EMG approach can distinguish only vowels. • According to (Rosenblum, 2010), human lip-reading experts can read tong positions, air flows, and tones by observing subtle moves of chin, cheek, and face. Theoretically the technology should be able to overcome the current limitations. • There are a number of researches to use lip reading to supplement speech recognition, or combine it with keys. • There is no commercial system yet.
  • 22. Lip reading (Cont’d) • Advantages • Lip reading solves the most critical defects of voiced speech recognition. • It is robust against environmental noise. • It protects privacy. • Disadvantages • Lip reading is not matured yet as a standalone technology. • Computer vision approach is subjective to occlusion and light condition.
  • 23. Input methods • Mechanical Movement • Audio • Gaze • Brain • Multimodal Fusion
  • 24. Gaze • State-of-art • It’s approached by computer vision. There are already some commercial systems. Most of commercial systems measure the Point-Of-Regard by “corneal-reflection and pupil-center” method with an infrared camera. There are a number of vendors. Gaze tracking is applied in Digital camera called “Iris” to sense focus. • There are remote sensor type and head-mounted type. Head- mounted eye tracker can take advantage of higher accuracy and simplified geometry, and robust against head moves. • Current eye-tracking systems achieve an accuracy of 0.5 degrees (equivalent to a region of approximately 15 pixels on a 17” display with a resolution of 1024x768 pixels viewed from a distance 70cm). • There have been a number of researches of eye-typing for disabled people. They use software keyboard or dasher with gaze. There was a research to apply the gaze tracking to replace candidate selection in document authoring scenario, which observed more than half the time was spent on looking and selecting the right choice from candidate list with traditional IME.
  • 25. Gaze (Cont’d) • Advantages • Eye gaze moves quicker than hand/finger/body. A simple target selection and cursor positioning operations were performed approximately twice as fast as with an eye tracker than with any of the conventional cursor positioning devices. When all is performing well, eye gaze interaction can give a subjective feeling of a highly responsive system, almost as though the system is executing the user’s intentions before he or she expresses them (Karn, 2003). • The eyes can move without fatigue. • The time required to move the eye is not related to the distance to be moved, unlike most other input. • Operating the eye requires no training or particular coordination for normal users. • Disadvantages • It is difficult how to interpret Point-Of-Regard if we don’t use other means or control. Moving one’s eyes is often an almost subconscious act, and eye movement is always “on”, called “Midas Touch” problem (Karn, 2003). • Dwell time (hampering speed, fatiguing), “gaze-and-touch”, or eye gesture were used to solve this. • Eyes basically provide only positional information. • Computer vision is subjective to occlusion and light condition. • It requires calibration before use.
  • 26. Input methods • Mechanical Movement • Audio • Gaze • Brain • Multimodal Fusion
  • 27. Brain • State-of-art • The brain-machine interface may replace any human computer interactions someday. But it is not certain when brain-machine interfaces can deal with texts or symbol sequences. • It uses • expensive high-end sensors as • fMRI (Functional Magnetic Resonance Imaging) • or Brain blood pattern by fMRI (functional magnetic resonance imaging) • or MEG (Magneto-encephalography), • or low-end sensors as • NIRS (Near-infrared spectroscopy) • or EEG (Electro-encephalogram. • MSR showed off-the-shell EEG ($1500) can classify several brain states [Tan, 2005]. Hitachi offers “Kokorogatari” (2005) which tells Yes/No by NIRS. Honda research succeeded in 2006 to distinguish 3 symbols ‘paper, stone and scissors‘ by fMRI. Honda research also showed in 2009 a robot ASIMO moves arm and foot as commanded by EEG & NIRS system. • There are ventures who offer some solution: NeuroSky, Inc, BrainGate, and Emotiv Systems.
  • 28. Brain (Cont’d) • Advantages • Eye-free, Hand-free. • Disadvantages • Technology is not matured yet. EEG requires intense focus at present.
  • 29. Input methods • Mechanical Movement • Audio • Gaze • Brain • Multimodal Fusion
  • 30. Multimodal Fusion • Advantages • Users have a freedom of choice of modality. It contributes to reliability (error correction). • Can support more users. • Modality fusion usually outperforms uni-modal recognition. • Disadvantages • Processing (either early fusion or late fusion) could become more complex than mono-modal methods.
  • 31. Input methods • Mechanical Movement – slow but reliable • Audio – fast for text input • Gaze – fast for pointing • Brain • Multimodal Fusion
  • 32. Summary • Silent Speech (including Lip Reading) is a preferred technology of text input. • Gaze is the fast pointing method and provides information of user’s intention. • Finger dexterity is reliable to control & command machines.