Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

of

Comparisons of input modalities and methods Slide 1 Comparisons of input modalities and methods Slide 2 Comparisons of input modalities and methods Slide 3 Comparisons of input modalities and methods Slide 4 Comparisons of input modalities and methods Slide 5 Comparisons of input modalities and methods Slide 6 Comparisons of input modalities and methods Slide 7 Comparisons of input modalities and methods Slide 8 Comparisons of input modalities and methods Slide 9 Comparisons of input modalities and methods Slide 10 Comparisons of input modalities and methods Slide 11 Comparisons of input modalities and methods Slide 12 Comparisons of input modalities and methods Slide 13 Comparisons of input modalities and methods Slide 14 Comparisons of input modalities and methods Slide 15 Comparisons of input modalities and methods Slide 16 Comparisons of input modalities and methods Slide 17 Comparisons of input modalities and methods Slide 18 Comparisons of input modalities and methods Slide 19 Comparisons of input modalities and methods Slide 20 Comparisons of input modalities and methods Slide 21 Comparisons of input modalities and methods Slide 22 Comparisons of input modalities and methods Slide 23 Comparisons of input modalities and methods Slide 24 Comparisons of input modalities and methods Slide 25 Comparisons of input modalities and methods Slide 26 Comparisons of input modalities and methods Slide 27 Comparisons of input modalities and methods Slide 28 Comparisons of input modalities and methods Slide 29 Comparisons of input modalities and methods Slide 30 Comparisons of input modalities and methods Slide 31 Comparisons of input modalities and methods Slide 32
Upcoming SlideShare
Historical perspective of communication --- Imbalance in producing and consuming information since 6,000 years ago, even in IT ages ---
Next
Download to read offline and view in fullscreen.

1 Like

Share

Download to read offline

Comparisons of input modalities and methods

Download to read offline

a survey of input modalities and methods.

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Comparisons of input modalities and methods

  1. 1. Comparisons of Computer Input Modalities and Methods Yoshiharu Sato, http://yo-sato.com/
  2. 2. Input methods • Mechanical Movement • Audio • Gaze • Brain • Multimodal Fusion
  3. 3. Input methods • Mechanical Movement • Audio • Gaze • Brain • Multimodal Fusion
  4. 4. Mechanical Movement • Advantages • Easy to control • Disadvantages • Speed is limited by the mechanical movement. • Hand/Finger methods • Body Gestures • Muscle sensing
  5. 5. Hand/Finger methods • Advantages • Disadvantages • Artifact (input device) must be within reach of the user. It does not suit to remote control scenario or mobile scenarios. • Hand is busy to type characters or hold a device. Eyes are busy to look at keyboards or touch panel. It does not suit to mobile scenario. • Keyboard • Handwriting
  6. 6. Keyboard • Advantages • Keys directly map to characters, and there is smaller problem of recognition accuracy than hand-writing or voice recognition (note: it does involve recognition for East Asia ideogram input). This is one of reasons why it is hard to beat keyboard as mainstream input means. • Keys can represent any functions and language is rich. • Device is cheap. • It requires smaller computation cost than the others. • Disadvantages • Keyboard input operation is not natural. • Inputting texts reply on the knowledge of key positions (“memory in the world”, Norman 1988). • Hardware keyboard • Software keyboard
  7. 7. Hardware keyboard • Advantages • Keys are fixed. By that, human can rely on the knowledge in the world [Norman, 1988], and it’s easy to operate. • Disadvantages • Keys are fixed and limited, and functions bound to a key are sometimes over-loaded and modes are introduced to confuse users.
  8. 8. Software keyboard • Advantages • Keys are configurable by software, and there is no need of over-loading of keys. • Touch language is richer than key press. • Disadvantages • Touch is less accurate than mouse, and requires more efforts in correction than the hardware keyboard • Keyboard occludes screen real-estate, and distracts user’s thinking. • Keys are part of touch monitor, and it is hard to use under an extreme lightening condition. • Keys are configurable by software, and users need to look for key positions and a new key layout requires re-learning.
  9. 9. Handwriting • State of art • Online hand-writing technology was established in 90’s. • Typical recognition engine goes through the process of normalization of input data (e.g., base-line, slant/slope), segmentation and feature extraction and classification (dynamic programming, neural network or so + language model). • Commercial engines have about 10% character error rate for isolated characters in boxes, and 20% character error rate for run-on mode in 90’s. They were close to practical accuracy. • Hand-writing is already integrated into most retail devices such as PC or Smart Phone. • Offline handwriting technology has not reached to a practical use.
  10. 10. Handwriting (Cont’d) • Advantages • Ink is the character, and it is direct and intuitive. Human has been familiar for a long time. • Pen can play the role of mouse. • Silent. It can protect privacy. • Not subjective to environment noise. • Disadvantages • Ink needs a conversion to character codes, recognition cannot be 100%, and recognition results require corrections. • Hand-busy • Finger-movement to write a character is complex and time consuming.
  11. 11. Body Gestures • State-of-art • Body gestures are recognized by computer vision, or motion sensor. • Microsoft Kinect • Leap Motion • NTT DoCoMo “UbiButton” • “Ring” • Shiseido • There is a research to use tongue gestures by magnetic sensors.
  12. 12. Body Gestures (Cont’d) • Advantages • Gesture language can be richer than mouse/keys and touch by virtue of 3D. • It does not occlude screen real-estate. • Disadvantages • Computer vision is subjective to occlusion and light condition. • 3D freehand pointing precision may be lower than that with a 2D surface. • Freehand gestures involve more muscles than keyboard/mouse interaction, and large/frequent arm movements cause fatigue over time. • It’s socially awkward. It is strange if I make gestures against machine in crowded environment.
  13. 13. Muscle sensing • State-of-art • EMG (electromyography) in forearm muscle-sensing band can classify finger moves. • There is no commercial system yet for computer commanding. • There are several vendors of EMG, and low-end device costs less than $1,000.
  14. 14. Muscle sensing (Cont’d) • Advantages • Muscle can be sensed by a non-obtrusive way without some artifacts in the reach of the user. • It allows hand-free operations. • It doesn’t require observable interaction that can be socially awkward. It protects privacy. • Not interfere with environment as voice recognition or computer vision. • Fatigue free. • Disadvantages • It is limited by mechanical movement speed. • Language must be designed.
  15. 15. Input methods • Mechanical Movement • Audio • Gaze • Brain • Multimodal Fusion
  16. 16. Audio • Advantages • Speaking is direct, intuitive, and natural. Human has been familiar with it for a long time. People don’t have to learn speaking. So consumers perceive speech interface as not a input task. • Hand-free and eye-free, and suites to mobile scenario. • It is 5 times faster to speak than writing/typing. • Disadvantages • Voice needs a conversion to character codes, requires recognition, and corrections. • There is a segmentation problem of conversation, commands, and text recognition. • Voice recognition • Silent speech recognition • Lip reading
  17. 17. Voice recognition • State-of-art • Voice recognition technology has been investigated since 1960’s, established in 1990’s. • Voice recognition has been already in practical use in call centers, medical jobs, and any time-critical jobs but documentation is required. Remote hand-free control by speech in a car is also in practical use. The remote control of home equipment’s is also starting up. • There have been researches to use speech as primary and use other method for confirmation, selection, or correction. A research showed a double of T9 productivity. A research combines speech with Gaze and Dasher, and gained twice productivity compared Dasher only.
  18. 18. Voice recognition (Cont’d) • Advantages • Voice can communicate emotions. • Disadvantages • It is subjective to environmental noises. Recognition accuracy drastically drops in noisy environment by 20- 50%. The accuracy degradation comes from natural spontaneous interaction or diverse speaker too. • It’s socially awkward in two ways • Speaking is loud and invites noises to the others. • It doesn’t keep privacy. It does not suit to crowded environments. • See http://yoshiharusato.wordpress.com/2014/05/29/why- speech-recognition-do-not-work/.
  19. 19. Silent speech recognition • State-of-art • Research of non-voiced speech recognition emerged recently. Alternative to air-microphone are throat microphone, surface EMG (electromyography), ultrasound imaging of tongue and lips, and a type of stethoscope microphone. • There is no commercial system yet.
  20. 20. Silent speech recognition • Advantages • Silent speech solves the most critical defects of voiced speech recognition. • It is robust against environmental noise. • It protects privacy. • Disadvantages • Technology practicality is to be proved. • The quality of body-conducted speech degrades compared with normal speech. • NAM is not able to recognize pitch (Tone of Chinese).
  21. 21. Lip reading • State-of-art • Lip reading is approached from pattern recognition by computer vision, or muscle move recognition by EMG (Electromyography). The computer vision approach is still the level of limited vocabulary (Takeshi Saitoh, 2009). Word recognition rate is about 80-90%. EMG approach can distinguish only vowels. • According to (Rosenblum, 2010), human lip-reading experts can read tong positions, air flows, and tones by observing subtle moves of chin, cheek, and face. Theoretically the technology should be able to overcome the current limitations. • There are a number of researches to use lip reading to supplement speech recognition, or combine it with keys. • There is no commercial system yet.
  22. 22. Lip reading (Cont’d) • Advantages • Lip reading solves the most critical defects of voiced speech recognition. • It is robust against environmental noise. • It protects privacy. • Disadvantages • Lip reading is not matured yet as a standalone technology. • Computer vision approach is subjective to occlusion and light condition.
  23. 23. Input methods • Mechanical Movement • Audio • Gaze • Brain • Multimodal Fusion
  24. 24. Gaze • State-of-art • It’s approached by computer vision. There are already some commercial systems. Most of commercial systems measure the Point-Of-Regard by “corneal-reflection and pupil-center” method with an infrared camera. There are a number of vendors. Gaze tracking is applied in Digital camera called “Iris” to sense focus. • There are remote sensor type and head-mounted type. Head- mounted eye tracker can take advantage of higher accuracy and simplified geometry, and robust against head moves. • Current eye-tracking systems achieve an accuracy of 0.5 degrees (equivalent to a region of approximately 15 pixels on a 17” display with a resolution of 1024x768 pixels viewed from a distance 70cm). • There have been a number of researches of eye-typing for disabled people. They use software keyboard or dasher with gaze. There was a research to apply the gaze tracking to replace candidate selection in document authoring scenario, which observed more than half the time was spent on looking and selecting the right choice from candidate list with traditional IME.
  25. 25. Gaze (Cont’d) • Advantages • Eye gaze moves quicker than hand/finger/body. A simple target selection and cursor positioning operations were performed approximately twice as fast as with an eye tracker than with any of the conventional cursor positioning devices. When all is performing well, eye gaze interaction can give a subjective feeling of a highly responsive system, almost as though the system is executing the user’s intentions before he or she expresses them (Karn, 2003). • The eyes can move without fatigue. • The time required to move the eye is not related to the distance to be moved, unlike most other input. • Operating the eye requires no training or particular coordination for normal users. • Disadvantages • It is difficult how to interpret Point-Of-Regard if we don’t use other means or control. Moving one’s eyes is often an almost subconscious act, and eye movement is always “on”, called “Midas Touch” problem (Karn, 2003). • Dwell time (hampering speed, fatiguing), “gaze-and-touch”, or eye gesture were used to solve this. • Eyes basically provide only positional information. • Computer vision is subjective to occlusion and light condition. • It requires calibration before use.
  26. 26. Input methods • Mechanical Movement • Audio • Gaze • Brain • Multimodal Fusion
  27. 27. Brain • State-of-art • The brain-machine interface may replace any human computer interactions someday. But it is not certain when brain-machine interfaces can deal with texts or symbol sequences. • It uses • expensive high-end sensors as • fMRI (Functional Magnetic Resonance Imaging) • or Brain blood pattern by fMRI (functional magnetic resonance imaging) • or MEG (Magneto-encephalography), • or low-end sensors as • NIRS (Near-infrared spectroscopy) • or EEG (Electro-encephalogram. • MSR showed off-the-shell EEG ($1500) can classify several brain states [Tan, 2005]. Hitachi offers “Kokorogatari” (2005) which tells Yes/No by NIRS. Honda research succeeded in 2006 to distinguish 3 symbols ‘paper, stone and scissors‘ by fMRI. Honda research also showed in 2009 a robot ASIMO moves arm and foot as commanded by EEG & NIRS system. • There are ventures who offer some solution: NeuroSky, Inc, BrainGate, and Emotiv Systems.
  28. 28. Brain (Cont’d) • Advantages • Eye-free, Hand-free. • Disadvantages • Technology is not matured yet. EEG requires intense focus at present.
  29. 29. Input methods • Mechanical Movement • Audio • Gaze • Brain • Multimodal Fusion
  30. 30. Multimodal Fusion • Advantages • Users have a freedom of choice of modality. It contributes to reliability (error correction). • Can support more users. • Modality fusion usually outperforms uni-modal recognition. • Disadvantages • Processing (either early fusion or late fusion) could become more complex than mono-modal methods.
  31. 31. Input methods • Mechanical Movement – slow but reliable • Audio – fast for text input • Gaze – fast for pointing • Brain • Multimodal Fusion
  32. 32. Summary • Silent Speech (including Lip Reading) is a preferred technology of text input. • Gaze is the fast pointing method and provides information of user’s intention. • Finger dexterity is reliable to control & command machines.
  • sabrinarahmawati

    Aug. 28, 2014

a survey of input modalities and methods.

Views

Total views

2,613

On Slideshare

0

From embeds

0

Number of embeds

1,104

Actions

Downloads

13

Shares

0

Comments

0

Likes

1

×