The document discusses techniques for hand-based augmented reality interaction. It outlines several approaches including using fiducial markers or sensors on the hand, predefined hand postures, or image-based detection. The key techniques involve using RGB-D cameras to capture color and depth images to detect the 3D position of the hand both in camera coordinates and real-world coordinates. This allows mapping the virtual hand position based on the real hand to enable gestures to generate manipulation commands in augmented reality.
3. Fiducial markers
on hand
B. Thomas et al., VR 02
V. Buchmann et al.,
CG&IT 04
Sensors on
wrist
D. Kim et al., UIST 12
G. Park et al., HCII 14
Fixed camera
setting
S. Corbett-Davies et
al., VR 13
T. Piumsomboon et
al., IVC 11
Predefined
hand posture
T. Lee et al., ISWC
07
Image/camera
coordinate-based
M. Tosas et al., ECCV
Workshop 04
P. Mistry et al.,
SIGGRAPH ASIA 09
Leap Motion VR
Development, 14
4. Object behind a hand
appears to be in front
Object behind a hand
cannot be seen
5. without tethered
tracking devices
• Depth perception
a semi-transparent proxy hand
Semi-transparent
proxy hand
Virtual hand is rendered in
local reference coordinates
Distant
object
6. Color image
point clouds
Hand
and 3D
3D hand
position
segmentation
and detection
RGB-D
cameras
capture color
and depth-map
images
User wears HWD with
near and far-range
RGB-D cameras
Hand and
camera pose
Commands
Hand
gestures
generate
manipulation
commands
Local
reference
coordinates
are set in
real space
Visual
feedback for
hands and
environment
User’s hand
movement
Display
Start End
7. Near-range RGB-D camera
Long-range RGB-D camera
Detect hand and close-up occlusion
Support occlusion and shadowing of more distant objects
y
z
Origin
Physical world ≠ Camera
virtual world
Determine correct scale relationship
9. Ideal Image
coordinates
Observed Image
Camera coordinates
coordinates
Back-projection
3D hand position
based on real-world
coordinates
Real
hand
Virtual hand
1
2
3D hand position
in camera
coordinates
Virtual
hand
4
Rendering
3
Virtual hand
mapping
(푥, 푦)
XI
YI
Xc
Yc
Zc
(푥푐 , 푦푐 , 푧푐 )
(푥푤, 푦푤, 푧푤)
Yw
Xw
Zw
10. 1. Image to camera coordinate system
s
푥
푦
1
= 퐾
푥푐
푦푐
푧푐
1
=
푓푥
0
0
0 푐푥 0
푓푦 푐푦 0
0 1 0
푥푐
푦푐
푧푐
1
=
푓푥푥푐 + 푐푥 푧푐
푓푦푥푐 + 푐푦푧푐
푧푐
푥푐 =
푥 − 푐푥 )푧푐
푓푥
, 푦푐 =
푦 − 푐푦 )푧푐
푓푦
, 푧푐 = depth map value
Ideal Image
coordinates
Image
coordinates
Camera
coordinates
Back-projection
3D hand position in
camera coordinates
(푥, 푦)
Xi
Yi
Xc
Yc
Zc
(푥푐 , 푦푐 , 푧푐 )
11. 2. Camera to world coordinate system
푃푊 =
푥푤
푦푤
푧푤
1
= λ푇푊푡표퐶
−1
푥푐
푦푐
푧푐
1
Correct scale relationship between physical and virtual worlds
λ =
퐷푣푖푟푡푢푎푙 = 푐푎푚푒푟푎 푡표 표푟푖푔푖푛 푖푛 푣푖푟푡푢푎푙 푠푐푎푙푒 푢푛푖푡푠
퐷푟푒푎푙 = 푐푎푚푒푟푎 푡표 표푟푖푔푖푛 푖푛 푟푒푎푙 푠푐푎푙푒 푢푛푖푡푠
Camera
coordinates
3D hand position based on
real-world coordinates
Xc
Yc
Zc
(푥푐 , 푦푐 , 푧푐 )
(푥푤, 푦푤, 푧푤)
Yw
Xw
Zw
Virtual world scale ≠
Physical world scale
12. 3. Virtual hand mapping
AR version of VR hand interaction technique, HOMER*
푃푉 = 퐶 + 휔 ∙ 퐷푊 ∙
퐷푂푏푗
퐷퐼푛푖푡
∙
푃푊 − 퐶
푃푊 − 퐶
Distance Direction
Gestures generate input commands with user’s bare hand:
Virtual
hand
푃푊
Yw
Xw
Zw
푃푉
Camera
Real
hand
퐶
Grasping state
(visible hand, but no visible finger)
Releasing state
(visible finger)
Distant
object
Virtual
hand
HOMER
* D. Bowman et al., “HOMER: Hand-centered Object Manipulation Extending Ray-casting technique,” I3D 1997
13. • Semi-transparent visualization for hand
Environmental
occlusion Shadow
Distance HWD
Grasping
* S. Zhai et al., “Investigating The ‘Silk Cursor’: Transparency for 3D Target Acquisition,” CHI 1994.
14. • Occlusion, semi-transparent grey shadows, and guidelines
• Voxels of distant physical objects in the environment are
transparently rendered with relatively larger voxels
• Semi-transparent grey shadows
• Horizontal and vertical virtual guidelines
Dynamic
environment