A comparison of stereo correspondence algorithms can be conducted by a quantitative evaluation of
disparity maps. Among the existing evaluation methodologies, the Middlebury’s methodology is commonly
used. However, the Middlebury’s methodology has shortcomings in the evaluation model and the error
measure. These shortcomings may bias the evaluation results, and make a fair judgment about algorithms
accuracy difficult. An alternative, the A* methodology is based on a multiobjective optimisation model that
only provides a subset of algorithms with comparable accuracy. In this paper, a quantitative evaluation of
disparity maps is proposed. It performs an exhaustive assessment of the entire set of algorithms. As
innovative aspect, evaluation results are shown and analysed as disjoint groups of stereo correspondence
algorithms with comparable accuracy. This innovation is obtained by a partitioning and grouping algorithm.
On the other hand, the used error measure offers advantages over the error measure used in the
Middlebury’s methodology. The experimental validation is based on the Middlebury’s test-bed and
algorithms repository. The obtained results show seven groups with different accuracies. Moreover, the topranked
stereo correspondence algorithms by the Middlebury’s methodology are not necessarily the most
accurate in the proposed methodology.
Matching algorithm performance analysis for autocalibration method of stereo ...
An Evaluation Methodology for Stereo Correspondence Algorithms
1. An Evaluation Methodology for
Stereo Correspondence Algorithms
Ivan Cabezas, Maria Trujillo and Margaret Florian
ivan.cabezas@correounivalle.edu.co
February 25th 2012
International Conference on Computer Vision Theory and Applications, VISAPP 2012, Rome - Italy
2. Multimedia and Vision Laboratory
MMV is a research group of the Universidad del Valle in Cali, Colombia
Ivan Maria et al.
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 2
3. Ayax Inc.
Ayax Inc. offers informatics solutions for decision analysis
Margaret
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 3
4. Content
Stereo Vision
Canonical Stereo Geometry and Disparity
Ground-truth Based Evaluation
Quantitative Evaluation Methodologies
Middlebury’s Methodology
A* Methodology
A* Groups Methodology
Experimental Results
Final Remarks
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 4
5. Stereo Vision
The stereo vision problem is to recover the 3D structure of the scene using
two or more images
3D World
Optics
Problem
Camera Inverse
System Problem
Disparity Map Reconstruction
Algorithm
2D Images
Left Right
Correspondence
Stereo Images Algorithm 3D Model
Yang Q. et al., Stereo Matching with Colour-Weighted Correlation, Hierarchical Belief Propagation, and Occlusion Handling, IEEE PAMI 2009
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 5
6. Canonical Stereo Geometry and Disparity
Disparity is the distance between corresponding points
Accurate Estimation Inaccurate Estimation
P P
P’
Z Z’
pl pr pl pr
πl πr πl πr
pr ’
f f
Cl B Cr Cl B Cr
Trucco, E. and Verri A., Introductory Techniques for 3D Computer Vision, Prentice Hall 1998
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 6
7. Ground-truth Based Evaluation
Ground-truth based evaluation is based on the comparison using disparity
ground-truth data
Scharstein, D. and Szeliski, R., High-accuracy Stereo Depth Maps using Structured Light, CVPR 2003
Tola, E., Lepetit, V. and Fua, P., A Fast Local Descriptor for Dense Matching, CVPR 2008
Strecha, C., et al. On Benchmarking Camera Calibration and Multi-View Stereo for High Resolution Imagery, CVPR 2008
http://www.zf-usa.com/products/3d-laser-scanners/
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 7
8. Quantitative Evaluation Methodologies
The use of a methodology allows to:
Assert specific components and procedures
Tune algorithm's parameters
Support decision for researchers and
practitioners
Measure the progress on the field
Szeliski, R., Prediction Error as a Quality Metric for Motion and Stereo, ICCV 2000
Kostliva, J., Cech, J., and Sara, R., Feasibility Boundary in Dense and Semi-Dense Stereo Matching, CVPR 2007
Tomabari, F., Mattoccia, S., and Di Stefano, L., Stereo for robots: Quantitative Evaluation of Efficient and Low-memory Dense Stereo Algorithms, ICCARV 2010
Cabezas, I. and Trujillo M., A Non-Linear Quantitative Evaluation Approach for Disparity Estimation, VISAPP 2011
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 8
9. Middlebury’s Methodology
Select Test Bed Images Select Error Criteria
nonocc all disc
Select and Apply Stereo Algorithms Select Error Measures
ObjectStereo GC+SegmBorder PUTv3
Compute Error Measures
PatchMatch ImproveSubPix OverSegmBP
Scharstein, D. and Szeliski, R., High-accuracy Stereo Depth Maps using Structured Light, CVPR 2003
Scharstein, D. and Szeliski, R., http://vision.middlebury.edu/stereo/eval/, 2012
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 9
11. Middlebury’s Methodology (iii)
Apply Evaluation Model Interpret Results
The ObjectStereo algorithm
produces accurate results
Middlebury’s
Evaluation Model
Algorithm Average Final
Rank Ranking
ObjectStereo 1.33 1
PatchMatch 3.00 2
PUTv3 3.33 3
GC+SegmBorder 4.00 4
ImproveSubPix 4.00 5
OverSegmBP 5.33 6
Scharstein, D. and Szeliski, R., A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms, IJCV 2002
Scharstein, D. and Szeliski, R., http://vision.middlebury.edu/stereo/eval/, 2012
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 11
12. Middlebury’s Methodology (iv): Weaknesses
The Middlebury’s evaluation model have some shortcomings
In some cases, the ranks are assigned arbitrarily
The same average ranking does not imply the same performance (and
vice versa)
The cardinality of the set of top-performer algorithms is a free parameter
It operates values related to incommensurable measures
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 12
13. Middlebury’s Methodology (v): Weaknesses
The BMP percentage measures the quantity of disparity estimation errors
exceeding a threshold
The BMP measure have some shortcomings:
It is sensitive to the threshold selection
It ignores the error magnitude
It ignores the inverse relation between depth and disparity
It may conceal estimation errors of a large magnitude, and, also it may
penalise errors of small impact in the final 3D reconstruction
Cabezas, I., Padilla, V., and Trujillo M., A Measure for Accuracy Disparity Maps Evaluation, CIARP 2011
Gallup, D., et al. Variable Baseline/Resolution Stereo, CVPR, 2008
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 13
14. A* Methodology
The A* evaluation methodology brings a theoretical background for the
comparison of stereo correspondence algorithms
The set of algorithms under evaluation
The set of estimated maps to be compared
The function that produces a vector of error measures
The set of vectors of error measures
Cabezas, I. and Trujillo M., A Non-Linear Quantitative Evaluation Approach for Disparity Estimation, VISAPP 2011
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 14
15. A* Methodology (ii)
The evaluation model of the A* methodology addresses the comparison of
stereo correspondence algorithms as a multi-objective optimisation problem
It defines a partition over the set A (the decision space)
Subject to:
where ≺ denotes the Pareto Dominance relation:
Let p and q be two algorithms
Let Vp and Vq be a pair of vectors belonging to the objective space
Thus, three possible relations are considered
Cabezas, I. and Trujillo M., A Non-Linear Quantitative Evaluation Approach for Disparity Estimation, VISAPP 2011
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 15
16. A* Methodology (iii): Pareto Dominance
The Pareto Dominance defines a partial order relation
VGC+SegmBorder = < 4.99, 5.78, 8.66 >
VPatchMatch = < 2.47, 7.80, 7.11 >
VImproveSubPix = < 2.96, 8.22, 8.55 >
VGC+SegmBorder VPatchMatch
< 4.99, 5.78, 8.66 > < 2.47, 7.80, 7.11 >
GC+SegmBorder ~ PatchMatch
VPatchMatch VImproveSubPix
< 2.47, 7.80, 7.11 > < 2.96, 8.22, 8.55 >
Patchmatch ≺ ImproveSubPix
Van Veldhuizen, D., et al., Considerations in Engineering Parallel Multi-objective Evolutionary Algorithms, Trans in Evolutionary Computing 2003
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 16
17. A* Methodology (iv): Illustration
Select Test Bed Images Select Error Criteria
nonocc all disc
Select and Apply Stereo Algorithms Select Error Measures
ObjectStereo GC+SegmBorder PUTv3
Compute Error Measures
PatchMatch ImproveSubPix OverSegmBP
Scharstein, D. and Szeliski, R., High-accuracy Stereo Depth Maps using Structured Light, CVPR 2003
Scharstein, D. and Szeliski, R., http://vision.middlebury.edu/stereo/eval/, 2012
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 17
18. A* Methodology (v): Illustration
The evaluation model performs the partitioning and the grouping of stereo
algorithms under evaluation, based on the Pareto Dominance relation
Compute Error Measures Apply Evaluation Model
Algorithm nonocc all disc
ObjectStereo 2.20 6.99 6.36
ObjectStereo , GC+SegmBorder
GC+SegmBorder 4.99 5.78 8.66
PUTv3 2.40 9.11 6.56 PatchMatch , , ,
PUTv3 ImproveSubPix OverSegmBP
PatchMatch 2.47 7.80 7.11
ImproveSubPix 2.96 8.22 8.55 Algorithm nonocc all disc Set
OverSegmBP 3.19 8.81 8.89 ObjectStereo 2.20 6.99 6.36 A*
GC+SegmBorder 4.99 5.78 8.66 A*
PUTv3 2.40 9.11 6.56 A’
PatchMatch 2.47 7.80 7.11 A’
ImproveSubPix 2.96 8.22 8.55 A’
OverSegmBP 3.19 8.81 8.89 A’
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 18
19. A* Methodology (vi): Illustration
Interpretation of results is based on the cardinality of the set A*
Apply Evaluation Model Interpret Results
A* Evaluation Model
The Objectstereo and the
GC+SegmBorder algorithms
Algorithm nonocc all disc Set are, comparable among them, and have a
ObjectStereo 2.20 6.99 6.36 A* superior performance to the rest of
algorithms
GC+SegmBorder 4.99 5.78 8.66 A*
PUTv3 2.40 9.11 6.56 A’
PatchMatch 2.47 7.80 7.11 A’
ImproveSubPix 2.96 8.22 8.55 A’
OverSegmBP 3.19 8.81 8.89 A’
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 19
20. A* Methodology (vii): Strength and Weakness
Strength: It allows a formal interpretation of results, based on the cardinality
of the set A*, and in regard to considered imagery test-bed
Weakness: It does not allow an exhaustive evaluation of the entire set of
algorithms under evaluation
It computes the set A* just once, and does not bring information about A’
Cabezas, I. and Trujillo M., A Non-Linear Quantitative Evaluation Approach for Disparity Estimation, VISAPP 2011
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 20
21. A* Groups Methodology
It extends the evaluation model of the A* methodology, incorporating the
capability of performing an exhaustive evaluation
subject to:
It introduces the partitioningAndGrouping algorithm
A = Set ( { } );
A.load( “Algorithms.dat” );
A* = Set ( { } );
A’ = Set ( { } );
group = 1;
do {
computePartition( A, A*, A’, g, ≺ );
A*.save ( “A*_group_”+group );
group++;
A.update ( A’ ); // A = A / A*
A*.removeAll ( ); // A* = { }
A’.removeAll ( ); // A’ = { }
}while ( ! A.isEmpty ( ) );
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 21
22. A* Groups Methodology (ii): Sigma-Z-Error
The A* Groups methodology uses the Sigma-Z-Error
(SZE) measure
The SZE measure has the following properties:
It is inherently related to depth reconstruction in a stereo system
It is based on the inverse relation between depth and disparity
It considers the magnitude of the estimation error
It is threshold free
Cabezas, I., Padilla, V., and Trujillo M., A Measure for Accuracy Disparity Maps Evaluation, CIARP 2011
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 22
23. A* Groups Methodology (iii): Illustration
The evaluation process of selected algorithms by using the proposal
Select Test Bed Images Select Error Criteria
nonocc all disc
Select and Apply Stereo Algorithms Select Error Measures
ObjectStereo GC+SegmBorder PUTv3
Compute Error Measures
PatchMatch ImproveSubPix OverSegmBP
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 23
24. A* Groups Methodology (iv): Illustration
The evaluation model performs the partitioning and the grouping of stereo
algorithms under evaluation, based on the Pareto Dominance relation
Compute Error Measures Apply Evaluation Model
Algorithm nonocc all disc
ObjectStereo 73.88 117.90 36.25
GC+SegmBorder ,, PatchMatch
GC+SegmBorder 50.48 64.90 24.33
PUTv3 99.67 333.37 53.79 ObjectStereo , PUTv3 , ImproveSubPix , OverSegmBP
PatchMatch 49.95 261.84 32.85
Algorithm nonocc all disc Group
ImproveSubPix 50.66 97.94 32.01
OverSegmBP 58.65 108.60 34.58 GC+SegmBorder 50.48 64.90 24.33 1
PatchMatch 49.95 261.84 32.85 1
PUTv3 99.67 333.37 53.79
ImproveSubPix 50.66 97.94 32.01
OverSegmBP 58.65 108.60 34.58
ObjectStereo 73.88 117.90 36.25
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 24
25. A* Groups Methodology (v): Illustration
Apply Evaluation Model
ObjectStereo , PUTv3, ImproveSubPix , OverSegmBP
Algorithm nonocc all disc
ObjectStereo, PUTv3 , OverSegmBP
PUTv3 99.67 333.37 53.79 Algorithm nonocc all disc
ImproveSubPix 50.66 97.94 32.01 PUTv3 99.67 333.37 53.79
OverSegmBP 58.65 108.60 34.58 OverSegmBP 58.65 108.60 34.58
ObjectStereo 73.88 117.90 36.25 ObjectStereo 73.88 117.90 36.25
OverSegmBP
ImproveSubPix
PUTv3 , ObjectStereo
ObjectStereo , PUTv3 , OverSegmBP
Algorithm nonocc all disc Group
Algorithm nonocc all disc Group OverSegmBP 58.65 108.60 34.58 3
PUTv3 99.67 333.37 53.79
ImproveSubPix 50.66 97.94 32.01 2
ObjectStereo 73.88 117.90 36.25
PUTv3 99.67 333.37 53.79
ObjectStereo 73.88 117.90 36.25
OverSegmBP 58.65 108.60 34.58
And so on …
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 25
26. A* Groups Methodology (vi): Illustration
Interpretation of results is based on the cardinality of each group
Apply Evaluation Model Interpret Results
A* Groups
Evaluation Model
There are 5 groups of different performance
The GC+SegmBorder and the PatchMatch
algorithms are, comparable among them,
and have a superior performance to the rest
Algorithm nonocc all disc Group of algorithms
GC+SegmBorder 50.48 64.90 24.33 1
The ImproveSubPix algorithm is superior to
PatchMatch 49.95 261.84 32.85 1 the OverSegmBP, the ObjectStereo, and
ImproveSubPix 50.66 97.94 32.01 2 the PUTv3 algorithms
OverSegmBP 58.65 108.60 34.58 3
…
ObjectStereo 73.88 117.90 36.25 4
PUTv3 99.67 333.37 53.79 5 The PUTv3 algorithm has the lowest
performance
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 26
27. Experimental Results
The conducted evaluation involves the following elements:
Test Bed Images
Error Criteria nonocc , all , disc
Error Measures SZE , BMP
Stereo Algorithms 112 algorithms from the Middlebury’s repository
Evaluation Models A* Groups Middlebury
Scharstein, D. and Szeliski, R., http://vision.middlebury.edu/stereo/eval/, 2012
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 27
28. Experimental Results (ii)
Algorithm Group Middlebury’s
Ranking
ADCensus 2 1
AdaptingBP 2 2
CoopRegion 2 3
DoubleBP 1 4
RDP 2 5
OutlierConf 2 6
Algorithm Strategy Group Middlebury’s SubPixDoubleBP 2 7
Ranking SurfaceStereo 2 8
DoubleBP Global 1 4 WarpMat 2 9
PatchMatch Local 1 11 ObjectStereo 2 10
GC+SegmBorder Global 1 13 PatchMatch 1 11
FeatureGC Global 1 18 Undr+OverSeg 2 12
Segm+Visib Global 1 29 GC+SegmBorder 1 13
MultiresGC Global 1 30 InfoPermeable 2 14
DistinctSM Local 1 34 CostFilter 2 15
GC+occ Global 1 67
MultiCamGC Global 1 68
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 28
29. Final Remarks
The use of the A* Groups methodology allows to perform an exhaustive
evaluation, as well as an objective interpretation of results
Innovative results in regard to the comparison of stereo correspondence
algorithms were obtained using proposed methodology and the SZE error
measure
The introduced methodology offers advantages over the conventional
approaches to compare stereo correspondence algorithms
Authors are already working in order to provide to the research community an
accessible way to use the introduced methodology
An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 29
30. An Evaluation Methodology for
Stereo Correspondence Algorithms
Ivan Cabezas, Maria Trujillo and Margaret Florian
ivan.cabezas@correounivalle.edu.co
February 25th 2012
International Conference on Computer Vision Theory and Applications, VISAPP 2012, Rome, Italy