PR-272: Accelerating Large-Scale Inference with Anisotropic Vector Quantization
[Guo et al., ICML 2020]
Paper link: https://arxiv.org/abs/1908.10396
Video presentation link: https://youtu.be/cU46yR-A0cs
reviewed by Sunghoon Joo
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
PR-272: Accelerating Large-Scale Inference with Anisotropic Vector Quantization
1. LoGANv2: Conditional Style-Based Logo Generation with Generative Adversarial
Networks
PR-272
Our implementation is open-source and available at
https://github.com/google-research/google-research/tree/master/scann
8. 1. Research Background
Objective & Approach : a new quantization loss function
• We propose the score-aware quantization loss function. The proposed loss can
work under any weighting function of the inner product and regardless of whether
the datapoints vary in norm.
1) ? (MIPS task)
• SP-TAG (Chen et al., 2018), FAISS (Johnson et al., 2017), hnswlib
(Malkov & Yashunin, 2016)
2) ?
• Product quantization codebook score-aware quantization loss .
• Binary quantization Stochastic Generative Hashing (Dai et al., 2017) loss .
8/21
12. 2. Methods
Quantization Technique
12/21
• X quantization point C .
• k quantized point Quary vector q dot product .
• Dictionary C . Lloyd’s algorithm
(Initialization) Codeword c
(Partition Assignment Step) k .
(codebook update)
15. 3. Experimental Results
Experiment 1: general l2-reconstruction loss score-aware loss
• T weigth function dot product level, parallel error – orthogonal error
• Threshold T = 0.2 . T = 0.2
• reconstruction loss quantization
15/21
η = 4.125
Traditional :minimizing reconstruction loss
Proposed : minimizing score-aware loss.
16. 3. Experimental Results
Experiment 1: general l2-reconstruction loss score-aware loss
• softmax approximation , <q, x>
• Score-aware loss function -quantized data pair loss
, top-ranking pair
• Amazon 670k
16/21
17. 3. Experimental Results
Experiment 2: bitrate method
17/21
Recall 1@N : 쿼리를 여러 번 수행 했을때, 검색된 상위 N 개 결과에 실제 상위 1개 datapoint가 포함 된 쿼리의 비율
• bitrate Recall 1@N N .
QUIPS : Guo et al., Quantization based fast inner product search. 2016
LSQ: Martinez et al., Lsq++: Lower running time and higher recall in multi-codebook quantization. 2018
18. 3. Experimental Results
Experiment 3: benchmark method – /
18/21
• MIPS end-to-end benchmark
• (http://ann-benchmarks.com/glove-100-angular_10_angular.html)
Benchmarks are all
conducted on an Intel Xeon
W-2135 with a single CPU
thread.