PR-232: AutoML-Zero:Evolving Machine Learning Algorithms From Scratch
Paper link: https://arxiv.org/abs/2003.03384
Video presentation link: https://youtu.be/J__uJ79m01Q
3. 1. Research Background
Introduction
3/19
• AutoML
, architecture hyperparameter .
• (NAS, Neural Architecture Search)
• Hyperparameters
• Learning rule (activation function, full forward pass, data augmentation, weight optimization, layer and weight pruning)
AutoML
https://arxiv.org/pdf/1810.13306.pdf
4. 1. Research Background
Architecture search
• - Constrained search space
• building block search algorithm , NAS .
• constrained search space .
Search space :
Saining Xie et al. (2019) https://arxiv.org/pdf/1904.01569.pdf
PR-155
Golnaz Ghaisi et al. (2019) https://arxiv.org/pdf/1904.01569.pdf
PR-166
Yanan Sun et al. (2019) https://arxiv.org/pdf/1710.10741.pdf
4/19
5. 1. Research Background
AutoML-Zero
• We propose to automatically search for whole ML algorithms using little restriction
on form and only simple mathematical operations as building blocks.
Matrix decomposition derivative .
5/19
6. 1. Research Background
AutoML-Zero
• we propose to automatically search for whole ML algorithms using little restriction
on form and only simple mathematical operations as building blocks.
백지상태에서 시작해서 최종 알고리즘 까지
정말 어마어마한 search space…
4일
6/19
8. P=5, T=3 일 때,
2. Methods
Type (i)
랜덤 연산 삽입/삭제.
삭제 확률이 삽입 확률의 두 배
Type (ii)
함수 내 연산 전부 교체
Type (iii)
Argument 하나만 교체.
Real-valued constant 수정할 때,
[0.5, 2.0]사이의 수 임의선택 후
곱하고 10%의 확률로 부호 바꿈
Evolutionary method
T만큼 랜덤선택
8/19
12. 3. Experimental Results
AutoML-Zero hand-designed reference (2-layer FC NN)
• CIFAR-10 MNIST task
• 10 class , binary classification ; 10C2 = 45 pairs
• pair 8000 train/ 2000 valid example
• 45 36 – Tsearch
(search task . 1~10 evolution cycle )
• 45 9 – Tselect ( best accuracy )
• CIFAR-10 test set final evaluation
• Number of possible operations: 7/58/58 for Setup/Predict/Learn
Figure 6 1 illustration , (5, 20) .
12/19
• Training Epoch : 1 or 10; evolution parameter: P=100, T=10
• Maximum num. instructions for Setup/Predict/Learn: 21/21/45.
13. 3. Experimental Results
• Best model parameter (learning rate, uniform distribution mean ) Tselect dataset random search .
, linear/nonlinear baseline hyperparameters random search .
• [CIFAR-10 ] 5 trial best algorithm accuracy : 84.06 0.10%
Linear baseline : logistic regression, 77.65 0.22%
Nonlinear baseline : 2-layer fully connected neural network, 82.22 0.17%
• binary classification task :
1) SVHN (32 x 32 x 3) (88.12% AutoML-Zero vs. 59.58% linear baseline vs. 85.14% for nonlinear baseline)
2) down-sampled ImageNet (128 x 128 x 3) (80.78% vs. 76.44% vs. 78.44%)
3) Fashion MNIST (28 x 28 x 1) (98.60% vs. 97.90% vs. 98.21%).
search space design convolution batch normalization .
AutoML-Zero hand-designed reference (2-layer FC NN)
AutoML-Zero 2-layer FC NN .
13/19
14. 3. Experimental Results
Challenging task AutoML-Zero
1) Few training examples
• Training dataset 80 100 epoch ,
AutoML-Zero Noisy ReLU (dropout ) .
• ?
(80 examples) vs. (800 examples) 30 ,
(p<0.0005) noisy ReLU .
14/19
15. 3. Experimental Results
Challenging task AutoML-Zero
2) Fast training
• Training dataset 800 10 epoch ,
AutoML-Zero learning-rate decay .
• ?
10 epoch vs. 100 epoch 30 ,
10 epoch case 30 (30/30), 100 epoch case 3 (3/30) learning-rate decay
.
15/19