SlideShare une entreprise Scribd logo
1  sur  63
Télécharger pour lire hors ligne
Evolutionary Deep Neural Network
(or NeuroEvolution)
신수용
2017. 8. 29
@SNU TensorFlow Study
2
https://www.youtube.com/watch?v=aeWmdojEJf0
https://github.com/ssusnic/Machine-Learning-Flappy-Bird
3
Evolutionary DNN
• Usually, used to decide DNN structure
– Number of layers, number of nodes..
• Can be used to decide weight values
– Flappy bird example
Evolutionary Computation
5
Biological Basis
• Biological systems adapt themselves to a
new environment by evolution.
• Biological evolution
– Production of descendants changed from
their parents
– Selective survival of some of these
descendants to produce more descendants
Survival of the Fittest
6
Evolutionary Computation
• Stochastic search (or problem solving)
techniques that mimic the metaphor of
natural biological evolution.
7
7
General Framework
초기해집합 생성
적합도 평가
종료?
부모 개체 선택
자손 생성
적합도 함수
최적해
Yes
No
교차 연산
돌연변이 연산
선택 연산
8
Paradigms in EC
• Genetic Algorithm (GA)
– [J. Holland, 1975]
– Bitstrings, mainly crossover, proportionate selection
• Genetic Programming (GP)
– [J. Koza, 1992]
– Trees, mainly crossover, proportionate selection
• Evolutionary Programming (EP)
– [L. Fogel et al., 1966]
– FSMs, mutation only, tournament selection
• Evolution Strategy (ES)
– [I. Rechenberg, 1973]
– Real values, mainly mutation, ranking selection
Genetic Algorithms
10
GA(Genetic Algorithms)
• 자연계의 유전 현상을 모방하여 적합한 가설을 얻
어내는 방법
• 특성
– 진화는 자연계에서 성공적이고 정교한 적응 방법
– 모델링하기 힘든 복잡한 문제에도 적용 가능
– 병렬화가 가능, H/W 성능의 도움을 받을 수 있음
• 대규모 탐색공간에서 최선의 fitness의 해를 찾는
일반적인 최적화 과정
• 최적의 해를 찾는다고 보장할 수는 없지만 높은
fitness의 해를 얻을 수 있음
11
GA의 기본용어 (1/2)
• 염색체 (Chromosome)  개체 (Individual)
– 주어진 문제에 대한 가능한 해 또는 가설
– 대부분 string으로 표현됨
– string의 원소는 정수, 실수 등 필요에 의해 결정됨
• 개체군 (population)
– 개체(가설)들의 집합
1 1 0 1 0 0 1 1
12
GA의 기본용어 (2/2)
• 적합도 (fitness)
– 산술적인 단위로 가설의 적합도를 표시한다.
– 유전자의 각 개체의 환경에 대한 적합의 비율을 평가
하는 값
– 평가치로 최적화 문제를 대상으로 하는 경우 목적함
수 값이나 제약조건을 고려하여 페널티 함수 값
• 적합도 함수 (fitness function)
– 적합도를 구하기 위해서 사용되는 기준방법
13
GA의 연산자 (1/5)
• 선택 연산자 (Selection Operator)
– 개체를 선택하여 부모들로 선정
– 우수한 자손들이 많이 생성되도록 하기 위해서(해답
을 발견하기 위해서) 좀더 우수한 적합도를 가진 개
체들이 선택될 확률이 비교적 높도록 함.
– Proportional (Roulette wheel) selection
– Tournament selection
– Ranking-based selection
14
GA의 연산자 (2/5)
• 교차 연산자 (Crossover Operator)
– 생물들이 생식을 하는 것처럼 부모들의 염색체를 서
로 교차시켜서 자손을 만드는 연산자.
– Crossover rate라고 불리는 임의의 확률에 의해서
교차연산의 수행여부가 결정된다.
15
GA의 연산자 (3/5)
– One-point crossover
1 1 0 1 0 0 1 1
0 1 1 1 0 1 1 0
Crossover point
1 1 0 1 0 1 1 0
0 1 1 1 0 0 1 1
16
GA의 연산자 (4/4)
• 돌연변이 연산자 (Mutation Operator)
– 한 bit를 mutation rate라는 임의의 확률로 변화
(flip)시키는 연산자
– 아주 작은 확률로 적용된다. (ex) 0.001
1 1 0 1 0 0 1 1
1 1 0 1 1 0 1 1
17
Example of Genetic Algorithm
18
가설 공간 탐색
• 다른 탐색 방법과의 비교
– local minima에 빠질 확률이 적다(급격한 움직임 가
능)
• Crowding
– 유사한 개체들이 개체군의 다수를 점유하는 현상
– 다양성을 감소시킨다.
19
Crowding
• Crowding의 해결법
– 선택방법을 바꾼다.
• Tournament selection, ranking selection
– “fitness sharing”
• 유사한 개체가 많으면 fitness를 감소시킨다.
– 결합하는 개체들을 제한
• 가장 비슷한 개체끼리 결합하게 함으로써 cluster or
multiple subspecies 형성
• 개체들을 공간적으로 분포시키고 근처의 것끼리만 결합 가
능하게 함
20
Typical behavior of an EA
• Phases in optimizing on a 1-dimensional fitness
landscape
Early phase:
quasi-random population distribution
Mid-phase:
population arranged around/on hills
Late phase:
population concentrated on high hills
21
Geometric Analogy - Mathematical Landscape
22
Typical run: progression of fitness
Typical run of an EA shows so-called “anytime behavior”
Bestfitnessinpopulation
Time (number of generations)
23
Bestfitnessinpopulation
Time (number of generations)
Progress in 1st half
Progress in 2nd half
Are long runs beneficial?
• Answer:
- it depends how much you want the last bit of progress
- it may be better to do more shorter runs
24
Scale of “all” problems
Performanceofmethodsonproblems
Random search
Special, problem tailored method
Evolutionary algorithm
ECs as problem solvers: Goldberg’s 1989 view
25
Advantages of EC
• No presumptions w.r.t. problem space
• Widely applicable
• Low development & application costs
• Easy to incorporate other methods
• Solutions are interpretable (unlike NN)
• Can be run interactively, accommodate
user proposed solutions
• Provide many alternative solutions
26
Disadvantages of EC
• No guarantee for optimal solution within
finite time
• Weak theoretical basis
• May need parameter tuning
• Often computationally expensive, i.e.
slow
Genetic Programming
28
Genetic Programming
• Genetic programming uses variable-size
tree-representations rather than fixed-
length strings of binary values.
• Program tree
= S-expression
= LISP parse tree
• Tree = Functions (Nonterminals) +
Terminals
29
GP Tree: An Example
• Function set: internal nodes
– Functions, predicates, or actions which
take one or more arguments
• Terminal set: leaf nodes
– Program constants, actions, or functions
which take no arguments
S-expression: (+ 3 (/ ( 5 4) 7))
Terminals = {3, 4, 5, 7}
Functions = {+, , /}
30
Tree based representation
• Trees are a universal form, e.g. consider
• Arithmetic formula
• Logical formula
• Program








15
)3(2
y
x
(x  true)  (( x  y )  (z  (x  y)))
i =1;
while (i < 20)
{
i = i +1
}
31
Tree based representation
• In GA, ES, EP chromosomes are linear
structures (bit strings, integer string, real-
valued vectors, permutations)
• Tree shaped chromosomes are non-linear
structures.
• In GA, ES, EP the size of the
chromosomes is fixed.
• Trees in GP may vary in depth and width.
32
Crossover: Subtree Exchange
+
b
 
a b
+
b
+ 
 
a a b
+

a b

 
a b
+
b
+

a
b

33
Mutation

a b
+
b
/

a
+

b
+
b
/

a
-
b a
Evolution strategies
35
ES quick overview
• Developed: Germany in the 1970’s
• Early names: I. Rechenberg, H.-P. Schwefel
• Typically applied to:
– numerical optimisation
• Attributed features:
– fast
– good optimizer for real-valued optimisation
– relatively much theory
• Special:
– self-adaptation of (mutation) parameters standard
36
ES technical summary
Representation Real-valued vectors
Recombination Discrete or intermediary
Mutation Gaussian perturbation
Parent selection Uniform random
Survivor selection (,) or (+)
Specialty Self-adaptation of mutation
step sizes
37
Introductory example
• Task: minimimise f : Rn
 R
• Algorithm: “two-membered ES” using
– Vectors from R
n
directly as chromosomes
– Population size 1
– Only mutation creating one child
– Greedy selection
38
Parent selection
• Parents are selected by uniform random
distribution whenever an operator needs
one/some
• Thus: ES parent selection is unbiased -
every individual has the same probability
to be selected
• Note that in ES “parent” means a
population member (in GA’s: a population
member selected to undergo variation)
39
Survivor selection
• Applied after creating  children from the
 parents by mutation and recombination
• Deterministically chops off the “bad stuff”
• Basis of selection is either:
– The set of children only: (,)-selection
– The set of parents and children: (+)-
selection
40
Survivor selection cont’d
• (+)-selection is an elitist strategy
• (,)-selection can “forget”
• Often (,)-selection is preferred for:
– Better in leaving local optima
– Better in following moving optima
– Using the + strategy bad  values can survive in x, too long if
their host x is very fit
• Selective pressure in ES is very high (  7 •  is the
common setting)
Evolutionary Programming
42
EP quick overview
• Developed: USA in the 1960’s
• Early names: D. Fogel
• Typically applied to:
– traditional EP: machine learning tasks by finite state machines
– contemporary EP: (numerical) optimization
• Attributed features:
– very open framework: any representation and mutation op’s OK
– crossbred with ES (contemporary EP)
– consequently: hard to say what “standard” EP is
• Special:
– no recombination
– self-adaptation of parameters standard (contemporary EP)
43
EP technical summary tableau
Representation Real-valued vectors
Recombination None
Mutation Gaussian perturbation
Parent selection Deterministic
Survivor selection Probabilistic (+)
Specialty Self-adaptation of mutation
step sizes (in meta-EP)
Evolutionary Neural Networks
(or Neuro-evolution)
45
ENN
• The back-propagation learning algorithm
cannot guarantee an optimal solution.
• In real-world applications, the back-
propagation algorithm might converge to
a set of sub-optimal weights from which
it cannot escape.
• As a result, the neural network is often
unable to find a desirable solution to a
problem at hand.
46
ENN
• Another difficulty is related to selecting an
optimal topology for the neural network.
– The “right” network architecture for a particular
problem is often chosen by means of heuristics,
and designing a neural network topology is still
more art than engineering.
• Genetic algorithms are an effective
optimization technique that can guide both
weight optimization and topology selection.
47
Encoding a set of weights in a chromosome
y
0.9
1
3
4
x1
x3
x2
2
-0.8
0.4
0.8
-0.7
0.2
-0.2
0.6
-0.3 0.1
-0.2
0.9
-0.60.1
0.3
0.5
From neuron:
To neuron:
12 34 5678
1
2
3
4
5
6
7
8
00 00 0000
00 00 0000
00 00 0000
0.9 -0.3 -0.7 0 0000
-0.8 0.6 0.3 0 0000
0.1 -0.2 0.2 0 0000
0.4 0.5 0.8 0 0000
00 0 -0.6 0.1 -0.2 0.9 0
Chromosome: 0.9 -0.3 -0.7 -0.8 0.6 0.3 0.1 -0.2 0.2 0.4 0.5 0.8 -0.6 0.1 -0.2 0.9
48
Fitness function
• The second step is to define a fitness
function for evaluating the chromosome’s
performance.
– This function must estimate the performance
of a given neural network.
– Simple function defined by the sum of
squared errors.
49
4
5
y
x2
2
-0.3
0.9
-0.7
0.5
-0.8
-0.6
Parent 1 Parent 2
x1
1 -0.2
0.1
0.4
4
5
y
x2
2
-0.1
-0.5
0.2
-0.9
0.6
0.3x1
1 0.9
0.3
-0.8
0.1 -0.7 -0.6 0.5 -0.8-0.2 0.9 0.4 -0.3 0.3 0.2 0.3 -0.9 0.60.9 -0.5 -0.8 -0.1
0.1 -0.7 -0.6 0.5 -0.80.9 -0.5 -0.8 0.1
4
y
x2
2
-0.1
-0.5
-0.7
0.5
-0.8
-0.6
Child
x1
1 0.9
0.1
-0.8
Crossover
50
Mutation
Original network
3
4
5
y
6
x2
2
-0.3
0.9
-0.7
0.5
-0.8
-0.6x1
1
-0.2
0.1
0.4
0.1 -0.7 -0.6 0.5 -0.8-0.2 0.9
3
4
5
y
6
x2
2
0.2
0.9
-0.7
0.5
-0.8
-0.6x1
1
-0.2
0.1
-0.1
0.1 -0.7 -0.6 0.5 -0.8-0.2 0.9
Mutated network
0.4 -0.3 -0.1 0.2
51
Architecture Selection
• The architecture of the network (i.e. the
number of neurons and their
interconnections) often determines the
success or failure of the application.
• Usually the network architecture is decided
by trial and error; there is a great need for a
method of automatically designing the
architecture for a particular application.
– Genetic algorithms may well be suited for this
task.
52
Encoding
Fromneuron:
To neuron:
1 2
0
5
0
3
0
4
0
6
1
2
3
4
5
6
0 0
0 0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1 1
1
1 1 1 1
0
1 0
0 0
0 0
3
4
5
y
6
x2
2
x1
1
Chromosome:
0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 0
53
Process
Neural Network j
Fitness = 117
Neural Network j
Fitness = 117
Generation i
Training Data Set
0 0 1.0000
0.1000 0.0998 0.8869
0.2000 0.1987 0.7551
0.3000 0.2955 0.6142
0.4000 0.3894 0.4720
0.5000 0.4794 0.3345
0.6000 0.5646 0.2060
0.7000 0.6442 0.0892
0.8000 0.7174 -0.0143
0.9000 0.7833 -0.1038
1.0000 0.8415 -0.1794
Child 2
Child 1
Crossover
Parent 1
Parent 2
Mutation
Generation (i + 1)
Evolutionary DNN
55
Good reference blog
• https://medium.com/@stathis/design-
by-evolution-393e41863f98
56
Evolving Deep Neural Networks
• https://arxiv.org/pdf/1703.00548.pdf
• CoDeepNEAT
– for optimizing deep learning architectures
through evolution
– Evolving DNNS for CIFAR-10
– Evolving LSTM architecture
– Not so clear experimental comparison..
57
Large-Scale Evolution of Image
Classifiers
• https://arxiv.org/abs/1703.01041
• Individual
– a trained architecture
• Fitness
– Individual’s accuracy on a validation set
• Selection (tournament selection)
– Randomly choose two individuals
– Select better one (parent)
58
Large-Scale Evolution of Image
Classifiers
• Mutation
– Pick a mutation from a
predetermined set
• Train child
• Repeat.
59
Large-Scale Evolution of Image
Classifiers
60
Convolution by Evolution
• https://arxiv.org/pdf/1606.02580.pdf
• GECCO16 paper
• Differential version of the Compositional
Pattern Producing Network (DPPN)
– Topology is evolved but the weights are
learned
– Compressed the weights of a denoising
autoencoder from 157684 to roughly 200
parameters with comparable image
reconstruction accuracy
61
62
sooyong.shin@khu.ac.kr
@likesky3

Contenu connexe

Similaire à Evolutionary (deep) neural network

CSA 3702 machine learning module 4
CSA 3702 machine learning module 4CSA 3702 machine learning module 4
CSA 3702 machine learning module 4Nandhini S
 
WIX3001 Lecture 6 Principles of GA.pptx
WIX3001 Lecture 6 Principles of GA.pptxWIX3001 Lecture 6 Principles of GA.pptx
WIX3001 Lecture 6 Principles of GA.pptxKelvinCheah4
 
AI.3-Evolutionary Computation [15-18].pdf
AI.3-Evolutionary Computation [15-18].pdfAI.3-Evolutionary Computation [15-18].pdf
AI.3-Evolutionary Computation [15-18].pdfThninh2
 
Genetic Algorithms - Artificial Intelligence
Genetic Algorithms - Artificial IntelligenceGenetic Algorithms - Artificial Intelligence
Genetic Algorithms - Artificial IntelligenceSahil Kumar
 
Introduction to genetic algorithms
Introduction to genetic algorithmsIntroduction to genetic algorithms
Introduction to genetic algorithmsshadanalam
 
Flowchart of ga
Flowchart of gaFlowchart of ga
Flowchart of gaDEEPIKA T
 
evolutionary algo's.ppt
evolutionary algo's.pptevolutionary algo's.ppt
evolutionary algo's.pptSherazAhmed103
 
introduction of genetic algorithm
introduction of genetic algorithmintroduction of genetic algorithm
introduction of genetic algorithmritambharaaatre
 
Evolutionary Algorithms
Evolutionary AlgorithmsEvolutionary Algorithms
Evolutionary AlgorithmsReem Alattas
 
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...Madhav Mishra
 
Introduction to Genetic Algorithms 2014
Introduction to Genetic Algorithms 2014Introduction to Genetic Algorithms 2014
Introduction to Genetic Algorithms 2014Aleksander Stensby
 

Similaire à Evolutionary (deep) neural network (20)

CSA 3702 machine learning module 4
CSA 3702 machine learning module 4CSA 3702 machine learning module 4
CSA 3702 machine learning module 4
 
WIX3001 Lecture 6 Principles of GA.pptx
WIX3001 Lecture 6 Principles of GA.pptxWIX3001 Lecture 6 Principles of GA.pptx
WIX3001 Lecture 6 Principles of GA.pptx
 
AI.3-Evolutionary Computation [15-18].pdf
AI.3-Evolutionary Computation [15-18].pdfAI.3-Evolutionary Computation [15-18].pdf
AI.3-Evolutionary Computation [15-18].pdf
 
Genetic Algorithms - Artificial Intelligence
Genetic Algorithms - Artificial IntelligenceGenetic Algorithms - Artificial Intelligence
Genetic Algorithms - Artificial Intelligence
 
Introduction to genetic algorithms
Introduction to genetic algorithmsIntroduction to genetic algorithms
Introduction to genetic algorithms
 
Genetic Algorithm
Genetic AlgorithmGenetic Algorithm
Genetic Algorithm
 
Metaheuristics
MetaheuristicsMetaheuristics
Metaheuristics
 
CI_L02_Optimization_ag2_eng.pdf
CI_L02_Optimization_ag2_eng.pdfCI_L02_Optimization_ag2_eng.pdf
CI_L02_Optimization_ag2_eng.pdf
 
Flowchart of ga
Flowchart of gaFlowchart of ga
Flowchart of ga
 
0101.genetic algorithm
0101.genetic algorithm0101.genetic algorithm
0101.genetic algorithm
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
 
evolutionary algo's.ppt
evolutionary algo's.pptevolutionary algo's.ppt
evolutionary algo's.ppt
 
introduction of genetic algorithm
introduction of genetic algorithmintroduction of genetic algorithm
introduction of genetic algorithm
 
BGA.pptx
BGA.pptxBGA.pptx
BGA.pptx
 
Evolutionary Algorithms
Evolutionary AlgorithmsEvolutionary Algorithms
Evolutionary Algorithms
 
CI_L11_Optimization_ag2_eng.pptx
CI_L11_Optimization_ag2_eng.pptxCI_L11_Optimization_ag2_eng.pptx
CI_L11_Optimization_ag2_eng.pptx
 
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
 
Machine learning
Machine learningMachine learning
Machine learning
 
Introduction to Genetic Algorithms 2014
Introduction to Genetic Algorithms 2014Introduction to Genetic Algorithms 2014
Introduction to Genetic Algorithms 2014
 
04 1 evolution
04 1 evolution04 1 evolution
04 1 evolution
 

Dernier

Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgsaravananr517913
 
Ch10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfCh10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfChristianCDAM
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating SystemRashmi Bhat
 
home automation using Arduino by Aditya Prasad
home automation using Arduino by Aditya Prasadhome automation using Arduino by Aditya Prasad
home automation using Arduino by Aditya Prasadaditya806802
 
Engineering Drawing section of solid
Engineering Drawing     section of solidEngineering Drawing     section of solid
Engineering Drawing section of solidnamansinghjarodiya
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingBootNeck1
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
multiple access in wireless communication
multiple access in wireless communicationmultiple access in wireless communication
multiple access in wireless communicationpanditadesh123
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdfHafizMudaserAhmad
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Crushers to screens in aggregate production
Crushers to screens in aggregate productionCrushers to screens in aggregate production
Crushers to screens in aggregate productionChinnuNinan
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptMadan Karki
 

Dernier (20)

Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
 
Ch10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfCh10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdf
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating System
 
home automation using Arduino by Aditya Prasad
home automation using Arduino by Aditya Prasadhome automation using Arduino by Aditya Prasad
home automation using Arduino by Aditya Prasad
 
Engineering Drawing section of solid
Engineering Drawing     section of solidEngineering Drawing     section of solid
Engineering Drawing section of solid
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event Scheduling
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
multiple access in wireless communication
multiple access in wireless communicationmultiple access in wireless communication
multiple access in wireless communication
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
Crushers to screens in aggregate production
Crushers to screens in aggregate productionCrushers to screens in aggregate production
Crushers to screens in aggregate production
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.ppt
 
Designing pile caps according to ACI 318-19.pptx
Designing pile caps according to ACI 318-19.pptxDesigning pile caps according to ACI 318-19.pptx
Designing pile caps according to ACI 318-19.pptx
 

Evolutionary (deep) neural network

  • 1. Evolutionary Deep Neural Network (or NeuroEvolution) 신수용 2017. 8. 29 @SNU TensorFlow Study
  • 3. 3 Evolutionary DNN • Usually, used to decide DNN structure – Number of layers, number of nodes.. • Can be used to decide weight values – Flappy bird example
  • 5. 5 Biological Basis • Biological systems adapt themselves to a new environment by evolution. • Biological evolution – Production of descendants changed from their parents – Selective survival of some of these descendants to produce more descendants Survival of the Fittest
  • 6. 6 Evolutionary Computation • Stochastic search (or problem solving) techniques that mimic the metaphor of natural biological evolution.
  • 7. 7 7 General Framework 초기해집합 생성 적합도 평가 종료? 부모 개체 선택 자손 생성 적합도 함수 최적해 Yes No 교차 연산 돌연변이 연산 선택 연산
  • 8. 8 Paradigms in EC • Genetic Algorithm (GA) – [J. Holland, 1975] – Bitstrings, mainly crossover, proportionate selection • Genetic Programming (GP) – [J. Koza, 1992] – Trees, mainly crossover, proportionate selection • Evolutionary Programming (EP) – [L. Fogel et al., 1966] – FSMs, mutation only, tournament selection • Evolution Strategy (ES) – [I. Rechenberg, 1973] – Real values, mainly mutation, ranking selection
  • 10. 10 GA(Genetic Algorithms) • 자연계의 유전 현상을 모방하여 적합한 가설을 얻 어내는 방법 • 특성 – 진화는 자연계에서 성공적이고 정교한 적응 방법 – 모델링하기 힘든 복잡한 문제에도 적용 가능 – 병렬화가 가능, H/W 성능의 도움을 받을 수 있음 • 대규모 탐색공간에서 최선의 fitness의 해를 찾는 일반적인 최적화 과정 • 최적의 해를 찾는다고 보장할 수는 없지만 높은 fitness의 해를 얻을 수 있음
  • 11. 11 GA의 기본용어 (1/2) • 염색체 (Chromosome)  개체 (Individual) – 주어진 문제에 대한 가능한 해 또는 가설 – 대부분 string으로 표현됨 – string의 원소는 정수, 실수 등 필요에 의해 결정됨 • 개체군 (population) – 개체(가설)들의 집합 1 1 0 1 0 0 1 1
  • 12. 12 GA의 기본용어 (2/2) • 적합도 (fitness) – 산술적인 단위로 가설의 적합도를 표시한다. – 유전자의 각 개체의 환경에 대한 적합의 비율을 평가 하는 값 – 평가치로 최적화 문제를 대상으로 하는 경우 목적함 수 값이나 제약조건을 고려하여 페널티 함수 값 • 적합도 함수 (fitness function) – 적합도를 구하기 위해서 사용되는 기준방법
  • 13. 13 GA의 연산자 (1/5) • 선택 연산자 (Selection Operator) – 개체를 선택하여 부모들로 선정 – 우수한 자손들이 많이 생성되도록 하기 위해서(해답 을 발견하기 위해서) 좀더 우수한 적합도를 가진 개 체들이 선택될 확률이 비교적 높도록 함. – Proportional (Roulette wheel) selection – Tournament selection – Ranking-based selection
  • 14. 14 GA의 연산자 (2/5) • 교차 연산자 (Crossover Operator) – 생물들이 생식을 하는 것처럼 부모들의 염색체를 서 로 교차시켜서 자손을 만드는 연산자. – Crossover rate라고 불리는 임의의 확률에 의해서 교차연산의 수행여부가 결정된다.
  • 15. 15 GA의 연산자 (3/5) – One-point crossover 1 1 0 1 0 0 1 1 0 1 1 1 0 1 1 0 Crossover point 1 1 0 1 0 1 1 0 0 1 1 1 0 0 1 1
  • 16. 16 GA의 연산자 (4/4) • 돌연변이 연산자 (Mutation Operator) – 한 bit를 mutation rate라는 임의의 확률로 변화 (flip)시키는 연산자 – 아주 작은 확률로 적용된다. (ex) 0.001 1 1 0 1 0 0 1 1 1 1 0 1 1 0 1 1
  • 18. 18 가설 공간 탐색 • 다른 탐색 방법과의 비교 – local minima에 빠질 확률이 적다(급격한 움직임 가 능) • Crowding – 유사한 개체들이 개체군의 다수를 점유하는 현상 – 다양성을 감소시킨다.
  • 19. 19 Crowding • Crowding의 해결법 – 선택방법을 바꾼다. • Tournament selection, ranking selection – “fitness sharing” • 유사한 개체가 많으면 fitness를 감소시킨다. – 결합하는 개체들을 제한 • 가장 비슷한 개체끼리 결합하게 함으로써 cluster or multiple subspecies 형성 • 개체들을 공간적으로 분포시키고 근처의 것끼리만 결합 가 능하게 함
  • 20. 20 Typical behavior of an EA • Phases in optimizing on a 1-dimensional fitness landscape Early phase: quasi-random population distribution Mid-phase: population arranged around/on hills Late phase: population concentrated on high hills
  • 21. 21 Geometric Analogy - Mathematical Landscape
  • 22. 22 Typical run: progression of fitness Typical run of an EA shows so-called “anytime behavior” Bestfitnessinpopulation Time (number of generations)
  • 23. 23 Bestfitnessinpopulation Time (number of generations) Progress in 1st half Progress in 2nd half Are long runs beneficial? • Answer: - it depends how much you want the last bit of progress - it may be better to do more shorter runs
  • 24. 24 Scale of “all” problems Performanceofmethodsonproblems Random search Special, problem tailored method Evolutionary algorithm ECs as problem solvers: Goldberg’s 1989 view
  • 25. 25 Advantages of EC • No presumptions w.r.t. problem space • Widely applicable • Low development & application costs • Easy to incorporate other methods • Solutions are interpretable (unlike NN) • Can be run interactively, accommodate user proposed solutions • Provide many alternative solutions
  • 26. 26 Disadvantages of EC • No guarantee for optimal solution within finite time • Weak theoretical basis • May need parameter tuning • Often computationally expensive, i.e. slow
  • 28. 28 Genetic Programming • Genetic programming uses variable-size tree-representations rather than fixed- length strings of binary values. • Program tree = S-expression = LISP parse tree • Tree = Functions (Nonterminals) + Terminals
  • 29. 29 GP Tree: An Example • Function set: internal nodes – Functions, predicates, or actions which take one or more arguments • Terminal set: leaf nodes – Program constants, actions, or functions which take no arguments S-expression: (+ 3 (/ ( 5 4) 7)) Terminals = {3, 4, 5, 7} Functions = {+, , /}
  • 30. 30 Tree based representation • Trees are a universal form, e.g. consider • Arithmetic formula • Logical formula • Program         15 )3(2 y x (x  true)  (( x  y )  (z  (x  y))) i =1; while (i < 20) { i = i +1 }
  • 31. 31 Tree based representation • In GA, ES, EP chromosomes are linear structures (bit strings, integer string, real- valued vectors, permutations) • Tree shaped chromosomes are non-linear structures. • In GA, ES, EP the size of the chromosomes is fixed. • Trees in GP may vary in depth and width.
  • 32. 32 Crossover: Subtree Exchange + b   a b + b +    a a b +  a b    a b + b +  a b 
  • 35. 35 ES quick overview • Developed: Germany in the 1970’s • Early names: I. Rechenberg, H.-P. Schwefel • Typically applied to: – numerical optimisation • Attributed features: – fast – good optimizer for real-valued optimisation – relatively much theory • Special: – self-adaptation of (mutation) parameters standard
  • 36. 36 ES technical summary Representation Real-valued vectors Recombination Discrete or intermediary Mutation Gaussian perturbation Parent selection Uniform random Survivor selection (,) or (+) Specialty Self-adaptation of mutation step sizes
  • 37. 37 Introductory example • Task: minimimise f : Rn  R • Algorithm: “two-membered ES” using – Vectors from R n directly as chromosomes – Population size 1 – Only mutation creating one child – Greedy selection
  • 38. 38 Parent selection • Parents are selected by uniform random distribution whenever an operator needs one/some • Thus: ES parent selection is unbiased - every individual has the same probability to be selected • Note that in ES “parent” means a population member (in GA’s: a population member selected to undergo variation)
  • 39. 39 Survivor selection • Applied after creating  children from the  parents by mutation and recombination • Deterministically chops off the “bad stuff” • Basis of selection is either: – The set of children only: (,)-selection – The set of parents and children: (+)- selection
  • 40. 40 Survivor selection cont’d • (+)-selection is an elitist strategy • (,)-selection can “forget” • Often (,)-selection is preferred for: – Better in leaving local optima – Better in following moving optima – Using the + strategy bad  values can survive in x, too long if their host x is very fit • Selective pressure in ES is very high (  7 •  is the common setting)
  • 42. 42 EP quick overview • Developed: USA in the 1960’s • Early names: D. Fogel • Typically applied to: – traditional EP: machine learning tasks by finite state machines – contemporary EP: (numerical) optimization • Attributed features: – very open framework: any representation and mutation op’s OK – crossbred with ES (contemporary EP) – consequently: hard to say what “standard” EP is • Special: – no recombination – self-adaptation of parameters standard (contemporary EP)
  • 43. 43 EP technical summary tableau Representation Real-valued vectors Recombination None Mutation Gaussian perturbation Parent selection Deterministic Survivor selection Probabilistic (+) Specialty Self-adaptation of mutation step sizes (in meta-EP)
  • 45. 45 ENN • The back-propagation learning algorithm cannot guarantee an optimal solution. • In real-world applications, the back- propagation algorithm might converge to a set of sub-optimal weights from which it cannot escape. • As a result, the neural network is often unable to find a desirable solution to a problem at hand.
  • 46. 46 ENN • Another difficulty is related to selecting an optimal topology for the neural network. – The “right” network architecture for a particular problem is often chosen by means of heuristics, and designing a neural network topology is still more art than engineering. • Genetic algorithms are an effective optimization technique that can guide both weight optimization and topology selection.
  • 47. 47 Encoding a set of weights in a chromosome y 0.9 1 3 4 x1 x3 x2 2 -0.8 0.4 0.8 -0.7 0.2 -0.2 0.6 -0.3 0.1 -0.2 0.9 -0.60.1 0.3 0.5 From neuron: To neuron: 12 34 5678 1 2 3 4 5 6 7 8 00 00 0000 00 00 0000 00 00 0000 0.9 -0.3 -0.7 0 0000 -0.8 0.6 0.3 0 0000 0.1 -0.2 0.2 0 0000 0.4 0.5 0.8 0 0000 00 0 -0.6 0.1 -0.2 0.9 0 Chromosome: 0.9 -0.3 -0.7 -0.8 0.6 0.3 0.1 -0.2 0.2 0.4 0.5 0.8 -0.6 0.1 -0.2 0.9
  • 48. 48 Fitness function • The second step is to define a fitness function for evaluating the chromosome’s performance. – This function must estimate the performance of a given neural network. – Simple function defined by the sum of squared errors.
  • 49. 49 4 5 y x2 2 -0.3 0.9 -0.7 0.5 -0.8 -0.6 Parent 1 Parent 2 x1 1 -0.2 0.1 0.4 4 5 y x2 2 -0.1 -0.5 0.2 -0.9 0.6 0.3x1 1 0.9 0.3 -0.8 0.1 -0.7 -0.6 0.5 -0.8-0.2 0.9 0.4 -0.3 0.3 0.2 0.3 -0.9 0.60.9 -0.5 -0.8 -0.1 0.1 -0.7 -0.6 0.5 -0.80.9 -0.5 -0.8 0.1 4 y x2 2 -0.1 -0.5 -0.7 0.5 -0.8 -0.6 Child x1 1 0.9 0.1 -0.8 Crossover
  • 50. 50 Mutation Original network 3 4 5 y 6 x2 2 -0.3 0.9 -0.7 0.5 -0.8 -0.6x1 1 -0.2 0.1 0.4 0.1 -0.7 -0.6 0.5 -0.8-0.2 0.9 3 4 5 y 6 x2 2 0.2 0.9 -0.7 0.5 -0.8 -0.6x1 1 -0.2 0.1 -0.1 0.1 -0.7 -0.6 0.5 -0.8-0.2 0.9 Mutated network 0.4 -0.3 -0.1 0.2
  • 51. 51 Architecture Selection • The architecture of the network (i.e. the number of neurons and their interconnections) often determines the success or failure of the application. • Usually the network architecture is decided by trial and error; there is a great need for a method of automatically designing the architecture for a particular application. – Genetic algorithms may well be suited for this task.
  • 52. 52 Encoding Fromneuron: To neuron: 1 2 0 5 0 3 0 4 0 6 1 2 3 4 5 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 1 0 0 0 0 0 3 4 5 y 6 x2 2 x1 1 Chromosome: 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 0
  • 53. 53 Process Neural Network j Fitness = 117 Neural Network j Fitness = 117 Generation i Training Data Set 0 0 1.0000 0.1000 0.0998 0.8869 0.2000 0.1987 0.7551 0.3000 0.2955 0.6142 0.4000 0.3894 0.4720 0.5000 0.4794 0.3345 0.6000 0.5646 0.2060 0.7000 0.6442 0.0892 0.8000 0.7174 -0.0143 0.9000 0.7833 -0.1038 1.0000 0.8415 -0.1794 Child 2 Child 1 Crossover Parent 1 Parent 2 Mutation Generation (i + 1)
  • 55. 55 Good reference blog • https://medium.com/@stathis/design- by-evolution-393e41863f98
  • 56. 56 Evolving Deep Neural Networks • https://arxiv.org/pdf/1703.00548.pdf • CoDeepNEAT – for optimizing deep learning architectures through evolution – Evolving DNNS for CIFAR-10 – Evolving LSTM architecture – Not so clear experimental comparison..
  • 57. 57 Large-Scale Evolution of Image Classifiers • https://arxiv.org/abs/1703.01041 • Individual – a trained architecture • Fitness – Individual’s accuracy on a validation set • Selection (tournament selection) – Randomly choose two individuals – Select better one (parent)
  • 58. 58 Large-Scale Evolution of Image Classifiers • Mutation – Pick a mutation from a predetermined set • Train child • Repeat.
  • 59. 59 Large-Scale Evolution of Image Classifiers
  • 60. 60 Convolution by Evolution • https://arxiv.org/pdf/1606.02580.pdf • GECCO16 paper • Differential version of the Compositional Pattern Producing Network (DPPN) – Topology is evolved but the weights are learned – Compressed the weights of a denoising autoencoder from 157684 to roughly 200 parameters with comparable image reconstruction accuracy
  • 61. 61
  • 62. 62