CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
Graphs, Environments, and Machine Learning for Materials Science
1. Graphs, Environments,
and Machine Learning
for Materials Science
Shyue Ping Ong, Chi Chen, Xiangguo Li,Yunxing
Zuo, Zhi Deng, Weike Ye, Zhenbin Wang
Aug 1 2019
NIST Workshop
2. High-throughput computation is not enough:A
statistical history of the Materials Project
Aug 1 2019
Reasonable ML
Deep learning
(AA’)0.5(BB’)0.5O3 perovskite
2 x 2 x 2 supercell,
10 A and 10 B species
= (10C2 x 8C4)2 ≈107
NIST Workshop
ratio of (634 + 34)/485 ≈ 1.38 (Supplementary Table S-II) with b5%
difference in the experimental and theoretical values. This again
agree well with those calculated from the rule of mixture (Supplemen-
tary Table-III). The experimental XRD patterns also agree well with
Fig. 2. Atomic-resolution STEM ABF and HAADF images of a representative high-entropy perovskite oxide, Sr(Zr0.2Sn0.2Ti0.2Hf0.2Mn0.2)O3. (a, c) ABF and (b, d) HAADF images at (a, b) low
and (c, d) high magnifications showing nanoscale compositional homogeneity and atomic structure. The [001] zone axis and two perpendicular atomic planes (110) and (110) are marked.
Insets are averaged STEM images.
Jiang et al. A New Class of
High-Entropy Perovskite
Oxides. Scripta Materialia
2018, 142, 116–120.
Materials design is
combinatorial
3. Combinatorial generalization,i.e.,making infinite
use of finite means,is how humans learn….
Aug 1 2019 NIST Workshop
Graph Networks as a Universal Machine Learning Framework for
Molecules and Crystals
Chi Chen, Weike Ye, Yunxing Zuo, Chen Zheng, and Shyue Ping Ong*
Department of NanoEngineering, University of California San Diego, 9500 Gilman Dr, Mail Code 0448, La Jolla, California
92093-0448, United States
*S Supporting Information
ABSTRACT: Graph networks are a new machine learning (ML)
paradigm that supports both relational reasoning and combinatorial
generalization. Here, we develop universal MatErials Graph Network
(MEGNet) models for accurate property prediction in both
molecules and crystals. We demonstrate that the MEGNet models
outperform prior ML models such as the SchNet in 11 out of 13
properties of the QM9 molecule data set. Similarly, we show that
MEGNet models trained on ∼60 000 crystals in the Materials Project
substantially outperform prior ML models in the prediction of the
formation energies, band gaps, and elastic moduli of crystals,
achieving better than density functional theory accuracy over a
much larger data set. We present two new strategies to address data
limitations common in materials science and chemistry. First, we
demonstrate a physically intuitive approach to unify four separate
molecular MEGNet models for the internal energy at 0 K and room temperature, enthalpy, and Gibbs free energy into a single
free energy MEGNet model by incorporating the temperature, pressure, and entropy as global state inputs. Second, we show
that the learned element embeddings in MEGNet models encode periodic chemical trends and can be transfer-learned from a
property model trained on a larger data set (formation energies) to improve property models with smaller amounts of data
(band gaps and elastic moduli).
■ INTRODUCTION
Machine learning (ML)1,2
has emerged as a powerful new tool
in materials science,3−14
driven in part by the advent of large
materials data sets from high-throughput electronic structure
calculations15−18
and/or combinatorial experiments.19,20
Among its many applications, the development of fast,
surrogate ML models for property prediction has arguably
received the most interest for its potential in accelerating
materials design21,22
as well as accessing larger length/time
scales at near-quantum accuracy.11,23−28
The key input to any ML model is a description of the
material, which must satisfy the necessary rotational, transla-
neural network model. Gilmer et al.37
later proposed the
message passing neural network (MPNN) framework that
includes the existing graph models with differences only in
their update functions.
Unlike molecules, descriptions of crystals must account for
lattice periodicity and additional space group symmetries. In
the crystal graph convolutional neural networks (CGCNNs)
proposed by Xie and Grossman,9
each crystal is represented by
a crystal graph, and invariance with respect to permutation of
atomic indices and unit cell choice are achieved through
convolution and pooling layers. They demonstrated excellent
prediction performance on a broad array of properties,
Article
pubs.acs.org/cmCite This: Chem. Mater. XXXX, XXX, XXX−XXX
DownloadedviaUNIVOFCALIFORNIASANDIEGOonApril23,2019at14:47:23(UTC).
Seehttps://pubs.acs.org/sharingguidelinesforoptionsonhowtolegitimatelysharepublishedarticles.
Chen, C.; Ye, W.; Zuo, Y.; Zheng, C.; Ong, S. P. Graph Networks as a Universal
Machine Learning Framework for Molecules and Crystals. Chem. Mater. 2019, 31 (9),
3564–3572. https://doi.org/10.1021/acs.chemmater.9b01294.
Shall I compare thee
to a summer’s day?
Thou art more lovely
and more temperate.
- William Shakespeare
4. But learning also takes place in the context
of hierarchy of structured knowledge…
Aug 1 2019 NIST Workshop
Thermodynamics
• Extensive (additive) versus
intensive properties
“Locality”
• Interactions between nearby
atoms are stronger than
atoms far away ~ 1/rn
Symmetry
• Translation
• Rotation
• Reflection
• Permutation of identical
atoms
6. Graphs as a natural representation for materials,
i.e.,molecules and crystals
Aug 1 2019
Global state (u)
ek1
vsk
vrk
NIST Workshop
Zr … … … … … …
Zs … … ... … … …
… … … … … … …
T p S … … … …
𝑒
"
#"#$
%
&%
Chen, C.; Ye, W.; Zuo, Y.; Zheng, C.; Ong, S. P. Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals. Chem.
Mater. 2019, 31 (9), 3564–3572. doi: 10.1021/acs.chemmater.9b01294.
7. Information flow between elements in a graph
network
Aug 1 2019
Global state (u)
ek1
vsk
vrk
NIST Workshop
T p S … … … …
𝒆 𝒌𝟏
*
= 𝜙-(𝒗 𝒓𝒌⨁𝒗 𝒔𝒌⨁𝒆 𝒌𝟏⨁𝒖)
Bond update
Chen, C.; Ye, W.; Zuo, Y.; Zheng, C.; Ong, S. P. Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals. Chem.
Mater. 2019, 31 (9), 3564–3572. doi: 10.1021/acs.chemmater.9b01294.
8. Information flow between elements in a graph
network
Aug 1 2019
Global state (u)
ek1
vsk
vrk
NIST Workshop
T p S … … … …
ek2 ek3
Atom update
𝒆 𝒌𝟏
*
= 𝜙-(𝒗 𝒓𝒌⨁𝒗 𝒔𝒌⨁𝒆 𝒌𝟏⨁𝒖)
Bond update
𝒗 𝒓𝒌
*
= 𝜙5(𝒗 𝒓𝒌⨁𝒆 𝒌𝒓⨁𝒖)
Chen, C.; Ye, W.; Zuo, Y.; Zheng, C.; Ong, S. P. Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals. Chem.
Mater. 2019, 31 (9), 3564–3572. doi: 10.1021/acs.chemmater.9b01294.
9. Information flow between elements in a graph
network
Aug 1 2019
Global state (u)
ek1
vsk
vrk
NIST Workshop
T p S … … … …
ek2 ek3
Atom update
𝒆 𝒌𝟏
*
= 𝜙-(𝒗 𝒓𝒌⨁𝒗 𝒔𝒌⨁𝒆 𝒌𝟏⨁𝒖)
Bond update
State update
𝒖′ = 𝜙7(𝑽⨁𝑬⨁𝒖)
𝜙 are approximated using
Universal
approximation
theorem
Cybenko et al. Math. Control Signal Systems
1989, 2 (4), 303–314.
𝒗 𝒓𝒌
*
= 𝜙5(𝒗 𝒓𝒌⨁𝒆 𝒌𝒓⨁𝒖)
10. MatErials Graph Networks (MEGNet)
Aug 1 2019 NIST Workshop
Z mapping to vector (remember
this!)
Implementation is open source at https://github.com/materialsvirtuallab/megnet.
Modular blocks can be stacked to
generate models of arbitrary
complexity and “locality” of
interactions
11. Performance on 130,462 QM9 molecules
Aug 1 2019 NIST Workshop
80%-10%-10%
train-validation-test split
Only Z as atomic feature, i.e.,
feature selection helps model
learn, but is not critical!
MEGNet1 MEGNet-
Simple1
SchNet2 “Chemical
Accuracy”
U0 (meV) 9 12 14 43
G (meV) 10 12 14 43
εHOMO (eV) 0.038 0.043 0.041 0.043
εLUMO (eV) 0.031 0.044 0.034 0.043
Cv (cal/molK) 0.030 0.029 0.033 0.05
1 Chen et al. Chem. Mater. 2019, 31 (9), 3564–3572. doi: 10.1021/acs.chemmater.9b01294.
2 Schutt et al. J. Chem. Phys. 148, 241722 (2018)
State-of-the-art performance
surpassing chemical accuracy in 11
of 13 properties!
12. MEGNet
Unified Free Energy Model
Aug 1 2019 NIST Workshop
Training data:
U, H, G at 298 K
U at 0K (U0)∅ = 𝑓(𝐸, 𝑇, 𝑃, 𝑆))
Failure in H and G due to lack of more
pressure and entropy in training data.
13. Performance on Materials Project Crystals
Aug 1 2019 NIST Workshop
Property MEGNet SchNet1 CGCNN2
Formation energy Ef (meV/atom) 28
(60,000)
35 39
(28,046)
Band gap Eg (eV) 0.330
(36,720)
- 0.388
(16,485)
log10 KVRH (GPa) 0.050
(4,664)
- 0.054
(2,041)
log10 GVRH (GPa) 0.079
(4,664)
- 0.087
(2,041)
Metal classifier 78.9%
(55,391)
- 80%
(28,046)
Non-metal classifier 90.6%
(55,391)
- 95%
(28,046)
1 Schutt et al. J. Chem. Phys. 148, 241722 (2018)
2 Xie et al. PRL. 120.14 (2018): 145301.
“Noisy”
Dataset too small
14. Transfer learning for improved convergence and
speed
Aug 1 2019
Ef model
(60,000 data points)
Eg model
(36,000 data points)
ü MAE decreases from
0.38 eV to 0.32 eV
ü Convergence speed x2
NIST Workshop
15. CGCNN vs MEGNet
Aug 1 2019 NIST Workshop
CGCNN MEGNet
Message passing Node to node only Node, edge, and global
Input atomic
features
Group number
Period number
Electronegativity
Covalent radius
Valence electrons,
First ionization energy
Electron affinity
Block
Atomic volume
Atomic number
Global state No Yes
Transferable
components
Possible, but not demonstrated Yes
Composability New models require network
optimization
New models can be formed
by stacking modular blocks
16. Extracting chemistry from machine-learned
models
Aug 1 2019 NIST Workshop
Sorted by Mendeleev number
t-SNE projection
Pearson
correlation
18. Practical considerations
• Training of deep learning
models is fairly expensive -
Dedicated GPU resources
recommended.
• But prediction is cheap,
https://megnet.crystals.ai runs
on a single dyno on Heroku!
Aug 1 2019 NIST Workshop
Pythia @ MaterialsVirtualLab
19. Representations for a collection of atoms
Aug 1 2019 NIST Workshop
Graphs
Local
env.
20. The scale problem in computational materials
science
Many real-world materials problems are not related to
bulk crystals.
Aug 1 2019
Huang et al. ACS Energy Lett. 2018, 3 (12), 2983–2988.Tang et al. Chem. Mater. 2018, 30 (1), 163–173.
Electrode-electrolyte interfaces Catalysis Microstructure and segregation
Need linear-scaling with ab initio accuracy.
NIST Workshop
21. Machine learning the potential energy surface
Aug 1 2019 NIST Workshop
Local environment descriptors ML approach
A separate neural network is used for each atom. The neural network is defined by
the number of hidden layers and the nodes in each layer, while the descriptor space is
given by the following symmetry functions:
Gatom,rad
i =
NatomX
j6=i
e ⌘(Rij Rs)2
· fc(Rij),
Gatom,ang
i = 21 ⇣
NatomX
j,k6=i
(1 + cos ✓ijk)⇣
· e ⌘0(R2
ij+R2
ik+R2
jk)
· fc(Rij) · fc(Rik) · fc(Rjk),
where Rij is the distance between atom i and neighbor atom j, ⌘ is the width of the
Gaussian and Rs is the position shift over all neighboring atoms within the cuto↵
radius Rc, ⌘0
is the width of the Gaussian basis and ⇣ controls the angular resolution.
fc(Rij) is a cuto↵ function, defined as follows:
fc(Rij) =
8
>><
>>:
0.5 · [cos (
⇡Rij
Rc
) + 1], for Rij Rc
0.0, for Rij > Rc.
These hyperparameters were optimized to minimize the mean absolute errors of en-
ergies and forces for each chemistry. The NNP model has shown great performance
for Si,11
TiO2,40
water41
and solid-liquid interfaces,42
metal-organic frameworks,43
and
has been extended to incorporate long-range electrostatics for ionic systems such as
4
Atom-centered symmetry
functions (ACSF)
Moment tensors
Smooth overlap of atomic
positions (SOAP)
SO4 bispectrum
Polynomial / Linear
regression
Kernel regression
Neural networks
ZnO44
and Li3PO4.45
2. Gaussian Approximation Potential (GAP). The GAP calculates the similar-
ity between atomic configurations based on a smooth-overlap of atomic positions
(SOAP)10,46
kernel, which is then used in a Gaussian process model. In SOAP, the
Gaussian-smeared atomic neighbor densities ⇢i(R) are expanded in spherical harmonics
as follows:
⇢i(R) =
X
j
fc(Rij) · exp(
|R Rij|2
2 2
atom
) =
X
nlm
cnlm gn(R)Ylm( ˆR),
The spherical power spectrum vector, which is in turn the square of expansion coe -
cients,
pn1n2l(Ri) =
lX
m= l
c⇤
n1lmcn2lm,
can be used to construct the SOAP kernel while raised to a positive integer power ⇣
(which is 4 in present case) to accentuate the sensitivity of the kernel,10
K(R, R0
) =
X
n1n2l
(pn1n2l(R)pn1n2l(R0
))⇣
,
In the above equations, atom is a smoothness controlling the Gaussian smearing, and
nmax and lmax determine the maximum powers for radial components and angular com-
ponents in spherical harmonics expansion, respectively.10
These hyperparameters, as
well as the number of reference atomic configurations used in Gaussian process, are
Behler-Parinello Neural
Network Potential (NNP)1
Moment Tensor Potential
(MTP)2
Gaussian Approximation
Potential (GAP)3
Spectral Neighbor Analysis
Potential (SNAP)4
ACSF/MT encodes distances and angles.
SOAP/bispectrum encodes neighbor density.
Interatomic Potential
1 Behler et al. PRL. 98.14 (2007): 146401.
2 Shapeev MultiScale Modeling and Simulation 14, (2016).
3 Bart ́ok et al. PRL. 104.13 (2010): 136403.
4 Thompson et al. J. Chem. Phys. 285, 316330 (2015)
22. Evaluation criteria
qAccuracy (energies,
forces and properties)
qComputational cost
qTraining data
requirements
qExtrapolability
Aug 1 2019 NIST Workshop
Machine learning
Interatomic Potentials
(ML-IAPs)
MaterialsVirtual Lab
23. Standardized workflow for ML-IAP construction
and evaluation
Pymatgen
Fireworks + VASP
DFT static
Dataset
Elastic deformation Distorted
structures
Surface generation Surface
structures
Vacancy + AIMD Trajectory
snapshots
(low T, high T) AIMD Trajectory
snapshots
Crystal
structure
property fittingE
e
e.g. elastic, phonon
···
energy weights
degrees of freedom
···
cutoff radius
expansion width
S1
S2
Sn
· · ·
rc
atomic descriptors
local
environment
sites
· · · · · ·
X1(r1j … r1n)
X2(r2k … r2m)
Xn(rnj … rnm)
machine learning
Y =f(X; !)
Y (energy, force, stress)
DFT properties
grid search
evolutionary algorithm
Aug 1 2019 NIST Workshop
Available open source on Github: https://github.com/materialsvirtuallab/mlearn
Test systems:
• Fcc Ni
• Fcc Cu
• Bcc Li
• Bcc Mo
• Diamond Ge
• Diamond Si
Zuo, Y.; Chen, C.; Li, X.; Deng, Z.; Chen, Y.; Behler, J.; Csányi, G.; Shapeev, A. V.; Thompson, A. P.; Wood, M. A.; et al. A Performance and Cost Assessment
of Machine Learning Interatomic Potentials. arXiv:1906.08888 2019.
24. ML-IAP:Accuracy vs Cost
Aug 1 2019 NIST Workshop
Testerror(meV/atom)
Computational cost s/(MD step atom)
a
b
Jmax = 3
Jmax = 3
2000 kernels20 polynomial powers
hidden layers [16, 16]
GAP reaches
best accuracy,
but is the most
expensive by
O(102-103)
MTP, NNP,
qSNAP all lie
quite close to
Pareto frontier.
Mo dataset
Zuo et al. A Performance and Cost Assessment of Machine Learning Interatomic Potentials. arXiv:1906.08888 2019.
25. ML-IAP:Training Data Requirements
Aug 1 2019 NIST Workshop
Energies Forces
• Data quality is more important than data quantity -
~O(102) structures sufficient to converge energies and
forces for most ML-IAPs..
a b
Zuo et al. A Performance and Cost Assessment of Machine Learning Interatomic Potentials. arXiv:1906.08888 2019.
NNP and qSNAP require
much more training data
26. ML-IAP:Extrapolability
• The greater the ML complexity (e.g., NNP
and GAP), the greater the issues with
extrapolation.
• Linear SNAP performs surprisingly well on
EOS and polymorph energy differences.
Aug 1 2019 NIST Workshop
Ni Li Si
Cu Mo Ge
DFT GAP
NNP
MTP
SNAP qSNAP
bcc Ni bcc Cu
fcc Mo fcc Li
wurtzite Si wurtzite Ge
GAP performs poorly!
Zuo et al. A Performance and Cost Assessment of Machine Learning Interatomic Potentials. arXiv:1906.08888 2019.
27. Applications:Ni-Mo phase diagram and
mechanical behavior
Aug 1 2019 NIST Workshop
Solid-liquid equilibrium Hall-Petch strengthening
[1] Hu et al. Nature, 2017, 355, 1292
28. Conclusions
Aug 1 2019 NIST Workshop
Graph /
Local env.
Descriptors
Machine
Learning
”Instant",
linear-scaling
property
predictions
Transfer learning
Property with
smaller data
Pymatgen
Fireworks + VASP
DFT static
Dataset
Elastic deformation Distorted
structures
Surface generation Surface
structures
Vacancy + AIMD Trajectory
snapshots
(low T, high T) AIMD Trajectory
snapshots
Crystal
structure
property fittingE
e
e.g. elastic, phonon
···
energy weights
degrees of freedom
···
cutoff radius
expansion width
S1
S2
Sn
· · ·
rc
atomic descriptors
local
environment
sites
· · · · · ·
X1(r1j … r1n)
X2(r2k … r2m)
Xn(rnj … rnm)
machine learning
Y =f(X; !)
Y (energy, force, stress)
DFT properties
grid search
evolutionary algorithm
ML-IAPs: Reproducible, near-DFT accuracy
+ linear scaling -> New science
http://crystals.ai & mlearn
Open source software
and standardized
datasets for materials ML
29. Acknowledgements
Aug 1 2019 NIST Workshop
MAVRL
Creating It from Bit
Contract #N000141612621
GRO Program
Chi Chen
(MEGNet)
Yunxing Zuo
(ML-IAP)
Xiangguo Li
(Ni-Mo and MPE)