Low area ANN architecture with Stochastic Computing and a Simplified Sigmoid Function
1. ĐẠI
CÔNG
HỌC
NGHỆ
ĐẠI
CÔNG
HỌC
NGHỆ
VIETNAM NATIONAL UNIVERSITY HANOI (VNU)
VNU UNIVERSITY OF ENGINEERING AND TECHNOLOGY
Low Area ANN Architecture
with Stochastic Computing
and a Simplified Sigmoid Function
Huy-Hung Ho, Xuan-Thuan Nguyen, Van-Thuat Nguyen, Van-Dung Nguyen
Key Laboratory for Smart Integrated Systems (SISLAB),
VNU University of Engineering and Technology (VNU-UET)
4. ĐẠI
CÔNG
HỌC
NGHỆ
ĐẠI
CÔNG
HỌC
NGHỆ
Proposed Design
• Reference model of LSI Contest
– Fixed size 2-3-2 architecture
– Sigmoid function LUT: high area cost
– Complex operators
4/20/2018 4
Weighted input
• Additions
• Multiplications
Sigmoid
function
1
1 + 𝑒−𝑥
• Proposed design
– Parameterized ANN architecture (Backpropagation)
– Optimized Sigmoid Function
• Lower LUT memory
– Forward ANN use Stochastic Computing
• Replacing Addition, Multiplication by Logic gate
Low area cost
High frequency
5. ĐẠI
CÔNG
HỌC
NGHỆ
ĐẠI
CÔNG
HỌC
NGHỆ
Optimized Sigmoid function
4/20/2018 5
Conventional
LUT sigmoid
Proposed
sigmoid
MSE 3.12 × 10−7
1.79 × 10−5
Area (cell) 85 67 ( -21.18% )
• Conventional Method
₋ PWL (piecewise nonlinear approximation)
₋ Lookup table (8-bit LUT)
₋ Separating different regions
• Only tanh() function
• Proposed Sigmoid Function
• 3 different regions
• Constant region
• Linear region
• Non-linear region
• 5-bit LUT & 3-bit LUT
• Evaluation
𝑀𝑆𝐸 =
𝑖=0
𝑁−1
𝑂𝑓𝑙𝑜𝑎𝑡𝑖𝑛𝑔_𝑝𝑜𝑖𝑛𝑡 − 𝑂 𝑝𝑟𝑜𝑝𝑜𝑠𝑒𝑑
2
𝑁
MSE: Mean square error
6. ĐẠI
CÔNG
HỌC
NGHỆ
ĐẠI
CÔNG
HỌC
NGHỆ
Forward ANN use Stochastic Computing
• Stochastic computing (SC)
– Process data in the form of digitized
probabilities
• Stochastic Number Generation (SNG)
• SC Multiplication
• SC Scale Addition
4/20/2018 6
Weighted input
• Additions
• Multiplications
Sigmoid
function
1
1 + 𝑒−𝑥
Logic gate,
MUX
Stochastic
Computing domain
𝑧=(𝑥+𝑦)/2
𝟐
𝟖 1,0,0,1,0,0,0,0
LFSR
>
𝐵𝑖𝑡 ‘1’ 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 = 2/8
SNG
LFSR: Linear Feedback Shift Register[Lee2017eeh]
7. ĐẠI
CÔNG
HỌC
NGHỆ
ĐẠI
CÔNG
HỌC
NGHỆ
SNG
SNG
a
SNG
SNG
T Q
Q
SNG
T Q
Q
SNG
SNG
T Q
Q
MUX
MUX
MUX
2
1
w3
1
a2
2
w3
2
a2
3
w3
3
b3
z3
Forward ANN use Stochastic Computing (cont.)
4/20/2018 7
Conventional architecture
Proposed SC architecture
8-bits 10-bits
Latency 2 × 28
+ 2 2 × 210
+ 2
MSE 2.03× 10−4
7.70 × 10−5
Forward ANN Simulation with SC length
MSE: Mean square error
9. ĐẠI
CÔNG
HỌC
NGHỆ
ĐẠI
CÔNG
HỌC
NGHỆ
Evaluation
4/20/2018 9
LSI
Contest
Our proposed
(8-bit SC length)
Frequency (MHz) 182 357
Slice LUTs 814 738
Slice Registers 528 450
Mux 60 75
DSP 12 0
Area (cell) 2059 1627 ( -20.98% )
Latency 4 2 × 28
+ 2
• Environment
• VHDL language
• Vivado Xilinx Tool
• Xilinx FPGA VC707 28nm
• 2-3-2 Forward Model
+ Low area cost
+ High frequency
+ Do not use DSP
- High latency
• Verification Implementation
2-3-2 forward ANN architecture evaluation
Real →
Fixed-point
Proposed
Design
Fixed-point
→ Real
𝑴𝑺𝑬 =
𝒊=𝟎
𝑵−𝟏
𝒂 − 𝒃 𝟐
𝑵
Math Equation
Input
(real value) MSE
MSE: Mean square error
10. ĐẠI
CÔNG
HỌC
NGHỆ
ĐẠI
CÔNG
HỌC
NGHỆ
Conclusion
• Contest Requirement
– 2-3-2 ANN architecture with 5 sets of inputs
• Proposed Design
– Parameterized 2-3-2 ANN architecture
– Simplified Sigmoid architecture
• 21.18% area cost
– Forward architecture use Stochastic computing
• 20.98% area cost (compared to Contest reference model)
– Limitation: High latency
• Future work
– Apply Stochastic Computing for Backward module
– Reduce more area cost
4/20/2018 10
13. ĐẠI
CÔNG
HỌC
NGHỆ
ĐẠI
CÔNG
HỌC
NGHỆ
Simplified Sigmoid Function Architecture
4/20/2018 13
1.0
Decode
0.015625
0.984375
0.0
Input
Output
+X
0.245 0.5
LUT
LUT
LUT
LUT
3
5
3
5
Input(0)
Input(1)
16
16
16
16
16
16
16
16
• Multiplexer divide and select the
regions
• 2 LUT 32-bit and 2 LUT 8-bit for
range −3.5; 1 and 1; −3.5
• Replacing multiplication by
approximated the 2-bit shift right
logic
Weighted
input range
Decode
Selector
Output
(𝟒. 𝟓; +∞) 000000 0.999023
(𝟑. 𝟓; 𝟒. 𝟓] 100000 0.984375
(𝟏; 𝟑. 𝟓] 110000 LUT
(−1; 1) 111000 𝑎|𝑎 ∈ 𝑦 = 0.245𝑥 + 0.5
[−3.5; −1) 111100 LUT
(−4.5; 3.5] 111110 0.015625
(−∞; −𝟒. 𝟓) 111111 0.0
14. ĐẠI
CÔNG
HỌC
NGHỆ
ĐẠI
CÔNG
HỌC
NGHỆ
2-3-2 Forward architecture use Stochastic Computing
4/20/2018 14
Sigmoid
functionSC weighted in
output
Sigmoid
function
LUT
Sigmoid
function
LUT
SC weighted in
hiddenSC weighted in
hiddenSC weighted in
hidden
clk
areset
start
input
SC weighted in
output
valid
output
weighted
in result
valid
output
weighted
in result
output
start start
weight hidden
bias hidden
weight hidden
bias hidden
start finish
Activation
function
activation
result
Activation
function
Controller
Weight
Bias
Weight
Bias
Weighted
input
Activation
Function
Weighted
Input
Activation
Function
...
...
...
...
...
...
...
...
...
...
.
.
.a
Our proposed
architecture
using SC
Reference
architecture
of LSI Contest