SlideShare une entreprise Scribd logo
1  sur  14
Télécharger pour lire hors ligne
CMSIS-NN, INTRO
新⽵竹碼農
Anthony Liu, 2018/03/08
1
RESOURCES
• Source: https://github.com/ARM-software/CMSIS_5
• Web 1: https://developer.arm.com/embedded/cmsis
• Web 2: http://www2.keil.com/mdk5/cmsis/
• Paper: https://arxiv.org/abs/1801.06601
• Manual: http://arm-software.github.io/CMSIS_5/NN/html/
index.html
2
CMSIS 5.3.0
• http://www2.keil.com/mdk5/cmsis/
• https://developer.arm.com/embedded/cmsis
• Cortex Microcontroller Software Interface Standard
• CMSIS-NN first appeared in 5.2.1 dev 3
CMSIS-CORE CMSIS-RTOS CMSIS-DSP
CMSIS-Driver CMSIS-SVD CMSIS-DAP
CMSIS-Pack CMSIS-NNCMSIS-Zone
(planned)
3
CMSIS
https://developer.arm.com/embedded/cmsis
4
CMSIS-NN
• DSP: Cortex-M0 (N) / Cortex-M3 (N) Cortex-M4 (Y) / Cortex-M7 (Y) / Cortex-M33 (Optional)
• For inference only with limited computation power
• CPU: Dozens MHz to 192MHz Cortex-M4, 400 MHz Cortex-M7
• MEMORY: Dozens KB to a few MB
• Kernels Support: q7t and q15_t fractional data type: [ -1.0, 1.0 )
• Functions
• Neural Network Convolution Functions
• Neural Network Activation Functions
• Fully-connected Layer Functions
• Neural Network Pooling Functions
• Softmax Functions
5
SUPPORT
• Data conversion
• arm_q7_to_q15_no_shift
• arm_q7_to_q15_reordered_no_shift
6
CONVOLUTION
• arm_convolve_HWC_q7_basic
• arm_convolve_HWC_q15_basic
• arm_convolve_HWC_q7_fast
• arm_convolve_HWC_q7_fast_nonsquare
• arm_convolve_HWC_q7_RGB
• arm_convolve_HWC_q15_fast
• arm_convolve_1x1_HWC_q7_fast_nonsquare
• arm_depthwise_separable_conv_HWC_q7
• arm_depthwise_separable_conv_HWC_q7_nonsquare
7
ACTIVATION
• ReLU
• arm_relu_q7
• arm_relu_q15
• Sigmoid / Tanh
• arm_nn_activations_direct_q7
• arm_nn_activations_direct_q15
8
POOLING
• Supports 1.7 format max-pooling and
average-pooling
• arm_maxpool_q7_HWC
• arm_avepool_q7_HWC
9
SOFTMAX
• EXP(2) based softmax function
• arm_softmax_q7
• arm_softmax_q15
10
FULLY-CONNECTED LAYER
• arm_fully_connected_q7
• arm_fully_connected_q7_opt
• arm_fully_connected_q15
• arm_fully_connected_q15_opt
• arm_fully_connected_mat_q7_vec_q15
• arm_fully_connected_mat_q7_vec_q15_opt
11
FOOTPRINT - 9,306
text data bss dec hex filename
132 0 0 132 84 ./SoftmaxFunctions/arm_softmax_q15.o
154 0 0 154 9a ./SoftmaxFunctions/arm_softmax_q7.o
544 0 0 544 220 ./PoolingFunctions/arm_pool_q7_HWC.o
2816 0 0 2816 b00 ./NNSupportFunctions/arm_nntables.o
84 0 0 84 54 ./NNSupportFunctions/arm_q7_to_q15_no_shift.o
72 0 0 72 48 ./NNSupportFunctions/arm_q7_to_q15_reordered_no_shift.o
102 0 0 102 66 ./FullyConnectedFunctions/arm_fully_connected_q15.o
88 0 0 88 58 ./FullyConnectedFunctions/arm_fully_connected_mat_q7_vec_q15.o
476 0 0 476 1dc ./FullyConnectedFunctions/arm_fully_connected_mat_q7_vec_q15_opt.o
486 0 0 486 1e6 ./FullyConnectedFunctions/arm_fully_connected_q15_opt.o
86 0 0 86 56 ./FullyConnectedFunctions/arm_fully_connected_q7.o
532 0 0 532 214 ./FullyConnectedFunctions/arm_fully_connected_q7_opt.o
266 0 0 266 10a ./ConvolutionFunctions/arm_convolve_1x1_HWC_q7_fast_nonsquare.o
404 0 0 404 194 ./ConvolutionFunctions/arm_convolve_HWC_q15_basic.o
450 0 0 450 1c2 ./ConvolutionFunctions/arm_convolve_HWC_q15_fast.o
426 0 0 426 1aa ./ConvolutionFunctions/arm_convolve_HWC_q7_basic.o
434 0 0 434 1b2 ./ConvolutionFunctions/arm_convolve_HWC_q7_fast.o
434 0 0 434 1b2 ./ConvolutionFunctions/arm_convolve_HWC_q7_fast_nonsquare.o
428 0 0 428 1ac ./ConvolutionFunctions/arm_convolve_HWC_q7_RGB.o
298 0 0 298 12a ./ConvolutionFunctions/arm_depthwise_separable_conv_HWC_q7.o
378 0 0 378 17a ./ConvolutionFunctions/arm_depthwise_separable_conv_HWC_q7_nonsquare.o
4 0 0 4 4 ./ConvolutionFunctions/arm_nn_mat_mult_kernel_q7_q15.o
4 0 0 4 4 ./ConvolutionFunctions/arm_nn_mat_mult_kernel_q7_q15_reordered.o
104 0 0 104 68 ./ActivationFunctions/arm_nn_activations_q15.o
48 0 0 48 30 ./ActivationFunctions/arm_nn_activations_q7.o
28 0 0 28 1c ./ActivationFunctions/arm_relu_q15.o
28 0 0 28 1c ./ActivationFunctions/arm_relu_q7.o
EXAMPLE - CIFAR-10
• arm_convolve_HWC_q7_RGB()
• arm_relu_q7()
• arm_maxpool_q7_HWC()
• arm_convolve_HWC_q7_fast()
• arm_relu_q7()
• arm_avepool_q7_HWC()
13
• arm_convolve_HWC_q7_fast()
• arm_relu_q7()
• arm_avepool_q7_HWC()
• arm_fully_connected_q7()
• arm_softmax_q7()
• conv1_wt: 2,400
• conv1_bias: 32
• conv2_wt: 12,800
• conv2_bias: 16
• conv3_wt: 12,800
• conv3_bias: 32
• ip1_wt: 10
• ip1_bias: 10
• input_data: 3K
• output_data: 10
• col_buffer: 3,200
• scratch_buffer: 40K
PERFORMANCE
CIFAR-10
speed show case
GRU
power-save show case

Contenu connexe

Similaire à CMSIS-NN

ArcSight Connector Appliance v6.0 Patch 2 Release Notes
ArcSight Connector Appliance v6.0 Patch 2 Release NotesArcSight Connector Appliance v6.0 Patch 2 Release Notes
ArcSight Connector Appliance v6.0 Patch 2 Release NotesProtect724tk
 
Putting Microservices on a Diet: with Istio!
Putting Microservices on a Diet: with Istio!Putting Microservices on a Diet: with Istio!
Putting Microservices on a Diet: with Istio!QAware GmbH
 
Understanding kube proxy in ipvs mode
Understanding kube proxy in ipvs modeUnderstanding kube proxy in ipvs mode
Understanding kube proxy in ipvs modeVictor Morales
 
PVS-Studio is ready to improve the code of Tizen operating system
PVS-Studio is ready to improve the code of Tizen operating systemPVS-Studio is ready to improve the code of Tizen operating system
PVS-Studio is ready to improve the code of Tizen operating systemAndrey Karpov
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...Edge AI and Vision Alliance
 
9 steps to awesome with kubernetes
9 steps to awesome with kubernetes9 steps to awesome with kubernetes
9 steps to awesome with kubernetesBaraniBuuny
 
Continuous Security: From tins to containers - now what!
Continuous Security: From tins to containers - now what!Continuous Security: From tins to containers - now what!
Continuous Security: From tins to containers - now what!Michael Man
 
Data Driven Decisions in DevOps
Data Driven Decisions in DevOpsData Driven Decisions in DevOps
Data Driven Decisions in DevOpsLeon Stigter
 
growthbotics audit.pdf
growthbotics audit.pdfgrowthbotics audit.pdf
growthbotics audit.pdfWilson Kao
 
20160221 va interconnect_pub
20160221 va interconnect_pub20160221 va interconnect_pub
20160221 va interconnect_pubCanturk Isci
 
Sprint 138
Sprint 138Sprint 138
Sprint 138ManageIQ
 
IoT Gateways - A Market Overview of Selected Vendors v2d - July 2017
IoT Gateways - A Market Overview of Selected Vendors v2d - July 2017IoT Gateways - A Market Overview of Selected Vendors v2d - July 2017
IoT Gateways - A Market Overview of Selected Vendors v2d - July 2017Dell Technologies
 
Open stack gbp final sn-4-slideshare
Open stack gbp final sn-4-slideshareOpen stack gbp final sn-4-slideshare
Open stack gbp final sn-4-slideshareSumit Naiksatam
 
Node.js on microsoft azure april 2014
Node.js on microsoft azure april 2014Node.js on microsoft azure april 2014
Node.js on microsoft azure april 2014Brian Benz
 
OSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015: Linux Performance Profiling and Monitoring by Werner FischerOSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015: Linux Performance Profiling and Monitoring by Werner FischerNETWAYS
 
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner FischerOSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner FischerNETWAYS
 

Similaire à CMSIS-NN (20)

ArcSight Connector Appliance v6.0 Patch 2 Release Notes
ArcSight Connector Appliance v6.0 Patch 2 Release NotesArcSight Connector Appliance v6.0 Patch 2 Release Notes
ArcSight Connector Appliance v6.0 Patch 2 Release Notes
 
C&C Botnet Factory
C&C Botnet FactoryC&C Botnet Factory
C&C Botnet Factory
 
Putting Microservices on a Diet: with Istio!
Putting Microservices on a Diet: with Istio!Putting Microservices on a Diet: with Istio!
Putting Microservices on a Diet: with Istio!
 
Understanding kube proxy in ipvs mode
Understanding kube proxy in ipvs modeUnderstanding kube proxy in ipvs mode
Understanding kube proxy in ipvs mode
 
Auto cutmanual
Auto cutmanualAuto cutmanual
Auto cutmanual
 
PVS-Studio is ready to improve the code of Tizen operating system
PVS-Studio is ready to improve the code of Tizen operating systemPVS-Studio is ready to improve the code of Tizen operating system
PVS-Studio is ready to improve the code of Tizen operating system
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
 
9 steps to awesome with kubernetes
9 steps to awesome with kubernetes9 steps to awesome with kubernetes
9 steps to awesome with kubernetes
 
Continuous Security: From tins to containers - now what!
Continuous Security: From tins to containers - now what!Continuous Security: From tins to containers - now what!
Continuous Security: From tins to containers - now what!
 
Data Driven Decisions in DevOps
Data Driven Decisions in DevOpsData Driven Decisions in DevOps
Data Driven Decisions in DevOps
 
growthbotics audit.pdf
growthbotics audit.pdfgrowthbotics audit.pdf
growthbotics audit.pdf
 
20160221 va interconnect_pub
20160221 va interconnect_pub20160221 va interconnect_pub
20160221 va interconnect_pub
 
Sprint 138
Sprint 138Sprint 138
Sprint 138
 
IoT Gateways - A Market Overview of Selected Vendors v2d - July 2017
IoT Gateways - A Market Overview of Selected Vendors v2d - July 2017IoT Gateways - A Market Overview of Selected Vendors v2d - July 2017
IoT Gateways - A Market Overview of Selected Vendors v2d - July 2017
 
Open stack gbp final sn-4-slideshare
Open stack gbp final sn-4-slideshareOpen stack gbp final sn-4-slideshare
Open stack gbp final sn-4-slideshare
 
Node.js on microsoft azure april 2014
Node.js on microsoft azure april 2014Node.js on microsoft azure april 2014
Node.js on microsoft azure april 2014
 
Asuntos de escaneado de transporte pesado
Asuntos de escaneado de transporte pesadoAsuntos de escaneado de transporte pesado
Asuntos de escaneado de transporte pesado
 
Microcontroller part 2
Microcontroller part 2Microcontroller part 2
Microcontroller part 2
 
OSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015: Linux Performance Profiling and Monitoring by Werner FischerOSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
 
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner FischerOSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
 

Dernier

Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 

Dernier (20)

DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 

CMSIS-NN

  • 2. RESOURCES • Source: https://github.com/ARM-software/CMSIS_5 • Web 1: https://developer.arm.com/embedded/cmsis • Web 2: http://www2.keil.com/mdk5/cmsis/ • Paper: https://arxiv.org/abs/1801.06601 • Manual: http://arm-software.github.io/CMSIS_5/NN/html/ index.html 2
  • 3. CMSIS 5.3.0 • http://www2.keil.com/mdk5/cmsis/ • https://developer.arm.com/embedded/cmsis • Cortex Microcontroller Software Interface Standard • CMSIS-NN first appeared in 5.2.1 dev 3 CMSIS-CORE CMSIS-RTOS CMSIS-DSP CMSIS-Driver CMSIS-SVD CMSIS-DAP CMSIS-Pack CMSIS-NNCMSIS-Zone (planned) 3
  • 5. CMSIS-NN • DSP: Cortex-M0 (N) / Cortex-M3 (N) Cortex-M4 (Y) / Cortex-M7 (Y) / Cortex-M33 (Optional) • For inference only with limited computation power • CPU: Dozens MHz to 192MHz Cortex-M4, 400 MHz Cortex-M7 • MEMORY: Dozens KB to a few MB • Kernels Support: q7t and q15_t fractional data type: [ -1.0, 1.0 ) • Functions • Neural Network Convolution Functions • Neural Network Activation Functions • Fully-connected Layer Functions • Neural Network Pooling Functions • Softmax Functions 5
  • 6. SUPPORT • Data conversion • arm_q7_to_q15_no_shift • arm_q7_to_q15_reordered_no_shift 6
  • 7. CONVOLUTION • arm_convolve_HWC_q7_basic • arm_convolve_HWC_q15_basic • arm_convolve_HWC_q7_fast • arm_convolve_HWC_q7_fast_nonsquare • arm_convolve_HWC_q7_RGB • arm_convolve_HWC_q15_fast • arm_convolve_1x1_HWC_q7_fast_nonsquare • arm_depthwise_separable_conv_HWC_q7 • arm_depthwise_separable_conv_HWC_q7_nonsquare 7
  • 8. ACTIVATION • ReLU • arm_relu_q7 • arm_relu_q15 • Sigmoid / Tanh • arm_nn_activations_direct_q7 • arm_nn_activations_direct_q15 8
  • 9. POOLING • Supports 1.7 format max-pooling and average-pooling • arm_maxpool_q7_HWC • arm_avepool_q7_HWC 9
  • 10. SOFTMAX • EXP(2) based softmax function • arm_softmax_q7 • arm_softmax_q15 10
  • 11. FULLY-CONNECTED LAYER • arm_fully_connected_q7 • arm_fully_connected_q7_opt • arm_fully_connected_q15 • arm_fully_connected_q15_opt • arm_fully_connected_mat_q7_vec_q15 • arm_fully_connected_mat_q7_vec_q15_opt 11
  • 12. FOOTPRINT - 9,306 text data bss dec hex filename 132 0 0 132 84 ./SoftmaxFunctions/arm_softmax_q15.o 154 0 0 154 9a ./SoftmaxFunctions/arm_softmax_q7.o 544 0 0 544 220 ./PoolingFunctions/arm_pool_q7_HWC.o 2816 0 0 2816 b00 ./NNSupportFunctions/arm_nntables.o 84 0 0 84 54 ./NNSupportFunctions/arm_q7_to_q15_no_shift.o 72 0 0 72 48 ./NNSupportFunctions/arm_q7_to_q15_reordered_no_shift.o 102 0 0 102 66 ./FullyConnectedFunctions/arm_fully_connected_q15.o 88 0 0 88 58 ./FullyConnectedFunctions/arm_fully_connected_mat_q7_vec_q15.o 476 0 0 476 1dc ./FullyConnectedFunctions/arm_fully_connected_mat_q7_vec_q15_opt.o 486 0 0 486 1e6 ./FullyConnectedFunctions/arm_fully_connected_q15_opt.o 86 0 0 86 56 ./FullyConnectedFunctions/arm_fully_connected_q7.o 532 0 0 532 214 ./FullyConnectedFunctions/arm_fully_connected_q7_opt.o 266 0 0 266 10a ./ConvolutionFunctions/arm_convolve_1x1_HWC_q7_fast_nonsquare.o 404 0 0 404 194 ./ConvolutionFunctions/arm_convolve_HWC_q15_basic.o 450 0 0 450 1c2 ./ConvolutionFunctions/arm_convolve_HWC_q15_fast.o 426 0 0 426 1aa ./ConvolutionFunctions/arm_convolve_HWC_q7_basic.o 434 0 0 434 1b2 ./ConvolutionFunctions/arm_convolve_HWC_q7_fast.o 434 0 0 434 1b2 ./ConvolutionFunctions/arm_convolve_HWC_q7_fast_nonsquare.o 428 0 0 428 1ac ./ConvolutionFunctions/arm_convolve_HWC_q7_RGB.o 298 0 0 298 12a ./ConvolutionFunctions/arm_depthwise_separable_conv_HWC_q7.o 378 0 0 378 17a ./ConvolutionFunctions/arm_depthwise_separable_conv_HWC_q7_nonsquare.o 4 0 0 4 4 ./ConvolutionFunctions/arm_nn_mat_mult_kernel_q7_q15.o 4 0 0 4 4 ./ConvolutionFunctions/arm_nn_mat_mult_kernel_q7_q15_reordered.o 104 0 0 104 68 ./ActivationFunctions/arm_nn_activations_q15.o 48 0 0 48 30 ./ActivationFunctions/arm_nn_activations_q7.o 28 0 0 28 1c ./ActivationFunctions/arm_relu_q15.o 28 0 0 28 1c ./ActivationFunctions/arm_relu_q7.o
  • 13. EXAMPLE - CIFAR-10 • arm_convolve_HWC_q7_RGB() • arm_relu_q7() • arm_maxpool_q7_HWC() • arm_convolve_HWC_q7_fast() • arm_relu_q7() • arm_avepool_q7_HWC() 13 • arm_convolve_HWC_q7_fast() • arm_relu_q7() • arm_avepool_q7_HWC() • arm_fully_connected_q7() • arm_softmax_q7() • conv1_wt: 2,400 • conv1_bias: 32 • conv2_wt: 12,800 • conv2_bias: 16 • conv3_wt: 12,800 • conv3_bias: 32 • ip1_wt: 10 • ip1_bias: 10 • input_data: 3K • output_data: 10 • col_buffer: 3,200 • scratch_buffer: 40K