SlideShare une entreprise Scribd logo
1  sur  33
Boosting HTML Apps with WebCL

Janakiram Raghumandala
December 10, 2013

DevCon 2013: Dec 9th–11th 2013,
Monte Carlo
What is WebCL?
• WebCL is an open API to achieve heterogeneous parallel
computing in HTML5 web browsers.
• It is a JavaScript binding to OpenCL
• Generally used with Canvas element, SIMD computing,
particle simulation
• WebCL enables
of computeintensive web applications.
Source Name Placement
2

| © 2013 Cisco and/or its affiliates. All rights reserved.
Contents
– Motivation
– OpenCL, The Underlying Concept
– WebCL
– Performance Comparison & Demos
– Summary, current status and References

3

| © 2013 Cisco and/or its affiliates. All rights reserved.
Contents
– Motivation
– OpenCL, The Underlying Concept
– WebCL
– Performance Comparison & Demos
– Summary, current status and References

4

| © 2013 Cisco and/or its affiliates. All rights reserved.
Motivation
• New mobile apps with high-computing demands
– Augmented Reality
– Video Processing
– 3D Games – lots of calculations and physics
– Computational photography

5

| © 2013 Cisco and/or its affiliates. All rights reserved.
Motivation…
• GPU delivers
improved FLOPS
• 100 times faster

6

| © 2013 Cisco and/or its affiliates. All rights reserved.
Motivation…
• Higher FLOPS per WATT
•
•
•
•
•

RED – CPUs
ORANGE – APUs
Light BLUE – ARM
GREEN – Grid Processors
YELLO – GPUs

• Top-left is the preferred zone

7

| © 2013 Cisco and/or its affiliates. All rights reserved.
Webcentric Applications

Webcentric Applications
8

| © 2013 Cisco and/or its affiliates. All rights reserved.
Motivation… growing web centric applications

9

| © 2013 Cisco and/or its affiliates. All rights reserved.
Contents
– Motivation
– OpenCL, The Underlying Concept
– WebCL
– Performance Comparison & Demos
– Summary, current status and References

10

| © 2013 Cisco and/or its affiliates. All rights reserved.
OpenCL
•
•
•
•

11

C-based cross-platform programming interface
Kernels
Run-time or build-time compilation
Rich set of built-in functions, cross, dot, sin, cos, log …

| © 2013 Cisco and/or its affiliates. All rights reserved.
OpenCL…
• Host contains one or
more OpenCL Devices
(aka Compute Devices)
• A Computer Device
contains one or more
Compute Units ~ Cores

12

| © 2013 Cisco and/or its affiliates. All rights reserved.
OpenCL…
• A Compute Unit
contains one or more
processing elements
~ Intel’s hyper threads
• Processing elements
execute code as SIMD

13

| © 2013 Cisco and/or its affiliates. All rights reserved.
OpenCL… Work Items & Work Groups

14

| © 2013 Cisco and/or its affiliates. All rights reserved.
OpenCL… Execution Model
• Kernel
– Basic unit of executable code, similar to a C function
• Program
– Collection of kernels and functions called by kernels
– Analogous to a dynamic library (run-time linking)
• Command Queue
– Applications queue kernels and data transfers
– Performed in-order or out-of-order
• Work-item ~ Processing Element ~ Thread
– An execution of a kernel by a processing element (~ thread)
• Work-group ~ Compute Unit ~ Core
– A collection of related work-items that execute on a single compute unit (~ core)

15

| © 2013 Cisco and/or its affiliates. All rights reserved.
OpenCL… Execution Model
• Kernel
– Basic program element, similar to a C function
• Program
– Collection of kernels and functions called by kernels
– Analogous to a dynamic library (run-time linking)
• Command Queue
– Applications queue kernels and data transfers
– Performed in-order or out-of-order
• Work-item ~ Processing Element ~ Thread
– An execution of a kernel by a processing element (~ thread)
• Work-group ~ Compute Unit ~ Core
– A collection of related work-items that execute on a single compute unit (~ core)

16

| © 2013 Cisco and/or its affiliates. All rights reserved.
OpenCL… Memory Model
•
•
•
•

17

| © 2013 Cisco and/or its affiliates. All rights reserved.

Private Memory per PE
Local Memory per Compute Unit
Global Memory per Device
Host Memory per Host System
OpenCL:Kernels
OpenCL: Kernels
• OpenCL kernel
– Defined on an N-dimensional computation domain
– Execute a kernel at each point of the computation domain
Traditional Loop
void vectorMult(
const float* a,
const float* b,
float* c,
const unsigned int count)
{
for(int i=0; i<count; i++)
c[i] = a[i] * b[i];
}

18

Data Parallel OpenCL Kernel
kernel void vectorMult(
global const float* a,
global const float* b,
global float* c)

| © 2013 Cisco and/or its affiliates. All rights reserved.

{
int id = get_global_id(0);
c[id] = a[id] * b[id];
}
OpenCL:Execution of Programs - Steps
1. Query host for OpenCL devices.
2. Create a context to associate OpenCL devices.
3. Create programs for execution on one or more
associated devices.
4. From the programs, select kernels to execute.
5. Create memory objects accessible from the host
and/or the device.
6. Copy memory data to the device as needed.
7. Provide kernels to the command queue for
execution.
8. Copy results from the device to the host.

19

| © 2013 Cisco and/or its affiliates. All rights reserved.
Contents
– Motivation
– OpenCL, The Underlying Concept
– WebCL
– Performance Comparison & Demos
– Summary, current status and References

20

| © 2013 Cisco and/or its affiliates. All rights reserved.
WebCL
• WebCL concepts were deliberately made similar to the
OpenCL model
– Programs
– Kernels
– Linking
– Memory Transfers

21

| © 2013 Cisco and/or its affiliates. All rights reserved.
WebCL:Programming
•
•
•
•

Initialization
Create kernel
Run kernel
WebCL image object creation
– From Uint8Array
– From <img>, <canvas>, or <video>
– From WebGL vertex buffer
– From WebGL texture
• WebGL vertex animation
• WebGL texture animation
22

| © 2013 Cisco and/or its affiliates. All rights reserved.
WebCL:Initialization
<script>
 var platforms = WebCL.getPlatforms();
var devices = platforms[0].getDevices(WebCL.DEVICE_TYPE_GPU);
var context = WebCL.createContext({ WebCLDevice: devices[0] } );
var queue = context.createCommandQueue();
</script>

23

| © 2013 Cisco and/or its affiliates. All rights reserved.
WebCL:Creating a kernel
<script id="squareProgram" type="x-kernel">
__kernel square(
__global float* input,
__global float* output,
const unsigned int count)
{
int i = get_global_id(0);
if(i < count)
output[i] = input[i] * input[i];
}
</script>
<script>
 var programSource = getProgramSource("squareProgram");
var program = context.createProgram(programSource);
program.build();
var kernel = program.createKernel("square");
</script>
24

| © 2013 Cisco and/or its affiliates. All rights reserved.
WebCL:Running WebCL Kernes
<script> ...

var inputBuf context.createBuffer(WebCL.MEM_READ_ONLY, Float32Array.BYTES_PER_ELEMENT * count);
var outputBuf =
context.createBuffer(WebCL.MEM_WRITE_ONLY, Float32Array.BYTES_PER_ELEMENT * count);
var data = new Float32Array(count); // populate data ...
queue.enqueueWriteBuffer(inputBuf, data, true); //Last arg indicates blocking API
kernel.setKernelArg(0, inputBuf);

kernel.setKernelArg(1, outputBuf);
kernel.setKernelArg(2, count, WebCL.KERNEL_ARG_INT);
var workGroupSize = kernel.getWorkGroupInfo(devices[0], WebCL.KERNEL_WORK_GROUP_SIZE);
queue.enqueueNDRangeKernel(kernel, [count], [workGroupSize]);
queue.finish(); //This API blocks
queue.enqueueReadBuffer(outputBuf, data, true); //Last arg indicates blocking API

</script>
25

| © 2013 Cisco and/or its affiliates. All rights reserved.
Contents
– Motivation
– OpenCL, The Underlying Concept
– WebCL
– Performance Comparison & Demos
– Summary, current status and References

26

| © 2013 Cisco and/or its affiliates. All rights reserved.
DevCon 2013: Dec 9th–11th 2013,
Monte Carlo
Contents
– Motivation
– OpenCL, The Underlying Concept
– WebCL
– Performance Comparison & Demos
– Summary, current status and References

28

| © 2013 Cisco and/or its affiliates. All rights reserved.
WebCL:Summary
JavaScript binding for OpenCL
Provides high performance parallel processing on multicore CPU & GPGPU
– Portable and efficient access to heterogeneous multicore devices
– Provides a single coherent standard across desktop and mobile devices
WebCL HW and SW requirements
– Need a modified browser with WebCL support
– Hardware, driver, and runtime support for OpenCL
WebCL stays close to the OpenCL standard
– Preserves developer familiarity and facilitates adoption
– Allows developers to translate their OpenCL knowledge to the web environment
– Easier to keep OpenCL and WebCL in sync, as the two evolve
Intended to be an interface above OpenCL
Facilitates layering of higher level abstractions on top of WebCL API
29

| © 2013 Cisco and/or its affiliates. All rights reserved.
WebCL:Goals
• Compute intensive web applications
– High Performance
– Platform independent – Standards-compliant
– Ease of application development

30

| © 2013 Cisco and/or its affiliates. All rights reserved.
WebCL Standardization – Current Status
• Work in progress
• Working-draft is available

31

| © 2013 Cisco and/or its affiliates. All rights reserved.
References
•
•
•
•

32

OpenCL 2.0 Tutorial
https://cvs.khronos.org/svn/repos/registry/trunk/public/webcl/spec/latest/index.html - WebCL Working draft
http://www.khronos.org/registry/cl/specs/opencl-1.1.pdf- OpenCL 1.1 Specification
http://streamcomputing.eu/blog/2012-08-27/processors-that-can-do-20-gflops-watt/

| © 2013 Cisco and/or its affiliates. All rights reserved.
Thank you.

Contenu connexe

Tendances

Native OSGi, Modular Software Development in a Native World - Alexander Broek...
Native OSGi, Modular Software Development in a Native World - Alexander Broek...Native OSGi, Modular Software Development in a Native World - Alexander Broek...
Native OSGi, Modular Software Development in a Native World - Alexander Broek...
mfrancis
 
Calling All Modularity Solutions: A Comparative Study from eBay
Calling All Modularity Solutions: A Comparative Study from eBayCalling All Modularity Solutions: A Comparative Study from eBay
Calling All Modularity Solutions: A Comparative Study from eBay
Tony Ng
 

Tendances (19)

Projects Valhalla, Loom and GraalVM at virtual JavaFest 2020 in Kiev, Ukraine...
Projects Valhalla, Loom and GraalVM at virtual JavaFest 2020 in Kiev, Ukraine...Projects Valhalla, Loom and GraalVM at virtual JavaFest 2020 in Kiev, Ukraine...
Projects Valhalla, Loom and GraalVM at virtual JavaFest 2020 in Kiev, Ukraine...
 
The Eclipse Transformer Project
The Eclipse Transformer Project The Eclipse Transformer Project
The Eclipse Transformer Project
 
IPT Reactive Java IoT Demo - BGOUG 2018
IPT Reactive Java IoT Demo - BGOUG 2018IPT Reactive Java IoT Demo - BGOUG 2018
IPT Reactive Java IoT Demo - BGOUG 2018
 
JavaOne 2015: 12 Factor App
JavaOne 2015: 12 Factor AppJavaOne 2015: 12 Factor App
JavaOne 2015: 12 Factor App
 
The Making of the Oracle R2DBC Driver and How to Take Your Code from Synchron...
The Making of the Oracle R2DBC Driver and How to Take Your Code from Synchron...The Making of the Oracle R2DBC Driver and How to Take Your Code from Synchron...
The Making of the Oracle R2DBC Driver and How to Take Your Code from Synchron...
 
Shorten All URLs
Shorten All URLsShorten All URLs
Shorten All URLs
 
Java 9 and Project Jigsaw
Java 9 and Project JigsawJava 9 and Project Jigsaw
Java 9 and Project Jigsaw
 
Instant scaling and deployment of Vitis Libraries on Alveo clusters using InA...
Instant scaling and deployment of Vitis Libraries on Alveo clusters using InA...Instant scaling and deployment of Vitis Libraries on Alveo clusters using InA...
Instant scaling and deployment of Vitis Libraries on Alveo clusters using InA...
 
MySQL 8 High Availability with InnoDB Clusters
MySQL 8 High Availability with InnoDB ClustersMySQL 8 High Availability with InnoDB Clusters
MySQL 8 High Availability with InnoDB Clusters
 
Polygot Java EE on the GraalVM
Polygot Java EE on the GraalVMPolygot Java EE on the GraalVM
Polygot Java EE on the GraalVM
 
Exploring Java Heap Dumps (Oracle Code One 2018)
Exploring Java Heap Dumps (Oracle Code One 2018)Exploring Java Heap Dumps (Oracle Code One 2018)
Exploring Java Heap Dumps (Oracle Code One 2018)
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
 
DEVNET-1152 OpenDaylight YANG Model Overview and Tools
DEVNET-1152	OpenDaylight YANG Model Overview and ToolsDEVNET-1152	OpenDaylight YANG Model Overview and Tools
DEVNET-1152 OpenDaylight YANG Model Overview and Tools
 
Native OSGi, Modular Software Development in a Native World - Alexander Broek...
Native OSGi, Modular Software Development in a Native World - Alexander Broek...Native OSGi, Modular Software Development in a Native World - Alexander Broek...
Native OSGi, Modular Software Development in a Native World - Alexander Broek...
 
OS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLOS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of ML
 
ADBA (Asynchronous Database Access)
ADBA (Asynchronous Database Access)ADBA (Asynchronous Database Access)
ADBA (Asynchronous Database Access)
 
MuleSoft Meetup Warsaw Group DataWeave 2.0
MuleSoft Meetup Warsaw Group DataWeave 2.0MuleSoft Meetup Warsaw Group DataWeave 2.0
MuleSoft Meetup Warsaw Group DataWeave 2.0
 
Building a Brain with Raspberry Pi and Zulu Embedded JVM
Building a Brain with Raspberry Pi and Zulu Embedded JVMBuilding a Brain with Raspberry Pi and Zulu Embedded JVM
Building a Brain with Raspberry Pi and Zulu Embedded JVM
 
Calling All Modularity Solutions: A Comparative Study from eBay
Calling All Modularity Solutions: A Comparative Study from eBayCalling All Modularity Solutions: A Comparative Study from eBay
Calling All Modularity Solutions: A Comparative Study from eBay
 

Similaire à Boosting your HTML Apps – Overview of OpenCL and Hello World of WebCL

TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
chiportal
 
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and KubeflowKostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
IT Arena
 
TECNIRIS@: OpenNebula Tutorial
TECNIRIS@: OpenNebula TutorialTECNIRIS@: OpenNebula Tutorial
TECNIRIS@: OpenNebula Tutorial
OpenNebula Project
 
Automatic generation of platform architectures using open cl and fpga roadmap
Automatic generation of platform architectures using open cl and fpga roadmapAutomatic generation of platform architectures using open cl and fpga roadmap
Automatic generation of platform architectures using open cl and fpga roadmap
Manolis Vavalis
 

Similaire à Boosting your HTML Apps – Overview of OpenCL and Hello World of WebCL (20)

WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...
 
Database As A Service: OEM + ODA (OOW 15 Presentation)
Database As A Service: OEM + ODA (OOW 15 Presentation)Database As A Service: OEM + ODA (OOW 15 Presentation)
Database As A Service: OEM + ODA (OOW 15 Presentation)
 
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAn Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
 
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
 
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and KubeflowKostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
 
TECNIRIS@: OpenNebula Tutorial
TECNIRIS@: OpenNebula TutorialTECNIRIS@: OpenNebula Tutorial
TECNIRIS@: OpenNebula Tutorial
 
OpenStack for devops environment
OpenStack for devops environment OpenStack for devops environment
OpenStack for devops environment
 
Show and Tell: Building Applications on Cisco Open SDN Controller
Show and Tell: Building Applications on Cisco Open SDN Controller Show and Tell: Building Applications on Cisco Open SDN Controller
Show and Tell: Building Applications on Cisco Open SDN Controller
 
Automatic generation of platform architectures using open cl and fpga roadmap
Automatic generation of platform architectures using open cl and fpga roadmapAutomatic generation of platform architectures using open cl and fpga roadmap
Automatic generation of platform architectures using open cl and fpga roadmap
 
4th SDN Interest Group Seminar-Session 2-2(130313)
4th SDN Interest Group Seminar-Session 2-2(130313)4th SDN Interest Group Seminar-Session 2-2(130313)
4th SDN Interest Group Seminar-Session 2-2(130313)
 
Red Hat Java Update and Quarkus Introduction
Red Hat Java Update and Quarkus IntroductionRed Hat Java Update and Quarkus Introduction
Red Hat Java Update and Quarkus Introduction
 
Development with JavaFX 9 in JDK 9.0.1
Development with JavaFX 9 in JDK 9.0.1Development with JavaFX 9 in JDK 9.0.1
Development with JavaFX 9 in JDK 9.0.1
 
Current & Future Use-Cases of OpenDaylight
Current & Future Use-Cases of OpenDaylightCurrent & Future Use-Cases of OpenDaylight
Current & Future Use-Cases of OpenDaylight
 
DEVNET-1006 Getting Started with OpenDayLight
DEVNET-1006	Getting Started with OpenDayLightDEVNET-1006	Getting Started with OpenDayLight
DEVNET-1006 Getting Started with OpenDayLight
 
OpenStack with OpenDaylight
OpenStack with OpenDaylightOpenStack with OpenDaylight
OpenStack with OpenDaylight
 
Cisco Cloupia uic product overview and demo presentation
Cisco Cloupia uic product overview and demo presentationCisco Cloupia uic product overview and demo presentation
Cisco Cloupia uic product overview and demo presentation
 
5 cisco open_stack
5 cisco open_stack5 cisco open_stack
5 cisco open_stack
 
Warsaw MuleSoft Meetup - Runtime Fabric
Warsaw MuleSoft Meetup - Runtime FabricWarsaw MuleSoft Meetup - Runtime Fabric
Warsaw MuleSoft Meetup - Runtime Fabric
 
Akka.Net and .Net Core - The Developer Conference 2018 Florianopolis
Akka.Net and .Net Core - The Developer Conference 2018 FlorianopolisAkka.Net and .Net Core - The Developer Conference 2018 Florianopolis
Akka.Net and .Net Core - The Developer Conference 2018 Florianopolis
 
Speeding up Programs with OpenACC in GCC
Speeding up Programs with OpenACC in GCCSpeeding up Programs with OpenACC in GCC
Speeding up Programs with OpenACC in GCC
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Dernier (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 

Boosting your HTML Apps – Overview of OpenCL and Hello World of WebCL

  • 1. Boosting HTML Apps with WebCL Janakiram Raghumandala December 10, 2013 DevCon 2013: Dec 9th–11th 2013, Monte Carlo
  • 2. What is WebCL? • WebCL is an open API to achieve heterogeneous parallel computing in HTML5 web browsers. • It is a JavaScript binding to OpenCL • Generally used with Canvas element, SIMD computing, particle simulation • WebCL enables of computeintensive web applications. Source Name Placement 2 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 3. Contents – Motivation – OpenCL, The Underlying Concept – WebCL – Performance Comparison & Demos – Summary, current status and References 3 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 4. Contents – Motivation – OpenCL, The Underlying Concept – WebCL – Performance Comparison & Demos – Summary, current status and References 4 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 5. Motivation • New mobile apps with high-computing demands – Augmented Reality – Video Processing – 3D Games – lots of calculations and physics – Computational photography 5 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 6. Motivation… • GPU delivers improved FLOPS • 100 times faster 6 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 7. Motivation… • Higher FLOPS per WATT • • • • • RED – CPUs ORANGE – APUs Light BLUE – ARM GREEN – Grid Processors YELLO – GPUs • Top-left is the preferred zone 7 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 8. Webcentric Applications Webcentric Applications 8 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 9. Motivation… growing web centric applications 9 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 10. Contents – Motivation – OpenCL, The Underlying Concept – WebCL – Performance Comparison & Demos – Summary, current status and References 10 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 11. OpenCL • • • • 11 C-based cross-platform programming interface Kernels Run-time or build-time compilation Rich set of built-in functions, cross, dot, sin, cos, log … | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 12. OpenCL… • Host contains one or more OpenCL Devices (aka Compute Devices) • A Computer Device contains one or more Compute Units ~ Cores 12 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 13. OpenCL… • A Compute Unit contains one or more processing elements ~ Intel’s hyper threads • Processing elements execute code as SIMD 13 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 14. OpenCL… Work Items & Work Groups 14 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 15. OpenCL… Execution Model • Kernel – Basic unit of executable code, similar to a C function • Program – Collection of kernels and functions called by kernels – Analogous to a dynamic library (run-time linking) • Command Queue – Applications queue kernels and data transfers – Performed in-order or out-of-order • Work-item ~ Processing Element ~ Thread – An execution of a kernel by a processing element (~ thread) • Work-group ~ Compute Unit ~ Core – A collection of related work-items that execute on a single compute unit (~ core) 15 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 16. OpenCL… Execution Model • Kernel – Basic program element, similar to a C function • Program – Collection of kernels and functions called by kernels – Analogous to a dynamic library (run-time linking) • Command Queue – Applications queue kernels and data transfers – Performed in-order or out-of-order • Work-item ~ Processing Element ~ Thread – An execution of a kernel by a processing element (~ thread) • Work-group ~ Compute Unit ~ Core – A collection of related work-items that execute on a single compute unit (~ core) 16 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 17. OpenCL… Memory Model • • • • 17 | © 2013 Cisco and/or its affiliates. All rights reserved. Private Memory per PE Local Memory per Compute Unit Global Memory per Device Host Memory per Host System
  • 18. OpenCL:Kernels OpenCL: Kernels • OpenCL kernel – Defined on an N-dimensional computation domain – Execute a kernel at each point of the computation domain Traditional Loop void vectorMult( const float* a, const float* b, float* c, const unsigned int count) { for(int i=0; i<count; i++) c[i] = a[i] * b[i]; } 18 Data Parallel OpenCL Kernel kernel void vectorMult( global const float* a, global const float* b, global float* c) | © 2013 Cisco and/or its affiliates. All rights reserved. { int id = get_global_id(0); c[id] = a[id] * b[id]; }
  • 19. OpenCL:Execution of Programs - Steps 1. Query host for OpenCL devices. 2. Create a context to associate OpenCL devices. 3. Create programs for execution on one or more associated devices. 4. From the programs, select kernels to execute. 5. Create memory objects accessible from the host and/or the device. 6. Copy memory data to the device as needed. 7. Provide kernels to the command queue for execution. 8. Copy results from the device to the host. 19 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 20. Contents – Motivation – OpenCL, The Underlying Concept – WebCL – Performance Comparison & Demos – Summary, current status and References 20 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 21. WebCL • WebCL concepts were deliberately made similar to the OpenCL model – Programs – Kernels – Linking – Memory Transfers 21 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 22. WebCL:Programming • • • • Initialization Create kernel Run kernel WebCL image object creation – From Uint8Array – From <img>, <canvas>, or <video> – From WebGL vertex buffer – From WebGL texture • WebGL vertex animation • WebGL texture animation 22 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 23. WebCL:Initialization <script>
 var platforms = WebCL.getPlatforms(); var devices = platforms[0].getDevices(WebCL.DEVICE_TYPE_GPU); var context = WebCL.createContext({ WebCLDevice: devices[0] } ); var queue = context.createCommandQueue(); </script> 23 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 24. WebCL:Creating a kernel <script id="squareProgram" type="x-kernel"> __kernel square( __global float* input, __global float* output, const unsigned int count) { int i = get_global_id(0); if(i < count) output[i] = input[i] * input[i]; } </script> <script>
 var programSource = getProgramSource("squareProgram"); var program = context.createProgram(programSource); program.build(); var kernel = program.createKernel("square"); </script> 24 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 25. WebCL:Running WebCL Kernes <script> ... var inputBuf context.createBuffer(WebCL.MEM_READ_ONLY, Float32Array.BYTES_PER_ELEMENT * count);
var outputBuf = context.createBuffer(WebCL.MEM_WRITE_ONLY, Float32Array.BYTES_PER_ELEMENT * count); var data = new Float32Array(count); // populate data ... queue.enqueueWriteBuffer(inputBuf, data, true); //Last arg indicates blocking API kernel.setKernelArg(0, inputBuf); kernel.setKernelArg(1, outputBuf); kernel.setKernelArg(2, count, WebCL.KERNEL_ARG_INT); var workGroupSize = kernel.getWorkGroupInfo(devices[0], WebCL.KERNEL_WORK_GROUP_SIZE); queue.enqueueNDRangeKernel(kernel, [count], [workGroupSize]); queue.finish(); //This API blocks queue.enqueueReadBuffer(outputBuf, data, true); //Last arg indicates blocking API </script> 25 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 26. Contents – Motivation – OpenCL, The Underlying Concept – WebCL – Performance Comparison & Demos – Summary, current status and References 26 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 27. DevCon 2013: Dec 9th–11th 2013, Monte Carlo
  • 28. Contents – Motivation – OpenCL, The Underlying Concept – WebCL – Performance Comparison & Demos – Summary, current status and References 28 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 29. WebCL:Summary JavaScript binding for OpenCL Provides high performance parallel processing on multicore CPU & GPGPU – Portable and efficient access to heterogeneous multicore devices – Provides a single coherent standard across desktop and mobile devices WebCL HW and SW requirements – Need a modified browser with WebCL support – Hardware, driver, and runtime support for OpenCL WebCL stays close to the OpenCL standard – Preserves developer familiarity and facilitates adoption – Allows developers to translate their OpenCL knowledge to the web environment – Easier to keep OpenCL and WebCL in sync, as the two evolve Intended to be an interface above OpenCL Facilitates layering of higher level abstractions on top of WebCL API 29 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 30. WebCL:Goals • Compute intensive web applications – High Performance – Platform independent – Standards-compliant – Ease of application development 30 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 31. WebCL Standardization – Current Status • Work in progress • Working-draft is available 31 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 32. References • • • • 32 OpenCL 2.0 Tutorial https://cvs.khronos.org/svn/repos/registry/trunk/public/webcl/spec/latest/index.html - WebCL Working draft http://www.khronos.org/registry/cl/specs/opencl-1.1.pdf- OpenCL 1.1 Specification http://streamcomputing.eu/blog/2012-08-27/processors-that-can-do-20-gflops-watt/ | © 2013 Cisco and/or its affiliates. All rights reserved.

Notes de l'éditeur

  1. Typically there will be hundres of Processing elements
  2. Square function is anOpenCL C based on C99 (ie., ISO/IEC 9899:1999)
  3. ----- Meeting Notes (29/11/13 16:11) -----Slow - - lively interactive, explain challenges,show that you are confident,