Jupyter notebooks are transforming the way we look at computing, coding, and science. But is this the only "data scientist experience" that this technology can provide? In this presentation we will look at how to create interactive web applications for data exploration and machine learning. In the background this code is still powered by the well-understood and well-documented Jupyter Notebooks.
Code on github: https://github.com/natbusa/kernelgateway_demos
2. 2 Natalino Busa - @natbusa
Linkedin + Twitter + Github:
@natbusa
DBS
Teradata
Cognitive Finance
ING Group
O’Reilly
Philips
3. 3 Natalino Busa - @natbusa
Icons made by Gregor Cresnar
from www.flaticon.com is licensed by CC
Learning: The Scientific Method
Ørsted's "First Introduction to General Physics" (1811)
https://en.m.wikipedia.org/wiki/History_of_scientific_method
observation hypothesis deduction synthesis
Hans Christian Ørsted
experiment
6. 6 Natalino Busa - @natbusa
The Jupyter Project
http://jupyter.org
7. 7 Natalino Busa - @natbusa
Jupyter notebook: what is it?
The Jupyter Notebook
The Jupyter Notebook is a web application that
allows you to create and share documents that
contain live code, equations, visualizations and
explanatory text.
Uses include: data cleaning and
transformation, numerical simulation,
statistical modeling, machine learning and
much more.
credit : Jupyter project
extracted from http://jupyter.org/index.html
8. 8 Natalino Busa - @natbusa
Jupyter notebook: why?
Language of choice
The Notebook has support for
over 40 programming
languages, including those
popular in Data Science such as
Python, R, Julia and Scala.
Share notebooks
Notebooks can be shared with
others using email, Dropbox,
GitHub and the Jupyter
Notebook Viewer.
Interactive widgets
Code can produce rich output
such as images, videos, LaTeX,
and JavaScript. Interactive
widgets can be used to
manipulate and visualize data in
realtime.
Big data integration
Leverage big data tools, such as
Apache Spark, from Python, R
and Scala. Explore that same
data with pandas, scikit-learn,
ggplot2, dplyr, etc.
credit : Jupyter project
extracted from http://jupyter.org/index.html
10. 10 Natalino Busa - @natbusa
Architecture of a Jupyter Notebook
Jupyter Notebook Server Kernel
∅MQ
Notebook files
Jupyter Notebook
Web App
Web
Browser
HTTP
Websockets
https://jupyter.readthedocs.io/en/latest/architecture/how_jupyter_ipython_work.html
11. 11 Natalino Busa - @natbusa
Architecture of a Jupyter Notebook
• Modular architecture:
Web App, Server, Kernel
• Kernels:
Python, R, Scala, Bash, SQL
• Web App:
Asynchronous, rich editing, syntax highlight, export and share
12. 12 Natalino Busa - @natbusa
Jupyter Notebook
● Narratives and Use Cases
Narratives are collaborative, shareable, publishable, and reproducible. We believe that
Narratives help both yourself and other researchers by sharing your use of Jupyter
projects, technical specifics of your deployment, and installation and configuration tips so
that others can learn from your experiences.
From https://jupyter.readthedocs.io/en/latest/use-cases/content-user.html
13. 13 Natalino Busa - @natbusa
Jupyter is more than Notebooks
“ What if I told you that the notebook
is NOT the only sort of narrative that
you can create with the Jupyter
project? ”
14. 14 Natalino Busa - @natbusa
Examples of Jupyter powered narratives
● O’Reilly Orioles
● Examples - build your own!
16. 16 Natalino Busa - @natbusa
Geolocated clustering and prediction
services with scikit-learn
Learn how to build a venue
recommender and a geofencing
alerting engine using geolocated data,
ML clustering algorithms, and
scikit-learn
17. 17 Natalino Busa - @natbusa
Build your own narrative!
What do you need?
Understand how to communicate to the jupyter server
Two ways: websockets or http api endpoints
Build your own web application
Many ways: e.g. angular, polymer, dart, etc
1
2
18. 18 Natalino Busa - @natbusa
Demos: kernel gateway
Purpose:
- Understand how to expose API endpoints
- Build your own narrative!
- Productivity gain: faster app prototyping
22. 22 Natalino Busa - @natbusa
Dockerize your jupyter gateway api
IMAGE=demos/kernel_gateway_demo
docker build -t $(IMAGE) .
docker run -p 8888:8888 $(IMAGE)
jupyter kernelgateway
--KernelGatewayApp.ip=0.0.0.0
--KernelGatewayApp.port=8888
--KernelGatewayApp.api=notebook-http
--KernelGatewayApp.seed_uri=/srv/notebooks/autoscience.ipynb
23. 23 Natalino Busa - @natbusa
Big Data apps:
Dockerize your jupyter gateway api with Toree
Jupyter Kernel Gateway Toree Kernel
∅MQ
Notebook files
Web
Browser
Your own
Web App
HTTP REST API
Docker
Containers
onewebsession=
oneserveronacloud
24. 24 Natalino Busa - @natbusa
Summary
• Jupyter notebook is a great way to create and share
data-driven uses cases and projects
• Jupyter is more than notebooks
– gateway, kernels, hub, etc
• Narratives powered by jupyter
– O’ Reilly Orioles
– build your own narrative
25. 25 Natalino Busa - @natbusa
Resources
Jupyter
http://jupyter.org/index.html
https://jupyter.readthedocs.io/en/latest/index.html#
Jupyter Kernel Gateway
https://github.com/jupyter/kernel_gateway
http://jupyter-kernel-gateway.readthedocs.io/en/latest/
Jupyter Con (first of its kind!)
https://conferences.oreilly.com/jupyter/jup-ny
Apache Toree (Spark Kernel)
https://toree.apache.org/
Web application dev
https://angular.io/
https://www.polymer-project.org/1.0/
Docker
https://github.com/jupyter/docker-stacks
https://www.docker.com/