SlideShare une entreprise Scribd logo
1  sur  112
Télécharger pour lire hors ligne
Data Analysis with Python
Happy Pi Day!
>>> sum(4./x if i%2 == 0 else -4./x for
i, x in enumerate(range(1, 10000000, 2)))
About Me
Co­chair Utah Python. Consultant with 14 years 
Python experience across Data Science, BI, Web, 
Open Source Stack Management, and Search.
Why Python?
● Simple
● One­stop shop
● Ubiquitous
More About Me
Analysis of Utah Avalanches
● Frequency
● Location
● Causes
Github Repo
Available at
● Acquire Data
● Clean Data
● Visualize Data
● Analyze Data
● Present
● Automate
Acquire Data
● Inspect HTML
● Program
<div class="content">
<div class="view view-avalanches view-id-avalanches view-display-id-page_1>
<div class="view-content">
<table class="views-table cols-7" >
<th class="views-field views-field-field-occurrence-date" >Date</th>
<th class="views-field views-field-field-region-forecaster" >Region</th>
<th class="views-field views-field-field-region-forecaster-1" >Place</th>
<th class="views-field views-field-field-trigger" >Trigger</th>
<th class="views-field views-field-field-killed" >Number Killed</th>
<th class="views-field views-field-view-node" ></th>
<th class="views-field views-field-field-coordinates" >Coordinates</th>
<tr class="odd views-row-first">
<td class="views-field views-field-field-occurrence-date" >
<span class="date-display-single" property="dc:date" datatype="xsd:dateTime" content="2015-03-04T00:00:00-
<td class="views-field views-field-field-region-forecaster" >Ogden</td>
<td class="views-field views-field-field-region-forecaster-1" >Hells Canyon</td>
<td class="views-field views-field-field-trigger" >Snowboarder</td>
<td class="views-field views-field-field-killed" >1</td>
<td class="views-field views-field-view-node" >
<a href="/avalanches/23779">Details</a></td>
● Requests
● BeautifulSoup4
$ pip install requests beautifulsoup4
import requests as r
url = ''
req = r.get(url)
data = req.text
Scraping items
● Find <div class="content">
● Find <tr>'s
● Find <td>'s
– Find Names ­ end of class attribute views-field-
– Find Values ­ string of <td>
● Also get details URL from <td class='views-
Code to Scrape
def get_info(data):
soup = BeautifulSoup(data)
content = soup.find(id="content")
trs = content.find_all('tr')
res = []
for tr in trs:
tds = tr.find_all('td')
data = {}
for td in tds:
name, value = get_field_name_value(td)
if not name:
data[name] = value
if data:
return res
Code to Scrape
def get_field_name_value(elem):
tags = elem.get('class')
start = 'views-field-field-'
for t in tags:
if t.startswith(start):
return t[len(start):],
elif t == 'views-field-view-node':
return 'url', elem.a['href']
return None, None
Scraping Details
<div id="content" class="column"><div class="section">
<a id="main-content"></a>
<span class="title"><h1>Avalanche: East Kessler</h1></span>
<div class="region region-content">
<div id="block-system-main" class="block block-system">
<div class="content">
<div id="node-23838" class="node node-avalanche node-full node-full clearfix"
about="/avalanches/23838" typeof="sioc:Item foaf:Document">
<span property="dc:title" content="Avalanche: East Kessler" class="rdf-meta element-
hidden"></span><span property="sioc:num_replies" content="0" datatype="xsd:integer" class="rdf-meta
<div class="field field-name-field-observation-date field-type-datetime field-label-above">
<div class="field-label">Observation Date<span class="field-label-colon">:&nbsp;</span></div>
<div class="field-items">
<div class="field-item even"><span class="date-display-single" property="dc:date"
datatype="xsd:dateTime" content="2015-03-05T00:00:00-07:00">Thursday, March 5, 2015</span></div>
Scraping Details
● Field class==field (need to use class_ in 
BeatifulSoup because of keyword conflict)
● Find field-label for key
● Find field-item for value
Scraping Details
def get_avalanche_details(url, rows):
res = []
for item in rows:
req = r.get(url + item['url'])
data = req.text
soup = BeautifulSoup(data)
content = soup.find(id='content')
field_divs = content.find_all(class_='field')
for div in field_divs:
key_elem = div.find(class_='field-label')
if key_elem is None:
print "NONE!!!", div
key = ''.join(key_elem.stripped_strings)
value_elem = div.find(class_='field-item')
value = ''.join(value_elem.stripped_strings).
replace(u'xa0', u' ')
except AttributeError as e:
print e, div
if key in item:
item[key] = value
return res
BS Notes
Can be annoying to find strings:
>>> from bs4 import BeautifulSoup
>>> s = BeautifulSoup('<div>foo<div>bar</div></div>')
>>> s
>>> s.string # This bothers me! None!
>>> s.strings
<generator object _all_strings at 0x...>
>>> list(s.strings)
[u'foo', u'bar']
BS Notes
Might need to deal with unicode (xa0 is Latin 
non­breaking space)...:
value = ''.join(
replace(u'xa0', u' ')
Other Tools
Scrapy ( ­ Framework for crawling 
web using Python
Convert to csv
Use pandas:
details = get_avalanche_details(base,
df = pd.DataFrame(details)
Unicode bytes!
Traceback (most recent call last):
File "", line 73, in <module>
crawl('/tmp/ava.csv', 2)
File "", line 69, in crawl
lib.write_csv_rows(, ix, self.nlevels,
self.cols, self.writer)
File "pandas/lib.pyx", line 978, in
pandas.lib.write_csv_rows (pandas/lib.c:16858)
UnicodeEncodeError: 'ascii' codec can't encode character
u'u200b' in position 70: ordinal not in range(128)
Use pandas to encode as utf­8:
details = get_avalanche_details(
base, items[:size])
df = pd.DataFrame(details)
df.to_csv(outname, encoding='utf-8')
Clean Data
Inspecting Data
● Spreadsheet
● pandas
● pandas + iPython Notebook
Pandas Interlude
Table with columns as Series:
df = {
cols = [
{ 'name':'growth'
'data':[.5, .7, 1.2] },
{ 'name':'Name'
'data':['Paul', 'George', 'Ringo'] },
In Pandas:
>>> df = pd.DataFrame({
... 'growth':[.5, .7, 1.2],
... 'Name':['Paul', 'George', 'Ringo'] })
>>> df
Name growth
0 Paul 0.5
1 George 0.7
2 Ringo 1.2
[3 rows x 2 columns]
Can create from:
● rows (list of dicts)
● columns (dicts of lists)
● csv file (pd.read_csv)
● from NumPy ndarray
Data for slides:
>>> df = pd.DataFrame({
... 'fname': list('ABCDEF'),
... 'lname': list('MNOPQR'),
... 'test1': range(80,85) + [None],
... 'test2': range(80,92,2)})
>>> df
fname lname test1 test2
0 A M 80 80
1 B N 81 82
2 C O 82 84
3 D P 83 86
4 E Q 84 88
5 F R NaN 90
Two Axes:
● Axes 0 ­ Index
● Axes 1 ­ Columns
>>> df.axes[0] # the index
Int64Index([0, 1, 2, 3, 4, 5], dtype='int64')
>>> df.axes[1] # columns
Index([u'fname', u'lname', u'test1', u'test2'],
Index & Columns
>>> df.index
Int64Index([0, 1, 2, 3, 4, 5], dtype='int64')
>>> df.columns
Index([u'fname', u'lname', u'test1', u'test2'],
Examining Data
Listing columns:
>>> df.columns
Index([u'fname', u'lname', u'test1', u'test2'],
Describing data:
>>> df.describe()
test1 test2
count 5.000000 6.000000
mean 82.000000 85.000000
std 1.581139 3.741657
min 80.000000 80.000000
25% 81.000000 82.500000
50% 82.000000 85.000000
75% 83.000000 87.500000
max 84.000000 90.000000
[8 rows x 2 columns]
Viewing the data (use .to_string() if needed):
>>> df
fname lname test1 test2
0 A M 80 80
1 B N 81 82
2 C O 82 84
3 D P 83 86
4 E Q 84 88
5 F R NaN 90
Pull out a column (Series):
>>> df.test1 # or df['test1']
0 80
1 81
2 82
3 83
4 84
5 NaN
Name: test1, dtype: float64
Median of a column (Series):
>>> df.test1.median()
Quick correlation:
>>> df.test1.corr(df.test2)
(Thus concludes our 
interlude) Back to 
>>> import pandas as pd
>>> df = pd.read_csv('/tmp/ava.csv')
>>> df.describe()
Unnamed: 0 Buried - Fully: Buried - Partly: Carried: Caught: 
count 20.00000 17.000000 3 20.000000 20.000000
mean 9.50000 1.117647 1 1.200000 1.300000
std 5.91608 0.332106 0 0.523148 0.656947
min 0.00000 1.000000 1 1.000000 1.000000
25% 4.75000 1.000000 1 1.000000 1.000000
50% 9.50000 1.000000 1 1.000000 1.000000
75% 14.25000 1.000000 1 1.000000 1.000000
max 19.00000 2.000000 1 3.000000 3.000000
Elevation: Injured: Killed: Slope Angle: Video: killed
count 20.000000 3 20.000000 15.000000 0 20.000000
mean 9520.000000 1 1.100000 36.200000 NaN 1.100000
std 1022.689951 0 0.307794 7.692297 NaN 0.307794
min 6400.000000 1 1.000000 10.000000 NaN 1.000000
25% 8925.000000 1 1.000000 36.000000 NaN 1.000000
50% 9800.000000 1 1.000000 38.000000 NaN 1.000000
75% 10200.000000 1 1.000000 39.500000 NaN 1.000000
max 10900.000000 1 2.000000 45.000000 NaN 2.000000
● Look at column types .dtypes
● Inspect columns col.describe(), 
● Tweak/create columns ...
Inspect Column types
>>> df.dtypes
Unnamed: 0 int64
Accident and Rescue Summary: object
Aspect: object
Avalanche Problem: object
Avalanche Type: object
Buried - Fully: float64
Buried - Partly: float64
Carried: int64
Caught: int64
Comments: object
Coordinates: object
Depth: object
Elevation: int64
Injured: float64
Killed: int64
Location Name or Route: object
Observation Date: object
Observer Name: object
Occurence Time: object
Occurrence Date: object
Region: object
Slope Angle: float64
Snow Profile Comments: object
Terrain Summary: object
Trigger: object
Trigger: additional info: object
Vertical: object
Video: float64
Weak Layer: object
Weather Conditions and History: object
Width: object
coordinates object
killed int64
occurrence-date object
region-forecaster object
region-forecaster-1 object
trigger object
url object
dtype: object
In Data Science, 80% of time spent prepare data, 
20% of time spent complain about need for 
prepare data
Some of the object (string, date, non­numeric) 
types need to be converted to numeric (other 
types). Some are free­form, others are categorical
Column Names
Get rid of those pesky colons:
>>> df2 = df.rename(columns={x:x.replace(
... ':', '')
... for x in df.columns})
>>> df2['Aspect'].value_counts()
Northeast 6
North 5
East 5
Southeast 2
West 1
Northwest 1
dtype: int64
>>> df2['Avalanche Problem'].value_counts()
Persistent Slab 4
Storm Slab 1
Deep Slab 1
Wind Slab 1
dtype: int64
>>> df2['Avalanche Type'].value_counts()
Hard Slab 12
Soft Slab 7
Cornice Fall 1
dtype: int64
Adjust Depth
>>> df2.Depth
0 3'
1 4'
2 4'
3 18"
4 8"
5 2'
6 3'
7 2'
8 16"
9 3'
10 2.5'
11 16"
12 NaN
13 3.5'
14 8'
15 3.5'
16 3'
17 2'
18 4'
19 4.5'
Name: Depth, dtype: object
Adjust Depth
>>> import re
>>> def to_inches(orig):
... """
... >>> to_inches("3'")
... 36
... """
... r = r'''(((d*.)?d*)')?(((d*.)?d*)")?'''
... regex = re.compile(r)
... txt = str(orig)
... if txt == 'nan':
... return orig
... match =
... groups = match.groups()
... feet = groups[1] or 0
... inches = groups[4] or 0
... return float(feet) * 12 + float(inches)
>>> df2['depth_inches'] = df2.Depth.apply(to_inches)
Some values are missing
.describe() only works for numeric columns
Can use .interpolate, .fillna, .dropna
Some values are missing
>>> df2.depth_inches
0 36
1 48
2 48
3 18
4 8
5 24
6 36
7 24
8 16
9 36
10 30
11 16
12 NaN
13 42
14 96
15 42
16 36
17 24
18 48
19 54
Name: depth_inches, dtype: float64
Some values are missing
>>> df2.depth_inches.ix[12]
>>> df2.depth_inches.interpolate().ix[12]
>>> df2.depth_inches.mean()
>>> df2.depth_inches.median()
>>> df2.depth_inches.dropna().ix[12]
Traceback (most recent call last):
KeyError: 12
Does linear by default but has other algorithms
Replace NaN with Median
df2['depth_inches'] = df2.depth_inches.fillna(
Date Munging
>>> df2['Occurrence Date']
0 Wednesday, March 4, 2015
1 Friday, March 7, 2014
2 Sunday, February 9, 2014
3 Saturday, February 8, 2014
4 Thursday, April 11, 2013
5 Friday, March 1, 2013
6 Friday, January 18, 2013
7 Saturday, March 3, 2012
8 Thursday, February 23, 2012
9 Sunday, February 5, 2012
10 Saturday, January 28, 2012
11 Sunday, November 13, 2011
12 Saturday, March 26, 2011
13 Friday, November 26, 2010
14 Sunday, April 4, 2010
15 Friday, January 29, 2010
16 Wednesday, January 27, 2010
17 Sunday, January 24, 2010
18 Tuesday, December 30, 2008
19 Wednesday, December 24, 2008
Name: Occurrence Date, dtype: object
Date Munging
>>> pd.to_datetime(df2['Occurrence Date'])
0 2015-03-04
1 2014-03-07
2 2014-02-09
3 2014-02-08
4 2013-04-11
5 2013-03-01
6 2013-01-18
7 2012-03-03
8 2012-02-23
9 2012-02-05
10 2012-01-28
11 2011-11-13
12 2011-03-26
13 2010-11-26
14 2010-04-04
15 2010-01-29
16 2010-01-27
17 2010-01-24
18 2008-12-30
19 2008-12-24
Name: Occurrence Date, dtype: datetime64[ns]
Date Munging
Might be useful to have date of week as well (Monday is the day to 
>>> df2['dow'] = df2['Occurrence Date'].apply(lambda x:
>>> df2.dow.value_counts()
Friday 5
Sunday 5
Saturday 4
Wednesday 3
Thursday 2
Tuesday 1
dtype: int64
Fill Vertical
Replace 'Unknown' with median:
df2['vert'] = df2.Vertical.str.replace('Unknown',
df2['vert'] = df2.vert.fillna(df2.vert.median())
0 NaN
1 40.812120000000, -110.906296000000
2 39.585986000000, -111.270003000000
3 40.482366000000, -111.648088000000
4 40.629000000000, -111.666412000000
5 39.043600000000, -111.519000000000
6 NaN
7 38.539320000000, -109.209852000000
8 40.653034000000, -111.592255000000
9 38.716456000000, -111.721988000000
10 40.624442000000, -111.669588000000
11 40.568491000000, -111.652937000000
12 39.372824000000, -111.422482000000
13 40.847320000000, -111.015129000000
14 41.050424000000, -111.844082000000
15 40.856199868806, -111.754991041400
16 40.617112000000, -111.623840000000
17 41.215563000000, -111.873307000000
18 40.871988000000, -110.974016000000
19 41.711752000000, -111.717181000000
Name: coordinates, dtype: object
df2['lat'] = df2.coordinates.apply(
lambda x: float(x.split(',')[0]) 
if str(x) != 'nan' else float('nan'))
df2['lon'] = df2.coordinates.apply(
lambda x: float(x.split(',')[1]) 
if str(x) != 'nan' else float('nan'))
Missing Data
We don't have:
● Temperature
● Weather (current, previous day)
Visualize Data
Use iPython Notebook (v3 is called jupyter) for 
$ pip install "ipython[notebook]"
$ ipython notebook
iPython Notebook 
Install cartopy, pip fails, github checkout
$ sudo apt-get install libgeos-dev libproj-dev
$ pip install shapely pyshp
$ git clone
$ cd cartopy
$ python install
Hard to get contour....
Enter gmaps
$ pip install gmaps
notebook code:
import gmaps
d2 = [x for x in zip(, df.lon) if
str(x[0]) != 'nan']
Enter Folium
Wraps leaflet.js
from IPython.display import HTML
import folium
def inline_map(map):
Embeds the HTML source of the map directly into the IPython notebook.
This method will not work if the map depends on any files (json data). Also this uses
the HTML5 srcdoc attribute, which may not be supported in all browsers.
return HTML('<iframe srcdoc="{srcdoc}" style="width: 100%; height: 510px; border:
none"></iframe>'.format(srcdoc=map.HTML.replace('"', '&quot;')))
def summary(i, row):
return "<b>{} {} {} {}</b> <p>{}</p>".format(i, row['year'],
row['Trigger'], row['Location Name or Route'],
row['Accident and Rescue Summary'])
map = folium.Map(location=d2[4], zoom_start=10, tiles='Stamen Terrain', height=700)
for i, row in df2.iterrows():
#print (, row.lon)
if str( == 'nan' or == 0:
map.simple_marker([, row.lon], popup=summary(i, row))
Go­to visualization to see the lay of the land
Easy with matplotlib (or pandas integration):
import matplotlib.pyplot as plt
df2.vert.hist() # using pandas
Recommended wrapper on top of matplotlib. 
Violin plots, faceted plots, + more
● Statsmodels
● scipy (includes scipy.stats)
● scikit­learn
● gensim
● SpaCy
>>> df2.Killed.sum()
>>> len(df2)
df3 = df2.groupby('year').sum().
sb.regplot(x='year', y="count",
data=df3, lowess=0, marker='x',
See previous folium code
>>> df2.Trigger.value_counts()
Skier 40
Snowmobiler 25
Snowboarder 13
Natural 6
Unknown 3
Hiker 3
Snowshoer 1
dtype: int64
def to_rad(d):
return d *math.pi/ 180
ax = plt.subplot(111)
for i, row in df2.iterrows():
jitter = (random.random()-.5)*.2
plt.plot([0, 1], [0, math.tan(to_rad(row.slope +
alpha=.3, color='b', linewidth=1)
>>> df2.Aspect.value_counts()
Northeast 24
North 14
East 9
Northwest 9
West 3
Southeast 3
South 1
dtype: int64
Crawling all avalanches takes ~2.5 min.
Manage workers to performs tasks in parallel
● ML (scikit­learn)
● Database (sqlalchemy)
● Web (django, flask, ...)

Contenu connexe


Introduction to Python and TensorFlow
Introduction to Python and TensorFlowIntroduction to Python and TensorFlow
Introduction to Python and TensorFlowBayu Aldi Yansyah
Learn python in 20 minutes
Learn python in 20 minutesLearn python in 20 minutes
Learn python in 20 minutesSidharth Nadhan
Learn 90% of Python in 90 Minutes
Learn 90% of Python in 90 MinutesLearn 90% of Python in 90 Minutes
Learn 90% of Python in 90 MinutesMatt Harrison
Introduction to Python
Introduction to PythonIntroduction to Python
Introduction to PythonUC San Diego
Python 표준 라이브러리
Python 표준 라이브러리Python 표준 라이브러리
Python 표준 라이브러리용 최
Beginners python cheat sheet - Basic knowledge
Beginners python cheat sheet - Basic knowledge Beginners python cheat sheet - Basic knowledge
Beginners python cheat sheet - Basic knowledge O T
Python and sysadmin I
Python and sysadmin IPython and sysadmin I
Python and sysadmin IGuixing Bai
Python于Web 2.0网站的应用 - QCon Beijing 2010
Python于Web 2.0网站的应用 - QCon Beijing 2010Python于Web 2.0网站的应用 - QCon Beijing 2010
Python于Web 2.0网站的应用 - QCon Beijing 2010Qiangning Hong
Python 내장 함수
Python 내장 함수Python 내장 함수
Python 내장 함수용 최
Python 101++: Let's Get Down to Business!
Python 101++: Let's Get Down to Business!Python 101++: Let's Get Down to Business!
Python 101++: Let's Get Down to Business!Paige Bailey
Python tutorial
Python tutorialPython tutorial
Python tutorialRajiv Risi
Python dictionary : past, present, future
Python dictionary: past, present, futurePython dictionary: past, present, future
Python dictionary : past, present, futuredelimitry
java 8 Hands on Workshop
java 8 Hands on Workshopjava 8 Hands on Workshop
java 8 Hands on WorkshopJeanne Boyarsky
Python tutorialfeb152012
Python tutorialfeb152012Python tutorialfeb152012
Python tutorialfeb152012Shani729
Python Traning presentation
Python Traning presentationPython Traning presentation
Python Traning presentationNimrita Koul

Tendances (20)

Python Cheat Sheet
Python Cheat SheetPython Cheat Sheet
Python Cheat Sheet
Begin with Python
Begin with PythonBegin with Python
Begin with Python
Introduction to Python and TensorFlow
Introduction to Python and TensorFlowIntroduction to Python and TensorFlow
Introduction to Python and TensorFlow
Learn python in 20 minutes
Learn python in 20 minutesLearn python in 20 minutes
Learn python in 20 minutes
Python Puzzlers
Python PuzzlersPython Puzzlers
Python Puzzlers
Learn 90% of Python in 90 Minutes
Learn 90% of Python in 90 MinutesLearn 90% of Python in 90 Minutes
Learn 90% of Python in 90 Minutes
Introduction to Python
Introduction to PythonIntroduction to Python
Introduction to Python
Python 표준 라이브러리
Python 표준 라이브러리Python 표준 라이브러리
Python 표준 라이브러리
Beginners python cheat sheet - Basic knowledge
Beginners python cheat sheet - Basic knowledge Beginners python cheat sheet - Basic knowledge
Beginners python cheat sheet - Basic knowledge
Python and sysadmin I
Python and sysadmin IPython and sysadmin I
Python and sysadmin I
Python于Web 2.0网站的应用 - QCon Beijing 2010
Python于Web 2.0网站的应用 - QCon Beijing 2010Python于Web 2.0网站的应用 - QCon Beijing 2010
Python于Web 2.0网站的应用 - QCon Beijing 2010
Python 내장 함수
Python 내장 함수Python 내장 함수
Python 내장 함수
Python 101++: Let's Get Down to Business!
Python 101++: Let's Get Down to Business!Python 101++: Let's Get Down to Business!
Python 101++: Let's Get Down to Business!
Python Tutorial
Python TutorialPython Tutorial
Python Tutorial
Python tutorial
Python tutorialPython tutorial
Python tutorial
Python dictionary : past, present, future
Python dictionary: past, present, futurePython dictionary: past, present, future
Python dictionary : past, present, future
java 8 Hands on Workshop
java 8 Hands on Workshopjava 8 Hands on Workshop
java 8 Hands on Workshop
Python tutorialfeb152012
Python tutorialfeb152012Python tutorialfeb152012
Python tutorialfeb152012
Python Traning presentation
Python Traning presentationPython Traning presentation
Python Traning presentation

Similaire à Analysis of Fatal Utah Avalanches with Python. From Scraping, Analysis, to Infographic

Beautiful python - PyLadies
Beautiful python - PyLadiesBeautiful python - PyLadies
Beautiful python - PyLadiesAlicia Pérez
An overview of Python 2.7
An overview of Python 2.7An overview of Python 2.7
An overview of Python 2.7decoupled
Rapid and Scalable Development with MongoDB, PyMongo, and Ming
Rapid and Scalable Development with MongoDB, PyMongo, and MingRapid and Scalable Development with MongoDB, PyMongo, and Ming
Rapid and Scalable Development with MongoDB, PyMongo, and MingRick Copeland
python-cheatsheets that will be for coders
python-cheatsheets that will be for coderspython-cheatsheets that will be for coders
python-cheatsheets that will be for coderssarafbisesh
Fazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearchFazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearchPedro Franceschi
Python seaborn cheat_sheet
Python seaborn cheat_sheetPython seaborn cheat_sheet
Python seaborn cheat_sheetNishant Upadhyay
GE8151 Problem Solving and Python Programming
GE8151 Problem Solving and Python ProgrammingGE8151 Problem Solving and Python Programming
GE8151 Problem Solving and Python ProgrammingMuthu Vinayagam
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the codeBeyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the codeWim Godden
Apache Spark in your likeness - low and high level customization
Apache Spark in your likeness - low and high level customizationApache Spark in your likeness - low and high level customization
Apache Spark in your likeness - low and high level customizationBartosz Konieczny
DataCamp Cheat Sheets 4 Python Users (2020)
DataCamp Cheat Sheets 4 Python Users (2020)DataCamp Cheat Sheets 4 Python Users (2020)
DataCamp Cheat Sheets 4 Python Users (2020)EMRE AKCAOGLU
Learn D3.js in 90 minutes
Learn D3.js in 90 minutesLearn D3.js in 90 minutes
Learn D3.js in 90 minutesJos Dirksen
PyCon 2010 SQLAlchemy tutorial
PyCon 2010 SQLAlchemy tutorialPyCon 2010 SQLAlchemy tutorial
PyCon 2010 SQLAlchemy tutorialjbellis
Beyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the codeBeyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the codeWim Godden
Basic R Data Manipulation
Basic R Data ManipulationBasic R Data Manipulation
Basic R Data ManipulationChu An
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeWim Godden

Similaire à Analysis of Fatal Utah Avalanches with Python. From Scraping, Analysis, to Infographic (20)

Intro to Python
Intro to PythonIntro to Python
Intro to Python
Beautiful python - PyLadies
Beautiful python - PyLadiesBeautiful python - PyLadies
Beautiful python - PyLadies
An overview of Python 2.7
An overview of Python 2.7An overview of Python 2.7
An overview of Python 2.7
A tour of Python
A tour of PythonA tour of Python
A tour of Python
Rapid and Scalable Development with MongoDB, PyMongo, and Ming
Rapid and Scalable Development with MongoDB, PyMongo, and MingRapid and Scalable Development with MongoDB, PyMongo, and Ming
Rapid and Scalable Development with MongoDB, PyMongo, and Ming
python-cheatsheets that will be for coders
python-cheatsheets that will be for coderspython-cheatsheets that will be for coders
python-cheatsheets that will be for coders
Fazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearchFazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearch
Python seaborn cheat_sheet
Python seaborn cheat_sheetPython seaborn cheat_sheet
Python seaborn cheat_sheet
GE8151 Problem Solving and Python Programming
GE8151 Problem Solving and Python ProgrammingGE8151 Problem Solving and Python Programming
GE8151 Problem Solving and Python Programming
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the codeBeyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Apache Spark in your likeness - low and high level customization
Apache Spark in your likeness - low and high level customizationApache Spark in your likeness - low and high level customization
Apache Spark in your likeness - low and high level customization
DataCamp Cheat Sheets 4 Python Users (2020)
DataCamp Cheat Sheets 4 Python Users (2020)DataCamp Cheat Sheets 4 Python Users (2020)
DataCamp Cheat Sheets 4 Python Users (2020)
Learn D3.js in 90 minutes
Learn D3.js in 90 minutesLearn D3.js in 90 minutes
Learn D3.js in 90 minutes
PyCon 2010 SQLAlchemy tutorial
PyCon 2010 SQLAlchemy tutorialPyCon 2010 SQLAlchemy tutorial
PyCon 2010 SQLAlchemy tutorial
Beyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the codeBeyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the code
Basic R Data Manipulation
Basic R Data ManipulationBasic R Data Manipulation
Basic R Data Manipulation
Scala ntnu
Scala ntnuScala ntnu
Scala ntnu
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code


SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingsocarem879

Dernier (20)

SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processing

Analysis of Fatal Utah Avalanches with Python. From Scraping, Analysis, to Infographic