SlideShare une entreprise Scribd logo
1  sur  54
Télécharger pour lire hors ligne
Next generation programming in R
Florian Uhlitz
uhlitz@hu-berlin.de
uhlitz.github.io
%>%
magrittr
readr
tidyr
dplyr
%>%
load data
reshape data
manipulate data
Stefan Milton Bache,
University of Southern Denmark
Hadley Wickham,
Rice University, RStudio
Recent developments in the R environment
magrittr
readr tidyr dplyr
%>%
load reshape manipulate%>% %>%
Toolbox for data wrangling in R
data wrangling
adapted from H. Wickham
magrittr
readr tidyr dplyr
%>%
load reshape manipulate%>% %>%
Toolbox for data wrangling in R
data wrangling
model
visualise
adapted from H. Wickham
report
magrittr
readr tidyr dplyr
%>%
load reshape manipulate%>% %>%
Toolbox for data wrangling in R
data wrangling
model
visualise
adapted from H. Wickham
report
magrittr
readr tidyr dplyr
%>%
load reshape manipulate%>% %>%
Toolbox for data wrangling in R
data wrangling
model
visualise
base
ggplot2
rmarkdown
broom
adapted from H. Wickham
data analysis
report
magrittr
readr tidyr dplyr
%>%
load reshape manipulate%>% %>%
Toolbox for data wrangling in R
data wrangling
model
visualise
base
ggplot2
rmarkdown
broom
adapted from H. Wickham
magrittr
In a pipe, the result of the left hand statement is handed
over to the function on the right hand side:
…similar to Unix pipe operator |
f(x, y)
x %>% f(y)
f(x, y, z)
x %>% f(y, z)
f2(f1(x), y)
f1(x) %>% f2(y)
magrittr
nested 

functions
magrittr
nested 

functions
chain of

functions
readr, readxl, haven
readr::read_csv()
readr::read_tsv()
readr::read_log()
readr::read_delim()
readr::read_fwf()
readr::read_table()
readxl::read_excel()
haven::read_sas()
haven::read_spss()
haven::read_stata()
tidyr
gather() spread()
Reshaping
adapted from rstudio.com/resources/cheatsheets/
tidyr
gather() spread()
separate() unite()
Reshaping
adapted from rstudio.com/resources/cheatsheets/
dplyr
filter(x > 1) select(B, C, E)
A B C D E B C Ex
1
2
3
1
x
2
3
Subsetting
adapted from rstudio.com/resources/cheatsheets/
dplyr
Transforming Summarising
1
2
3
x
4
5
6
y
1
2
3
x
4
5
6
y
5
7
9
z
mutate(z = x + y) summarise(A = sum(x), B = sum(y))
1
2
3
x
4
5
6
y
6
A
15
B
adapted from rstudio.com/resources/cheatsheets/
dplyr
Transforming Summarising
1
2
3
x
4
5
6
y
1
2
3
x
4
5
6
y
5
7
9
z
mutate(z = x + y) summarise(A = sum(x), B = sum(y))
1
2
3
x
4
5
6
y
6
A
15
B
group_by() %>% mutate() group_by() %>% summarise()
adapted from rstudio.com/resources/cheatsheets/
What`s tidy data?
KEEP

CALMAND
TIDY

UP
»Happy families are all alike; every unhappy
family is unhappy in its own way.«




Leo Tolstoy
Anna Karenina principle
»Tidy data sets are all alike; every messy
data set is messy in its own way.«




Hadley Wickham
Tidy data principle
Tidy data definition
Wickham, H. (2014). Tidy Data. Journal of Statistical Software
read_excel(“untidy_data.xlsx”) %>%
set_colnames(mynames) %>%
slice(1:36) %>%
fill(group, condition) %>%
separate(group, into = c(“Gene”, “Mutation”, “clone”), sep = “_”) %>%
write_tsv(“tidy_data.tsv”)
read_excel(“untidy_data.xlsx”) %>%
set_colnames(mynames) %>%
slice(1:36) %>%
fill(group, condition) %>%
separate(group, into = c(“Gene”, “Mutation”, “clone”), sep = “_”) %>%
write_tsv(“tidy_data.tsv”)
read_excel
read_excel %>% set_colnames
read_excel %>% set_colnames %>% tail
read_excel %>% set_colnames
read_excel %>% set_colnames %>% slice
read_excel %>% set_colnames %>% slice %>% fill
read_excel %>% set_colnames %>% slice %>% fill %>% select
read_excel %>% set_colnames %>% slice %>% fill %>% select %>% distinct
read_excel %>% set_colnames %>% slice %>% fill %>% select %>% distinct %>%

separate
read_excel %>% set_colnames %>% slice %>% fill %>% select %>% distinct %>%

separate
Caution!

readr, tidy & dplyr do “clever” stuff.
(heuristics like predicting a column class by
looking at the first 1000 entries)
read_excel %>% set_colnames %>% slice %>% fill %>% select %>% distinct

separate
read_excel %>% set_colnames %>% slice %>% fill %>% select %>% distinct

separate %>% unite
read_excel %>% set_colnames %>% slice %>% fill %>% select %>% distinct

separate %>% unite
Tidy data definition
Wickham, H. (2014). Tidy Data. Journal of Statistical Software
read_tsv
read_tsv %>% gather(key, value, -variable)
read_tsv %>% gather %>% spread(key, value)
read_tsv %>% gather
read_tsv %>% gather %>% filter
read_tsv %>% gather %>% filter %>% group_by
read_tsv %>% gather %>% filter %>% group_by %>% summarise %>% arrange
read_tsv %>% gather %>% filter %>% group_by %>% summarise %>% arrange
read_tsv %>% gather %>% filter %>% group_by %>% summarise %>% arrange
Data Wrangling
with dplyr and tidyr
Cheat Sheet
RStudio® is a trademark of RStudio, Inc. • CC BY RStudio • info@rstudio.com • 844-448-1212 • rstudio.com
Syntax - Helpful conventions for wrangling
dplyr::tbl_df(iris)
Converts data to tbl class. tbl’s are easier to examine than
data frames. R displays only the data that fits onscreen:
dplyr::glimpse(iris)
Information dense summary of tbl data.
utils::View(iris)
View data set in spreadsheet-like display (note capital V).
Source: local data frame [150 x 5]
Sepal.Length Sepal.Width Petal.Length
1 5.1 3.5 1.4
2 4.9 3.0 1.4
3 4.7 3.2 1.3
4 4.6 3.1 1.5
5 5.0 3.6 1.4
.. ... ... ...
Variables not shown: Petal.Width (dbl),
Species (fctr)
dplyr::%>%
Passes object on left hand side as first argument (or .
argument) of function on righthand side.
"Piping" with %>% makes code more readable, e.g.
iris %>%
group_by(Species) %>%
summarise(avg = mean(Sepal.Width)) %>%
arrange(avg)
x %>% f(y) is the same as f(x, y)
y %>% f(x, ., z) is the same as f(x, y, z )
Reshaping Data - Change the layout of a data set
Subset Observations (Rows) Subset Variables (Columns)
F M A
Each variable is saved
in its own column
F M A
Each observation is
saved in its own row
In a tidy
data set: &
Tidy Data - A foundation for wrangling in R
Tidy data complements R’s vectorized
operations. R will automatically preserve
observations as you manipulate variables.
No other format works as intuitively with R.
FAM
M * A
*
tidyr::gather(cases, "year", "n", 2:4)
Gather columns into rows.
tidyr::unite(data, col, ..., sep)
Unite several columns into one.
dplyr::data_frame(a = 1:3, b = 4:6)
Combine vectors into data frame
(optimized).
dplyr::arrange(mtcars, mpg)
Order rows by values of a column
(low to high).
dplyr::arrange(mtcars, desc(mpg))
Order rows by values of a column
(high to low).
dplyr::rename(tb, y = year)
Rename the columns of a data
frame.
tidyr::spread(pollution, size, amount)
Spread rows into columns.
tidyr::separate(storms, date, c("y", "m", "d"))
Separate one column into several.
wwwwwwA1005A1013A1010A1010
wwp110110100745451009
wwp110110100745451009 wwp110110100745451009wwp110110100745451009
wppw11010071007110451009100945
wwwww110110110110110 wwww
dplyr::filter(iris, Sepal.Length > 7)
Extract rows that meet logical criteria.
dplyr::distinct(iris)
Remove duplicate rows.
dplyr::sample_frac(iris, 0.5, replace = TRUE)
Randomly select fraction of rows.
dplyr::sample_n(iris, 10, replace = TRUE)
Randomly select n rows.
dplyr::slice(iris, 10:15)
Select rows by position.
dplyr::top_n(storms, 2, date)
Select and order top n entries (by group if grouped data).
< Less than != Not equal to
> Greater than %in% Group membership
== Equal to is.na Is NA
<= Less than or equal to !is.na Is not NA
>= Greater than or equal to &,|,!,xor,any,all Boolean operators
Logic in R - ?Comparison, ?base::Logic
dplyr::select(iris, Sepal.Width, Petal.Length, Species)
Select columns by name or helper function.
Helper functions for select - ?select
select(iris, contains("."))
Select columns whose name contains a character string.
select(iris, ends_with("Length"))
Select columns whose name ends with a character string.
select(iris, everything())
Select every column.
select(iris, matches(".t."))
Select columns whose name matches a regular expression.
select(iris, num_range("x", 1:5))
Select columns named x1, x2, x3, x4, x5.
select(iris, one_of(c("Species", "Genus")))
Select columns whose names are in a group of names.
select(iris, starts_with("Sepal"))
Select columns whose name starts with a character string.
select(iris, Sepal.Length:Petal.Width)
Select all columns between Sepal.Length and Petal.Width (inclusive).
select(iris, -Species)
Select all columns except Species.
Learn more with browseVignettes(package = c("dplyr", "tidyr")) • dplyr 0.4.0• tidyr 0.2.0 • Updated: 1/15
wwwwwwA1005A1013A1010A1010
devtools::install_github("rstudio/EDAWR") for data sets
rstudio.com/resources/cheatsheets/
Next Generation Programming in R

Contenu connexe

Tendances

R Programming: Importing Data In R
R Programming: Importing Data In RR Programming: Importing Data In R
R Programming: Importing Data In RRsquared Academy
 
5 R Tutorial Data Visualization
5 R Tutorial Data Visualization5 R Tutorial Data Visualization
5 R Tutorial Data VisualizationSakthi Dasans
 
Merge Multiple CSV in single data frame using R
Merge Multiple CSV in single data frame using RMerge Multiple CSV in single data frame using R
Merge Multiple CSV in single data frame using RYogesh Khandelwal
 
Presentation R basic teaching module
Presentation R basic teaching modulePresentation R basic teaching module
Presentation R basic teaching moduleSander Timmer
 
3 R Tutorial Data Structure
3 R Tutorial Data Structure3 R Tutorial Data Structure
3 R Tutorial Data StructureSakthi Dasans
 
2. R-basics, Vectors, Arrays, Matrices, Factors
2. R-basics, Vectors, Arrays, Matrices, Factors2. R-basics, Vectors, Arrays, Matrices, Factors
2. R-basics, Vectors, Arrays, Matrices, Factorskrishna singh
 
Introduction to pandas
Introduction to pandasIntroduction to pandas
Introduction to pandasPiyush rai
 
Python for R Users
Python for R UsersPython for R Users
Python for R UsersAjay Ohri
 
Introduction to Pandas and Time Series Analysis [PyCon DE]
Introduction to Pandas and Time Series Analysis [PyCon DE]Introduction to Pandas and Time Series Analysis [PyCon DE]
Introduction to Pandas and Time Series Analysis [PyCon DE]Alexander Hendorf
 
Spark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with SparkSpark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with Sparksamthemonad
 
Morel, a Functional Query Language
Morel, a Functional Query LanguageMorel, a Functional Query Language
Morel, a Functional Query LanguageJulian Hyde
 
R data-import, data-export
R data-import, data-exportR data-import, data-export
R data-import, data-exportFAO
 
Introduction to data.table in R
Introduction to data.table in RIntroduction to data.table in R
Introduction to data.table in RPaul Richards
 
Stata Programming Cheat Sheet
Stata Programming Cheat SheetStata Programming Cheat Sheet
Stata Programming Cheat SheetLaura Hughes
 
R Workshop for Beginners
R Workshop for BeginnersR Workshop for Beginners
R Workshop for BeginnersMetamarkets
 

Tendances (20)

R Programming: Importing Data In R
R Programming: Importing Data In RR Programming: Importing Data In R
R Programming: Importing Data In R
 
R seminar dplyr package
R seminar dplyr packageR seminar dplyr package
R seminar dplyr package
 
R Language Introduction
R Language IntroductionR Language Introduction
R Language Introduction
 
R programming language
R programming languageR programming language
R programming language
 
5 R Tutorial Data Visualization
5 R Tutorial Data Visualization5 R Tutorial Data Visualization
5 R Tutorial Data Visualization
 
Merge Multiple CSV in single data frame using R
Merge Multiple CSV in single data frame using RMerge Multiple CSV in single data frame using R
Merge Multiple CSV in single data frame using R
 
Presentation R basic teaching module
Presentation R basic teaching modulePresentation R basic teaching module
Presentation R basic teaching module
 
3 R Tutorial Data Structure
3 R Tutorial Data Structure3 R Tutorial Data Structure
3 R Tutorial Data Structure
 
2. R-basics, Vectors, Arrays, Matrices, Factors
2. R-basics, Vectors, Arrays, Matrices, Factors2. R-basics, Vectors, Arrays, Matrices, Factors
2. R-basics, Vectors, Arrays, Matrices, Factors
 
Introduction to pandas
Introduction to pandasIntroduction to pandas
Introduction to pandas
 
Python for R Users
Python for R UsersPython for R Users
Python for R Users
 
Introduction to Pandas and Time Series Analysis [PyCon DE]
Introduction to Pandas and Time Series Analysis [PyCon DE]Introduction to Pandas and Time Series Analysis [PyCon DE]
Introduction to Pandas and Time Series Analysis [PyCon DE]
 
Spark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with SparkSpark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with Spark
 
Morel, a Functional Query Language
Morel, a Functional Query LanguageMorel, a Functional Query Language
Morel, a Functional Query Language
 
Data Analysis in Python
Data Analysis in PythonData Analysis in Python
Data Analysis in Python
 
R data-import, data-export
R data-import, data-exportR data-import, data-export
R data-import, data-export
 
Introduction to data.table in R
Introduction to data.table in RIntroduction to data.table in R
Introduction to data.table in R
 
Stata Programming Cheat Sheet
Stata Programming Cheat SheetStata Programming Cheat Sheet
Stata Programming Cheat Sheet
 
R Workshop for Beginners
R Workshop for BeginnersR Workshop for Beginners
R Workshop for Beginners
 
R code for data manipulation
R code for data manipulationR code for data manipulation
R code for data manipulation
 

En vedette

Data and donuts: Data Visualization using R
Data and donuts: Data Visualization using RData and donuts: Data Visualization using R
Data and donuts: Data Visualization using RC. Tobin Magle
 
Self Learning Credit Scoring Model Presentation
Self Learning Credit Scoring Model PresentationSelf Learning Credit Scoring Model Presentation
Self Learning Credit Scoring Model PresentationSwitchPitch
 
Aire - Alternative Credit Scoring (TechStars DemoDay - Sep 2014)
Aire - Alternative Credit Scoring (TechStars DemoDay - Sep 2014)Aire - Alternative Credit Scoring (TechStars DemoDay - Sep 2014)
Aire - Alternative Credit Scoring (TechStars DemoDay - Sep 2014)Aire
 
20160611 kintone Café 高知 Vol.3 LT資料
20160611 kintone Café 高知 Vol.3 LT資料20160611 kintone Café 高知 Vol.3 LT資料
20160611 kintone Café 高知 Vol.3 LT資料安隆 沖
 
WF ED 540, Class Meeting 3 - Introduction to dplyr, 2016
WF ED 540, Class Meeting 3 - Introduction to dplyr, 2016WF ED 540, Class Meeting 3 - Introduction to dplyr, 2016
WF ED 540, Class Meeting 3 - Introduction to dplyr, 2016Penn State University
 
R Brown-bag seminars : Seminar-8
R Brown-bag seminars : Seminar-8R Brown-bag seminars : Seminar-8
R Brown-bag seminars : Seminar-8Muhammad Nabi Ahmad
 
Análisis espacial con R (asignatura de Master - UPM)
Análisis espacial con R (asignatura de Master - UPM)Análisis espacial con R (asignatura de Master - UPM)
Análisis espacial con R (asignatura de Master - UPM)Vladimir Gutierrez, PhD
 
Paquete ggplot - Potencia y facilidad para generar gráficos en R
Paquete ggplot - Potencia y facilidad para generar gráficos en RPaquete ggplot - Potencia y facilidad para generar gráficos en R
Paquete ggplot - Potencia y facilidad para generar gráficos en RNestor Montaño
 
Presentation DataScoring: Big Data and credit score
Presentation DataScoring: Big Data and credit scorePresentation DataScoring: Big Data and credit score
Presentation DataScoring: Big Data and credit scoreAnton Vokrug
 
Learn to use dplyr (Feb 2015 Philly R User Meetup)
Learn to use dplyr (Feb 2015 Philly R User Meetup)Learn to use dplyr (Feb 2015 Philly R User Meetup)
Learn to use dplyr (Feb 2015 Philly R User Meetup)Fan Li
 
WF ED 540, Class Meeting 3 - select, filter, arrange, 2016
WF ED 540, Class Meeting 3 - select, filter, arrange, 2016WF ED 540, Class Meeting 3 - select, filter, arrange, 2016
WF ED 540, Class Meeting 3 - select, filter, arrange, 2016Penn State University
 
WF ED 540, Class Meeting 3 - mutate and summarise, 2016
WF ED 540, Class Meeting 3 - mutate and summarise, 2016WF ED 540, Class Meeting 3 - mutate and summarise, 2016
WF ED 540, Class Meeting 3 - mutate and summarise, 2016Penn State University
 
R Programming: Learn To Manipulate Strings In R
R Programming: Learn To Manipulate Strings In RR Programming: Learn To Manipulate Strings In R
R Programming: Learn To Manipulate Strings In RRsquared Academy
 
Reproducible Research in R and R Studio
Reproducible Research in R and R StudioReproducible Research in R and R Studio
Reproducible Research in R and R StudioSusan Johnston
 

En vedette (18)

Data and donuts: Data Visualization using R
Data and donuts: Data Visualization using RData and donuts: Data Visualization using R
Data and donuts: Data Visualization using R
 
Fast data munging in R
Fast data munging in RFast data munging in R
Fast data munging in R
 
Self Learning Credit Scoring Model Presentation
Self Learning Credit Scoring Model PresentationSelf Learning Credit Scoring Model Presentation
Self Learning Credit Scoring Model Presentation
 
Generating random primes
Generating random primesGenerating random primes
Generating random primes
 
Aire - Alternative Credit Scoring (TechStars DemoDay - Sep 2014)
Aire - Alternative Credit Scoring (TechStars DemoDay - Sep 2014)Aire - Alternative Credit Scoring (TechStars DemoDay - Sep 2014)
Aire - Alternative Credit Scoring (TechStars DemoDay - Sep 2014)
 
20160611 kintone Café 高知 Vol.3 LT資料
20160611 kintone Café 高知 Vol.3 LT資料20160611 kintone Café 高知 Vol.3 LT資料
20160611 kintone Café 高知 Vol.3 LT資料
 
WF ED 540, Class Meeting 3 - Introduction to dplyr, 2016
WF ED 540, Class Meeting 3 - Introduction to dplyr, 2016WF ED 540, Class Meeting 3 - Introduction to dplyr, 2016
WF ED 540, Class Meeting 3 - Introduction to dplyr, 2016
 
Rlecturenotes
RlecturenotesRlecturenotes
Rlecturenotes
 
R Brown-bag seminars : Seminar-8
R Brown-bag seminars : Seminar-8R Brown-bag seminars : Seminar-8
R Brown-bag seminars : Seminar-8
 
Análisis espacial con R (asignatura de Master - UPM)
Análisis espacial con R (asignatura de Master - UPM)Análisis espacial con R (asignatura de Master - UPM)
Análisis espacial con R (asignatura de Master - UPM)
 
Paquete ggplot - Potencia y facilidad para generar gráficos en R
Paquete ggplot - Potencia y facilidad para generar gráficos en RPaquete ggplot - Potencia y facilidad para generar gráficos en R
Paquete ggplot - Potencia y facilidad para generar gráficos en R
 
Presentation DataScoring: Big Data and credit score
Presentation DataScoring: Big Data and credit scorePresentation DataScoring: Big Data and credit score
Presentation DataScoring: Big Data and credit score
 
Learn to use dplyr (Feb 2015 Philly R User Meetup)
Learn to use dplyr (Feb 2015 Philly R User Meetup)Learn to use dplyr (Feb 2015 Philly R User Meetup)
Learn to use dplyr (Feb 2015 Philly R User Meetup)
 
WF ED 540, Class Meeting 3 - select, filter, arrange, 2016
WF ED 540, Class Meeting 3 - select, filter, arrange, 2016WF ED 540, Class Meeting 3 - select, filter, arrange, 2016
WF ED 540, Class Meeting 3 - select, filter, arrange, 2016
 
WF ED 540, Class Meeting 3 - mutate and summarise, 2016
WF ED 540, Class Meeting 3 - mutate and summarise, 2016WF ED 540, Class Meeting 3 - mutate and summarise, 2016
WF ED 540, Class Meeting 3 - mutate and summarise, 2016
 
R Programming: Learn To Manipulate Strings In R
R Programming: Learn To Manipulate Strings In RR Programming: Learn To Manipulate Strings In R
R Programming: Learn To Manipulate Strings In R
 
Reproducible Research in R and R Studio
Reproducible Research in R and R StudioReproducible Research in R and R Studio
Reproducible Research in R and R Studio
 
Dplyr and Plyr
Dplyr and PlyrDplyr and Plyr
Dplyr and Plyr
 

Similaire à Next Generation Programming in R

Data Wrangling with dplyr and tidyr Cheat Sheet
Data Wrangling with dplyr and tidyr Cheat SheetData Wrangling with dplyr and tidyr Cheat Sheet
Data Wrangling with dplyr and tidyr Cheat SheetDr. Volkan OBAN
 
R programming & Machine Learning
R programming & Machine LearningR programming & Machine Learning
R programming & Machine LearningAmanBhalla14
 
Broom: Converting Statistical Models to Tidy Data Frames
Broom: Converting Statistical Models to Tidy Data FramesBroom: Converting Statistical Models to Tidy Data Frames
Broom: Converting Statistical Models to Tidy Data FramesWork-Bench
 
Practical data science_public
Practical data science_publicPractical data science_public
Practical data science_publicLong Nguyen
 
fINAL Lesson_5_Data_Manipulation_using_R_v1.pptx
fINAL Lesson_5_Data_Manipulation_using_R_v1.pptxfINAL Lesson_5_Data_Manipulation_using_R_v1.pptx
fINAL Lesson_5_Data_Manipulation_using_R_v1.pptxdataKarthik
 
Lab 2: Classification and Regression Prediction Models, training and testing ...
Lab 2: Classification and Regression Prediction Models, training and testing ...Lab 2: Classification and Regression Prediction Models, training and testing ...
Lab 2: Classification and Regression Prediction Models, training and testing ...Yao Yao
 
Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Spencer Fox
 
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Yao Yao
 
India software developers conference 2013 Bangalore
India software developers conference 2013 BangaloreIndia software developers conference 2013 Bangalore
India software developers conference 2013 BangaloreSatnam Singh
 
ComputeFest 2012: Intro To R for Physical Sciences
ComputeFest 2012: Intro To R for Physical SciencesComputeFest 2012: Intro To R for Physical Sciences
ComputeFest 2012: Intro To R for Physical Sciencesalexstorer
 
PPT ON MACHINE LEARNING by Ragini Ratre
PPT ON MACHINE LEARNING by Ragini RatrePPT ON MACHINE LEARNING by Ragini Ratre
PPT ON MACHINE LEARNING by Ragini RatreRaginiRatre
 
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)Serban Tanasa
 
Big Data Mining in Indian Economic Survey 2017
Big Data Mining in Indian Economic Survey 2017Big Data Mining in Indian Economic Survey 2017
Big Data Mining in Indian Economic Survey 2017Parth Khare
 

Similaire à Next Generation Programming in R (20)

Data Wrangling with dplyr and tidyr Cheat Sheet
Data Wrangling with dplyr and tidyr Cheat SheetData Wrangling with dplyr and tidyr Cheat Sheet
Data Wrangling with dplyr and tidyr Cheat Sheet
 
R programming & Machine Learning
R programming & Machine LearningR programming & Machine Learning
R programming & Machine Learning
 
Broom: Converting Statistical Models to Tidy Data Frames
Broom: Converting Statistical Models to Tidy Data FramesBroom: Converting Statistical Models to Tidy Data Frames
Broom: Converting Statistical Models to Tidy Data Frames
 
Practical data science_public
Practical data science_publicPractical data science_public
Practical data science_public
 
fINAL Lesson_5_Data_Manipulation_using_R_v1.pptx
fINAL Lesson_5_Data_Manipulation_using_R_v1.pptxfINAL Lesson_5_Data_Manipulation_using_R_v1.pptx
fINAL Lesson_5_Data_Manipulation_using_R_v1.pptx
 
R Cheat Sheet
R Cheat SheetR Cheat Sheet
R Cheat Sheet
 
Lab 2: Classification and Regression Prediction Models, training and testing ...
Lab 2: Classification and Regression Prediction Models, training and testing ...Lab 2: Classification and Regression Prediction Models, training and testing ...
Lab 2: Classification and Regression Prediction Models, training and testing ...
 
Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016
 
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
 
India software developers conference 2013 Bangalore
India software developers conference 2013 BangaloreIndia software developers conference 2013 Bangalore
India software developers conference 2013 Bangalore
 
R Basics
R BasicsR Basics
R Basics
 
Introduction to r
Introduction to rIntroduction to r
Introduction to r
 
ComputeFest 2012: Intro To R for Physical Sciences
ComputeFest 2012: Intro To R for Physical SciencesComputeFest 2012: Intro To R for Physical Sciences
ComputeFest 2012: Intro To R for Physical Sciences
 
PPT ON MACHINE LEARNING by Ragini Ratre
PPT ON MACHINE LEARNING by Ragini RatrePPT ON MACHINE LEARNING by Ragini Ratre
PPT ON MACHINE LEARNING by Ragini Ratre
 
User biglm
User biglmUser biglm
User biglm
 
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
 
Big Data Mining in Indian Economic Survey 2017
Big Data Mining in Indian Economic Survey 2017Big Data Mining in Indian Economic Survey 2017
Big Data Mining in Indian Economic Survey 2017
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
NCCU: Statistics in the Criminal Justice System, R basics and Simulation - Pr...
NCCU: Statistics in the Criminal Justice System, R basics and Simulation - Pr...NCCU: Statistics in the Criminal Justice System, R basics and Simulation - Pr...
NCCU: Statistics in the Criminal Justice System, R basics and Simulation - Pr...
 
R for Statistical Computing
R for Statistical ComputingR for Statistical Computing
R for Statistical Computing
 

Dernier

ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.pptamreenkhanum0307
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 

Dernier (20)

ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.ppt
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 

Next Generation Programming in R

  • 1. Next generation programming in R Florian Uhlitz uhlitz@hu-berlin.de uhlitz.github.io %>%
  • 2. magrittr readr tidyr dplyr %>% load data reshape data manipulate data Stefan Milton Bache, University of Southern Denmark Hadley Wickham, Rice University, RStudio Recent developments in the R environment
  • 3. magrittr readr tidyr dplyr %>% load reshape manipulate%>% %>% Toolbox for data wrangling in R data wrangling adapted from H. Wickham
  • 4. magrittr readr tidyr dplyr %>% load reshape manipulate%>% %>% Toolbox for data wrangling in R data wrangling model visualise adapted from H. Wickham
  • 5. report magrittr readr tidyr dplyr %>% load reshape manipulate%>% %>% Toolbox for data wrangling in R data wrangling model visualise adapted from H. Wickham
  • 6. report magrittr readr tidyr dplyr %>% load reshape manipulate%>% %>% Toolbox for data wrangling in R data wrangling model visualise base ggplot2 rmarkdown broom adapted from H. Wickham
  • 7. data analysis report magrittr readr tidyr dplyr %>% load reshape manipulate%>% %>% Toolbox for data wrangling in R data wrangling model visualise base ggplot2 rmarkdown broom adapted from H. Wickham
  • 8. magrittr In a pipe, the result of the left hand statement is handed over to the function on the right hand side: …similar to Unix pipe operator | f(x, y) x %>% f(y) f(x, y, z) x %>% f(y, z) f2(f1(x), y) f1(x) %>% f2(y)
  • 12. tidyr gather() spread() Reshaping adapted from rstudio.com/resources/cheatsheets/
  • 13. tidyr gather() spread() separate() unite() Reshaping adapted from rstudio.com/resources/cheatsheets/
  • 14. dplyr filter(x > 1) select(B, C, E) A B C D E B C Ex 1 2 3 1 x 2 3 Subsetting adapted from rstudio.com/resources/cheatsheets/
  • 15. dplyr Transforming Summarising 1 2 3 x 4 5 6 y 1 2 3 x 4 5 6 y 5 7 9 z mutate(z = x + y) summarise(A = sum(x), B = sum(y)) 1 2 3 x 4 5 6 y 6 A 15 B adapted from rstudio.com/resources/cheatsheets/
  • 16. dplyr Transforming Summarising 1 2 3 x 4 5 6 y 1 2 3 x 4 5 6 y 5 7 9 z mutate(z = x + y) summarise(A = sum(x), B = sum(y)) 1 2 3 x 4 5 6 y 6 A 15 B group_by() %>% mutate() group_by() %>% summarise() adapted from rstudio.com/resources/cheatsheets/
  • 18. »Happy families are all alike; every unhappy family is unhappy in its own way.« 
 
 Leo Tolstoy Anna Karenina principle
  • 19. »Tidy data sets are all alike; every messy data set is messy in its own way.« 
 
 Hadley Wickham Tidy data principle
  • 20. Tidy data definition Wickham, H. (2014). Tidy Data. Journal of Statistical Software
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27. read_excel(“untidy_data.xlsx”) %>% set_colnames(mynames) %>% slice(1:36) %>% fill(group, condition) %>% separate(group, into = c(“Gene”, “Mutation”, “clone”), sep = “_”) %>% write_tsv(“tidy_data.tsv”)
  • 28. read_excel(“untidy_data.xlsx”) %>% set_colnames(mynames) %>% slice(1:36) %>% fill(group, condition) %>% separate(group, into = c(“Gene”, “Mutation”, “clone”), sep = “_”) %>% write_tsv(“tidy_data.tsv”)
  • 34. read_excel %>% set_colnames %>% slice %>% fill
  • 35. read_excel %>% set_colnames %>% slice %>% fill %>% select
  • 36. read_excel %>% set_colnames %>% slice %>% fill %>% select %>% distinct
  • 37. read_excel %>% set_colnames %>% slice %>% fill %>% select %>% distinct %>%
 separate
  • 38. read_excel %>% set_colnames %>% slice %>% fill %>% select %>% distinct %>%
 separate Caution!
 readr, tidy & dplyr do “clever” stuff. (heuristics like predicting a column class by looking at the first 1000 entries)
  • 39. read_excel %>% set_colnames %>% slice %>% fill %>% select %>% distinct
 separate
  • 40. read_excel %>% set_colnames %>% slice %>% fill %>% select %>% distinct
 separate %>% unite
  • 41. read_excel %>% set_colnames %>% slice %>% fill %>% select %>% distinct
 separate %>% unite
  • 42.
  • 43. Tidy data definition Wickham, H. (2014). Tidy Data. Journal of Statistical Software
  • 45. read_tsv %>% gather(key, value, -variable)
  • 46. read_tsv %>% gather %>% spread(key, value)
  • 48. read_tsv %>% gather %>% filter
  • 49. read_tsv %>% gather %>% filter %>% group_by
  • 50. read_tsv %>% gather %>% filter %>% group_by %>% summarise %>% arrange
  • 51. read_tsv %>% gather %>% filter %>% group_by %>% summarise %>% arrange
  • 52. read_tsv %>% gather %>% filter %>% group_by %>% summarise %>% arrange
  • 53. Data Wrangling with dplyr and tidyr Cheat Sheet RStudio® is a trademark of RStudio, Inc. • CC BY RStudio • info@rstudio.com • 844-448-1212 • rstudio.com Syntax - Helpful conventions for wrangling dplyr::tbl_df(iris) Converts data to tbl class. tbl’s are easier to examine than data frames. R displays only the data that fits onscreen: dplyr::glimpse(iris) Information dense summary of tbl data. utils::View(iris) View data set in spreadsheet-like display (note capital V). Source: local data frame [150 x 5] Sepal.Length Sepal.Width Petal.Length 1 5.1 3.5 1.4 2 4.9 3.0 1.4 3 4.7 3.2 1.3 4 4.6 3.1 1.5 5 5.0 3.6 1.4 .. ... ... ... Variables not shown: Petal.Width (dbl), Species (fctr) dplyr::%>% Passes object on left hand side as first argument (or . argument) of function on righthand side. "Piping" with %>% makes code more readable, e.g. iris %>% group_by(Species) %>% summarise(avg = mean(Sepal.Width)) %>% arrange(avg) x %>% f(y) is the same as f(x, y) y %>% f(x, ., z) is the same as f(x, y, z ) Reshaping Data - Change the layout of a data set Subset Observations (Rows) Subset Variables (Columns) F M A Each variable is saved in its own column F M A Each observation is saved in its own row In a tidy data set: & Tidy Data - A foundation for wrangling in R Tidy data complements R’s vectorized operations. R will automatically preserve observations as you manipulate variables. No other format works as intuitively with R. FAM M * A * tidyr::gather(cases, "year", "n", 2:4) Gather columns into rows. tidyr::unite(data, col, ..., sep) Unite several columns into one. dplyr::data_frame(a = 1:3, b = 4:6) Combine vectors into data frame (optimized). dplyr::arrange(mtcars, mpg) Order rows by values of a column (low to high). dplyr::arrange(mtcars, desc(mpg)) Order rows by values of a column (high to low). dplyr::rename(tb, y = year) Rename the columns of a data frame. tidyr::spread(pollution, size, amount) Spread rows into columns. tidyr::separate(storms, date, c("y", "m", "d")) Separate one column into several. wwwwwwA1005A1013A1010A1010 wwp110110100745451009 wwp110110100745451009 wwp110110100745451009wwp110110100745451009 wppw11010071007110451009100945 wwwww110110110110110 wwww dplyr::filter(iris, Sepal.Length > 7) Extract rows that meet logical criteria. dplyr::distinct(iris) Remove duplicate rows. dplyr::sample_frac(iris, 0.5, replace = TRUE) Randomly select fraction of rows. dplyr::sample_n(iris, 10, replace = TRUE) Randomly select n rows. dplyr::slice(iris, 10:15) Select rows by position. dplyr::top_n(storms, 2, date) Select and order top n entries (by group if grouped data). < Less than != Not equal to > Greater than %in% Group membership == Equal to is.na Is NA <= Less than or equal to !is.na Is not NA >= Greater than or equal to &,|,!,xor,any,all Boolean operators Logic in R - ?Comparison, ?base::Logic dplyr::select(iris, Sepal.Width, Petal.Length, Species) Select columns by name or helper function. Helper functions for select - ?select select(iris, contains(".")) Select columns whose name contains a character string. select(iris, ends_with("Length")) Select columns whose name ends with a character string. select(iris, everything()) Select every column. select(iris, matches(".t.")) Select columns whose name matches a regular expression. select(iris, num_range("x", 1:5)) Select columns named x1, x2, x3, x4, x5. select(iris, one_of(c("Species", "Genus"))) Select columns whose names are in a group of names. select(iris, starts_with("Sepal")) Select columns whose name starts with a character string. select(iris, Sepal.Length:Petal.Width) Select all columns between Sepal.Length and Petal.Width (inclusive). select(iris, -Species) Select all columns except Species. Learn more with browseVignettes(package = c("dplyr", "tidyr")) • dplyr 0.4.0• tidyr 0.2.0 • Updated: 1/15 wwwwwwA1005A1013A1010A1010 devtools::install_github("rstudio/EDAWR") for data sets rstudio.com/resources/cheatsheets/