SlideShare une entreprise Scribd logo
1  sur  70
Télécharger pour lire hors ligne
Programming in R
(and some other stuff)
y.wurm@qmul.ac.uk
https://wurmlab.github.io
© Alex Wild & others
© National Geographic
Atta leaf-cutter ants
© National Geographic
Atta leaf-cutter ants
© National Geographic
Atta leaf-cutter ants
Oecophylla Weaver ants
© ameisenforum.de
© ameisenforum.de
Fourmis tisserandes
© ameisenforum.de
Oecophylla Weaver ants
© forestryimages.org© wynnie@flickr
Tofilski et al 2008
Forelius pusillus
Tofilski et al 2008
Forelius pusillus hides the nest entrance at night
Tofilski et al 2008
Forelius pusillus hides the nest entrance at night
Tofilski et al 2008
Forelius pusillus hides the nest entrance at night
Tofilski et al 2008
Forelius pusillus hides the nest entrance at night
Avant
Workers staying outside die
« preventive self-sacrifice »
Tofilski et al 2008
Forelius pusillus hides the nest entrance at night
Dorylus driver ants: ants with no home
© BBC
Animal biomass (Brazilian rainforest)
from Fittkau & Klinge 1973
Other insects Amphibians
Reptiles
Birds
Mammals
Earthworms
Spiders
Soil fauna excluding
earthworms,
ants & termites
Ants & termites
We use modern technologies to
understand insect societies.
• evolution of social behaviour
• molecules involved in social behaviour
• consequences of environmental change
Big data is invading biology
This changes
everything.
Any lab can
sequence
anything!
http://gregoryzynda.com/ncbi/genome/python/2014/03/31/ncbi-genome.html
BIG
Big data is invading biology
• Genomics
• Cancer genomics
• Biodiversity assessments
• Stool microbiome sequencing
• Personalized medicine
• Sensor networks - e.g tracking microclimates, recording sounds
• Huge medical studies
• Aerial surveys (Drones) - e.g. crop productivity; rainforest cover
• Camera traps
Learning to deal with big data takes time
Practicals
• Aim: get relevant data handling skills
• Doing things by hand:
• impossible?
• slow,
• error-prone,
• Automate!
• Basic programming
• in R
• no stats!
Why R?
😳😟
😴😡
😥
Practicals: contents
• Done:
• data accessing/subsetting
• New:
• search/replace
• regular expressions
• New:
• functions
• loops
• Friday: (Introduction to Unix & High performance computing)
Text search on steroids
Reusable pieces of work
Repeating the same thing many times
• create a variable that contains the number 35
• create a variable that contains the string “I love tofu”
• give me a vector containing the sequence of numbers
from 5 to 11
• access the second number
• replace the second number with 42
• add 5 to the second number
• now add 5 to all numbers
• now add an extra number: 1999
• can you sum all the numbers?
• creating a vector
> my_vector <- c(5, 6, 7, 8, 9, 10, 11)
> my_vector <- 5:11
> my_vector <- seq(from=5, to=11, by=1)
> my_vector
[1] 5 6 7 8 9 10 11
> length(my_vector)
[1] 7
> (10 > 30)

[1] FALSE
> my_vector > 8

[1] FALSE FALSE FALSE FALSE TRUE TRUE TRUE
> my_vector[my_vector > 8]

9 10 11
> other_vector <- my_vector[my_vector > 8]
> other_vector
9 10 11
> other_vector + 3
• give me a vector containing numbers from 5 to 11 (3 variants)
• accessing a subset
• of a vector
> big_vector <- 150:100
> big_vector
[1] 150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135 13
[20] 131 130 129 128 127 126 125 124 123 122 121 120 119 118 117 116 11
[39] 112 111 110 109 108 107 106 105 104 103 102 101 100
> big_vector[5]
146
> mysubset <- big_vector[my_vector]
> mysubset
[1] 146 145 144 143 142 141 140
> big_vector > 130
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[13] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[49] FALSE FALSE FALSE
> subset(x = big_vector, subset = big_vector > 140)
[1] 150 149 148 147 146 145 144 143 142 141
> big_vector[big_vector >= 140]
[1] 150 149 148 147 146 145 144 143 142 141 140
> my_vector
[1] 5 6 7 8 9 10 11
Regular expressions (regex):
Text search on steroids.
who dat?
Regular expressions (regex):
Text search on steroids.
Regular expression Finds
David David
Dav(e|(id)) David, Dave
Dav(e|(id)|(ide)|o) David, Dave, Davide, Davo
At{1,2}enborough
Attenborough,
Atenborough
Atte[nm]borough
Attenborough,
Attemborough
At{1,2}[ei][nm]bo{0,1}ro((ugh)|w){0,1}
Atimbro,

attenbrough,
ateinborow
Easy counting, replacing all with “Sir David Attenborough”
Yes: ”HATSOMIKTIP"
yes: ”HAVSONYYIKTIP"
not: ”HAVSQMIKTIP"
Regex special symbols
Regular expression Finds Example
[aeiou] any single vowel “e”
[aeiou]*
between 0 and infinity
vowels vowels, e.g.’
“eeooouuu"
[aeoiu]{1,3} between 1 and 3 vowels “oui”
a|i one of the 2 characters “"
((win)|(fail))
one of the two 

words in ()
fail
Yes: ”HATSOMIKTIP"
yes: ”HAVSONYYIKTIP"
not: ”HAVSQMIKTIP"
More Regex Special symbols
• Google “Regular expression cheat sheet”
• ?regexp
Synonymous with
[:digit:] [0-9]
[A-z] [A-z], ie [A-Za-z]
s whitespace
. any single character
.+ one to many of anything
b* between 0 and infinity letter ‘b’
[^abc] any character other than a, b or c.
( (
[:punct:]
any of these: ! " # $ % & ' ( ) * + , - . /
: ; < = > ? @ [  ] ^ _ ` { |
You want to scan a protein sequence database for a
particular binding site.Type a single regular expression that
will match the first two of the following peptide sequences,
but NOT the last one:
"HATSOMIKTIP"
"HAVSONYYIKTIP"
"HAVSQMIKTIP"
(rubular)
Variants of a microsatellite sequence are responsible for
differential expression of vasopressin receptor, and in turn for
differences in social behaviour in voles & others. Create a regular
expression that finds AGAGAGAGAGAGAGAG dinucleotide
microsatellite repeats with lengths of 5 to 500
Again
Make a regular expression
• matching “LMTSOMIKTIP” and “LMVSONYYIKTIP” but not
“LMVSQMIKTIP”
• matching all variants of “ok” (e.g., “O.K.”,“Okay”…)
Ok… so how do we use this?
• ?grep
• ?gsub
Which species names include ‘y’?
Create a vector with only species names, but replace all ‘y’
with ‘Y!
ants <- read.table("https://goo.gl/3Ek1dL")
colnames(ants) <- c("genus", "species")
Remove all vowels
Replace all vowels with ‘o’
Functions
Functions
• R has many. e.g.: plot(), t.test()
• Making your own:
tree_age_estimate <- function(diameter, species) {
growth_rate <- growth_rates[ species ]
age_estimate <- diameter / growth_rate
return(age_estimate)
}
> tree_age_estimate(25, “White Oak”)
+ 66
> tree_age_estimate(60, “Carya ovata”)
+ 190
Make a function
• That converts fahrenheit to celsius
(subtract 32 then divide the result by 1.8)
Loops
“for”
Loop
> possible_colours <- c('blue', 'cyan', 'sky-blue', 'navy blue',
'steel blue', 'royal blue', 'slate blue', 'light blue', 'dark
blue', 'prussian blue', 'indigo', 'baby blue', 'electric blue')
> possible_colours
[1] "blue" "cyan" "sky-blue" "navy blue"
[5] "steel blue" "royal blue" "slate blue" "light blue"
[9] "dark blue" "prussian blue" "indigo" "baby blue"
[13] "electric blue"
> for (colour in possible_colours) {
+ print(paste("The sky is oh so, so", colour))
+ }
[1] "The sky is so, oh so blue"
[1] "The sky is so, oh so cyan"
[1] "The sky is so, oh so sky-blue"
[1] "The sky is so, oh so navy blue"
[1] "The sky is so, oh so steel blue"
[1] "The sky is so, oh so royal blue"
[1] "The sky is so, oh so slate blue"
[1] "The sky is so, oh so light blue"
[1] "The sky is so, oh so dark blue"
[1] "The sky is so, oh so prussian blue"
[1] "The sky is so, oh so indigo"
[1] "The sky is so, oh so baby blue"
What does this loop do?
for (index in 10:1) {
print(paste(index, "mins befo lunch"))
}
Again
• What does the following code do (decompose on pen and
paper)
for (letter in LETTERS) {
begins_with <- paste("^", letter, sep="")
matches <- grep(pattern = begins_with,
x = ants$genus)
print(paste(length(matches), "begin with", letter))
}
> LETTERS
[1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S"
[20] "T" "U" "V" "W" "X" "Y" "Z"


> ants <- read.table("https://goo.gl/3Ek1dL")
> colnames(ants) <- c("genus", “species")


> head(ants)
genus species
1 Anergates atratulus
2 Camponotus sp.
3 Crematogaster scutellaris
4 Formica aquilonia
5 Formica cunicularia
6 Formica exsecta
What does this loop do?
Jasmin
Zohren
Bruno
Vieira
Rodrigo
Pracana
James
Wright
Programming in R
?
If/else
Logical Operators
going further

Contenu connexe

Tendances

The Chain Rule, Part 1
The Chain Rule, Part 1The Chain Rule, Part 1
The Chain Rule, Part 1Pablo Antuna
 
Useful javascript
Useful javascriptUseful javascript
Useful javascriptLei Kang
 
#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"
#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"
#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"epamspb
 
Math 7 lesson 6 subtraction of integers (without number lines)
Math 7   lesson 6 subtraction of integers (without number lines)Math 7   lesson 6 subtraction of integers (without number lines)
Math 7 lesson 6 subtraction of integers (without number lines)Ariel Gilbuena
 
Digital signal processing (2nd ed) (mitra) solution manual
Digital signal processing (2nd ed) (mitra) solution manualDigital signal processing (2nd ed) (mitra) solution manual
Digital signal processing (2nd ed) (mitra) solution manualRamesh Sundar
 
Introduction to jRuby
Introduction to jRubyIntroduction to jRuby
Introduction to jRubyAdam Kalsey
 
Let’s Talk About Ruby
Let’s Talk About RubyLet’s Talk About Ruby
Let’s Talk About RubyIan Bishop
 

Tendances (11)

The Chain Rule, Part 1
The Chain Rule, Part 1The Chain Rule, Part 1
The Chain Rule, Part 1
 
Python idioms
Python idiomsPython idioms
Python idioms
 
Useful javascript
Useful javascriptUseful javascript
Useful javascript
 
FNT 2015 PDIS CodeEU - Zanimljiva informatika - 02 Djordje Pavlovic - Live_ch...
FNT 2015 PDIS CodeEU - Zanimljiva informatika - 02 Djordje Pavlovic - Live_ch...FNT 2015 PDIS CodeEU - Zanimljiva informatika - 02 Djordje Pavlovic - Live_ch...
FNT 2015 PDIS CodeEU - Zanimljiva informatika - 02 Djordje Pavlovic - Live_ch...
 
#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"
#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"
#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"
 
Math 7 lesson 6 subtraction of integers (without number lines)
Math 7   lesson 6 subtraction of integers (without number lines)Math 7   lesson 6 subtraction of integers (without number lines)
Math 7 lesson 6 subtraction of integers (without number lines)
 
Hermes
HermesHermes
Hermes
 
Digital signal processing (2nd ed) (mitra) solution manual
Digital signal processing (2nd ed) (mitra) solution manualDigital signal processing (2nd ed) (mitra) solution manual
Digital signal processing (2nd ed) (mitra) solution manual
 
Introduction to jRuby
Introduction to jRubyIntroduction to jRuby
Introduction to jRuby
 
LCA change
LCA changeLCA change
LCA change
 
Let’s Talk About Ruby
Let’s Talk About RubyLet’s Talk About Ruby
Let’s Talk About Ruby
 

Similaire à Here are some suggestions for going further with R programming:- Learn data visualization with ggplot2 - This is a very powerful and flexible package for creating publication-quality graphs.- Work with real datasets - Find open datasets online and practice reading, wrangling, analyzing and visualizing real data. This is a great way to learn.- Learn data wrangling with dplyr and tidyr - These packages make it easy to manipulate and reshape data frames.- Learn modeling techniques - Explore packages for linear/logistic regression, time series analysis, machine learning etc. - Learn Shiny for building interactive web apps - Build apps to explore and visualize data.- Learn R Markdown

2014 11-12 sbsm032rstatsprogramming.key
2014 11-12 sbsm032rstatsprogramming.key2014 11-12 sbsm032rstatsprogramming.key
2014 11-12 sbsm032rstatsprogramming.keyYannick Wurm
 
2015 9-30-sbc361-research methcomm
2015 9-30-sbc361-research methcomm2015 9-30-sbc361-research methcomm
2015 9-30-sbc361-research methcommYannick Wurm
 
2014-9-24-SBC361-ResearchMethComm
2014-9-24-SBC361-ResearchMethComm2014-9-24-SBC361-ResearchMethComm
2014-9-24-SBC361-ResearchMethCommYannick Wurm
 
Python for High School Programmers
Python for High School ProgrammersPython for High School Programmers
Python for High School ProgrammersSiva Arunachalam
 
2013 10-16-sbc3610-research methcomm
2013 10-16-sbc3610-research methcomm2013 10-16-sbc3610-research methcomm
2013 10-16-sbc3610-research methcommYannick Wurm
 
A Taste of Python - Devdays Toronto 2009
A Taste of Python - Devdays Toronto 2009A Taste of Python - Devdays Toronto 2009
A Taste of Python - Devdays Toronto 2009Jordan Baker
 
Class 6: Lists & dictionaries
Class 6: Lists & dictionariesClass 6: Lists & dictionaries
Class 6: Lists & dictionariesMarc Gouw
 
Spock Framework - Slidecast
Spock Framework - SlidecastSpock Framework - Slidecast
Spock Framework - SlidecastDaniel Kolman
 
Clojure for Java developers - Stockholm
Clojure for Java developers - StockholmClojure for Java developers - Stockholm
Clojure for Java developers - StockholmJan Kronquist
 
re3 - modern regex syntax with a focus on adoption
re3 - modern regex syntax with a focus on adoptionre3 - modern regex syntax with a focus on adoption
re3 - modern regex syntax with a focus on adoptionAur Saraf
 
第二讲 Python基礎
第二讲 Python基礎第二讲 Python基礎
第二讲 Python基礎juzihua1102
 
第二讲 预备-Python基礎
第二讲 预备-Python基礎第二讲 预备-Python基礎
第二讲 预备-Python基礎anzhong70
 
NUS iOS Swift Talk
NUS iOS Swift TalkNUS iOS Swift Talk
NUS iOS Swift TalkGabriel Lim
 
Thinking Functionally In Ruby
Thinking Functionally In RubyThinking Functionally In Ruby
Thinking Functionally In RubyRoss Lawley
 

Similaire à Here are some suggestions for going further with R programming:- Learn data visualization with ggplot2 - This is a very powerful and flexible package for creating publication-quality graphs.- Work with real datasets - Find open datasets online and practice reading, wrangling, analyzing and visualizing real data. This is a great way to learn.- Learn data wrangling with dplyr and tidyr - These packages make it easy to manipulate and reshape data frames.- Learn modeling techniques - Explore packages for linear/logistic regression, time series analysis, machine learning etc. - Learn Shiny for building interactive web apps - Build apps to explore and visualize data.- Learn R Markdown (20)

2014 11-12 sbsm032rstatsprogramming.key
2014 11-12 sbsm032rstatsprogramming.key2014 11-12 sbsm032rstatsprogramming.key
2014 11-12 sbsm032rstatsprogramming.key
 
2015 9-30-sbc361-research methcomm
2015 9-30-sbc361-research methcomm2015 9-30-sbc361-research methcomm
2015 9-30-sbc361-research methcomm
 
2014-9-24-SBC361-ResearchMethComm
2014-9-24-SBC361-ResearchMethComm2014-9-24-SBC361-ResearchMethComm
2014-9-24-SBC361-ResearchMethComm
 
Python Basic
Python BasicPython Basic
Python Basic
 
Python for High School Programmers
Python for High School ProgrammersPython for High School Programmers
Python for High School Programmers
 
2013 10-16-sbc3610-research methcomm
2013 10-16-sbc3610-research methcomm2013 10-16-sbc3610-research methcomm
2013 10-16-sbc3610-research methcomm
 
A Taste of Python - Devdays Toronto 2009
A Taste of Python - Devdays Toronto 2009A Taste of Python - Devdays Toronto 2009
A Taste of Python - Devdays Toronto 2009
 
Class 6: Lists & dictionaries
Class 6: Lists & dictionariesClass 6: Lists & dictionaries
Class 6: Lists & dictionaries
 
Python tutorial
Python tutorialPython tutorial
Python tutorial
 
Spock Framework - Slidecast
Spock Framework - SlidecastSpock Framework - Slidecast
Spock Framework - Slidecast
 
Spock Framework
Spock FrameworkSpock Framework
Spock Framework
 
Clojure for Java developers - Stockholm
Clojure for Java developers - StockholmClojure for Java developers - Stockholm
Clojure for Java developers - Stockholm
 
Python 101 1
Python 101   1Python 101   1
Python 101 1
 
Go Java, Go!
Go Java, Go!Go Java, Go!
Go Java, Go!
 
re3 - modern regex syntax with a focus on adoption
re3 - modern regex syntax with a focus on adoptionre3 - modern regex syntax with a focus on adoption
re3 - modern regex syntax with a focus on adoption
 
第二讲 Python基礎
第二讲 Python基礎第二讲 Python基礎
第二讲 Python基礎
 
第二讲 预备-Python基礎
第二讲 预备-Python基礎第二讲 预备-Python基礎
第二讲 预备-Python基礎
 
NUS iOS Swift Talk
NUS iOS Swift TalkNUS iOS Swift Talk
NUS iOS Swift Talk
 
Music as data
Music as dataMusic as data
Music as data
 
Thinking Functionally In Ruby
Thinking Functionally In RubyThinking Functionally In Ruby
Thinking Functionally In Ruby
 

Plus de Yannick Wurm

2018 09-03-ses open-fair_practices_in_evolutionary_genomics
2018 09-03-ses open-fair_practices_in_evolutionary_genomics2018 09-03-ses open-fair_practices_in_evolutionary_genomics
2018 09-03-ses open-fair_practices_in_evolutionary_genomicsYannick Wurm
 
2018 08-reduce risks of genomics research
2018 08-reduce risks of genomics research2018 08-reduce risks of genomics research
2018 08-reduce risks of genomics researchYannick Wurm
 
2017 11-15-reproducible research
2017 11-15-reproducible research2017 11-15-reproducible research
2017 11-15-reproducible researchYannick Wurm
 
2016 09-16-fairdom
2016 09-16-fairdom2016 09-16-fairdom
2016 09-16-fairdomYannick Wurm
 
2016 05-31-wurm-social-chromosome
2016 05-31-wurm-social-chromosome2016 05-31-wurm-social-chromosome
2016 05-31-wurm-social-chromosomeYannick Wurm
 
2016 05-30-monday-assembly
2016 05-30-monday-assembly2016 05-30-monday-assembly
2016 05-30-monday-assemblyYannick Wurm
 
2016 05-29-intro-sib-springschool-leuker bad
2016 05-29-intro-sib-springschool-leuker bad2016 05-29-intro-sib-springschool-leuker bad
2016 05-29-intro-sib-springschool-leuker badYannick Wurm
 
2015 12-18- Avoid having to retract your genomics analysis - Popgroup Reprodu...
2015 12-18- Avoid having to retract your genomics analysis - Popgroup Reprodu...2015 12-18- Avoid having to retract your genomics analysis - Popgroup Reprodu...
2015 12-18- Avoid having to retract your genomics analysis - Popgroup Reprodu...Yannick Wurm
 
2015 11-10-bio-in-docker-oswitch
2015 11-10-bio-in-docker-oswitch2015 11-10-bio-in-docker-oswitch
2015 11-10-bio-in-docker-oswitchYannick Wurm
 
Week 5 genetic basis of evolution
Week 5   genetic basis of evolutionWeek 5   genetic basis of evolution
Week 5 genetic basis of evolutionYannick Wurm
 
Biol113 week4 evolution
Biol113 week4 evolutionBiol113 week4 evolution
Biol113 week4 evolutionYannick Wurm
 
2015 10-7-11am-reproducible research
2015 10-7-11am-reproducible research2015 10-7-11am-reproducible research
2015 10-7-11am-reproducible researchYannick Wurm
 
2015 09-29-sbc322-methods.key
2015 09-29-sbc322-methods.key2015 09-29-sbc322-methods.key
2015 09-29-sbc322-methods.keyYannick Wurm
 
2015 09-28 bio721 intro
2015 09-28 bio721 intro2015 09-28 bio721 intro
2015 09-28 bio721 introYannick Wurm
 
Sustainable software institute Collaboration workshop
Sustainable software institute Collaboration workshopSustainable software institute Collaboration workshop
Sustainable software institute Collaboration workshopYannick Wurm
 
2014 10-15-Nextbug edinburgh
2014 10-15-Nextbug edinburgh2014 10-15-Nextbug edinburgh
2014 10-15-Nextbug edinburghYannick Wurm
 

Plus de Yannick Wurm (20)

2018 09-03-ses open-fair_practices_in_evolutionary_genomics
2018 09-03-ses open-fair_practices_in_evolutionary_genomics2018 09-03-ses open-fair_practices_in_evolutionary_genomics
2018 09-03-ses open-fair_practices_in_evolutionary_genomics
 
2018 08-reduce risks of genomics research
2018 08-reduce risks of genomics research2018 08-reduce risks of genomics research
2018 08-reduce risks of genomics research
 
2017 11-15-reproducible research
2017 11-15-reproducible research2017 11-15-reproducible research
2017 11-15-reproducible research
 
2016 09-16-fairdom
2016 09-16-fairdom2016 09-16-fairdom
2016 09-16-fairdom
 
2016 05-31-wurm-social-chromosome
2016 05-31-wurm-social-chromosome2016 05-31-wurm-social-chromosome
2016 05-31-wurm-social-chromosome
 
2016 05-30-monday-assembly
2016 05-30-monday-assembly2016 05-30-monday-assembly
2016 05-30-monday-assembly
 
2016 05-29-intro-sib-springschool-leuker bad
2016 05-29-intro-sib-springschool-leuker bad2016 05-29-intro-sib-springschool-leuker bad
2016 05-29-intro-sib-springschool-leuker bad
 
2015 12-18- Avoid having to retract your genomics analysis - Popgroup Reprodu...
2015 12-18- Avoid having to retract your genomics analysis - Popgroup Reprodu...2015 12-18- Avoid having to retract your genomics analysis - Popgroup Reprodu...
2015 12-18- Avoid having to retract your genomics analysis - Popgroup Reprodu...
 
2015 11-10-bio-in-docker-oswitch
2015 11-10-bio-in-docker-oswitch2015 11-10-bio-in-docker-oswitch
2015 11-10-bio-in-docker-oswitch
 
Week 5 genetic basis of evolution
Week 5   genetic basis of evolutionWeek 5   genetic basis of evolution
Week 5 genetic basis of evolution
 
Biol113 week4 evolution
Biol113 week4 evolutionBiol113 week4 evolution
Biol113 week4 evolution
 
Evolution week3
Evolution week3Evolution week3
Evolution week3
 
2015 10-7-11am-reproducible research
2015 10-7-11am-reproducible research2015 10-7-11am-reproducible research
2015 10-7-11am-reproducible research
 
Evolution week2
Evolution week2Evolution week2
Evolution week2
 
2015 09-29-sbc322-methods.key
2015 09-29-sbc322-methods.key2015 09-29-sbc322-methods.key
2015 09-29-sbc322-methods.key
 
Sbc322 intro.key
Sbc322 intro.keySbc322 intro.key
Sbc322 intro.key
 
2015 09-28 bio721 intro
2015 09-28 bio721 intro2015 09-28 bio721 intro
2015 09-28 bio721 intro
 
Sustainable software institute Collaboration workshop
Sustainable software institute Collaboration workshopSustainable software institute Collaboration workshop
Sustainable software institute Collaboration workshop
 
2014 10-15-Nextbug edinburgh
2014 10-15-Nextbug edinburgh2014 10-15-Nextbug edinburgh
2014 10-15-Nextbug edinburgh
 
2014 12-09-oulu
2014 12-09-oulu2014 12-09-oulu
2014 12-09-oulu
 

Dernier

Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 

Dernier (20)

Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 

Here are some suggestions for going further with R programming:- Learn data visualization with ggplot2 - This is a very powerful and flexible package for creating publication-quality graphs.- Work with real datasets - Find open datasets online and practice reading, wrangling, analyzing and visualizing real data. This is a great way to learn.- Learn data wrangling with dplyr and tidyr - These packages make it easy to manipulate and reshape data frames.- Learn modeling techniques - Explore packages for linear/logistic regression, time series analysis, machine learning etc. - Learn Shiny for building interactive web apps - Build apps to explore and visualize data.- Learn R Markdown

  • 1. Programming in R (and some other stuff) y.wurm@qmul.ac.uk https://wurmlab.github.io
  • 2. © Alex Wild & others
  • 3.
  • 4. © National Geographic Atta leaf-cutter ants
  • 5. © National Geographic Atta leaf-cutter ants
  • 6. © National Geographic Atta leaf-cutter ants
  • 7.
  • 8. Oecophylla Weaver ants © ameisenforum.de
  • 12. Tofilski et al 2008 Forelius pusillus
  • 13. Tofilski et al 2008 Forelius pusillus hides the nest entrance at night
  • 14. Tofilski et al 2008 Forelius pusillus hides the nest entrance at night
  • 15. Tofilski et al 2008 Forelius pusillus hides the nest entrance at night
  • 16. Tofilski et al 2008 Forelius pusillus hides the nest entrance at night
  • 17. Avant Workers staying outside die « preventive self-sacrifice » Tofilski et al 2008 Forelius pusillus hides the nest entrance at night
  • 18. Dorylus driver ants: ants with no home © BBC
  • 19. Animal biomass (Brazilian rainforest) from Fittkau & Klinge 1973 Other insects Amphibians Reptiles Birds Mammals Earthworms Spiders Soil fauna excluding earthworms, ants & termites Ants & termites
  • 20. We use modern technologies to understand insect societies. • evolution of social behaviour • molecules involved in social behaviour • consequences of environmental change
  • 21.
  • 22.
  • 23. Big data is invading biology
  • 24. This changes everything. Any lab can sequence anything!
  • 26. BIG
  • 27. Big data is invading biology • Genomics • Cancer genomics • Biodiversity assessments • Stool microbiome sequencing • Personalized medicine • Sensor networks - e.g tracking microclimates, recording sounds • Huge medical studies • Aerial surveys (Drones) - e.g. crop productivity; rainforest cover • Camera traps
  • 28.
  • 29. Learning to deal with big data takes time
  • 30.
  • 31. Practicals • Aim: get relevant data handling skills • Doing things by hand: • impossible? • slow, • error-prone, • Automate! • Basic programming • in R • no stats!
  • 33. Practicals: contents • Done: • data accessing/subsetting • New: • search/replace • regular expressions • New: • functions • loops • Friday: (Introduction to Unix & High performance computing) Text search on steroids Reusable pieces of work Repeating the same thing many times
  • 34.
  • 35. • create a variable that contains the number 35 • create a variable that contains the string “I love tofu” • give me a vector containing the sequence of numbers from 5 to 11 • access the second number • replace the second number with 42 • add 5 to the second number • now add 5 to all numbers • now add an extra number: 1999 • can you sum all the numbers?
  • 36. • creating a vector > my_vector <- c(5, 6, 7, 8, 9, 10, 11) > my_vector <- 5:11 > my_vector <- seq(from=5, to=11, by=1) > my_vector [1] 5 6 7 8 9 10 11 > length(my_vector) [1] 7 > (10 > 30)
 [1] FALSE > my_vector > 8
 [1] FALSE FALSE FALSE FALSE TRUE TRUE TRUE > my_vector[my_vector > 8]
 9 10 11 > other_vector <- my_vector[my_vector > 8] > other_vector 9 10 11 > other_vector + 3 • give me a vector containing numbers from 5 to 11 (3 variants)
  • 37. • accessing a subset • of a vector > big_vector <- 150:100 > big_vector [1] 150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135 13 [20] 131 130 129 128 127 126 125 124 123 122 121 120 119 118 117 116 11 [39] 112 111 110 109 108 107 106 105 104 103 102 101 100 > big_vector[5] 146 > mysubset <- big_vector[my_vector] > mysubset [1] 146 145 144 143 142 141 140 > big_vector > 130 [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [13] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [49] FALSE FALSE FALSE > subset(x = big_vector, subset = big_vector > 140) [1] 150 149 148 147 146 145 144 143 142 141 > big_vector[big_vector >= 140] [1] 150 149 148 147 146 145 144 143 142 141 140 > my_vector [1] 5 6 7 8 9 10 11
  • 38. Regular expressions (regex): Text search on steroids.
  • 40.
  • 41.
  • 42. Regular expressions (regex): Text search on steroids. Regular expression Finds David David Dav(e|(id)) David, Dave Dav(e|(id)|(ide)|o) David, Dave, Davide, Davo At{1,2}enborough Attenborough, Atenborough Atte[nm]borough Attenborough, Attemborough At{1,2}[ei][nm]bo{0,1}ro((ugh)|w){0,1} Atimbro,
 attenbrough, ateinborow Easy counting, replacing all with “Sir David Attenborough” Yes: ”HATSOMIKTIP" yes: ”HAVSONYYIKTIP" not: ”HAVSQMIKTIP"
  • 43. Regex special symbols Regular expression Finds Example [aeiou] any single vowel “e” [aeiou]* between 0 and infinity vowels vowels, e.g.’ “eeooouuu" [aeoiu]{1,3} between 1 and 3 vowels “oui” a|i one of the 2 characters “" ((win)|(fail)) one of the two 
 words in () fail Yes: ”HATSOMIKTIP" yes: ”HAVSONYYIKTIP" not: ”HAVSQMIKTIP"
  • 44. More Regex Special symbols • Google “Regular expression cheat sheet” • ?regexp Synonymous with [:digit:] [0-9] [A-z] [A-z], ie [A-Za-z] s whitespace . any single character .+ one to many of anything b* between 0 and infinity letter ‘b’ [^abc] any character other than a, b or c. ( ( [:punct:] any of these: ! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ ] ^ _ ` { |
  • 45.
  • 46. You want to scan a protein sequence database for a particular binding site.Type a single regular expression that will match the first two of the following peptide sequences, but NOT the last one: "HATSOMIKTIP" "HAVSONYYIKTIP" "HAVSQMIKTIP"
  • 48. Variants of a microsatellite sequence are responsible for differential expression of vasopressin receptor, and in turn for differences in social behaviour in voles & others. Create a regular expression that finds AGAGAGAGAGAGAGAG dinucleotide microsatellite repeats with lengths of 5 to 500
  • 49. Again Make a regular expression • matching “LMTSOMIKTIP” and “LMVSONYYIKTIP” but not “LMVSQMIKTIP” • matching all variants of “ok” (e.g., “O.K.”,“Okay”…)
  • 50.
  • 51. Ok… so how do we use this? • ?grep • ?gsub
  • 52. Which species names include ‘y’? Create a vector with only species names, but replace all ‘y’ with ‘Y! ants <- read.table("https://goo.gl/3Ek1dL") colnames(ants) <- c("genus", "species") Remove all vowels Replace all vowels with ‘o’
  • 53.
  • 55. Functions • R has many. e.g.: plot(), t.test() • Making your own: tree_age_estimate <- function(diameter, species) { growth_rate <- growth_rates[ species ] age_estimate <- diameter / growth_rate return(age_estimate) } > tree_age_estimate(25, “White Oak”) + 66 > tree_age_estimate(60, “Carya ovata”) + 190
  • 56. Make a function • That converts fahrenheit to celsius (subtract 32 then divide the result by 1.8)
  • 57. Loops
  • 58. “for” Loop > possible_colours <- c('blue', 'cyan', 'sky-blue', 'navy blue', 'steel blue', 'royal blue', 'slate blue', 'light blue', 'dark blue', 'prussian blue', 'indigo', 'baby blue', 'electric blue') > possible_colours [1] "blue" "cyan" "sky-blue" "navy blue" [5] "steel blue" "royal blue" "slate blue" "light blue" [9] "dark blue" "prussian blue" "indigo" "baby blue" [13] "electric blue" > for (colour in possible_colours) { + print(paste("The sky is oh so, so", colour)) + } [1] "The sky is so, oh so blue" [1] "The sky is so, oh so cyan" [1] "The sky is so, oh so sky-blue" [1] "The sky is so, oh so navy blue" [1] "The sky is so, oh so steel blue" [1] "The sky is so, oh so royal blue" [1] "The sky is so, oh so slate blue" [1] "The sky is so, oh so light blue" [1] "The sky is so, oh so dark blue" [1] "The sky is so, oh so prussian blue" [1] "The sky is so, oh so indigo" [1] "The sky is so, oh so baby blue"
  • 59. What does this loop do? for (index in 10:1) { print(paste(index, "mins befo lunch")) }
  • 60. Again • What does the following code do (decompose on pen and paper)
  • 61. for (letter in LETTERS) { begins_with <- paste("^", letter, sep="") matches <- grep(pattern = begins_with, x = ants$genus) print(paste(length(matches), "begin with", letter)) } > LETTERS [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S" [20] "T" "U" "V" "W" "X" "Y" "Z" 
 > ants <- read.table("https://goo.gl/3Ek1dL") > colnames(ants) <- c("genus", “species") 
 > head(ants) genus species 1 Anergates atratulus 2 Camponotus sp. 3 Crematogaster scutellaris 4 Formica aquilonia 5 Formica cunicularia 6 Formica exsecta What does this loop do?
  • 62.
  • 67.
  • 68.
  • 69.