The volume of biomedical literature is massive -- there are over one million new research articles published every year (roughly one every thirty seconds). To make those articles more usable in research, Scripps Research Institute is exploring ways in which Citizen Scientists can perform "biocuration". They’ll share learnings from one experiment conducted to identify all mentions of diseases and disease concepts in the abstracts of 973 biomedical research articles. Scripps used Amazon Mechanical Turk as the platform for testing the Citizen Science concept and interface, and will discuss how researchers can utilize crowdsourcing to solve complex R&D compute problems that require a degree of human intelligence.
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
R&D Focus: Amazon Mechanical Turk as a platform for curating research articles
1. Amazon Mechanical Turk as
a platform for curating
research articles
Andrew Su, Ph.D.
@andrewsu
asu@scripps.edu
http://sulab.org
March 25, 2015
Cloud Computing &
Life Sciences with AWS
Slides: slideshare.net/andrewsu
3. The biomedical literature is growing fast…
3
0
200,000
400,000
600,000
800,000
1,000,000
1,200,000
1983 1988 1993 1998 2003 2008 2013
Number of new PubMed-indexed articles
5. … but it is very hard to query and compute
5
Imatinib
Crizotinib
Erlotinib
Gefitinib
Sorafenib
Lapatinib
Dasatinib
…
Acute myeloid leukemia
Acute lymphoblastic leukemia
Chronic myelogenous leukemia
Chronic lymphocytic leukemia
Hodgkin lymphoma
Non-Hodgkin lymphoma
Myeloma
…
AND
6. The Network of BioThings
6
1. Identify biomedical concepts in text
… We report a case of familial systemic
mastocytosis with the rare KIT K509I germ
line mutation. In vitro treatment with imatinib,
dasatinib and PKC412 reduced cell viability
of primary mast cells harboring KIT K509I
mutation. Both patients with familial systemic
mastocytosis had remarkable hematological
and skin improvement after three months of
imatinib treatment.
Leuk Res. 2014 Oct;38(10):1245-51. doi: 10.1016/j.leukres.
GENES
DISEASES
DRUGS
VARIANTS
7. The Network of BioThings
7
imatinib
dasatinib
PKC412
Familial systemic
mastocytosis
KIT
K509I
1. Identify biomedical concepts in text
2. Identify relationships between concepts
Mutation
of
Mutation
causes
causes
treats
inhibits
8. 8
Goal: Assemble a network of biomedical
knowledge that is comprehensive,
current, computable and traceable.
10. 10
Doğan and Lu. Proceedings of the 2012 Workshop on BioNLP, 2012, 91-9.
NCBI Disease Corpus as Gold Standard
593 PubMed abstracts 12 expert annotators
(2 per document)
6,900 mentions of
“disease concepts”
11. Question: Can a group of non-scientists
collectively replicate the expert-generated
NCBI Disease Corpus?
11
20. 20
Cyrus Afrasiabi
Sebastian Burgstaller
Ramya Gamini
Louis Gioia
Toby Li
Salvatore Loguercio
Adam Mark
Erick Scott
Greg Stupp
Kevin Xin
Other group members
Contact
http://sulab.org
asu@scripps.edu
@andrewsu
+Andrew Su
Mark2Cure
Ben Good
Max Nanis
Ginger Tsueng
Chunlei Wu
All Mark2Curators!
Funding and Support
BioGPS: R01 GM83924
Gene Wiki: R01 GM089820
BD2K Center of Excellence: U54 GM114833
STSI: UL1 TR001114
Icon credits (Noun Project, Wikimedia Commons): Zach VanDeHey, hunotika, Viktorvoigt, Alberto Rojas, Lloyd Humphreys
Matt and Cristina Might
NGLY1 community
21. Why do I Mark2Cure?
21
I am retired, have a doctorate in
medical humanities, and have two
children with Gaucher disease. I am
just looking for some way to put my
education to use. Sounds like a perfect
situation for me.
My 4 year old daughter Phoebe is
living with and battling rare
disease.
I have Ehlers Danlos Syndrome. I hope to help people
learn about this painful and debilitating disorder, so that
others like me can receive more effective medical care.
Take part in
something that
helps humanity.
I Mark2Cure in memory of
my son Mike who had type 1
diabetes.
Studied biology in
college and I really
miss it!
In memory of my daughter
who had Cystic Fibrosis
Give back