6. Human Connectome Project
• > 500 subjects (will reach 1200)
– Young and healthy (22-35yrs)
– 200 twins!
• 1 hour worth of MRI scanning:
– State of the art sequences – high temporal and spatial resolution
– Resting-state fMRI (R-fMRI)
– Task-evoked fMRI (T-fMRI)
• Working Memory
• Gambling
• Motor
• Language
• Social Cognition
• Relational Processing
• Emotion Processing
– Diffusion MRI (dMRI)
– MEG and EEG
– 7T coming soon
7. Human Connectome Project
• Rich phenotypical data
– Cognition, personality, substance abuse etc.
• Genotyping! (not yet available)
• Methodological developments
– Fine tuned sequences
– Innovative field inhomogeneity corrections
– New preprocessing techniques
• Ready to use preprocessed data
9. FCP/INDI Usage Survey
Survey Courtesy of Stan Colcombe & Cameron Craddock
FCP/INDI Data Usage Description
Master's thesis research 11.94%
Doctoral dissertation research 38.81%
Teaching resource (projects or examples) 13.43%
Pilot data for grant applications 16.42%
Research intended for publication 76.12%
Independent study (e.g., teach self about analysis) 37.31%
FCP/INDI Users; 10% respondent rate
13. Data sharing saves money
$878,988
cost of reacquiring data for each of the
reuses of OpenfMRI datasets
14. Data sharing fears
• Fear of being scooped
• Fear of someone finding a mistake
• Misconceptions about the ownership of the
data
15. Studies sharing data have higher
statistical quality
Wicherts JM, Bakker M, Molenaar D (2011) Willingness to Share Research Data Is
Related to the Strength of the Evidence and the Quality of Reporting of
Statistical Results. PLoS ONE 6(11): e26828. doi: 10.1371/journal.pone.0026828
20. Baby steps
• Everything is a question of cost and benefit
– If we keep the cost low even small benefit (or
just conviction that data sharing is GOOD) will
suffice
21. NeuroVault.org
simple data sharing
• Minimize the cost!
• We just want your statistical maps with
minimum description (DOI)
– If you want you can put more metadata, but
you don’t have to
• We streamline login process (Google,
Facebook)
26. Benefits - other
• Private collections
• Multiple contributors to one collection
• Sharable persistent URLs
• Viewer embeddable on your labs website
or your private blog
• Improved exposure of your research
• Improved reusability of your results
27. Using NeuroVault…
• Improves collaboration
• Makes your paper more attractive
• Shows you care about transparency
• Takes only five minutes
• Gives you warm and fuzzy feeling that you
helped future meta-analyses
34. • Share
your
stat
maps!
How can we appropriately
reward extra effort and risk
related with sharing data?
35. Solution – data papers
• Authors get recognizable credit for their
work.
– Even smaller contributors such as RAs can be
included.
• Acquisition methods are described in
detail.
• Quality of metadata is being controlled by
peer review.
37. • Neuroinformatics (Springer)
• GigaScience (BGI, BioMed Central)
• Scientific Data (Nature Publising Group)
• F1000Research (Faculty of 1000)
• Data in Brief (Elsevier)
• Journal of Open Psychology Data (Ubiquity
press)
Where to publish data papers?
38.
39.
40.
41.
42. What makes a good data paper?
• Clear and accurate description of the
acquisition protocol.
• Good data organization.
• Ease of access to data.
• Data quality description.
• Fair credit attribution.
43. How to improve the impact of your
dataset?
• Provide preprocessed data.
• Reach out to your peers…
– …and people outside of your field (ML)
• Build a community around the data.
45. Repositories
• Field specific
– OpenfMRI.org (task based fMRI)
– FCP/INDI (resting state fMRI)
– COINS
• Field agnostic
– DataVerse (Harvard)
– Figshare (only small datasets)
– DataDryad (fees may apply)
46. OpenfMRI
• Will host any dataset that has a task based
fMRI component
• No fees
• Curated and uncurated datasets
• Recommended by many journals (including
Scientific Data)
47. Prepare in advance
• Make sure your consent form includes data
sharing
• Decide which database you want to send
your data to in advance
– Organize your data according to their
requirements
• Work on anonymized data as much as you
can
48. If I haven’t convinced you yet
• Why to share data:
– It’s the ethical thing to do (Brakewood and
Poldrack 2013)
– The journal might require it (PLoS).
– Your funders might require it (NIH).
– Track record of data sharing can improve your
chances of getting your next grant.
49. Sharing data is related to higher
citation rate
Piwowar, Day & Fridsma (2007) Piwowar & Vision(2013)
50. Acknowledgements
Russell A. Poldrack
Jean-Baptiste Poline
Yannick Schwarz
Tal Yarkoni
Michael Milham
Daniel Margulies
Yannick Schwartz
Gael Varoquox
Joseph Wexler
Gabriel Rivera
Camile Maumet
Vanessa Sochat
Thomas Nichols
MPI CBS Resting state group
Poldrack Lab
INCF Data Sharing Task
Force