So Long Computer Overlords

1. So long, computer overlordsHow Cloud (and Grid) can liberate research IT – and transform discoveryIan Foster

4. The data deluge MACHO et al.: 1 TB Palomar: 3 TB 2MASS: 10 TB GALEX: 30 TB Sloan: 40 TB Pan-STARRS: 40,000 TB 100,000 TB Genomic sequencing output x2 every 9 month >300 public centers 1330molec. bio databases Nucleic Acids Research (96 in Jan 2001) 2004: 36 TB 2012: 2,300 TB Climate model intercomparison project (CMIP) of the IPCC

5. Big science has achieved big successes OSG: 1.4M CPU-hours/day, >90 sites, >3000 users, >260 pubs in 2010 LIGO: 1 PB data in last science run, distributed worldwide Robust production solutions Substantial teams and expense Sustained, multi-year effort Application-specific solutions, built on common technology ESG: 1.2 PB climate data delivered to 23,000 users; 600+ pubs All build on NSF OCI (& DOE)-supported Globus Toolkit software

6. But small science is struggling More data, more complex data Ad-hoc solutions Inadequate software, hardware Data plan mandates

7. Medium-scale science struggles too! Blanco 4m on Cerro Tololo Image credit: Roger Smith/NOAO/AURA/NSF Dark Energy Survey receives 100,000 files each night in Illinois They transmit files to Texas for analysis … then move results back to Illinois Process must be reliable, routine, and efficient The cyberinfrastructure team is not large

8. The challenge of staying competitive "Well, in our country," said Alice … "you'd generally get to somewhere else — if you run very fast for a long time, as we've been doing.” "A slow sort of country!" said the Queen. "Now, here, you see, it takes all the running you can do, to keep in the same place. If you want to get somewhere else, you must run at least twice as fast as that!"

9. Current approaches are unsustainable Small laboratories PI, postdoc, technician, grad students Estimate 5,000 across US university community Average ill-spent/unmet need of 0.5 FTE/lab? Medium-scale projects Multiple PIs, a few software engineers Estimate 500 across US university community Average ill-spent/unmet need of 3 FTE/project? Total 4000 FTE: at ~$100K/FTE => $400M/yr Plus computers, storage, opportunity costs, …

10. And don’t forget administrative costs 42%of the time spent by an average PI on a federally funded research project was reported to be expended on administrative tasks related to that project rather than on research — Federal Demonstration Partnership faculty burden survey, 2007

11. You can run a company from a coffee shop

12. Because businesses outsource their IT Web presence Email (hosted Exchange) Calendar Telephony (hosted VOIP) Human resources and payroll Accounting Customer relationship mgmt Software as a Service (SaaS)

13. And often their large-scale computing too Web presence Email (hosted Exchange) Calendar Telephony (hosted VOIP) Human resources and payroll Accounting Customer relationship mgmt Data analytics Content distribution Software as a Service (SaaS) Infrastructure as a Service(IaaS)

14. Let’s rethink how we provide research IT Accelerate discovery and innovation worldwide by providing research IT as a service Leverage software-as-a-service to provide millions of researchers with unprecedented access to powerful tools; enable a massive shortening of cycle times intime-consuming research processes; and reduce research IT costs dramatically via economies of scale so long, computer overlords

16. Publish papers

17. Find, configure, install relevant software

18. Find, access, analyze relevant data

19. Order supplies

20. Write proposals

21. Write reports

23. Publish papers

24. Find, configure, install relevant software

25. Find, access, analyze relevant data

26. Order supplies

27. Write proposals

28. Write reports

30. Data movement can be surprisingly difficult Discover endpoints, determine available protocols, negotiate firewalls, configure software, manage space, determine required credentials, configure protocols, detect and respond to failures, determine expected performance, determine actual performance, identify diagnose and correct network misconfigurations, integrate with file systems, … It took 2 weeks and much help from many people to move 10 TB between California and Tennessee. (2007 BES report) B A

31. Globus Online’sSaaS/Web 2.0 architecture Command line interface lsalcf#dtn:/ scpalcf#dtn:/myfile br />nersc#dtn:/myfile HTTP REST interface POST https://transfer.api.globusonline.org/ v0.10/transfer <transfer-doc> Web interface (Operate) Fire-and-forget data movement Automatic fault recovery High performance No client software install Across multiple security domains (Hosted on) GridFTP servers FTP servers Other protocols: HTTP, WebDAV, SRM, … Globus Connect on local computers

32. Example application: UC sequencing facility Mac using Globus Connect Delivery of data to customer iBi File Server Mount drive iBi general-purpose compute cluster Sequencing-specific compute cluster Sequencing instrument

33. Statistics and user feedback Launched November 2010 >1400 users registered >350 TB user data moved >28 million user files moved >140 endpoints registered Widely used on TeraGrid/XSEDE; other centers & facilities; internationally >20x faster than SCP Faster than hand-tuned “Last time I needed to fetch 100,000 files from NERSC, a graduate student babysat the process for a month.” “I expected to spend four weeks writing code to manage my data transfers; with Globus Online, I was up and running in five minutes.” “Globus Online’s speed has us planning experiments that we would never have considered previously.”

34. Moving 586 Terabytes in two weeks

35. Monitoring provides deep visibility

36. 20 Terabytes in less than one day Terabyte 20 Gigabyes in more than two days Gigabyte Megabyte Kilobyte

37. Common research data management steps Dark Energy Survey Galaxy genomics LIGO observatory SBGrid structural biology consortium NCAR climate data applications Land use change; economics

38. We have choices of where to compute Campus systems First target for many researchers XSEDE supercomputers 220,000 cores, peer-reviewed awards Optimized for scientific computing Open Science Grid 60,000 cores; high throughput Commercial cloud providers Instant access for small tasks Expensive for big projects Users insist that they need everything connected

39. Towards “research IT as a service”

40. Research data management as a service GO-User Credentials and other profile information GO-Transfer Data movement GO-Team Group membership GO-Collaborate Connect to collaborative tools: Jira, Confluence, … GO-Store Access to campus, cloud, XSEDE storage GO-Catalog On-demand metadata catalogs GO-Compute Access to computers GO-Galaxy Share, create, run workflows Today Prototype Fall

41. SaaS services in action: The XSEDE vision XUAS

42. Data analysis as a service: Early steps Securely and reliably: Assemble code Find computers Deploy code Run program Access data Store data Record workflow Reuse workflow [7, 8] [1, 2] We have built such systems for biological, environmental,and economics researchers VM image App code Workflow Galaxy Condor [3, 4] [5, 6] Data store

43. SaaS economics: A quick tutorial Lower per-user cost (x10?) via aggregation onto common infrastructure $400M/yr $40M/yr? Initial “cost trough” due to fixed costs Per-user revenue permits positive return to scale Further reduce per-user cost over time $ 0 Time X10 reduction in per-user cost: $50K  $5K/yr per lab $300K  $30K/yr per project

44. A national cyberinfrastructure strategy? To providemore capability formore people at less cost … Create infrastructure Robust and universal Economies of scale Positive returns to scale Via the creative use of Aggregation (“cloud”) Federation (“grid”) Small and medium laboratories and projects P L L L L L L L L L P P P P L L L L L L L L L L L L L L L L L L aa S Research data management Collaboration, computation Research administration

45. Acknowledgments Colleagues at UChicago and Argonne Steve Tuecke, Ravi Madduri, Kyle Chard, Tanu Malik, Michael Russell, Paul Dave, Stuart Martin, Dan Katz, and many others Colleagues at other institutions Carl Kesselman, MironLivny, John Towns, and others NSF OCI, MPS, and SBE; DOE ASCR; and NIH for support

46. For more information Foster, I. Globus Online: Accelerating and democratizing science through cloud-based services. IEEE Internet Computing(May/June):70-73, 2011. Allen, B., Bresnahan, J., Childers, L., Foster, I., Kandaswamy, G., Kettimuthu, R., Kordas, J., Link, M., Martin, S., Pickett, K. and Tuecke, S. Globus Online: Radical Simplification of Data Movement via SaaS. Communications of the ACM, 2011.

47. Thank you! foster@uchicago.edu

So Long Computer Overlords

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (20)

Similaire à So Long Computer Overlords

Similaire à So Long Computer Overlords (20)

Plus de Ian Foster

Plus de Ian Foster (20)

Dernier

Dernier (20)

So Long Computer Overlords

Notes de l'éditeur