Healthcare Innovation Replicathon 2017 and Data Carpentry Instructor Training

Tracy K Teal giving the keynote address at the Healthcare Innovation Replicatihon.

Our Healthcare Innovation Replicathon and Data Carpentry Instructor Training events were a success! Students and faculty from many campuses and departments met on March 24-25, 2017 at the Engine-4 co-working space in Bayamón. Puerto Rico.. Students took part in the 36-hour Healthcare Innovation Replicathon, led by Alejandro Reyes and Keegan Korthauer from Rafael Irizarry’s rafalab (Dana Farber Cancer Center & Harvard) Patricia Ordoñez (UPR RP Computer Science), and Phillip Brooks from Titus Brown’s Lab for Data Intensive Biology (UC Davis). Students took brief tutorials on R, reproducible research, and statistical analysis, then dove in to examine two studies on pharmacogenomics in cancer cell lines.

Keegan Korthauer describing the pharmacogenomics problem studied in the Healthcare Innovation Replicathon.

Interdisciplinary (and inter-campus) teams of students worked for 24 hours on a re-analysis of data from two large scale studies looking at the effects of 15 drugs in 240 cancer cell lines. They presented their findings on Saturday, including errors in the published figures, their recommendations on improved measures of drug effects, and lists of drugs for follow-up studies.

The Healthcare Innovation Replicathon wasn’t just about data, statistics, programs and cancer. Students, mentors, sponsors, and faculty got a chance to interact in a welcoming environment, with good food, top-notch Internet, and even a guitar or two.

Phillip Brooks (standing, tweeting about the event) and Alejandro Reyes (seated, on guitar duty) entertaining the participants in the Healthcare Innovation Replicathon.
Students from competing teams set aside their differences to make some music.
Final presentations by students at the Healthcare Innovation Replicathon (photo credit K. Korthauer)

Meanwhile, faculty from Interamerican University Bayamon Campus, UPR Humacao, Mayaguez, Rio Piedras and private industry went through Data Carpentry Instructor Training led by Rayna Harris (UT Austin), Sue McClatchy (The Jackson Laboratory), and Tracy Teal (Data Carpentry). Data Carpentry Instructor Training presents instructors with research-based best practices for teaching data science to novices. Stay tuned for announcements of new Data Carpentry workshops with some of the new instructors soon.

A dozen instructors getting trained as part of Data Carpentry Instructor Training.

The IDI-BD2K project would like to thank all the sponsors that made this event a success: VarMed Management, PR-INBRE, the National Institutes of Health,, the University of Puerto Rico, the Lab for Data Intensive Biology at the University of California, Davis, rafalab at Harvard University, AbartysHealth, e3 consulting, Data Carpentry, Engine-4, and the UPR Rio Piedras Department of Computer Science.

Patricia Ordoñez, one of the IDI-BD2K Principal Investigators, thanking some of our sponsors at the Healthcare Innovation Replicathon.

Mañana Comienza el Healthcare Innovation Replicathon

Qué es un Replicathon?

Un replicathon, similar a un hackathon, se caracteriza por ser una actividad de 36 horas continuas de trabajo analítico y de programación para crear soluciones reales usando la tecnología. A diferencia de un hackathon tradicional, los equipos recibirán el mismo reto o problema: dos manuscritos científicos que llegaron a dos resultados diferentes utilizando los mismos datos. El fin es que el equipo de participantes interprete los datos y presente sus conclusiones. En un hackathon, normalmente la solución se realiza en forma de un App (una aplicación móvil o web). El replicathon requiere colaboración interdisciplinaria entre los expertos en programación, los de análisis de datos, los del contexto (genómica en este caso).

Estudiantes pueden matricularse para participar aqui:

Replicathon Problem

Rafael Irizarry, IDI-BD2K partner, has submitted this problem for the Heathcare Innovation Replicathon.

In 2012, two studies (Garnett et al and Barretina et al) attempted to correlate large numbers of gene expression, mutation, and copy number measurements in hundreds of cancer cell lines with sensitivities to hundreds of different drugs, with the goal of finding genes or mutations that might indicate certain kinds of cancers with vulnerabilities to specific drugs. However, a subsequent study (Haibe-Kains et al 2013), attempting to replicate the initial findings, found major inconsistencies in the results of the two studies. We can review the papers, download the data and analyze it ourselves to form our own conclusions.

Readings for project:

  1. Haibe-Kains, B. et al. Inconsistency in large pharmacogenomic
    studies. Nature 504, 389–393 (2013).
  2. Barretina, J. et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603–607 (2012).
  3. Garnett, M. J. et al. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature 483, 570–575 (2012).
  4. Safikhani, Z. et al. Assessment of pharmacogenomic agreement. bioRxiv 48470 (2016). doi:10.1101/048470
  5. Smirnov, P. et al. PharmacoGx: an R package for analysis of large pharmacogenomic datasets. Bioinformatics 32, 1244–1246 (2016).

Replicathon registration open

The registration for the Varmed Management Group and IDI-BD2K Healthcare Innovation Replicathon is open. Join us March 24-25, 2017 for this event. Mentors from the University of Puerto Rico, Harvard, University of California Davis, Massachusetts Institute of Technology and more will guide groups of students to examine the issues of replicability in a set of experiments asking the same question and obtaining different answers.


Turn biological data and coffee into insight.