Reunion informativa

Estimado estudiante,

Hoy como nunca antes la investigación  biomédica está generando cantidades masivas de datos, cuyo análisis e interpretación tiene el potencial de producir dramáticos avances en nuestro conocimiento sobre la salud humana y sobre nuestra calidad de vida. El análisis de estos conjuntos masivos de datos (“Big Data”) require técnicas que combinan conocimientos en Biología, Química, Estadística, Ciencias  de Cómputo y otras áreas.

Existe la posibilidad de que NIH apruebe una propuesta enviada por un grupo de profesores de la Facultad de Ciencias Naturales de la UPR-Rio Piedras para preparar estudiantes de diferentes concentraciones en investigación biomédica usando grandes cantidades de datos (“Big Data to Knowledge”- BD2K) Estos estudiantes tomarían una secuencia de cursos dependiendo de su concentración de origen, y también cursos enfocados en el manejo y análisis de  “Biomedical Big Data”. Los mejores estudiantes de este grupo realizarán internados en laboratorios nacionales financiados por NIH.

Si eres estudiante de la Facultad de Ciencias Naturales y te interesa este reto:

te invitamos a una reunión informativa los días 10 y 12 de agosto de 2015 al medio dia en el anfiteatro A-211.

En esta reunion esperamos poder formar dos grupos de estudiantes.  El primero con estudiantes comenzando su 2do, 3er o 4to año que estén avanzados en sus estudios y que puedan incorporarse al programa como un grupo piloto.  El segundo, estudiantes de 1er a 3er año que puedan ir tomando los cursos necesarios para incorporarse al Programa el año entrante.

Ciencia de Cómputos

Pérez-Hernández, María-Eglée – Bayesian Biostatistics and its Applications in Life Sciences

Dr. Pérez Hernández is currently involved in the “Biostatistics, Epidemiology and Bioinformatics BEBiC Core” of the U54 Collaborative 5 year Grant between UPR and MDAnderson Cancer Center, where she is collaborating with Drs. Pericchi and Ortiz-Zuazaga. She is also collaborating with Dr. Acevedo in the development of Bayesian epidemiological models based on internet search information (Google Flu).

Dr. Pérez-Hernández has made contributions on Bayesian Statistics, especially on Bayesian Robustness and Objective Bayesian Methods. She has a long history of successful interdisciplinary work with researchers in biomedical sciences and ecology, including statistical support for development of rotavirus vaccines and statistical support for studies on Helicobacter pylori.

Pericchi, Luis – Bayesian Statistics in Cancer, Cardiovascular Disease and Health Econometrics

Dr. Pericchi has currently three long term projects that involve big data from Puerto Rico, and that require exploratory data analysis, modeling, inference and prediction. Currently he is the Co-PI of the “Biostatistics, Epidemiology and Bioinformatics BEBiC Core” of the U54 Collaborative 5 year Grant between the University of Puerto Rico and MDAnderson Cancer Center. He is collaborating with Drs. Perez-Hernandez and Ortiz-Zuazaga and directing students to search for predictive models of prostate cancer severity that involve over 800 patients and around 600 potential explanatory variables. Another aspect of his cancer-related research deals with the design of multidimensional engineering experiments for alternative cancer treatments to radio- and chemo- therapies that give rise to response surfaces in several dimensions. Regarding heart disease and stroke, he has worked with the School of Medicine Endowed Health Services Research Center, and a database of cardiovascular diseases in Puerto Rico was established with several possible explanatory variables, giving rise to several potential data science projects. Regarding health econometrics and related fields, Dr. Pericchi has been directing projects to capture masses of information of credit behavior in Puerto Rico, as well its modeling.

Dr. Pericchi has a long trajectory on different aspects of Bayesian Statistics, but especially in: Foundations of Decision Theory, Model Selection, Bayesian Robustness, Bayesian Treatment of Conflicting Evidence and Applications to Statistics of Extremes, Detection of Fraud, Medical Diagnoses and Clinical Trials. He is an elected member of ISBA: International Society for Bayesian Analysis and the current president of its Section of Objective Bayes.

Ortiz-Zuazaga, Humberto G. – Bioinformatics of Gene Expression

Dr. Ortiz-Zuazaga has developed novel methods of measuring gene expression from microarray and second-generation sequencing data, and determining regulatory gene networks from this data. He already has established successful collaborations with scientists in biomedical research using Big Data, in this award, he will continue to grow these research collaborations, bringing his quantitative and algorithmic skills to bear on novel biomedical problems. Due to his experience in multiple fields, Dr. Ortiz-Zuazaga is uniquely qualified to abstract the basic algorithmic challenges in many biological problems, and can help translate biological questions into data analysis algorithms. Students in his lab will adapt probabilistic data structures to the task of detecting differential gene expression in de-novo RNA-seq experiments, and use these and other data sets to model gene regulatory networks using bioinformatic and statistical methods.

Dr. Ortiz-Zuazaga brings to the project extensive experience in computational biology, ranging from data analysis to modelling and simulation and visualization.

Ordóñez, Patricia – Visualization, Machine Learning, and Biomedical Informatics Education

Professor Patricia Ordóñez has been developing a real-time visualization for Intensive Care Unit Data for over 7 years. She will be working with the MIT Laboratory of Computational Physiology in a summer sabbatical in 2015 to incorporate her visualization into their soon-to-be publically available database of streaming physiological data. As part of this grant, she envisions working with Dr. Harry Hochheiser at the University of Pittsburgh on the development and assessment of this project. She would like to incorporate his research on time boxes for univariate time series into multivariate time series of vital sign data. He would serve as a mentor in this project to improve the user experience.

Patricia Ordóñez is the founder of the Symposium of Health Informatics in Latin America and the Caribbean (SHILAC) that began in 2013 with an emphasis on defining common health care problems in LAC and finding innovative informatics solutions. The second SHILAC accompanied by the first Hacking Medicine in the Caribbean will occur in November 2015 in San Juan. Her contacts in Latin America and the Caribbean with leaders in biomedical informatics will serve as mentors for faculty at UPR-RP. Her expertise in working with visualization and machine learning in multivariate times series to develop clinical decision systems make her an ideal candidate for the program since she is attempting to incorporate her research into streaming databases.

Massey, Steve E. – Meta-metabolomic Network Analysis of Metagenomic Data from Diverse Habitats from Around the World

Dr Massey has been developing methods to assess metabolic flux through a microbial community from shotgun metagenomic data, by reconstructing ‘meta-metabolomic networks’ which show the relative abundance of genes encoding enzymes involved in the different metabolic pathways present. The approach involves large scale Blast searching of millions of individual sequences using grid computing, assignment of metabolic function to the identified sequence homologs, calculation of relative redundancy from the dataset, and calculation of overall flux using the kinetic rate constants of reference enzymes taken from the literature. The overall aim of this project is to assess differences in carbon flux from diverse habitats around the world, with an emphasis on methanogenesis. Data will be obtained from the MG-RAST database and selected for variation in latitude, temperature, and aerobicity. Students will learn a range of command line driven techniques for conducting both local and remote analyses, and will learn how to manage and parse very large data sets.

Dr. Massey is a bioinformatician with a wide range of interests, in genome evolution, metagenomics, organismal complexity, genetic code evolution, evolutionary medicine and ancient DNA.

Koutis, Yiannis – Algorithm Development for Image Segmentation

Dr. Koutis and his former MS student Richard Garcia-Lebron have developed new optimization-based methods for semi-automatic segmentation of neurons in EM images. These methods produce segmentations whose quality comes close to that of human experts. These methods require very little human intervention, and complete the segmentation in a small fraction of the time needed for manual segmentation. At the heart of these algorithms are recently discovered solvers and optimization techniques in which Dr Koutis has been a key contributor. This ongoing project offers many opportunities for undergraduate students with different sets of skills and interests and at various levels. Conversely, the contribution of undergraduate students is beneficial for the project as it can provide the lower-level support for more advanced students, and a stream of potential contributors to the larger field of Connectomics, under the auspices of NIH’s BRAIN initiative.

Dr. Koutis bring to the project knowledge in theoretical computer science, with expertise in spectral graph theory, numerical linear algebra and parameterized algorithms for hard combinatorial problems.

Godoy-Vitorino, Filipa – Metagenomics of Microbe-Human Interactions

Dr. Filipa Godoy-Vitorino is an Associate Professor at the Department of Natural Sciences, Interamerican University Metropolitan Campus and heads the Laboratory of Microbial Ecology and Genomics (MEGL). Her lab uses microbiome data (16S and ITS profiles and shotgun metagenomics) to study ecosystem functions and microbe-host interactions in humans, plants and animals. She integrates DNA sequence data (high throughput sequencing) with ecology, physiology and bioinformatics. Currently, having nearly exclusive research duties, she is developing different microbiome projects in natural environments including the association between microbiota and cervical HPV infections in Latinas.

Dr. Godoy-Vitorino brings to the project extensive expertise in in microbial community analyses using state of the art pipelines, as well as assembly, annotation and binning of microbial metagenomic data, for gene and enzymatic pathway inference.

Garcia-Arrarás, Jose E. – Gene Profiling of Regeneration Processes

Dr. Garcia-Arrarás has pioneered the use of the echinoderm Holothuria glaberrima to study the process of regeneration and organogenesis. His research focuses on the molecular aspects of organ regeneration, specifically on the genes that are important for intestinal and nervous system regeneration to occur. His lab has generated an expressed sequence tag (EST) database for H. glaberrima sequences obtained from various transcriptomic studies that include normal nervous tissue, normal intestine and regenerating nervous tissue and intestine at different regenerative stages. Their work is aimed at finding different profiles of gene expression and at determining the function of specific genes during the process of regeneration. Students will be involved in bioinformatics analyses to determine gene sequences, structural domains and gene characterization. In addition, the database will be analyzed to characterize the genetic profiles of nervous tissue specific gene sequence expression, intestinal specific expression and/or stage specific profiles.

In addition to the field of Regeneration, Dr. Garcia-Arraras brings to the project extensive biomedical knowledge in various fields that include Developmental Biology, Neuroscience, Physiology, Immunology and Anatomy.

Conde, José G. – Population Studies Based on Publicly Available Data Sources

Dr. Conde is working with multiple-cause mortality files for the United States (about 2.5 million records per year for years 2005 to 2013) and its territories (about 30,000 records per year for years 2005 to 2013), which are available from the CDC’s National Center for Health Statistics (NCHS). His research focuses on premature mortality in various populations; multiple-cause-mortality analysis of multiple diseases, including systemic lupus erythematous, and (in collaboration with Dr. Ortiz-Zuazaga) applying new tools to visualize the association of comorbid conditions with underlying causes of death. Thus, he is familiar with NCHS mortality files structure, ICD-10 coding systems, and mortality data collection and recode procedures.

Dr. Conde brings to the project his expertise in Medicine, Public Health and Epidemiology, in addition to his experience of more than 20 years in biomedical informatics projects and infrastructure.