I am a computational biologist dedicated to enabling biological discoveries through the development of computational tools across diverse scientific disciplines with a primary focus on rare disease. My graduate and postdoctoral training studied the diverse landscape of gene, RNA and protein regulation in cells through the integration of high throughput datasets derived from proteomic, transcriptomic, epigenetic and genetic experiments. As the generation of large scientific datasets has expanded beyond the reach of a single scientist or lab I have worked to enable collaborations across labs by bringing scientists together to form new partnerships and collaborations, primarily in cancer.
My current work is focused on developing novel integrative approaches to study diverse data modalities as well as making these approaches consumable by the broader scientific public with a specific focus on rare disease. This includes improving on computational approaches (Gosline et al, 2016, Banerjee et al. 2020) and also building software packages and data resources to share the results in a meaningful manner (Tuncbag et al., 2016, Allaway et al., 2019).
Cellular networks are comprised of DNA, RNA, proteins and metabolites that cooperate to affect the biological function of a particular cell. However, any single biological experiment only collects one segment of this larger network, making it difficult to infer broader biological context of a single gene or protein alteration. As such, network algorithms that explicitly model interactions between proteins, metabolites, DNA and RNA are able to better identify relationships between these molecules to derive biological conclusions.
My computational research focuses on applying network algorithms and other approaches to analyze high throughput data in conditions where low sample size confounds more general statistical approaches. For example, the network algorithms have been useful in characterizing biological activity in cases where many data modalities have been captured for very few samples. Specifically I have used gene expression changes, measured by RNA-Seq, as a signal of upstream cellular perturbations, measured by genetic hits (Gosline et al., 2012 and Gosline et al., 2015), proteomic data (Huang et al., 2013) or non-coding RNA alterations (Tuncbag, Gosline et al., 2016). These network algorithms have proven to be robust to smaller sample sizes where correlative analyses are not available. More recently I have employed dimensionality-reduction strategies to improve gene expression interpretation in rare nerve sheath tumors (Banerjee et al., 2020).
Systems biology approaches to rare disease
With fewer patients and model systems, rare diseases are not often studied in a systems biology context. This paucity of data in the rare disease field further reinforces the absence of comprehensive research in these areas. My experiences with integrative computational methods, systems approaches to cell biology, and scientific outreach provide a wealth of knowledge to work in the rare disease space.
Specifically I have worked with dozens of researchers in the field of neurofibromatosis to harmonize studies across the field to apply a systems approach to uncover the molecular etiology of NF1-related symptoms to identify potential drug treatments. This work has culminated in the development of a disease-centric data resource (Allaway et al. 2019) to collate diverse streams of research into a single landing page as well as performing in-depth resource generation (Gosline et al. 2017, Ferrer et al. 2018, Pollard et al. 2020). These data resources have fueled the application of integrative algorithms described above.
Strengthening research communities
Research interests aside, one of the most exciting aspects of science is how rapidly the process and community changes and adapts to new technologies, including our connectivity across the globe. I have been lucky to participate in many ‘experiments’ that push the limit on the scientific method in various ways:
1- Online course development: I developed content alongside the MITx and Edx platform teams to teach R and Python to Biologists. https://www.edx.org/course/quantitative-biology-workshop-mitx-7-qbwx.
2- Community building: I worked with the NCI through the Integrated Cancer Biology Program (ICBP) Junior Investigator program to plan annual Junior Investigator meetings from 2012-2014 to build a community of Cancer Systems Biologists. https://icbp.nci.nih.gov/education-and-outreach/jr_investigators/junior-investigators-meetings. In addition to the meeting described below, this also founded its own society and twitter handle (@cancersysbio).
3- Scientific meetings: One of the primary mandates of the the Association for Cancer Systems Biologists os to run the biennial ‘Systems Approaches to Cancer Biology’ meeting, co-sponsored by the NCI. http://www.sacbmeeting.org. I have been on the organizing committee for the previous two meetings and we are actively planning the third!
4- Open-source tool development and building data repositories: As a computational biologist it is common to develop tools and let them language without being used. To counter this I have worked to release an open-source tool to help others integrate diverse high throughput datasets (Tuncbag, Gosline et al. 2016) as well as work to docker-ize a tool to predict cell type from single-cell RNA-seq data (Liu et al, 2020).
6- Hackathons: Most recently I worked with others to run the 2nd NF Hackathon focused on data from the NF Data Portal managed by the Rare Disease group at Sage Bionetworks. It was an amazing opportunity and one that will continue as the NF community grows!