Research Advisor - Computational Genomics

Eli Lilly Eli Lilly · Pharma · Boston, MA +2

This role focuses on building and maintaining production genomics infrastructure for RNA drug discovery, including data pipelines from raw sequencer output to DataLake ingestion. It also involves contributing to ML models that predict pharmacology outcomes based on sequence design and collaborating with enterprise AI teams to integrate AI/ML capabilities into RNA-specific tools. The role requires expertise in bioinformatics, high-throughput NGS data analysis, and programming, with exposure to AI/ML methods in biological sequence data.

What you'd actually do

  1. Build and maintain production NGS analysis pipelines scaled for high-throughput RNA drug discovery including bulk RNA-seq, smRNA-seq, SHAPE-seq, and eCLIP-seq assays
  2. Own the full data path from sequencer output through QC to DataLake ingestion, ensuring data integrity, reproducibility, and accessibility for downstream analysis and ML ready data
  3. Design and execute analytical strategies for spatial transcriptomics and single-cell sequencing to resolve cellular heterogeneity and tissue-level biology relevant to RNA therapeutics
  4. Support and contribute to ML model development connecting RNA sequence design and chemistry to pharmacology outcomes, including potency and off-target liability prediction
  5. Partner with enterprise AI, Statistics, and Tech@Lilly teams to translate platform-level AI/ML capabilities into RNA-specific tooling and reusable analytical assets

Skills

Required

  • PhD degree in bioinformatics, computational biology, computational genomics, integrated biomedical sciences, or a related field
  • Demonstrated expertise with standard bioinformatics tools, pipelines, and databases for genomic analysis
  • Strong programming skills in Python, R, or similar languages
  • Experience with high-throughput, high-dimensional NGS data analysis and interpretation
  • Excellent written and oral communication skills with ability to present complex data to diverse audiences
  • Demonstrated ability to work collaboratively in cross-functional team environments
  • Self-directed and highly motivated individual with strong learning agility
  • Extensive experience developing scientific solutions using Python, R, or similar languages
  • Experience working with High-Performance Computing and/or cloud environments
  • Experience with workflow orchestration platforms such as Seqera Platform (Nextflow Tower) and community-maintained pipelines (e.g., nf-core/rnaseq) for scalable, reproducible analysis
  • Hands-on experience with RNA-seq analysis tools for alignment, gene expression quantification and pathway analyses
  • Hands-on experience with spatial transcriptomics (e.g., Visium, Xenium) and single-cell RNA-seq analysis (e.g., Seurat, Scanpy)

Nice to have

  • Exposure to AI/ML methods applied to biological sequence data or drug discovery (e.g., regression, deep learning, generative models for sequences)
  • Deep understanding of nucleic acid, cellular, and molecular biology; familiarity with RNA therapeutics biology is a strong plus
  • Demonstrated knowledge of genetics and molecular biology, particularly as they relate to RNA
  • Track record of peer-reviewed publications in bioinformatics, computational biology, or a closely related field
  • Experience in algorithm development, statistics, data management, data mining, data visualization, and/or analytics

What the JD emphasized

  • production genomics infrastructure
  • ML models
  • RNA drug discovery
  • AI/ML capabilities

Other signals

  • build, scale and maintain the production genomics infrastructure
  • contributing to ML models that link sequence design to pharmacology outcomes
  • Partner with enterprise AI, Statistics, and Tech@Lilly teams to translate platform-level AI/ML capabilities into RNA-specific tooling