EpiBench Project Initialization at Washington University

Project Kickoff Meeting

This note documents the initial setup and planning for the EpiBench project at Washington University. Last updated: 2024-09-10

Project Introduction

On September 3, 2024, I began my position as a Bioinformatics Research Assistant in the Division of Medical Oncology at Washington University in St. Louis. One of my primary responsibilities is continuing the development of EpiBench, an integrated platform for epigenetic analysis combining machine learning with biological interpretability.

Initial Meetings and Planning

During the first week, I had several key meetings:

  • Onboarding with HR (September 3)
  • Lab introduction with Dr. Spencer and team members (September 4)
  • Project planning meeting to discuss EpiBench goals and timeline (September 6)
  • Technical infrastructure discussion with IT and computational resources team (September 7)

Initial Project Goals

Based on discussions with the research team, we’ve established the following initial goals for EpiBench development at Washington University:

  1. Infrastructure Setup (September 2024)

    • Configure development environment on WashU compute cluster
    • Set up version control and documentation workflow
    • Define coding standards and collaboration protocols
  2. Data Integration (October 2024)

    • Integrate specific datasets available at WashU
    • Establish data pipelines for ongoing projects
    • Create data preprocessing workflows for lab-specific protocols
  3. Algorithm Enhancement (November 2024)

    • Review and optimize existing machine learning models
    • Implement new model architectures for specific epigenetic features
    • Develop specialized analysis for cancer-specific epigenetic patterns

Technical Environment

  • Computing Resources: Access to WashU Research Computing Cluster (RCC)
  • Storage: 10TB allocated for project data and results
  • Development Tools: GitLab Enterprise, Jupyter Hub, and RStudio Server
  • Software Stack: Python 3.10, PyTorch 2.0, Bioconductor tools, custom WashU pipelines

Next Steps

[ ] Complete setup of development environment on RCC [ ] Schedule regular progress meetings with team members [ ] Identify key datasets for initial testing [ ] Begin documentation of WashU-specific workflows [ ] Design initial experiments for model training and evaluation

Research Team

  • Dr. David Spencer - Principal Investigator
  • [Team Member 1] - Postdoctoral Researcher, Genomics
  • [Team Member 2] - PhD Student, Machine Learning
  • [Team Member 3] - Staff Scientist, Bioinformatics
  • Andrew Bonney - Bioinformatics Research Assistant

Notes and Observations

This section will be filled in with ongoing observations, challenges, and insights throughout the project initialization phase.


This document will be updated regularly as the project develops. For background on EpiBench, see the Introduction to EpiBench.