BF528 - Applications in Translational Bioinformatics
Welcome to the homepage of BF528.
Semester: Spring 2025
Meeting time: Mon/Fri - 10:10-11:55am, Wed - 9:05-9:55am
Location:
Mon/Fri:CDS B62
Wed: CAS B20
Zoom: By request only
Office hours: By appointment
Wednesdays, 10-12pm LSEB 101
Course Objectives
Learn the molecular mechanisms and basic data analysis steps that underly common next-generation sequencing experiments
Develop proficiency in creating bioinformatics workflows with an emphasis on reproducibility and portability
Gain experience generating and interpreting bioinformatics analyses in a biological context
Below you will find a selection of some of the prominent biological and computational topics that will be covered in the course:
- High Throughput Sequencing Technologies (RNAseq, ChIPseq, scRNAseq) and various omics technologies (Proteomics, Metabolomics, etc)
- Computational Workflow Tools (snakemake, nextflow)
- Reproducibility and Replicability Tools (Git, Docker, Conda)
- Bioinformatics Databases and File Formats
Course Description
This course will expose students to modern bioinformatics studies with a specific focus on the analysis of next generation sequencing data. Lectures will cover a mix of both biological and computational topics necessary for the technical and conceptual understanding of current high-throughput genomics techniques. This will include brief discussions of the molecular mechanisms of the assays, basic data analysis workflows, and translating these results into biological conclusions.
Students will get hands-on experience developing computational workflows that perform an end-to-end analysis of sequencing data from ubiquitous NGS technologies including RNA-sequencing, ChIP-sequencing, and Single Cell RNA-sequencing. The course emphasizes the importance of reproducibility, and portability in modern bioinformatics.
Classes will be traditional lectures exploring the variety of topics regarding next generation sequencing and labs will focus on practical activities meant to develop experience working with the tools and technologies needed for the analysis and interpretation of sequencing data.
Course Values and Policies
Everyone is welcome. Every background, race, color, creed, religion, ethnic origin, age, sex, sexual orientation, gender identity, nationality is welcome and celebrated in this course. Everyone deserves respect, patience, and kindness. Disrespectful language, discrimination, or harassment of any kind are not tolerated, and may result in removal from class or the University. The instructors deem these principles to be inviolable human rights. Students should feel safe reporting any and all instances of discrimination or harassment to the instructor, to any of the Bioinformatics Program leadership, or the BU Equal Opportunity Office
Everyone brings value. Each of us brings unique experiences, skills, and creativity to this course. Our diversity is our greatest asset. Collaboration is highly encouraged. All students are encouraged to work together and seek out any and all available resources when completing projects in all aspects of the course, including sharing both ideas and code as well as those found on the internet. Any and all available resources may be brought to bear. However, consistent with BU policy, your reports should be written in your own words and represent your own work and understanding of the material.
Life happens. Your mental, physical and emotional health isfar more important than any class. Make sure to take care of yourself and reach out to someone you trust (mentor, family member, or friend) if you ever feel you need to talk to someone. BU offers a number of resources through Student Health Services for managing situations involving grief, anxiety and depression, stress, homesickness and other common issues. I am also always here to listen without judgment. On a related note, if you need to miss class because of private matters, you do not need to disclose anything you aren’t comfortable sharing, please just let me know and I will work with you to help you catch up when you return. Your family, friends, and health should always come first.
Prerequisites
Basic understanding of biology and genomics. Any of these courses are adequate prerequisites for this course: BF527, BE505/BE605. Students should have some experience programming in a modern programming language (R, python, C, Java, etc).
Working familiarity with Git and command line interfaces is also heavily recommended.
Instructor and TAs
Joey Orofino
Eric Palanques Tost
Xudong Han
Contact information available on Blackboard
My pledge to foster Diversity, Inclusion, Anti-racism
This course is a judgement free and anti-racist learning environment. Our cohort consists of students from a wide variety of social identities and life circumstances. Everyone will treat one another with respect and consideration at all times or be asked to leave the classroom.
As instructor, I pledge to:
- Learn and correctly pronounce everyone’s preferred name/nickname
- Use preferred pronouns for those who wish to indicate this to me/the class
- Work to accommodate/prevent language related challenges (for instance I will do my best to avoid the use of idioms and slang)
Projects Overview
- Project 0: Genome Analytics
- Project 1: RNAseq
- Project 2: ChIPseq
- Project 3: scRNAseq
Each project has been split into 4 weekly parts and by the end of each, you will have developed a complete pipeline that will process the data from end-to-end using state-of-the-art tools. In order to generate reproducible, and portable NGS analysis workflows, we will be employing a combination of technologies including Nextflow, git, Conda, Docker, and HPC.
Subsequent projects will gradually add more complexity and tasks once you’ve gained experience with the fundamentals. Simultaneously, the amount of scaffolding and direct instructions will also be reduced.
The data for each of the projects come from peer-reviewed published papers. Prior to your analysis, you will not be informed of the source and you will be asked to make some general conclusions and hypotheses from your results. In the 4th week of each project, we will reveal the original paper and you will compare how well you were able to reproduce the reported results and be asked to speculate on any observable differences. Please note that this is not intended to say one approach or analysis was “right” but to foster discussion on reproducibility in bioinformatics, why its challenging, and what we can do to ensure our own work is reproducible.
Project Grading
Project 0 will not be graded and is meant to serve as a gentle introduction to Nextflow, and the other technologies we will be utilizing throughout the semester.
Projects 1 and 2 will be graded based on your answers to weekly discussion questions as well as credit for completing the specified tasks on-time.
For project 3, you will be asked to develop your own pipeline to process scRNAseq data, using all of the principles and techniques we’ve utilized throughout the semester. We will provide minimal guidance and a list of analyses you will need to perform and questions you will need to address.
Generally speaking the grading breakdown will be as follows:
Project 1: 25% Project 2: 25% Project 3: 40% Lab Participation: 10%
Course Schedule
Day | Date | Week | Class | Topic | Project |
---|---|---|---|---|---|
Wed | 1/22 | 1 | Lecture | Introduction | |
Fri | 1/24 | 1 | Lecture + Lab |
Genomics, Genes, and Genomes Lab - Setup |
P0 - W1 |
Mon | 1/27 | 2 | Lecture |
Computational Pipeline Strategies SCC cluster usage |
|
Wed | 1/29 | 2 | Lab | Writing your own NF modules | |
Fri | 1/31 | 2 | Lecture |
Next Generation Sequencing Week 1 Review |
P0 - W2 |
Mon | 2/3 | 3 | Lecture | Sequence Analysis Fundamentals | |
Wed | 2/5 | 3 | Lab | HPC Cluster Usage | |
Fri | 2/7 | 3 | Lecture |
Genomic Variation and SNP Analysis Week 2 Review |
P0 - W3 |
Mon | 2/10 | 4 | Lecture | Genome Editing - CRISPR Cas9 | |
Wed | 2/12 | 4 | Lab | CRISPR Guide Selection | |
Fri | 2/14 | 4 | Lecture | Week3 Review | PO - W4 |
Mon | 2/17 | NO CLASS | |||
Tues | 2/18 | 5 | Lecture | Biological Databases | |
Wed | 2/19 | 5 | Lab | Project 0 Discussion Writing a methods section |
|
Fri | 2/21 | 5 | Lecture | Sequence Analysis - RNA-Seq 1 | P1 - W1 |
Mon | 2/24 | 6 | Lecture | Sequence Analysis - RNA-Seq 2 | |
Wed | 2/26 | 6 | Lab | Containers (Docker) | |
Fri | 2/28 | 6 | Lecture | Gene Sets and Enrichment | P1 - W2 |
Mon | 3/3 | 7 | Lecture |
Microbiome: 16s |
|
Wed | 3/5 | 7 | Lab | Sequencing QC Evaluation | |
Fri | 3/7 | 7 | Lecture | Microbiome: Metagenomics | P1 - W3 |
SPRING BREAK | |||||
Mon | 3/17 |
8 | Lecture | Proteomics | |
Wed | 3/18 | 8 | Lab | Splice-aware vs non-splice aware | |
Fri | 3/21 | 8 | Lecture | Metabolomics | P1 - W4 |
Mon | 3/24 | 9 | Lecture | Sequence Analysis - ChIP-Seq | |
Wed | 3/26 | 9 | Lab | Project 1 Discussion | |
Fri | 3/28 | 9 | Lecture | Sequence analysis - ATAC-Seq | P2 - W1 |
Mon | 3/31 | 10 | Lecture | Single Cell Analysis Part 1 | |
Wed | 4/2 | 10 | Lab | Using public databases | |
Fri | 4/4 | 10 | Lecture | Single Cell Analysis Part 2 | P2 - W2 |
Mon | 4/7 | 11 | Lecture | Single Cell Analysis Part 3 | |
Wed | 4/9 | 11 | Lab | Single Cell In-Class | |
Fri | 4/11 | 11 | Lecture | Single Cell In-Class | P2 - W3 |
Mon | 4/14 | 12 | Lecture | Single Cell In-Class | |
Wed | 4/16 | 12 | Lab | Single Cell In-Class | |
Fri | 4/18 | 12 | Lecture | Spatial Transcriptomics | P2 - W4 |
Mon | 4/21 | NO CLASS | |||
Wed | 4/23 | 13 | Project 2 Discussion | Final Project Assigned | |
Fri | 4/25 | 13 | Lab | Reproducibility Check | |
Mon | 4/28 | 14 | Lab | Using NF-Core pipelines | |
Wed | 4/30 | 14 | NO CLASS | Project Work | |
5/8 | Final Project Due |