Discovering Novel RNA Isoforms in Mouse Brown Fat

My bioinformatics pipeline/technical demo to uncover new RNA transcript variants using long-read RNA sequencing and as a learning tool.

Project Summary

This project analyzes a nanopore long-read RNA-seq dataset from murine (brown) brown adipose tissue using a reproducible FLAIR-based pipeline. I identified both known and previously unannotated (novel) isoforms and quantify their expression. This project also serves as a personal learning tool for an introduction to bio-informatics, focusing on basic data-analysis/pipeline.

Dataset

- Source: NCBI SRA
- ID and Link: SRR33470049
- Technology: Oxford Nanopore long-read sequencing (on GridION)
- Tissue: Mouse brown fat
- Reference/Annotations: Ensembl FTP

Pipeline Overview

Results

These results highlight previously unknown RNA diversity (with some admitted limitations) in metabolically active brown fat tissue.

Top Novel Isoform Counts by Gene Top Genes with Most Novel Isoforms

Limitations

Why It Matters

Identifying novel isoforms can uncover new mechanisms of gene regulation, alternative splicing, and tissue-specific gene expression — especially in dynamic tissues like brown fat, which plays a role in energy metabolism.

Repository

Explore the code, data prep, and full Jupyter notebook with more information/indepth:

View on GitHub

About me!

I'm Thaddeus Lipke, a SWE/EMT-B certified grad from Columbia Uni who wanted to learn some bio-informatics and get more into python.
Feel free to contact me, I'd like to learn more if something needs clarification or corrections (please correct me)! My contact is on the github repo for this!
This project helped me learn genomics tools like FLAIR, minimap2, and samtools — and apply them to a real-world RNA sequencing problem with a mouse dataset.