作者
Fairlie Reese, Brian Williams, Gabriela Balderrama-Gutierrez, Dana Wyman, Muhammed Hasan Çelik, Elisabeth Rebboah, Narges Rezaie, Diane Trout, Milad Razavi-Mohseni, Yunzhe Jiang, Beatrice Borsari, Samuel Morabito, Heidi Yahan Liang, Cassandra J McGill, Sorena Rahmanian, Jasmine Sakr, Shan Jiang, Weihua Zeng, Klebea Carvalho, Annika K Weimer, Louise A Dionne, Ariel McShane, Karan Bedi, Shaimae I Elhajjajy, Sean Upchurch, Jennifer Jou, Ingrid Youngworth, Idan Gabdank, Paul Sud, Otto Jolanki, J Seth Strattan, Meenakshi S Kagda, Michael P Snyder, Ben C Hitz, Jill E Moore, Zhiping Weng, David Bennett, Laura Reinholdt, Mats Ljungman, Michael A Beer, Mark B Gerstein, Lior Pachter, Roderic Guigó, Barbara J Wold, Ali Mortazavi
发表日期
2023/5/16
期刊
bioRxiv
出版商
Cold Spring Harbor Laboratory Preprints
简介
The majority of mammalian genes encode multiple transcript isoforms that result from differential promoter use, changes in exonic splicing, and alternative 3’end choice. Detecting and quantifying transcript isoforms across tissues, cell types, and species has been extremely challenging because transcripts are much longer than the short reads normally used for RNA-seq. By contrast, long-read RNA-seq (LR-RNA-seq) gives the complete structure of most transcripts. We sequenced 264 LR-RNA-seq PacBio libraries totaling over 1 billion circular consensus reads (CCS) for 81 unique human and mouse samples. We detect at least one full-length transcript from 87.7% of annotated human protein coding genes and a total of 200,000 full-length transcripts, 40% of which have novel exon junction chains.
引用总数