Abstract Cancer is a complex disease and represents hundreds of different disease types. It is important to identify these disease types and their underlying causative alterations to help guide and tailor treatments. Genomic technologies have been used in projects such as The Cancer Genome Atlas (TCGA) to characterize a large number of cancers. However, these early genomics projects were mostly on unselected cohorts with limited follow-up and many clinically relevant datasets were not feasible for analysis due to limitations in technologies using formalin-fixed and small starting quantities of tissue. New initiatives of the Center for Cancer Genomics will help us address more clinically-meaningful questions. We propose to use our expertise in gene expression and RNA-sequencing analysis to further characterize cancer to help identify novel markers for diagnosis, novel drug therapies and clinical associations. We will approach this in three aims. For Aim 1, we will use RNA sequence information to identify somatic mutations, improve mapping assembly and quantification of B and T cells, identify structural variations, and perform high level quality control including genotype checks across sequence data for the same sample. For Aim 2, we will calculate gene and isoform levels that will be used to identify tumor subtypes, alternative isoform usage, and application of previously defined gene signatures and tumor subtypes. For Aim 3, we will use supervised analyses to find genes significantly associated with molecular features and model gene expression data to look for association with clinical outcome or drug treatment response. We expect our data, integrated with the data from other Genome Data Analysis Centers, will uncover novel insights into cancer development, progression and treatment of cancer. We will also leverage the information we have learned from pan-cancer analyses to identify shared genomic alterations or pathway activity that may accelerate therapy development." |