Data Releases

The first data release, projected for Tuesday February 17 2015, will include Whole Transcriptome RNA-seq data from 697 single cells from human heart and brain across the three studies. Click here to get information on applying for data access.

Data dictionaries and variable summaries are available on the dbGaP FTP site ( The public summary-level phenotype data were released on Tuesday, February 17, 2015. These data may be browsed at the dbGaP study home page:

Data are embargoed for a period of six months for each available wave of data; applicants for data are prohibited from publishing findings during this period. Along with the .bam files (in SRA format), qualified investigators will receive phenotypes and gene expression counts for these samples.

Below are the list of files available to download based on Institutional Review Board (IRB) consent:

  1. Sequencing Data, containing sequencing read and mapping information (BAM files) stored as the Sequencing Read Archive (SRA) format.
  2. Subject Phenotype Data, contains phenotype data from consented study subjects being sequenced.
  3. Sample Attributes Data, contains detailed attributes of study samples including body site where the sample was collected, experimental protocol followed and images when available.
  4. Next-Gen Sequencing Quality Metrics Data, contains detailed results of analysing the RNA-sequencing data including various quality metrics
  5. Gene Expression Count Files, contains the raw counts for exonic expression and intronic expression for each of the samples 
  6. Subject Consent, contains a list of all subjects being sequenced.
  7. Subject Sample Mapping, relates each Subject ID with the sequenced Sample ID.


Each file type also contains a Data Dictionary, whenever applicable. For questions about the study, please contact us.



© 2015 University of Pennsylvania