NGS Metrics Data Dictionary

VARNAME VARDESC EXTERNAL ONTOLOGY USED
SAMPLE_ID De-identified Sample ID  
NGS_Protocol_FileName
Document that details the NGS processing pipeline applied to the SCAP-T samples
 
HumanGenomeAssembly Human Genome assembly used  
GencodeVersion Gencode version used  
PennSCAP-T_pipeline_version
Version of the PennSCAP-T pipeline  
STAR_version STAR version used for carrying out the alignments  
HTSeq_version HTSeq version used for quantification of expression  
BLAST_version Blast version used  
FastQC_version FastQC version used  
Samtools_version Samtools version used  
BLAST_TotalHits The number of reads out of 5000 randomly selected reads that map against the nt database using BLAST  
BLAST_HitsNotTarget_or_ERCC
Percentage hits that map neither to the Human genome nor to ERCC sequences
 
BLAST_Bacteria Number of BLAST matches to Bacteria  
BLAST_Fish Number of BLAST matches to Fish  
BLAST_Fly Number of BLAST matches to Fly  
BLAST_Human Number of BLAST matches to Human  
BLAST_Mouse Number of BLAST matches to Mouse  
BLAST_Rat Number of BLAST matches to Rat  
BLAST_Yeast Number of BLAST matches to Yeast  
BLAST_HitsNotCounted Number of BLAST matches to any other species  
BLAST_ERCC Number of BLAST matches to ERCC  
nTotalReadPairs Number of total reads from RNA-Sequencing  
nBothTrimmed Number of read pairs trimmed for contaminants  
nFirstTrimmed Number of reads trimmed for contaminants (First reads)  
nSecondTrimmed Number of reads trimmed for contaminants (Second reads)  
nBothDiscarded
Number of read pairs discarded after trimming for being too short (< 20 nt)
 
nFirstDiscarded
Number of first reads discarded after trimming for being too short (< 20 nt)
 
nSecondDiscarded
Number of second reads discarded after trimming for being too short (< 20 nt)
 
nBelowPhredThreshold(first)
Number of first reads with Phred Score less than the threshold for at least one position
 
nBelowPhredThreshold(second)
Number of second reads with Phred Score less than the threshold for at least one position
 
removed5N(first) Number of first reads trimmed for Ns from 5' end  
removed5N(second) Number of second reads trimmed for Ns from 5' end  
removed3N(first) Number of first reads trimmed for Ns from 3' end  
removed3N(second) Number of second reads trimmed for Ns from 3' end  
indexAdapter(first) Number of first reads trimmed for the Index adapter  
indexAdapter(second) Number of second reads trimmed for the Index adapter  
univAdapterRC(first)
Number of first reads trimmed for the reverse complement of the Universal adapter
 
univAdapterRC(second)
Number of second reads trimmed for the reverse complement of the Universal adapter
 
aRNAPrimer(first) Number of first reads trimmed for the aRNA primer  
aRNAPrimer(second) Number of second reads trimmed for the aRNA primer  
aRNAPrimerRC(first)
Number of first reads trimmed for the reverse complement of aRNA primer
 
aRNAPrimerRC(second)
Number of second reads trimmed for the reverse complement of aRNA primer
 
5polyT(first) Number of first reads trimmed for polyT from 5' end  
5polyT(second) Number of second reads trimmed for polyT from 5' end  
3polyA(first) Number of first reads trimmed for polyA from 3' end  
3polyA(second) Number of second reads trimmed for polyA from 3' end  
NexteraXT_TransposaseRC(first)
Number of reads trimmed for 3' NexteraXT Transposase Reverse complement
 
CDS(first) Number of reads trimmed for the CDS 3' Adapter  
CDS_RC(first)
Number of reads trimmed for the CDS 3' Adapter Reverse complement
 
nTSOFiltered(first) Number of reads filtered for being the TSO primer concatemers  
AvgInpReadLen Average read length  
AvgUniqMapLen Average read length for uniquely mapped reads  
UniqMap % Uniquely mapped reads by STAR aligner  
Multimapped % Multimapped reads by STAR aligner  
ReadsTooShort % Reads too short and not mapped as reported by STAR aligner  
ReadsCountedExonic Number of reads counted by HTSeq as exonic  
NumGenesExpressed Number of genes with at least one exonic read  
AvgReadPerGeneExonic Average number of reads that are exonic per gene  
MaxReadsPerGeneExonic Maximum number of reads that are exonic per gene  
NoExonicFeature
Number of reads that did not map to any exon as reported by HTSeq
 
AmbiguousMappedExonic
Number of reads that mapped ambiguosly to exons of more than one gene and were not counted
 
ReadsCountedIntronic Number of reads counted by HTSeq as intronic  
NumGenesWithIntronicExpression
Number of genes with at least one intronic read  
AvgReadPerGeneIntronic Average number of reads that are intronic per gene  
MaxReadsPerGeneIntronic
Maximum number of reads that are intronic per gene  
NoIntronicFeature
Number of reads that did not map to any intron as reported by HTSeq
 
AmbiguousMappedIntronic
Number of reads that mapped ambiguosly to introns of more than one gene and were not counted