scaleTxCoverages.Rd
This function takes a list of predicted transcript coverage profiles (from
predictTxCoverage
) and corresponding transcript abundance
estimates (from, e.g., Salmon, kallisto, StringTie, RSEM, or any other
transcript abundance estimation software), and scales the coverage profiles
by the abundance estimates in order to determine the predicted number of
reads covering each position in the transcript. In addition, it extracts the
predicted number of reads spanning each junction in each transcript, and sums
up these for each annotated junction, across all transcripts containing the
junction.
scaleTxCoverages(txCoverageProfiles, txQuants, tx2gene, strandSpecific,
methodName, verbose = FALSE)
A list
of predicted transcript coverage
profiles (output from predictTxCoverage
).
A data.frame
with estimated transcript abundances.
Must contain columns named transcript
, TPM
and count
,
all other columns will be ignored.
A data.frame
with transcript-to-gene mapping. Must have
at least two columns, named tx
and gene
.
Logical, whether or not the data is strand-specific. Determines whether or not the strand of a junction should be taken into account when the coverages are summed up across transcripts.
Name to use for the abundance estimation method in the output table.
Logical, whether to print progress messages.
A list with two elements:
junctionPredCovs
:A tibble
with the predicted coverage
for each junction, scaled by the estimated transcript abundances and
summarized across all transcripts.
txQuants
:A tibble
with estimated transcript abundances,
including information about whether or not the coverage profile could be
predicted ('covOK') or if a uniform coverage was imposed ('covError' or
'covNA').
Soneson C, Love MI, Patro R, Hussain S, Malhotra D, Robinson MD: A junction coverage compatibility score to quantify the reliability of transcript abundance estimates and annotation catalogs. bioRxiv doi:10.1101/378539 (2018)
if (FALSE) { # \dontrun{
gtf <- system.file("extdata/Homo_sapiens.GRCh38.90.chr22.gtf.gz",
package = "jcc")
bam <- system.file("extdata/reads.chr22.bam", package = "jcc")
biasMod <- fitAlpineBiasModel(gtf = gtf, bam = bam,
organism = "Homo_sapiens",
genome = Hsapiens, genomeVersion = "GRCh38",
version = 90, minLength = 230,
maxLength = 7000, minCount = 10,
maxCount = 10000, subsample = TRUE,
nbrSubsample = 30, seed = 1, minSize = NULL,
maxSize = 220, verbose = TRUE)
tx2gene <- readRDS(system.file("extdata/tx2gene.sub.rds", package = "jcc"))
predCovProfiles <- predictTxCoverage(biasModel = biasMod$biasModel,
exonsByTx = biasMod$exonsByTx,
bam = bam, tx2gene = tx2gene,
genome = Hsapiens,
genes = c("ENSG00000070371",
"ENSG00000093010"),
nCores = 1, verbose = TRUE)
txQuants <- readRDS(system.file("extdata/quant.sub.rds", package = "jcc"))
txsc <- scaleTxCoverages(txCoverageProfiles = predCovProfiles,
txQuants = txQuants, tx2gene = tx2gene,
strandSpecific = TRUE, methodName = "Salmon",
verbose = TRUE)
} # }