fitAlpineBiasModel.RdThis function provides a wrapper around some of the functions from the
alpine package. Given a gtf file and a bam file with reads aligned to
the genome, it will find single-isoform genes (with lengths and expression
levels within given ranges) and use the observed read coverages to fit a
fragment bias model.
fitAlpineBiasModel(gtf, bam, organism, genome, genomeVersion, version,
minLength = 600, maxLength = 7000, minCount = 500, maxCount = 10000,
subsample = TRUE, nbrSubsample = 200, seed = 1, minSize = NULL,
maxSize = NULL, verbose = FALSE)Path to gtf file with genomic features. Preferably in Ensembl format.
Path to bam file with read alignments to the genome.
The organism (e.g., 'Homo_sapiens'). This argument will be
passed to ensembldb::ensDbFromGtf.
A BSgenome object.
Genome version (e.g., 'GRCh38'). This argument will be
passed to ensembldb::ensDbFromGtf.
The version of the reference annotation (e.g., 90). This
argument will be passed to ensembldb::ensDbFromGtf.
Minimum and maximum length of single-isoform genes used to fit fragment bias model.
Minimum and maximum read coverage of single-isoform genes used to fit fragment bias model.
Whether to subsample the set of single-isoform genes
satisfying the minLength, maxLength, minCount and
maxCount criteria before fitting the fragment bias model.
If subsample is TRUE, the number of genes
to subsample.
If subsample is TRUE, the random seed to use to
ensure reproducibility.
Smallest and largest fragment size to consider. One or
both of these can be NULL, in which case it is estimated as the 2.5
or 97.5 percentile, respectively, of estimated fragment sizes in the
provided data.
Logical, whether to print progress messages.
A list with three elements:
biasModel:The fitted fragment bias model.
exonsByTx:A GRangesList
object with exons grouped by transcript.
transcripts:A
GRanges object with all the reference transcripts.
Soneson C, Love MI, Patro R, Hussain S, Malhotra D, Robinson MD: A junction coverage compatibility score to quantify the reliability of transcript abundance estimates and annotation catalogs. bioRxiv doi:10.1101/378539 (2018).
Love MI, Hogenesch JB, Irizarry RA: Modeling of RNA-seq fragment sequence bias reduces systematic errors in transcript abundance estimation. Nature Biotechnology 34(12):1287-1291 (2016).
if (FALSE) { # \dontrun{
gtf <- system.file("extdata/Homo_sapiens.GRCh38.90.chr22.gtf.gz",
package = "jcc")
bam <- system.file("extdata/reads.chr22.bam", package = "jcc")
biasMod <- fitAlpineBiasModel(gtf = gtf, bam = bam,
organism = "Homo_sapiens",
genome = Hsapiens, genomeVersion = "GRCh38",
version = 90, minLength = 230,
maxLength = 7000, minCount = 10,
maxCount = 10000, subsample = TRUE,
nbrSubsample = 30, seed = 1, minSize = NULL,
maxSize = 220, verbose = TRUE)
} # }