fitAlpineBiasModel.Rd
This function provides a wrapper around some of the functions from the
alpine
package. Given a gtf file and a bam file with reads aligned to
the genome, it will find single-isoform genes (with lengths and expression
levels within given ranges) and use the observed read coverages to fit a
fragment bias model.
fitAlpineBiasModel(gtf, bam, organism, genome, genomeVersion, version,
minLength = 600, maxLength = 7000, minCount = 500, maxCount = 10000,
subsample = TRUE, nbrSubsample = 200, seed = 1, minSize = NULL,
maxSize = NULL, verbose = FALSE)
Path to gtf file with genomic features. Preferably in Ensembl format.
Path to bam file with read alignments to the genome.
The organism (e.g., 'Homo_sapiens'). This argument will be
passed to ensembldb::ensDbFromGtf
.
A BSgenome
object.
Genome version (e.g., 'GRCh38'). This argument will be
passed to ensembldb::ensDbFromGtf
.
The version of the reference annotation (e.g., 90). This
argument will be passed to ensembldb::ensDbFromGtf
.
Minimum and maximum length of single-isoform genes used to fit fragment bias model.
Minimum and maximum read coverage of single-isoform genes used to fit fragment bias model.
Whether to subsample the set of single-isoform genes
satisfying the minLength
, maxLength
, minCount
and
maxCount
criteria before fitting the fragment bias model.
If subsample
is TRUE
, the number of genes
to subsample.
If subsample
is TRUE
, the random seed to use to
ensure reproducibility.
Smallest and largest fragment size to consider. One or
both of these can be NULL
, in which case it is estimated as the 2.5
or 97.5 percentile, respectively, of estimated fragment sizes in the
provided data.
Logical, whether to print progress messages.
A list with three elements:
biasModel
:The fitted fragment bias model.
exonsByTx
:A GRangesList
object with exons grouped by transcript.
transcripts
:A
GRanges
object with all the reference transcripts.
Soneson C, Love MI, Patro R, Hussain S, Malhotra D, Robinson MD: A junction coverage compatibility score to quantify the reliability of transcript abundance estimates and annotation catalogs. bioRxiv doi:10.1101/378539 (2018).
Love MI, Hogenesch JB, Irizarry RA: Modeling of RNA-seq fragment sequence bias reduces systematic errors in transcript abundance estimation. Nature Biotechnology 34(12):1287-1291 (2016).
if (FALSE) { # \dontrun{
gtf <- system.file("extdata/Homo_sapiens.GRCh38.90.chr22.gtf.gz",
package = "jcc")
bam <- system.file("extdata/reads.chr22.bam", package = "jcc")
biasMod <- fitAlpineBiasModel(gtf = gtf, bam = bam,
organism = "Homo_sapiens",
genome = Hsapiens, genomeVersion = "GRCh38",
version = 90, minLength = 230,
maxLength = 7000, minCount = 10,
maxCount = 10000, subsample = TRUE,
nbrSubsample = 30, seed = 1, minSize = NULL,
maxSize = 220, verbose = TRUE)
} # }