bsxplorer.BAMReader

class BAMReader(bam_filename: str | Path, index_filename: str | Path, cytosine_file: str | Path | None = None, bamtype: Literal['bismark'] = 'bismark', regions: DataFrame | None = None, threads: int = 1, batch_num: int = 10000.0, min_qual: int | None = None, context: Literal['CG', 'CHG', 'CHH', 'all'] = 'all', keep_converted: bool = True, qc: bool = True, readahead=5, **pysam_kwargs)[source]

Class for reading sorted and indexed BAM files.

Parameters:

bam_filename – Path to SORTED BAM file.
index_filename – Path to BAM index (.bai) file.
cytosine_file – Preprocessed with BinomialData class cytosine file. Reads will be aligned to cytosine coordinates (None to disable).
bamtype – Aligner type, which was used during BAM calculation.
regions – DataFrame from Genome with genomic regions to align to (None to disable).
threads – Number of threads used to decompress BAM.
batch_num – Number of reads per batch.
min_qual – Filter reads by quality (None to disable).
context – Filter reads by conext. Possible options “CG”, “CHG”, “CHH”, “all”. Set “all” for no filtering.
keep_converted – Keep converted reads. Default: True.
qc – Calculate QC data. Default: True.
readahead – Number of batches to read before calculation.
pysam_kwargs – Keyword arguements for pysam.AlignmentFile.

Methods

`__init__`
`plot_qc`	Plot QC stats.
`qc_data`	Returns QC data from read fragments.
`report_iter`	Iter BAM file, returns `UniversalBatch`.
`stats_iter`	Iter BAM file, returns `PivotRegion`.

report_iter()[source]

Iter BAM file, returns UniversalBatch.

Return type:: iterator

stats_iter()[source]

Iter BAM file, returns PivotRegion.

Return type:: iterator

qc_data() → tuple[bsxplorer.BamReader.QualsCounter, list[bsxplorer.BamReader.QualsCounter], list[tuple]][source]

Returns QC data from read fragments. Tuple of (Count of Phred score, Count of Phred score by position, Average quality for region)

Return type:: tuple

plot_qc()[source]

Plot QC stats.

Return type:: plotly.graph_objects.Figure