bsxplorer.BinomialData.region_pvalue

BinomialData.region_pvalue(genome: DataFrame, methylation_pvalue: float = 0.05, use_threads: bool = True, save: str | Path | bool | None = None, dir: str | Path = PosixPath('/home/runner/work/BSXplorer/BSXplorer/docs'))[source]

Map cytosines with provided annotation and calculate region methylation P-value (assuming distribution is binomial).

Parameters:
  • genome – DataFrame with annotation (e.g. from Genome class)

  • methylation_pvalue – P-value of cytosine methylation for it to be considered methylated.

  • use_threads – Do multi-threaded or single-threaded reading. If multi-threaded option is used, number of threads is defined by multiprocessing.cpu_count()

  • save – Name with which preprocessed file will be saved. If not provided - input file name is being used.

  • dir – Path to working dir, where file will be saved.

Returns:

Instance of RegionStat class.

Return type:

RegionStat

Examples

If there no preprocessed file:

>>> report_path = "/path/to/report.txt"
>>> genome_path = "/path/to/genome.gff"
>>> c_binom = bsxplorer.BinomialData.preprocess(report_path, report_type="bismark")
>>> genome = bsxplorer.Genome.from_gff(genome_path).gene_body()
>>> data = c_binom.region_pvalue(genome)
>>> data
shape: (3, 11)
┌─────────┬────────┬─────────┬───────┬───────┬────────┬────────┬────────┬────────┬────────┬────────┐
│ chr     ┆ strand ┆ id      ┆ start ┆ end   ┆ p_valu ┆ p_valu ┆ p_valu ┆ total  ┆ total  ┆ total  │
│ ---     ┆ ---    ┆ ---     ┆ ---   ┆ ---   ┆ e_cont ┆ e_cont ┆ e_cont ┆ contex ┆ contex ┆ contex │
│ cat     ┆ cat    ┆ str     ┆ u64   ┆ u64   ┆ ext_CG ┆ ext_CH ┆ ext_CH ┆ t_CG   ┆ t_CHG  ┆ t_CHH  │
│         ┆        ┆         ┆       ┆       ┆ ---    ┆ G      ┆ H      ┆ ---    ┆ ---    ┆ ---    │
│         ┆        ┆         ┆       ┆       ┆ f64    ┆ ---    ┆ ---    ┆ i64    ┆ i64    ┆ i64    │
│         ┆        ┆         ┆       ┆       ┆        ┆ f64    ┆ f64    ┆        ┆        ┆        │
╞═════════╪════════╪═════════╪═══════╪═══════╪════════╪════════╪════════╪════════╪════════╪════════╡
│ NC_0030 ┆ +      ┆ gene-AT ┆ 3631  ┆ 5899  ┆ 1.0    ┆ 1.0    ┆ 1.0    ┆ 60     ┆ 82     ┆ 251    │
│ 70.9    ┆        ┆ 1G01010 ┆       ┆       ┆        ┆        ┆        ┆        ┆        ┆        │
│ NC_0030 ┆ -      ┆ gene-AT ┆ 6788  ┆ 9130  ┆ 0.9992 ┆ 1.0    ┆ 1.0    ┆ 31     ┆ 55     ┆ 295    │
│ 70.9    ┆        ┆ 1G01020 ┆       ┆       ┆ 65     ┆        ┆        ┆        ┆        ┆        │
│ NC_0030 ┆ +      ┆ gene-AT ┆ 11101 ┆ 11372 ┆ 1.0    ┆ 1.0    ┆ 1.0    ┆ 1      ┆ 8      ┆ 43     │
│ 70.9    ┆        ┆ 1G03987 ┆       ┆       ┆        ┆        ┆        ┆        ┆        ┆        │
└─────────┴────────┴─────────┴───────┴───────┴────────┴────────┴────────┴────────┴────────┴────────┘

If preprocessed file exists:

>>> preprocessed_path = "/path/to/preprocessed.binom.pq"
>>> c_binom = bsxplorer.BinomialData(preprocessed_path)
>>> data = c_binom.region_pvalue(genome)