bsxplorer.Genome.other

Genome.other(region_type: str, min_length: int = 1000, flank_length: int = 100) DataFrame[source]

Filter annotation by selected type and calculate positions of nflanking regions.

Warning

This method will have empty output, if type is not specified in input file.

If no flanking regions are needed – enter flank_length=0.

Parameters:
  • region_type – Filter annotation by region type from gff.

  • min_length – Region length threshold.

  • flank_length – Length of flanking regions.

Return type:

Return polars.DataFrame for downstream usage.

Examples

>>> path = "/path/to/genome.gff"
>>> genome = genome.from_gff(path)
>>> genome.other("region")
shape: (45_888, 7)
┌─────────────┬────────┬────────┬────────┬──────────┬────────────┬─────────────────┐
│ chr         ┆ strand ┆ start  ┆ end    ┆ upstream ┆ downstream ┆ id              │
│ ---         ┆ ---    ┆ ---    ┆ ---    ┆ ---      ┆ ---        ┆ ---             │
│ str         ┆ str    ┆ u64    ┆ u64    ┆ u64      ┆ u64        ┆ str             │
╞═════════════╪════════╪════════╪════════╪══════════╪════════════╪═════════════════╡
│ NC_003070.9 ┆ +      ┆ 3631   ┆ 5899   ┆ 3531     ┆ 5999       ┆ rna-NM_099983.2 │
│ …           ┆ …      ┆ …      ┆ …      ┆ …        ┆ …          ┆ …               │
│ NC_000932.1 ┆ +      ┆ 152806 ┆ 154312 ┆ 152706   ┆ 154412     ┆ rna-ArthCp085   │
│ NC_000932.1 ┆ ?      ┆ 69611  ┆ 140650 ┆ 69511    ┆ 140750     ┆ rna-ArthCp047   │
└─────────────┴────────┴────────┴────────┴──────────┴────────────┴─────────────────┘