bsxplorer.Genome.gene_body
- Genome.gene_body(min_length: int = 0, flank_length: int = 2000) DataFrame [source]
Filter annotation by type == gene and calculate positions of flanking regions.
Warning
This method will have empty output, if type is not specified in input file.
- Parameters:
min_length – Region length threshold.
flank_length – Length of flanking regions.
- Return type:
Return
polars.DataFrame
for downstream usage.
Examples
>>> path = "/path/to/genome.gff" >>> genome = genome.from_gff(path) >>> genome.gene_body(min_length=2000, flank_length=2000) shape: (14_644, 7) ┌─────────────┬────────┬────────┬────────┬──────────┬────────────┬────────────────┐ │ chr ┆ strand ┆ start ┆ end ┆ upstream ┆ downstream ┆ id │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ str ┆ str ┆ u64 ┆ u64 ┆ u64 ┆ u64 ┆ str │ ╞═════════════╪════════╪════════╪════════╪══════════╪════════════╪════════════════╡ │ NC_003070.9 ┆ + ┆ 3631 ┆ 5899 ┆ 1631 ┆ 7899 ┆ gene-AT1G01010 │ │ … ┆ … ┆ … ┆ … ┆ … ┆ … ┆ … │ │ NC_000932.1 ┆ + ┆ 104691 ┆ 107500 ┆ 102691 ┆ 109500 ┆ gene-ArthCr087 │ │ NC_000932.1 ┆ + ┆ 141485 ┆ 143708 ┆ 139485 ┆ 145708 ┆ gene-ArthCp086 │ └─────────────┴────────┴────────┴────────┴──────────┴────────────┴────────────────┘