bsxplorer.Genome.cds
- Genome.cds(min_length: int = 100) DataFrame [source]
Filter annotation by type == CDS and length threshold.
Warning
This method will have empty output, if type is not specified in input file.
- Parameters:
min_length – Region length threshold.
- Return type:
Return
polars.DataFrame
for downstream usage.
Examples
>>> path = "/path/to/genome.gff" >>> genome = genome.from_gff(path) >>> genome.cds(min_length=200) shape: (81_837, 7) ┌─────────────┬────────┬────────┬────────┬──────────┬────────────┬─────────────────┐ │ chr ┆ strand ┆ start ┆ end ┆ upstream ┆ downstream ┆ id │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ str ┆ str ┆ u64 ┆ u64 ┆ u64 ┆ u64 ┆ str │ ╞═════════════╪════════╪════════╪════════╪══════════╪════════════╪═════════════════╡ │ NC_003070.9 ┆ + ┆ 3996 ┆ 4276 ┆ 3996 ┆ 4276 ┆ cds-NP_171609.1 │ │ … ┆ … ┆ … ┆ … ┆ … ┆ … ┆ … │ │ NC_000932.1 ┆ + ┆ 152806 ┆ 153195 ┆ 152806 ┆ 153195 ┆ cds-NP_051123.1 │ │ NC_000932.1 ┆ + ┆ 153878 ┆ 154312 ┆ 153878 ┆ 154312 ┆ cds-NP_051123.1 │ └─────────────┴────────┴────────┴────────┴──────────┴────────────┴─────────────────┘