Console usage

All console scripts require configuration file – tab separated file with columns:

Column NAMES should NOT be INCLUDED in real configuration file.

<ins>Group name</ins>

<ins>Path to report</ins>

<ins>Path to annotation</ins>

Flank length

Min region length

Region type

Genome type

Report type

Mock

mock-1.CX_report.txt

annotation.gff

2000

0

gene

gff

bismark

Mock

mock-2.CX_report.txt

annotation.gff

2000

0

gene

gff

bismark

Infected

infected-1.CX_report.txt

annotation.gff

2000

0

gene

gff

bismark

Infected

infected-2.CX_report.txt

annotation.gff

2000

0

gene

gff

bismark

Columns with <ins>underlined</ins> are required.

bsxplorer-metagene

usage: bsxplorer-metagene [-h] [-o NAME] [--dir DIR] [-m BLOCK_MB] [-t] [-s {wmean,mean,median,min,max,1pgeom}] [-u UBIN] [-d DBIN] [-b BBIN] [-q QUANTILE] [-C CONFIDENCE] [-S SMOOTH] [-H VRESOLUTION] [-V HRESOLUTION]
                          [--separate_strands] [--export {pdf,svg,none}] [--ticks TICKS TICKS TICKS TICKS TICKS]
                          config

Metagene report creation tool

positional arguments:
  config                Path to config file

options:
  -h, --help            show this help message and exit
  -o NAME, --out NAME   Output filename (default: Metagene_Report_08-07-24_18-47-26)
  --dir DIR             Output and working dir (default: /Users/shitohana/Desktop/PycharmProjects/BSXplorer/tests)
  -m BLOCK_MB, --block_mb BLOCK_MB
                        Block size for reading. (Block size ≠ amount of RAM used. Reader allocates approx. Block size * 20 memory for reading.) (default: 50)
  -t, --threads         Do multi-threaded or single-threaded reading. If multi-threaded option is used, number of threads is defined by `multiprocessing.cpu_count()` (default: False)
  -s {wmean,mean,median,min,max,1pgeom}, --sumfunc {wmean,mean,median,min,max,1pgeom}
                        Summary function to calculate density for bin with. (default: wmean)
  -u UBIN, --ubin UBIN  Number of windows for upstream region (default: 50)
  -d DBIN, --dbin DBIN  Number of windows for downstream downstream (default: 50)
  -b BBIN, --bbin BBIN  Number of windows for body region (default: 100)
  -q QUANTILE, --quantile QUANTILE
                        Quantile of most varying genes to draw on clustermap (default: 0.75)
  -C CONFIDENCE, --confidence CONFIDENCE
                        Probability for confidence bands for line-plot. 0 if disabled (default: 0.95)
  -S SMOOTH, --smooth SMOOTH
                        Windows for SavGol function. (default: 10)
  -H VRESOLUTION        Vertical resolution for heat-map (default: 100)
  -V HRESOLUTION        Vertical resolution for heat-map (default: 100)
  --separate_strands    Do strands need to be processed separately (default: False)
  --export {pdf,svg,none}
                        Export format for plots (set none to disable) (default: pdf)
  --ticks TICKS TICKS TICKS TICKS TICKS
                        Names of ticks (- character should be escaped with double reverse slash) (default: None)

Example

Intraspecies report.

Command:

bsxplorer-metagene --dir metagene -o metagene_intra -m 30 -u 100 -b 200 -d 100 -q 0.95 -C 0.95 -S 10 -H 50 -V 50 --export pdf --ticks \\-2000bp \\  Body \\  +2000bp metagene.conf
Example config file (metagene.conf)

Sample name

Report path

Annot path

Flank length

Min length

Region type

Annot format

Report format

AraTh

A_thaliana.txt

A_thaliana_genomic.gff

2000

0

gene

gff

bismark

BraDi

Brachypodium_distachyon_leaf.txt

Brachypodium_distachyon_genomic.gff

2000

0

gene

gff

bismark

CucSa

C_sativus.txt

C_sativus_genomic.gff

2000

0

gene

gff

bismark

MusMu

SRR16815382_Mus_musculus.CX_report.gz

Mus_musculus_genomic.gff

2000

0

gene

gff

bismark

Output HTML-report example

Same species report.

Command:

bsxplorer-metagene --dir metagene-brapa -o metagene_brapa -m 30 -u 100 -b 200 -d 100 -q 0.95 -C 0.95 -S 10 -H 50 -V 50 --export pdf --ticks \\-2000bp \\  Body \\  +2000bp brapa.conf
Example config file (brapa.conf)

Sample name

Report path

Annot path

Flank length

Min length

Region type

Annot format

Report format

Misugi_mock

DRR336466.CX_report.txt

genomic.gff

2000

0

gene

gff

bismark

Misugi_mock

DRR336467.CX_report.txt

genomic.gff

2000

0

gene

gff

bismark

Misugi_infected

DRR336468.CX_report.txt

genomic.gff

2000

0

gene

gff

bismark

Misugi_infected

DRR336469.CX_report.txt

genomic.gff

2000

0

gene

gff

bismark

Nanane_mock

DRR336470.CX_report.txt

genomic.gff

2000

0

gene

gff

bismark

Nanane_mock

DRR336471.CX_report.txt

genomic.gff

2000

0

gene

gff

bismark

Nanane_infected

DRR336472.CX_report.txt

genomic.gff

2000

0

gene

gff

bismark

Nanane_infected

DRR336473.CX_report.txt

genomic.gff

2000

0

gene

gff

bismark

Output HTML-report example

bsxplorer-category

usage: bsxplorer-categorise [-h] [-o NAME] [--dir DIR] [-m BLOCK_MB] [-t] [-s {wmean,mean,median,min,max,1pgeom}] [-u UBIN] [-d DBIN] [-b BBIN] [-q QUANTILE] [-C CONFIDENCE] [-S SMOOTH] [-H VRESOLUTION] [-V HRESOLUTION]
                            [--separate_strands] [--export {pdf,svg,none}] [--ticks TICKS TICKS TICKS TICKS TICKS] [--cytosine_p CYTOSINE_P] [--min_cov MIN_COV] [--region_p REGION_P] [--save_cat | --no-save_cat]
                            config

BM, UM categorisation tool

positional arguments:
  config                Path to config file

options:
  -h, --help            show this help message and exit
  -o NAME, --out NAME   Output filename (default: Metagene_Report_08-07-24_18-49-15)
  --dir DIR             Output and working dir (default: /Users/shitohana/Desktop/PycharmProjects/BSXplorer/tests)
  -m BLOCK_MB, --block_mb BLOCK_MB
                        Block size for reading. (Block size ≠ amount of RAM used. Reader allocates approx. Block size * 20 memory for reading.) (default: 50)
  -t, --threads         Do multi-threaded or single-threaded reading. If multi-threaded option is used, number of threads is defined by `multiprocessing.cpu_count()` (default: False)
  -s {wmean,mean,median,min,max,1pgeom}, --sumfunc {wmean,mean,median,min,max,1pgeom}
                        Summary function to calculate density for bin with. (default: wmean)
  -u UBIN, --ubin UBIN  Number of windows for upstream region (default: 50)
  -d DBIN, --dbin DBIN  Number of windows for downstream downstream (default: 50)
  -b BBIN, --bbin BBIN  Number of windows for body region (default: 100)
  -q QUANTILE, --quantile QUANTILE
                        Quantile of most varying genes to draw on clustermap (default: 0.75)
  -C CONFIDENCE, --confidence CONFIDENCE
                        Probability for confidence bands for line-plot. 0 if disabled (default: 0.95)
  -S SMOOTH, --smooth SMOOTH
                        Windows for SavGol function. (default: 10)
  -H VRESOLUTION        Vertical resolution for heat-map (default: 100)
  -V HRESOLUTION        Vertical resolution for heat-map (default: 100)
  --separate_strands    Do strands need to be processed separately (default: False)
  --export {pdf,svg,none}
                        Export format for plots (set none to disable) (default: pdf)
  --ticks TICKS TICKS TICKS TICKS TICKS
                        Names of ticks (- character should be escaped with double reverse slash) (default: None)
  --cytosine_p CYTOSINE_P
                        P-value for binomial test to consider cytosine methylated (default: .05)
  --min_cov MIN_COV     Minimal coverage for cytosine to keep (default: 2)
  --region_p REGION_P   P-value for binomial test to consider region methylated (default: .05)
  --save_cat, --no-save_cat
                        Does categories need to be saved (default: True)

bsxplorer-chr

usage: bsxplorer-chr [-h] [-o NAME] [--dir DIR] [-m BLOCK_MB] [-t THREADS] [-w WINDOW] [-l MIN_LENGTH] [-C CONFIDENCE] [-S SMOOTH] [--export {pdf,svg,none}] [--separate_strands] config

Chromosome methylation levels visualisation tool

positional arguments:
  config                Path to config file

options:
  -h, --help            show this help message and exit
  -o NAME, --out NAME   Output filename (default: Metagene_Report_08-07-24_18-47-14)
  --dir DIR             Output and working dir (default: /Users/shitohana/Desktop/PycharmProjects/BSXplorer/tests)
  -m BLOCK_MB, --block_mb BLOCK_MB
                        Block size for reading. (Block size ≠ amount of RAM used. Reader allocates approx. Block size * 20 memory for reading.) (default: 50)
  -t THREADS, --threads THREADS
                        Do multi-threaded or single-threaded reading. If multi-threaded option is used, number of threads is defined by `multiprocessing.cpu_count()` (default: True)
  -w WINDOW, --window WINDOW
                        Length of windows in bp (default: 1000000)
  -l MIN_LENGTH, --min_length MIN_LENGTH
                        Minimum length of chromosome to be analyzed (default: 1000000)
  -C CONFIDENCE, --confidence CONFIDENCE
                        Probability for confidence bands for line-plot. 0 if disabled (default: 0.95)
  -S SMOOTH, --smooth SMOOTH
                        Windows for SavGol function. (default: 100)
  --export {pdf,svg,none}
                        Export format for plots (set none to disable) (default: pdf)
  --separate_strands    Do strands need to be processed separately (default: False)

Example

Command:

bsxplorer-metagene --dir metagene -o metagene_intra -m 30 -u 100 -b 200 -d 100 -q 0.95 -C 0.95 -S 10 -H 50 -V 50 --export pdf --ticks \\-2000bp \\  Body \\  +2000bp metagene.conf

Config file: Example config file (brapa.conf)

Output HTML-report example

bsxplorer-bam

usage: bsxplorer-bam [-h] --bam BAM --bai BAI [-f FASTA] [--bamtype {bismark}] [-m {report,stats}] [--to_type {bismark,cgmap,bedgraph,coverage,binom}] [--stat {ME,EPM,PDR}] [--stat_param STAT_PARAM] [--stat_md STAT_MD]
                     [-g GFF] [-c {CG,CHG,CHH,all}] [-q {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42}] [-s] [--no_qc] [-t THREADS] [-n BATCH_N]
                     [-a READAHEAD]
                     output

BAM to report reader converter tool.

positional arguments:
  output                Path to output file.

options:
  -h, --help            show this help message and exit
  --bam BAM             Path to SORTED .bam file with alignments (default: None)
  --bai BAI             Path to .bai index file (default: None)
  -f FASTA, --fasta FASTA
                        Path to .fasta file with reference sequence for full cytosine report. (default: None)
  --bamtype {bismark}   Type of aligner which was used for generating BAM. (default: bismark)
  -m {report,stats}, --mode {report,stats}
  --to_type {bismark,cgmap,bedgraph,coverage,binom}
                        Specifies the output file type if mode is set to 'report'. (default: bismark)
  --stat {ME,EPM,PDR}   Specifies the BAM stat type if mode is set to 'stats' (default: ME)
  --stat_param STAT_PARAM
                        See docs for specifical stat parameters. (default: 4)
  --stat_md STAT_MD     Minimum number of reads for cytosine to be analysed (if mode is 'stats') (default: 4)
  -g GFF, --gff GFF     Path to regions genome coordinates .gff file, if cytosines need to be filtered. (default: None)
  -c {CG,CHG,CHH,all}, --context {CG,CHG,CHH,all}
                        Filter cytosines by specific methylation context (default: all)
  -q {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42}, --min_qual {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42}
                        Filter cytosines by read Phred score quality (default: None)
  -s, --skip_converted  Skip reads aligned to converted sequence (default: False)
  --no_qc               Do not calculate QC stats (default: False)
  -t THREADS, --threads THREADS
                        How many threads will be used for reading the BAM file. (default: 1)
  -n BATCH_N, --batch_n BATCH_N
                        Number of reads per batch. (default: 10000.0)
  -a READAHEAD, --readahead READAHEAD
                        Number of batches to be read before processing. (default: 5)