The gtftk.bam.bam_coverage_matrix module

A module to compute bigwig coverage over a set of regions (bed).

pygtftk.bwig.bw_coverage.bw_cov_mp(bw_list=None, region_file=None, labels=None, bin_nb=None, nb_proc=None, n_highest=None, zero_to_na=False, pseudo_count=None, stat='mean', verbose=False)

Compute bigwig coverage (multi-processed) for a set of regions.

Parameters:
  • bw_list – the list of bigWig files to be processed.
  • region_file – the bed file containing the region for which coverage is to be computed.
  • labels – shortname for bigwigs.
  • bin_nb – The number of bin into which the region should be splitted.
  • nb_proc – Number of threads to be used.
  • n_highest – compute the mean coverage based on the n highest values in the bins.
  • pseudo_count – The value for a pseudo-count.
  • verbose – run in verbose mode.
  • stat – mean (default) or sum.
  • zero_to_na – Convert missing values to NA, not zero.

Returns a file.

pygtftk.bwig.bw_coverage.bw_profile_mp(in_bed_file=None, nb_proc=None, big_wig=None, bin_nb=None, pseudo_count=0, stranded=True, type=None, labels=None, outputfile=None, zero_to_na=False, bed_format=False, add_score=False, stat='mean', verbose=False)

Compute bigwig profile for a set of regions.

Parameters:
  • in_bed_file – the bed file containing the region for which coverage is to be computed.
  • nb_proc – Number of threads to be used.
  • big_wig – The bigWig files to be processed.
  • bin_nb – The number of bin into which the region should be splitted.
  • pseudo_count – The value for a pseudo-count.
  • stranded – controls whether the profile should be ordered based on strand.
  • type – This string will be added to the output to indicate the type of region (e.g tss, promoter,…).
  • labels – shortname for bigwigs.
  • outputfile – output file name.
  • zero_to_na – Convert missing values to NA, not zero.
  • bed_format – Force Bed format. Default is to write columns in the following way: bwig, chrom, start, end, gene/feature, strand…
  • add_score – add a ‘score’ column (“.”). Just for downstream compatibility).
  • stat – mean (default) or sum.
  • verbose – run in verbose mode.

Returns a file.