mpa.SimulateLibrary

Overview

SimulateLibrary is a program within the mpathic package which creates a library of random mutants from an initial wildtype sequence and mutation rate.

Usage

>>> import mpathic as mpa
>>> mpa.simulate_library_class(wtseq="TAATGTGAGTTAGCTCACTCAT")

Example Output Table:

ct            seq
100           ACAGGGTTAC
50            ACGGGGTTAC
...

Class Details

class mpathic.src.simulate_library.SimulateLibrary(wtseq='ACGACGA', mutrate=0.1, numseq=10000, dicttype='dna', probarr=None, tags=False, tag_length=10)
Parameters:
wtseq : (string)

wildtype sequence. Must contain characteres ‘A’, ‘C’, ‘G’,’T’ for

dicttype = ‘DNA’, ‘A’, ‘C’, ‘G’,’U’ for dicttype = ‘RNA’

mutrate : (float)

mutation rate

numseq : (int)

number of sequences. Must be a positive integer.

dicttype : (string)

sequence dictionary: valid choices include ‘dna’, ‘rna’, ‘pro’

probarr : (np.ndarray)

probability matrix used to generate bases

tags : (boolean)

If simulating tags, each generated seq gets a unique tag

tag_length : (int)

Length of tags. Should be >= 0

Attributes:
output_df : (pandas dataframe)

Contains the output of simulate library in a pandas dataframe.

arr2seq(arr, inv_dict)

Change numbers back into base pairs.

seq2arr(seq, seq_dict)

Change base pairs to numbers