tidymut.utils.cleaner_workers module
- tidymut.utils.cleaner_workers.apply_single_mutation(row_data: Tuple, dataset_columns: Index, sequence_column: str, name_column: str, mutation_column: str, position_columns: Dict[str, str] | None, mutation_sep: str, is_zero_based: bool, sequence_class: Type[ProteinSequence | DNASequence | RNASequence]) Tuple[str | None, str | None] [source]
Apply mutations to a single sequence.
- Parameters:
row_data (Tuple) – Row data from the dataset
dataset_columns (Index) – Column names of the dataset
sequence_column (str) – Column name containing sequences
name_column (str) – Column name containing protein identifiers
mutation_column (str) – Column name containing mutation information
position_columns (Optional[Dict[str, str]]) – Position column mapping for sequence extraction
mutation_sep (str) – Separator for splitting multiple mutations
is_zero_based (bool) – Whether the mutation position is zero-based.
sequence_class (Type[Union[ProteinSequence, DNASequence, RNASequence]]) – Sequence class to use for mutation application
- Returns:
(mutated_sequence, error_message) - either sequence or error, not both
- Return type:
Tuple[Optional[str], Optional[str]]
- tidymut.utils.cleaner_workers.infer_wt_sequence_grouped(group_data: Tuple[Any, pd.DataFrame], name_column: str, mutation_column: str, sequence_column: str, label_columns: List[str], wt_label: float, mutation_sep: str, is_zero_based: bool, handle_multiple_wt: Literal['error', 'separate', 'first'], sequence_class: Type[ProteinSequence | DNASequence | RNASequence], alphabet_class: Type[ProteinAlphabet | DNAAlphabet | RNAAlphabet]) Tuple[List[Dict[str, Any]], str] [source]
Process a single protein group and return list of rows (including WT).
This is a module-level function that processes protein groups independently.
- tidymut.utils.cleaner_workers.valid_single_mutation(args: Tuple) Tuple[str | None, str | None] [source]
Process a single mutation string.
- Parameters:
args (Tuple) – (mut_info, format_mutations, mutation_sep, is_zero_based, cache)
- Returns:
(formatted_mutation, error_message) - one will be None
- Return type:
Tuple[Optional[str], Optional[str]]