presto.Sequence¶
Sequence processing functions
-
class
presto.Sequence.
AssemblyRecord
(seq=None)¶ Bases:
object
A class defining a paired-end assembly result
-
property
overlap
¶
-
property
-
class
presto.Sequence.
AssemblyStats
(n)¶ Bases:
object
Class containing p-value and z-score matrices for scoring assemblies
-
class
presto.Sequence.
PrimerAlignment
(seq=None)¶ Bases:
object
A class defining a primer alignment result
-
__bool__
()¶ Boolean evaluation of the alignment
- Returns
evaluates to the value of the valid attribute
- Return type
-
-
presto.Sequence.
alignAssembly
(head_seq, tail_seq, alpha=1e-05, max_error=0.3, min_len=8, max_len=1000, scan_reverse=False, assembly_stats=None, score_dict={'-', '-': 0, '-', '.': 0, '-', 'A': 0, '-', 'B': 0, '-', 'C': 0, '-', 'D': 0, '-', 'G': 0, '-', 'H': 0, '-', 'K': 0, '-', 'M': 0, '-', 'N': 0, '-', 'R': 0, '-', 'S': 0, '-', 'T': 0, '-', 'V': 0, '-', 'W': 0, '-', 'Y': 0, '.', '-': 0, '.', '.': 0, '.', 'A': 0, '.', 'B': 0, '.', 'C': 0, '.', 'D': 0, '.', 'G': 0, '.', 'H': 0, '.', 'K': 0, '.', 'M': 0, '.', 'N': 0, '.', 'R': 0, '.', 'S': 0, '.', 'T': 0, '.', 'V': 0, '.', 'W': 0, '.', 'Y': 0, 'A', '-': 0, 'A', '.': 0, 'A', 'A': 1, 'A', 'B': 0, 'A', 'C': 0, 'A', 'D': 1, 'A', 'G': 0, 'A', 'H': 1, 'A', 'K': 0, 'A', 'M': 1, 'A', 'N': 1, 'A', 'R': 1, 'A', 'S': 0, 'A', 'T': 0, 'A', 'V': 1, 'A', 'W': 1, 'A', 'Y': 0, 'B', '-': 0, 'B', '.': 0, 'B', 'A': 0, 'B', 'B': 1, 'B', 'C': 1, 'B', 'D': 1, 'B', 'G': 1, 'B', 'H': 1, 'B', 'K': 1, 'B', 'M': 1, 'B', 'N': 1, 'B', 'R': 1, 'B', 'S': 1, 'B', 'T': 1, 'B', 'V': 1, 'B', 'W': 1, 'B', 'Y': 1, 'C', '-': 0, 'C', '.': 0, 'C', 'A': 0, 'C', 'B': 1, 'C', 'C': 1, 'C', 'D': 0, 'C', 'G': 0, 'C', 'H': 1, 'C', 'K': 0, 'C', 'M': 1, 'C', 'N': 1, 'C', 'R': 0, 'C', 'S': 1, 'C', 'T': 0, 'C', 'V': 1, 'C', 'W': 0, 'C', 'Y': 1, 'D', '-': 0, 'D', '.': 0, 'D', 'A': 1, 'D', 'B': 1, 'D', 'C': 0, 'D', 'D': 1, 'D', 'G': 1, 'D', 'H': 1, 'D', 'K': 1, 'D', 'M': 1, 'D', 'N': 1, 'D', 'R': 1, 'D', 'S': 1, 'D', 'T': 1, 'D', 'V': 1, 'D', 'W': 1, 'D', 'Y': 1, 'G', '-': 0, 'G', '.': 0, 'G', 'A': 0, 'G', 'B': 1, 'G', 'C': 0, 'G', 'D': 1, 'G', 'G': 1, 'G', 'H': 0, 'G', 'K': 1, 'G', 'M': 0, 'G', 'N': 1, 'G', 'R': 1, 'G', 'S': 1, 'G', 'T': 0, 'G', 'V': 1, 'G', 'W': 0, 'G', 'Y': 0, 'H', '-': 0, 'H', '.': 0, 'H', 'A': 1, 'H', 'B': 1, 'H', 'C': 1, 'H', 'D': 1, 'H', 'G': 0, 'H', 'H': 1, 'H', 'K': 1, 'H', 'M': 1, 'H', 'N': 1, 'H', 'R': 1, 'H', 'S': 1, 'H', 'T': 1, 'H', 'V': 1, 'H', 'W': 1, 'H', 'Y': 1, 'K', '-': 0, 'K', '.': 0, 'K', 'A': 0, 'K', 'B': 1, 'K', 'C': 0, 'K', 'D': 1, 'K', 'G': 1, 'K', 'H': 1, 'K', 'K': 1, 'K', 'M': 0, 'K', 'N': 1, 'K', 'R': 1, 'K', 'S': 1, 'K', 'T': 1, 'K', 'V': 1, 'K', 'W': 1, 'K', 'Y': 1, 'M', '-': 0, 'M', '.': 0, 'M', 'A': 1, 'M', 'B': 1, 'M', 'C': 1, 'M', 'D': 1, 'M', 'G': 0, 'M', 'H': 1, 'M', 'K': 0, 'M', 'M': 1, 'M', 'N': 1, 'M', 'R': 1, 'M', 'S': 1, 'M', 'T': 0, 'M', 'V': 1, 'M', 'W': 1, 'M', 'Y': 1, 'N', '-': 1, 'N', '.': 1, 'N', 'A': 1, 'N', 'B': 1, 'N', 'C': 1, 'N', 'D': 1, 'N', 'G': 1, 'N', 'H': 1, 'N', 'K': 1, 'N', 'M': 1, 'N', 'N': 1, 'N', 'R': 1, 'N', 'S': 1, 'N', 'T': 1, 'N', 'V': 1, 'N', 'W': 1, 'N', 'Y': 1, 'R', '-': 0, 'R', '.': 0, 'R', 'A': 1, 'R', 'B': 1, 'R', 'C': 0, 'R', 'D': 1, 'R', 'G': 1, 'R', 'H': 1, 'R', 'K': 1, 'R', 'M': 1, 'R', 'N': 1, 'R', 'R': 1, 'R', 'S': 1, 'R', 'T': 0, 'R', 'V': 1, 'R', 'W': 1, 'R', 'Y': 0, 'S', '-': 0, 'S', '.': 0, 'S', 'A': 0, 'S', 'B': 1, 'S', 'C': 1, 'S', 'D': 1, 'S', 'G': 1, 'S', 'H': 1, 'S', 'K': 1, 'S', 'M': 1, 'S', 'N': 1, 'S', 'R': 1, 'S', 'S': 1, 'S', 'T': 0, 'S', 'V': 1, 'S', 'W': 0, 'S', 'Y': 1, 'T', '-': 0, 'T', '.': 0, 'T', 'A': 0, 'T', 'B': 1, 'T', 'C': 0, 'T', 'D': 1, 'T', 'G': 0, 'T', 'H': 1, 'T', 'K': 1, 'T', 'M': 0, 'T', 'N': 1, 'T', 'R': 0, 'T', 'S': 0, 'T', 'T': 1, 'T', 'V': 0, 'T', 'W': 1, 'T', 'Y': 1, 'V', '-': 0, 'V', '.': 0, 'V', 'A': 1, 'V', 'B': 1, 'V', 'C': 1, 'V', 'D': 1, 'V', 'G': 1, 'V', 'H': 1, 'V', 'K': 1, 'V', 'M': 1, 'V', 'N': 1, 'V', 'R': 1, 'V', 'S': 1, 'V', 'T': 0, 'V', 'V': 1, 'V', 'W': 1, 'V', 'Y': 1, 'W', '-': 0, 'W', '.': 0, 'W', 'A': 1, 'W', 'B': 1, 'W', 'C': 0, 'W', 'D': 1, 'W', 'G': 0, 'W', 'H': 1, 'W', 'K': 1, 'W', 'M': 1, 'W', 'N': 1, 'W', 'R': 1, 'W', 'S': 0, 'W', 'T': 1, 'W', 'V': 1, 'W', 'W': 1, 'W', 'Y': 1, 'Y', '-': 0, 'Y', '.': 0, 'Y', 'A': 0, 'Y', 'B': 1, 'Y', 'C': 1, 'Y', 'D': 1, 'Y', 'G': 0, 'Y', 'H': 1, 'Y', 'K': 1, 'Y', 'M': 1, 'Y', 'N': 1, 'Y', 'R': 0, 'Y', 'S': 1, 'Y', 'T': 1, 'Y', 'V': 1, 'Y', 'W': 1, 'Y', 'Y': 1})¶ Stitches two sequences together by aligning the ends
- Parameters
head_seq – the head SeqRecord.
head_seq – the tail SeqRecord.
alpha – the minimum p-value for a valid assembly.
max_error – the maximum error rate for a valid assembly.
min_len – minimum length of overlap to test.
max_len – maximum length of overlap to test.
scan_reverse – if True allow the head sequence to overhang the end of the tail sequence if False end alignment scan at end of tail sequence or start of head sequence.
assembly_stats – optional successes by trials numpy.array of p-values.
score_dict – optional dictionary of character scores in the . form {(char1, char2): score}.
- Returns
assembled sequence object.
- Return type
-
presto.Sequence.
calculateDiversity
(seq_list, score_dict=getDNAScoreDict())¶ Determine the average pairwise error rate for a list of sequences
- Parameters
seq_list – List of SeqRecord objects to score
score_dict – Optional dictionary of alignment scores as {(char1, char2): score}
- Returns
Average pairwise error rate for the list of sequences
- Return type
-
presto.Sequence.
calculateSetError
(seq_list, ref_seq, ignore_chars=['n', 'N'], score_dict=getDNAScoreDict())¶ Counts the occurrence of nucleotide mismatches from a reference in a set of sequences
- Parameters
seq_list – list of SeqRecord objects with aligned sequences.
ref_seq – SeqRecord object containing the reference sequence to match against.
ignore_chars – list of characters to exclude from mismatch counts.
score_dict – optional dictionary of alignment scores as {(char1, char2): score}.
- Returns
error rate for the set.
- Return type
-
presto.Sequence.
checkSeqEqual
(seq1, seq2, ignore_chars={'-', '.', 'N', 'n'})¶ Determine if two sequences are equal, excluding missing positions
- Parameters
seq1 – SeqRecord object
seq2 – SeqRecord object
ignore_chars – Set of characters to ignore
- Returns
True if the sequences are equal
- Return type
-
presto.Sequence.
compilePrimers
(primers)¶ Translates IUPAC Ambiguous Nucleotide characters to regular expressions and compiles them
- Parameters
key – Dictionary of sequences to translate
- Returns
Dictionary of compiled regular expressions
- Return type
-
presto.Sequence.
consensusUnify
(data, field, delimiter='|', '=', ',')¶ Reassigns all annotations to the consensus annotation in group
- Parameters
data – SeqData object contain sequences to process.
field – field containing annotations to collapse.
delimiter – a tuple of delimiters for (annotations, field/values, value lists).
- Returns
modified sequences.
- Return type
-
presto.Sequence.
deleteSeqPositions
(seq, positions)¶ Deletes a list of positions from a SeqRecord
- Parameters
seq – SeqRecord objects
positions – Set of positions (indices) to delete
- Returns
Modified SeqRecord with the specified positions removed
- Return type
SeqRecord
-
presto.Sequence.
deletionUnify
(data, field, delimiter='|', '=', ',')¶ Removes all sequences with differing annotations in a group
- Parameters
data – SeqData object contain sequences to process.
field – field containing annotations to collapse.
delimiter – a tuple of delimiters for (annotations, field/values, value lists).
- Returns
modified sequences.
- Return type
-
presto.Sequence.
extractAlignment
(seq_record, start, length, rev_primer=False)¶ Extracts a subsequence from sequence
- Parameters
data – SeqRecord to process.
start – position where subsequence starts.
length – the length of the subsequence to extract.
rev_primer – if True extract relative to the tail end of the sequence.
- Returns
extraction results as an alignment object
- Return type
-
presto.Sequence.
filterLength
(data, min_length=250, inner=True, missing_chars='.-Nn')¶ Filters sequences by length
- Parameters
- Returns
SeqResult object.
- Return type
-
presto.Sequence.
filterMissing
(data, max_missing=10, inner=True, missing_chars='.-Nn')¶ Filters sequences by number of missing nucleotides
- Parameters
- Returns
SeqResult object.
- Return type
-
presto.Sequence.
filterQuality
(data, min_qual=0, inner=True, missing_chars='.-Nn')¶ Filters sequences by quality score
- Parameters
- Returns
SeqResult object.
- Return type
-
presto.Sequence.
filterRepeats
(data, max_repeat=15, include_missing=False, inner=True, missing_chars='.-Nn')¶ Filters sequences by fraction of ambiguous nucleotides
- Parameters
data (SeqData) – a SeqData object with a single SeqRecord to process.
max_repeat (int) – the maximum number of allowed repeating characters.
include_missing (int) – if True count ambiguous character repeats; if False do not consider ambiguous character repeats.
inner (int) – if True exclude outer missing characters from calculation.
missing_chars (str) – a string of missing character values.
- Returns
SeqResult object.
- Return type
-
presto.Sequence.
findGapPositions
(seq_list, max_gap, gap_chars={'-', '.'})¶ Finds positions in a set of aligned sequences with a high number of gap characters.
- Parameters
seq_list – List of SeqRecord objects with aligned sequences
max_gap – Float of the maximum gap frequency to consider a position as non-gapped
gap_chars – Set of characters to consider as gaps
- Returns
Positions (indices) with gap frequency greater than max_gap
- Return type
-
presto.Sequence.
frequencyConsensus
(seq_list, min_freq=0.6, ignore_chars={'-', '.', 'N', 'n'})¶ Builds a consensus sequence from a set of sequences
- Parameters
set_seq – List of SeqRecord objects
min_freq – Frequency cutoff to assign a base
ignore_chars – Set of characters to exclude when building a consensus sequence
- Returns
Consensus SeqRecord object
- Return type
SeqRecord
-
presto.Sequence.
getAAScoreDict
(mask_score=None, gap_score=None)¶ Generates a score dictionary
- Parameters
mask_score – Tuple of length two defining scores for all matches against an X character for (a, b), with the score for character (a) taking precedence; if None score symmetrically according to IUPAC character identity
gap_score – Tuple of length two defining score for all matches against a [-, .] character for (a, b), with the score for character (a) taking precedence; if None score symmetrically according to IUPAC character identity
- Returns
Score dictionary with keys (char1, char2) mapping to scores
- Return type
-
presto.Sequence.
getDNAScoreDict
(mask_score=None, gap_score=None)¶ Generates a score dictionary
- Parameters
mask_score – Tuple of length two defining scores for all matches against an N character for (a, b), with the score for character (a) taking precedence; if None score symmetrically according to IUPAC character identity
gap_score – Tuple of length two defining score for all matches against a [-, .] character for (a, b), with the score for character (a) taking precedence; if None score symmetrically according to IUPAC character identity
- Returns
Score dictionary with keys (char1, char2) mapping to scores
- Return type
-
presto.Sequence.
indexSeqSets
(seq_dict, field='BARCODE', delimiter='|', '=', ',')¶ Identifies sets of sequences with the same ID field
- Parameters
seq_dict – a dictionary index of sequences returned from SeqIO.index()
field – the annotation field containing set IDs
delimiter – a tuple of delimiters for (fields, values, value lists)
- Returns
Dictionary mapping set name to a list of record names
- Return type
-
presto.Sequence.
joinAssembly
(head_seq, tail_seq, gap=0, insert_seq=None)¶ Concatenates two sequences
- Parameters
head_seq – the head SeqRecord.
tail_seq – the tail SeqRecord.
gap – number of gap characters to insert between head and tail ignored if insert_seq is not None.
insert_seq – a string or Bio.Seq.Seq object, to insert between the head and tail; if None insert with N characters.
- Returns
assembled sequence object.
- Return type
-
presto.Sequence.
localAlignment
(seq_record, primers, primers_regex=None, max_error=0.3, max_len=1000, rev_primer=False, skip_rc=False, gap_penalty=1, 1, score_dict={'-', '-': 0, '-', '.': 0, '-', 'A': 0, '-', 'B': 0, '-', 'C': 0, '-', 'D': 0, '-', 'G': 0, '-', 'H': 0, '-', 'K': 0, '-', 'M': 0, '-', 'N': 0, '-', 'R': 0, '-', 'S': 0, '-', 'T': 0, '-', 'V': 0, '-', 'W': 0, '-', 'Y': 0, '.', '-': 0, '.', '.': 0, '.', 'A': 0, '.', 'B': 0, '.', 'C': 0, '.', 'D': 0, '.', 'G': 0, '.', 'H': 0, '.', 'K': 0, '.', 'M': 0, '.', 'N': 0, '.', 'R': 0, '.', 'S': 0, '.', 'T': 0, '.', 'V': 0, '.', 'W': 0, '.', 'Y': 0, 'A', '-': 0, 'A', '.': 0, 'A', 'A': 1, 'A', 'B': 0, 'A', 'C': 0, 'A', 'D': 1, 'A', 'G': 0, 'A', 'H': 1, 'A', 'K': 0, 'A', 'M': 1, 'A', 'N': 1, 'A', 'R': 1, 'A', 'S': 0, 'A', 'T': 0, 'A', 'V': 1, 'A', 'W': 1, 'A', 'Y': 0, 'B', '-': 0, 'B', '.': 0, 'B', 'A': 0, 'B', 'B': 1, 'B', 'C': 1, 'B', 'D': 1, 'B', 'G': 1, 'B', 'H': 1, 'B', 'K': 1, 'B', 'M': 1, 'B', 'N': 1, 'B', 'R': 1, 'B', 'S': 1, 'B', 'T': 1, 'B', 'V': 1, 'B', 'W': 1, 'B', 'Y': 1, 'C', '-': 0, 'C', '.': 0, 'C', 'A': 0, 'C', 'B': 1, 'C', 'C': 1, 'C', 'D': 0, 'C', 'G': 0, 'C', 'H': 1, 'C', 'K': 0, 'C', 'M': 1, 'C', 'N': 1, 'C', 'R': 0, 'C', 'S': 1, 'C', 'T': 0, 'C', 'V': 1, 'C', 'W': 0, 'C', 'Y': 1, 'D', '-': 0, 'D', '.': 0, 'D', 'A': 1, 'D', 'B': 1, 'D', 'C': 0, 'D', 'D': 1, 'D', 'G': 1, 'D', 'H': 1, 'D', 'K': 1, 'D', 'M': 1, 'D', 'N': 1, 'D', 'R': 1, 'D', 'S': 1, 'D', 'T': 1, 'D', 'V': 1, 'D', 'W': 1, 'D', 'Y': 1, 'G', '-': 0, 'G', '.': 0, 'G', 'A': 0, 'G', 'B': 1, 'G', 'C': 0, 'G', 'D': 1, 'G', 'G': 1, 'G', 'H': 0, 'G', 'K': 1, 'G', 'M': 0, 'G', 'N': 1, 'G', 'R': 1, 'G', 'S': 1, 'G', 'T': 0, 'G', 'V': 1, 'G', 'W': 0, 'G', 'Y': 0, 'H', '-': 0, 'H', '.': 0, 'H', 'A': 1, 'H', 'B': 1, 'H', 'C': 1, 'H', 'D': 1, 'H', 'G': 0, 'H', 'H': 1, 'H', 'K': 1, 'H', 'M': 1, 'H', 'N': 1, 'H', 'R': 1, 'H', 'S': 1, 'H', 'T': 1, 'H', 'V': 1, 'H', 'W': 1, 'H', 'Y': 1, 'K', '-': 0, 'K', '.': 0, 'K', 'A': 0, 'K', 'B': 1, 'K', 'C': 0, 'K', 'D': 1, 'K', 'G': 1, 'K', 'H': 1, 'K', 'K': 1, 'K', 'M': 0, 'K', 'N': 1, 'K', 'R': 1, 'K', 'S': 1, 'K', 'T': 1, 'K', 'V': 1, 'K', 'W': 1, 'K', 'Y': 1, 'M', '-': 0, 'M', '.': 0, 'M', 'A': 1, 'M', 'B': 1, 'M', 'C': 1, 'M', 'D': 1, 'M', 'G': 0, 'M', 'H': 1, 'M', 'K': 0, 'M', 'M': 1, 'M', 'N': 1, 'M', 'R': 1, 'M', 'S': 1, 'M', 'T': 0, 'M', 'V': 1, 'M', 'W': 1, 'M', 'Y': 1, 'N', '-': 0, 'N', '.': 0, 'N', 'A': 0, 'N', 'B': 0, 'N', 'C': 0, 'N', 'D': 0, 'N', 'G': 0, 'N', 'H': 0, 'N', 'K': 0, 'N', 'M': 0, 'N', 'N': 0, 'N', 'R': 0, 'N', 'S': 0, 'N', 'T': 0, 'N', 'V': 0, 'N', 'W': 0, 'N', 'Y': 0, 'R', '-': 0, 'R', '.': 0, 'R', 'A': 1, 'R', 'B': 1, 'R', 'C': 0, 'R', 'D': 1, 'R', 'G': 1, 'R', 'H': 1, 'R', 'K': 1, 'R', 'M': 1, 'R', 'N': 1, 'R', 'R': 1, 'R', 'S': 1, 'R', 'T': 0, 'R', 'V': 1, 'R', 'W': 1, 'R', 'Y': 0, 'S', '-': 0, 'S', '.': 0, 'S', 'A': 0, 'S', 'B': 1, 'S', 'C': 1, 'S', 'D': 1, 'S', 'G': 1, 'S', 'H': 1, 'S', 'K': 1, 'S', 'M': 1, 'S', 'N': 1, 'S', 'R': 1, 'S', 'S': 1, 'S', 'T': 0, 'S', 'V': 1, 'S', 'W': 0, 'S', 'Y': 1, 'T', '-': 0, 'T', '.': 0, 'T', 'A': 0, 'T', 'B': 1, 'T', 'C': 0, 'T', 'D': 1, 'T', 'G': 0, 'T', 'H': 1, 'T', 'K': 1, 'T', 'M': 0, 'T', 'N': 1, 'T', 'R': 0, 'T', 'S': 0, 'T', 'T': 1, 'T', 'V': 0, 'T', 'W': 1, 'T', 'Y': 1, 'V', '-': 0, 'V', '.': 0, 'V', 'A': 1, 'V', 'B': 1, 'V', 'C': 1, 'V', 'D': 1, 'V', 'G': 1, 'V', 'H': 1, 'V', 'K': 1, 'V', 'M': 1, 'V', 'N': 1, 'V', 'R': 1, 'V', 'S': 1, 'V', 'T': 0, 'V', 'V': 1, 'V', 'W': 1, 'V', 'Y': 1, 'W', '-': 0, 'W', '.': 0, 'W', 'A': 1, 'W', 'B': 1, 'W', 'C': 0, 'W', 'D': 1, 'W', 'G': 0, 'W', 'H': 1, 'W', 'K': 1, 'W', 'M': 1, 'W', 'N': 1, 'W', 'R': 1, 'W', 'S': 0, 'W', 'T': 1, 'W', 'V': 1, 'W', 'W': 1, 'W', 'Y': 1, 'Y', '-': 0, 'Y', '.': 0, 'Y', 'A': 0, 'Y', 'B': 1, 'Y', 'C': 1, 'Y', 'D': 1, 'Y', 'G': 0, 'Y', 'H': 1, 'Y', 'K': 1, 'Y', 'M': 1, 'Y', 'N': 1, 'Y', 'R': 0, 'Y', 'S': 1, 'Y', 'T': 1, 'Y', 'V': 1, 'Y', 'W': 1, 'Y', 'Y': 1})¶ Performs pairwise local alignment of a list of short sequences against a long sequence
- Parameters
seq_record – a SeqRecord object to align primers against
primers – dictionary of {names: short IUPAC ambiguous sequence strings}
primers_regex – optional dictionary of {names: compiled primer regular expressions}
max_error – maximum acceptable error rate before aligning reverse complement
max_len – maximum length of sample sequence to align
rev_primer – if True align with the tail end of the sequence
skip_rc – if True do not check reverse complement sequences
gap_penalty – a tuple of positive (gap open, gap extend) penalties
score_dict – optional dictionary of alignment scores as {(char1, char2): score}
- Returns
primer alignment result object
- Return type
-
presto.Sequence.
maskQuality
(data, min_qual=0)¶ Masks characters by in sequence by quality score
-
presto.Sequence.
maskSeq
(align, mode='mask', barcode=False, barcode_field='BARCODE', primer_field='PRIMER', delimiter='|', '=', ',')¶ Create an output sequence with primers masked or cut
- Parameters
align – a PrimerAlignment object.
mode – defines the action taken; one of ‘cut’, ‘mask’, ‘tag’ or ‘trim’.
barcode – if True add sequence preceding primer to description.
barcode_field – name of the output barcode annotation.
primer_field – name of the output primer annotation.
delimiter – a tuple of delimiters for (annotations, field/values, value lists).
- Returns
masked sequence.
- Return type
Bio.SeqRecord.SeqRecord
-
presto.Sequence.
overlapConsensus
(head_seq, tail_seq, ignore_chars={'-', '.', 'N', 'n'})¶ Creates a consensus overlap sequences from two segments
- Parameters
head_seq – the overlap head SeqRecord.
tail_seq – the overlap tail SeqRecord.
ignore_chars – list of characters which do not contribute to consensus.
- Returns
A SeqRecord object with consensus characters and quality scores.
- Return type
SeqRecord
-
presto.Sequence.
qualityConsensus
(seq_list, min_qual=0, min_freq=0.6, dependent=False, ignore_chars={'-', '.', 'N', 'n'})¶ Builds a consensus sequence from a set of sequences
- Parameters
seq_list – List of SeqRecord objects
min_qual – Quality cutoff to assign a base
min_freq – Frequency cutoff to assign a base
dependent – If False assume sequences are independent for quality calculation
ignore_chars – Set of characters to exclude when building a consensus sequence
- Returns
Consensus SeqRecord object
- Return type
SeqRecord
-
presto.Sequence.
referenceAssembly
(head_seq, tail_seq, ref_dict, ref_db, min_ident=0.5, evalue=1e-05, max_hits=100, fill=False, aligner='usearch', aligner_exec='usearch', score_dict={'-', '-': 0, '-', '.': 0, '-', 'A': 0, '-', 'B': 0, '-', 'C': 0, '-', 'D': 0, '-', 'G': 0, '-', 'H': 0, '-', 'K': 0, '-', 'M': 0, '-', 'N': 0, '-', 'R': 0, '-', 'S': 0, '-', 'T': 0, '-', 'V': 0, '-', 'W': 0, '-', 'Y': 0, '.', '-': 0, '.', '.': 0, '.', 'A': 0, '.', 'B': 0, '.', 'C': 0, '.', 'D': 0, '.', 'G': 0, '.', 'H': 0, '.', 'K': 0, '.', 'M': 0, '.', 'N': 0, '.', 'R': 0, '.', 'S': 0, '.', 'T': 0, '.', 'V': 0, '.', 'W': 0, '.', 'Y': 0, 'A', '-': 0, 'A', '.': 0, 'A', 'A': 1, 'A', 'B': 0, 'A', 'C': 0, 'A', 'D': 1, 'A', 'G': 0, 'A', 'H': 1, 'A', 'K': 0, 'A', 'M': 1, 'A', 'N': 1, 'A', 'R': 1, 'A', 'S': 0, 'A', 'T': 0, 'A', 'V': 1, 'A', 'W': 1, 'A', 'Y': 0, 'B', '-': 0, 'B', '.': 0, 'B', 'A': 0, 'B', 'B': 1, 'B', 'C': 1, 'B', 'D': 1, 'B', 'G': 1, 'B', 'H': 1, 'B', 'K': 1, 'B', 'M': 1, 'B', 'N': 1, 'B', 'R': 1, 'B', 'S': 1, 'B', 'T': 1, 'B', 'V': 1, 'B', 'W': 1, 'B', 'Y': 1, 'C', '-': 0, 'C', '.': 0, 'C', 'A': 0, 'C', 'B': 1, 'C', 'C': 1, 'C', 'D': 0, 'C', 'G': 0, 'C', 'H': 1, 'C', 'K': 0, 'C', 'M': 1, 'C', 'N': 1, 'C', 'R': 0, 'C', 'S': 1, 'C', 'T': 0, 'C', 'V': 1, 'C', 'W': 0, 'C', 'Y': 1, 'D', '-': 0, 'D', '.': 0, 'D', 'A': 1, 'D', 'B': 1, 'D', 'C': 0, 'D', 'D': 1, 'D', 'G': 1, 'D', 'H': 1, 'D', 'K': 1, 'D', 'M': 1, 'D', 'N': 1, 'D', 'R': 1, 'D', 'S': 1, 'D', 'T': 1, 'D', 'V': 1, 'D', 'W': 1, 'D', 'Y': 1, 'G', '-': 0, 'G', '.': 0, 'G', 'A': 0, 'G', 'B': 1, 'G', 'C': 0, 'G', 'D': 1, 'G', 'G': 1, 'G', 'H': 0, 'G', 'K': 1, 'G', 'M': 0, 'G', 'N': 1, 'G', 'R': 1, 'G', 'S': 1, 'G', 'T': 0, 'G', 'V': 1, 'G', 'W': 0, 'G', 'Y': 0, 'H', '-': 0, 'H', '.': 0, 'H', 'A': 1, 'H', 'B': 1, 'H', 'C': 1, 'H', 'D': 1, 'H', 'G': 0, 'H', 'H': 1, 'H', 'K': 1, 'H', 'M': 1, 'H', 'N': 1, 'H', 'R': 1, 'H', 'S': 1, 'H', 'T': 1, 'H', 'V': 1, 'H', 'W': 1, 'H', 'Y': 1, 'K', '-': 0, 'K', '.': 0, 'K', 'A': 0, 'K', 'B': 1, 'K', 'C': 0, 'K', 'D': 1, 'K', 'G': 1, 'K', 'H': 1, 'K', 'K': 1, 'K', 'M': 0, 'K', 'N': 1, 'K', 'R': 1, 'K', 'S': 1, 'K', 'T': 1, 'K', 'V': 1, 'K', 'W': 1, 'K', 'Y': 1, 'M', '-': 0, 'M', '.': 0, 'M', 'A': 1, 'M', 'B': 1, 'M', 'C': 1, 'M', 'D': 1, 'M', 'G': 0, 'M', 'H': 1, 'M', 'K': 0, 'M', 'M': 1, 'M', 'N': 1, 'M', 'R': 1, 'M', 'S': 1, 'M', 'T': 0, 'M', 'V': 1, 'M', 'W': 1, 'M', 'Y': 1, 'N', '-': 1, 'N', '.': 1, 'N', 'A': 1, 'N', 'B': 1, 'N', 'C': 1, 'N', 'D': 1, 'N', 'G': 1, 'N', 'H': 1, 'N', 'K': 1, 'N', 'M': 1, 'N', 'N': 1, 'N', 'R': 1, 'N', 'S': 1, 'N', 'T': 1, 'N', 'V': 1, 'N', 'W': 1, 'N', 'Y': 1, 'R', '-': 0, 'R', '.': 0, 'R', 'A': 1, 'R', 'B': 1, 'R', 'C': 0, 'R', 'D': 1, 'R', 'G': 1, 'R', 'H': 1, 'R', 'K': 1, 'R', 'M': 1, 'R', 'N': 1, 'R', 'R': 1, 'R', 'S': 1, 'R', 'T': 0, 'R', 'V': 1, 'R', 'W': 1, 'R', 'Y': 0, 'S', '-': 0, 'S', '.': 0, 'S', 'A': 0, 'S', 'B': 1, 'S', 'C': 1, 'S', 'D': 1, 'S', 'G': 1, 'S', 'H': 1, 'S', 'K': 1, 'S', 'M': 1, 'S', 'N': 1, 'S', 'R': 1, 'S', 'S': 1, 'S', 'T': 0, 'S', 'V': 1, 'S', 'W': 0, 'S', 'Y': 1, 'T', '-': 0, 'T', '.': 0, 'T', 'A': 0, 'T', 'B': 1, 'T', 'C': 0, 'T', 'D': 1, 'T', 'G': 0, 'T', 'H': 1, 'T', 'K': 1, 'T', 'M': 0, 'T', 'N': 1, 'T', 'R': 0, 'T', 'S': 0, 'T', 'T': 1, 'T', 'V': 0, 'T', 'W': 1, 'T', 'Y': 1, 'V', '-': 0, 'V', '.': 0, 'V', 'A': 1, 'V', 'B': 1, 'V', 'C': 1, 'V', 'D': 1, 'V', 'G': 1, 'V', 'H': 1, 'V', 'K': 1, 'V', 'M': 1, 'V', 'N': 1, 'V', 'R': 1, 'V', 'S': 1, 'V', 'T': 0, 'V', 'V': 1, 'V', 'W': 1, 'V', 'Y': 1, 'W', '-': 0, 'W', '.': 0, 'W', 'A': 1, 'W', 'B': 1, 'W', 'C': 0, 'W', 'D': 1, 'W', 'G': 0, 'W', 'H': 1, 'W', 'K': 1, 'W', 'M': 1, 'W', 'N': 1, 'W', 'R': 1, 'W', 'S': 0, 'W', 'T': 1, 'W', 'V': 1, 'W', 'W': 1, 'W', 'Y': 1, 'Y', '-': 0, 'Y', '.': 0, 'Y', 'A': 0, 'Y', 'B': 1, 'Y', 'C': 1, 'Y', 'D': 1, 'Y', 'G': 0, 'Y', 'H': 1, 'Y', 'K': 1, 'Y', 'M': 1, 'Y', 'N': 1, 'Y', 'R': 0, 'Y', 'S': 1, 'Y', 'T': 1, 'Y', 'V': 1, 'Y', 'W': 1, 'Y', 'Y': 1})¶ Stitches two sequences together by aligning against a reference database
- Parameters
head_seq – the head SeqRecord.
head_seq – the tail SeqRecord.
ref_dict – a dictionary of reference SeqRecord objects.
ref_db – the path and name of the reference database.
min_ident – the minimum identity for a valid assembly.
evalue – the E-value cut-off for ublast.
max_hits – the maxhits output limit for ublast.
fill – if False non-overlapping regions will be assigned Ns; if True non-overlapping regions will be filled with the reference sequence.
aligner – the alignment tool; one of ‘blastn’ or ‘usearch’.
aligner_exec – the path to the alignment tool executable.
score_dict – optional dictionary of character scores in the form {(char1, char2): score}.
- Returns
assembled sequence object.
- Return type
-
presto.Sequence.
reverseComplement
(seq)¶ Takes the reverse complement of a sequence
- Parameters
seq – a SeqRecord object, Seq object or string to reverse complement
- Returns
Object of the same type as the input with the reverse complement sequence
- Return type
Seq
-
presto.Sequence.
scoreAA
(a, b, mask_score=None, gap_score=None)¶ Returns the score for a pair of IUPAC Extended Protein characters
- Parameters
a – First character
b – Second character
mask_score – Tuple of length two defining scores for all matches against an X character for (a, b), with the score for character (a) taking precedence; if None score symmetrically according to IUPAC character identity
gap_score – Tuple of length two defining score for all matches against a gap (-, .) character for (a, b), with the score for character (a) taking precedence; if None score symmetrically according to IUPAC character identity
- Returns
Score for the character pair
- Return type
-
presto.Sequence.
scoreAlignment
(seq_record, primers, start=0, rev_primer=False, score_dict={'-', '-': 0, '-', '.': 0, '-', 'A': 0, '-', 'B': 0, '-', 'C': 0, '-', 'D': 0, '-', 'G': 0, '-', 'H': 0, '-', 'K': 0, '-', 'M': 0, '-', 'N': 0, '-', 'R': 0, '-', 'S': 0, '-', 'T': 0, '-', 'V': 0, '-', 'W': 0, '-', 'Y': 0, '.', '-': 0, '.', '.': 0, '.', 'A': 0, '.', 'B': 0, '.', 'C': 0, '.', 'D': 0, '.', 'G': 0, '.', 'H': 0, '.', 'K': 0, '.', 'M': 0, '.', 'N': 0, '.', 'R': 0, '.', 'S': 0, '.', 'T': 0, '.', 'V': 0, '.', 'W': 0, '.', 'Y': 0, 'A', '-': 0, 'A', '.': 0, 'A', 'A': 1, 'A', 'B': 0, 'A', 'C': 0, 'A', 'D': 1, 'A', 'G': 0, 'A', 'H': 1, 'A', 'K': 0, 'A', 'M': 1, 'A', 'N': 1, 'A', 'R': 1, 'A', 'S': 0, 'A', 'T': 0, 'A', 'V': 1, 'A', 'W': 1, 'A', 'Y': 0, 'B', '-': 0, 'B', '.': 0, 'B', 'A': 0, 'B', 'B': 1, 'B', 'C': 1, 'B', 'D': 1, 'B', 'G': 1, 'B', 'H': 1, 'B', 'K': 1, 'B', 'M': 1, 'B', 'N': 1, 'B', 'R': 1, 'B', 'S': 1, 'B', 'T': 1, 'B', 'V': 1, 'B', 'W': 1, 'B', 'Y': 1, 'C', '-': 0, 'C', '.': 0, 'C', 'A': 0, 'C', 'B': 1, 'C', 'C': 1, 'C', 'D': 0, 'C', 'G': 0, 'C', 'H': 1, 'C', 'K': 0, 'C', 'M': 1, 'C', 'N': 1, 'C', 'R': 0, 'C', 'S': 1, 'C', 'T': 0, 'C', 'V': 1, 'C', 'W': 0, 'C', 'Y': 1, 'D', '-': 0, 'D', '.': 0, 'D', 'A': 1, 'D', 'B': 1, 'D', 'C': 0, 'D', 'D': 1, 'D', 'G': 1, 'D', 'H': 1, 'D', 'K': 1, 'D', 'M': 1, 'D', 'N': 1, 'D', 'R': 1, 'D', 'S': 1, 'D', 'T': 1, 'D', 'V': 1, 'D', 'W': 1, 'D', 'Y': 1, 'G', '-': 0, 'G', '.': 0, 'G', 'A': 0, 'G', 'B': 1, 'G', 'C': 0, 'G', 'D': 1, 'G', 'G': 1, 'G', 'H': 0, 'G', 'K': 1, 'G', 'M': 0, 'G', 'N': 1, 'G', 'R': 1, 'G', 'S': 1, 'G', 'T': 0, 'G', 'V': 1, 'G', 'W': 0, 'G', 'Y': 0, 'H', '-': 0, 'H', '.': 0, 'H', 'A': 1, 'H', 'B': 1, 'H', 'C': 1, 'H', 'D': 1, 'H', 'G': 0, 'H', 'H': 1, 'H', 'K': 1, 'H', 'M': 1, 'H', 'N': 1, 'H', 'R': 1, 'H', 'S': 1, 'H', 'T': 1, 'H', 'V': 1, 'H', 'W': 1, 'H', 'Y': 1, 'K', '-': 0, 'K', '.': 0, 'K', 'A': 0, 'K', 'B': 1, 'K', 'C': 0, 'K', 'D': 1, 'K', 'G': 1, 'K', 'H': 1, 'K', 'K': 1, 'K', 'M': 0, 'K', 'N': 1, 'K', 'R': 1, 'K', 'S': 1, 'K', 'T': 1, 'K', 'V': 1, 'K', 'W': 1, 'K', 'Y': 1, 'M', '-': 0, 'M', '.': 0, 'M', 'A': 1, 'M', 'B': 1, 'M', 'C': 1, 'M', 'D': 1, 'M', 'G': 0, 'M', 'H': 1, 'M', 'K': 0, 'M', 'M': 1, 'M', 'N': 1, 'M', 'R': 1, 'M', 'S': 1, 'M', 'T': 0, 'M', 'V': 1, 'M', 'W': 1, 'M', 'Y': 1, 'N', '-': 0, 'N', '.': 0, 'N', 'A': 0, 'N', 'B': 0, 'N', 'C': 0, 'N', 'D': 0, 'N', 'G': 0, 'N', 'H': 0, 'N', 'K': 0, 'N', 'M': 0, 'N', 'N': 0, 'N', 'R': 0, 'N', 'S': 0, 'N', 'T': 0, 'N', 'V': 0, 'N', 'W': 0, 'N', 'Y': 0, 'R', '-': 0, 'R', '.': 0, 'R', 'A': 1, 'R', 'B': 1, 'R', 'C': 0, 'R', 'D': 1, 'R', 'G': 1, 'R', 'H': 1, 'R', 'K': 1, 'R', 'M': 1, 'R', 'N': 1, 'R', 'R': 1, 'R', 'S': 1, 'R', 'T': 0, 'R', 'V': 1, 'R', 'W': 1, 'R', 'Y': 0, 'S', '-': 0, 'S', '.': 0, 'S', 'A': 0, 'S', 'B': 1, 'S', 'C': 1, 'S', 'D': 1, 'S', 'G': 1, 'S', 'H': 1, 'S', 'K': 1, 'S', 'M': 1, 'S', 'N': 1, 'S', 'R': 1, 'S', 'S': 1, 'S', 'T': 0, 'S', 'V': 1, 'S', 'W': 0, 'S', 'Y': 1, 'T', '-': 0, 'T', '.': 0, 'T', 'A': 0, 'T', 'B': 1, 'T', 'C': 0, 'T', 'D': 1, 'T', 'G': 0, 'T', 'H': 1, 'T', 'K': 1, 'T', 'M': 0, 'T', 'N': 1, 'T', 'R': 0, 'T', 'S': 0, 'T', 'T': 1, 'T', 'V': 0, 'T', 'W': 1, 'T', 'Y': 1, 'V', '-': 0, 'V', '.': 0, 'V', 'A': 1, 'V', 'B': 1, 'V', 'C': 1, 'V', 'D': 1, 'V', 'G': 1, 'V', 'H': 1, 'V', 'K': 1, 'V', 'M': 1, 'V', 'N': 1, 'V', 'R': 1, 'V', 'S': 1, 'V', 'T': 0, 'V', 'V': 1, 'V', 'W': 1, 'V', 'Y': 1, 'W', '-': 0, 'W', '.': 0, 'W', 'A': 1, 'W', 'B': 1, 'W', 'C': 0, 'W', 'D': 1, 'W', 'G': 0, 'W', 'H': 1, 'W', 'K': 1, 'W', 'M': 1, 'W', 'N': 1, 'W', 'R': 1, 'W', 'S': 0, 'W', 'T': 1, 'W', 'V': 1, 'W', 'W': 1, 'W', 'Y': 1, 'Y', '-': 0, 'Y', '.': 0, 'Y', 'A': 0, 'Y', 'B': 1, 'Y', 'C': 1, 'Y', 'D': 1, 'Y', 'G': 0, 'Y', 'H': 1, 'Y', 'K': 1, 'Y', 'M': 1, 'Y', 'N': 1, 'Y', 'R': 0, 'Y', 'S': 1, 'Y', 'T': 1, 'Y', 'V': 1, 'Y', 'W': 1, 'Y', 'Y': 1})¶ Performs a simple fixed position alignment of primers
- Parameters
seq_record – a SeqRecord object to align primers against
primers – dictionary of {names: short IUPAC ambiguous sequence strings}
start – position where primer alignment starts
rev_primer – if True align with the tail end of the sequence
score_dict – optional dictionary of {(char1, char2): score} alignment scores
- Returns
primer alignment result object
- Return type
-
presto.Sequence.
scoreDNA
(a, b, mask_score=None, gap_score=None)¶ Returns the score for a pair of IUPAC Ambiguous Nucleotide characters
- Parameters
a – First characters
b – Second character
n_score – Tuple of length two defining scores for all matches against an N character for (a, b), with the score for character (a) taking precedence; if None score symmetrically according to IUPAC character identity
gap_score – Tuple of length two defining score for all matches against a gap (-, .) character for (a, b), with the score for character (a) taking precedence; if None score symmetrically according to IUPAC character identity
- Returns
Score for the character pair
- Return type
-
presto.Sequence.
scoreSeqPair
(seq1, seq2, ignore_chars={}, score_dict={'-', '-': 1, '-', '.': 1, '-', 'A': 0, '-', 'B': 0, '-', 'C': 0, '-', 'D': 0, '-', 'G': 0, '-', 'H': 0, '-', 'K': 0, '-', 'M': 0, '-', 'N': 0, '-', 'R': 0, '-', 'S': 0, '-', 'T': 0, '-', 'V': 0, '-', 'W': 0, '-', 'Y': 0, '.', '-': 1, '.', '.': 1, '.', 'A': 0, '.', 'B': 0, '.', 'C': 0, '.', 'D': 0, '.', 'G': 0, '.', 'H': 0, '.', 'K': 0, '.', 'M': 0, '.', 'N': 0, '.', 'R': 0, '.', 'S': 0, '.', 'T': 0, '.', 'V': 0, '.', 'W': 0, '.', 'Y': 0, 'A', '-': 0, 'A', '.': 0, 'A', 'A': 1, 'A', 'B': 0, 'A', 'C': 0, 'A', 'D': 1, 'A', 'G': 0, 'A', 'H': 1, 'A', 'K': 0, 'A', 'M': 1, 'A', 'N': 1, 'A', 'R': 1, 'A', 'S': 0, 'A', 'T': 0, 'A', 'V': 1, 'A', 'W': 1, 'A', 'Y': 0, 'B', '-': 0, 'B', '.': 0, 'B', 'A': 0, 'B', 'B': 1, 'B', 'C': 1, 'B', 'D': 1, 'B', 'G': 1, 'B', 'H': 1, 'B', 'K': 1, 'B', 'M': 1, 'B', 'N': 1, 'B', 'R': 1, 'B', 'S': 1, 'B', 'T': 1, 'B', 'V': 1, 'B', 'W': 1, 'B', 'Y': 1, 'C', '-': 0, 'C', '.': 0, 'C', 'A': 0, 'C', 'B': 1, 'C', 'C': 1, 'C', 'D': 0, 'C', 'G': 0, 'C', 'H': 1, 'C', 'K': 0, 'C', 'M': 1, 'C', 'N': 1, 'C', 'R': 0, 'C', 'S': 1, 'C', 'T': 0, 'C', 'V': 1, 'C', 'W': 0, 'C', 'Y': 1, 'D', '-': 0, 'D', '.': 0, 'D', 'A': 1, 'D', 'B': 1, 'D', 'C': 0, 'D', 'D': 1, 'D', 'G': 1, 'D', 'H': 1, 'D', 'K': 1, 'D', 'M': 1, 'D', 'N': 1, 'D', 'R': 1, 'D', 'S': 1, 'D', 'T': 1, 'D', 'V': 1, 'D', 'W': 1, 'D', 'Y': 1, 'G', '-': 0, 'G', '.': 0, 'G', 'A': 0, 'G', 'B': 1, 'G', 'C': 0, 'G', 'D': 1, 'G', 'G': 1, 'G', 'H': 0, 'G', 'K': 1, 'G', 'M': 0, 'G', 'N': 1, 'G', 'R': 1, 'G', 'S': 1, 'G', 'T': 0, 'G', 'V': 1, 'G', 'W': 0, 'G', 'Y': 0, 'H', '-': 0, 'H', '.': 0, 'H', 'A': 1, 'H', 'B': 1, 'H', 'C': 1, 'H', 'D': 1, 'H', 'G': 0, 'H', 'H': 1, 'H', 'K': 1, 'H', 'M': 1, 'H', 'N': 1, 'H', 'R': 1, 'H', 'S': 1, 'H', 'T': 1, 'H', 'V': 1, 'H', 'W': 1, 'H', 'Y': 1, 'K', '-': 0, 'K', '.': 0, 'K', 'A': 0, 'K', 'B': 1, 'K', 'C': 0, 'K', 'D': 1, 'K', 'G': 1, 'K', 'H': 1, 'K', 'K': 1, 'K', 'M': 0, 'K', 'N': 1, 'K', 'R': 1, 'K', 'S': 1, 'K', 'T': 1, 'K', 'V': 1, 'K', 'W': 1, 'K', 'Y': 1, 'M', '-': 0, 'M', '.': 0, 'M', 'A': 1, 'M', 'B': 1, 'M', 'C': 1, 'M', 'D': 1, 'M', 'G': 0, 'M', 'H': 1, 'M', 'K': 0, 'M', 'M': 1, 'M', 'N': 1, 'M', 'R': 1, 'M', 'S': 1, 'M', 'T': 0, 'M', 'V': 1, 'M', 'W': 1, 'M', 'Y': 1, 'N', '-': 0, 'N', '.': 0, 'N', 'A': 1, 'N', 'B': 1, 'N', 'C': 1, 'N', 'D': 1, 'N', 'G': 1, 'N', 'H': 1, 'N', 'K': 1, 'N', 'M': 1, 'N', 'N': 1, 'N', 'R': 1, 'N', 'S': 1, 'N', 'T': 1, 'N', 'V': 1, 'N', 'W': 1, 'N', 'Y': 1, 'R', '-': 0, 'R', '.': 0, 'R', 'A': 1, 'R', 'B': 1, 'R', 'C': 0, 'R', 'D': 1, 'R', 'G': 1, 'R', 'H': 1, 'R', 'K': 1, 'R', 'M': 1, 'R', 'N': 1, 'R', 'R': 1, 'R', 'S': 1, 'R', 'T': 0, 'R', 'V': 1, 'R', 'W': 1, 'R', 'Y': 0, 'S', '-': 0, 'S', '.': 0, 'S', 'A': 0, 'S', 'B': 1, 'S', 'C': 1, 'S', 'D': 1, 'S', 'G': 1, 'S', 'H': 1, 'S', 'K': 1, 'S', 'M': 1, 'S', 'N': 1, 'S', 'R': 1, 'S', 'S': 1, 'S', 'T': 0, 'S', 'V': 1, 'S', 'W': 0, 'S', 'Y': 1, 'T', '-': 0, 'T', '.': 0, 'T', 'A': 0, 'T', 'B': 1, 'T', 'C': 0, 'T', 'D': 1, 'T', 'G': 0, 'T', 'H': 1, 'T', 'K': 1, 'T', 'M': 0, 'T', 'N': 1, 'T', 'R': 0, 'T', 'S': 0, 'T', 'T': 1, 'T', 'V': 0, 'T', 'W': 1, 'T', 'Y': 1, 'V', '-': 0, 'V', '.': 0, 'V', 'A': 1, 'V', 'B': 1, 'V', 'C': 1, 'V', 'D': 1, 'V', 'G': 1, 'V', 'H': 1, 'V', 'K': 1, 'V', 'M': 1, 'V', 'N': 1, 'V', 'R': 1, 'V', 'S': 1, 'V', 'T': 0, 'V', 'V': 1, 'V', 'W': 1, 'V', 'Y': 1, 'W', '-': 0, 'W', '.': 0, 'W', 'A': 1, 'W', 'B': 1, 'W', 'C': 0, 'W', 'D': 1, 'W', 'G': 0, 'W', 'H': 1, 'W', 'K': 1, 'W', 'M': 1, 'W', 'N': 1, 'W', 'R': 1, 'W', 'S': 0, 'W', 'T': 1, 'W', 'V': 1, 'W', 'W': 1, 'W', 'Y': 1, 'Y', '-': 0, 'Y', '.': 0, 'Y', 'A': 0, 'Y', 'B': 1, 'Y', 'C': 1, 'Y', 'D': 1, 'Y', 'G': 0, 'Y', 'H': 1, 'Y', 'K': 1, 'Y', 'M': 1, 'Y', 'N': 1, 'Y', 'R': 0, 'Y', 'S': 1, 'Y', 'T': 1, 'Y', 'V': 1, 'Y', 'W': 1, 'Y', 'Y': 1})¶ Determine the error rate for a pair of sequences
- Parameters
seq1 – SeqRecord object
seq2 – SeqRecord object
ignore_chars – Set of characters to ignore when scoring and counting the weight
score_dict – Optional dictionary of alignment scores
- Returns
Tuple of the (score, minimum weight, error rate) for the pair of sequences
- Return type
Tuple
-
presto.Sequence.
sequentialAssembly
(head_seq, tail_seq, ref_dict, ref_db, alpha=1e-05, max_error=0.3, min_len=8, max_len=1000, scan_reverse=False, min_ident=0.5, evalue=1e-05, max_hits=100, fill=False, aligner='usearch', aligner_exec='usearch', assembly_stats=None, score_dict={'-', '-': 0, '-', '.': 0, '-', 'A': 0, '-', 'B': 0, '-', 'C': 0, '-', 'D': 0, '-', 'G': 0, '-', 'H': 0, '-', 'K': 0, '-', 'M': 0, '-', 'N': 0, '-', 'R': 0, '-', 'S': 0, '-', 'T': 0, '-', 'V': 0, '-', 'W': 0, '-', 'Y': 0, '.', '-': 0, '.', '.': 0, '.', 'A': 0, '.', 'B': 0, '.', 'C': 0, '.', 'D': 0, '.', 'G': 0, '.', 'H': 0, '.', 'K': 0, '.', 'M': 0, '.', 'N': 0, '.', 'R': 0, '.', 'S': 0, '.', 'T': 0, '.', 'V': 0, '.', 'W': 0, '.', 'Y': 0, 'A', '-': 0, 'A', '.': 0, 'A', 'A': 1, 'A', 'B': 0, 'A', 'C': 0, 'A', 'D': 1, 'A', 'G': 0, 'A', 'H': 1, 'A', 'K': 0, 'A', 'M': 1, 'A', 'N': 1, 'A', 'R': 1, 'A', 'S': 0, 'A', 'T': 0, 'A', 'V': 1, 'A', 'W': 1, 'A', 'Y': 0, 'B', '-': 0, 'B', '.': 0, 'B', 'A': 0, 'B', 'B': 1, 'B', 'C': 1, 'B', 'D': 1, 'B', 'G': 1, 'B', 'H': 1, 'B', 'K': 1, 'B', 'M': 1, 'B', 'N': 1, 'B', 'R': 1, 'B', 'S': 1, 'B', 'T': 1, 'B', 'V': 1, 'B', 'W': 1, 'B', 'Y': 1, 'C', '-': 0, 'C', '.': 0, 'C', 'A': 0, 'C', 'B': 1, 'C', 'C': 1, 'C', 'D': 0, 'C', 'G': 0, 'C', 'H': 1, 'C', 'K': 0, 'C', 'M': 1, 'C', 'N': 1, 'C', 'R': 0, 'C', 'S': 1, 'C', 'T': 0, 'C', 'V': 1, 'C', 'W': 0, 'C', 'Y': 1, 'D', '-': 0, 'D', '.': 0, 'D', 'A': 1, 'D', 'B': 1, 'D', 'C': 0, 'D', 'D': 1, 'D', 'G': 1, 'D', 'H': 1, 'D', 'K': 1, 'D', 'M': 1, 'D', 'N': 1, 'D', 'R': 1, 'D', 'S': 1, 'D', 'T': 1, 'D', 'V': 1, 'D', 'W': 1, 'D', 'Y': 1, 'G', '-': 0, 'G', '.': 0, 'G', 'A': 0, 'G', 'B': 1, 'G', 'C': 0, 'G', 'D': 1, 'G', 'G': 1, 'G', 'H': 0, 'G', 'K': 1, 'G', 'M': 0, 'G', 'N': 1, 'G', 'R': 1, 'G', 'S': 1, 'G', 'T': 0, 'G', 'V': 1, 'G', 'W': 0, 'G', 'Y': 0, 'H', '-': 0, 'H', '.': 0, 'H', 'A': 1, 'H', 'B': 1, 'H', 'C': 1, 'H', 'D': 1, 'H', 'G': 0, 'H', 'H': 1, 'H', 'K': 1, 'H', 'M': 1, 'H', 'N': 1, 'H', 'R': 1, 'H', 'S': 1, 'H', 'T': 1, 'H', 'V': 1, 'H', 'W': 1, 'H', 'Y': 1, 'K', '-': 0, 'K', '.': 0, 'K', 'A': 0, 'K', 'B': 1, 'K', 'C': 0, 'K', 'D': 1, 'K', 'G': 1, 'K', 'H': 1, 'K', 'K': 1, 'K', 'M': 0, 'K', 'N': 1, 'K', 'R': 1, 'K', 'S': 1, 'K', 'T': 1, 'K', 'V': 1, 'K', 'W': 1, 'K', 'Y': 1, 'M', '-': 0, 'M', '.': 0, 'M', 'A': 1, 'M', 'B': 1, 'M', 'C': 1, 'M', 'D': 1, 'M', 'G': 0, 'M', 'H': 1, 'M', 'K': 0, 'M', 'M': 1, 'M', 'N': 1, 'M', 'R': 1, 'M', 'S': 1, 'M', 'T': 0, 'M', 'V': 1, 'M', 'W': 1, 'M', 'Y': 1, 'N', '-': 1, 'N', '.': 1, 'N', 'A': 1, 'N', 'B': 1, 'N', 'C': 1, 'N', 'D': 1, 'N', 'G': 1, 'N', 'H': 1, 'N', 'K': 1, 'N', 'M': 1, 'N', 'N': 1, 'N', 'R': 1, 'N', 'S': 1, 'N', 'T': 1, 'N', 'V': 1, 'N', 'W': 1, 'N', 'Y': 1, 'R', '-': 0, 'R', '.': 0, 'R', 'A': 1, 'R', 'B': 1, 'R', 'C': 0, 'R', 'D': 1, 'R', 'G': 1, 'R', 'H': 1, 'R', 'K': 1, 'R', 'M': 1, 'R', 'N': 1, 'R', 'R': 1, 'R', 'S': 1, 'R', 'T': 0, 'R', 'V': 1, 'R', 'W': 1, 'R', 'Y': 0, 'S', '-': 0, 'S', '.': 0, 'S', 'A': 0, 'S', 'B': 1, 'S', 'C': 1, 'S', 'D': 1, 'S', 'G': 1, 'S', 'H': 1, 'S', 'K': 1, 'S', 'M': 1, 'S', 'N': 1, 'S', 'R': 1, 'S', 'S': 1, 'S', 'T': 0, 'S', 'V': 1, 'S', 'W': 0, 'S', 'Y': 1, 'T', '-': 0, 'T', '.': 0, 'T', 'A': 0, 'T', 'B': 1, 'T', 'C': 0, 'T', 'D': 1, 'T', 'G': 0, 'T', 'H': 1, 'T', 'K': 1, 'T', 'M': 0, 'T', 'N': 1, 'T', 'R': 0, 'T', 'S': 0, 'T', 'T': 1, 'T', 'V': 0, 'T', 'W': 1, 'T', 'Y': 1, 'V', '-': 0, 'V', '.': 0, 'V', 'A': 1, 'V', 'B': 1, 'V', 'C': 1, 'V', 'D': 1, 'V', 'G': 1, 'V', 'H': 1, 'V', 'K': 1, 'V', 'M': 1, 'V', 'N': 1, 'V', 'R': 1, 'V', 'S': 1, 'V', 'T': 0, 'V', 'V': 1, 'V', 'W': 1, 'V', 'Y': 1, 'W', '-': 0, 'W', '.': 0, 'W', 'A': 1, 'W', 'B': 1, 'W', 'C': 0, 'W', 'D': 1, 'W', 'G': 0, 'W', 'H': 1, 'W', 'K': 1, 'W', 'M': 1, 'W', 'N': 1, 'W', 'R': 1, 'W', 'S': 0, 'W', 'T': 1, 'W', 'V': 1, 'W', 'W': 1, 'W', 'Y': 1, 'Y', '-': 0, 'Y', '.': 0, 'Y', 'A': 0, 'Y', 'B': 1, 'Y', 'C': 1, 'Y', 'D': 1, 'Y', 'G': 0, 'Y', 'H': 1, 'Y', 'K': 1, 'Y', 'M': 1, 'Y', 'N': 1, 'Y', 'R': 0, 'Y', 'S': 1, 'Y', 'T': 1, 'Y', 'V': 1, 'Y', 'W': 1, 'Y', 'Y': 1})¶ Stitches sequences together by first attempting de novo assembly then falling back to reference guided assembly
- Parameters
head_seq – the head SeqRecord
head_seq – the tail SeqRecord
ref_dict – a dictionary of reference SeqRecord objects
ref_db – the path and name of the reference database
alpha – the minimum p-value for a valid de novo assembly
max_error – the maximum error rate for a valid de novo assembly
min_len – minimum length of overlap to test for de novo assembly
max_len – maximum length of overlap to test for de novo assembly
scan_reverse – if True allow the head sequence to overhang the end of the tail sequence in de novo assembly if False end alignment scan at end of tail sequence or start of head sequence
min_ident – the minimum identity for a valid reference guided assembly
evalue – the E-value cut-off for reference guided assembly
max_hits – the maxhits output limit for reference guided assembly
fill – if False non-overlapping regions will be assigned Ns in reference guided assembly; if True non-overlapping regions will be filled with the reference sequence.
aligner – the alignment tool; one of ‘blastn’ or ‘usearch’
aligner_exec – the path to the alignment tool executable
assembly_stats – optional successes by trials numpy.array of p-values
score_dict – optional dictionary of character scores in the form {(char1, char2): score}.
- Returns
assembled sequence object.
- Return type
-
presto.Sequence.
subsetSeqIndex
(seq_dict, field, values, delimiter='|', '=', ',')¶ Subsets a sequence set by annotation value
- Parameters
seq_dict – Dictionary index of sequences returned from SeqIO.index()
field – Annotation field to select keys by
values – List of annotation values that define the retained keys
delimiter – Tuple of delimiters for (annotations, field/values, value lists)
- Returns
List of keys
- Return type
-
presto.Sequence.
subsetSeqSet
(seq_iter, field, values, delimiter='|', '=', ',')¶ Subsets a sequence set by annotation value
- Parameters
seq_iter – Iterator or list of SeqRecord objects
field – Annotation field to select by
values – List of annotation values that define the retained sequences
delimiter – Tuple of delimiters for (annotations, field/values, value lists)
- Returns
Modified list of SeqRecord objects
- Return type
-
presto.Sequence.
translateAmbigDNA
(key)¶ Translates IUPAC Ambiguous Nucleotide characters to or from character sets
- Parameters
key – String or re.search object containing the character set to translate
- Returns
Character translation
- Return type
-
presto.Sequence.
trimQuality
(data, min_qual=0, window=10, reverse=False)¶ Cuts sequences using a moving mean quality score
- Parameters
- Returns
SeqResult object.
- Return type