presto.Sequence

Sequence processing functions

class presto.Sequence.PrimerAlignment(seq=None)

Bases: object

A class defining a primer alignment result

__bool__()

Boolean evaluation of the alignment

Returns:evaluates to the value of the valid attribute
Return type:int
__len__()

Length of alignment

Returns:length of align_seq attribute
Return type:int
presto.Sequence.alignPrimers(seq_record, primers, primers_regex=None, max_error=0.2, max_len=50, rev_primer=False, skip_rc=False, gap_penalty=(1, 1), score_dict={('T', 'B'): 1, ('S', 'V'): 1, ('G', 'A'): 0, ('D', '.'): 0, ('.', 'D'): 0, ('W', 'M'): 1, ('-', 'R'): 0, ('C', 'T'): 0, ('M', 'Y'): 1, ('-', 'Y'): 0, ('D', 'W'): 1, ('.', 'V'): 0, ('V', 'W'): 1, ('Y', 'T'): 1, ('N', 'M'): 0, ('W', 'H'): 1, ('V', 'N'): 1, ('Y', '.'): 0, ('K', 'M'): 0, ('V', 'H'): 1, ('K', 'D'): 1, ('R', '-'): 0, ('.', 'T'): 0, ('V', 'R'): 1, ('B', '-'): 0, ('.', '.'): 0, ('M', 'D'): 1, ('A', 'B'): 0, ('Y', 'V'): 1, ('G', 'Y'): 0, ('V', 'K'): 1, ('D', 'A'): 1, ('V', 'V'): 1, ('S', 'T'): 0, ('C', 'B'): 1, ('M', 'C'): 1, ('H', 'M'): 1, ('S', '.'): 0, ('G', 'C'): 0, ('W', 'D'): 1, ('N', 'A'): 0, ('R', 'C'): 0, ('.', 'G'): 0, ('W', 'B'): 1, ('W', 'A'): 1, ('-', 'T'): 0, ('N', 'K'): 0, ('D', 'C'): 0, ('T', 'W'): 1, ('-', '.'): 0, ('Y', 'H'): 1, ('N', 'V'): 0, ('G', 'N'): 1, ('T', '.'): 0, ('D', 'T'): 1, ('T', 'C'): 0, ('V', 'S'): 1, ('Y', '-'): 0, ('D', 'B'): 1, ('Y', 'G'): 0, ('A', 'A'): 1, ('K', '.'): 0, ('Y', 'S'): 1, ('V', 'G'): 1, ('B', 'K'): 1, ('W', 'K'): 1, ('W', '-'): 0, ('K', 'T'): 1, ('H', 'R'): 1, ('C', '-'): 0, ('C', 'C'): 1, ('V', 'Y'): 1, ('Y', 'Y'): 1, ('C', 'D'): 0, ('N', 'D'): 0, ('M', 'R'): 1, ('B', 'R'): 1, ('B', 'W'): 1, ('C', 'Y'): 1, ('H', 'S'): 1, ('T', 'M'): 0, ('A', 'K'): 0, ('A', 'T'): 0, ('-', 'V'): 0, ('A', '.'): 0, ('H', 'Y'): 1, ('V', 'D'): 1, ('B', 'N'): 1, ('H', 'H'): 1, ('G', 'M'): 0, ('A', 'W'): 1, ('B', 'S'): 1, ('B', 'B'): 1, ('V', 'B'): 1, ('D', 'R'): 1, ('B', 'G'): 1, ('G', 'K'): 1, ('Y', 'B'): 1, ('M', 'S'): 1, ('R', 'Y'): 0, ('-', 'N'): 0, ('M', '-'): 0, ('K', 'Y'): 1, ('Y', 'R'): 0, ('R', 'T'): 0, ('A', 'N'): 1, ('.', 'A'): 0, ('H', 'V'): 1, ('S', 'W'): 0, ('N', 'W'): 0, ('T', 'K'): 1, ('W', 'Y'): 1, ('V', 'M'): 1, ('S', 'N'): 1, ('-', 'G'): 0, ('.', 'K'): 0, ('D', 'Y'): 1, ('D', 'G'): 1, ('Y', 'C'): 1, ('A', 'H'): 1, ('V', '.'): 0, ('A', 'C'): 0, ('S', 'A'): 0, ('H', '.'): 0, ('.', 'W'): 0, ('N', 'R'): 0, ('R', 'B'): 1, ('K', '-'): 0, ('W', 'N'): 1, ('G', '.'): 0, ('S', 'B'): 1, ('K', 'C'): 0, ('S', 'K'): 1, ('C', 'A'): 0, ('B', 'H'): 1, ('M', 'K'): 0, ('M', 'V'): 1, ('D', 'K'): 1, ('C', 'S'): 1, ('D', 'N'): 1, ('T', 'R'): 0, ('.', 'B'): 0, ('H', 'G'): 0, ('M', 'B'): 1, ('W', 'V'): 1, ('T', 'N'): 1, ('M', '.'): 0, ('R', 'D'): 1, ('A', 'Y'): 0, ('-', 'C'): 0, ('-', 'W'): 0, ('D', 'H'): 1, ('G', 'R'): 1, ('D', 'V'): 1, ('N', 'Y'): 0, ('R', 'M'): 1, ('-', 'B'): 0, ('A', 'G'): 0, ('N', 'C'): 0, ('V', 'A'): 1, ('A', 'S'): 0, ('M', 'M'): 1, ('S', 'C'): 1, ('V', '-'): 0, ('M', 'H'): 1, ('M', 'G'): 0, ('K', 'B'): 1, ('Y', 'M'): 1, ('B', '.'): 0, ('S', 'S'): 1, ('A', 'D'): 1, ('B', 'T'): 1, ('G', 'S'): 1, ('G', 'W'): 0, ('G', 'V'): 1, ('C', 'V'): 1, ('B', 'V'): 1, ('G', 'G'): 1, ('.', 'C'): 0, ('C', 'H'): 1, ('W', 'G'): 0, ('A', 'R'): 1, ('R', 'G'): 1, ('.', 'S'): 0, ('Y', 'N'): 1, ('N', 'H'): 0, ('C', 'R'): 0, ('K', 'W'): 1, ('D', 'D'): 1, ('T', 'S'): 0, ('W', '.'): 0, ('Y', 'W'): 1, ('T', 'G'): 0, ('W', 'T'): 1, ('T', '-'): 0, ('-', 'A'): 0, ('C', 'N'): 1, ('B', 'D'): 1, ('D', 'S'): 1, ('R', 'N'): 1, ('.', 'H'): 0, ('T', 'Y'): 1, ('K', 'A'): 0, ('-', 'K'): 0, ('B', 'A'): 0, ('-', 'D'): 0, ('T', 'H'): 1, ('V', 'C'): 1, ('G', 'H'): 0, ('.', 'R'): 0, ('R', 'R'): 1, ('K', 'S'): 1, ('G', '-'): 0, ('S', 'G'): 1, ('N', 'S'): 0, ('C', 'G'): 0, ('R', 'H'): 1, ('.', 'M'): 0, ('S', 'H'): 1, ('D', 'M'): 1, ('R', 'V'): 1, ('B', 'Y'): 1, ('C', '.'): 0, ('S', 'R'): 1, ('M', 'T'): 0, ('B', 'M'): 1, ('H', 'A'): 1, ('N', 'B'): 0, ('N', 'G'): 0, ('T', 'V'): 0, ('S', 'M'): 1, ('-', 'H'): 0, ('N', 'T'): 0, ('H', 'K'): 1, ('N', '.'): 0, ('M', 'A'): 1, ('C', 'M'): 1, ('-', '-'): 0, ('H', 'W'): 1, ('N', '-'): 0, ('W', 'R'): 1, ('W', 'W'): 1, ('-', 'M'): 0, ('H', 'N'): 1, ('A', '-'): 0, ('T', 'T'): 1, ('K', 'V'): 1, ('B', 'C'): 1, ('H', 'D'): 1, ('H', 'C'): 1, ('K', 'H'): 1, ('A', 'M'): 1, ('S', 'Y'): 1, ('R', '.'): 0, ('A', 'V'): 1, ('H', 'T'): 1, ('Y', 'D'): 1, ('R', 'K'): 1, ('R', 'S'): 1, ('K', 'R'): 1, ('C', 'W'): 0, ('G', 'T'): 0, ('D', '-'): 0, ('W', 'S'): 0, ('R', 'A'): 1, ('T', 'D'): 1, ('K', 'N'): 1, ('M', 'W'): 1, ('.', 'Y'): 0, ('.', '-'): 0, ('M', 'N'): 1, ('Y', 'K'): 1, ('N', 'N'): 0, ('W', 'C'): 0, ('H', '-'): 0, ('G', 'B'): 1, ('-', 'S'): 0, ('T', 'A'): 0, ('K', 'K'): 1, ('S', 'D'): 1, ('Y', 'A'): 0, ('R', 'W'): 1, ('H', 'B'): 1, ('C', 'K'): 0, ('G', 'D'): 1, ('.', 'N'): 0, ('V', 'T'): 0, ('K', 'G'): 1, ('S', '-'): 0})

Performs pairwise local alignment of a list of short sequences against a long sequence

Parameters:
  • seq_record – a SeqRecord object to align primers against
  • primers – dictionary of {names: short IUPAC ambiguous sequence strings}
  • primers_regex – optional dictionary of {names: compiled primer regular expressions}
  • max_error – maximum acceptable error rate before aligning reverse complement
  • max_len – maximum length of sample sequence to align
  • rev_primer – if True align with the tail end of the sequence
  • skip_rc – if True do not check reverse complement sequences
  • gap_penalty – a tuple of positive (gap open, gap extend) penalties
  • score_dict – optional dictionary of alignment scores as {(char1, char2): score}
Returns:

primer alignment result object

Return type:

presto.Sequence.PrimerAlignment

presto.Sequence.calculateDiversity(seq_list, score_dict=getDNAScoreDict())

Determine the average pairwise error rate for a list of sequences

Parameters:
  • seq_list – List of SeqRecord objects to score
  • score_dict – Optional dictionary of alignment scores as {(char1, char2): score}
Returns:

Average pairwise error rate for the list of sequences

Return type:

float

presto.Sequence.calculateSetError(seq_list, ref_seq, ignore_chars=['n', 'N'], score_dict=getDNAScoreDict())

Counts the occurrence of nucleotide mismatches from a reference in a set of sequences

Parameters:
  • seq_list – list of SeqRecord objects with aligned sequences.
  • ref_seq – SeqRecord object containing the reference sequence to match against.
  • ignore_chars – list of characters to exclude from mismatch counts.
  • score_dict – optional dictionary of alignment scores as {(char1, char2): score}.
Returns:

error rate for the set.

Return type:

float

presto.Sequence.checkSeqEqual(seq1, seq2, ignore_chars={'n', '-', 'N', '.'})

Determine if two sequences are equal, excluding missing positions

Parameters:
  • seq1 – SeqRecord object
  • seq2 – SeqRecord object
  • ignore_chars – Set of characters to ignore
Returns:

True if the sequences are equal

Return type:

bool

presto.Sequence.compilePrimers(primers)

Translates IUPAC Ambiguous Nucleotide characters to regular expressions and compiles them

Parameters:key – Dictionary of sequences to translate
Returns:Dictionary of compiled regular expressions
Return type:dict
presto.Sequence.deleteSeqPositions(seq, positions)

Deletes a list of positions from a SeqRecord

Parameters:
  • seq – SeqRecord objects
  • positions – Set of positions (indices) to delete
Returns:

Modified SeqRecord with the specified positions removed

Return type:

SeqRecord

presto.Sequence.findGapPositions(seq_list, max_gap, gap_chars={'-', '.'})

Finds positions in a set of aligned sequences with a high number of gap characters.

Parameters:
  • seq_list – List of SeqRecord objects with aligned sequences
  • max_gap – Float of the maximum gap frequency to consider a position as non-gapped
  • gap_chars – Set of characters to consider as gaps
Returns:

Positions (indices) with gap frequency greater than max_gap

Return type:

list

presto.Sequence.frequencyConsensus(seq_list, min_freq=0.6, ignore_chars={'n', '-', 'N', '.'})

Builds a consensus sequence from a set of sequences

Parameters:
  • set_seq – List of SeqRecord objects
  • min_freq – Frequency cutoff to assign a base
  • ignore_chars – Set of characters to exclude when building a consensus sequence
Returns:

Consensus SeqRecord object

Return type:

SeqRecord

presto.Sequence.getAAScoreDict(mask_score=None, gap_score=None)

Generates a score dictionary

Parameters:
  • mask_score – Tuple of length two defining scores for all matches against an X character for (a, b), with the score for character (a) taking precedence; if None score symmetrically according to IUPAC character identity
  • gap_score – Tuple of length two defining score for all matches against a [-, .] character for (a, b), with the score for character (a) taking precedence; if None score symmetrically according to IUPAC character identity
Returns:

Score dictionary with keys (char1, char2) mapping to scores

Return type:

dict

presto.Sequence.getDNAScoreDict(mask_score=None, gap_score=None)

Generates a score dictionary

Parameters:
  • mask_score – Tuple of length two defining scores for all matches against an N character for (a, b), with the score for character (a) taking precedence; if None score symmetrically according to IUPAC character identity
  • gap_score – Tuple of length two defining score for all matches against a [-, .] character for (a, b), with the score for character (a) taking precedence; if None score symmetrically according to IUPAC character identity
Returns:

Score dictionary with keys (char1, char2) mapping to scores

Return type:

dict

presto.Sequence.indexSeqSets(seq_dict, field='BARCODE', delimiter=('|', '=', ', '))

Identifies sets of sequences with the same ID field

Parameters:
  • seq_dict – a dictionary index of sequences returned from SeqIO.index()
  • field – the annotation field containing set IDs
  • delimiter – a tuple of delimiters for (fields, values, value lists)
Returns:

Dictionary mapping set name to a list of record names

Return type:

dict

presto.Sequence.maskSeq(align, mode='mask', barcode=False, delimiter=('|', '=', ', '))

Create an output sequence with primers masked or cut

Parameters:
  • align – a PrimerAlignment object returned from alignPrimers or scorePrimers
  • mode – defines the action taken; one of [‘cut’,’mask’,’tag’,’trim’]
  • barcode – if True add sequence preceding primer to description
  • delimiter – a tuple of delimiters for (annotations, field/values, value lists)
Returns:

masked sequence.

Return type:

Bio.SeqRecord.SeqRecord

presto.Sequence.qualityConsensus(seq_list, min_qual=20, min_freq=0.6, dependent=False, ignore_chars={'n', '-', 'N', '.'})

Builds a consensus sequence from a set of sequences

Parameters:
  • seq_list – List of SeqRecord objects
  • min_qual – Quality cutoff to assign a base
  • min_freq – Frequency cutoff to assign a base
  • dependent – If False assume sequences are independent for quality calculation
  • ignore_chars – Set of characters to exclude when building a consensus sequence
Returns:

Consensus SeqRecord object

Return type:

SeqRecord

presto.Sequence.reverseComplement(seq)

Takes the reverse complement of a sequence

Parameters:seq – a SeqRecord object, Seq object or string to reverse complement
Returns:Object of the same type as the input with the reverse complement sequence
Return type:Seq
presto.Sequence.scoreAA(a, b, mask_score=None, gap_score=None)

Returns the score for a pair of IUPAC Extended Protein characters

Parameters:
  • a – First character
  • b – Second character
  • mask_score – Tuple of length two defining scores for all matches against an X character for (a, b), with the score for character (a) taking precedence; if None score symmetrically according to IUPAC character identity
  • gap_score – Tuple of length two defining score for all matches against a gap (-, .) character for (a, b), with the score for character (a) taking precedence; if None score symmetrically according to IUPAC character identity
Returns:

Score for the character pair

Return type:

int

presto.Sequence.scoreDNA(a, b, mask_score=None, gap_score=None)

Returns the score for a pair of IUPAC Ambiguous Nucleotide characters

Parameters:
  • a – First characters
  • b – Second character
  • n_score – Tuple of length two defining scores for all matches against an N character for (a, b), with the score for character (a) taking precedence; if None score symmetrically according to IUPAC character identity
  • gap_score – Tuple of length two defining score for all matches against a gap (-, .) character for (a, b), with the score for character (a) taking precedence; if None score symmetrically according to IUPAC character identity
Returns:

Score for the character pair

Return type:

int

presto.Sequence.scorePrimers(seq_record, primers, start=0, rev_primer=False, score_dict={('T', 'B'): 1, ('S', 'V'): 1, ('G', 'A'): 0, ('D', '.'): 0, ('.', 'D'): 0, ('W', 'M'): 1, ('-', 'R'): 0, ('C', 'T'): 0, ('M', 'Y'): 1, ('-', 'Y'): 0, ('D', 'W'): 1, ('.', 'V'): 0, ('V', 'W'): 1, ('Y', 'T'): 1, ('N', 'M'): 0, ('W', 'H'): 1, ('V', 'N'): 1, ('Y', '.'): 0, ('K', 'M'): 0, ('V', 'H'): 1, ('K', 'D'): 1, ('R', '-'): 0, ('.', 'T'): 0, ('V', 'R'): 1, ('B', '-'): 0, ('.', '.'): 0, ('M', 'D'): 1, ('A', 'B'): 0, ('Y', 'V'): 1, ('G', 'Y'): 0, ('V', 'K'): 1, ('D', 'A'): 1, ('V', 'V'): 1, ('S', 'T'): 0, ('C', 'B'): 1, ('M', 'C'): 1, ('H', 'M'): 1, ('S', '.'): 0, ('G', 'C'): 0, ('W', 'D'): 1, ('N', 'A'): 0, ('R', 'C'): 0, ('.', 'G'): 0, ('W', 'B'): 1, ('W', 'A'): 1, ('-', 'T'): 0, ('N', 'K'): 0, ('D', 'C'): 0, ('T', 'W'): 1, ('-', '.'): 0, ('Y', 'H'): 1, ('N', 'V'): 0, ('G', 'N'): 1, ('T', '.'): 0, ('D', 'T'): 1, ('T', 'C'): 0, ('V', 'S'): 1, ('Y', '-'): 0, ('D', 'B'): 1, ('Y', 'G'): 0, ('A', 'A'): 1, ('K', '.'): 0, ('Y', 'S'): 1, ('V', 'G'): 1, ('B', 'K'): 1, ('W', 'K'): 1, ('W', '-'): 0, ('K', 'T'): 1, ('H', 'R'): 1, ('C', '-'): 0, ('C', 'C'): 1, ('V', 'Y'): 1, ('Y', 'Y'): 1, ('C', 'D'): 0, ('N', 'D'): 0, ('M', 'R'): 1, ('B', 'R'): 1, ('B', 'W'): 1, ('C', 'Y'): 1, ('H', 'S'): 1, ('T', 'M'): 0, ('A', 'K'): 0, ('A', 'T'): 0, ('-', 'V'): 0, ('A', '.'): 0, ('H', 'Y'): 1, ('V', 'D'): 1, ('B', 'N'): 1, ('H', 'H'): 1, ('G', 'M'): 0, ('A', 'W'): 1, ('B', 'S'): 1, ('B', 'B'): 1, ('V', 'B'): 1, ('D', 'R'): 1, ('B', 'G'): 1, ('G', 'K'): 1, ('Y', 'B'): 1, ('M', 'S'): 1, ('R', 'Y'): 0, ('-', 'N'): 0, ('M', '-'): 0, ('K', 'Y'): 1, ('Y', 'R'): 0, ('R', 'T'): 0, ('A', 'N'): 1, ('.', 'A'): 0, ('H', 'V'): 1, ('S', 'W'): 0, ('N', 'W'): 0, ('T', 'K'): 1, ('W', 'Y'): 1, ('V', 'M'): 1, ('S', 'N'): 1, ('-', 'G'): 0, ('.', 'K'): 0, ('D', 'Y'): 1, ('D', 'G'): 1, ('Y', 'C'): 1, ('A', 'H'): 1, ('V', '.'): 0, ('A', 'C'): 0, ('S', 'A'): 0, ('H', '.'): 0, ('.', 'W'): 0, ('N', 'R'): 0, ('R', 'B'): 1, ('K', '-'): 0, ('W', 'N'): 1, ('G', '.'): 0, ('S', 'B'): 1, ('K', 'C'): 0, ('S', 'K'): 1, ('C', 'A'): 0, ('B', 'H'): 1, ('M', 'K'): 0, ('M', 'V'): 1, ('D', 'K'): 1, ('C', 'S'): 1, ('D', 'N'): 1, ('T', 'R'): 0, ('.', 'B'): 0, ('H', 'G'): 0, ('M', 'B'): 1, ('W', 'V'): 1, ('T', 'N'): 1, ('M', '.'): 0, ('R', 'D'): 1, ('A', 'Y'): 0, ('-', 'C'): 0, ('-', 'W'): 0, ('D', 'H'): 1, ('G', 'R'): 1, ('D', 'V'): 1, ('N', 'Y'): 0, ('R', 'M'): 1, ('-', 'B'): 0, ('A', 'G'): 0, ('N', 'C'): 0, ('V', 'A'): 1, ('A', 'S'): 0, ('M', 'M'): 1, ('S', 'C'): 1, ('V', '-'): 0, ('M', 'H'): 1, ('M', 'G'): 0, ('K', 'B'): 1, ('Y', 'M'): 1, ('B', '.'): 0, ('S', 'S'): 1, ('A', 'D'): 1, ('B', 'T'): 1, ('G', 'S'): 1, ('G', 'W'): 0, ('G', 'V'): 1, ('C', 'V'): 1, ('B', 'V'): 1, ('G', 'G'): 1, ('.', 'C'): 0, ('C', 'H'): 1, ('W', 'G'): 0, ('A', 'R'): 1, ('R', 'G'): 1, ('.', 'S'): 0, ('Y', 'N'): 1, ('N', 'H'): 0, ('C', 'R'): 0, ('K', 'W'): 1, ('D', 'D'): 1, ('T', 'S'): 0, ('W', '.'): 0, ('Y', 'W'): 1, ('T', 'G'): 0, ('W', 'T'): 1, ('T', '-'): 0, ('-', 'A'): 0, ('C', 'N'): 1, ('B', 'D'): 1, ('D', 'S'): 1, ('R', 'N'): 1, ('.', 'H'): 0, ('T', 'Y'): 1, ('K', 'A'): 0, ('-', 'K'): 0, ('B', 'A'): 0, ('-', 'D'): 0, ('T', 'H'): 1, ('V', 'C'): 1, ('G', 'H'): 0, ('.', 'R'): 0, ('R', 'R'): 1, ('K', 'S'): 1, ('G', '-'): 0, ('S', 'G'): 1, ('N', 'S'): 0, ('C', 'G'): 0, ('R', 'H'): 1, ('.', 'M'): 0, ('S', 'H'): 1, ('D', 'M'): 1, ('R', 'V'): 1, ('B', 'Y'): 1, ('C', '.'): 0, ('S', 'R'): 1, ('M', 'T'): 0, ('B', 'M'): 1, ('H', 'A'): 1, ('N', 'B'): 0, ('N', 'G'): 0, ('T', 'V'): 0, ('S', 'M'): 1, ('-', 'H'): 0, ('N', 'T'): 0, ('H', 'K'): 1, ('N', '.'): 0, ('M', 'A'): 1, ('C', 'M'): 1, ('-', '-'): 0, ('H', 'W'): 1, ('N', '-'): 0, ('W', 'R'): 1, ('W', 'W'): 1, ('-', 'M'): 0, ('H', 'N'): 1, ('A', '-'): 0, ('T', 'T'): 1, ('K', 'V'): 1, ('B', 'C'): 1, ('H', 'D'): 1, ('H', 'C'): 1, ('K', 'H'): 1, ('A', 'M'): 1, ('S', 'Y'): 1, ('R', '.'): 0, ('A', 'V'): 1, ('H', 'T'): 1, ('Y', 'D'): 1, ('R', 'K'): 1, ('R', 'S'): 1, ('K', 'R'): 1, ('C', 'W'): 0, ('G', 'T'): 0, ('D', '-'): 0, ('W', 'S'): 0, ('R', 'A'): 1, ('T', 'D'): 1, ('K', 'N'): 1, ('M', 'W'): 1, ('.', 'Y'): 0, ('.', '-'): 0, ('M', 'N'): 1, ('Y', 'K'): 1, ('N', 'N'): 0, ('W', 'C'): 0, ('H', '-'): 0, ('G', 'B'): 1, ('-', 'S'): 0, ('T', 'A'): 0, ('K', 'K'): 1, ('S', 'D'): 1, ('Y', 'A'): 0, ('R', 'W'): 1, ('H', 'B'): 1, ('C', 'K'): 0, ('G', 'D'): 1, ('.', 'N'): 0, ('V', 'T'): 0, ('K', 'G'): 1, ('S', '-'): 0})

Performs a simple fixed position alignment of primers

Parameters:
  • seq_record – a SeqRecord object to align primers against
  • primers – dictionary of {names: short IUPAC ambiguous sequence strings}
  • start – position where primer alignment starts
  • rev_primer – if True align with the tail end of the sequence
  • score_dict – optional dictionary of {(char1, char2): score} alignment scores
Returns:

primer alignment result object

Return type:

presto.Sequence.PrimerAlignment

presto.Sequence.scoreSeqPair(seq1, seq2, ignore_chars=set(), score_dict={('T', 'B'): 1, ('S', 'V'): 1, ('G', 'A'): 0, ('D', '.'): 0, ('.', 'D'): 0, ('W', 'M'): 1, ('-', 'R'): 0, ('C', 'T'): 0, ('M', 'Y'): 1, ('-', 'Y'): 0, ('D', 'W'): 1, ('.', 'V'): 0, ('V', 'W'): 1, ('Y', 'T'): 1, ('N', 'M'): 1, ('W', 'H'): 1, ('V', 'N'): 1, ('Y', '.'): 0, ('K', 'M'): 0, ('V', 'H'): 1, ('K', 'D'): 1, ('R', '-'): 0, ('.', 'T'): 0, ('V', 'R'): 1, ('B', '-'): 0, ('.', '.'): 1, ('M', 'D'): 1, ('A', 'B'): 0, ('Y', 'V'): 1, ('G', 'Y'): 0, ('V', 'K'): 1, ('D', 'A'): 1, ('V', 'V'): 1, ('S', 'T'): 0, ('C', 'B'): 1, ('M', 'C'): 1, ('H', 'M'): 1, ('S', '.'): 0, ('G', 'C'): 0, ('W', 'D'): 1, ('N', 'A'): 1, ('R', 'C'): 0, ('.', 'G'): 0, ('W', 'B'): 1, ('W', 'A'): 1, ('-', 'T'): 0, ('N', 'K'): 1, ('D', 'C'): 0, ('T', 'W'): 1, ('-', '.'): 1, ('Y', 'H'): 1, ('N', 'V'): 1, ('G', 'N'): 1, ('T', '.'): 0, ('D', 'T'): 1, ('T', 'C'): 0, ('V', 'S'): 1, ('Y', '-'): 0, ('D', 'B'): 1, ('Y', 'G'): 0, ('A', 'A'): 1, ('K', '.'): 0, ('Y', 'S'): 1, ('V', 'G'): 1, ('B', 'K'): 1, ('W', 'K'): 1, ('W', '-'): 0, ('K', 'T'): 1, ('H', 'R'): 1, ('C', '-'): 0, ('C', 'C'): 1, ('V', 'Y'): 1, ('Y', 'Y'): 1, ('C', 'D'): 0, ('N', 'D'): 1, ('M', 'R'): 1, ('B', 'R'): 1, ('B', 'W'): 1, ('C', 'Y'): 1, ('H', 'S'): 1, ('T', 'M'): 0, ('A', 'K'): 0, ('A', 'T'): 0, ('-', 'V'): 0, ('A', '.'): 0, ('H', 'Y'): 1, ('V', 'D'): 1, ('B', 'N'): 1, ('H', 'H'): 1, ('G', 'M'): 0, ('A', 'W'): 1, ('B', 'S'): 1, ('B', 'B'): 1, ('V', 'B'): 1, ('D', 'R'): 1, ('B', 'G'): 1, ('G', 'K'): 1, ('Y', 'B'): 1, ('M', 'S'): 1, ('R', 'Y'): 0, ('-', 'N'): 0, ('M', '-'): 0, ('K', 'Y'): 1, ('Y', 'R'): 0, ('R', 'T'): 0, ('A', 'N'): 1, ('.', 'A'): 0, ('H', 'V'): 1, ('S', 'W'): 0, ('N', 'W'): 1, ('T', 'K'): 1, ('W', 'Y'): 1, ('V', 'M'): 1, ('S', 'N'): 1, ('-', 'G'): 0, ('.', 'K'): 0, ('D', 'Y'): 1, ('D', 'G'): 1, ('Y', 'C'): 1, ('A', 'H'): 1, ('V', '.'): 0, ('A', 'C'): 0, ('S', 'A'): 0, ('H', '.'): 0, ('.', 'W'): 0, ('N', 'R'): 1, ('R', 'B'): 1, ('K', '-'): 0, ('W', 'N'): 1, ('G', '.'): 0, ('S', 'B'): 1, ('K', 'C'): 0, ('S', 'K'): 1, ('C', 'A'): 0, ('B', 'H'): 1, ('M', 'K'): 0, ('M', 'V'): 1, ('D', 'K'): 1, ('C', 'S'): 1, ('D', 'N'): 1, ('T', 'R'): 0, ('.', 'B'): 0, ('H', 'G'): 0, ('M', 'B'): 1, ('W', 'V'): 1, ('T', 'N'): 1, ('M', '.'): 0, ('R', 'D'): 1, ('A', 'Y'): 0, ('-', 'C'): 0, ('-', 'W'): 0, ('D', 'H'): 1, ('G', 'R'): 1, ('D', 'V'): 1, ('N', 'Y'): 1, ('R', 'M'): 1, ('-', 'B'): 0, ('A', 'G'): 0, ('N', 'C'): 1, ('V', 'A'): 1, ('A', 'S'): 0, ('M', 'M'): 1, ('S', 'C'): 1, ('V', '-'): 0, ('M', 'H'): 1, ('M', 'G'): 0, ('K', 'B'): 1, ('Y', 'M'): 1, ('B', '.'): 0, ('S', 'S'): 1, ('A', 'D'): 1, ('B', 'T'): 1, ('G', 'S'): 1, ('G', 'W'): 0, ('G', 'V'): 1, ('C', 'V'): 1, ('B', 'V'): 1, ('G', 'G'): 1, ('.', 'C'): 0, ('C', 'H'): 1, ('W', 'G'): 0, ('A', 'R'): 1, ('R', 'G'): 1, ('.', 'S'): 0, ('Y', 'N'): 1, ('N', 'H'): 1, ('C', 'R'): 0, ('K', 'W'): 1, ('D', 'D'): 1, ('T', 'S'): 0, ('W', '.'): 0, ('Y', 'W'): 1, ('T', 'G'): 0, ('W', 'T'): 1, ('T', '-'): 0, ('-', 'A'): 0, ('C', 'N'): 1, ('B', 'D'): 1, ('D', 'S'): 1, ('R', 'N'): 1, ('.', 'H'): 0, ('T', 'Y'): 1, ('K', 'A'): 0, ('-', 'K'): 0, ('B', 'A'): 0, ('-', 'D'): 0, ('T', 'H'): 1, ('V', 'C'): 1, ('G', 'H'): 0, ('.', 'R'): 0, ('R', 'R'): 1, ('K', 'S'): 1, ('G', '-'): 0, ('S', 'G'): 1, ('N', 'S'): 1, ('C', 'G'): 0, ('R', 'H'): 1, ('.', 'M'): 0, ('S', 'H'): 1, ('D', 'M'): 1, ('R', 'V'): 1, ('B', 'Y'): 1, ('C', '.'): 0, ('S', 'R'): 1, ('M', 'T'): 0, ('B', 'M'): 1, ('H', 'A'): 1, ('N', 'B'): 1, ('N', 'G'): 1, ('T', 'V'): 0, ('S', 'M'): 1, ('-', 'H'): 0, ('N', 'T'): 1, ('H', 'K'): 1, ('N', '.'): 0, ('M', 'A'): 1, ('C', 'M'): 1, ('-', '-'): 1, ('H', 'W'): 1, ('N', '-'): 0, ('W', 'R'): 1, ('W', 'W'): 1, ('-', 'M'): 0, ('H', 'N'): 1, ('A', '-'): 0, ('T', 'T'): 1, ('K', 'V'): 1, ('B', 'C'): 1, ('H', 'D'): 1, ('H', 'C'): 1, ('K', 'H'): 1, ('A', 'M'): 1, ('S', 'Y'): 1, ('R', '.'): 0, ('A', 'V'): 1, ('H', 'T'): 1, ('Y', 'D'): 1, ('R', 'K'): 1, ('R', 'S'): 1, ('K', 'R'): 1, ('C', 'W'): 0, ('G', 'T'): 0, ('D', '-'): 0, ('W', 'S'): 0, ('R', 'A'): 1, ('T', 'D'): 1, ('K', 'N'): 1, ('M', 'W'): 1, ('.', 'Y'): 0, ('.', '-'): 1, ('M', 'N'): 1, ('Y', 'K'): 1, ('N', 'N'): 1, ('W', 'C'): 0, ('H', '-'): 0, ('G', 'B'): 1, ('-', 'S'): 0, ('T', 'A'): 0, ('K', 'K'): 1, ('S', 'D'): 1, ('Y', 'A'): 0, ('R', 'W'): 1, ('H', 'B'): 1, ('C', 'K'): 0, ('G', 'D'): 1, ('.', 'N'): 0, ('V', 'T'): 0, ('K', 'G'): 1, ('S', '-'): 0})

Determine the error rate for a pair of sequences

Parameters:
  • seq1 – SeqRecord object
  • seq2 – SeqRecord object
  • ignore_chars – Set of characters to ignore when scoring and counting the weight
  • score_dict – Optional dictionary of alignment scores
Returns:

Tuple of the (score, minimum weight, error rate) for the pair of sequences

Return type:

Tuple

presto.Sequence.subsetSeqIndex(seq_dict, field, values, delimiter=('|', '=', ', '))

Subsets a sequence set by annotation value

Parameters:
  • seq_dict – Dictionary index of sequences returned from SeqIO.index()
  • field – Annotation field to select keys by
  • values – List of annotation values that define the retained keys
  • delimiter – Tuple of delimiters for (annotations, field/values, value lists)
Returns:

List of keys

Return type:

list

presto.Sequence.subsetSeqSet(seq_iter, field, values, delimiter=('|', '=', ', '))

Subsets a sequence set by annotation value

Parameters:
  • seq_iter – Iterator or list of SeqRecord objects
  • field – Annotation field to select by
  • values – List of annotation values that define the retained sequences
  • delimiter – Tuple of delimiters for (annotations, field/values, value lists)
Returns:

Modified list of SeqRecord objects

Return type:

list

presto.Sequence.translateAmbigDNA(key)

Translates IUPAC Ambiguous Nucleotide characters to or from character sets

Parameters:key – String or re.search object containing the character set to translate
Returns:Character translation
Return type:str
presto.Sequence.weightSeq(seq, ignore_chars=set())

Returns the length of a sequencing excluding ignored characters

Parameters:
  • seq – SeqRecord or Seq object
  • ignore_chars – Set of characters to ignore when counting sequence length
Returns:

Sum of the character scores for the sequence

Return type:

int