presto.IO

File I/O and logging functions

presto.IO.countSeqFile(seq_file)

Counts the records in FASTA/FASTQ files

Parameters:seq_file – FASTA or FASTQ file containing sample sequences
Returns:Count of records in the sequence file
Return type:int
presto.IO.countSeqSets(seq_file, field='BARCODE', delimiter=('|', '=', ', '))

Identifies sets of sequences with the same ID field

Parameters:
  • seq_file – FASTA or FASTQ file containing sample sequences
  • field – Annotation field containing set IDs
  • delimiter – Tuple of delimiters for (fields, values, value lists)
Returns:

Count of unit set IDs in the sequence file

Return type:

int

presto.IO.getFileType(filename)

Determines the type of a file by file extension

Parameters:filename – Filename
Returns:String defining the sequence type for SeqIO operations
Return type:str
presto.IO.getOutputHandle(in_file, out_label=None, out_dir=None, out_name=None, out_type=None)

Opens an output file handle

Parameters:
  • in_file – Input filename
  • out_label – Text to be inserted before the file extension; if None do not add a label
  • out_type – the file extension of the output file; if None use input file extension
  • out_dir – the output directory; if None use directory of input file
  • out_name – the short filename to use for the output file; if None use input file short name
Returns:

File handle

Return type:

file

presto.IO.printCount(current, step, start_time=None, task=None, end=False)

Prints a progress bar to standard out

Parameters:
  • current (int) – count of completed tasks.
  • step (int) – an int defining the progress increment to print at.
  • start_time (time.time) – task start time returned by time.time(); if None do not add run time to progress
  • task (str) – name of task to display.
  • end (bool) – if True print final log (add newline).
presto.IO.printError(message, exit=True)

Prints an error to standard error and exits

Parameters:
  • message (str) – error message.
  • exit (bool) – if True exit after the message.
presto.IO.printLog(record, handle=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>, inset=None)

Formats a dictionary into a log string

Parameters:
  • record – a dict or OrderedDict of field names mapping to values.
  • handle – the file handle to write the log to; if None do not write to file.
  • inset – minimum field name inset; if None automatically space field names.
Returns:

Formatted multi-line string in the log format.

Return type:

str

presto.IO.printMessage(message, start_time=None, end=False, width=25)

Prints a progress message to standard out

Parameters:
  • message – Current task message
  • start_time – task start time returned by time.time(); if None do not add run time to progress
  • end – If True print final message (add newline)
  • width – Maximum number of characters for messages
presto.IO.printProgress(current, total, step, start_time=None, task=None, end=False)

Prints a progress bar to standard out

Parameters:
  • current (int) – count of completed tasks.
  • total (int) – total task count.
  • step (float) – float defining the fractional progress increment to print at.
  • start_time (float) – task start time returned by time.time(); if None do not add run time to progress
  • task (str) – name of task to display.
  • end (bool) – if True print final log (add newline).
presto.IO.printWarning(message)

Prints a warning to standard error

Parameters:message (str) – warning message.
presto.IO.readPrimerFile(primer_file)

Processes primer sequences from file

Parameters:primer_file (str) – name of the FASTA file containing primer sequences.
Returns:Dictionary mapping primer ID to sequence.
Return type:dict
presto.IO.readReferenceFile(ref_file)

Create a dictionary of cleaned and ungapped reference sequences.

Parameters:ref_file – reference sequences in fasta format.
Returns:
cleaned and ungapped reference sequences;
with the key as the sequence ID and value as a Bio.SeqRecord for each reference sequence.
Return type:dict
presto.IO.readSeqFile(seq_file, index=False, key_func=None)

Reads FASTA/FASTQ files

Parameters:
  • seq_file – FASTA or FASTQ file containing sample sequences
  • index – If True return a dictionary from SeqIO.index(); if False return an iterator from SeqIO.parse()
  • key_func – the key_function argument to pass to SeqIO.index if index=True
Returns:

an interator of SeqRecords if index=False. A dict if True.

Return type:

iter