presto.Multiprocessing

Multiprocessing functions

class presto.Multiprocessing.SeqData(key, records)

Bases: object

A class defining sequence data objects for worker processes

class presto.Multiprocessing.SeqResult(key, records)

Bases: object

A class defining sequence result objects for collector processes

data_count
presto.Multiprocessing.collectSeqQueue(alive, result_queue, collect_queue, seq_file, task_label, out_args, index_field=None)

Pulls from results queue, assembles results and manages log and file IO

Parameters:
  • alive – a multiprocessing.Value boolean controlling whether processing continues; when False function returns
  • result_queue – Multiprocessing.Queue holding worker results
  • collect_queue – Multiprocessing.Queue to store collector return values
  • seq_file – Sample sequence file name
  • task_label – Task label used to tag the output files
  • out_args – Common output argument dictionary from parseCommonArgs
  • index_field – Field defining set membership for sequence sets if None data queue contained individual records
Returns:

Adds a dictionary with key value pairs to collect_queue containing

‘log’ defining a log object, ‘out_files’ defining the output file names

Return type:

None

presto.Multiprocessing.feedSeqQueue(alive, data_queue, seq_file, index_func=None, index_args={})

Feeds the data queue with SeqRecord objects

Parameters:
  • alive – multiprocessing.Value boolean controlling whether processing continues; when False function returns
  • data_queue – multiprocessing.Queue to hold data for processing
  • seq_file – Sequence file to read input from
  • index_func – Function to use to define sequence sets if None do not index sets and feed individual records
  • index_args – Dictionary of arguments to pass to index_func
Returns:

None

presto.Multiprocessing.manageProcesses(feed_func, work_func, collect_func, feed_args={}, work_args={}, collect_args={}, nproc=None, queue_size=None)

Manages feeder, worker and collector processes

Parameters:
  • feed_func – Data Queue feeder function
  • work_func – Worker function
  • collect_func – Result Queue collector function
  • feed_args – Dictionary of arguments to pass to feed_func
  • work_args – Dictionary of arguments to pass to work_func
  • collect_args – Dictionary of arguments to pass to collect_func
  • nproc – Number of processQueue processes; if None defaults to the number of CPUs
  • queue_size – Maximum size of the argument queue; if None defaults to 2*nproc
Returns:

Dictionary of collector results

Return type:

dict

presto.Multiprocessing.processSeqQueue(alive, data_queue, result_queue, process_func, process_args={})

Pulls from data queue, performs calculations, and feeds results queue

Parameters:
  • alive – multiprocessing.Value boolean controlling whether processing continues; when False function returns
  • data_queue – multiprocessing.Queue holding data to process
  • result_queue – multiprocessing.Queue to hold processed results
  • process_func – function to use for filtering sequences
  • process_args – Dictionary of arguments to pass to process_func
Returns:

None