ParseHeaders

Parses pRESTO annotations in FASTA/FASTQ sequence headers

usage: ParseHeaders [--version] [-h]  ...
--version

show program’s version number and exit

-h, --help

show this help message and exit

output files:
reheader-pass

reads passing annotation operation and modified accordingly.

reheader-fail

raw reads failing annotation operation.

headers

tab delimited table of the selected annotations.

output annotation fields:
<user defined>

annotation fields specified by the -f argument.

ParseHeaders add

Adds field/value pairs to header annotations

usage: ParseHeaders add [--version] [-h] -s SEQ_FILES [SEQ_FILES ...]
                        [-o OUT_FILES [OUT_FILES ...]] [--outdir OUT_DIR]
                        [--outname OUT_NAME] [--failed] [--fasta]
                        [--delim DELIMITER DELIMITER DELIMITER] -f FIELDS
                        [FIELDS ...] -u VALUES [VALUES ...]
--version

show program’s version number and exit

-h, --help

show this help message and exit

-s <seq_files>

A list of FASTA/FASTQ files containing sequences to process.

-o <out_files>

Explicit output file name(s). Note, this argument cannot be used with the –failed, –outdir, or –outname arguments. If unspecified, then the output filename will be based on the input filename(s).

--outdir <out_dir>

Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.

--outname <out_name>

Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.

--failed

If specified create files containing records that fail processing.

--fasta

Specify to force output as FASTA rather than FASTQ.

--delim <delimiter>

A list of the three delimiters that separate annotation blocks, field names and values, and values within a field, respectively.

-f <fields>

List of fields to add.

-u <values>

List of values to add for each field.

ParseHeaders collapse

Collapses header annotations with multiple entries

usage: ParseHeaders collapse [--version] [-h] -s SEQ_FILES [SEQ_FILES ...]
                             [-o OUT_FILES [OUT_FILES ...]] [--outdir OUT_DIR]
                             [--outname OUT_NAME] [--failed] [--fasta]
                             [--delim DELIMITER DELIMITER DELIMITER] -f FIELDS
                             [FIELDS ...] --act
                             {min,max,sum,first,last,set,cat}
                             [{min,max,sum,first,last,set,cat} ...]
--version

show program’s version number and exit

-h, --help

show this help message and exit

-s <seq_files>

A list of FASTA/FASTQ files containing sequences to process.

-o <out_files>

Explicit output file name(s). Note, this argument cannot be used with the –failed, –outdir, or –outname arguments. If unspecified, then the output filename will be based on the input filename(s).

--outdir <out_dir>

Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.

--outname <out_name>

Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.

--failed

If specified create files containing records that fail processing.

--fasta

Specify to force output as FASTA rather than FASTQ.

--delim <delimiter>

A list of the three delimiters that separate annotation blocks, field names and values, and values within a field, respectively.

-f <fields>

List of fields to collapse.

--act {min,max,sum,first,last,set,cat}

List of actions to take for each field defining how each annotation will be combined into a single value. The actions “min”, “max”, “sum” perform the corresponding mathematical operation on numeric annotations. The actions “first” and “last” choose the value from the corresponding position in the annotation. The action “set” collapses annotations into a comma delimited list of unique values. The action “cat” concatenates the values together into a single string.

ParseHeaders copy

Copies header annotation fields

usage: ParseHeaders copy [--version] [-h] -s SEQ_FILES [SEQ_FILES ...]
                         [-o OUT_FILES [OUT_FILES ...]] [--outdir OUT_DIR]
                         [--outname OUT_NAME] [--failed] [--fasta]
                         [--delim DELIMITER DELIMITER DELIMITER] -f FIELDS
                         [FIELDS ...] -k NAMES [NAMES ...]
                         [--act {min,max,sum,first,last,set,cat} [{min,max,sum,first,last,set,cat} ...]]
--version

show program’s version number and exit

-h, --help

show this help message and exit

-s <seq_files>

A list of FASTA/FASTQ files containing sequences to process.

-o <out_files>

Explicit output file name(s). Note, this argument cannot be used with the –failed, –outdir, or –outname arguments. If unspecified, then the output filename will be based on the input filename(s).

--outdir <out_dir>

Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.

--outname <out_name>

Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.

--failed

If specified create files containing records that fail processing.

--fasta

Specify to force output as FASTA rather than FASTQ.

--delim <delimiter>

A list of the three delimiters that separate annotation blocks, field names and values, and values within a field, respectively.

-f <fields>

List of fields to copy.

-k <names>

List of names for each copied field. If the new field is already present, the copied field will be merged into the existing field.

--act {min,max,sum,first,last,set,cat}

List of collapse actions to take on each new field following the copy operation defining how each annotation will be combined into a single value. The actions “min”, “max”, “sum” perform the corresponding mathematical operation on numeric annotations. The actions “first” and “last” choose the value from the corresponding position in the annotation. The action “set” collapses annotations into a comma delimited list of unique values. The action “cat” concatenates the values together into a single string.

ParseHeaders delete

Deletes fields from header annotations

usage: ParseHeaders delete [--version] [-h] -s SEQ_FILES [SEQ_FILES ...]
                           [-o OUT_FILES [OUT_FILES ...]] [--outdir OUT_DIR]
                           [--outname OUT_NAME] [--failed] [--fasta]
                           [--delim DELIMITER DELIMITER DELIMITER] -f FIELDS
                           [FIELDS ...]
--version

show program’s version number and exit

-h, --help

show this help message and exit

-s <seq_files>

A list of FASTA/FASTQ files containing sequences to process.

-o <out_files>

Explicit output file name(s). Note, this argument cannot be used with the –failed, –outdir, or –outname arguments. If unspecified, then the output filename will be based on the input filename(s).

--outdir <out_dir>

Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.

--outname <out_name>

Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.

--failed

If specified create files containing records that fail processing.

--fasta

Specify to force output as FASTA rather than FASTQ.

--delim <delimiter>

A list of the three delimiters that separate annotation blocks, field names and values, and values within a field, respectively.

-f <fields>

List of fields to delete.

ParseHeaders expand

Expands annotation fields with multiple values

usage: ParseHeaders expand [--version] [-h] -s SEQ_FILES [SEQ_FILES ...]
                           [-o OUT_FILES [OUT_FILES ...]] [--outdir OUT_DIR]
                           [--outname OUT_NAME] [--failed] [--fasta]
                           [--delim DELIMITER DELIMITER DELIMITER] -f FIELDS
                           [FIELDS ...] [--sep SEPARATOR]
--version

show program’s version number and exit

-h, --help

show this help message and exit

-s <seq_files>

A list of FASTA/FASTQ files containing sequences to process.

-o <out_files>

Explicit output file name(s). Note, this argument cannot be used with the –failed, –outdir, or –outname arguments. If unspecified, then the output filename will be based on the input filename(s).

--outdir <out_dir>

Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.

--outname <out_name>

Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.

--failed

If specified create files containing records that fail processing.

--fasta

Specify to force output as FASTA rather than FASTQ.

--delim <delimiter>

A list of the three delimiters that separate annotation blocks, field names and values, and values within a field, respectively.

-f <fields>

List of fields to expand.

--sep <separator>

The character separating each value in the fields.

ParseHeaders merge

Merge multiple annotations fields into a single field

usage: ParseHeaders merge [--version] [-h] -s SEQ_FILES [SEQ_FILES ...]
                          [-o OUT_FILES [OUT_FILES ...]] [--outdir OUT_DIR]
                          [--outname OUT_NAME] [--failed] [--fasta]
                          [--delim DELIMITER DELIMITER DELIMITER] -f FIELDS
                          [FIELDS ...] -k NAME [--act {min,max,sum,set,cat}]
                          [--delete]
--version

show program’s version number and exit

-h, --help

show this help message and exit

-s <seq_files>

A list of FASTA/FASTQ files containing sequences to process.

-o <out_files>

Explicit output file name(s). Note, this argument cannot be used with the –failed, –outdir, or –outname arguments. If unspecified, then the output filename will be based on the input filename(s).

--outdir <out_dir>

Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.

--outname <out_name>

Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.

--failed

If specified create files containing records that fail processing.

--fasta

Specify to force output as FASTA rather than FASTQ.

--delim <delimiter>

A list of the three delimiters that separate annotation blocks, field names and values, and values within a field, respectively.

-f <fields>

List of fields to merge.

-k <name>

Name for the merged field. If the new field is already present, the merged fields will be merged into the existing field.

--act {min,max,sum,set,cat}

List of collapse actions to take on the new field following the merge defining how to combine the annotations into a single value. The actions “min”, “max”, “sum” perform the corresponding mathematical operation on numeric annotations. The action “set” collapses annotations into a comma delimited list of unique values. The action “cat” concatenates the values together into a single string.

--delete

If specified, delete the field that were merged from the output header.

ParseHeaders rename

Renames header annotation fields

usage: ParseHeaders rename [--version] [-h] -s SEQ_FILES [SEQ_FILES ...]
                           [-o OUT_FILES [OUT_FILES ...]] [--outdir OUT_DIR]
                           [--outname OUT_NAME] [--failed] [--fasta]
                           [--delim DELIMITER DELIMITER DELIMITER] -f FIELDS
                           [FIELDS ...] -k NAMES [NAMES ...]
                           [--act {min,max,sum,first,last,set,cat} [{min,max,sum,first,last,set,cat} ...]]
--version

show program’s version number and exit

-h, --help

show this help message and exit

-s <seq_files>

A list of FASTA/FASTQ files containing sequences to process.

-o <out_files>

Explicit output file name(s). Note, this argument cannot be used with the –failed, –outdir, or –outname arguments. If unspecified, then the output filename will be based on the input filename(s).

--outdir <out_dir>

Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.

--outname <out_name>

Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.

--failed

If specified create files containing records that fail processing.

--fasta

Specify to force output as FASTA rather than FASTQ.

--delim <delimiter>

A list of the three delimiters that separate annotation blocks, field names and values, and values within a field, respectively.

-f <fields>

List of fields to rename.

-k <names>

List of new names for each field. If the new field is already present, the renamed field will be merged into the existing field and the old field will be deleted.

--act {min,max,sum,first,last,set,cat}

List of collapse actions to take on each new field following the rename operation defining how each annotation will be combined into a single value. The actions “min”, “max”, “sum” perform the corresponding mathematical operation on numeric annotations. The actions “first” and “last” choose the value from the corresponding position in the annotation. The action “set” collapses annotations into a comma delimited list of unique values. The action “cat” concatenates the values together into a single string.

ParseHeaders table

Writes sequence headers to a table

usage: ParseHeaders table [--version] [-h] -s SEQ_FILES [SEQ_FILES ...]
                          [-o OUT_FILES [OUT_FILES ...]] [--outdir OUT_DIR]
                          [--outname OUT_NAME] [--failed]
                          [--delim DELIMITER DELIMITER DELIMITER] -f FIELDS
                          [FIELDS ...]
--version

show program’s version number and exit

-h, --help

show this help message and exit

-s <seq_files>

A list of FASTA/FASTQ files containing sequences to process.

-o <out_files>

Explicit output file name(s). Note, this argument cannot be used with the –failed, –outdir, or –outname arguments. If unspecified, then the output filename will be based on the input filename(s).

--outdir <out_dir>

Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.

--outname <out_name>

Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.

--failed

If specified create files containing records that fail processing.

--delim <delimiter>

A list of the three delimiters that separate annotation blocks, field names and values, and values within a field, respectively.

-f <fields>

List of fields to collect. The sequence identifier may be specified using the hidden field name “ID”.