MaskPrimers

Removes primers and annotates sequences with primer and barcode identifiers

usage: MaskPrimers [--version] [-h]  ...
--version

show program’s version number and exit

-h, --help

show this help message and exit

output files:
mask-pass
processed reads with successful primer matches.
mask-fail
raw reads failing primer identification.
output annotation fields:
SEQORIENT
the orientation of the output sequence. Either F (input) or RC (reverse complement of input).
PRIMER
name of the best primer match.
BARCODE
the sequence preceding the primer match. Only output when the –barcode flag is specified.

MaskPrimers align

Find primer matches using pairwise local alignment.

usage: MaskPrimers align [--version] [-h] -s SEQ_FILES [SEQ_FILES ...]
                             [--fasta] [--failed] [--log LOG_FILE]
                             [--delim DELIMITER DELIMITER DELIMITER]
                             [--nproc NPROC] [--outdir OUT_DIR]
                             [--outname OUT_NAME] -p PRIMER_FILE
                             [--mode {cut,mask,trim,tag}] [--revpr] [--barcode]
                             [--maxerror MAX_ERROR] [--maxlen MAX_LEN] [--skiprc]
                             [--gap GAP_PENALTY GAP_PENALTY]
--version

show program’s version number and exit

-h, --help

show this help message and exit

-s <seq_files>

A list of FASTA/FASTQ files containing sequences to process.

--fasta

Specify to force output as FASTA rather than FASTQ.

--failed

If specified create files containing records that fail processing.

--log <log_file>

Specify to write verbose logging to a file. May not be specified with multiple input files.

--delim <delimiter>

A list of the three delimiters that separate annotation blocks, field names and values, and values within a field, respectively.

--nproc <nproc>

The number of simultaneous computational processes to execute (CPU cores to utilized).

--outdir <out_dir>

Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.

--outname <out_name>

Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.

-p <primer_file>

A FASTA or REGEX file containing primer sequences.

--mode {cut,mask,trim,tag}

Specifies the action to take with the primer sequence. The “cut” mode will remove both the primer region and the preceding sequence. The “mask” mode will replace the primer region with Ns and remove the preceding sequence. The “trim” mode will remove the region preceding the primer, but leave the primer region intact. The “tag” mode will leave the input sequence unmodified.

--revpr

Specify to match the tail-end of the sequence against the reverse complement of the primers. This also reverses the behavior of the –maxlen argument, such that the search window begins at the tail-end of the sequence.

--barcode

Specify to encode sequences with barcode sequences (unique molecular identifiers) found preceding the primer region.

--maxerror <max_error>

Maximum allowable error rate.

--maxlen <max_len>

Length of the sequence window to scan for primers.

--skiprc

Specify to prevent checking of sample reverse complement sequences.

--gap <gap_penalty>

A list of two positive values defining the gap open and gap extension penalties for aligning the primers. Note: the error rate is calculated as the percentage of mismatches from the primer sequence with gap penalties reducing the match count accordingly; this may lead to error rates that differ from strict mismatch percentage when gaps are present in the alignment.

MaskPrimers score

Find primer matches by scoring primers at a fixed position.

usage: MaskPrimers score [--version] [-h] -s SEQ_FILES [SEQ_FILES ...]
                             [--fasta] [--failed] [--log LOG_FILE]
                             [--delim DELIMITER DELIMITER DELIMITER]
                             [--nproc NPROC] [--outdir OUT_DIR]
                             [--outname OUT_NAME] -p PRIMER_FILE
                             [--mode {cut,mask,trim,tag}] [--revpr] [--barcode]
                             [--maxerror MAX_ERROR] [--start START]
--version

show program’s version number and exit

-h, --help

show this help message and exit

-s <seq_files>

A list of FASTA/FASTQ files containing sequences to process.

--fasta

Specify to force output as FASTA rather than FASTQ.

--failed

If specified create files containing records that fail processing.

--log <log_file>

Specify to write verbose logging to a file. May not be specified with multiple input files.

--delim <delimiter>

A list of the three delimiters that separate annotation blocks, field names and values, and values within a field, respectively.

--nproc <nproc>

The number of simultaneous computational processes to execute (CPU cores to utilized).

--outdir <out_dir>

Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.

--outname <out_name>

Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.

-p <primer_file>

A FASTA or REGEX file containing primer sequences.

--mode {cut,mask,trim,tag}

Specifies the action to take with the primer sequence. The “cut” mode will remove both the primer region and the preceding sequence. The “mask” mode will replace the primer region with Ns and remove the preceding sequence. The “trim” mode will remove the region preceding the primer, but leave the primer region intact. The “tag” mode will leave the input sequence unmodified.

--revpr

Specify to match the tail-end of the sequence against the reverse complement of the primers. This also reverses the behavior of the –maxlen argument, such that the search window begins at the tail-end of the sequence.

--barcode

Specify to encode sequences with barcode sequences (unique molecular identifiers) found preceding the primer region.

--maxerror <max_error>

Maximum allowable error rate.

--start <start>

The starting position of the primer