FilterSeq.py
Filters sequences in FASTA/FASTQ files
usage: FilterSeq.py [--version] [-h] ...
- --version
show program’s version number and exit
- -h, --help
show this help message and exit
- output files:
- <command>-pass
reads passing filtering operation and modified accordingly, where <command> is the name of the filtering operation that was run.
- <command>-fail
raw reads failing filtering criteria, where <command> is the name of the filtering operation.
- output annotation fields:
None
FilterSeq.py length
Filters reads by length.
usage: FilterSeq.py length [--version] [-h] -s SEQ_FILES [SEQ_FILES ...]
[-o OUT_FILES [OUT_FILES ...]] [--outdir OUT_DIR]
[--outname OUT_NAME] [--log LOG_FILE] [--failed]
[--fasta] [--nproc NPROC] [-n MIN_LENGTH] [--inner]
- --version
show program’s version number and exit
- -h, --help
show this help message and exit
- -s <seq_files>
A list of FASTA/FASTQ files containing sequences to process.
- -o <out_files>
Explicit output file name(s). Note, this argument cannot be used with the –failed, –outdir, or –outname arguments. If unspecified, then the output filename will be based on the input filename(s).
- --outdir <out_dir>
Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.
- --outname <out_name>
Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.
- --log <log_file>
Specify to write verbose logging to a file. May not be specified with multiple input files.
- --failed
If specified create files containing records that fail processing.
- --fasta
Specify to force output as FASTA rather than FASTQ.
- --nproc <nproc>
The number of simultaneous computational processes to execute (CPU cores to utilized).
- -n <min_length>
Minimum sequence length to retain.
- --inner
If specified exclude consecutive missing characters at either end of the sequence.
FilterSeq.py maskqual
Masks low quality positions.
usage: FilterSeq.py maskqual [--version] [-h] -s SEQ_FILES [SEQ_FILES ...]
[-o OUT_FILES [OUT_FILES ...]] [--outdir OUT_DIR]
[--outname OUT_NAME] [--log LOG_FILE] [--failed]
[--fasta] [--nproc NPROC] [-q MIN_QUAL]
- --version
show program’s version number and exit
- -h, --help
show this help message and exit
- -s <seq_files>
A list of FASTA/FASTQ files containing sequences to process.
- -o <out_files>
Explicit output file name(s). Note, this argument cannot be used with the –failed, –outdir, or –outname arguments. If unspecified, then the output filename will be based on the input filename(s).
- --outdir <out_dir>
Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.
- --outname <out_name>
Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.
- --log <log_file>
Specify to write verbose logging to a file. May not be specified with multiple input files.
- --failed
If specified create files containing records that fail processing.
- --fasta
Specify to force output as FASTA rather than FASTQ.
- --nproc <nproc>
The number of simultaneous computational processes to execute (CPU cores to utilized).
- -q <min_qual>
Quality score threshold.
FilterSeq.py missing
Filters reads by N or gap character count.
usage: FilterSeq.py missing [--version] [-h] -s SEQ_FILES [SEQ_FILES ...]
[-o OUT_FILES [OUT_FILES ...]] [--outdir OUT_DIR]
[--outname OUT_NAME] [--log LOG_FILE] [--failed]
[--fasta] [--nproc NPROC] [-n MAX_MISSING]
[--inner]
- --version
show program’s version number and exit
- -h, --help
show this help message and exit
- -s <seq_files>
A list of FASTA/FASTQ files containing sequences to process.
- -o <out_files>
Explicit output file name(s). Note, this argument cannot be used with the –failed, –outdir, or –outname arguments. If unspecified, then the output filename will be based on the input filename(s).
- --outdir <out_dir>
Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.
- --outname <out_name>
Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.
- --log <log_file>
Specify to write verbose logging to a file. May not be specified with multiple input files.
- --failed
If specified create files containing records that fail processing.
- --fasta
Specify to force output as FASTA rather than FASTQ.
- --nproc <nproc>
The number of simultaneous computational processes to execute (CPU cores to utilized).
- -n <max_missing>
Threshold for fraction of gap or N nucleotides.
- --inner
If specified exclude consecutive missing characters at either end of the sequence.
FilterSeq.py quality
Filters reads by quality score.
usage: FilterSeq.py quality [--version] [-h] -s SEQ_FILES [SEQ_FILES ...]
[-o OUT_FILES [OUT_FILES ...]] [--outdir OUT_DIR]
[--outname OUT_NAME] [--log LOG_FILE] [--failed]
[--fasta] [--nproc NPROC] [-q MIN_QUAL] [--inner]
- --version
show program’s version number and exit
- -h, --help
show this help message and exit
- -s <seq_files>
A list of FASTA/FASTQ files containing sequences to process.
- -o <out_files>
Explicit output file name(s). Note, this argument cannot be used with the –failed, –outdir, or –outname arguments. If unspecified, then the output filename will be based on the input filename(s).
- --outdir <out_dir>
Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.
- --outname <out_name>
Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.
- --log <log_file>
Specify to write verbose logging to a file. May not be specified with multiple input files.
- --failed
If specified create files containing records that fail processing.
- --fasta
Specify to force output as FASTA rather than FASTQ.
- --nproc <nproc>
The number of simultaneous computational processes to execute (CPU cores to utilized).
- -q <min_qual>
Quality score threshold.
- --inner
If specified exclude consecutive missing characters at either end of the sequence.
FilterSeq.py repeats
Filters reads by consecutive nucleotide repeats.
usage: FilterSeq.py repeats [--version] [-h] -s SEQ_FILES [SEQ_FILES ...]
[-o OUT_FILES [OUT_FILES ...]] [--outdir OUT_DIR]
[--outname OUT_NAME] [--log LOG_FILE] [--failed]
[--fasta] [--nproc NPROC] [-n MAX_REPEAT]
[--missing] [--inner]
- --version
show program’s version number and exit
- -h, --help
show this help message and exit
- -s <seq_files>
A list of FASTA/FASTQ files containing sequences to process.
- -o <out_files>
Explicit output file name(s). Note, this argument cannot be used with the –failed, –outdir, or –outname arguments. If unspecified, then the output filename will be based on the input filename(s).
- --outdir <out_dir>
Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.
- --outname <out_name>
Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.
- --log <log_file>
Specify to write verbose logging to a file. May not be specified with multiple input files.
- --failed
If specified create files containing records that fail processing.
- --fasta
Specify to force output as FASTA rather than FASTQ.
- --nproc <nproc>
The number of simultaneous computational processes to execute (CPU cores to utilized).
- -n <max_repeat>
Threshold for fraction of repeating nucleotides.
- --missing
If specified count consecutive gap and N characters ‘ in addition to {A,C,G,T}.
- --inner
If specified exclude consecutive missing characters at either end of the sequence.
FilterSeq.py trimqual
Trims sequences by quality score decay.
usage: FilterSeq.py trimqual [--version] [-h] -s SEQ_FILES [SEQ_FILES ...]
[-o OUT_FILES [OUT_FILES ...]] [--outdir OUT_DIR]
[--outname OUT_NAME] [--log LOG_FILE] [--failed]
[--fasta] [--nproc NPROC] [-q MIN_QUAL]
[--win WINDOW] [--reverse]
- --version
show program’s version number and exit
- -h, --help
show this help message and exit
- -s <seq_files>
A list of FASTA/FASTQ files containing sequences to process.
- -o <out_files>
Explicit output file name(s). Note, this argument cannot be used with the –failed, –outdir, or –outname arguments. If unspecified, then the output filename will be based on the input filename(s).
- --outdir <out_dir>
Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.
- --outname <out_name>
Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.
- --log <log_file>
Specify to write verbose logging to a file. May not be specified with multiple input files.
- --failed
If specified create files containing records that fail processing.
- --fasta
Specify to force output as FASTA rather than FASTQ.
- --nproc <nproc>
The number of simultaneous computational processes to execute (CPU cores to utilized).
- -q <min_qual>
Quality score threshold.
- --win <window>
Nucleotide window size for moving average calculation.
- --reverse
Specify to trim the head of the sequence rather than the tail.