hashFrag
Contents:
Installation
pip install
Clone repository
Installing dependencies
BLAST+ download
A minor note on the use of BLAST alignment scores in hashFrag
Basic usage
If you already have train/test data splits
If you want to create new splits
Example dataset
Defining homology
hashFrag syntax
hashFrag commands
Filter existing splits
Stratify test split
Creating orthogonal splits
Advanced usage
Direct module execution
Creating orthogonal data folds
hashFrag: High-Performance Computing (HPC) mode
Example: SLURM
Example: SGE
Tutorials
hashFrag tutorial: Creating orthogonal splits
A note on the selected parameters for this tutorial
Section 1 - Identifying candidate similar sequences
Section 1.1 - Processing raw
blastn
output file
Section 2: Filter false-positives based on a defined threshold
Section 2.1: hashFrag-pure mode
Section 3: Determine groups of homology
Section 4: Use case(s)
Creating homology-aware data splits
Creating homology-aware data folds
Further details
hashFrag tutorial: Filtering existing test set
A note on the selected parameters for this tutorial
Section 1 - Identifying candidate similar sequences
Section 1.1 - Processing the raw
blastn
output file
Section 2: Filter false-positives based on a defined threshold
Section 2.1: hashFrag-pure mode
Section 3: Use Case(s)
Filter test split sequences that exhibit homology with any sequences in the train split
Further details
hashFrag tutorial: stratify existing splits
A note on the selected parameters for this tutorial
Section 1 - Identifying candidate similar sequences
Section 1.1 - Processing raw
blastn
output file
Section 2: Use Case(s)
Stratify test split based on homology
Section 2.1 - hashFrag-pure mode
Further details
hashFrag
Search
Please activate JavaScript to enable the search functionality.