Installation

It is recommended to execute hashFrag in a conda or virtualenv environment with Python version 3.10.

pip install

We recommend installing hashFrag using the following pip command:

pip install hashFrag

Clone repository

Clone the repository using the following command:

git clone https://github.com/de-Boer-Lab/hashFrag.git

Export the source directory to your PATH:

export PATH="$PATH:./hashFrag/src"

(Optional) To avoid running the above command every time you open a terminal, add it to your shell configuration file (e.g., ~/.bashrc) with the following command:

echo 'export PATH="$PATH:./hashFrag/src"' >> ~/.bashrc

Installing dependencies

If you are managing your virtual environment with Anaconda or Miniconda, you can directly install dependencies upon creation of the conda environment using the command:

conda env create -n hashFrag -f environment.yml
  • This creates a conda environment named “hashFrag”

Alternatively, you can install dependencies located in the requirements.txt file with the folowing pip command:

pip install -r requirements.txt

BLAST+ download

To install the suite of BLAST applications, follow the instructions found at the NCBI BLAST Command Line Applications User Manual here.

Directly download the BLAST+ package executables for different operating systems at the following FTP page:

https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/

Follow the instructions to extract the downloaded file and export binaries to your PATH.

Verify that blastn and makeblastdb commands were installed succesfully:

blastn -version
makeblastdb -version

A minor note on the use of BLAST alignment scores in hashFrag

The lightning (default) mode of hashFrag uses pairwise local alignment scores derived from the BLAST algorithm. Rather than directly using the provided alignment scores, however, a corrected version of the alignment score is calculated.

Gap scoring is designed to reflect the biological occurrence of insertions and deletions in sequences. Typically, opening a gap incurs a larger penalty (gapopen), while subsequent extension of the same gap incurs smaller penalties (gapextend). Upon encountering a gap opening event, the BLAST algorithm applies the gapextend penalty in addition to a gapopen penalty. To conform to exact local alignment scoring conventions, we adjust the BLAST scores such that only the gapopen penalty is applied to gap opening events.