The update_blastdb.pl script is a Perl utility that automates the process of fetching, checking, and decompressing preformatted databases from the NCBI FTP site . 1. List Available Databases
Common databases include nr (non-redundant protein), nt (nucleotide), and refseq_rna . 2. Download and Decompress a Database
: Use passive FTP mode, which is often required to bypass strict firewalls. download blast database command line
: Increase the connection timeout (default is 120 seconds) if you have a slow or unstable connection. Manual Download via FTP
This command checks if you already have the latest version in your current directory; it will only download new data if the files at NCBI have a newer timestamp. 3. Key Command Flags The update_blastdb
Downloading a BLAST database via the command line is an essential skill for bioinformatics workflows, allowing you to run local searches without the sequence length or timeout limits of the web interface. The primary tool for this task is the script, which is bundled with the NCBI BLAST+ command-line applications. Using the update_blastdb.pl Script
If you cannot use the Perl script, you can manually download the compressed .tar.gz files using tools like wget or curl . Manual Download via FTP This command checks if
To include scientific names and common names in your BLAST reports, you also need the taxonomy database. You can download it using the same script: Get NCBI BLAST databases - NCBI - NIH
Before downloading, you can see a list of all available NCBI databases by running: update_blastdb.pl --showall Use code with caution.
After downloading, you must manually decompress and extract the files using the tar command: tar -zxvf swissprot.*.tar.gz Use code with caution. Essential Post-Download Step: The Taxonomy Database