Run Vulture on local machines using docker
The following instructions are for running Vulture on local machines using docker. The instructions are tested on the following system:
Distributor ID: Ubuntu
Description: Ubuntu 22.04.2 LTS
Release: 22.04
Codename: jammy
Install Docker
Installation methods can be found exactly at https://docs.docker.com/engine/install/ubuntu/. You can install Docker Engine in different ways, depending on your needs:
We install using the apt repository. Before you install Docker Engine for the first time on a new host machine, you need to set up the Docker repository. Afterward, you can install and update Docker from the repository.
# Set up the repository
# Update the apt package index and install packages to allow apt to use a repository over HTTPS:
sudo apt-get update
sudo apt-get install ca-certificates curl gnupg
# Add Docker’s official GPG key:
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
# Use the following command to set up the repository:
echo \
"deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
"$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
Install Docker Engine
We install the latest version of Docker Engine and containerd. The we test that your installation is working correctly.
# Update the apt package index:
sudo apt-get update
# Install Docker Engine, containerd, and Docker Compose.
# To install the latest version, run:
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
#Verify that the Docker Engine installation is successful by running the hello-world image.
sudo docker run hello-world
# This command downloads a test image and runs it in a container. When the container runs, it prints a confirmation message and exits.
Specify docker container in Nextflow config file
Open the "nextflow.config" file in the vulture/nextflow directory. This following snippet shows how to specify the docker container for the pipeline. The docker container is hosted on Docker Hub and can be pulled by Nextflow automatically.
...
dockerlocal {
docker.enabled = true
process.container = 'junyichen6/vulture:0.0.1'
docker.fixOwnership = true
docker.containerOptions = "--user root"
}
...
Specify the configuration file for an analysis
Before we start our analysis, we need to creat a configuration file for the analysis. Here is a snippet of how the "params.yaml" file looks like:
...
soloStrand: "Forward"
alignment: "STAR"
technology: "10XV3"
virus_database: "viruSITE.NCBIprokaryotes"
soloMultiMappers: "EM"
soloFeatures: "GeneFull"
inputformat: "fastq"
sampleSubfix1: "_1"
sampleSubfix2: "_2"
ref: [The full path of your reference genome direcory, e.g. $HOME/data/references]
samplepath: [The full path of your fastq samples, e.g $HOME/data/fastq]
read2urls:
- [The full path of your _2.fastq.gz file, e.g. $HOME/data/fastq/SRR12570125_2.fastq.gz]
read1urls:
- [The full path of your _1.fastq.gz file, e.g.$HOME/fastq/SRR12570125_1.fastq.gz]
reads:
- [An unique ID of your sample, e.g SRR12570125]
...
Execute the command below to start the main analysis of Vulture.
cd $HOME/Vulture/nextflow
nextflow run scvh_docker_local.nf -profile dockerlocal -params-file params.yaml --outdir=your_output_directory -with-report nextflow_report_$(date +%s).html -bg &>> nextflow_log_$(date +%s).log
A successful run will generate the following files in the output directory:
...
nextflow_report_1628188800.html
nextflow_log_1628188800.log
...
The "nextflow_report_1628188800.html" file is a report of the analysis. The "nextflow_log_1628188800.log" file is the log file of the analysis. A successful run will also generate the following files in the nextflow_log_1628188800.log:
N E X T F L O W ~ version 21.10.6
Launching `scvh_docker_local.nf` [cheeky_brown] - revision: 59a6446081
S C V H - N F P I P E L I N E
===================================
transcriptome: /mnt/d/scvh_files/vmh_genome_dir/references
reads : [SRR12570125]
outdir : /mnt/d/output/
database: : viruSITE.NCBIprokaryotes
threads : 10
ram : 128
alignment : STAR
whitelist : 3M-february-2018.txt
soloCBlen : 16
soloCBstart : 1
soloUMIstart : 17
soloUMIlen : 12
soloStrand : Forward
soloMultiMappers: EM
soloFeature : GeneFull
outSAMtype : BAM SortedByCoordinate
technology : 10XV3
pseudoBAM :
inputformat : fastq
sampleSubfix1 : _1
sampleSubfix2 : _2
[SRR12570125, /mnt/d/scvh_files/EXAMPLES/SRR12570125_1.fastq.gz, /mnt/d/scvh_files/EXAMPLES/SRR12570125_2.fastq.gz]
[88/9dd405] Submitted process > Map (1)