Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue: FASTQC Process Fails with Exit Code 140 in nf-core/sarek Pipeline Using Singularity #1683

Open
SirAymane opened this issue Oct 10, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@SirAymane
Copy link

SirAymane commented Oct 10, 2024

Description of the bug

FASTQC Process Fails with Exit Code 140 in nf-core/sarek Pipeline Using Singularity:


Description of the bug:

The nf-core/sarek pipeline is consistently failing during the FASTQC process with exit code 140 when executed on a Slurm-based HPC cluster using Singularity. The same issue occurs when using Docker as the container runtime.


Additional context and observations:

  • The error occurs during the FASTQC process in both Singularity and Docker executions.
  • Other pipelines such as nf-core/rnaseq run without issues in the same environment.
  • Running the pipeline with root privileges also fails.
  • The warning Skipping mount /usr/local/var/singularity/mnt/session/etc/resolv.conf appears but may not be directly related to the issue.
  • Java I/O error java.io.IOException: Bad file descriptor suggests possible file handling issues within the container.
  • The error persists even when Singularity is correctly configured and verified with other workflows.

Request for assistance:

I am seeking help to resolve this issue with the FASTQC process in the nf-core/sarek pipeline. Any guidance on addressing the exit code 140 error would be greatly appreciated, particularly:

  • Is this a known issue with FASTQC in nf-core/sarek?
  • Could the java.io.IOException: Bad file descriptor indicate an underlying issue in the pipeline or the environment?
  • Are there specific settings or configurations required for running this pipeline with Singularity on SLURM?

Posted as well at the NF-CORE Slack channel - https://nfcore.slack.com/archives/CE6SDBX2A/p1728540986179659

Added command and terminal output and relevant files below.

Command used and terminal output

Command used and terminal output:

Command Executed:

nextflow run nf-core/sarek -profile singularity --input samplesheet.csv --genome hg38 -r 3.2.3 -c nextflow.conf --outdir results_output --wes --known_indels Mills_and_1000G_gold_standard.indels.hg38.vcf.gz --tools mutect2,snpeff --resume

Error Output:

Failed Process: NFCORE_SAREK:SAREK:FASTQC (Sample-1)

Command Executed:

printf "%s %s\n" DNA_Sample-1.R1.fastq.gz Sample-1_1.gz DNA_Sample-1.R2.fastq.gz Sample-1_2.gz | while read old_name new_name; do
    [ -f "${new_name}" ] || ln -s $old_name $new_name
done
fastqc --quiet --threads 8 Sample-1_1.gz Sample-1_2.gz

cat <<-END_VERSIONS > versions.yml
"NFCORE_SAREK:SAREK:FASTQC":
    fastqc: $( fastqc --version | sed -e "s/FastQC v//g" )
END_VERSIONS

Exit Code: 140

Command Error:
WARNING: Skipping mount /usr/local/var/singularity/mnt/session/etc/resolv.conf [files]: /etc/resolv.conf doesn't exist in container
java.io.IOException: Bad file descriptor

Relevant files


Relevant files:

The following is the script used to launch the job (paths and personal information generalized for privacy):

#!/bin/bash
#SBATCH --job-name=CARIS_singularity # Job name
#SBATCH -p long
#SBATCH --mail-type=END,FAIL          # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --ntasks=1                    # Run on a single CPU
#SBATCH --mem=10G                     # Job memory request
#SBATCH --cpus-per-task=1
#SBATCH --output=%x_%j_nobed.log   # Standard output and error log
#SBATCH --error=%x_%j_nobed.err

samples="samplesheet.csv"
sarekoutput="results_$SLURM_JOB_NAME"
logdir="/path/to/log/"
logfile="$SLURM_JOB_NAME.txt"
pon="/path/to/required/1000g_pon.hg38.vcf.gz"
pon_tbi="/path/to/required/1000g_pon.hg38.vcf.gz.tbi"
known_indels="/path/to/required/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz"
other=" --germline_resource /path/to/required/af-only-gnomad.raw.sites_mod.vcf.gz --germline_resource_tbi /path/to/required/af-only-gnomad.raw.sites_mod.vcf.gz.tbi --pon $pon --pon_tbi $pon_tbi"
maxmem="256.GB"
igenomes="/path/to/required/"
max_cpu="48"
max_time="600.h"
tools="mutect2,snpeff"

cmd="nextflow run nf-core/sarek -profile singularity --input $samples --genome hg38 -r 3.2.3 -c nextflow.conf --outdir $sarekoutput --wes --known_indels $known_indels --trim_fastq --resume --tools $tools $other"

# Create cache directory if it doesn't exist
if [ ! -d "cache" ]; then
    mkdir cache
fi

# Create nextflow config file
read -r -d '' config <<- EOM
params {
  config_profile_description = 'bioinfo config'
  config_profile_contact = '$SLURM_JOB_USER $SLURM_JOB_USER@domain.org'
}

singularity {
  enabled = true
  autoMounts = true
  cacheDir ='./cache/'
}

executor {
  name = 'slurm'
  queueSize = 12
}

process {
  executor = 'slurm'
  queue  = { task.time <= 5.h && task.memory <= 10.GB ? 'short': (task.memory <= 95.GB ? 'long' : 'highmem')}
  queueSize = 12
}

params {
  max_memory = '$maxmem'
  max_cpus = $max_cpu
  max_time = '$max_time'
}
EOM

echo "$config" > nextflow.conf

# Create log file
message=$(date +"%D %T")"        "$(whoami)"     "$SLURM_JOB_NAME"       "$cmd
echo  $message >> $logdir$logfile

# Execute nextflow command
nextflow run nf-core/sarek -profile singularity --input $samples --genome GATK.GRCh38 -r 3.2.3 -c nextflow.conf --outdir $sarekoutput --wes --known_indels $known_indels --trim_fastq --resume --tools mutect2,snpeff

System information


System information:

  • Nextflow version: 24.04.3
  • Hardware: HPC
  • Executor: Slurm
  • Container engine: Singularity 3.11.0, Docker 24.0.7
  • OS: Ubuntu 22.04 (Jammy Jellyfish)
  • Version of nf-core/sarek: 3.2.3

I have attached the stdout final message and the stderr. The output logs show the message described ealier at the screenshot, and as for the .err, I am having a "missing txt file" which I do not recognize.

The job ran for 24 hours, with a couple of failed jobs, it still produced an output of 2 TB at the work directories.

Script Location: The entire script and more details are available on GitHub at this issue link.Image
Image

@SirAymane SirAymane added the bug Something isn't working label Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant