Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using samtools sort instead of Picard WGS #1119

Open
ekiernan opened this issue Nov 8, 2023 · 1 comment
Open

Using samtools sort instead of Picard WGS #1119

ekiernan opened this issue Nov 8, 2023 · 1 comment

Comments

@ekiernan
Copy link
Contributor

ekiernan commented Nov 8, 2023

A researcher wrote in with the following:
I am using the WholeGenomeReprocessing pipeline to process some large BAM files (100GB+ each). In particular, the SortSam task in the pipeline runs slowly. I noticed the pipeline uses picard SortSam, which is not multithreaded, instead of something like samtools sort which can use multiple cores/threads. I am wondering if there is a reason for this choice, and would it be possible to change this step of the pipeline to use multithreaded options?

@sdwang008
Copy link

Hi, thanks for posting! I'm raising this question because for large BAM files, the sorting step currently is the one that takes by far the most time. If using 4-8 cores, the walltime of the pipeline could probably be reduced by half!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants