Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chromap, Less than 5% barcodes can be found or corrected based on the barcode whitelist #152

Open
genecell opened this issue Jan 8, 2022 · 1 comment

Comments

@genecell
Copy link

genecell commented Jan 8, 2022

Hi,

I am using the MAESTRO to analyze a scATAC dataset downloaded form SRA database (accession number: SRR10399252), but I met this error:

Output file: Result/Mapping/SRR10399252_epilepsy/fragments_pre_corrected_dedup_count.tsv
Loaded all sequences successfully in 12.35s, number of sequences: 195, number of bases: 3099922541.
Kmer size: 17, window size: 7.
Lookup table size: 393150044, occurrence table size: 444597151.
Loaded index successfully in 30.40s.
Loaded 737280 barcodes in 1.45s.
Loaded sequence batch successfully in 0.82s, number of sequences: 500000, number of bases: 8000000.
Less than 5% barcodes can be found or corrected based on the barcode whitelist.
Please check whether the barcode whitelist matches the data, e.g. length, reverse-complement. If this is a false positive warning, please run Chromap with the option --skip-barcode-check.

I have also tried the minimap2 for mapping, but also got error:

[Sat Jan  8 17:29:03 2022]
rule scatac_mergepeak:
    input: Result/Analysis/SRR12130207_Lega_42/SRR12130207_Lega_42_all_peaks.narrowPeak
    output: Result/Analysis/SRR12130207_Lega_42/SRR12130207_Lega_42_final_peaks.bed
    jobid: 7
    benchmark: Result/Benchmark/SRR12130207_Lega_42_PeakMerge.benchmark
    wildcards: sample=SRR12130207_Lega_42

[Sat Jan  8 17:29:04 2022]
Error in rule scatac_mergepeak:
    jobid: 7
    output: Result/Analysis/SRR12130207_Lega_42/SRR12130207_Lega_42_final_peaks.bed
    shell:
        
            cat Result/Analysis/SRR12130207_Lega_42/SRR12130207_Lega_42_all_peaks.narrowPeak             | sort -k1,1 -k2,2n | cut -f 1-4 > Result/Analysis/SRR12130207_Lega_42/SRR12130207_Lega_42_cat_peaks.bed

            mergeBed -i Result/Analysis/SRR12130207_Lega_42/SRR12130207_Lega_42_cat_peaks.bed | grep -v '_' | grep -v 'chrEBV' > Result/Analysis/SRR12130207_Lega_42/SRR12130207_Lega_42_final_peaks.bed

            rm Result/Analysis/SRR12130207_Lega_42/SRR12130207_Lega_42_cat_peaks.bed
            
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

I can successfully run the test data provided by MAESTRO, so I do not know whether it is due to the scATAC-seq data itself.
Thanks in advance!

Best regards,
Min

@haowenz
Copy link

haowenz commented Feb 6, 2022

Did you used Chromap custom format for barcode? If yes, this just got fixed here and would work in Chromap next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants