Skip to content

Latest commit

 

History

History
54 lines (45 loc) · 2.11 KB

README.md

File metadata and controls

54 lines (45 loc) · 2.11 KB

slaMEM

slaMEM is a tool used to efficiently retrieve MEMs (Maximal Exact Matches) between a reference genome sequence and one or more query sequences, similarly to these software tools:

slaMEM relies on an FM-Index together with a new data structure called SSILCP (Sampled Search Intervals from Longest Common Prefixes) to store information about parent intervals in a time- and space-efficient way.

slaMEM also includes an useful feature to display the locations of the found MEMs, generating images like the one below.

MEMs of 57 E.coli strains

Reference

If you use slaMEM, please cite:

Fernandes, Francisco and Ana T. Freitas. slaMEM: efficient retrieval of maximal exact matches using a sampled LCP array. Bioinformatics 30.4 (2014): 464-471.

Manual

Install

make

Usage

./slaMEM (<options>) <reference_file> <query_file(s)>
Options:
  • mem : find MEMs: any number of occurrences in both ref and query (default)
  • mam : find MAMs: unique in ref but any number in query
  • l : minimum match length (default=20)
  • o : output file name (default="*-mems.txt")
  • b : process both forward and reverse strands
  • n : discard 'N' characters in the sequences
  • m : minimum sequence size (e.g. to ignore small scaffolds)
  • r : load only the reference(s) whose name(s) contain(s) this string
Extra:
  • v : generate MEMs map image from this MEMs file
Example:
./slaMEM -b -l 10 ./ref.fna ./query.fna
./slaMEM -v ./ref-mems.txt ./ref.fna ./query.fna