http://sels.tecnico.ulisboa.pt/ep/
Entropic profiles of DNA sequences are local information plots of the relative over and under-expression of motifs per position. They are calculated based on Chaos Game Representation (CGR) using a recently proposed fractal kernel and Parzen's window density estimation method. They allow the visualization of motif densities for two different parameters: resolution L and smoothing parameter Φ. This method detects biological significant regions of DNA. An important simplification allows its calculation using segment counts, explored in this application through suffix trees.
Vinga, S. and Almeida, J.S.: Local Renyi entropic profiles of DNA sequences. BMC Bioinformatics 2007, 8:393. Fernandes F., Vinga S., Freitas A.T., Almeida J.S.: Entropic Profiler - detection of conservation in genomes using information theory. BMC Research Notes 2009, 2:72.
ep -t<type> -f<file> -l<l> -p<phi> -i<position> -m<findmax> -w<window>
t
: sequence type, should always be 'f', which stands for "file"f
: fasta file to load, should be less than 2 GB, only the 1st sequence inside is loaded, 'N's are converted to 'A'sl
: L value (resolution length), should be higher or equal to 3 and less or equal to 10p
: Phi value (smoothing parameter), should be less or equal to 10i
: position to studym
: automatically find the value of L that maximizes the function in that position 'i', should be '0' for false or '1' for truew
: window length to study around that position 'i'x
: (optional parameter) load previously saved project (".tree" file), should be '0' for false or '1' for truey
: (optional parameter) study by motif instead of position, should be a string of DNA charactersz
: (optional parameter) get EP values for entire sequence, does not create plots, should be '0' for false or '1' for true
ep -tf -fExample1.fasta -l8 -p10 -i35840 -m0 -w100