-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME.txt
47 lines (30 loc) · 1.43 KB
/
README.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
How to generate decycling set
java -jar decycling.jar <output file> <k> <Alphabet>
Example run:
java -jar decycling.jar decycling_5_ACGT.txt 5 ACGT
This command will genearte a decycling set for k=5 over ACGT alphabet. The output will be saved to decycling_5_ACGT.txt.
How to find additional k-mers to hit all L-long sequences
java -jar DOCKS.jar <output file> <input decycling file> <k> <L - sequence length> <Alphabet> <ILP time limit> <0/1/2/3 - random/greedyL/greedyAny/none> <x - optional with any>
Example DOCKS run:
java -jar DOCKS.jar res_5_20_ACGT_0_1.txt decycling_5_ACGT.txt 5 20 ACGT 0 1
Example DOCKSAny run:
java -jar DOCKS.jar res_5_20_ACGT_0_2.txt decycling_5_ACGT.txt 5 20 ACGT 0 2
Example DOCKSAnyX (X=125) run:
java -jar DOCKS.jar res_5_20_ACGT_0_2_125.txt decycling_5_ACGT.txt 5 20 ACGT 0 2 125
In all of the runs, the use needs to provide the decycling set computed by decycling.jar.
Example ILP run (limited to 1000s), with no DOCKS solution:
java -jar DOCKS.jar res_5_20_ACGT_1000_3.txt decycling_5_ACGT.txt 5 20 ACGT 1000 3
You may need to use more memory for higher values of k, i.e. k>=10, by adding -Xmx4096m option for example.
You may need to increase heapspace size for random mode for DFS, by adding -Xss515m option for example.
Interpreting the output
decycling's and DOCKS's outputs should look like this:
AAAACAAA
AAAAGAAA
AAAATAAA
AAACCAAA
...
ATAACGAA
TCACCGAA
GCCTACTA
TCCTCCTA
Each line is a k-mer in the set.