Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Marker protein search failed!" error after execution #46

Open
alex-trist opened this issue Mar 14, 2024 · 9 comments
Open

"Marker protein search failed!" error after execution #46

alex-trist opened this issue Mar 14, 2024 · 9 comments
Labels
help wanted Extra attention is needed

Comments

@alex-trist
Copy link

Describe the bug
Hi, I am using platon version 1.7 installed through mamba. As input I am using draft assmblies obtained from bacterial whole genome sequencing (enterobacteria, mainly klebsiella) with Illumina. My issues are:

  1. What is the required length of the contigs? One of the assemblies has 170 contigs; however, 56 are analyzed based on their size.
  2. Using any of the assemblies, the result is always "Marker protein search failed!" The error that appears in the log file is: ERROR - MAIN - diamond execution failed! diamond-error-code=-11.

Any hint on how to fix it?

Best regards.

Therefore, please provide us with at least the following information:

  • what exactly happened

"Marker protein search failed!" error after execution

  • what exact command was executed: just copy-paste the command line

platon --db ../Data_Bases/db_platon/ --prefix --output platon/ --verbose --threads 8 contigs/KP882418.fasta

  • what installation of Platon did you use: BioConda, GitHub, Pip

BioConda (mamba)

  • which version of Platon was used

1.7

@alex-trist alex-trist added the bug Something isn't working label Mar 14, 2024
@TranNhatTan14
Copy link

Hi @alex-trist Did you find solution for this problem?

@alex-trist
Copy link
Author

Hi @alex-trist Did you find solution for this problem?

Hi @TranNhatTan14 , no, not yet.

I made some of the analysis with MOB, but I would like to try Platon.

Any sugestions?

@oschwengers
Copy link
Owner

Hi and thanks for reaching out. Let me start with the question about contig lengths:
Platon works on the detection and evaluation of so called marker proteins upon which a so called replicon distribution score is computed. As short contigs barely encode any CDS and thus no marker proteins, they cannot be used for Platons approach and thus, are skipped upfront. However, they also make up only a tiny fraction of potential plasmids, so in most cases this should be neglectable.

Regarding the Diamond error. This seems to be a Diamond related bug. Very often, this is caused by too little memory resources. Could you try to execute Platon on a machine with ~16 GB memory?

@alex-trist
Copy link
Author

Thanks! @oschwengers

Indeed I executed Platon on a 16GB machine; however, I've experienced memory related issues with DIAMOND in the past, so I'll try in a more powerfull machine and get back to you.

Regards.

@1073501616
Copy link

Hi @oschwengers !I have met the same problem, do you have any other solution?

@Longyulin22
Copy link

Hi @oschwengers !I have met the same problem, do you have any other solution?
mylog:
2024-05-08 15:48:34,121 - INFO - MAIN - version 1.7
2024-05-08 15:48:34,121 - INFO - MAIN - command line: /ifs1/User/longyulin/mambaforge-pypy3/envs/platon/bin/platon --db /ifs1/User/longyulin/mambaforge-pypy3/envs/platon/db --output 120 -v -t 24 /ifs1/User/longyulin/data/seqkit-m300-g/120.fasta
2024-05-08 15:48:34,121 - INFO - CONFIG - threads=24
2024-05-08 15:48:34,121 - INFO - CONFIG - verbose=True
2024-05-08 15:48:34,121 - DEBUG - CONFIG - test parameter db: db_tmp=/ifs1/User/longyulin/mambaforge-pypy3/envs/platon/db
2024-05-08 15:48:34,121 - INFO - CONFIG - database detected: type=parameter, path=/ifs1/User/longyulin/mambaforge-pypy3/envs/platon/db
2024-05-08 15:48:34,121 - INFO - CONFIG - genome-path=/ifs1/User/longyulin/data/seqkit-m300-g/120.fasta
2024-05-08 15:48:34,122 - INFO - CONFIG - tmp-path=/tmp/tmpzeir3tg3
2024-05-08 15:48:34,122 - INFO - CONFIG - output-path=/ifs1/User/longyulin/mambaforge-pypy3/envs/platon/120
2024-05-08 15:48:34,122 - INFO - CONFIG - mode=accuracy
2024-05-08 15:48:34,122 - INFO - CONFIG - characterize=False
2024-05-08 15:48:34,122 - INFO - CONFIG - metagenome=False
2024-05-08 15:48:34,125 - INFO - UTILS - dependency check: tool=prodigal, version=v2.6.3
2024-05-08 15:48:34,164 - INFO - UTILS - dependency check: tool=diamond, version=v2.1.9
2024-05-08 15:48:34,316 - INFO - UTILS - dependency check: tool=blastn, version=v2.15.0
2024-05-08 15:48:34,319 - INFO - UTILS - dependency check: tool=hmmsearch, version=v3.4.0
2024-05-08 15:48:34,323 - INFO - UTILS - dependency check: tool=nucmer, version=v4.0.0
2024-05-08 15:48:34,328 - INFO - UTILS - dependency check: tool=cmscan, version=v1.1.5
2024-05-08 15:48:34,345 - INFO - MAIN - exclude contig: too long: id=NODE_1_length_1550057_cov_277.070263, length=1550057
2024-05-08 15:48:34,349 - INFO - MAIN - exclude contig: too long: id=NODE_2_length_546324_cov_286.703279, length=546324
2024-05-08 15:48:34,361 - INFO - MAIN - exclude contig: too short: id=NODE_25_length_866_cov_1.737643, length=866
2024-05-08 15:48:34,361 - INFO - MAIN - exclude contig: too short: id=NODE_26_length_856_cov_1.206675, length=856
2024-05-08 15:48:34,361 - INFO - MAIN - exclude contig: too short: id=NODE_27_length_829_cov_1.444149, length=829
2024-05-08 15:48:34,361 - INFO - MAIN - exclude contig: too short: id=NODE_28_length_797_cov_914.684722, length=797
2024-05-08 15:48:34,361 - INFO - MAIN - exclude contig: too short: id=NODE_29_length_746_cov_2.539611, length=746
2024-05-08 15:48:34,361 - INFO - MAIN - exclude contig: too short: id=NODE_30_length_615_cov_0.912639, length=615
2024-05-08 15:48:34,361 - INFO - MAIN - exclude contig: too short: id=NODE_31_length_542_cov_1.208602, length=542
2024-05-08 15:48:34,361 - INFO - MAIN - exclude contig: too short: id=NODE_32_length_540_cov_1.114471, length=540
2024-05-08 15:48:34,361 - INFO - MAIN - exclude contig: too short: id=NODE_33_length_524_cov_2071.434004, length=524
2024-05-08 15:48:34,361 - INFO - MAIN - exclude contig: too short: id=NODE_34_length_516_cov_0.997722, length=516
2024-05-08 15:48:34,361 - INFO - MAIN - exclude contig: too short: id=NODE_35_length_491_cov_0.881643, length=491
2024-05-08 15:48:34,361 - INFO - MAIN - exclude contig: too short: id=NODE_36_length_488_cov_622.467153, length=488
2024-05-08 15:48:34,361 - INFO - MAIN - exclude contig: too short: id=NODE_37_length_477_cov_0.922500, length=477
2024-05-08 15:48:34,361 - INFO - MAIN - exclude contig: too short: id=NODE_38_length_475_cov_1.261307, length=475
2024-05-08 15:48:34,361 - INFO - MAIN - exclude contig: too short: id=NODE_39_length_472_cov_705.840506, length=472
2024-05-08 15:48:34,361 - INFO - MAIN - exclude contig: too short: id=NODE_40_length_467_cov_897.476923, length=467
2024-05-08 15:48:34,361 - INFO - MAIN - exclude contig: too short: id=NODE_41_length_431_cov_1.483051, length=431
2024-05-08 15:48:34,362 - INFO - MAIN - exclude contig: too short: id=NODE_42_length_412_cov_1.023881, length=412
2024-05-08 15:48:34,362 - INFO - MAIN - exclude contig: too short: id=NODE_43_length_410_cov_290.438438, length=410
2024-05-08 15:48:34,362 - INFO - MAIN - exclude contig: too short: id=NODE_44_length_402_cov_1934.320000, length=402
2024-05-08 15:48:34,362 - INFO - MAIN - exclude contig: too short: id=NODE_45_length_389_cov_302.602564, length=389
2024-05-08 15:48:34,362 - INFO - MAIN - exclude contig: too short: id=NODE_46_length_382_cov_0.865574, length=382
2024-05-08 15:48:34,362 - INFO - MAIN - exclude contig: too short: id=NODE_47_length_369_cov_0.660959, length=369
2024-05-08 15:48:34,362 - INFO - MAIN - exclude contig: too short: id=NODE_48_length_363_cov_0.583916, length=363
2024-05-08 15:48:34,362 - INFO - MAIN - exclude contig: too short: id=NODE_49_length_358_cov_0.779359, length=358
2024-05-08 15:48:34,362 - INFO - MAIN - exclude contig: too short: id=NODE_50_length_357_cov_1.042857, length=357
2024-05-08 15:48:34,362 - INFO - MAIN - exclude contig: too short: id=NODE_51_length_357_cov_1.042857, length=357
2024-05-08 15:48:34,362 - INFO - MAIN - exclude contig: too short: id=NODE_52_length_355_cov_1.435252, length=355
2024-05-08 15:48:34,362 - INFO - MAIN - exclude contig: too short: id=NODE_53_length_355_cov_1.255396, length=355
2024-05-08 15:48:34,362 - INFO - MAIN - exclude contig: too short: id=NODE_54_length_351_cov_1.065693, length=351
2024-05-08 15:48:34,362 - INFO - MAIN - exclude contig: too short: id=NODE_55_length_351_cov_0.791971, length=351
2024-05-08 15:48:34,362 - INFO - MAIN - exclude contig: too short: id=NODE_56_length_346_cov_1.085502, length=346
2024-05-08 15:48:34,362 - INFO - MAIN - exclude contig: too short: id=NODE_57_length_335_cov_0.751938, length=335
2024-05-08 15:48:34,362 - INFO - MAIN - exclude contig: too short: id=NODE_58_length_333_cov_0.968750, length=333
2024-05-08 15:48:34,362 - INFO - MAIN - exclude contig: too short: id=NODE_59_length_331_cov_0.763780, length=331
2024-05-08 15:48:34,362 - INFO - MAIN - exclude contig: too short: id=NODE_60_length_330_cov_0.837945, length=330
2024-05-08 15:48:34,363 - INFO - MAIN - exclude contig: too short: id=NODE_61_length_327_cov_0.812000, length=327
2024-05-08 15:48:34,363 - INFO - MAIN - exclude contig: too short: id=NODE_62_length_325_cov_14.423387, length=325
2024-05-08 15:48:34,363 - INFO - MAIN - exclude contig: too short: id=NODE_63_length_317_cov_1.300000, length=317
2024-05-08 15:48:34,363 - INFO - MAIN - exclude contig: too short: id=NODE_64_length_314_cov_1.232068, length=314
2024-05-08 15:48:34,363 - INFO - MAIN - exclude contig: too short: id=NODE_65_length_312_cov_1.076596, length=312
2024-05-08 15:48:34,363 - INFO - MAIN - exclude contig: too short: id=NODE_66_length_307_cov_1.386957, length=307
2024-05-08 15:48:34,363 - INFO - MAIN - exclude contig: too short: id=NODE_67_length_305_cov_204.105263, length=305
2024-05-08 15:48:34,363 - INFO - MAIN - exclude contig: too short: id=NODE_68_length_305_cov_1.811404, length=305
2024-05-08 15:48:34,363 - INFO - MAIN - exclude contig: too short: id=NODE_69_length_304_cov_1.092511, length=304
2024-05-08 15:48:34,363 - INFO - MAIN - length contig filter: # input=69, # discarded=47, # remaining=22
2024-05-08 15:48:41,802 - INFO - MAIN - ORF detection: # ORFs=2434
2024-05-08 15:48:41,802 - INFO - MAIN - ORF contig filter disabled! # passed contigs=22
2024-05-08 15:48:55,491 - ERROR - MAIN - diamond execution failed! diamond-error-code=-11
2024-05-08 15:48:55,491 - DEBUG - MAIN - diamond execution: cmd=['diamond', 'blastp', '--db', '/ifs1/User/longyulin/mambaforge-pypy3/envs/platon/db/mps.dmnd', '--query', '/tmp/tmpzeir3tg3/proteins.faa', '--out', '/tmp/tmpzeir3tg3/diamond.tsv', '--max-target-seqs', '1', '--id', '90', '--query-cover', '80', '--subject-cover', '80', '--threads', '24', '--tmpdir', '/tmp/tmpzeir3tg3'], stdout='', stderr='diamond v2.1.9.163 (C) Max Planck Society for the Advancement of Science, Benjamin Buchfink, University of Tuebingen
Documentation, support and updates available at http://www.diamondsearch.org
Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)

#CPU threads: 24
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
Temporary directory: /tmp/tmpzeir3tg3
#Target sequences to report alignments for: 1
Opening the database... [0.266s]
Database: /ifs1/User/longyulin/mambaforge-pypy3/envs/platon/db/mps.dmnd (type: Diamond database, sequences: 4847438, letters: 1549533412)
Block size = 2000000000
Opening the input file... [0.001s]
Opening the output file... [0s]
Loading query sequences... [0.009s]
Length sorting queries... [0.002s]
Masking queries... [0.008s]
Building query seed set... [0.093s]
Algorithm: Query-indexed
Building query histograms... [0.008s]
Seeking in database... [0s]
Loading reference sequences... [5.432s]
Length sorting reference... [2.083s]
Initializing temporary storage... [0s]
Building reference histograms... [1.465s]
Allocating buffers... [0s]
Processing query block 1, reference block 1/1, shape 1/2.
Building reference seed array... [0.977s]
Building query seed array... [0.009s]
Computing hash join... [0.093s]
Searching alignments... [0.659s]
Deallocating memory... [0s]
Processing query block 1, reference block 1/1, shape 2/2.
Building reference seed array... [0.921s]
Building query seed array... [0.007s]
Computing hash join... [0.086s]
Searching alignments... [0.628s]
Deallocating memory... [0s]
Deallocating buffers... [0.32s]
Clearing query masking... [0s]
Computing alignments... Loading trace points... [0.186s]
Sorting trace points... [0.028s]
Computing alignments... '

@jpaganini
Copy link

Hi,

I was running into the same issue, even when requesting 20GB of memory to run it. For me, the fix was to install diamond 2.0.6 via conda, using the following command: conda install bioconda::diamond=2.0.6 --yes.

Hope it helps.

Cheers,

@1073501616
Copy link

Thank you! I used the command and occurred the wrong
ERROR: Wrong diamond version installed. Please, install diamond version v2.0.14!
Then I try this and it run!
(platon) 23:16:19 /mnt/
$ conda install -c bioconda diamond=2.0.14

@oschwengers
Copy link
Owner

Hi all and thanks for reporting!

There is a known bug in Diamond v2.1.9 which is reported upstream: bbuchfink/diamond#785

Currently, downgrading to v2.1.8 should do the trick until there is an official patch for Diamond.

@oschwengers oschwengers added help wanted Extra attention is needed and removed bug Something isn't working labels May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

6 participants