Skip to content

virusrecom v1.3.2

Compare
Choose a tag to compare
@ZhijianZhou01 ZhijianZhou01 released this 23 Jul 02:39
· 17 commits to main since this release
15de5b4

Compared to virusrecom v1.2.1

1. Optimize memory usage

  • Solve the bug of large memory usage when plotting plotting WIC figures or mWIC figures in batches.

  • Sites in the sequence alignment can be iteratively read and loaded into memory in the form of sub-block. Specifies the maximum number of sites per sub-block by the parameter --block (default value: 40000), different sub-blocks will be sequentially loaded to calculate the WIC value. For example, --block 20000 means that no more than 20,000 sites in per iteration load. This optimization allow large amounts of sequences to be computed at lower memory.

    Here's the run log from the example 3.1 (1000 sequences from 10 lineages, the number of sites in alignment is 29,172):

>>> Treat query_recombinant as a potential recombination lineage...

>>> VirusRecom starts calculating weighted information content from each lineage...

    VirusRecom is importing data blocks 1

    Load sites: 1 - 20000

    VirusRecom is removing sites (columns) containing gap (-)...

    VirusRecom is extracting polymorphic sites...

    WIC for data_blocks 1 have been completed.

    VirusRecom is importing data blocks 2

    Load sites: 20001 - 29172

    VirusRecom is removing sites (columns) containing gap (-)...

    VirusRecom is extracting polymorphic sites...

    WIC for data_blocks 2 have been completed.

>>> The WIC calculations of 1015 sites have been completed.

>>> VirusRecom starts scanning using sliding window ...

    Possible major parent: reference_lineage_1 (global mWIC: 1.8976186779157704)

    Other possible parents and recombination region (map at the alignment):

    reference_lineage_2 [['7237 to 11539(mWIC: 1.9553354371515168)', 'p_value: 7.831109305531836e-06']]

>>> Take 0:00:18.073764 seconds in total.

2. Streamline the output

  • Reduce the output of logs on the screen.
  • Rename the file Possible_recombination_event_detailed.txt to identify_logs_detailed.txt, because it is not the final identification of recombination.