Skip to content

Latest commit

 

History

History
9 lines (6 loc) · 383 Bytes

README.md

File metadata and controls

9 lines (6 loc) · 383 Bytes

Minhash and Lsh

Locality Sensitive Hashing with MinHash

Implementation of Minhash Algorithm(Minhash.java) for two documents. Jaccard Similarity is computed on the result of the minhash signature matrix.

LSH implementation of the Minhash matrix using bands. (LSH.java)

The algorithm is currently tailored for 2 documents. In the future it can be improved to match N documents.