Skip to content

Latest commit

 

History

History
54 lines (37 loc) · 1.48 KB

README.md

File metadata and controls

54 lines (37 loc) · 1.48 KB

Phrase Seeker

Phrase Seeker is a Python library that searches for phrases in a text, regardless of their form or intervening words. It was developed in February 2019 to perform text analysis for a local scientific conference.

Features

  • Search texts for phrases.
  • Search for multiple pharses at once.
  • Find phrases even if they weren't in their normalized forms.
  • Find phrases even if there had extra words in-between (e.g. adjectives).
  • Get sentence where the phrase was found.
  • Get location of the sentence in the text.

Requirements

  • Python 3.7

Installation

$ git clone git@github.com:kirillgashkov/phrase-seeker.git
$ cd phrase-seeker
$ pip install -r requirements.txt

Usage

Note: by default seeking function won't leave cache after itself. You can change this behavior by passing should_delete_cache=False as an additional argument to the function. However, if the phrases are changed, you must delete the cache before using the function again (call phrase_seeker.delete_cache() to do so).

from phrase_seeker import seek_phrases_in_text

text = "Insert your awesome text here"
phrases = ["inserted text"]

matches = seek_phrases_in_text(phrases, text)

for match in matches:
    print(match.phrase.text)
    print(match.sentence.start, match.sentence.end, '-', match.sentence.text)

License

Distributed under the MIT License. See the LICENSE.md for details.

Acknowledgments