Skip to content

Apache Search

RashiLaddha edited this page Jun 30, 2017 · 5 revisions

Search

The system allows the user to search for a word or group of word in all the content present. It also enables user to search in pdf attachments. It also enables powerful matching capabilities including phrases, wildcards, joins, grouping and much more.


Tagging

The system also has a feature to suggest tags. It suggests tags to the user depending upon the content of the post using Learning Algorithms. Based on an algorithm and a ranking mechanism the user will be provided with a list of tags from which he can select those that best describe the article and also train a user-content learning model in the background. The model trains on the user's past tagging patterns and posts on the website which are then used to propose tags.

The users get 2 sets of Tags to choose from- 1.Implicit Tags- which are derived only from the current content and ranked according to their chi square value. 2.Learned Tags- derived using the Trained Model and the learning algorithm. Research Background

It uses the "Word Co-occurrence Statistical Algorithm". The co-occurrence algorithm has been extended to incorporate learning as well.