Ranked-Retrieval

Dataset

The Data is obtained from google conceptual captions which is used for image captioning. Recent Experiment Image2Tweet used this dataset. The Dataset is attached in the folder "dataset". The Dataset is named "ConceptualCaptionsDataset.xlsx" This is a large dataset and we considered 30,000 documents of captions with their Hindi Translations that will be retrieved with Index along with the Retrieved Caption Documents.

How to run the retrieval system:

Step 1: Go to https://colab.research.google.com/?utm_source=scs-index
Step 2: Go to Upload and select the "Retrieval_System.ipynb" file in the "code" folder.
Step 3: Once the code is uploaded, In the left menu bar select files and upload the datset.
Step 4: Either leave the default path that works or copy the path of dataset in the df = pd.read_excel('DATASET PATH SHOULD BE HERE',header = None)
Step 5: Once the Dataset is loaded, the code gives you top K retrieved documents along with their hindi translations using multiple term weighting methods, and similarity evaluation methods.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Code		Code
Dataset		Dataset
Project_Description.pdf		Project_Description.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ranked-Retrieval

Dataset

How to run the retrieval system:

About

Releases

Packages

Languages

JayanthSriram27/Ranked-Retrieval

Folders and files

Latest commit

History

Repository files navigation

Ranked-Retrieval

Dataset

How to run the retrieval system:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages