Personalising Goodreads with Elasicsearch

A personalised information retrieval prototype with Elasticsearch, gRPC, python and React.

N. Bosch, S. Lembeye, S. Lindstrand, S. Olander

Setup

Consider the following structure

WORKDIR, or home, or root-dir, git-dir: "the directory where this README.md" is located.
backend/: python source for the searc-engine
frontend/: React source for the search-GUI
protos/: Protobuf defs for the gRPC
simulation/: Source for simulation study and LDA pre-processing.

Manual start

For more in-depth installation and pre-req definitions see the respective setup-dirs READMEs.

Start Elasticsearch on the default port (:9200). Once the status is ok (yellow), continue.
Start the backend/src/main.py script with python3.
- If the index goodreads does not exist on ES the script backend/src/client.py (called by backend/src/main.py) will create a new one and index all files: /backend/data/*.csv.
- If the index exists, the index will be unmodified by default.
- NB: the main.py will start a gRPC server on port :5678.
Start the reverse-proxy by invoking the following command: grpcwebproxy --backend_addr=localhost:5678 --run_tls_server=false --allow_all_origins --server_http_max_write_timeout=30s --server_http_max_read_timeout=30s
Navigate to frontend/ and run npm start or npm run start to fire-up the React search-engine.
- The search gui will be accessible on localhost:1234

All-in-all you should have four things running: ES <:9200> python gRPC-server <:5678> grpcwebprox <:5678> npm (node) <:1234> User.

(Experimental) Automatic start

NB: this is highly experimental as this script has had issues with unforseen priori configuration on the host machine.

In an effort to make the booting process as easy as possible we can start the backend (es-server, proxy, es-client) with one collected script. Obviously all requirements are prerequisites for them to work, see below within each component's requirements for details. Nevertheless, this script will outline the same approach as above (the Manual Start) and could serve as a nice indication on what to do.

Starting the servers on a Windows machine (assume Powershell 7)

$ cd backend
$ .\start_ps1     # windows
$ sh start.sh     # macOS

Further, there are no logging capabilities using this method as we are running each component as a background process.

Name		Name	Last commit message	Last commit date
Latest commit History 99 Commits
backend		backend
frontend		frontend
protos		protos
simulation		simulation
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Personalising Goodreads with Elasicsearch

Setup

Manual start

(Experimental) Automatic start

About

Releases 1

Contributors 4

Languages

olandr/inforet-elastic-grpc

Folders and files

Latest commit

History

Repository files navigation

Personalising Goodreads with Elasicsearch

Setup

Manual start

(Experimental) Automatic start

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Contributors 4

Languages