You can find challenge requirements over here.
In my solution, I choose to use ElasticSearch as search engine, because it has many features like full text and geolocation search.
In the elasticsearch query, the term q
is used for full text search get the most similar suggestion. After that, geolocation position is used to affect the score, if geolocation is provided.
In order to mitigate with level of traffic I implemented a cache using Redis. Check Benchmark Section.
Solution hosted by Heroku: https://busbud-challenge-viniciusgava.herokuapp.com
Please check Infrastructure section for further details.
The project was designed as hexagonal architecture, separated as:
- command: Any operation that change a suggestion.
- infrastructure: All data access including, elasticsearch, file importing, countries and administrative divisions.
- query: Any operation for fetch data.
- user: User interactions, in this project, REST API layer.
It refers to the column admin1
of the provided file, which for:
- USA it is the states. I just use the provided value for USA values.
- Canada it is the provinces or territories. I used the geonames fipscode to figure it out the correct values.
The benchmark results with cache and without cache are available below.
The cache implementation reduces the response time in 3 times AND increase the throughput in 3 times.
The best benchmark results it was:
- throughput of 768,000 RPM.
- response time avg of 78 milliseconds.
The best benchmark results it was:
- throughput of 254,000 RPM.
- response time avg of 236 milliseconds.
It's important mention that I could get betters throughput increasing the number of concurrent connections, but it did not worth since increases the error rate and affects the response time.
The benchmark was executed locally using docker container for elasticsearch and node app direct on the Operation System, Linux Ubuntu.
- AMD Ryzen 5 2600 6-core 12-threads 3.4ghz base
- 16GB RAM 16GB 3466MHz
- Storage M2 Read 2000MB/s, Write 500MB/s
I used heroku to host the application and bonsai.io for ElasticSearch cloud. Since Bonsai.io free tier plan has restrictions to max of 2 concurrent operations(2 for search + 2 for index), the application performance in this environment is limited.
Maintenance endpoint are not available in heroku also because of Bonsai.io free tier restrictions.
Cache Solution is not deployed on heroku.
Search for suggestions using a term. It will return the closest suggestion if geolocation is provided.
URL : /suggestions
Method : GET
Query String:
q
: The searched term. Required.latitude
: Latitude to find the closest suggestions. Optional.longitude
: Longitude to find the closest suggestions. Optional.
Condition : Suggestions have been found.
Code : 200 OK
Content :
{
"suggestions": [
{
"name": "London, ON, Canada",
"latitude": "42.98339",
"longitude": "-81.23304",
"score": "1.0"
},
{
"name": "London, OH, USA",
"latitude": "39.88645",
"longitude": "-83.44825",
"score": "0.3"
},
{
"name": "Londontowne, MD, USA",
"latitude": "38.93345",
"longitude": "-76.54941",
"score": "0.2"
},
{
"name": "New London, CT, USA",
"latitude": "41.35565",
"longitude": "-72.09952",
"score": "0.0"
},
{
"name": "Londonderry, NH, USA",
"latitude": "42.86509",
"longitude": "-71.37395",
"score": "0.0"
}
]
}
Condition : Suggestions have not been found.
Code : 404 Not Found
Content :
{
"suggestions": []
}
Condition : A required field is missing.
Code : 400 Bad Request
Content :
{
"error": "Querystring 'q' must be informed."
}
It creates the elasticsearch index in order to persist suggestions.
URL : /maintenance/index
Method : POST
Inform a Authorization
with a Barear token to perform maintenance operations.
eg header: Authorization: Bearer dev
For development environment just use the token Bearer dev
.
Condition : Index created.
Code : 201 Created
Content :
{
"status": "created"
}
It Populates the index with the file data.
URL : /maintenance/populate
Method : POST
Inform a Authorization
with a Barear token to perform maintenance operations.
eg header: Authorization: Bearer dev
For development environment just use the token Bearer dev
.
Condition : Data importing is in progress.
Code : 200 Ok
Content :
{
"status": "running"
}
Condition : A import is already running.
Code : 409 Conflict
Content :
{
"status": "import is already running"
}
It deletes the elasticsearch index and all loaded data.
URL : /maintenance/index
Method : DELETE
Inform a Authorization
with a Barear token to perform maintenance operations.
eg header: Authorization: Bearer dev
For development environment just use the token Bearer dev
.
Condition : Index deleted.
Code : 200 Ok
Content :
{
"status": "deleted"
}
In the project directory run:
nvm use
npm install
docker-compose up
After that, let's create the elastic search index and load the data:
Application must be running to call the endpoints.
Call the following endpoints:
- Create index. Api Reference: Maintenance - Create Index
- Populate the elasticsearch documents. Api Reference: Maintenance - Populate index
Application configs are available at config.js
in the project directory.
npm run test
npm run start
it should produce an output similar to:
Server running at http://127.0.0.1:2345/suggestions
npm run lint
OR
npm run lintfix
for autofix the code.
npm run allchecks
all data was imported using the provided file. Endpoints to reimport all data can be found on the Api Reference.