WalmartLabs-ML-CodeSprint

https://www.hackerrank.com/contests/walmart-codesprint-ml/challenges/products-shelves-tagging

Hackerrank contest where participants have to develop a machine learning solution to the problem of putting products of a certain type on certain shelves, given characteristics about the product.

Hackerrank provides two files (train.tsv and test.tsv) for the contestants. Looking at the files, I noticed that there was a lot of missing data and a lot of formatting errors with the train file. I decided to cut down the number of features that I would consider for each product. The features that I used are:

Seller, Actual Color, Artist ID, Genre ID, ISBN, Item Class ID, Literary Genre, MPAA Rating, Product Name, Publisher, Recommended Location, and Recommended Use. (14 Features In Total)

The label for each product is the tags column, which tells you what shelf each of the training examples(products) went on.

I used a KNN approach for the machine learning algorithm, but looking back on the competition, this was not a good choice, since the most important feature in the data was the product name, and processing those strings into numeric values was not the best choice.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Documentation.pdf		Documentation.pdf
README.md		README.md
WalmartMLCodeSprint.py		WalmartMLCodeSprint.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WalmartLabs-ML-CodeSprint

About

Releases

Packages

Languages

adeshpande3/WalmartLabs-ML-CodeSprint

Folders and files

Latest commit

History

Repository files navigation

WalmartLabs-ML-CodeSprint

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages