I don't have the exact version which got the best score as I didn't think at that time that this would be my final solution. So I simply didn't save the state of the process. I tried to make the solution as similar as possible. I left the code with next changes that I tried just in case you would like to see them as well.
I've got two approaches to the problem. Both of them would be classified as second place according to Kaggle. The second solution had MRE as 48K+ and the first one had 49K+. So I'm including both of them. Please note:
- first method is more likely to be similar to the wining score
- I don't remember the state of second method. I tried so many combinations in here that and it's all blur right now
Additionally I added my own cross validation method with usage example. It wasn't perfect but allowed me to get similar results as with test set on Kaggle. I implemented it later just for fun and to test myself at later stage (before we've got validation from Vladimir). But I didn't use it while implementing the winning models.