Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrated Helsinki's palvelukartta data to search index. #160

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

JaniL
Copy link

@JaniL JaniL commented Jan 23, 2013

#158

Hi,

I decided to do something about this. Following pull request adds 10 844 places to the search index. The implementation uses my pk-extraction script, which is powered by my wrapper module for palvelukartta's rest api.

pk-extraction needs to be installed globally in order to work with current implementation.

npm install -g git://github.com/JaniL/pk-extraction.git

@teropa
Copy link
Member

teropa commented Jan 27, 2013

Cool!
Will get back to this in a week or so, unless @sluukkonen has a chance to get into it before that.

@sluukkonen
Copy link
Member

Nice work, and it's time to update the search index in any case.

Thanks a lot!

@sluukkonen
Copy link
Member

One thing this needs to do is to separate the results by city, as the search index currently extracts the city name from the filename (e.g. helsinki.txt -> Helsinki). So if you could group the results by the city and add similar command-line parameters than in kalkati-extraction, we could merge this right away.

The whole update process could use some rethinking, but at least my time & motivation to work on it is limited, so this is probably the least painful way to integrate the results.

@JaniL
Copy link
Author

JaniL commented Jan 27, 2013

Done. The tool is updated now too, so fetch the newest version.

Changelog of the extraction tool:

  • Units with no coordinates are ignored (Some of them didn't have any, leading to "undefined,undefined" coordinates in the files)
  • Units with name "Puisto, lähivirkistysalue tai vastaava" are ignored aswell (There were at least ten of those)

@sluukkonen
Copy link
Member

Thanks! Going to merge and deploy this later today.

@sluukkonen
Copy link
Member

Ok, I took a look at this.

The data import works well, but I think we should do some additional filtering on the Palvelukartta data. For example, you get a lot of results like "Brahenpuiston koulu 2013-2014", "ltapäivätoiminta / Brahenpuiston koulu, Opetustoimi" and "Brahenpuiston koulu, kouluterveydenhuolto" when we already have "Brahenpuiston koulu" indexed from the OpenStreetMap data.

I'm not really sure what the biggest pitfalls of our OSM data are, are there specific categories where the Palvelukartta data would really help?

@JaniL
Copy link
Author

JaniL commented Jan 29, 2013

That's hard to say as I'm not a frequent user of OSM, and neither Palvelukartta.

If no one hasn't any suggestions then this should be closed as the data can't be imported without heavy filtering.

@sluukkonen
Copy link
Member

Integrating Palvelukartta data was originally Tero's idea - let's see if he has some ideas about how we should use it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants