Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text mining #15

Open
romainsacchi opened this issue Mar 13, 2019 · 2 comments
Open

Text mining #15

romainsacchi opened this issue Mar 13, 2019 · 2 comments

Comments

@romainsacchi
Copy link
Contributor

Estimated person-hours: unknown
Volunteer(s)/Candidate(s): unknown
Task Description: Text mining of specific data sources, corporate sustainability reports, academic journals in PDF format. Also, for understanding in a more structured way the raw text already contained in LCA data.

On another note, a game-based approach involving a broader community can be considered, involving the manual extraction and parsing of data.

Technical specifications:

Opportunities for machine learning & use of (semi-)automate procedures to replace activities currently requiring human intervention. A practical example could implementing text mining of specific data sources, e.g. corporate sustainability reports.

Data updates and adding new data points: Potential to assign tasks to Master students, group work, classroom projects. One of the flow-property layers has to be defined as the natural unit for each product. The natural unit is the one that the product cannot loose without loosing its meaning.

@romainsacchi
Copy link
Contributor Author

Can we make that couple of sentences "One of the flow-property layers has to be defined as the natural unit for each product. The natural unit is the one that the product cannot loose without loosing its meaning." clearer?

@romainsacchi
Copy link
Contributor Author

Python libraries like beautifulsoup4 could be considered, for example. You will find for example here a script I wrote that scraps data from globalenergyobservatory.org and maps all the coal power plants in the world, along with capacity, type of coal used, etc.

@romainsacchi romainsacchi removed this from the alpha release milestone Mar 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant