Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prompt #2 at Biodiversity Next workshop: Top 3-5 ways to improve citizen science data quality #5

Open
eellwood opened this issue Nov 15, 2019 · 0 comments

Comments

@eellwood
Copy link
Contributor

[Context: At the biodiversity_next conference workshop, we asked participants to come up with a list of 3-5 ways we can improve citizen science data quality and/or perceptions of cs data quality. These are the responses, separated by group.]

(Summary (https://github.com/tdwg/citizen-science/projects))


Look at project design:
-Don't ask people to record everything. Start off with a limited number of easily identifiable species, for ex.
-Guides should be simple
-Allow for recording of uncertainty in identities or higher level identification
-Good guides should give an indication of what is needed to identify that species, e.g., take a photo of both top and bottom sides of the insect
-Include notes that ask "why you think it is that species?", e.g., did you go by it's sound, what guides did you use, did you consider related or look-alike species?


User-assigned confidence, e.g., not at all, a bit, very certain; with allowed defaults. Greater usability but greater error. Doesn't improve quality, but help quantify it.


  • Reduce misidentifications
  • stimulate use of scientific names
  • rank the participants more aware of the importance of data accuracy
  • educate people before letting them participate (in a fun way, e.g., using a quiz where they have to validate species images. In this way you also learn about their level and learn which mistakes they make most.

  • Utilizing picklist/controlled vocabulary
  • Second pass for specimens done by one-off, infrequent users
  • vetting training material
  • Multiple pass data cleaning workflow, e.g., first pass with open refine to catch typos, then pass along to expert

  • Adding uncertainty. Measure (%, slider, etc.). Some users won't add observation if they are not 100% sure, others will add if they are 50% sure
  • More feedback to user
  • Cross-checks, random validation by many users
  • New ways of gamification, rewards

  • Bias in space. Communicate bias and suggest/nudge people's recording
  • Bias in taxonomic identification. Guides, e.g., in app, keys, maybe AI, expert review
  • Bias in habitat classification (as above)
  • Bias in effort. Ask user, e.g., complete list
  • Who is advertising: Data providers with a vested interest in reuse or (in the future) those whose funders moderate it to ensure openness.
  • Who is using: Any modeler or data user should use this, but maybe we need to raise awareness and use of quality metadata

Collection manager --> advertise quality
Data user --> buy-in


  • Species identification. Citizens shouldn't be forced to choose if they don't know
  • Understanding data use
  • Form layout/instructions, need to understand how to communicate

  • Create applications w/ constraints and guides for data collection. Do quality assurance on the front end. But leave a free text field for remarks, even if not scientifically useful.
  • Create age- and education-level appropriate training materials that have been tested and vetted by the user community as being easy to use and understand, and interesting.
  • Introduce the concept of uncertainty - have users calibrate themselves/their data and report some level of uncertainty

  • Software that insulates the citizen from the technical details
  • Better matching of project managers with solid project design and the citizen scientists interested in and capable of doing it
  • personalization /ownership of outcome

  • Better match the protocol, the app, and the audience
  • Better mechanisms to keep people motivated and provide good quality data
  • Better mechanisms to validate/curate data

  • Cost of validation based on automated task assignment to the most well adapted citizen scientists
  • Improved data quality on site, based on automated feedback to the participant (e.g., "This species is not frequent here")
  • Implement evaluation of participant profiles in order to detect wrong patterns of platform images
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant