-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Subset of annotations for FSD 1st release #26
Comments
We could do this in two ways:
|
The prioritization of categories according to their number of ground truth annotation should bring us to a balanced dataset #38. I think this is enough for getting a good subset. |
We decided not to create any subset, but rather prioritize some categories and annotations. |
After defining the mapping to create lists of candidate samples to fill the sound categories (#25), we should decide a subset of all mapped annotations where to concentrate the validations in the initial validation task for FSD 1st release. This is related to the design of the subsets #24, in such a way that the chosen subset of annotations should meet the requirements of the data subsets that we want to provide.
A simple example:
Suppose we want the Medium subset of the FSD 1st release to have ~100k annotations with rater agreement. When validating an annotation, raters can say NP or U with certain probability. This means that we should select >100k annotations as starting point.
The text was updated successfully, but these errors were encountered: