-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[demo] Track accuracy over time. #92
Comments
Hi @juharris |
Hey, thanks for reaching out! As people add data and train a model, the model's accuracy for some test set will change and I would like to track that accuracy's change over time. I think there's a lot to be done for this issue, but we can break it down in some steps. Ideally, for the highest transparency, we would compute the test set evaluation on-chain, but that would be very expensive and arguably wasteful. So what are the steps that we can make towards transparency? I think as a start, you can store the accuracy and timestamp in the table that you can set up in demo/server.js. Maybe you can also store a hash of test set data that was used and some other metadata about the test set? I think that's a decent start and once that is done, you can get an idea of other ways to store test set metrics. You can also get into zero-knowledge proofs or use hashes to prove that the right computation was done to perform evaluation. |
I think i just need to maintain a table of accuracy , timestamp , hashset and other meta data for time being then improve it and then improve this with hashing to prove that changes where made or get in zero-knowledge proofs as well. |
whenever a new training sample (data set of a particular model changes) is added the accuracy of the model changes. then changed accuracy with timestamp needs to be recorded of the model in an SQLite table. I think every model is having the same data table. But every model will have different accuracies for same dataset or data . i need to maintain table for every model separately to track accuracy of every model with timestamp ? . Please correct me if i am wrong . |
Using a new table for each model will be hard to manage, so they should all use the same table. You can use a column with a dataset name to help keep track of which dataset the model was tested against. |
so we can create a table who has following parameters transaction_hash ,id of model ,accuracy , timestamp , we are having following apis :- when do I need to call function to check accuracy with which API ? |
Thanks for the update! I don't think a transaction hash is appropriate for the location of data. I'm not sure what we should use. You can just mode a "data_location" column and we can figure out what to put in it later. It might vary. Data might be on-chain, at a URL, it can vary. I believe I answer the question and functions in the PR: You should make 2 new functions. |
What is left in this issue? |
In database? On blockchain?
We now track the accuracy in the database (managed by the server). This is okay but it's centralized so it would be good to add some proof to that database.
The text was updated successfully, but these errors were encountered: