Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding pdq db creation and data ingestion features #100

Merged
merged 12 commits into from
May 12, 2024
Merged

adding pdq db creation and data ingestion features #100

merged 12 commits into from
May 12, 2024

Conversation

kopardev
Copy link
Member

@kopardev kopardev commented May 12, 2024

Changes

This will will include scripts to:

  • create db schema for pdq data ingestion
  • data ingestion

Issues

#99
#101

PR Checklist

(Strikethrough any points that are not applicable.)

  • This comment contains a description of changes with justifications, with any relevant issues linked.
  • Update docs if there are any API changes.
  • Update CHANGELOG.md with a short description of any user-facing changes and reference the PR number. Guidelines: https://keepachangelog.com/en/1.1.0/

@kopardev kopardev marked this pull request as draft May 12, 2024 16:12
@kopardev
Copy link
Member Author

@kopardev
Copy link
Member Author

added create_and_append_db.sh script to created db with all data up-to-date.

  • see file /data/CCBR_Pipeliner/userdata/spacesavers2_pdq
  • file /data/CCBR_Pipeliner/userdata/spacesavers2_pdq is only 104K vs all TSV and TSV.gz files accounting for 13M of disc-space ... surely saving a lot of disk space with sqlite3 db... improved perfomance.

@kopardev kopardev marked this pull request as ready for review May 12, 2024 21:09
@kopardev kopardev merged commit ab322f0 into main May 12, 2024
1 check passed
@kopardev kopardev deleted the pdq_db branch May 12, 2024 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant