-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
source-redshift-batch: Code health improvements #2432
Conversation
As with all the other batch SQL captures, omit the default query template and instead retcon that resource property as an "override", meanwhile adding new schema/table name properties and using those by default in discovered bindings.
Moves around and renames the test suite helper functions to match what source-postgres-batch has. This adds the table name generation functions but doesn't use them since that would alter the test snapshots too.
This changes the names of all test tables but means less manual fiddling around in the table setup, a net win.
Redshift is kind of PostgreSQL and kind of not (for instance, Redshift has a completely different columnar storage engine and doesn't expose an XMIN system column), but it's close enough that our test suite passes when run against Postgres. Since it's tricky to test against Redshift in CI builds, we might as well do the next-best thing and test against a local Postgres instance since practically speaking the alternative is no CI tests at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
We do use an actual redshift (serverless) application for CI tests wtih materialize-redshift
, see this config. Not sure if that's worth trying to work in for the redshift batch capture tests now or in the future.
Oh thanks, somehow I missed that. I'll see if I can get that working for source-redshift-batch as well, testing against the actual system is definitely preferable if it's not terribly complicated. |
Adds SOPS-encrypted configs and tweaks the test logic to use them for connecting to a test database. There are configs for testing against a real Redshift instance and against a local Postgres instance, but by default Redshift will be used. The credentials plumbing so that SOPS can be run within a test to decrypt the configs is nontrivial.
96a0b48
to
62bab10
Compare
PTAL. I've added another commit which gets the full connector test suite running against that same Redshift instance as part of CI builds. The setup for that felt a bit complicated. Adding encrypted config files and rewriting the test helper logic to sops-decrypt those encrypted configs was straightforward enough, but:
|
I'm also not precisely sure how the materialization connector integration tests get the GCP credentials where they need to be, which is here where the configs are decrypted using the |
Yeah, I just tried doing the dumb thing and leaving it unset and apparently that's all it takes, so I guess SOPS (or rather, the GCP client library it uses, because I checked and the SOPS repo has no mentions of that environment variable in the code) will indeed take it directly from I thought I had tried that already, but I guess maybe I screwed something up when I did that the first time (in the context of making it work via the Dockerfile before I gave up on that, so maybe that was part of the problem) because it seems like it's working now. |
Description:
As with source-postgres-batch and source-mysql-batch, this PR is mostly a bunch of test suite improvements along with the change to omit query templates from discovered bindings by default in favor of letting the connector pick the appropriate default query when unspecified.
I've also added CI tests for source-redshift-batch which simply launch a local Postgres instance to test against. While Redshift is different from Postgres in some important ways, it's still close enough that our test suite passes when run against Postgres and if it's a choice between "no tests in CI builds" and "somewhat unrepresentative tests in CI builds" the latter is definitely preferable.
This change isdata:image/s3,"s3://crabby-images/d0bb7/d0bb7f7625ca5bf5c3cf7a2b7a514cf841ab8395" alt="Reviewable"