-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: Source Microsoft Onedrive #1199
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
LGTM |
Updates the connector to use `stateKey` for each binding to track its state. This includes a migration for existing captures. Once all existing captures have started up with this new connector and we are confident that no additional old captures will be re-enabled, we can remove the migration code.
The materializations for some fields may be ambiguous if the destination systems are for example case insensitive, and a field like "thisField" will be materialized as the same column as "thisfield". Presently this is a concern for `materialize-redshift`, which treats everything as lowercase, and `materialize-snowflake` because of the historical way we have quoted identifiers. Historically we have not handled these cases well, and would end up with a error during apply because of trying to create a duplicate column. More recently, changes were made to outright forbid these ambiguous fields, and instead require alternate projections of them in order to be materialized. Forbidding the fields was simple, but excessive since users generally want to use field selection (rather than projections on collections) to pick which ambiguous field they actually want as the quick and easy way to get something working. This change enables that by implementing the following behaviors: * For a new binding of a materialization, ambiguous fields will all be optional if they aren't otherwise forbidden. The constraint description will say that only one of them should be selected to materialize, since we don't have any kind of mutual exclusion constraint. If a user selects more than one of these fields for a new materialization, it will fail to apply. * Once an ambiguous field is selected for materialization, all others from that set of ambiguous fields will be forbidden. If a binding is published with no ambiguous field selected, they will all still be constrained as optional so that it is possible to select one for an existing binding. * In the event of a pre-existing table with no previously persisted spec for that binding, the ambiguous fields will still be set as optional, because we don't know which projection the existing column should correspond to without a field selection. This is different than other pre-existing columns which we will recommend by default, but seems to be necessary given the ambiguous nature of the columns.
… default value Materializations do not support collections with nullable keys, and will generally fail if a null value is every encountered for a collection key. This changes validation to enforce that a nullable collection key not be allowed, unless it has a default value annotation. If there is a default value annotation, the connector is assured that it will never actually see a null value, and it will not try to drop NOT NULL constraints from the materialized field in future `Apply` calls.
…or float64 We used to create number columns as "bignumeric", so allowing "bignumeric" for these columns allows for backward compatibility. See #859.
…keyword Users may wish to pre-create indices with specific mappings as keywords instead of text, or alter their mappings after the fact.
…ceStates` `updateResourceStates` may remove bindings from the checkpoint, and these removals won't be reflected in persisted merge patches in future document checkpoints. Because of this, we should always emit a full update to the state checkpoint in `Pull` after `updateResourceStates` has run to persist any changes in the state due to bindings being removed.
…numeric or float64" This reverts commit 24b4fc1.
We used to create number columns as "bignumeric" (see #859.), and now create strings formatted as integer columns as "bignumeric". Allowing "bignumeric" columns to be either integer or number compatible allows both new and old materializations to work.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description:
(Describe the high level scope of new or changed features)
Workflow steps:
(How does one use this feature, and how has it changed)
Documentation links affected:
(list any documentation links that you created, or existing ones that you've identified as needing updates, along with a brief description)
Notes for reviewers:
(anything that might help someone review this PR)
This change isdata:image/s3,"s3://crabby-images/d0bb7/d0bb7f7625ca5bf5c3cf7a2b7a514cf841ab8395" alt="Reviewable"