Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PECO-1387] Fix: mv/st network request authentication is being overridden by local .netrc file #555

Merged
merged 24 commits into from
Jan 18, 2024

Conversation

susodapop
Copy link

@susodapop susodapop commented Jan 17, 2024

Description

This pull request fixes an issue in the materialized view / streaming table code path where a local .netrc file in a user's home directory could override the authentication headers used for polling the MV/ST status. The fix is the same as the one in #338 and uses the same BearerAuth class.

The fix is to use a requests.Session with its .auth property set to an instance of our custom BearerAuth class and then replace every call to requests.get(...) with a call to self.session.get(...). For simplicity, I also set the User-Agent header at the Session level so that we don't need to pass a headers dict around between methods.

To test this, I ran the TestMaterializedView integration test with a bad .netrc file in my home directory. Before this change, the test failed with this error:

Error getting info for materialized view/streaming table

This indicates an authentication error.

After the change in this pull request, the materialized view / streaming table test still fails (because the test is broken - which I confirmed with @rcypher-databricks ) but it doesn't fail because of an authentication issue. This tells me that the fix is working as it should.

As for a direct test of this behaviour, we don't have a clean way to add a bad .netrc to our cloud testing environment so a manual test like I describe above is the only choice :/

Checklist

  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt-databricks next" section.

Jesse Whitehouse added 19 commits May 19, 2023 12:10
Update changelog.

Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Don't use dbt-spark as a default.

This is a no-op because this is always overridden anyway.

Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
setting the BearerAuth on the session rather than per-request

Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
refactor to push the .auth into requests session. It's not required in
DBContext anymore.

Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
.netrc fix.

Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
This was leftover from moving the authentication class into connections.py
in 4f56a5c

Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
This fixes a merge conflict in the changelog

Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
…override

Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Jesse Whitehouse added 2 commits January 17, 2024 18:10
it around between method calls.

Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Jesse Whitehouse added 2 commits January 18, 2024 15:07
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
pipeline = _get_pipeline_state(host, headers, pipeline_id)
session = Session()
session.auth = BearerAuth(headers)
session.headers = {"User-Agent": self._user_agent}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I set the headers on the Session now so we don't need to pass headers to _get_pipeline_state and _get_update_error_message etc.

@@ -606,7 +611,7 @@ def pollRefreshPipeline(
# should we do exponential backoff?
time.sleep(polling_interval)

pipeline = _get_pipeline_state(host, headers, pipeline_id)
pipeline = _get_pipeline_state(session, host, pipeline_id)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you can see I replace headers with session.

@susodapop susodapop merged commit 319c427 into main Jan 18, 2024
21 checks passed
@susodapop susodapop deleted the peco-1387-mvst-netrc-fix branch January 18, 2024 21:26
benc-db added a commit that referenced this pull request Jan 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants