Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbt snapshots 1.9 with empty source #906

Open
martinlarssonellevio opened this issue Jan 17, 2025 · 8 comments
Open

dbt snapshots 1.9 with empty source #906

martinlarssonellevio opened this issue Jan 17, 2025 · 8 comments
Labels
bug Something isn't working

Comments

@martinlarssonellevio
Copy link

Describe the bug

If the source table is empty we randomly get the following error message from Databricks:

12:29:33    Database Error in snapshot snapshot_catalogues_calculation_level (snapshots/bronze/pcs_pcsdbadm/bronze_snapshot_catalogues_calculation_level.yml)
  Character N is neither a decimal digit number, decimal point, nor "e" notation exponential mark.
  compiled code at target/run/leap/snapshots/bronze/pcs_pcsdbadm/bronze_snapshot_catalogues_calculation_level.yml

Steps To Reproduce

Create a source table and do not fill with data.
Create snapshot with 1.9 definition in a yml file and run it.

The error message shows up randomly but on the same tables if you run it twice. We cannot see any patterns to this other than if there are some data in the source table the error goes away.

I should say that we are using an ephemeral view as "source" relation: ref('bronze_ephemeral_catalogues_calculation_level').

Expected behavior

The snapshot should execute without error.

Screenshots and log output

System information

The output of dbt --version:

1.9.0

The operating system you're using:

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.5 LTS
Release:        22.04
Codename:       jammy

The output of python --version:

Python 3.12.7

Additional context

No

@martinlarssonellevio martinlarssonellevio added the bug Something isn't working label Jan 17, 2025
@martinlarssonellevio martinlarssonellevio changed the title dbt snapshots 1.9 dbt snapshots 1.9 with empty source Jan 17, 2025
@benc-db
Copy link
Collaborator

benc-db commented Jan 17, 2025

Does using this PR remedy the issue, or do I have another snapshot bug: #904?

@martinlarssonellevio
Copy link
Author

martinlarssonellevio commented Jan 20, 2025

I dont know. Will you be making a release with #904?

@benc-db
Copy link
Collaborator

benc-db commented Jan 21, 2025

yes, today.

@benc-db
Copy link
Collaborator

benc-db commented Jan 21, 2025

Please try with 1.9.2 and let me know

@martinlarssonellevio
Copy link
Author

Sorry to say that it does not. But I managed to fix the error. I have a setup with a macro to update dbt_valid_to when recreating snapshots from raw data. In our snapshots I have a post_hook that runs the macro set_dbt_valid_to with a variable called snapshot_start_date and it looks like this:

{% macro set_dbt_valid_to(snapshot_start_date) %}
    {% if snapshot_start_date != '1900-01-01' and config.get('hard_deletes') == 'invalidate' %}
        {{ log('Updating dbt_valid_to: ' ~ snapshot_start_date ~ ' for table ' ~ this, True) }}
        UPDATE {{ this }}
        SET dbt_valid_to = '{{ snapshot_start_date }}'
        WHERE dbt_valid_to > '{{ snapshot_start_date }}'
    {% endif %}
{% endmacro %}

Randomly I got the same error here when the dbt_valid_to column was mainly NULLs. Once I changed WHERE dbt_valid_to > '{{ snapshot_start_date }}' to dbt_valid_to >= '{{ snapshot_start_date }}' (> to >=) the error went away. This feels a bit like a Databricks bug. Not sure I can reproduce this easily though.

@martinlarssonellevio
Copy link
Author

Closing as I dont think dbt-databricks is causing this

@martinlarssonellevio martinlarssonellevio closed this as not planned Won't fix, can't repro, duplicate, stale Jan 22, 2025
@martinlarssonellevio
Copy link
Author

martinlarssonellevio commented Jan 24, 2025

We just had 16 new snapshot tables deployed. All source tables was empty and one of them failed with the above error message. Once I filled upp all source tables the error was gone.
I dont know if this is a bug in dbt-databricks or Databricks itself. But I would love if you had the time to try to reploduce this on your side.

We use an ephemeral model as the source of our snapshots to be able to filter out duplicates.

@benc-db
Copy link
Collaborator

benc-db commented Jan 24, 2025

BTW, I think the snapshot functionality is expecting 'valid to' to be a date in the far future, rather than null; at least that's how the functional tests are written

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants