Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HUDI-8785] Missed set of POPULATE_META_FIELDS parameter during table initialization by Flink #12731

Merged
merged 1 commit into from
Jan 31, 2025

Conversation

geserdugarov
Copy link
Contributor

@geserdugarov geserdugarov commented Jan 29, 2025

Change Logs

Currently, the following queries execution by Flink:

CREATE TABLE hudi_debug (
    id INT,
    part INT,
    desc STRING,
    PRIMARY KEY (id) NOT ENFORCED
) 
WITH (
    'connector' = 'hudi',
    'path' = '...',
    'table.type' = 'MERGE_ON_READ',
    'write.operation' = 'upsert',
    'hoodie.populate.meta.fields' = 'false'
); 

INSERT INTO hudi_debug VALUES 
    (1,100,'aaa'),
    (2,200,'bbb'); 

SELECT * FROM hudi_debug; 

would result in

org.apache.hudi.exception.HoodieException: Exception when reading log file

The reason is missed configuration of hoodie.populate.meta.fields during table initialization by Flink.

Proposed changes allow to run successfully

SELECT * FROM hudi_debug

Impact

Fixed read by Flink of data without Hudi metacolumns.

Risk level (write none, low medium or high below)

Low

Documentation Update

No need

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

@github-actions github-actions bot added the size:XS PR with lines of changes in <= 10 label Jan 29, 2025
@@ -270,6 +270,11 @@ public static HoodieTableMetaClient initTableIfNotExists(
.setUrlEncodePartitioning(conf.getBoolean(FlinkOptions.URL_ENCODE_PARTITIONING))
.setCDCEnabled(conf.getBoolean(FlinkOptions.CDC_ENABLED))
.setCDCSupplementalLoggingMode(conf.getString(FlinkOptions.SUPPLEMENTAL_LOGGING_MODE))
.setPopulateMetaFields(
Boolean.parseBoolean(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move the options resolving to OptionsResolver

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will check it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@geserdugarov geserdugarov force-pushed the master-populate-flink-init branch from 02d223b to cd9462f Compare January 30, 2025 07:16
@github-actions github-actions bot added size:S PR with lines of changes in (10, 100] and removed size:XS PR with lines of changes in <= 10 labels Jan 30, 2025
@geserdugarov geserdugarov force-pushed the master-populate-flink-init branch from cd9462f to 772eacd Compare January 30, 2025 11:04
@hudi-bot
Copy link

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@wombatu-kun wombatu-kun merged commit d87b859 into apache:master Jan 31, 2025
44 checks passed
@geserdugarov geserdugarov deleted the master-populate-flink-init branch January 31, 2025 03:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:S PR with lines of changes in (10, 100]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants