Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datalake 0.15.0 #11

Draft
wants to merge 729 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
729 commits
Select commit Hold shift + click to select a range
d36e526
[HUDI-7464] Fix minor bugs in kafka post-processing related code (#10…
the-other-tim-brown Mar 1, 2024
0a4bed6
[MINOR] Fix violations of Sonarqube rule java:S2184 (#10444)
KUTEJiang Mar 1, 2024
d81f2bb
[HUDI-7447] Fix not bootstrap when subTask restart when OPCoordinator…
wenbingshen Mar 2, 2024
99d52b7
[HUDI-9424] Support using local timezone when writing flink TIMESTAMP…
cmmp6 Mar 2, 2024
68659a1
[HUDI-7465] Split tests in CI further to reduce total CI elapsed time…
yihua Mar 2, 2024
4bdb98b
[HUDI-6089] Handle default insert behaviour to ingest duplicates (#10…
wombatu-kun Mar 3, 2024
53725e8
[HUDI-7469] Reduce redundant tests with Hudi record types (#10800)
yihua Mar 3, 2024
7c3437f
[HUDI-6953] Adding test for composite keys with bulk insert row write…
nsivabalan Mar 3, 2024
e2ec366
[HUDI-3625] Update RFC-60 (#9462)
CTTY Mar 4, 2024
734e0cf
[MINOR] Clean code of FileSystemViewManager (#10797)
stayrascal Mar 4, 2024
05e16b7
[HUDI-7471] Use existing util method to get Spark conf in tests (#10802)
yihua Mar 4, 2024
e35fa8d
[MINOR] Add PR description validation on documentation updates (#10799)
yihua Mar 4, 2024
a4aa005
[HUDI-7479] SQL confs don't propagate to spark row writer properly (#…
jonvex Mar 5, 2024
5deb196
[HUDI-7337] Implement MetricsReporter that reports metrics to M3 (#10…
kbuci Mar 5, 2024
78bf676
[HUDI-7413] Fix schema exception types and error messages thrown with…
jonvex Mar 5, 2024
4538fb2
[HUDI-7418] Create a common method for filtering in S3 and GCS source…
rmahindra123 Mar 6, 2024
81fe5ad
[MINOR] Fix Azure publishing of JUnit results (#10817)
yihua Mar 6, 2024
111d138
[MINOR] Publish test results from the containerized job to Azure (#10…
yihua Mar 6, 2024
3d5d274
[HUDI-7473] Rebalance CI (#10805)
yihua May 14, 2024
45923f3
[HUDI-6947] Refactored HoodieSchemaUtils.deduceWriterSchema with many…
geserdugarov Mar 7, 2024
695577b
[HUDI-7356] Passing configs to file reader constructor for flexibilit…
wombatu-kun Mar 7, 2024
4680cb4
[HUDI-7197] Adding mis fixes related with table services testing (#10…
harsh1231 Mar 7, 2024
9f00f6d
[HUDI-5167] Reducing total test run time: reducing tests for virtual …
nsivabalan Mar 7, 2024
8ed8a20
[HUDI-7488] The BigQuerySyncTool can't work well when the hudi table …
steve-xi-awx Mar 8, 2024
06584c6
[MINOR] Separate HoodieSparkWriterTestBase to reduce duplication (#10…
geserdugarov Mar 8, 2024
8e6eff9
[HUDI-7491] Fixing handling null values of extra metadata in clean co…
nsivabalan Mar 8, 2024
dbe16f3
[HUDI-7411] Meta sync should consider cleaner commit (#10676)
codope Mar 8, 2024
866348a
[ENG-6316] Bump cleaner retention for MDT (#537) (#10655)
codope Mar 8, 2024
632e61f
[HUDI-6043] Metadata Table should use default values for Compaction p…
lokeshj1703 Mar 9, 2024
02ae11f
[HUDI-5101] Adding spark-structured streaming test support via spark-…
nsivabalan Mar 9, 2024
45a2e07
[HUDI-7495] Bump mysql-connector-java from 8.0.22 to 8.0.28 in /hudi-…
dependabot[bot] Mar 9, 2024
80990d4
[HUDI-7163] Fix not parsable text DateTimeParseException when compact…
wuzhenhua01 Mar 9, 2024
3f78130
[HUDI-7496] Bump mybatis from 3.4.6 to 3.5.6 in /hudi-platform-servic…
dependabot[bot] Mar 9, 2024
c2c7e05
[HUDI-1517] create marker file for every log file (#11187)
codope May 14, 2024
58ae418
[MINOR] Remove repetitive words in docs (#10844)
studystill Mar 11, 2024
7b734ac
[HUDI-7489] Avoid collecting WriteStatus to driver in row writer code…
jonvex Mar 12, 2024
6256035
add job context (#10848)
the-other-tim-brown Mar 12, 2024
0819a8b
[HUDI-7478] Fix max delta commits guard check w/ MDT (#10820)
wombatu-kun May 14, 2024
9ff708b
[MINOR] Fix and enable test TestHoodieDeltaStreamer.testJdbcSourceInc…
wombatu-kun Mar 15, 2024
3f8859a
[HUDI-7382] Get partitions from active timeline instead of listing wh…
fhan688 Mar 15, 2024
774b401
[MINOR] rename KeyGenUtils#enableAutoGenerateRecordKeys (#10871)
wombatu-kun Mar 15, 2024
d99bf04
[HUDI-7506] Compute offsetRanges based on eventsPerPartition allocate…
vinishjail97 Mar 15, 2024
41ba99d
[HUDI-7466] Add parallel listing of existing partitions in Glue Catal…
VitoMakarevich Mar 16, 2024
f061cbf
[HUDI-7421] Build HoodieDeltaWriteStat using HoodieDeltaWriteStat#cop…
wombatu-kun Mar 18, 2024
29b3ff9
[HUDI-7492] Fix the incorrect keygenerator specification for multi pa…
empcl Mar 18, 2024
1cd6900
[MINOR] Add Hudi icon for idea (#10880)
qidian99 Mar 19, 2024
30f6e83
[HUDI-7516] Put jdbc-h2 creds into static variables for hudi-utilitie…
wombatu-kun Mar 20, 2024
7571aa0
[MINOR] Remove redundant fileId from HoodieAppendHandle (#10901)
wombatu-kun May 14, 2024
d8cb589
[HUDI-7529] Resolve hotspots in stream read (#10911)
zhuanshenbsj1 Mar 23, 2024
84b85ee
[HUDI-7487] Fixed test with in-memory index by proper heap clearing (…
geserdugarov Mar 23, 2024
4966601
[MINOR] Refactored `@Before*` and `@After*` in `HoodieDeltaStreamerTe…
geserdugarov Mar 23, 2024
a119006
[HUDI-7530] Refactoring of handleUpdateInternal in CommitActionExecut…
wombatu-kun Mar 23, 2024
0a92b67
[HUDI-7499] Support FirstValueAvroPayload for Hudi (#10857)
xuzifu666 Mar 24, 2024
b8aa7d8
checkstyle (#10919)
zhuanshenbsj1 Mar 25, 2024
c6ad102
[HUDI-7513] Add jackson-module-scala to spark bundle (#10877)
xicm Mar 25, 2024
24f0b68
[MINOR] Restore the setMaxParallelism setting for HoodieTableSource.p…
zhuanshenbsj1 Mar 26, 2024
9de9cbb
[HUDI-7531] Consider pending clustering when scheduling a new cluster…
yihua May 14, 2024
4397202
[HUDI-7518] Fix HoodieMetadataPayload merging logic around repeated d…
yihua May 14, 2024
b16fe5d
[HUDI-7500] fix gaps with deduce schema and null schema (#10858)
jonvex Mar 27, 2024
3a2a123
[HUDI-7551] Avoid loading all partitions in CleanPlanner when MDT is …
the-other-tim-brown Mar 28, 2024
d8cccb2
[HUDI-6317] Streaming read should skip compaction and clustering inst…
SteNicholas Mar 28, 2024
f602eec
[MINOR} When M3 metrics reporter type is used HoodieMetricsConfig sho…
kbuci Mar 29, 2024
ed34f95
[HUDI-7187] Fix integ test props to honor new streamer properties (#1…
wombatu-kun Mar 31, 2024
58b0d24
[HUDI-6538] Refactor methods in TimelineDiffHelper class (#10938)
wombatu-kun Apr 1, 2024
2adac11
[HUDI-7557] Fix incremental cleaner when commit for savepoint removed…
codope Apr 1, 2024
0eaad07
[MINOR] Upgrade mockito to 3.12.4 (#10953)
jonvex Apr 2, 2024
f8de98a
[HUDI-7564] Fix HiveSyncConfig inconsistency (#10951)
voonhous Apr 3, 2024
71ea426
[HUDI-7569] [RLI] Fix wrong result generated by query (#10955)
bhat-vinay Apr 3, 2024
b6273b9
[HUDI-7486] Classify schema exceptions when converting from avro to s…
jonvex Apr 3, 2024
b633362
[HUDI-7564] Revert hive sync inconsistency and reason for it (#10959)
voonhous Apr 4, 2024
a3846f1
[HUDI-7556] Fixing MDT validator and adding tests (#10939)
nsivabalan Apr 5, 2024
8cdadad
[HUDI-7571] Add api to get exception details in HoodieMetadataTableVa…
lokeshj1703 Apr 5, 2024
2194bd4
[MINOR] Removed FSUtils.makeBaseFileName without fileExt param (#10963)
wombatu-kun Apr 5, 2024
e8e699a
[MINOR] Handle cases of malformed records when converting to json (#1…
the-other-tim-brown Apr 6, 2024
4ed94d3
[MINOR] use Temurin jdk (#10948)
sullis Apr 7, 2024
4c824b5
[MINOR] Removed FSUtils.makeBaseFileName without fileExt param (#10967)
wombatu-kun Apr 8, 2024
f2c1b4d
[HUDI-6854] Change default payload type to HOODIE_AVRO_DEFAULT (#10949)
wombatu-kun May 14, 2024
7c0f9ac
[HUDI-7572] Avoid to schedule empty compaction plan without log files…
danny0405 May 14, 2024
704527d
[HUDI-7559] [1/n] Fix RecordLevelIndexSupport::filterQueryWithRecordK…
bhat-vinay Apr 9, 2024
8bbfcee
[MINOR] Optimize print write error msg in StreamWriteOperatorCoordina…
zhuanshenbsj1 Apr 10, 2024
fad8ff0
[HUDI-7556] Fixing false positive validation with MDT validator (#10986)
nsivabalan May 15, 2024
53bdcb0
[HUDI-7583] Read log block header only for the schema and instant tim…
yihua May 14, 2024
e5054aa
[HUDI-7597] Add logs of Kafka offsets when the checkpoint is out of b…
yihua Apr 10, 2024
fa9cc9f
[MINOR] Fix BUG: HoodieLogFormatWriter: unable to close output stream…
silly-carbon Apr 10, 2024
f01c133
[HUDI-7600] Shutdown ExecutorService when HiveMetastoreBasedLockProvi…
Zouxxyy Apr 11, 2024
cb05c77
[HUDI-7391] HoodieMetadataMetrics should use Metrics instance for met…
lokeshj1703 May 14, 2024
741bd78
[HUDI-6441] Passing custom Headers with Hudi Callback URL (#10970)
wombatu-kun Apr 11, 2024
ebd8a7d
[HUDI-7605] Allow merger strategy to be set in spark sql writer (#10999)
jonvex Apr 12, 2024
5b37e84
[HUDI-7290] Don't assume ReplaceCommits are always Clustering (#10479)
jonvex Apr 12, 2024
04ec9f6
[HUDI-7601] Add heartbeat mechanism to refresh lock (#10994)
YannByron Apr 12, 2024
a92613a
[HUDI-7378] Fix Spark SQL DML with custom key generator (#10615)
yihua May 15, 2024
09dae35
[HUDI-7616] Avoid multiple cleaner plans and deprecate hoodie.clean.a…
yihua Apr 14, 2024
73a84d7
[HUDI-7606] Unpersist RDDs after table services, mainly compaction an…
rmahindra123 Apr 14, 2024
1117db6
[HUDI-7615] Mark a few write configs with the correct sinceVersion (#…
pt657407064 Apr 15, 2024
ab0e2cd
[HUDI-7584] Always read log block lazily and remove readBlockLazily a…
wombatu-kun Apr 15, 2024
ecb33e3
[HUDI-7619] Removed code duplicates in HoodieTableMetadataUtil (#11022)
wombatu-kun May 14, 2024
cd68706
[HUDI-6762] Removed usages of MetadataRecordsGenerationParams (#10962)
wombatu-kun May 14, 2024
7fe6acf
[MINOR] Remove redundant lines in StreamSync and TestStreamSyncUnitTe…
yihua Apr 16, 2024
87659d4
[MINOR] Rename location to path in `makeQualified` (#11037)
yihua Apr 17, 2024
34a1584
[HUDI-7578] Avoid unnecessary rewriting to improve performance (#11028)
danny0405 Apr 17, 2024
82bdc9c
[HUDI-7625] Avoid unnecessary rewrite for metadata table (#11038)
danny0405 Apr 17, 2024
e3ac75c
[HUDI-7626] Propagate UserGroupInformation from the main thread to th…
beyond1920 Apr 17, 2024
29b4a04
[HUDI-4228] Clean up literal usage in Hudi CLI argument check (#11042)
wombatu-kun Apr 18, 2024
a0a2c97
[HUDI-7633] Use try with resources for AutoCloseable (#11045)
yihua Apr 18, 2024
290f505
[MINOR] Remove redundant TestStringUtils in hudi-common (#11046)
yihua Apr 18, 2024
c9c1f75
[HUDI-7636] Make StoragePath Serializable (#11049)
yihua Apr 18, 2024
517f7d0
[HUDI-7635] Add default block size and openSeekable APIs to HoodieSto…
yihua May 15, 2024
8fff940
[HUDI-7637] Make StoragePathInfo Comparable (#11050)
yihua Apr 18, 2024
bce7199
[HUDI-6497] Replace FileSystem, Path, and FileStatus usage in hudi-co…
yihua May 15, 2024
349e083
[HUDI-7640] Uses UUID as temporary file suffix for HoodieStorage.crea…
danny0405 Apr 19, 2024
82c3209
[HUDI-7618] Add ability to ignore checkpoints in delta streamer (#11018)
sampan-s-nayak Apr 19, 2024
2dd563f
[HUDI-7643] Fix test by using the right StreamSync constructor (#11056)
codope Apr 19, 2024
071b26d
[HUDI-7515] Fix partition metadata write failure (#10886)
wecharyu Apr 20, 2024
36cf9bd
[MINOR] Added configurations of Hudi table, file-based SQL source, Hu…
geserdugarov Apr 20, 2024
66208b0
[HUDI-7628] Rename FSUtils.getPartitionPath to constructAbsolutePath …
wombatu-kun May 15, 2024
4f3952e
[HUDI-7631] Clean up usage of CachingPath outside hudi-common module …
wombatu-kun Apr 21, 2024
e7e77e5
[HUDI-7623] Refactoring of RemoteHoodieTableFileSystemView and Reques…
wombatu-kun May 15, 2024
aebf1ee
[HUDI-7655] Minor fix to rli validation with MDT validator (#11060)
nsivabalan Apr 21, 2024
44f8897
[MINOR] Reuse MetadataPartitionType enum to get all partition paths (…
codope Apr 22, 2024
02142e8
[HUDI-7608] Fix Flink table creation configuration not taking effect …
empcl May 15, 2024
cea3e43
[MINOR] Fix incorrect catch of ClassCastException using HoodieSparkKe…
Alowator Apr 23, 2024
514251d
[MINOR] Fixe naming of methods in HoodieMetadataConfig (#11076)
wombatu-kun Apr 24, 2024
d61673e
[HUDI-7647] READ_UTC_TIMEZONE doesn't affect log files for MOR tables…
Alowator Apr 24, 2024
5a79c26
[HUDI-6386] Enable testArchivalWithMultiWriters back as they are pass…
nsivabalan Apr 24, 2024
79df183
[MINOR] Fix LoggerName for JDBCExecutor (#11063)
chengabc930919 Apr 24, 2024
01e5240
[HUDI-7651] Add util methods for creating meta client (#11081)
yihua May 15, 2024
663ba26
[HUDI-7632] Remove FileSystem usage in HoodieLogFormatWriter (#11082)
wombatu-kun May 15, 2024
f66310a
[HUDI-7650] Remove FileSystem argument in TestHelpers methods (#11072)
yihua Apr 24, 2024
371fc73
[MINOR] Remove unused util methods in LogReaderUtils (#11086)
yihua Apr 24, 2024
d4ef0b6
[HUDI-7660] Fix excessive object creation in RowDataKeyGen (#11084)
wombatu-kun Apr 25, 2024
d42f399
[HUDI-7235] Fix checkpoint bug for S3/GCS Incremental Source (#10336)
vinishjail97 Apr 25, 2024
5007231
[HUDI-7645] Optimize BQ sync tool for MDT (#11065)
wombatu-kun Apr 25, 2024
b71e279
[HUDI-7666] Fix serializable implementation of StorageConfiguration c…
yihua Apr 25, 2024
03e21d0
[MINOR] Make KafkaSource abstraction public and more flexible (#11093)
the-other-tim-brown Apr 25, 2024
1d38ae5
[HUDI-7658] Add time to meta sync failure log (#11080)
jonvex Apr 25, 2024
45426de
[HUDI-7511] Fixing offset range calculation for kafka (#10875)
nsivabalan Apr 26, 2024
6ffdc5f
[HUDI-7672] Fix the Hive server scratch dir for tests in hudi-utiliti…
danny0405 Apr 26, 2024
348b6bb
[HUDI-7575] Avoid repeated fetching of pending replace instants (#10976)
the-other-tim-brown Apr 26, 2024
305bd7e
[HUDI-7676] Fix serialization in Spark DAG in HoodieBackedTableMetada…
yihua Apr 26, 2024
2960094
[HUDI-7664] Remove Hadoop dependency from hudi-io module (#11089)
yihua Apr 26, 2024
2b73ab4
[MINOR] Streamer test setup performance (#10806)
the-other-tim-brown Apr 26, 2024
e8368f2
[HUDI-7670] Return StorageConfiguration from getConf() in HoodieStora…
yihua May 15, 2024
1ba41a2
[HUDI-7668] Add and rename APIs in StorageConfiguration (#11102)
yihua May 15, 2024
ee974ec
[HUDI-7675] Don't set default value for primary key when get schema f…
hehuiyuan Apr 27, 2024
3754c8a
[HUDI-7674] Fix Hudi CLI Command "metadata validate-files" to use fil…
bvaradar May 15, 2024
13ae15c
[HUDI-7681] Remove Hadoop Path usage in a few classes in hudi-common …
yihua Apr 27, 2024
dd7e597
[HUDI-7683] Make HoodieMetadataMetrics log level debug ro reduce nois…
codope Apr 29, 2024
f7937d3
[HUDI-7682] Remove the files copy in Azure CI tests report (#11110)
danny0405 Apr 29, 2024
2bfe068
[MINOR] Remove the redundant log in HFileBootstrapIndex (#11115)
danny0405 Apr 29, 2024
e828a6d
[HUDI-7667] Created util method to get offset range for fetching new …
pkgajulapalli Apr 29, 2024
6e3b22e
[HUDI-7684] Sort the records for Flink metadata table bulk_insert (#1…
danny0405 Apr 30, 2024
4ddd99b
[HUDI-7588] Replace hadoop Configuration with StorageConfiguration in…
yihua May 15, 2024
fa9e489
[HUDI-7694] Unify bijection-avro dependency version (#11132)
yihua May 1, 2024
e99a2ee
[HUDI-7702] Remove unused method in ReflectUtil (#11135)
yihua May 2, 2024
47c57f8
[HUDI-6296] Add Scala 2.13 support for Spark 3.5 integration (#11130)
yihua May 15, 2024
581b881
[HUDI-7688] Stop retry inflate if encounter InterruptedIOException (#…
beyond1920 May 3, 2024
23bb9a0
[MINOR] remove unnecessary lines from java test (#11139)
jonvex May 3, 2024
b331120
[HUDI-7686] Add tests on the util methods for type cast of configurat…
yihua May 3, 2024
a05bfdc
[HUDI-7576] Improve efficiency of getRelativePartitionPath, reduce co…
the-other-tim-brown May 15, 2024
c31eab1
[HUDI-7710] Remove compaction.inflight from conflict resolution (#11148)
linliu-code May 4, 2024
da0eb16
[HUDI-7703] Clean plan to exclude partitions with no deleting file (#…
xushiyan May 6, 2024
3571370
[HUDI-7641] Adding metadata enablement metrics and index type metrics…
nsivabalan May 6, 2024
c38e952
Fixing deltastreamer tests for auto record key gen (#11099)
nsivabalan May 6, 2024
9e9e218
[HUDI-7710] Use compaction.requested during conflict resolution (#11151)
linliu-code May 15, 2024
53d1c1f
[HUDI-7721] Fix broken build on master (#11164)
jonvex May 7, 2024
fc91460
[HUDI-7720] Fix HoodieTableFileSystemView NPE in fetchAllStoredFileGr…
xuzifu666 May 7, 2024
0eda139
[MINOR] Do not force setting spark conf in UtilHelpers (#11166)
Zouxxyy May 7, 2024
fb4ac8d
[MINOR] Remove duplicate settings (#11167)
askwang May 7, 2024
faf953a
[MINOR] Use parent as the glob path when full file path specified (#1…
the-other-tim-brown May 8, 2024
63e8cd9
[HUDI-7727] Avoid constructAbsolutePathInHadoopPath in hudi-common mo…
yihua May 8, 2024
1b2f05f
[HUDI-7728] Use StorageConfiguration in LockProvider constructors (#1…
yihua May 8, 2024
e03b528
[HUDI-7699] Support STS external ids and configurable session names i…
istreeter May 8, 2024
b98bf58
[HUDI-7734] Remove unused FSPermissionDTO (#11176)
yihua May 9, 2024
7b923ec
[HUDI-7735] Remove usage of SerializableConfiguration (#11177)
yihua May 9, 2024
13fd77c
[MINOR] Cosmetic changes for names and log msgs (#11179)
danny0405 May 9, 2024
99ea8b6
[HUDI-7737] Bump Spark 3.4 version to Spark 3.4.3 (#11180)
geserdugarov May 9, 2024
8fb7f85
[HUDI-7587] Make hudi-hadoop-common module dependent on hudi-common m…
jonvex May 15, 2024
a5656a1
[HUDI-7350] Make Hudi reader and writer factory APIs Hadoop-independe…
jonvex May 15, 2024
7f11739
[HUDI-7725] Restructure HFileBootstrapIndex to separate Hadoop-depend…
jonvex May 10, 2024
d49bd43
[HUDI-7729] Move ParquetUtils to hudi-hadoop-common module (#11186)
yihua May 10, 2024
caec900
[HUDI-7738] Set FileStreamReader Charset as UTF-8 (#11181)
xuzifu666 May 10, 2024
68e3514
[HUDI-7654] Optimizing BQ sync for MDT (#11061)
nsivabalan May 10, 2024
f44c1c0
[HUDI-7726] Restructure TableSchemaResolver to separate Hadoop logic …
jonvex May 10, 2024
733728c
[HUDI-7742] Move Hadoop-dependent reader util classes to hudi-hadoop-…
yihua May 10, 2024
e530f38
[HUDI-7673] Fixing false positive validation failure for RLI with MDT…
nsivabalan May 15, 2024
4f243ef
[HUDI-7731] Fix usage of new Configuration() in production code (#11191)
jonvex May 11, 2024
c28e009
[HUDI-7739] Shudown asyncDetectorExecutor in AsyncTimelineServerBased…
Zouxxyy May 11, 2024
d6cc2c0
[HUDI-7508] Avoid collecting records in HoodieStreamerUtils.createHoo…
vinishjail97 May 11, 2024
c21e420
[HUDI-7745] Move Hadoop-dependent util methods to hudi-hadoop-common …
yihua May 15, 2024
2b2bba9
[HUDI-4732] Add support for confluent schema registry with proto (#11…
the-other-tim-brown May 12, 2024
8beaf31
[HUDI-7501] Use source profile for S3 and GCS sources (#10861)
vinishjail97 May 13, 2024
7907b99
[HUDI-7523] Add HOODIE_SPARK_DATASOURCE_OPTIONS to be used in HoodieI…
vinishjail97 May 15, 2024
04c275d
[HUDI-7743] Improve StoragePath usages (#11189)
jonvex May 15, 2024
aeb49aa
[HUDI-7744] Introduce IOFactory and a config to set the factory (#11192)
jonvex May 15, 2024
5150d1b
[HUDI-7750] Move HoodieLogFormatWriter class to hoodie-hadoop-common …
yihua May 15, 2024
8f2dba3
remove a few classes from hudi-common (#11209)
jonvex May 14, 2024
6e129de
[HUDI-7589] Add API to create HoodieStorage in HoodieIOFactory (#11208)
jonvex May 15, 2024
580bb1c
[HUDI-7549] Reverting spurious log block deduction with LogRecordRead…
nsivabalan May 14, 2024
25da2b0
[HUDI-7617] Fix issues for bulk insert user defined partitioner in St…
vinishjail97 May 14, 2024
90b0b5b
[HUDI-7535] Add metrics for sourceParallelism and Refresh profile in …
vinishjail97 May 14, 2024
0e5d6f9
[HUDI-7749] Bump Spark version 3.3.1 to 3.3.4 (#11198)
codope May 14, 2024
3c00124
[HUDI-7712] Fixing RLI initialization to account for file slices inst…
nsivabalan May 15, 2024
56d9fbe
[HUDI-7624] Fixing index tagging duration (#11035)
nsivabalan May 15, 2024
c047600
[HUDI-7752] Abstract serializeRecords for log writing (#11210)
yihua May 15, 2024
f746712
[HUDI-7429] Fixing average record size estimation for delta commits (…
nsivabalan May 15, 2024
4db72fd
[HUDI-7759] Remove Hadoop dependencies in hudi-common module (#11220)
yihua May 15, 2024
cc64cd8
[HUDI-7532] Include only compaction instants for lastCompaction in ge…
nsivabalan May 15, 2024
5f65aac
[HUDI-7768] Fixing failing tests of async compaction metadata for 0.1…
nsivabalan May 15, 2024
98e9cb1
[HUDI-7765] Turn off native HFile reader for 0.15.0 release (#11233)
yihua May 15, 2024
c4ca028
[HUDI-7767] Revert Spark 3.3 and 3.4 upgrades (#11235)
yihua May 15, 2024
2b81e6b
[HUDI-7771] Making OverwriteWithLatestPayload as default payload in 0…
nsivabalan May 16, 2024
2815aef
[HUDI-6386] Branch 0.x failing tests test multi writer archival (#11239)
nsivabalan May 16, 2024
a04390b
[HUDI-7770] Parse partition path from hudi directory for bootstrap ta…
May 15, 2024
c028842
[HUDI-7769] Fix Hudi CDC read on Spark 3.3.4 and 3.4.3 (#11242)
yihua May 16, 2024
1b97582
[HUDI-7766] Adding staging jar deployment command for Spark 3.5 and S…
yihua May 16, 2024
9c90a7b
Create release branch for version 0.15.0
yihua May 16, 2024
16ba7f2
Remove local change from 0.14.0
yihua May 16, 2024
f5a7c0f
[MINOR] [BRANCH-0.x] Added condition to check default value to fix ex…
ad1happy2go May 17, 2024
ed2dc91
[HUDI-7784] Fix serde of HoodieHadoopConfiguration in Spark (#11270)
yihua May 22, 2024
0de2e80
[HUDI-7786] Fix roaring bitmap dependency in hudi-integ-test-bundle (…
nsivabalan May 24, 2024
b71da93
[HUDI-7785] Keep public APIs in utilities module the same as before H…
yihua May 24, 2024
670f131
[HUDI-7775] Remove unused APIs in HoodieStorage (#11281)
yihua May 24, 2024
b350a26
[HUDI-7788] Fixing exception handling in AverageRecordSizeUtils (#11290)
yihua May 24, 2024
77b1ded
[HUDI-7776] Simplify HoodieStorage instance fetching (#11292)
yihua May 25, 2024
2e763c9
[HUDI-7792] Bump h2 from 1.4.200 to 2.2.220 (#11296)
yihua May 25, 2024
72dd518
[HUDI-7790] Revert changes in DFSPathSelector and UtilHelpers.readCon…
yihua May 25, 2024
b607899
[HUDI-7794] Bump org.apache.hive:hive-service from 2.3.1 to 2.3.4 (#1…
yihua May 25, 2024
f5b8088
[HUDI-7777] Allow HoodieTableMetaClient to take HoodieStorage instanc…
yihua May 26, 2024
b9ffa97
[HUDI-7796] Gracefully cast file system instance in Avro writers (#11…
yihua May 26, 2024
86552da
[HUDI-7778] Fixing global index for duplicate updates (#11305)
yihua May 26, 2024
b4d52c0
[HUDI-7798] Mark configs included in 0.15.0 release (#11307)
yihua May 26, 2024
b8796d0
[HUDI-7797] Use HoodieIOFactory to return pluggable FileFormatUtils i…
yihua May 26, 2024
bd4256b
[MINOR] Fix bundle validation script on branch-0.x (#11331)
yihua May 27, 2024
bbebda4
[HUDI-7707] Enable bundle validation on Java 8 and 11 (#11313)
yihua May 27, 2024
27e45ac
[HUDI-7802] Fix bundle validation scripts (#11332)
yihua May 27, 2024
7100227
Bumping release candidate number 2
yihua May 27, 2024
f80c416
[MINOR] Change release candidate validation target
yihua May 27, 2024
b9ae51e
[MINOR] Disable release candidate validation by default (#11339)
yihua May 27, 2024
27df817
[MINOR] Fix Flink version in release candidate validation (#11341)
yihua May 27, 2024
fd6c611
DENG-2598: adding support for select * except
kkalanda-score May 28, 2024
e88e079
Merge pull request #9 from scoremedia/DENG-2598/adding-support-for-ex…
kkalanda-score May 29, 2024
88d057f
[HUDI-7809] Use Spark SerializableConfiguration to avoid NPE in Kryo …
yihua May 29, 2024
fe08b6f
[HUDI-7807] Fixing spark-sql for pk less tables (#11354)
nsivabalan May 29, 2024
9e79996
[HUDI-7812] Disabling row writer for clustering (#11360)
nsivabalan May 29, 2024
c009895
[HUDI-7655] Ensuring clean action executor cleans up all intended fil…
nsivabalan May 30, 2024
d90c690
[MINOR] Remove thrift gen in staging deploy script
yihua May 30, 2024
d0df1d4
Bumping release candidate number 3
yihua May 30, 2024
3883285
[MINOR] Update release version to reflect published version 0.15.0
yihua Jun 4, 2024
aa46f77
sync to release 0.15.0
remeajayi2022 Nov 5, 2024
025976a
Timestamp changes to partition path
remeajayi2022 Nov 6, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 1 addition & 1 deletion .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ _If medium or high, explain what verification was done to mitigate the risks._

### Documentation Update

_Describe any necessary documentation update if there is any new feature, config, or user-facing change_
_Describe any necessary documentation update if there is any new feature, config, or user-facing change. If not, put "none"._

- _The config description must be updated if new configs are added or the default value of the configs are changed_
- _Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
Expand Down
85 changes: 85 additions & 0 deletions .github/workflows/azure_ci.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

async function checkAzureCiAndCreateCommitStatus({ github, context, prNumber, latestCommitHash }) {
console.log(`- Checking Azure CI status of PR: ${prNumber} ${latestCommitHash}`);
const botUsername = 'hudi-bot';

const comments = await github.paginate(github.rest.issues.listComments, {
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: prNumber,
sort: 'updated',
direction: 'desc',
per_page: 100
});

// Find the latest comment from hudi-bot containing the Azure CI report
const botComments = comments.filter(comment => comment.user.login === botUsername);

let status = 'pending';
let message = 'In progress';
let azureRunLink = '';

if (botComments.length > 0) {
const lastComment = botComments[0];
const reportPrefix = `${latestCommitHash} Azure: `
const successReportString = `${reportPrefix}[SUCCESS]`
const failureReportString = `${reportPrefix}[FAILURE]`

if (lastComment.body.includes(reportPrefix)) {
if (lastComment.body.includes(successReportString)) {
message = 'Successful on the latest commit';
status = 'success';
} else if (lastComment.body.includes(failureReportString)) {
message = 'Failed on the latest commit';
status = 'failure';
}
}

const linkRegex = /\[[a-zA-Z]+\]\((https?:\/\/[^\s]+)\)/;
const parts = lastComment.body.split(reportPrefix);
const secondPart = parts.length > 1 ? parts[1] : '';
const match = secondPart.match(linkRegex);

if (match) {
azureRunLink = match[1];
}
}

console.log(`Status: ${status}`);
console.log(`Azure Run Link: ${azureRunLink}`);
console.log(`${message}`);

console.log(`- Create commit status of PR based on Azure CI status: ${prNumber} ${latestCommitHash}`);
// Create or update the commit status for Azure CI
await github.rest.repos.createCommitStatus({
owner: context.repo.owner,
repo: context.repo.repo,
sha: latestCommitHash,
state: status,
target_url: azureRunLink,
description: message,
context: 'Azure CI'
});

return { status, message, azureRunLink };
}

module.exports = checkAzureCiAndCreateCommitStatus;
80 changes: 80 additions & 0 deletions .github/workflows/azure_ci_check.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

name: Azure CI

on:
issue_comment:
types: [ created, edited, deleted ]

permissions:
statuses: write
pull-requests: read
issues: read

jobs:
check-azure-ci-and-add-commit-status:
if: |
github.event.issue.pull_request != null &&
github.event.comment.user.login == 'hudi-bot'
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v2

- name: Check PR state
id: check_pr_state
uses: actions/github-script@v7
with:
github-token: ${{secrets.GITHUB_TOKEN}}
script: |
const issueNumber = context.issue.number;
const { data: pullRequest } = await github.rest.pulls.get({
owner: context.repo.owner,
repo: context.repo.repo,
pull_number: issueNumber
});

// Only check open PRs and a PR that is not a HOTFIX
const shouldSkip = (pullRequest.body.includes('HOTFIX: SKIP AZURE CI')
|| pullRequest.state != 'open');

if (!shouldSkip) {
const commitHash = pullRequest.head.sha;
console.log(`Latest commit hash: ${commitHash}`);
// Set the output variable to be used in subsequent step
core.setOutput("latest_commit_hash", commitHash);
}
console.log(`Should skip Azure CI? ${shouldSkip}`);
return shouldSkip;

- name: Check Azure CI report and create commit status to PR
if: steps.check_pr_state.outputs.result != 'true'
uses: actions/github-script@v7
with:
github-token: ${{secrets.GITHUB_TOKEN}}
script: |
const latestCommitHash = '${{ steps.check_pr_state.outputs.latest_commit_hash }}'
const issueNumber = context.issue.number;
const checkAzureCiAndCreateCommitStatus = require(`${process.env.GITHUB_WORKSPACE}/.github/workflows/azure_ci.js`);

await checkAzureCiAndCreateCommitStatus({
github,
context,
prNumber: issueNumber,
latestCommitHash: latestCommitHash
});
Loading