Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc(iceberg): iceberg doc updates #12787

Merged
merged 2 commits into from
Mar 6, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 7 additions & 3 deletions docs/iceberg-catalog.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,8 @@ Before starting, ensure you have:
DH_ICEBERG_DATA_ROOT="s3://your-bucket/path"

```
The `DH_ICEBERG_CLIENT_ID` is the `AWS_ACCESS_KEY_ID` and `DH_ICEBERG_CLIENT_SECRET` is the `AWS_SECRET_ACCESS_KEY`

4. If using pyiceberg, configure pyiceberg to use your local datahub using one of its supported ways. For example, create `~/.pyiceberg.yaml` with
```commandline
catalog:
Expand Down Expand Up @@ -124,8 +126,10 @@ You can create Iceberg tables using PyIceberg with a defined schema. Here's an e
<Tabs>
<TabItem value="spark" label="spark-sql" default>

Connect to the DataHub Iceberg Catalog using Spark SQL by defining `$GMS_HOST`, `$GMS_PORT`, `$WAREHOUSE` to connect to and `$USER_PAT` - the DataHub Personal Access Token used to connect to the catalog:
When datahub is running locally, set `GMS_HOST` to `localhost` and `GMS_PORT` to `8080`.
Connect to the DataHub Iceberg Catalog using Spark SQL by defining `$GMS_HOST`, `$GMS_PORT`, `$WAREHOUSE` to connect to and `$USER_PAT` - the DataHub Personal Access Token used to connect to the catalog.
When using DataHub Cloud (Acryl), the Iceberg Catalog URL is `https://<your-instance>.acryl.io/gms/iceberg/`
If you're running DataHub locally, set `GMS_HOST` to `localhost` and `GMS_PORT` to `8080`.

For this example, set `WAREHOUSE` to `arctic_warehouse`

```cli
Expand Down Expand Up @@ -518,4 +522,4 @@ A: Check that:

- [Apache Iceberg Documentation](https://iceberg.apache.org/)
- [PyIceberg Documentation](https://py.iceberg.apache.org/)
- [DataHub Documentation](https://datahubproject.io/docs/)
- [DataHub Documentation](https://datahubproject.io/docs/)
Loading