Skip to content

Commit

Permalink
Merge branch 'master' into peter/column-level-lineage
Browse files Browse the repository at this point in the history
  • Loading branch information
PeteMango authored Mar 6, 2025
2 parents 1405b27 + fcabe88 commit 073b652
Show file tree
Hide file tree
Showing 45 changed files with 3,822 additions and 525 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,8 @@ public class DomainType
Constants.OWNERSHIP_ASPECT_NAME,
Constants.INSTITUTIONAL_MEMORY_ASPECT_NAME,
Constants.STRUCTURED_PROPERTIES_ASPECT_NAME,
Constants.FORMS_ASPECT_NAME);
Constants.FORMS_ASPECT_NAME,
Constants.DISPLAY_PROPERTIES_ASPECT_NAME);
private final EntityClient _entityClient;

public DomainType(final EntityClient entityClient) {
Expand Down
1 change: 1 addition & 0 deletions datahub-web-react/src/app/domain/DomainsList.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,7 @@ export const DomainsList = () => {
},
ownership: null,
entities: null,
displayProperties: null,
},
pageSize,
);
Expand Down
1 change: 1 addition & 0 deletions datahub-web-react/src/app/domain/utils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ export const updateListDomainsCache = (
children: null,
dataProducts: null,
parentDomains: null,
displayProperties: null,
},
1000,
parentDomain,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -125,8 +125,8 @@ export default function MLGroupModels() {
},
},
{
title: 'Tags',
key: 'tags',
title: 'Properties',
key: 'properties',
width: 200,
render: (_: any, record: any) => {
const tags = record.properties?.tags || [];
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -63,13 +63,14 @@ export default function FormByEntity({ formUrn }: Props) {
<ContentWrapper>
<ProgressBar formUrn={formUrn} />
<FlexWrapper>
<ProfileSidebar
sidebarSections={loading ? [] : sections}
topSection={{ component: () => <EntityInfo formUrn={formUrn} /> }}
backgroundColor="white"
alignLeft
/>
<Form formUrn={formUrn} />
{selectedEntityData && (
<ProfileSidebar
sidebarSections={loading ? [] : sections}
topSection={{ component: () => <EntityInfo formUrn={formUrn} /> }}
backgroundColor="white"
/>
)}
</FlexWrapper>
</ContentWrapper>
</EntityContext.Provider>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -128,8 +128,8 @@ export default function MLGroupModels() {
},
},
{
title: 'Tags',
key: 'tags',
title: 'Properties',
key: 'properties',
width: 200,
render: (_: any, record: any) => {
const tags = record.properties?.tags || [];
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
import { Input, Modal } from 'antd';
import { debounce } from 'lodash';
import React from 'react';
import styled from 'styled-components';

Expand Down Expand Up @@ -53,13 +52,6 @@ const IconColorPicker: React.FC<IconColorPickerProps> = ({
const [stagedColor, setStagedColor] = React.useState<string>(color || '#000000');
const [stagedIcon, setStagedIcon] = React.useState<string>(icon || 'account_circle');

// a debounced version of updateDisplayProperties that takes in the same arguments
// eslint-disable-next-line react-hooks/exhaustive-deps
const debouncedUpdateDisplayProperties = React.useCallback(
debounce((...args) => updateDisplayProperties(...args).then(() => setTimeout(() => refetch(), 1000)), 500),
[],
);

return (
<Modal
open={open}
Expand All @@ -77,7 +69,7 @@ const IconColorPicker: React.FC<IconColorPickerProps> = ({
},
},
},
});
}).then(() => refetch());
onChangeColor?.(stagedColor);
onChangeIcon?.(stagedIcon);
onClose();
Expand All @@ -93,44 +85,10 @@ const IconColorPicker: React.FC<IconColorPickerProps> = ({
marginBottom: 30,
marginTop: 15,
}}
onChange={(e) => {
setStagedColor(e.target.value);
debouncedUpdateDisplayProperties?.({
variables: {
urn,
input: {
colorHex: e.target.value,
icon: {
iconLibrary: IconLibrary.Material,
name: stagedIcon,
style: 'Outlined',
},
},
},
});
}}
onChange={(e) => setStagedColor(e.target.value)}
/>
<Title>Choose an icon for {name || 'Domain'}</Title>
<ChatIconPicker
color={stagedColor}
onIconPick={(i) => {
console.log('picking icon', i);
debouncedUpdateDisplayProperties?.({
variables: {
urn,
input: {
colorHex: stagedColor,
icon: {
iconLibrary: IconLibrary.Material,
name: capitalize(snakeToCamel(i)),
style: 'Outlined',
},
},
},
});
setStagedIcon(i);
}}
/>
<ChatIconPicker color={stagedColor} onIconPick={(i) => setStagedIcon(i)} />
</Modal>
);
};
Expand Down
6 changes: 6 additions & 0 deletions datahub-web-react/src/graphql/domain.graphql
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,9 @@ query getDomain($urn: String!) {
forms {
...formsFields
}
displayProperties {
...displayPropertiesFields
}
...domainEntitiesFields
...notes
}
Expand All @@ -64,6 +67,9 @@ query listDomains($input: ListDomainsInput!) {
ownership {
...ownershipFields
}
displayProperties {
...displayPropertiesFields
}
...domainEntitiesFields
}
}
Expand Down
9 changes: 9 additions & 0 deletions datahub-web-react/src/graphql/fragments.graphql
Original file line number Diff line number Diff line change
Expand Up @@ -229,6 +229,9 @@ fragment parentNodesFields on ParentNodesResult {
properties {
name
}
displayProperties {
...displayPropertiesFields
}
}
}

Expand All @@ -238,6 +241,9 @@ fragment parentDomainsFields on ParentDomainsResult {
urn
type
... on Domain {
displayProperties {
...displayPropertiesFields
}
properties {
name
description
Expand Down Expand Up @@ -1259,6 +1265,9 @@ fragment entityDomain on DomainAssociation {
...parentDomainsFields
}
...domainEntitiesFields
displayProperties {
...displayPropertiesFields
}
}
associatedUrn
}
Expand Down
3 changes: 3 additions & 0 deletions datahub-web-react/src/graphql/glossaryNode.graphql
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,9 @@ query getGlossaryNode($urn: String!) {
}
}
}
displayProperties {
...displayPropertiesFields
}
...notes
}
}
3 changes: 3 additions & 0 deletions datahub-web-react/src/graphql/preview.graphql
Original file line number Diff line number Diff line change
Expand Up @@ -341,6 +341,9 @@ fragment entityPreview on Entity {
parentDomains {
...parentDomainsFields
}
displayProperties {
...displayPropertiesFields
}
...domainEntitiesFields
}
... on Container {
Expand Down
3 changes: 3 additions & 0 deletions datahub-web-react/src/graphql/search.graphql
Original file line number Diff line number Diff line change
Expand Up @@ -845,6 +845,9 @@ fragment searchResultsWithoutSchemaField on Entity {
parentDomains {
...parentDomainsFields
}
displayProperties {
...displayPropertiesFields
}
...domainEntitiesFields
structuredProperties {
properties {
Expand Down
43 changes: 38 additions & 5 deletions docs/advanced/writing-mcps.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,15 +39,48 @@ For example, if you want to understand the structure of entities in your DataHub

## Saving MCPs to a file

### Exporting rom DataHub Instance
### Exporting from Ingestion Source

You can export MCPs directly from your DataHub instance using a recipe file. This is useful when you want to:
You can export MCPs from an ingestion source (such as BigQuery, Snowflake, etc.) to a file using the `file` sink type in your recipe. This approach is useful when you want to:

- Examine existing entities in your DataHub instance
- Save MCPs for later ingestion
- Examine existing entities in the source
- Debug ingestion issues

To get started, create a recipe file (e.g., `export_mcps.yaml`) specifying your target source and the file `sink` type:

```yaml
source:
type: bigquery # Replace with your source type
config:
... # Add your source configuration here
sink:
type: "file"
config:
filename: "mcps.json"
```

Run the ingestion with the following command:

```python
datahub ingest -c export_mcps.yaml
```

This command will extract all entities from your source and write them to `mcps.json` in MCP format.

For more details about the `file` sink type, please refer to [Metadata File](../../metadata-ingestion/sink_docs/metadata-file.md)

### Exporting from DataHub Instance

You can also export MCPs directly from an existing DataHub instance using a similar recipe approach. This method is particularly useful when you need to:

- Examine entities already in your DataHub instance
- Create test cases based on real data
- Debug entity relationships

First, create a recipe file (e.g., `export_mcps.yaml`):
The process is similar to exporting from an ingestion source, with the only difference being that you'll use `datahub` as the source type.
Create a recipe file (e.g., `export_mcps.yaml`) with this configuration:
```yaml
source:
Expand All @@ -69,7 +102,7 @@ Run the ingestion:
datahub ingest -c export_mcps.yaml
```
This will write all the entities from your DataHub instance to `mcps.json` in MCP format.
This will extract all entities from your DataHub instance and save them to `mcps.json` in MCP format.
### Creating MCPs with Python SDK
Expand Down
2 changes: 2 additions & 0 deletions docs/how/updating-datahub.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ This file documents any backwards-incompatible changes in DataHub and assists pe

### Breaking Changes

- #12673: Business Glossary ID generation has been modified to handle special characters and URL cleaning. When `enable_auto_id` is false (default), IDs are now generated by cleaning the name (converting spaces to hyphens, removing special characters except periods which are used as path separators) while preserving case. This may result in different IDs being generated for terms with special characters.

- #12580: The OpenAPI source handled nesting incorrectly. 12580 fixes it to create proper nested field paths, however, this will re-write the incorrect schemas of existing OpenAPI runs.

- #12408: The `platform` field in the DataPlatformInstance GraphQL type is removed. Clients need to retrieve the platform via the optional `dataPlatformInstance` field.
Expand Down
10 changes: 7 additions & 3 deletions docs/iceberg-catalog.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,8 @@ Before starting, ensure you have:
DH_ICEBERG_DATA_ROOT="s3://your-bucket/path"

```
The `DH_ICEBERG_CLIENT_ID` is the `AWS_ACCESS_KEY_ID` and `DH_ICEBERG_CLIENT_SECRET` is the `AWS_SECRET_ACCESS_KEY`

4. If using pyiceberg, configure pyiceberg to use your local datahub using one of its supported ways. For example, create `~/.pyiceberg.yaml` with
```commandline
catalog:
Expand Down Expand Up @@ -124,8 +126,10 @@ You can create Iceberg tables using PyIceberg with a defined schema. Here's an e
<Tabs>
<TabItem value="spark" label="spark-sql" default>

Connect to the DataHub Iceberg Catalog using Spark SQL by defining `$GMS_HOST`, `$GMS_PORT`, `$WAREHOUSE` to connect to and `$USER_PAT` - the DataHub Personal Access Token used to connect to the catalog:
When datahub is running locally, set `GMS_HOST` to `localhost` and `GMS_PORT` to `8080`.
Connect to the DataHub Iceberg Catalog using Spark SQL by defining `$GMS_HOST`, `$GMS_PORT`, `$WAREHOUSE` to connect to and `$USER_PAT` - the DataHub Personal Access Token used to connect to the catalog.
When using DataHub Cloud (Acryl), the Iceberg Catalog URL is `https://<your-instance>.acryl.io/gms/iceberg/`
If you're running DataHub locally, set `GMS_HOST` to `localhost` and `GMS_PORT` to `8080`.

For this example, set `WAREHOUSE` to `arctic_warehouse`

```cli
Expand Down Expand Up @@ -518,4 +522,4 @@ A: Check that:

- [Apache Iceberg Documentation](https://iceberg.apache.org/)
- [PyIceberg Documentation](https://py.iceberg.apache.org/)
- [DataHub Documentation](https://datahubproject.io/docs/)
- [DataHub Documentation](https://datahubproject.io/docs/)
Loading

0 comments on commit 073b652

Please sign in to comment.