Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#3001] docs(spark-connector): add spark connector document #3018

Merged
merged 3 commits into from
Apr 22, 2024

Conversation

FANNG1
Copy link
Contributor

@FANNG1 FANNG1 commented Apr 18, 2024

What changes were proposed in this pull request?

add spark connector document

Why are the changes needed?

Fix: #3001

Does this PR introduce any user-facing change?

no

How was this patch tested?

document

3. Execute the spark sql query.

Suppose there are two catalogs in the metalake `test`, `hive` and `iceberg`, and the table `hive_table1` in the catalog `hive`, and the table `iceberg_table1` in the catalog `iceberg`.
```sql
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add blank line between paragraphs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

use hive;
select * from db.hive_table1;
```
:::info
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

select * from db.hive_table1;
```
:::info
The command SHOW CATALOGS will only display the Spark default catalog, named spark_catalog, due to limitations within the Spark catalog manager. It does not list the catalogs present in the metalake. However, after explicitly using the USING CATALOG command with a specific catalog name, that catalog name then becomes visible in the output of SHOW CATALOGS.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use backquotes "`" in "SHOW CATALOGS".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

This software is licensed under the Apache License version 2."
---

With the Gravitino Spark connector, accessing data or managing metadata in Hive catalogs becomes straightforward, enabling seamless federation queries across different Hive catalogs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have some Hive or Iceberg specific configurations that should be listed here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, the configurations are retrivied from Gravitino server.

[ COMMENT table_comment ]
[ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ]
[ AS select_statement ]
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to add more SQL examples about DDL, DML and DQL to show users how to use it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

```sql
select * from hive.db.hive_table1 union all select * from iceberg.db.iceberg_table1;
use hive;
select * from db.hive_table1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uppercase all the SQL reserved words.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@jerryshao jerryshao merged commit 4cf2ed2 into apache:main Apr 22, 2024
9 checks passed
diqiu50 pushed a commit to diqiu50/gravitino that referenced this pull request Jun 13, 2024
…ache#3018)

### What changes were proposed in this pull request?
add spark connector document

### Why are the changes needed?
Fix: apache#3001

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
document
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Subtask] add spark connector document
2 participants