[#2780] Improvement(server,core): Move tree lock from rest api to the corresponding implementation to minimize tree lock range. #2873

yuqi1129 · 2024-04-10T13:10:55Z

What changes were proposed in this pull request?

Modify the rest API and move the tree lock to the core module.

Why are the changes needed?

The rest API should not be locked entirely by a tree lock.

Fix: #2780

Does this PR introduce any user-facing change?

N/A.

How was this patch tested?

N/A.

yuqi1129 · 2024-04-11T02:53:30Z

server/src/main/java/com/datastrato/gravitino/server/web/rest/PartitionOperations.java

-                });
+            Table loadTable = dispatcher.loadTable(tableIdent);
+
+            Partition p =


As we have moved the logic about tree lock into dispatcher, for partitioning-related logic, we can't completely remove tree lock from APIs.

This seems to be not very elegant, but I can't find a better one till now.

The change here will possibly lead to inconsistency, right?

Yeah, I'm working on it.

In case of this, I would rethink the necessity of this change.

OK, I have some doubts about whether PR is necessary.

You can defer this and work on others firstly.

jerryshao · 2024-04-16T15:04:23Z

@yuqi1129 I'm going to close this first. We can reopen this when we feel necessary to change and have a better solution.

yuqi1129 · 2024-04-16T15:24:34Z

to

Got it

jerryshao · 2024-12-10T07:57:36Z

I think you can do some fine-grained access control for the lock scope, rather than blindly wrapping the whole logic into one lock. For example, like "load fileset", you only need to lock the doWithCatalog operation, no need to lock the whole logic. For "create fileset", you can lock the catalog for property validation, and again lock the fileset for fileset creation. Can you please carefully check the code, and optimize them as possible as you can.

…_2780

yuqi1129 · 2025-02-10T03:11:38Z

I think you can do some fine-grained access control for the lock scope, rather than blindly wrapping the whole logic into one lock. For example, like "load fileset", you only need to lock the doWithCatalog operation, no need to lock the whole logic. For "create fileset", you can lock the catalog for property validation, and again lock the fileset for fileset creation. Can you please carefully check the code, and optimize them as possible as you can.

@jerryshao

I have encountered several code as following:

Is this behavior acceptable? The consequence is complex and the goal of tree lock is to guarantee the behavior in
the method is determinate. According to the code above, I have not seen the effect of the tree lock, and seems to be the same effect without any lock.

Could you show your review on this point? If this is acceptable, I will take it and change many places accordingly.

jerryshao · 2025-02-10T06:19:28Z

I see your point, the original code may have some problems, or it can be optimized. I think it would be better to optimize (guarantee the consistency) as much as we can. If it cannot be achieved, then the change should be no worse than the original code.

core/src/main/java/org/apache/gravitino/authorization/AccessControlManager.java

jerryshao · 2025-02-12T07:08:57Z

core/src/main/java/org/apache/gravitino/authorization/AccessControlManager.java

  }

  @Override
  public User getUser(String metalake, String user)
      throws NoSuchUserException, NoSuchMetalakeException {
-    return userGroupManager.getUser(metalake, user);
+    return TreeLockUtils.doWithTreeLock(
+        AuthorizationUtils.ofGroup(metalake, user),


core/src/main/java/org/apache/gravitino/authorization/AccessControlManager.java

jerryshao · 2025-02-12T07:18:15Z

core/src/main/java/org/apache/gravitino/catalog/CatalogManager.java

+            }
+            throw new RuntimeException(e);
+          }
+        });


Do we need add lock for this part, I feel it is not so necessary, what do you think? @mchades

Agree. There should be no race condition for a mocked catalog. And the test result only represents the status at that time.

jerryshao · 2025-02-12T07:20:21Z

core/src/main/java/org/apache/gravitino/catalog/CatalogManager.java

+          try {
+            catalogWrapper.doWithPropertiesMeta(
+                f -> {
+                  Pair<Map<String, String>, Map<String, String>> alterProperty =
+                      getCatalogAlterProperty(changes);
+                  validatePropertyForAlter(
+                      f.catalogPropertiesMetadata(),
+                      alterProperty.getLeft(),
+                      alterProperty.getRight());
+                  return null;
+                });
+          } catch (IllegalArgumentException e1) {
+            throw e1;
+          } catch (Exception e) {
+            LOG.error("Failed to alter catalog {}", ident, e);
+            throw new RuntimeException(e);
+          }


Do we need to add this part into the lock?

Yes, there is no need to put it into the lock, the fact that this part of logic is between loadCatalogAndWrap and updataCatalog, unless we split these two logic and lock them separately can we remove the code out of lock range.

I think we can split into two locks, it should be fine.

jerryshao · 2025-02-12T07:25:25Z

@mchades can you please help to review the CatalogManager part, this part is a bit complex after you introduced the in use mechanism.

@jerqi can you please review access control part?

jerryshao · 2025-02-27T12:11:05Z

core/src/main/java/org/apache/gravitino/authorization/AccessControlManager.java

+        LockType.WRITE,
+        () ->
+            TreeLockUtils.doWithTreeLock(
+                NameIdentifier.of(AuthorizationUtils.ofRoleNamespace(metalake).levels()),


I think it should role, not role namespace. WDYT?

This should be fine as it will not change the name of role.

I guess we use role namepace not role here because we would access multiple roles (List) in this method. For ease of use, we lock the role parent directly.

@jerqi Please help to confirm this problem

I prefer locking role namespace. Because we should align to other code.

jerryshao · 2025-02-27T12:11:25Z

core/src/main/java/org/apache/gravitino/authorization/AccessControlManager.java

+        LockType.WRITE,
+        () ->
+            TreeLockUtils.doWithTreeLock(
+                NameIdentifier.of(AuthorizationUtils.ofRoleNamespace(metalake).levels()),


jerryshao · 2025-02-27T12:24:19Z

core/src/main/java/org/apache/gravitino/catalog/CatalogManager.java

+          try {
+            catalogWrapper.doWithPropertiesMeta(
+                f -> {
+                  Pair<Map<String, String>, Map<String, String>> alterProperty =
+                      getCatalogAlterProperty(changes);
+                  validatePropertyForAlter(
+                      f.catalogPropertiesMetadata(),
+                      alterProperty.getLeft(),
+                      alterProperty.getRight());
+                  return null;
+                });
+          } catch (IllegalArgumentException e1) {
+            throw e1;
+          } catch (Exception e) {
+            LOG.error("Failed to alter catalog {}", ident, e);
+            throw new RuntimeException(e);
+          }


I think we can split into two locks, it should be fine.

jerryshao · 2025-02-27T12:36:33Z

core/src/main/java/org/apache/gravitino/catalog/SchemaOperationDispatcher.java

-          .withHiddenProperties(
-              getHiddenPropertyNames(
+    return TreeLockUtils.doWithTreeLock(
+        NameIdentifier.of(ident.namespace().levels()),


I'm curious why do we need to add the write in the catalog level? I think it should be fine to add the write lock in the schema level, am I right?

The following scenario will cause problems if we use write lock in the schema level，assuming the schema name is schema1

TheadA ThreadB Hold schema1 write lock Rename schema1 to schema2 Store schema2 to storage hold schema2 write lock Block due to some reason rename schema2 to schema1 return store schema2 to storage x return

This problem only happens for rename, I think we don't support schema rename. Besides, we can treat rename and other change separately, if the changes contain rename, then we use parent level write lock, otherwise we use its own write lock. WDYT?

Let me verify this and check if this can be applied to catalogs, tables, and other objects.

I changed the code and determined if we need to lock the parent or itself.

jerryshao · 2025-02-27T12:41:17Z

core/src/main/java/org/apache/gravitino/catalog/TableOperationDispatcher.java

@@ -194,70 +198,75 @@ public Table createTable(
  public Table alterTable(NameIdentifier ident, TableChange... changes)
      throws NoSuchTableException, IllegalArgumentException {
    validateAlterProperties(ident, HasPropertyMetadata::tablePropertiesMetadata, changes);
+    return TreeLockUtils.doWithTreeLock(
+        NameIdentifier.of(ident.namespace().levels()),
+        LockType.WRITE,


Also here, why alter operation needs parent write lock?

yuqi1129 · 2025-02-28T07:52:34Z

I think we can split into two locks, it should be fine.

Changed.

jerryshao · 2025-02-28T08:59:15Z

core/src/main/java/org/apache/gravitino/authorization/PermissionManager.java

+        TreeLockUtils.doWithTreeLock(
+            AuthorizationUtils.ofRole(metalake, role),
+            LockType.READ,
+            () -> roleEntitiesToGrant.add(roleManager.getRole(metalake, role)));


I guess with the overhead of requesting multiple read lock, the performance may not be as good as request one parent read lock. I would be inclined to use the old way, what do you think?

Locking the parent node with read type is NOT correct logically. For example, if we lock the parent lock with a read lock, other threads can still modify the role when this method is loading the role, so if we are going to use the namespace lock, the lock type should be changed to WRITE. Is that acceptable to you?

Write lock is not acceptable.

core/src/main/java/org/apache/gravitino/catalog/TableOperationDispatcher.java

…to the corresponding implementation to minimize tree lock range. (apache#2873) ### What changes were proposed in this pull request? Modify the rest API and move the tree lock to the core module. ### Why are the changes needed? The rest API should not be locked entirely by a tree lock. Fix: apache#2780 ### Does this PR introduce _any_ user-facing change? N/A. ### How was this patch tested? N/A. --------- Co-authored-by: Jerry Shao <jerryshao@datastrato.com>

yuqi1129 self-assigned this Apr 10, 2024

yuqi1129 closed this Apr 10, 2024

yuqi1129 reopened this Apr 10, 2024

yuqi1129 closed this Apr 11, 2024

yuqi1129 reopened this Apr 11, 2024

yuqi1129 closed this Apr 11, 2024

yuqi1129 reopened this Apr 11, 2024

yuqi1129 commented Apr 11, 2024

View reviewed changes

jerryshao closed this Apr 16, 2024

FANNG1 mentioned this pull request Apr 17, 2024

[Improvement] Rethinking the implement of partition operations #2999

Closed

yuqi1129 reopened this Nov 25, 2024

Update code.

9f8d5ea

yuqi1129 force-pushed the issue_2780 branch from e7f5224 to 9f8d5ea Compare December 6, 2024 09:26

yuqi1129 and others added 3 commits December 6, 2024 19:05

fix

5b6a993

Fix test error.

e2a94d3

Merge branch 'main' into issue_2780

4d1af74

yuqi1129 added 4 commits February 6, 2025 21:00

Merge branch 'main' of github.com:apache/gravitino into issue_2780

ce53770

Merge branch 'issue_2780' of github.com:yuqi1129/gravitino into issue…

8f0534b

…_2780

Narrow span of tree lock

45eaf79

Merge branch 'main' of github.com:apache/gravitino into issue_2780

551d806

correct some typos.

3fc50bc

jerryshao reviewed Feb 12, 2025

View reviewed changes

yuqi1129 added 2 commits February 14, 2025 15:29

fix comments

7b8e21b

fix lock error.

6b16780

revert some unnecessary code.

6c61f6d

jerryshao reviewed Feb 27, 2025

View reviewed changes

Resolve comments.

ab1997e

jerryshao reviewed Feb 28, 2025

View reviewed changes

core/src/main/java/org/apache/gravitino/catalog/TableOperationDispatcher.java Outdated Show resolved Hide resolved

fix a mistake

8664692

jerryshao approved these changes Feb 28, 2025

View reviewed changes

jerryshao merged commit 9f5f3ee into apache:main Feb 28, 2025
28 checks passed

[#2780] Improvement(server,core): Move tree lock from rest api to the corresponding implementation to minimize tree lock range. #2873

[#2780] Improvement(server,core): Move tree lock from rest api to the corresponding implementation to minimize tree lock range. #2873

Conversation

yuqi1129 commented Apr 10, 2024

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jerryshao commented Apr 16, 2024

yuqi1129 commented Apr 16, 2024

jerryshao commented Dec 10, 2024

yuqi1129 commented Feb 10, 2025

jerryshao commented Feb 10, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jerryshao commented Feb 12, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yuqi1129 Feb 28, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yuqi1129 commented Feb 28, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yuqi1129 Feb 28, 2025 •

edited

Loading