Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PWX-36475 : Creating node PDB for parallel upgrades #1508

Merged
merged 10 commits into from
May 27, 2024
Merged

Conversation

svijaykumar-px
Copy link
Contributor

What this PR does / why we need it: Creating Node PDB for every node in the cluster which is a part of quorum. This will be used in parallel upgrades. The PDB for every node is created with the name px-storage- and matches with the node name label of every portworx pod

Which issue(s) this PR fixes (optional)
Closes # PWX-36475

Special notes for your reviewer: This per node PDB will only be created when portworx version is greater than or equal to 3.1.2

@svijaykumar-px svijaykumar-px marked this pull request as draft April 2, 2024 16:43
@svijaykumar-px svijaykumar-px marked this pull request as ready for review April 4, 2024 05:16
Comment on lines 1934 to 1938
shouldRun, shouldContinueRunning, err := k8sutil.CheckPredicatesForStoragePod(&node, cluster, nil)
if err != nil {
return nil, err
}
if shouldRun || shouldContinueRunning {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to check if that k8s node can run a portworx pod or not? Don't we need to only check if the portworx node is part of the current cluster or not? Basically this can be a simple map -

for _, node := range k8sNodeList.Items {
  k8sNodes[node.Name] = true
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was mentioned that we will create PDB only for k8s nodes that can run px right? So during surge upgrade when a node is being removed from the cluster, it shouldn't run px and we wont create a node PDB for it .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I will make the change suggested, but lets discuss why we need to create node PDB for k8s nodes that should not run px pod

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a portworx node has scheduler name which is part of the cluster, then we should protect that k8s node from upgrading irrespective of whether currently the k8s node should run portworx or not, right?

Let's discuss more on this offline to make sure we are covering all cases correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assumed a kubernetes node when it goes to the state of should not run portworx, then px will be deleted from that node soon anyways, so why should we create pdb for it only to delete it in sometime?

err = driver.PreInstall(cluster)
require.NoError(t, err)

// PDB is created for 2 storage nodes and kvdb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not setting the spec.kvdb.internal field in the storage cluster. Then why is the kvdb pdb getting created here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is creating kvdb pdb as kvdb spec itself is nil, so it doesnt check if we set internal or not

Copy link
Contributor Author

@svijaykumar-px svijaykumar-px Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if cluster.Spec.Kvdb != nil && !cluster.Spec.Kvdb.Internal the first condition fails and thus creates kvdb pdb. In the other tests, kvdb is being set / total PDB counts is not used

Copy link

github-actions bot commented Apr 9, 2024

This PR is stale because it has been in review for 3 days with no activity.

Copy link

codecov bot commented Apr 15, 2024

Codecov Report

Attention: Patch coverage is 83.01887% with 18 lines in your changes are missing coverage. Please review.

Project coverage is 75.97%. Comparing base (e085c30) to head (8233eb5).

Files Patch % Lines
...rs/storage/portworx/component/disruption_budget.go 73.91% 10 Missing and 8 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1508      +/-   ##
==========================================
- Coverage   75.97%   75.97%   -0.01%     
==========================================
  Files          77       77              
  Lines       20483    20578      +95     
==========================================
+ Hits        15562    15634      +72     
- Misses       3837     3850      +13     
- Partials     1084     1094      +10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@svijaykumar-px svijaykumar-px requested a review from a team May 2, 2024 06:48
Copy link
Contributor

@nikita-bhatia nikita-bhatia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

This PR is stale because it has been in review for 3 days with no activity.

Conflicts:
	drivers/storage/portworx/component/disruption_budget.go
errors := []error{}
for _, node := range nodesNeedingPDB {
minAvailable := intstr.FromInt(1)
PDBName := "px-" + node
Copy link
Contributor

@piyush-nimbalkar piyush-nimbalkar Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: internal variables should have lower case.

*pdbName

@piyush-nimbalkar piyush-nimbalkar deleted the PWX-36475 branch July 16, 2024 23:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants