-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add metrics. #186
add metrics. #186
Conversation
d33e9a2
to
755fae3
Compare
/hold Wait for the merge of open-cluster-management-io/sdk-go#76 |
/assign @clyang82 |
could you append the results for metrics? |
755fae3
to
466200f
Compare
Metrics after running e2e testing: # HELP advisory_lock_count Number of advisory lock requests.
# TYPE advisory_lock_count counter
advisory_lock_count{status="OK",type="events"} 16
advisory_lock_count{status="OK",type="instances"} 22
advisory_lock_count{status="OK",type="resource_status"} 38
advisory_lock_count{status="OK",type="resources"} 10
# HELP advisory_lock_duration Advisory Lock durations in seconds.
# TYPE advisory_lock_duration histogram
advisory_lock_duration_bucket{status="OK",type="events",le="0.1"} 16
advisory_lock_duration_bucket{status="OK",type="events",le="0.2"} 16
advisory_lock_duration_bucket{status="OK",type="events",le="0.5"} 16
advisory_lock_duration_bucket{status="OK",type="events",le="1"} 16
advisory_lock_duration_bucket{status="OK",type="events",le="2"} 16
advisory_lock_duration_bucket{status="OK",type="events",le="10"} 16
advisory_lock_duration_bucket{status="OK",type="events",le="+Inf"} 16
advisory_lock_duration_sum{status="OK",type="events"} 0.07476534900000001
advisory_lock_duration_count{status="OK",type="events"} 16
advisory_lock_duration_bucket{status="OK",type="instances",le="0.1"} 22
advisory_lock_duration_bucket{status="OK",type="instances",le="0.2"} 22
advisory_lock_duration_bucket{status="OK",type="instances",le="0.5"} 22
advisory_lock_duration_bucket{status="OK",type="instances",le="1"} 22
advisory_lock_duration_bucket{status="OK",type="instances",le="2"} 22
advisory_lock_duration_bucket{status="OK",type="instances",le="10"} 22
advisory_lock_duration_bucket{status="OK",type="instances",le="+Inf"} 22
advisory_lock_duration_sum{status="OK",type="instances"} 0.032695262999999995
advisory_lock_duration_count{status="OK",type="instances"} 22
advisory_lock_duration_bucket{status="OK",type="resource_status",le="0.1"} 38
advisory_lock_duration_bucket{status="OK",type="resource_status",le="0.2"} 38
advisory_lock_duration_bucket{status="OK",type="resource_status",le="0.5"} 38
advisory_lock_duration_bucket{status="OK",type="resource_status",le="1"} 38
advisory_lock_duration_bucket{status="OK",type="resource_status",le="2"} 38
advisory_lock_duration_bucket{status="OK",type="resource_status",le="10"} 38
advisory_lock_duration_bucket{status="OK",type="resource_status",le="+Inf"} 38
advisory_lock_duration_sum{status="OK",type="resource_status"} 0.40729072899999996
advisory_lock_duration_count{status="OK",type="resource_status"} 38
advisory_lock_duration_bucket{status="OK",type="resources",le="0.1"} 10
advisory_lock_duration_bucket{status="OK",type="resources",le="0.2"} 10
advisory_lock_duration_bucket{status="OK",type="resources",le="0.5"} 10
advisory_lock_duration_bucket{status="OK",type="resources",le="1"} 10
advisory_lock_duration_bucket{status="OK",type="resources",le="2"} 10
advisory_lock_duration_bucket{status="OK",type="resources",le="10"} 10
advisory_lock_duration_bucket{status="OK",type="resources",le="+Inf"} 10
advisory_lock_duration_sum{status="OK",type="resources"} 0.08142236900000001
advisory_lock_duration_count{status="OK",type="resources"} 10
# HELP grpc_server_called_total Total number of RPCs called on the server.
# TYPE grpc_server_called_total counter
grpc_server_called_total{source="sourceclient-testfpnxv",type="Publish"} 6
grpc_server_called_total{source="sourceclient-testfpnxv",type="Subscribe"} 3
# HELP grpc_server_message_received_total Total number of messages received on the server from agent and client.
# TYPE grpc_server_message_received_total counter
grpc_server_message_received_total{source="sourceclient-testfpnxv",type="Publish"} 6
grpc_server_message_received_total{source="sourceclient-testfpnxv",type="Subscribe"} 3
# HELP grpc_server_message_sent_total Total number of messages sent by the server to agent and client.
# TYPE grpc_server_message_sent_total counter
grpc_server_message_sent_total{source="sourceclient-testfpnxv",type="Publish"} 6
grpc_server_message_sent_total{source="sourceclient-testfpnxv",type="Subscribe"} 30
# HELP grpc_server_processed_duration_seconds Histogram of the duration of RPCs processed on the server.
# TYPE grpc_server_processed_duration_seconds histogram
grpc_server_processed_duration_seconds_bucket{source="sourceclient-testfpnxv",type="Publish",le="0.005"} 0
grpc_server_processed_duration_seconds_bucket{source="sourceclient-testfpnxv",type="Publish",le="0.01"} 5
grpc_server_processed_duration_seconds_bucket{source="sourceclient-testfpnxv",type="Publish",le="0.025"} 5
grpc_server_processed_duration_seconds_bucket{source="sourceclient-testfpnxv",type="Publish",le="0.05"} 6
grpc_server_processed_duration_seconds_bucket{source="sourceclient-testfpnxv",type="Publish",le="0.1"} 6
grpc_server_processed_duration_seconds_bucket{source="sourceclient-testfpnxv",type="Publish",le="0.25"} 6
grpc_server_processed_duration_seconds_bucket{source="sourceclient-testfpnxv",type="Publish",le="0.5"} 6
grpc_server_processed_duration_seconds_bucket{source="sourceclient-testfpnxv",type="Publish",le="1"} 6
grpc_server_processed_duration_seconds_bucket{source="sourceclient-testfpnxv",type="Publish",le="2.5"} 6
grpc_server_processed_duration_seconds_bucket{source="sourceclient-testfpnxv",type="Publish",le="5"} 6
grpc_server_processed_duration_seconds_bucket{source="sourceclient-testfpnxv",type="Publish",le="10"} 6
grpc_server_processed_duration_seconds_bucket{source="sourceclient-testfpnxv",type="Publish",le="+Inf"} 6
grpc_server_processed_duration_seconds_sum{source="sourceclient-testfpnxv",type="Publish"} 0.075799143
grpc_server_processed_duration_seconds_count{source="sourceclient-testfpnxv",type="Publish"} 6
# HELP grpc_server_processed_total Total number of RPCs processed on the server, regardless of success or failure.
# TYPE grpc_server_processed_total counter
grpc_server_processed_total{code="OK",source="sourceclient-testfpnxv",type="Publish"} 6
grpc_server_processed_total{code="OK",source="sourceclient-testfpnxv",type="Subscribe"} 3
# HELP resource_processed_total Number of processed resources.
# TYPE resource_processed_total counter
resource_processed_total{action="update",id="01dbff20-f86d-41dd-b972-ea9bbfdef7f0"} 4
resource_processed_total{action="update",id="240b3897-d620-4749-8ad9-eb0a0e631ef6"} 4
resource_processed_total{action="update",id="3a3d6f1f-75f2-4169-be70-c9ef90bbf743"} 6
resource_processed_total{action="update",id="4628bd82-5a8d-457a-bcc3-ade46f54a690"} 2
resource_processed_total{action="update",id="629a9314-1670-410d-9dde-a3dada2667df"} 12
resource_processed_total{action="update",id="8b1cc8aa-5ad1-4aba-b6cd-5ca90b0a0444"} 3
resource_processed_total{action="update",id="d4d3b89c-e11e-4a51-a9c0-e3da68955c7f"} 6
# HELP rest_api_inbound_request_count Number of requests served.
# TYPE rest_api_inbound_request_count counter
rest_api_inbound_request_count{code="200",method="GET",path="/api/maestro/v1/resource-bundles/-"} 2
rest_api_inbound_request_count{code="200",method="GET",path="/api/maestro/v1/resources/-"} 8
rest_api_inbound_request_count{code="200",method="PATCH",path="/api/maestro/v1/resources/-"} 1
rest_api_inbound_request_count{code="201",method="POST",path="/api/maestro/v1/resources"} 4
rest_api_inbound_request_count{code="204",method="DELETE",path="/api/maestro/v1/resources/-"} 5
rest_api_inbound_request_count{code="404",method="GET",path="/api/maestro/v1/resource-bundles/-"} 1
rest_api_inbound_request_count{code="404",method="GET",path="/api/maestro/v1/resources/-"} 1
# HELP rest_api_inbound_request_duration Request duration in seconds.
# TYPE rest_api_inbound_request_duration histogram
rest_api_inbound_request_duration_bucket{code="200",method="GET",path="/api/maestro/v1/resource-bundles/-",le="0.1"} 2
rest_api_inbound_request_duration_bucket{code="200",method="GET",path="/api/maestro/v1/resource-bundles/-",le="1"} 2
rest_api_inbound_request_duration_bucket{code="200",method="GET",path="/api/maestro/v1/resource-bundles/-",le="10"} 2
rest_api_inbound_request_duration_bucket{code="200",method="GET",path="/api/maestro/v1/resource-bundles/-",le="30"} 2
rest_api_inbound_request_duration_bucket{code="200",method="GET",path="/api/maestro/v1/resource-bundles/-",le="+Inf"} 2
rest_api_inbound_request_duration_sum{code="200",method="GET",path="/api/maestro/v1/resource-bundles/-"} 0.004490571
rest_api_inbound_request_duration_count{code="200",method="GET",path="/api/maestro/v1/resource-bundles/-"} 2
rest_api_inbound_request_duration_bucket{code="200",method="GET",path="/api/maestro/v1/resources/-",le="0.1"} 8
rest_api_inbound_request_duration_bucket{code="200",method="GET",path="/api/maestro/v1/resources/-",le="1"} 8
rest_api_inbound_request_duration_bucket{code="200",method="GET",path="/api/maestro/v1/resources/-",le="10"} 8
rest_api_inbound_request_duration_bucket{code="200",method="GET",path="/api/maestro/v1/resources/-",le="30"} 8
rest_api_inbound_request_duration_bucket{code="200",method="GET",path="/api/maestro/v1/resources/-",le="+Inf"} 8
rest_api_inbound_request_duration_sum{code="200",method="GET",path="/api/maestro/v1/resources/-"} 0.021776432999999998
rest_api_inbound_request_duration_count{code="200",method="GET",path="/api/maestro/v1/resources/-"} 8
rest_api_inbound_request_duration_bucket{code="200",method="PATCH",path="/api/maestro/v1/resources/-",le="0.1"} 1
rest_api_inbound_request_duration_bucket{code="200",method="PATCH",path="/api/maestro/v1/resources/-",le="1"} 1
rest_api_inbound_request_duration_bucket{code="200",method="PATCH",path="/api/maestro/v1/resources/-",le="10"} 1
rest_api_inbound_request_duration_bucket{code="200",method="PATCH",path="/api/maestro/v1/resources/-",le="30"} 1
rest_api_inbound_request_duration_bucket{code="200",method="PATCH",path="/api/maestro/v1/resources/-",le="+Inf"} 1
rest_api_inbound_request_duration_sum{code="200",method="PATCH",path="/api/maestro/v1/resources/-"} 0.013018656
rest_api_inbound_request_duration_count{code="200",method="PATCH",path="/api/maestro/v1/resources/-"} 1
rest_api_inbound_request_duration_bucket{code="201",method="POST",path="/api/maestro/v1/resources",le="0.1"} 4
rest_api_inbound_request_duration_bucket{code="201",method="POST",path="/api/maestro/v1/resources",le="1"} 4
rest_api_inbound_request_duration_bucket{code="201",method="POST",path="/api/maestro/v1/resources",le="10"} 4
rest_api_inbound_request_duration_bucket{code="201",method="POST",path="/api/maestro/v1/resources",le="30"} 4
rest_api_inbound_request_duration_bucket{code="201",method="POST",path="/api/maestro/v1/resources",le="+Inf"} 4
rest_api_inbound_request_duration_sum{code="201",method="POST",path="/api/maestro/v1/resources"} 0.03167025
rest_api_inbound_request_duration_count{code="201",method="POST",path="/api/maestro/v1/resources"} 4
rest_api_inbound_request_duration_bucket{code="204",method="DELETE",path="/api/maestro/v1/resources/-",le="0.1"} 5
rest_api_inbound_request_duration_bucket{code="204",method="DELETE",path="/api/maestro/v1/resources/-",le="1"} 5
rest_api_inbound_request_duration_bucket{code="204",method="DELETE",path="/api/maestro/v1/resources/-",le="10"} 5
rest_api_inbound_request_duration_bucket{code="204",method="DELETE",path="/api/maestro/v1/resources/-",le="30"} 5
rest_api_inbound_request_duration_bucket{code="204",method="DELETE",path="/api/maestro/v1/resources/-",le="+Inf"} 5
rest_api_inbound_request_duration_sum{code="204",method="DELETE",path="/api/maestro/v1/resources/-"} 0.057168279
rest_api_inbound_request_duration_count{code="204",method="DELETE",path="/api/maestro/v1/resources/-"} 5
rest_api_inbound_request_duration_bucket{code="404",method="GET",path="/api/maestro/v1/resource-bundles/-",le="0.1"} 1
rest_api_inbound_request_duration_bucket{code="404",method="GET",path="/api/maestro/v1/resource-bundles/-",le="1"} 1
rest_api_inbound_request_duration_bucket{code="404",method="GET",path="/api/maestro/v1/resource-bundles/-",le="10"} 1
rest_api_inbound_request_duration_bucket{code="404",method="GET",path="/api/maestro/v1/resource-bundles/-",le="30"} 1
rest_api_inbound_request_duration_bucket{code="404",method="GET",path="/api/maestro/v1/resource-bundles/-",le="+Inf"} 1
rest_api_inbound_request_duration_sum{code="404",method="GET",path="/api/maestro/v1/resource-bundles/-"} 0.001403407
rest_api_inbound_request_duration_count{code="404",method="GET",path="/api/maestro/v1/resource-bundles/-"} 1
rest_api_inbound_request_duration_bucket{code="404",method="GET",path="/api/maestro/v1/resources/-",le="0.1"} 1
rest_api_inbound_request_duration_bucket{code="404",method="GET",path="/api/maestro/v1/resources/-",le="1"} 1
rest_api_inbound_request_duration_bucket{code="404",method="GET",path="/api/maestro/v1/resources/-",le="10"} 1
rest_api_inbound_request_duration_bucket{code="404",method="GET",path="/api/maestro/v1/resources/-",le="30"} 1
rest_api_inbound_request_duration_bucket{code="404",method="GET",path="/api/maestro/v1/resources/-",le="+Inf"} 1
rest_api_inbound_request_duration_sum{code="404",method="GET",path="/api/maestro/v1/resources/-"} 0.001669214
rest_api_inbound_request_duration_count{code="404",method="GET",path="/api/maestro/v1/resources/-"} 1
# HELP resources_spec_resync_duration_seconds The duration of the resource spec resync in seconds.
# TYPE resources_spec_resync_duration_seconds histogram
resources_spec_resync_duration_seconds_bucket{cluster="32326ea5-77aa-487a-b7c5-916e7862571e",source="maestro",type="io.open-cluster-management.works.v1alpha1.manifestbundles",le="0.1"} 1
resources_spec_resync_duration_seconds_bucket{cluster="32326ea5-77aa-487a-b7c5-916e7862571e",source="maestro",type="io.open-cluster-management.works.v1alpha1.manifestbundles",le="0.2"} 1
resources_spec_resync_duration_seconds_bucket{cluster="32326ea5-77aa-487a-b7c5-916e7862571e",source="maestro",type="io.open-cluster-management.works.v1alpha1.manifestbundles",le="0.5"} 1
resources_spec_resync_duration_seconds_bucket{cluster="32326ea5-77aa-487a-b7c5-916e7862571e",source="maestro",type="io.open-cluster-management.works.v1alpha1.manifestbundles",le="1"} 1
resources_spec_resync_duration_seconds_bucket{cluster="32326ea5-77aa-487a-b7c5-916e7862571e",source="maestro",type="io.open-cluster-management.works.v1alpha1.manifestbundles",le="2"} 1
resources_spec_resync_duration_seconds_bucket{cluster="32326ea5-77aa-487a-b7c5-916e7862571e",source="maestro",type="io.open-cluster-management.works.v1alpha1.manifestbundles",le="10"} 1
resources_spec_resync_duration_seconds_bucket{cluster="32326ea5-77aa-487a-b7c5-916e7862571e",source="maestro",type="io.open-cluster-management.works.v1alpha1.manifestbundles",le="30"} 1
resources_spec_resync_duration_seconds_bucket{cluster="32326ea5-77aa-487a-b7c5-916e7862571e",source="maestro",type="io.open-cluster-management.works.v1alpha1.manifestbundles",le="+Inf"} 1
resources_spec_resync_duration_seconds_sum{cluster="32326ea5-77aa-487a-b7c5-916e7862571e",source="maestro",type="io.open-cluster-management.works.v1alpha1.manifestbundles"} 0.000883473
resources_spec_resync_duration_seconds_count{cluster="32326ea5-77aa-487a-b7c5-916e7862571e",source="maestro",type="io.open-cluster-management.works.v1alpha1.manifestbundles"} 1
resources_spec_resync_duration_seconds_bucket{cluster="32326ea5-77aa-487a-b7c5-916e7862571e",source="maestro",type="io.open-cluster-management.works.v1alpha1.manifests",le="0.1"} 1
resources_spec_resync_duration_seconds_bucket{cluster="32326ea5-77aa-487a-b7c5-916e7862571e",source="maestro",type="io.open-cluster-management.works.v1alpha1.manifests",le="0.2"} 1
resources_spec_resync_duration_seconds_bucket{cluster="32326ea5-77aa-487a-b7c5-916e7862571e",source="maestro",type="io.open-cluster-management.works.v1alpha1.manifests",le="0.5"} 1
resources_spec_resync_duration_seconds_bucket{cluster="32326ea5-77aa-487a-b7c5-916e7862571e",source="maestro",type="io.open-cluster-management.works.v1alpha1.manifests",le="1"} 1
resources_spec_resync_duration_seconds_bucket{cluster="32326ea5-77aa-487a-b7c5-916e7862571e",source="maestro",type="io.open-cluster-management.works.v1alpha1.manifests",le="2"} 1
resources_spec_resync_duration_seconds_bucket{cluster="32326ea5-77aa-487a-b7c5-916e7862571e",source="maestro",type="io.open-cluster-management.works.v1alpha1.manifests",le="10"} 1
resources_spec_resync_duration_seconds_bucket{cluster="32326ea5-77aa-487a-b7c5-916e7862571e",source="maestro",type="io.open-cluster-management.works.v1alpha1.manifests",le="30"} 1
resources_spec_resync_duration_seconds_bucket{cluster="32326ea5-77aa-487a-b7c5-916e7862571e",source="maestro",type="io.open-cluster-management.works.v1alpha1.manifests",le="+Inf"} 1
resources_spec_resync_duration_seconds_sum{cluster="32326ea5-77aa-487a-b7c5-916e7862571e",source="maestro",type="io.open-cluster-management.works.v1alpha1.manifests"} 0.001288316
resources_spec_resync_duration_seconds_count{cluster="32326ea5-77aa-487a-b7c5-916e7862571e",source="maestro",type="io.open-cluster-management.works.v1alpha1.manifests"} 1 |
/ok-to-test |
ec38997
to
59b13af
Compare
Signed-off-by: morvencao <lcao@redhat.com>
59b13af
to
e932349
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -142,6 +148,15 @@ func (s *sqlResourceService) Update(ctx context.Context, resource *api.Resource) | |||
return nil, handleUpdateError("Resource", err) | |||
} | |||
|
|||
// Create the set of labels that we will add to all the resource process: | |||
labels := prometheus.Labels{ | |||
metricsIDLabel: updated.ID, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the id
field is the resource ID right?
if yes, such a label will lead to cardinality explosion. consider removing the label
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good advice! we’re keeping this for two reasons: it helps to diagnose the frequent updates by a single resource, and we don’t yet have enough resources to cause cardinality issues.
ref: https://issues.redhat.com/browse/ACM-12921