forked from ray-project/ray
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Data] Adding in per node metrics (ray-project#49705)
## Why are these changes needed? Adding in per node metrics will allow us to debug more efficiently as we can see a more granular view of what is happening at the per node level. Currently for Ray Data we collect a number of metrics that we would want to group by node, but we do not report node for these metrics. The stats collector actor aggregates these across multiple nodes. This PR adds per node metrics to `OpRuntimeMetrics` which will not be aggregated across multiple nodes so we can visualize this data segmented by node in the data dashboard. ## Example ### Script ```python import ray import time def f(x): time.sleep(0.1) return x file_path = "s3://air-example-data-2/100G-xgboost-data.parquet/" ds = ray.data.read_parquet(file_path).limit(10_000_000) ds = ds.map_batches(f) for _ in ds.iter_batches(): pass ``` ### Output ``` (base) ray@ip-10-0-61-222:~/default$ python gen_metrics.py 2025-01-16 16:33:49,968 INFO worker.py:1654 -- Connecting to existing Ray cluster at address: 10.0.61.222:6379... 2025-01-16 16:33:49,977 INFO worker.py:1832 -- Connected to Ray cluster. View the dashboard at https://session-64eiepsal97ynjwq1gb53c43vb.i.anyscaleuserdata-staging.com 2025-01-16 16:33:49,979 INFO packaging.py:366 -- Pushing file package 'gcs://_ray_pkg_e7896c7ed49efce702fc2ded295073e96fe54a3a.zip' (0.00MiB) to Ray cluster... 2025-01-16 16:33:49,979 INFO packaging.py:379 -- Successfully pushed file package 'gcs://_ray_pkg_e7896c7ed49efce702fc2ded295073e96fe54a3a.zip'. 2025-01-16 16:33:51,418 INFO streaming_executor.py:108 -- Starting execution of Dataset. Full logs are in /tmp/ray/session_2025-01-16_16-33-18_905648_2451/logs/ray-data 2025-01-16 16:33:51,418 INFO streaming_executor.py:109 -- Execution plan of Dataset: InputDataBuffer[Input] -> TaskPoolMapOperator[ListFiles] -> TaskPoolMapOperator[PartitionFiles] -> TaskPoolMapOperator[ReadFiles] -> LimitOperator[limit=10000000] -> TaskPoolMapOperator[MapBatches(f)] (autoscaler +1m19s) Tip: use `ray status` to view detailed cluster status. To disable these messages, set RAY_SCHEDULER_EVENTS=0. ✔️ Dataset execution finished in 113.05 seconds: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10.0M/10.0M [01:53<00:00, 88.5k row/s] - ListFiles: Tasks: 0; Queued blocks: 0; Resources: 0.0 CPU, 0.0B object store: : 1.00k row [01:53, 8.85 row/s] - PartitionFiles: Tasks: 0; Queued blocks: 0; Resources: 0.0 CPU, 87.3KB object store: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.00k/1.00k [01:53<00:00, 8.85 row/s] - ReadFiles: Tasks: 0; Queued blocks: 311; Resources: 0.0 CPU, 2.4GB object store: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 17.9M/17.9M [01:53<00:00, 159k row/s] - limit=10000000: Tasks: 0; Queued blocks: 19; Resources: 0.0 CPU, 0.0B object store: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10.0M/10.0M [01:53<00:00, 88.5k row/s] - MapBatches(f): Tasks: 0; Queued blocks: 0; Resources: 0.0 CPU, 1.2GB object store: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10.0M/10.0M [01:53<00:00, 88.5k row/s ``` ### New Charts <img width="1462" alt="image" src="https://github.com/user-attachments/assets/218183df-243f-4c84-9af9-cc362fac0b7e" /> <img width="1465" alt="image" src="https://github.com/user-attachments/assets/4bdfef16-c773-45c2-bc56-512df70ba0c4" /> <img width="1468" alt="image" src="https://github.com/user-attachments/assets/da19ac5f-33f8-46fe-9677-afbe08a55af1" /> <img width="1471" alt="image" src="https://github.com/user-attachments/assets/72a85d24-ea4e-4c7b-9d96-244325d6333d" /> --------- Signed-off-by: Matthew Owen <mowen@anyscale.com>
- Loading branch information
1 parent
79aa176
commit 53678c1
Showing
8 changed files
with
270 additions
and
11 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.