Xla parallel proxy #12

tengomucho · 2024-04-05T09:37:07Z

What does this PR do?

This PR does several things:

adds an example with gemma-2b that show the inference as it works
Adds and implementation that will allow to launch and communicate with models running in parallel using ModelProxy
Adds a test and that to a CI workflow.
Did you write any new necessary tests?

Originally imported from gemma repository https://github.com/google/gemma_pytorch.git At commit version: cf8658c.

This will allow to execute models in a parallel way and interact with them from the caller thread. To see how it works in output mode, you can launch the test with debug enabled this way: DEBUG=1 pytest -s tests/test_parallel_proxy.py

mfuntowicz · 2024-04-05T13:37:21Z

optimum/tpu/xla_parallel_proxy.py

+    xmp.spawn(_mp_fn, args=(args), join=True, daemon=False)
+
+
+class ModelProxy:


I would rename this, it's not clear imo it handles the MP logic, wdyt?

What about something like DistributedModel or DistributedTpuModel ?

mfuntowicz · 2024-04-05T13:38:13Z

optimum/tpu/xla_parallel_proxy.py

+
+    def send(self, command: ModelCommand, data: Dict = None):
+        # First wait until model is ready to receive commands
+        debug(f"  MM Command {command} waiting for model to be ready")


Let's remove the DEBUG/debug reference and leverage python logging

.github/workflows/test-pytorch-xla-tpu.yml

optimum/tpu/xla_parallel_proxy.py

pyproject.toml

tests/test_parallel_proxy.py

tengomucho added 4 commits April 5, 2024 09:06

fix(build): add loguru dependency for optimum-tpu

d5675f7

feat: added generation with gemma example

4e51a1d

feat: import xla_model_parallel.py

1784fd9

Originally imported from gemma repository https://github.com/google/gemma_pytorch.git At commit version: cf8658c.

feat: add implementation for parallel model proxy

0c5a5e5

This will allow to execute models in a parallel way and interact with them from the caller thread. To see how it works in output mode, you can launch the test with debug enabled this way: DEBUG=1 pytest -s tests/test_parallel_proxy.py

tengomucho force-pushed the xla-parallel-proxy branch from 7317e4b to c8d7bbe Compare April 5, 2024 09:52

tengomucho added 2 commits April 5, 2024 10:19

feat(CI): added optimum tests to CI

3aaaf96

chore: adapt xla_model_parallel style to avoid CI complaints

747ad57

tengomucho force-pushed the xla-parallel-proxy branch from c8d7bbe to 747ad57 Compare April 5, 2024 10:19

tengomucho marked this pull request as ready for review April 5, 2024 10:29

tengomucho requested a review from mfuntowicz April 5, 2024 10:29

mfuntowicz requested changes Apr 5, 2024

View reviewed changes

tengomucho added 4 commits April 8, 2024 07:45

chore: Rename ModelProxy -> DistributedModel

9a908c2

chore: replace custom debug logging with loguru

5e0bdce

chore(CI): remove accelerate installation from tests workflow

7bcceaa

chore: remove unused function

46c0690

tengomucho requested a review from mfuntowicz April 8, 2024 08:08

mfuntowicz approved these changes Apr 8, 2024

View reviewed changes

tengomucho merged commit 7b48145 into main Apr 8, 2024
2 checks passed

mfuntowicz deleted the xla-parallel-proxy branch April 8, 2024 08:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Xla parallel proxy #12

Xla parallel proxy #12

tengomucho commented Apr 5, 2024 •

edited

Loading

mfuntowicz Apr 5, 2024

mfuntowicz Apr 5, 2024

		xmp.spawn(_mp_fn, args=(args), join=True, daemon=False)


		class ModelProxy:

Xla parallel proxy #12

Xla parallel proxy #12

Conversation

tengomucho commented Apr 5, 2024 • edited Loading

What does this PR do?

mfuntowicz Apr 5, 2024

Choose a reason for hiding this comment

mfuntowicz Apr 5, 2024

Choose a reason for hiding this comment

tengomucho commented Apr 5, 2024 •

edited

Loading