Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computer use is not working #223

Open
southkorea2013 opened this issue Mar 3, 2025 · 0 comments
Open

Computer use is not working #223

southkorea2013 opened this issue Mar 3, 2025 · 0 comments

Comments

@southkorea2013
Copy link

southkorea2013 commented Mar 3, 2025

Hi,

I run the autonomous_gui_agent_voice.py, but it cannot work as expected.
macOS 15.3.1
MacBook M3 Max 128G
I find the clicking position is not on the Safari icon in Dock.
The following errors are prompted:

This is a beta version of the video understanding. It may not work as expected.
Screen Navigation Assistant
Press Ctrl+C to quit
Fetching 12 files: 100%|█████████████████████| 12/12 [00:00<00:00, 21201.20it/s]
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Fetching 11 files: 100%|█████████████████████| 11/11 [00:00<00:00, 16008.79it/s]
Press Enter to start listening...
Listening...
Fetching 4 files: 100%|█████████████████████████| 4/4 [00:00<00:00, 9049.20it/s]

Heard:  Please open Safari and navigate to Apple.com.
Planner response:
 Thought: The task is to open Safari and navigate to Apple.com. The next step is
to locate and click on the Safari icon in the dock to open the browser.
Action: CLICK on the Safari icon in the dock.
GUI Agent Response:
 {'action': 'CLICK', 'value': None, 'position': [0.4, 0.81]}
Executing action: CLICK
Clicking at position
(604, 795)
Drawing ellipse at pixel coordinates: (604, 795)
Updated navigation history
Saved image to screenshots/screenshot_20250303-142311.png
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
Planner response:
 Thought: The task is to open Safari and navigate to Apple.com. The next step is
to locate and click on the Safari icon in the dock to open the browser.
Action: CLICK on the Safari icon in the dock.
{"action": "CLICK", "value": null, "position": [604.8000000000001, 
795.4200000000001]}
GUI Agent Response:
 {'action': 'CLICK', 'value': null, 'position': [604.8000000000001, 
795.4200000000001]}
Error: name 'null' is not defined
Traceback (most recent call last):
  File "/Users/nanhan/demos/mlx-vlm/computer_use/autonomous_gui_agent_voice.py", line 632, in <module>
    main()
  File "/Users/nanhan/demos/mlx-vlm/computer_use/autonomous_gui_agent_voice.py", line 616, in main
    raise e
  File "/Users/nanhan/demos/mlx-vlm/computer_use/autonomous_gui_agent_voice.py", line 611, in main
    past_actions = process_command(
                   ^^^^^^^^^^^^^^^^
  File "/Users/nanhan/demos/mlx-vlm/computer_use/autonomous_gui_agent_voice.py", line 534, in process_command
    response = eval(response)
               ^^^^^^^^^^^^^^
  File "<string>", line 1, in <module>
NameError: name 'null' is not defined
Image

Thanks,
Nan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant