You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I run the autonomous_gui_agent_voice.py, but it cannot work as expected.
macOS 15.3.1
MacBook M3 Max 128G
I find the clicking position is not on the Safari icon in Dock.
The following errors are prompted:
This is a beta version of the video understanding. It may not work as expected.
Screen Navigation Assistant
Press Ctrl+C to quit
Fetching 12 files: 100%|█████████████████████| 12/12 [00:00<00:00, 21201.20it/s]
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Fetching 11 files: 100%|█████████████████████| 11/11 [00:00<00:00, 16008.79it/s]
Press Enter to start listening...
Listening...
Fetching 4 files: 100%|█████████████████████████| 4/4 [00:00<00:00, 9049.20it/s]
Heard: Please open Safari and navigate to Apple.com.
Planner response:
Thought: The task is to open Safari and navigate to Apple.com. The next step is
to locate and click on the Safari icon in the dock to open the browser.
Action: CLICK on the Safari icon in the dock.
GUI Agent Response:
{'action': 'CLICK', 'value': None, 'position': [0.4, 0.81]}
Executing action: CLICK
Clicking at position
(604, 795)
Drawing ellipse at pixel coordinates: (604, 795)
Updated navigation history
Saved image to screenshots/screenshot_20250303-142311.png
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
Planner response:
Thought: The task is to open Safari and navigate to Apple.com. The next step is
to locate and click on the Safari icon in the dock to open the browser.
Action: CLICK on the Safari icon in the dock.
{"action": "CLICK", "value": null, "position": [604.8000000000001,
795.4200000000001]}
GUI Agent Response:
{'action': 'CLICK', 'value': null, 'position': [604.8000000000001,
795.4200000000001]}
Error: name 'null' is not defined
Traceback (most recent call last):
File "/Users/nanhan/demos/mlx-vlm/computer_use/autonomous_gui_agent_voice.py", line 632, in <module>
main()
File "/Users/nanhan/demos/mlx-vlm/computer_use/autonomous_gui_agent_voice.py", line 616, in main
raise e
File "/Users/nanhan/demos/mlx-vlm/computer_use/autonomous_gui_agent_voice.py", line 611, in main
past_actions = process_command(
^^^^^^^^^^^^^^^^
File "/Users/nanhan/demos/mlx-vlm/computer_use/autonomous_gui_agent_voice.py", line 534, in process_command
response = eval(response)
^^^^^^^^^^^^^^
File "<string>", line 1, in <module>
NameError: name 'null' is not defined
Thanks,
Nan
The text was updated successfully, but these errors were encountered:
Hi,
I run the autonomous_gui_agent_voice.py, but it cannot work as expected.
macOS 15.3.1
MacBook M3 Max 128G
I find the clicking position is not on the Safari icon in Dock.
The following errors are prompted:
Thanks,
Nan
The text was updated successfully, but these errors were encountered: