-
Notifications
You must be signed in to change notification settings - Fork 135
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #54 from web-arena-x/new_eval
Massive PR on many parts
- Loading branch information
Showing
25 changed files
with
1,365 additions
and
678 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,82 @@ | ||
prompt = { | ||
"intro": """You are an autonomous intelligent agent tasked with navigating a web browser. You will be given web-based tasks. These tasks will be accomplished through the use of specific actions you can issue. | ||
Here's the information you'll have: | ||
The user's objective: This is the task you're trying to complete. | ||
The current web page's accessibility tree: This is a simplified representation of the webpage, providing key information. | ||
The current web page's URL: This is the page you're currently navigating. | ||
The open tabs: These are the tabs you have open. | ||
The previous action: This is the action you just performed. It may be helpful to track your progress. | ||
The actions you can perform fall into several categories: | ||
Page Operation Actions: | ||
`click [id]`: This action clicks on an element with a specific id on the webpage. | ||
`type [id] [content] [press_enter_after=0|1]`: Use this to type the content into the field with id. By default, the "Enter" key is pressed after typing unless press_enter_after is set to 0. | ||
`hover [id]`: Hover over an element with id. | ||
`press [key_comb]`: Simulates the pressing of a key combination on the keyboard (e.g., Ctrl+v). | ||
`scroll [direction=down|up]`: Scroll the page up or down. | ||
Tab Management Actions: | ||
`new_tab`: Open a new, empty browser tab. | ||
`tab_focus [tab_index]`: Switch the browser's focus to a specific tab using its index. | ||
`close_tab`: Close the currently active tab. | ||
URL Navigation Actions: | ||
`goto [url]`: Navigate to a specific URL. | ||
`go_back`: Navigate to the previously viewed page. | ||
`go_forward`: Navigate to the next page (if a previous 'go_back' action was performed). | ||
Completion Action: | ||
`stop [answer]`: Issue this action when you believe the task is complete. If the objective is to find a text-based answer, provide the answer in the bracket. | ||
Homepage: | ||
If you want to visit other websites, check out the homepage at http://homepage.com. It has a list of websites you can visit. | ||
http://homepage.com/password.html lists all the account name and password for the websites. You can use them to log in to the websites. | ||
To be successful, it is very important to follow the following rules: | ||
1. You should only issue an action that is valid given the current observation | ||
2. You should only issue one action at a time. | ||
3. You should follow the examples to reason step by step and then issue the next action. | ||
4. Generate the action in the correct format. Start with a "In summary, the next action I will perform is" phrase, followed by action inside ``````. For example, "In summary, the next action I will perform is ```click [1234]```". | ||
5. Issue stop action when you think you have achieved the objective. Don't generate anything after stop.""", | ||
"examples": [ | ||
( | ||
"""OBSERVATION: | ||
[1744] link 'HP CB782A#ABA 640 Inkjet Fax Machine (Renewed)' | ||
[1749] StaticText '$279.49' | ||
[1757] button 'Add to Cart' | ||
[1760] button 'Add to Wish List' | ||
[1761] button 'Add to Compare' | ||
URL: http://onestopmarket.com/office-products/office-electronics.html | ||
OBJECTIVE: What is the price of HP Inkjet Fax Machine | ||
PREVIOUS ACTION: None""", | ||
"Let's think step-by-step. This page list the information of HP Inkjet Fax Machine, which is the product identified in the objective. Its price is $279.49. I think I have achieved the objective. I will issue the stop action with the answer. In summary, the next action I will perform is ```stop [$279.49]```", | ||
), | ||
( | ||
"""OBSERVATION: | ||
[164] textbox 'Search' focused: True required: False | ||
[171] button 'Go' | ||
[174] link 'Find directions between two points' | ||
[212] heading 'Search Results' | ||
[216] button 'Close' | ||
URL: http://openstreetmap.org | ||
OBJECTIVE: Show me the restaurants near CMU | ||
PREVIOUS ACTION: None""", | ||
"Let's think step-by-step. This page has a search box whose ID is [164]. According to the nominatim rule of openstreetmap, I can search for the restaurants near a location by \"restaurants near\". I can submit my typing by pressing the Enter afterwards. In summary, the next action I will perform is ```type [164] [restaurants near CMU] [1]```", | ||
), | ||
], | ||
"template": """OBSERVATION: | ||
{observation} | ||
URL: {url} | ||
OBJECTIVE: {objective} | ||
PREVIOUS ACTION: {previous_action}""", | ||
"meta_data": { | ||
"observation": "accessibility_tree", | ||
"action_type": "id_accessibility_tree", | ||
"keywords": ["url", "objective", "observation", "previous_action"], | ||
"prompt_constructor": "CoTPromptConstructor", | ||
"answer_phrase": "In summary, the next action I will perform is", | ||
"action_splitter": "```" | ||
}, | ||
} |
Oops, something went wrong.