Skip to content

Commit

Permalink
Merge pull request #6 from chaosparrot/terminal_support
Browse files Browse the repository at this point in the history
Terminal support
  • Loading branch information
chaosparrot authored Dec 31, 2024
2 parents e3e8c95 + dd18dcd commit 042cbe6
Show file tree
Hide file tree
Showing 45 changed files with 1,256 additions and 179 deletions.
27 changes: 20 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,10 @@ If you do not want marithime dictation, but instead only want to use the selecti

You can always say `marithime` followed by a phrase to use it if you do not want to override the regular dictation insert.

### List of supported programs

Generally this package tries to support all kinds of programs through the accessibility APIs. Though in order to properly know whether they work, the programs are tested manually for support. [The list of supported programs is documented here](programs/SUPPORTED.md)

### Privacy statement

Because most software isn't accessible, this package relies on a couple of ways to understand what is inside a text field, and where the caret inside of it is. One of these methods is **locally tracking keystrokes that happen through Talon voice**.
Expand All @@ -49,10 +53,10 @@ If you want to highlight a specific set of tests, go inside of the specific test

#### Documentation

[] - Create a usage and installation video
[] - Create a usage and installation video
Videos seem to speak to people more than written text does, so accompany this with a video as well

[] - Extension possibilities for other packages
[] - Extension possibilities for other packages
There's a ton of ways other packages can make use of our captures, settings and detections, but we will need to document them so they are easier to reuse as well.

#### Dictation
Expand All @@ -61,7 +65,7 @@ There's a ton of ways other packages can make use of our captures, settings and
This boils down to matching `an` and `the` to be similar despite them being phonetically different.
We can add something configurable so its easy for users to extend.

[] - Terminator words
[~] - Terminator words
Right now the word `quill` is used, instead of the word `over`, to terminate a command. We probably want to extend this a bit, though we need to take into account that they need to not only be used in commands, but filtered out in other ways.

[] - Making automatic fixing work
Expand Down Expand Up @@ -91,16 +95,25 @@ We could make it easier to loop through sentences since we already have the buff
[] - Implement flow for digits
Right now, you still need to say `numb zero` every time between commands. We can detect if we should allow digits, periods and other kinds of formatters as single words if we can be very certain that the next character will be

[] - Word wrap detection
We need to find a way to deal with word wrap, meaning things being on a single line, but visually ( and most importantly, keyboard relatively ) they are on multiple lines. Our current Up and Down arrow key pressing does not deal with that.

[] - Zero width space indexation selection fix
When a zero width space indexation is used, it is possible that a current selection is removed. We can fix that selection afterwards so we don't have issues where content is removed unnecessarily

[] - Add clipboard pasting insert support
Right now it isn't possible to use clipboard pasting as a way to insert things rather than typing out the characters one by one. This makes the insertion slower than it could be. This can be done with 'Ctrl+C' and 'Ctrl+V', or 'Ctrl+Shift+C' and 'Ctrl+Shift+V' in terminals. Though we probably want to use `action.edit.paste()` to make it compatible with other packages. We do need to be aware that in terminals there is a possibility that `Remove trailing white-space when pasting` is turned on, which might cause desyncs.

#### Programs

[] - Improved MacOS support
While there's programs where it nails the accessibility API pretty well, others just don't connect properly with finding the right focused element. We'll need to address these one by one unfortunately, because accessibility APIs are all over the place from program to program.

[] - Terminal support
Right now terminals have a ton of issues because they do not allow for text selection, have painful accessibility support, and use a ton of custom key binds that don't correlate with other document builders.
[] - Text editor support
This means we should be able to support vim, nano and other keybindings. This runs into the same issues as using a terminal does however, namely poor accessibility support and hard to detect when something is inside of a text editor in the first place. Another is no line wrapping when reaching the start or end of the line and key-pressing beyond that boundary.

[] - Single line detection / support
Some fields, like name fields, do not have the possibility to add multiple lines. In that case, we probably want to either clear the buffer or simply not allow the enter to change the field. We should probably do a refresh if we are in an accessible field, and a clear in a terminal.
[~] - Single line detection
Some fields, like name fields, do not have the possibility to add multiple lines. In that case, we probably want to either clear the buffer or simply not allow the enter to change the field. We should probably do a refresh if we are in an accessible field, and a clear in a terminal.

[] - Accessiblity input tags
We can detect a field type, like email, phone number etc from the accessibility APIs. That means we could expose that information for other packages to use as well, so you can say `Homer` to input `homer@odyssey.com` for example.
Expand Down
20 changes: 15 additions & 5 deletions accessibility/windows.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ def index_element_text(self, element = None) -> AccessibilityText:
if value is not None:
accessibility_text = AccessibilityText(value)

if accessibility_text and "Text" in element.patterns:
if not accessibility_text and "Text" in element.patterns:
try:
value = element.text_pattern.document_range.text
# Windows sometimes just throws operation successful errors...
Expand Down Expand Up @@ -99,10 +99,11 @@ def determine_caret_positions(self, element = None) -> List[AccessibilityCaret]:
return None

# Code adapted from AndreasArvidsson's talon files
# Currently only Text and Text2 are supported
has_text_pattern = False if "Text2" not in element.patterns and "Text" not in element.patterns else True
# Currently only Text2 is supported
# Text(1) doesn't seem to have caret_range and other methods
has_text_pattern = "Text2" in element.patterns# or "Text" in element.patterns
if has_text_pattern:
text_pattern = element.text_pattern2 if "Text2" in element.patterns else element.text_pattern
text_pattern = element.text_pattern2# if "Text2" in element.patterns else element.text_pattern

# Make copy of the document range to avoid modifying the original
range_before_selection = text_pattern.document_range.clone()
Expand Down Expand Up @@ -180,8 +181,17 @@ def determine_caret_positions(self, element = None) -> List[AccessibilityCaret]:
return [end_caret, start_caret] if is_reversed else [start_caret, end_caret]
else:
return []
# IAccessible proves to be harder to implement
# Further investigation can be done with the code seen in NVAccess
# https://github.com/nvaccess/nvda/blob/e80d7822160f7d2ff151140bc97ca84e5798c1fb/source/NVDAObjects/IAccessible/__init__.py#L465
#elif "LegacyIAccessible" in element.patterns:
# print("ATTEMPTING!")
# pattern = element.legacyiaccessible_pattern
# selection = pattern.selection
# if len(selection) > 0:
# print( dir( selection ), selection )
# print( dir(element), element.aria_role )

return []


windows_api = WindowsAccessibilityApi()
25 changes: 20 additions & 5 deletions context.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,21 +3,36 @@
mod = Module()
ctx = Context()
mod.list("marithime_terminator_word", desc="A list of all the end-of-command terminator words used within dictation and other commands")

mod.tag("marithime_available", desc="Check that the Marithime package is available for a user")
mod.tag("marithime_dictation", desc="Overrides the dictation insert with the marithime one")
mod.tag("marithime_context_disable_shift_selection", desc="Disables shift selection for the current context")
mod.tag("marithime_context_disable_word_wrap", desc="Disables word wrap detection for the current context")
mod.tag("marithime_input_field_text", desc="Set when a single line text input field is focused")

mod.setting("marithime_auto_fixing_enabled", type=int, default=0, desc="Whether to allow auto-fixing ( auto correct ) based on earlier corrections")

# Settings that handle multi-line
mod.setting("marithime_context_shift_selection", type=int, default=1, desc="Enables or disables the use of shift press selection for the current context")
mod.setting("marithime_context_multiline_supported", type=int, default=1, desc="Enables or disables the use of multiple lines through the enter key")
mod.setting("marithime_context_word_wrap_width", type=int, default=-1, desc="Sets the width of the current input element for word wrapping, negative for disabled")

# Key tracking
mod.setting("marithime_context_clear_key", type=str, default="", desc="When this key is pressed, the context is cleared - For example, enter presses clearing the terminal")
mod.setting("marithime_context_remove_undo", type=str, default="ctrl-z", desc="The key combination to undo a paste action")
mod.setting("marithime_context_remove_word", type=str, default="ctrl-backspace", desc="The key combination to clear a word to the left of the caret")
mod.setting("marithime_context_remove_letter", type=str, default="backspace", desc="The key combination to clear a single letter to the left of the caret")
mod.setting("marithime_context_remove_forward_word", type=str, default="ctrl-delete", desc="The key combination to clear a word to the right of the caret")
mod.setting("marithime_context_remove_forward_letter", type=str, default="delete", desc="The key combination to clear a single letter to the right of the caret")
mod.setting("marithime_context_remove_line", type=str, default="", desc="The key to remove an entire line from the current text")
mod.setting("marithime_context_start_line_key", type=str, default="home", desc="The key to move to the start of the line")
mod.setting("marithime_context_end_line_key", type=str, default="end", desc="The key to move to the end of the line")

# This is default turned to aggressive ( to re-index after every action ) until we fix most of the edgecases where de-syncs happen
# Options - "" (default) - Whenever the confidence that it has lost the location in the file, re-index
# - aggressive - After every marithime command that requires context, we re-index
# - disabled - Disable indexing altogether, so no shift-select, clipboard, file or accessibility indexing
mod.setting("marithime_indexing_strategy", type=str, default="", desc="Determine what strategy we should use to begin reindexing documents")

ctx.tags = ["user.marithime_available"]
ctx.lists["user.marithime_terminator_word"] = ["quill", "quilt"]
ctx.lists["user.marithime_terminator_word"] = [
# "over",
"quill",
"quilt"
]
2 changes: 1 addition & 1 deletion phonetics/phonetics.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
from typing import List, Callable
from .detection import detect_phonetic_fix_type, phonetic_normalize, levenshtein, syllable_count, EXACT_MATCH, HOMOPHONE_MATCH, PHONETIC_MATCH

class PhoneticSearch:
class PhoneticSearch:

# The file content and callbacks, - Separated from talon bindings for easier testing
homophone_content = ""
Expand Down
57 changes: 57 additions & 0 deletions programs/SUPPORTED.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Supported programs

This is a list of programs that are known to be supported. Others might have varying degrees of support depending on how well accessibility is handled. Testing is generally done on the beta release of Talon Voice, for windows on Windows 11 ( 10 if specifically mentioned ).

## Operating system support

Both Windows and MacOS have supported accessibility APIs, which allow us to introspect the currently focused text area in great detail, but Linux does not. For that reason, Linux doesn't have as good of a user experience since we cannot poll the content of a text area directly.

## Word processor support

TODO


## Browser support

TODO

## Code editor / IDE support

| Program | OS | Cursor tracking | Content tracking | Notes |
|-----------------|---------|-----------------|------------------|-------|
| VSCode editor | Windows | Yes* | Yes* | This requires turning on accessiblity support |
| VSCode editor | MacOS | Yes* | Yes* | This requries turning on accessibility support `Shift+Option+F1`|

VSCode has some issues when creating new files that do not have a sticky filename however.

## Terminal support

| Program | OS | Cursor tracking | Content tracking | Selection | Notes |
|-----------------|---------|-----------------|------------------|-----------|-------|
| Terminal | Windows | Key tracking | Key tracking | Virtual | |
| Git BASH | Windows | Key tracking | Key tracking | Virtual | |
| CMD | Windows | Key tracking | Key tracking | Virtual | |
| Cygwin | Windows | Key tracking | Key tracking | Virtual | |
| ConEmu | Windows | Key tracking | Key tracking | Virtual | |
| Cmder | Windows | Key tracking | Key tracking | Shift | |
| PowerShell | Windows | Key tracking | Key tracking | Shift | |
| iTerm | MacOS | Key tracking | Key tracking | Virtual | |
| iTerm2 | MacOS | Key tracking | Key tracking | Virtual | |
| KiTTY | MacOS | Key tracking | Key tracking | Virtual | |
| Gnome Terminal | Linux | Key tracking | Key tracking | Virtual | |
| Guake | Linux | Key tracking | Key tracking | Virtual | |
| VSCode terminal | Windows / MacOS / Linux | Key tracking | Key tracking | Virtual | This requires [changing the windows title as described in talonhub community](https://github.com/talonhub/community/tree/main/apps/vscode#terminal) |

Terminal programs generally aren't as well supported as other programs with are more rich set of accessibility APIs. Not to mention that text editors such as VIM, emacs and nano each have their own set of hotkeys to navigate the text displayed, so key tracking becomes increasingly hard to do and prone for desyncs.

It seems that the `TextPattern` is supported on Windows 11 for terminals, so it might be worth exploring this more in the future, though each terminal program has a different leading character set ( '$ ' for bash-likes, `λ ` for Cmder, '...>' for PowerShell et cetera ) and we can realistically only support single line programs for now.

While it is possible to tackle this, it is also quite hard to do without major time investments and plugins designed for each text editor available.

Terminals are detected by the `tag: terminal` which is generally retrieved from .talon files like the ones shown in the talon community repository.

### Virtual selection

Virtual selection is used when shift selection isn't supported. What this boils down to is that the text caret will be set after the selected text. Follow up commands, like inserting text, will be have as if the text was actually selected, meaning we would replace the selection in the case of replacing text. This allows you to continue using the same exact commands without worrying about the internals of the specific programs.

While it would be possible to support text selection like the mode supported by holding down Shift in ConEmu, or using mark modes like the ones shown when pressing `Ctrl+Shift+M` in a windows terminal, doing so would further complicate selection, and might not work with Text editors either.
4 changes: 4 additions & 0 deletions programs/input_field.talon
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
tag: user.marithime_input_field_text
-
settings():
user.marithime_context_multiline_supported = 0
6 changes: 6 additions & 0 deletions programs/mac/kitty.talon
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
os: mac
and app.bundle: net.kovidgoyal.kitty
-
tag(): terminal
settings():
user.marithime_context_shift_selection = 0
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,7 @@ settings():
user.marithime_context_remove_word = "cmd-backspace"
user.marithime_context_remove_letter = "backspace"
user.marithime_context_remove_forward_word = "cmd-delete"
user.marithime_context_remove_forward_letter = "delete"
user.marithime_context_remove_forward_letter = "delete"
user.marithime_context_remove_line = ""
user.marithime_context_end_line_key = "cmd-right"
user.marithime_context_start_line_key = "cmd-left"
14 changes: 14 additions & 0 deletions programs/terminal.talon
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
os: mac
os: windows
os: linux
tag: terminal
-
settings():
user.marithime_indexing_strategy = "disabled"
user.marithime_context_multiline_supported = 0
user.marithime_context_shift_selection = 0
user.marithime_context_end_line_key = "ctrl-e"
user.marithime_context_start_line_key = "ctrl-a"
user.marithime_context_remove_line = "ctrl-u"
user.marithime_context_remove_word = "ctrl-w"
user.marithime_context_clear_key = "enter"
6 changes: 6 additions & 0 deletions programs/windows/cmder.talon
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
os: windows
tag: terminal
win.title: /Cmder/
-
settings():
user.marithime_context_shift_selection = 1
6 changes: 6 additions & 0 deletions programs/windows/powershell.talon
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
os: windows
tag: terminal
win.title: /PowerShell/
-
settings():
user.marithime_context_shift_selection = 1
3 changes: 1 addition & 2 deletions streamdeck.talon
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
tag: user.talon_hud_enabled
tag: user.talon_hud_deck_enabled
-
#deck(compass): user.marithime_index_textarea()
#deck(question-mark): user.marithime_dump_context()
#deck(compass): user.marithime_index_textarea()
9 changes: 8 additions & 1 deletion talon_hud_integration/index_visualisation.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@

def index_document(_ = None, _2 = None):
actions.user.marithime_index_textarea()
actions.user.marithime_toggle_track_context()

ctx_override = Context()
ctx_override.matches = """
Expand Down Expand Up @@ -63,4 +64,10 @@ def marithime_update_sensory_state(scanning: bool, level: str, caret_confidence:

absolute_status_bar_image = os.path.join(IMAGES_DIR, status_bar_image + ".png")
status_bar_icon = actions.user.hud_create_status_icon("virtual_buffer", absolute_status_bar_image, "", "Virtual buffer unavailable", index_document)
actions.user.hud_publish_status_icon("virtual_buffer", status_bar_icon)
actions.user.hud_publish_status_icon("virtual_buffer", status_bar_icon)

def marithime_show_context() -> str:
"""Show the current context in a Window if given the chance"""
content_to_render = actions.next()
actions.user.hud_publish_content(content_to_render, "documentation", "Virtual buffer context", True, [], [])
return content_to_render
Loading

0 comments on commit 042cbe6

Please sign in to comment.