title: "symbol2unicode: Generate unicode symbols from similar ascii character combinations" author: Huub de Beer date: "June, 2016" keywords:
- ascii
- unicode
- javascript
- dfa
- symbols
- mathematics ...
While reading Nederpelt and Kamareddine (2011) Logical reasoning: A first course for a project to explore constructionist learning approaches, I found myself entering Unicode symbols a lot. There is nothing wrong with entering one or two Unicode symbols now and then—a fitting symbol enhances the readability of a text enormously—, but when a text is symbol-heavy it soon becomes a chore.
For example, in Vim, the text editor I use for all my
writing, you can enter the logical
not operator as follows: to get "¬"
you have to press Control
+ "v", then "u", and then "00ac". This is a lot
more typing than, say, !
to denote "not". Would it not be great if there was
a program where I could enter ASCII
representations of the symbols I want to use, which would then be converted to
their Unicode equivalents?
symbol2unicode
is such a program!
symbol2unicode
is free
software; symbol2unicode
is
licensed under the GNU General Public Licence Version
3. You will find its source
code at Github.
There are two ways to use symbol2unicode
: via a web
interface and via a command-line interface, which has an
interactive mode. Both interfaces work mostly the same: you enter in an ASCII
representation of a symbol, such as =>
, and by pressing ENTER
it is
converted to Unicode. You can also supply the ASCII representation as a
parameter to the symbol2unicode
program.
You can install symbol2unicode
via npm as follows:
npm install -g symbol2unicode
If you do not want to install the program globally, remove the -g
parameter
from the line above.
Run the program symbol2unicode
with the ASCII representations of the symbols
you want to convert as parameters. For example, to convert =>
to ⇒
, run
the program as follows:
symbol2unicode "=>"
You can specify as many parameters as you like. These will be joined together with a space (" ") and run through the converter as one long string. For example,
symbol2unicode "P /\ Q" "=>" "!Q \/ P === !P"
results in the output P ∧ Q ⇒ ¬Q ∨ P ≡ ¬P
. The input string (<forall i: i in ZZ:i <= i^2>)
will be converted to 〈∀ i: i ∈ ℤ:i ≤ i²〉
.
If the symbol2unicode
program is executed without any parameters, it will
run in interactive mode. The interactive mode starts by printing the
following short welcome message:
Welcome by symbol2unicode.
Usage:
Enter a string of ascii symbols after the prompt (? ) and press
ENTER to convert it to unicode. Press CONTROL+C to quit.
Hereafter you can enter ASCII representations of the symbols you want to
convert after the ?
prompt. Press ENTER
to convert your input to Unicode.
To quit interactive mode and the program, press CONTROL+C
.
Finally, it is possible to use the symbol2unicode
program with pipes. For
example:
echo "P /\ Q === true" | symbol2unicode
will result in P ∧ Q ≡ true
.
As I am a heavy Vim user, I like to use symbol2unicode
from inside vim. Of
course, I can call it as any other external program in Vim:
:r !symbol2unicode "(forall i:i in NN:i <= i^2)"
Which will insert (∀ i: i ∈ ℕ: i ≤ i²) on the line below the one where the
cursor is. This works fine, but the command is quite a "lineful",
particularly if you only want to insert a single symbol now and then. A simple
way to decrease the invocation length, is to create an alias in
Bash (or any other shell that supports
them) for symbol2unicode
to something shorter, such as s2u
or uu
.
A better way, however, is to create a custom Vim command—I like the sound of
S2u
for that (custom commands should start with a capital letter)—that feeds its argument to symbol2unicode
and inserts the
output in the current file. The above example then becomes:
:S2u (forall i:i in NN: i <= i^2)
To create the S2u
command, run
:command -nargs=+ S2u r! symbol2unicode "<args>"
or add it to your .vimrc
. As a next step, you could map S2u
to a
key, such as F7
, with
:map <F7> <Esc>:S2u <Space>
All you now have to do is to press that key, type your ASCII string of symbols
and press ENTER
.
For a full overview of the ASCII to Unicode mappings, see
the source code file src/DEFAULT_REPLACEMENTS.js
.
The rules for replacement rules are simple:
- An ASCII symbol representation can occur only once, but Unicode symbols can occur as often as needed.
- An Unicode symbol is exactly one character, but the ASCII symbol representations can have as many characters as needed.
Where there are clear conventions for ASCII symbol representations, such as in
programming languages, these conventions have priority over more "logical"
representations. Therefore, <=
is converter to ≤
rather than ⇐
(which
you get with <==
).
You can add a symbol to the default list of replacements by either doing a pull request or by shooting me an email. Before you do, however, check if your new replacement rule does not interfere with pre-existing rules. You can check that by
-
On the command line: add your rules to the
src/DEFAULT_REPLACEMENTS.js
file, build the program withnpm run build
and start the program
bin/cli
If your new rule is in conflict with a pre-existing rule, it will complain and exit.
-
In the web browser: go to the web interface and open the javascript console. The
converter
is in the global scope. You can try to add your rules by calling therule
method on the converter like so:converter.rule("=>", "⋔");
The first argument to the rule method is the ASCII representation and the second one is the Unicode symbol. Again, if your rule interferes with a pre-existing rule or is otherwise not okay, it will complain.
Of course, as symbol2unicode
is free software, you are free to create your
own set of (default) translation rules.