- Spell Checker
Ever wondered how spell checkers work?
I actually found them quite fascinating, so I decided to build one myself. This project is a spell checker implemented in C++ using the strategy pattern. The spell checker reads a sample text file, identifies misspelled words, [TODO: and provides suggestions for corrections]. The strategy pattern allows for easy integration of different spell-checking algorithms, starting with a Levenshtein distance-based strategy.
# compile the spell checker
g++ -std=c++11 -o spell_checker -Iinclude src/*.cpp main.cpp -ftime-report
# compile the spell checker
# 7 times faster execution
g++ -std=c++11 -o spell_checker -Iinclude src/*.cpp main.cpp -ftime-report -O3
# run the spell checker
./spell_checker
# run the spell checker with a file
./spell_checker <input_file>
# run the spell checker with a file and output the result to a file
./spell_checker <input_file> > <output_file>
# run the spell checker with a string
./spell_checker "In twilight's glow, I adore applle's embrace and banaanas' dance. Oranges, a symphony of sunsets, are greaet! Grapes, peears, and cheries waltz in a delicious duet. Lemons and limmes serenade the senses with their sour song. Strawberries and blueberries, nature's jewels, are my favorite fruiits. I savor a watermellon's whisper and a kiwii's sweet refrain."
# file io
spell-checker> ./spell_checker
Spell-checking complete. Corrected file written to: fixed.txt
Runtime: 15870561 microseconds
#dictionary file used: words.txt -> 4.75MB
# input: sample.txt
In twilight's glow, I adore applle's embrace and banaanas' dance.
Oranges, a symphony of sunsets, are greaet!
Grapes, peears, and cheries waltz in a delicious duet.
Lemons and limmes serenade the senses with their sour song.
Strawberries and blueberries, nature's jewels, are my favorite fruiits.
I savor a watermellon's whisper and a kiwii's sweet refrain.
# output: fixed.txt
In twilight's glow, I adore [applle's] embrace and [banaanas'] dance.
Oranges, a symphony of sunsets, are [greaet!]
Grapes, [peears,] and cheries waltz in a delicious duet.
Lemons and [limmes] serenade the senses with their sour song.
Strawberries and blueberries, nature's jewels, are my favorite [fruiits.]
I savor a [watermellon's] whisper and a [kiwii's] sweet refrain.
# string io
spell-checker> ./spell_checker "In twilight's glow, I adore applle's embrace and banaanas' dance. Oranges, a symphony of sunsets, are greaet! Grapes, peears, and cheries waltz in a delicious duet. Lemons and limmes serenade the senses with their sour song. Strawberries and blueberries, nature's jewels, are my favorite fruiits. I savor a watermellon's whisper and a kiwii's sweet refrain."
Corrected string: In twilight's glow, I adore [applle's] embrace and [banaanas'] dance. Oranges, a symphony of sunsets, are [greaet!] Grapes, [peears,] and cheries waltz in a delicious duet. Lemons and [limmes] serenade the senses with their sour song. Strawberries and blueberries, nature's jewels, are my favorite [fruiits.] I savor a [watermellon's] whisper and a [kiwii's] sweet refrain.
Runtime: 15533753 microseconds
./spell_checker
Spell-checking complete. Corrected file written to: fixed.txt
Runtime: 17275530 microseconds
spell-checker> ./spell_checker "In twilight's glow, I adore applle's embrace and banaanas' dance. Oranges, a symphony of sunsets, are greaet! Grapes, peears, and cheries waltz in a delicious duet. Lemons and limmes serenade the senses with their sour song. Strawberries and blueberries, nature's jewels, are my favorite fruiits. I savor a watermellon's whisper and a kiwii's sweet refrain.
>> "
Corrected string: In twilight's glow, I [adore] [applle's] embrace and [banaanas'] dance. Oranges, a symphony [of] sunsets, [are] [greaet!] Grapes, [peears,] and [cheries] waltz in a [delicious] duet. Lemons and [limmes] serenade the senses with their [sour] [song.] Strawberries and blueberries, nature's jewels, [are] [my] favorite [fruiits.] I savor a [watermellon's] whisper and a [kiwii's] [sweet] refrain.
Runtime: 5387747 microseconds
Explored the strategy pattern for designing flexible and interchangeable algorithms. Applied it to seamlessly switch between different spell-checking strategies.
Recognized the advantages of design patterns, especially the strategy pattern, for creating adaptable software. Learned how to easily integrate or switch algorithms without modifying the core logic.
Created a custom logger to enhance console output, highlighting incorrect words for better visibility. Used ANSI escape codes for simple and effective text highlighting.
Emphasized the significance of modular code for improved readability and maintainability. Structured the project into distinct components, such as file I/O and spell-checking strategies.
Applied C++ concepts in a practical project, reinforcing language features and syntax. Gained hands-on experience in file I/O, string manipulation, and algorithm integration.
Applied an iterative development process, refining the spell checker through testing and feedback. Overcame challenges and improved the implementation to achieve the desired functionality.
- nothing actually new here, just a refresher on the algorithm and how to apply it for a spell checker!
- good speed, better than expected for a naive implementation
-
it is somehow worse than the levenshtein distance implementation
-
NOTE: i was wrong. The trie implementation is actually faster than the levenshtein distance implementation. i was not reusing the same dictionary, instead i was recheckeing the dictionary for each word and adding it to the trie.
-
it is like 3 times faster: see results.
-
but seems like it identifies some good words as incorrect too! need to fix this.
-
mutable
andconst functions
const
function: a function that does not modify the object's state// ex: int get_value() const { return value; }
mutable
keyword is used to modify the class member variables in a const function.// ex: mutable int value;
now we can create a weird pattern where a method is
const
but it modifies the object's state// ex: class Trie { mutable int value; int get_value() const { value = 10; return value; } }
-
is this a good practice? why even try
const-correctness
if we are going to usemutable
anyways on the states we want to modify? -
but this helped me implement the TrieSpellChecker's
isSpelledCorrectly
method, while following theconst-correctness
of theSpellChecker
interface (concrete strategy). -
the
mutable
was added in the TrieSpellChecker.h.
class Strategy {
public:
virtual int execute() const = 0;
};
class CachedStrategy : public Strategy {
public:
CachedStrategy() : cachedResult(0), isResultValid(false) {}
int execute() const override {
if (!isResultValid) {
// Compute and cache the result
cachedResult = computeResult();
isResultValid = true;
}
return cachedResult;
}
private:
mutable int cachedResult;
mutable bool isResultValid;
int computeResult() const {
// expensive computation
return 42;
}
};