Digital Reasonal Remote Programming exercise
Below are instructions for building, testing and executing the exercise. I have checked-in all sources,classes,docs and outputs. If you have any questions please do not hesitant to call or email. tel: 201 955-8113 email: stephenpince@gmail.com
The ant command can be used to build, test and execute the exercise.
Unix(bash) and windows batch commands have also been provided. No arguments are necessary to execute the scripts. The scripts do have optional parameters. See each script for usage details.
USING ANT
$ ant
$ ant test
$ ant run
$ ant javadoc
$ ant clean
USING SCRIPTS
$ test_part[123]
where [123] represents the question number. e.g. to run to test question 1
$ test_part1
$ run_part[123]
Where [123] represents the question number. e.g. to execute question 1.
$ run_part1
All the java source are packaged in digitalreasoning are under /src.
assumptions: US locale.
limitations: size of source files, size of dictionary.
alternative implementation: use a trie for data storage, it is more memory efficient.
/src/digitalreasoning/DocumentTokenizer.java is the java source implementing for question 1.
/src/digitalreasoning/TestDocumentTokenizer.java is the unit test driver source for testing question 1.
assumptions: US locale.
limitations: is the linear searching of proper names in a sentence, size of source files, size of dictionary.
alternative implementation: You could use a suffix tree for the sentence data structure.
This would allow for constant time lookup of proper names.
Ukkonen's Algorithm can build a suffix tree in O(n).
/src/digitalreasoning/ProperNameDocumentTokenizer.java is the java source implementingfor question 2.
/src/digitalreasoning/TestProperNameDocumentTokenizer.java is the unit test driver source for testing question 2.
assumptions: US locale.
limitations: is the linear searching of proper names in a sentence, size of source files, size of dictionary.
alternative implementation: You could use a suffix tree for the sentence data structure.
This would allow for constant time lookup of proper names.
Ukkonen's Algorithm can build a suffix tree in O(n).
/src/digitalreasoning/ProperNameAggregator.java is the java source implementing for question 3.
/src/digitalreasoning/TestProperNameAggregator.java is the unit test driver source for testing question 3.