[[BackLinksMenu]] [[TicketQuery(summary=SPELL_CHECK_R0, format=table, col=summary|owner|status|type|component|priority|effort|importance, rows=description|analysis_owners|analysis_reviewers|analysis_score|design_owners|design_reviewers|design_score|implementation_owners|implementation_reviewers|implementation_score|test_owners|test_reviewers|test_score|)]] = Analysis = == Overview == Support for spell-checking needs to be added to Sophie. After this task is completed, spell-checking capabilities should be available. At the first revision of this task the expected end-user functionality is defined and research is performed to find an appropriate third-party library for spell-checking that will be integrated in Sophie. Preliminary design for integration should be available. At next revisions, actual implementation takes place. == Task requirements == * Perform a research to find an appropriate library for spell-checking. Spell check in Sophie will be implemented initially the following way: * The Tools tab will contain a spell-check palette. It will contain the following elements: * Spell-check button - will run a spell check on the text within the currently selected chain. * Toggle underline button - will turn on/off underlining of misspelled words (they will be underlined with a dotted line as in Trac). * Replace/Ignore buttons - will replace/ignore the currently selected misspelled word. * A list of misspelled words - it will contain the currently found misspelled words. * Clicking on a misspelled word will highlight it (select it) in the text (and go to the page it is on if necessary). * A list of suggestions for correction - it will contain possible corrections for the selected misspelled word. * Double-clicking on a suggestion will replace the misspelled word. * When a word is replaced/ignored, the next misspelled word is selected. * Initially up to two languages in a book should be selectable (that is, spell-check should be performed against two dictionaries). * UI for language selection is not defined yet. * NOTE: These requirements are subject to refinement in the next revision of the task. They are listed here to serve as general guidelines for the research. * Describe the research findings in the [wiki:SPELL_CHECK_R0#Design Design] section of this page. Include the following information about each library: * Name and website; * Licensing information * Provided features (incl. supported languages or dictionary format); * Ease of integration; * In the [wiki:SPELL_CHECK_R0#Implementation Implementation] section, suggest the library that is most appropriate for use in Sophie. Provide preliminary design of how to integrate it in Sophie. Required spell-check functionality can be redefined here based on the library chosen. == Task result == This wiki page (containing research findings and preliminary design for integration). == Implementation idea == Jazzy and Suggester are two possible candidates. Suggester provides search suggestions as well. == Related == http://aspell.net/ == How to demo == Show the research findings and how the library will be integrated. = Design = Here is a list of the libraries reviewed: == Jazzy == * Websites: http://jazzy.sourceforge.net/, http://sourceforge.net/projects/jazzy * License and code: LGPL, open-source * Features: * Based on Aspell algorithms (actually was a port of Aspell initially). * Dictionaries specified as word lists (can easily generate them from Aspell dictionaries). * Comes with an English dictionary only but can get any of about 90 Aspell dictionaries to work. * Can be used on Strings or in a JTextComponent. * Documentation: * No online docs or support forums (except few threads with no answers on SourceFourge). * Relatively good JavaDoc and in-code comments. * Internal: * Uses event-driven approach (each spelling error is an event containing the misspelled word and a list of suggestions; listeners can be attached to handle the error). * Words are passed as a WordTokenizer object constructed from a String. * Useful links: * http://www.ibm.com/developerworks/java/library/j-jazzy/ == Suggester Spellcheck == * Websites: http://www.softcorporation.com/products/suggester/, http://www.softcorporation.com/products/spellcheck/ * License and code: Free to use binaries, source is proprietary, no common licence * Features: * Dictionary compression (up to 2GB in memory). * Fast (0.002ms to check a word against the dictionary, 40ms to provide suggestions). * Written in Java 1.2. * Provides dictionaries for 9 languages. * Documentation: * No documentation provided with Basic Edition. Advanced and Enterprise versions (which are paid) provide documentation. * No access to the code except for some samples. * Internal: * Uses .ind files (LaTeX processed index data) packed in jars as dictionaries. * Has a.class, b.class, etc. in JAR == Other == * JOrtho - open-source, GPL-licenced, JTextComponent based only. * JSpell - commercial server-based solution. * JMySpell - open-source, LGPL-licenced, supports OpenOffice dictionaries (some of them LGPL-licensed), which are more compact, early stage of development, no documentation or JavaDoc, only one project actually applying it. Might be a viable option in the future if it gets stable. == Conclusion == Jazzy seems as the better choice. It is flexible as far as language dictionaries are concerned, incorporates powerful algorithms, provides JavaDoc and seems easy to use and modify. As a disatvantage, it might be slower due to its event-driven approach. However, Suggester has no documentation and its understanding will be more difficult. It would be harder to supply it with dictionaries as well. In the implementation section, a prototype/demo of using Jazzy in Sophie will be provided. '''Note:''' Possible issues - dictionary file size too big (a lot bigger than a .dic + .aff file that JMySpell uses for example). = Implementation = ^Describe and link the implementation results here (from the wiki or the repository). = Testing = ^Place the testing results here. = Comments = * http://sourceforge.net/projects/jazzydicts/files/ provides a lot of dictionaries in Jazzy format and a tool for conversion under GPL license. * http://wiki.services.openoffice.org/wiki/Dictionaries#Bulgarian_.28Bulgaria.29 - OpenOffice.org dictionaries that can be converted using the above mentioned tool. * http://sourceforge.net/projects/bgoffice/files/ - a lot of Bulgarian dictionaries.