I need a web application that can take a group of ms-word files that belong to one particular project (that would be placed in a particular directory), extract the words that the dictionary (any dictionary will do, could be word, google, something in ajax, whatever really) marks as errors, add them to a database and display a list of such words on a table.
We will have two tables. One "project" table and one "permanent" table.
In the permanent table, we will have a list much like the one in the excel file attached. It will have common spelling errors and the words those errors should be replaced with. ? The system should start by doing a find/replace of the common errors in all the files in the directory.
After that, the system will query the words still reported as errors, and populate a "project" table in the database with the list of words that are marked as wrong. Users will edit the words, either marking them as "not errors", which could be added to the dictionary or not, or by editing them for that particular project, or finally by editing the words and marking them as "common".
Words marked as common errors will be copied to the "permanent" table so that in the next project we do not need to manually edit them again.
What the system does then is a simple find and replace operation using the project list.
The idea, of course, is that the permanent table will have more words as time passes, so that some point, most corrections will be done automatically before we see errors.
Notice that I need to do this and must edit text files that can be either .doc, .docx or .rtf and I can adjust to produce either. ? So if you can do this with say, php and mysql or ajax and use a different dictionary than microsoft's, fine with me.
The main idea, is to edit all the spelling errors in files, without reading each file, and where most of the common errors are replaced automatically before you even see the errors.
Last thing, of course, an administrator should be able to edit the permanent table. ? Oh, and 95% of the times, the files will be in spanish. ? So please make sure you get a spanish dictionary. ? The code can, of course, be in english.