← Back to team overview

tamilspellchecker team mailing list archive

Root word extraction done-150k root words

 

Hi all,

Yesterday i have gone through the TamilwordNet software's working logic and
made a few interesting things as follow,

1)I extracted the root_word(140k) and derived words(3000k) from database.

2)I wrote a python code which converts English to tamil word(like google
transliteration)
 eg.manithar -மனிதர்

3)Then converted all the words to tamil and stored in file(Uing python's
cPickle module) .
  a) Basically each root word is key and the derived words are values

I am thinking on technical implementation of above work into our spell
checker.
Your suggestions and queries are welcome.

-- 
Yours,
S.Selvam