← Back to team overview

tamilspellchecker team mailing list archive

Re: code updated to launchpad and mit demo

 

On Sun, Feb 1, 2009 at 8:25 PM, Elanjelian Venugopal <tamiliam@xxxxxxxxx>wrote:

> வணக்கம்.
>
> 2009/2/1 S.Selvam Siva <s.selvamsiva@xxxxxxxxx>:
>
>  Thank for your response.I am basically an engineering final year
>> student(CSE). I initially wanted to improve aspell tamil spell checking,But
>> i did not now how to implement tamil grammar with aspell and include GUI
>> suppport.
>>
>> At that time i came to know about malayalam spell checker as a gedit
>> plugin developed by thottilingam with python.So i decided to develop the
>> spell checking engine (tamilspell.py) which can be reused for any editors.
>> As a first step i integrated with gedit. Next will be open office.
>>
>
> Open Office, I understand, is now using hunspell. Could the engine you are
> working on be converted to hunspell spellchecking extension?
>

what we need is one python file calling tamilpsell.py with some tamil text
as argument.Though i have little knowledge on Open Office plugin
mechnism,adding it to Open Office will require Open Office specific
module(pyUNO,i guess).And our first aim need to be to develop a powerful
spell cheking engine .So our plugin may not depend on hunspell.


>
>
> I am eager to know, whether you have added any grammatical rules to aspell
>> and any hint on developing aspell for improved tamil spell check.
>>
>
> I'm not a computer scientist; however, I did try to develop an affix file
> then, but support for unicode and complex script was not provided by
> aspel/Ispell/Myspell, so it didn't work. I think all that is now past.
>
> Anyway, from what I remember, it wouldn't be too difficult -- though really
> tedious -- to develop the necessary rules. At the least, we could develop
> the easier, more common rules first. I could organise an effort, if you
> could look at the technical aspects of the challenge. By the way, how does
> the engine you are developing handles affix rules? See:
> http://lingucomponent.openoffice.org/affix.readme
>

As of now ,i just maintain list of tamil words (one per line) and make
comparison to find out miss-spelled words(This is the starting point of our
project).

Affix rules seem to be critical part of tamil spell checking which i have
not got any clue so far,except that AU-KBC has developed morphlogical
analysis and released a software(Acharam.exe wrriten in java).we will be
really happy if you can help us on affix rules .

I agree that we need to develop common rules first and We are ready to
technically implement it.

Note: I have got some basic rules for சந்திப் பிழை, through tamil teachers.


>
>
> Note:We have collected nearly 30,000 tamil words(includiing aspell's
>> 14,000),extracted from tamil web page.
>>
>
> Nice! You are using one of the corpus catchers available, I presume...
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~tamilspellchecker<https://launchpad.net/%7Etamilspellchecker>
> Post to     : tamilspellchecker@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~tamilspellchecker<https://launchpad.net/%7Etamilspellchecker>
> More help   : https://help.launchpad.net/ListHelp
>
>


-- 
Yours,
S.Selvam

Follow ups

References