sahana-s08-de-duplicator team mailing list archive
-
sahana-s08-de-duplicator team
-
Mailing list archive
-
Message #00006
Re: Help needed
see inline
On Tue, Nov 23, 2010 at 3:26 PM, Akilandeswari Ramakrishnan <
aramakr@xxxxxxxx> wrote:
> hi pradnya,
>
> have a question.. .from the 'People deduplicator' point of view ..
>
> example if ther r 2 records : akila and Akhila(*these are first names i.e
> one of the column name and not entire records*), i ll pass each of these
> to get the soundex values. soundex values will be the same (a240) as these
> are pronounced alike.. . so this is a dupe suspect..
>
* soundex can be used only for first and last name* *for the rest of the
fields we can use JW*. *Thats what i think, if anyone has any other inputs
on this please reply*
>
> Then for the JW algo, I should just pass the 2 strings or the entire 2
> records ?
>
*you can read the record and call this method for each column in DB
record with input strings*.* if any other approach for this..please discuss*
jaro_winkler(akila, akhila)
>
>
thanks
> Akila
>
>
> On Tue, Nov 23, 2010 at 10:51 AM, Pradnya Kulkarni <
> kulkarni.pradnya@xxxxxxxxx> wrote:
>
>> 1] I have implemented jaro winkler algorithm
>> inputs - two strings to be compare
>> output - distance for strings i.e the decimal value example -
>> http://en.wikipedia.org/wiki/Jaro%E2%80%93Winkler_distance
>> method name - jaro_winkler(str1, str2)
>>
>> 2] for soundex
>> inputs - input string such as person name
>> output - soundex value
>> method name - soundex(name)
>>
>> to compare two names u can call this function twice and compare the return
>> values. if values are same then they are phonetically similar.
>>
>> you can go ahead and write code and call these methods for now.
>>
>> Thanks,
>> Pradnya
>>
>>
>> On Tue, Nov 23, 2010 at 10:43 AM, Akilandeswari Ramakrishnan <
>> aramakr@xxxxxxxx> wrote:
>>
>>> It would be helpful if you could share how your module works.. after your
>>> testing is complete..
>>> in the sense what is the input that it expects
>>> how will it give the o/p
>>> basically Input/output parameters..
>>>
>>> thnx
>>> Akila
>>>
>>> So that from the controllers.. we would provide the necessary inputs and
>>> process the o/p from your module accordingly..
>>>
>>> On Tue, Nov 23, 2010 at 10:16 AM, Pradnya Kulkarni <
>>> kulkarni.pradnya@xxxxxxxxx> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I have created a new file in 's3deduplicator.py' in modules and added
>>>> functions for algos.
>>>> does any one have idea abt how to call methods from modules? and how to
>>>> import modules in other file?
>>>>
>>>> Let me know as I want to test the algo code
>>>>
>>>> Thanks,
>>>> Pradnya
>>>>
>>>> _______________________________________________
>>>> Mailing list: https://launchpad.net/~sahana-s08-de-duplicator<https://launchpad.net/%7Esahana-s08-de-duplicator>
>>>> Post to : sahana-s08-de-duplicator@xxxxxxxxxxxxxxxxxxx
>>>> Unsubscribe : https://launchpad.net/~sahana-s08-de-duplicator<https://launchpad.net/%7Esahana-s08-de-duplicator>
>>>> More help : https://help.launchpad.net/ListHelp
>>>>
>>>>
>>>
>>
>
Follow ups
References