← Back to team overview

sahana-s08-de-duplicator team mailing list archive

Re: Help needed

 

can you please send the code to pass as an array so that I can test on my
side.
I think that will helpful while checking for location as well. Right
Shishir?

On Wed, Nov 24, 2010 at 6:08 PM, Akilandeswari Ramakrishnan <
aramakr@xxxxxxxx> wrote:

> Hi,
>
> No luck with the string conversion..
> But yes! i can pass u 2 records as array of fields. can you change the code
> accordingly.? think it will be really useful ...
>
> thanks
> Akila
>
>
> On Tue, Nov 23, 2010 at 4:55 PM, Pradnya Kulkarni <
> kulkarni.pradnya@xxxxxxxxx> wrote:
>
>> probably I can return you the single value if you send the records as
>> array of value. so we can reuse JW and soundex implementation.
>>
>>
>> On Tue, Nov 23, 2010 at 4:44 PM, Pradnya Kulkarni <
>> kulkarni.pradnya@xxxxxxxxx> wrote:
>>
>>> see inline
>>>
>>>  On Tue, Nov 23, 2010 at 4:22 PM, Akilandeswari Ramakrishnan <
>>> aramakr@xxxxxxxx> wrote:
>>>
>>>>  if we pass for each of the field , then we ll have to deal with
>>>> multiple match% per record.. which ll be cumbersome
>>>>
>>>> if we pass the entire object for the JW algo, will it be possible to
>>>> treat the whole record as a string and compare ?
>>>> *(**if u send  records as an inputs , **I will be doing the same thing
>>>> as the algorithm works on 2 strings at a time for JW. I am not sure how the
>>>> different table could be handled in that as location, person has different
>>>> columns)*
>>>>
>>>
>>>
>>>>  if not possible , then we ll just send the Firstname* and lastname *only.
>>>> wat u guys say?
>>>>
>>>>    *Others, Please provide some inputs on this,
>>> *
>>>
>>>>
>>>> On Tue, Nov 23, 2010 at 3:41 PM, Pradnya Kulkarni <
>>>> kulkarni.pradnya@xxxxxxxxx> wrote:
>>>>
>>>>> see inline
>>>>>
>>>>>  On Tue, Nov 23, 2010 at 3:26 PM, Akilandeswari Ramakrishnan <
>>>>> aramakr@xxxxxxxx> wrote:
>>>>>
>>>>>> hi pradnya,
>>>>>>
>>>>>> have a question.. .from the 'People deduplicator' point of view ..
>>>>>>
>>>>>> example if ther r 2 records : akila and Akhila(*these are first names
>>>>>> i.e one of the column name and not entire records*), i ll pass each
>>>>>> of these to get the soundex values. soundex values will be the same (a240)
>>>>>> as these are pronounced alike.. . so this is a dupe suspect..
>>>>>>
>>>>> *   soundex can be used only for first and last name* *for the rest of
>>>>> the fields we can use JW*. *Thats what i think, if anyone has any
>>>>> other inputs on this please reply*
>>>>>
>>>>>>
>>>>>> Then for the JW algo, I should just pass the 2 strings or the entire 2
>>>>>> records ?
>>>>>>
>>>>>     *you can read the record and call this method for each column in
>>>>> DB record with input strings*.* if any other approach for this..please
>>>>> discuss*
>>>>>
>>>>>   jaro_winkler(akila, akhila)
>>>>>>
>>>>>>
>>>>>  thanks
>>>>>> Akila
>>>>>>
>>>>>>
>>>>>> On Tue, Nov 23, 2010 at 10:51 AM, Pradnya Kulkarni <
>>>>>> kulkarni.pradnya@xxxxxxxxx> wrote:
>>>>>>
>>>>>>> 1] I have implemented jaro winkler algorithm
>>>>>>> inputs - two strings to be compare
>>>>>>> output - distance for strings i.e the decimal value example -
>>>>>>> http://en.wikipedia.org/wiki/Jaro%E2%80%93Winkler_distance
>>>>>>> method name - jaro_winkler(str1, str2)
>>>>>>>
>>>>>>> 2] for soundex
>>>>>>> inputs - input string such as person name
>>>>>>> output - soundex value
>>>>>>> method name - soundex(name)
>>>>>>>
>>>>>>> to compare two names u can call this function twice and compare the
>>>>>>> return values. if values are same then they are phonetically similar.
>>>>>>>
>>>>>>> you can go ahead and write code and call these methods for now.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Pradnya
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Nov 23, 2010 at 10:43 AM, Akilandeswari Ramakrishnan <
>>>>>>> aramakr@xxxxxxxx> wrote:
>>>>>>>
>>>>>>>> It would be helpful if you could share how your module works.. after
>>>>>>>> your testing is complete..
>>>>>>>> in the sense what is the input that it expects
>>>>>>>> how will it give the o/p
>>>>>>>> basically Input/output parameters..
>>>>>>>>
>>>>>>>> thnx
>>>>>>>> Akila
>>>>>>>>
>>>>>>>> So that from the controllers.. we would provide the necessary inputs
>>>>>>>> and process the o/p from your module accordingly..
>>>>>>>>
>>>>>>>>   On Tue, Nov 23, 2010 at 10:16 AM, Pradnya Kulkarni <
>>>>>>>> kulkarni.pradnya@xxxxxxxxx> wrote:
>>>>>>>>
>>>>>>>>>  Hi all,
>>>>>>>>>
>>>>>>>>> I have created a new file in 's3deduplicator.py' in modules and
>>>>>>>>> added functions for algos.
>>>>>>>>> does any one have idea abt how to call methods from modules? and
>>>>>>>>> how to import modules in other file?
>>>>>>>>>
>>>>>>>>> Let me know as I want to test the algo code
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Pradnya
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Mailing list: https://launchpad.net/~sahana-s08-de-duplicator<https://launchpad.net/%7Esahana-s08-de-duplicator>
>>>>>>>>> Post to     : sahana-s08-de-duplicator@xxxxxxxxxxxxxxxxxxx
>>>>>>>>> Unsubscribe : https://launchpad.net/~sahana-s08-de-duplicator<https://launchpad.net/%7Esahana-s08-de-duplicator>
>>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Follow ups

References