← Back to team overview

sahana-s08-de-duplicator team mailing list archive

Re: Help needed

 

Hi,

No luck with the string conversion..
But yes! i can pass u 2 records as array of fields. can you change the code
accordingly.? think it will be really useful ...

thanks
Akila


On Tue, Nov 23, 2010 at 4:55 PM, Pradnya Kulkarni <
kulkarni.pradnya@xxxxxxxxx> wrote:

> probably I can return you the single value if you send the records as array
> of value. so we can reuse JW and soundex implementation.
>
>
> On Tue, Nov 23, 2010 at 4:44 PM, Pradnya Kulkarni <
> kulkarni.pradnya@xxxxxxxxx> wrote:
>
>> see inline
>>
>>  On Tue, Nov 23, 2010 at 4:22 PM, Akilandeswari Ramakrishnan <
>> aramakr@xxxxxxxx> wrote:
>>
>>>  if we pass for each of the field , then we ll have to deal with
>>> multiple match% per record.. which ll be cumbersome
>>>
>>> if we pass the entire object for the JW algo, will it be possible to
>>> treat the whole record as a string and compare ?
>>> *(**if u send  records as an inputs , **I will be doing the same thing
>>> as the algorithm works on 2 strings at a time for JW. I am not sure how the
>>> different table could be handled in that as location, person has different
>>> columns)*
>>>
>>
>>
>>>  if not possible , then we ll just send the Firstname* and lastname *only.
>>> wat u guys say?
>>>
>>>    *Others, Please provide some inputs on this,
>> *
>>
>>>
>>> On Tue, Nov 23, 2010 at 3:41 PM, Pradnya Kulkarni <
>>> kulkarni.pradnya@xxxxxxxxx> wrote:
>>>
>>>> see inline
>>>>
>>>>  On Tue, Nov 23, 2010 at 3:26 PM, Akilandeswari Ramakrishnan <
>>>> aramakr@xxxxxxxx> wrote:
>>>>
>>>>> hi pradnya,
>>>>>
>>>>> have a question.. .from the 'People deduplicator' point of view ..
>>>>>
>>>>> example if ther r 2 records : akila and Akhila(*these are first names
>>>>> i.e one of the column name and not entire records*), i ll pass each of
>>>>> these to get the soundex values. soundex values will be the same (a240) as
>>>>> these are pronounced alike.. . so this is a dupe suspect..
>>>>>
>>>> *   soundex can be used only for first and last name* *for the rest of
>>>> the fields we can use JW*. *Thats what i think, if anyone has any other
>>>> inputs on this please reply*
>>>>
>>>>>
>>>>> Then for the JW algo, I should just pass the 2 strings or the entire 2
>>>>> records ?
>>>>>
>>>>     *you can read the record and call this method for each column in DB
>>>> record with input strings*.* if any other approach for this..please
>>>> discuss*
>>>>
>>>>   jaro_winkler(akila, akhila)
>>>>>
>>>>>
>>>>  thanks
>>>>> Akila
>>>>>
>>>>>
>>>>> On Tue, Nov 23, 2010 at 10:51 AM, Pradnya Kulkarni <
>>>>> kulkarni.pradnya@xxxxxxxxx> wrote:
>>>>>
>>>>>> 1] I have implemented jaro winkler algorithm
>>>>>> inputs - two strings to be compare
>>>>>> output - distance for strings i.e the decimal value example -
>>>>>> http://en.wikipedia.org/wiki/Jaro%E2%80%93Winkler_distance
>>>>>> method name - jaro_winkler(str1, str2)
>>>>>>
>>>>>> 2] for soundex
>>>>>> inputs - input string such as person name
>>>>>> output - soundex value
>>>>>> method name - soundex(name)
>>>>>>
>>>>>> to compare two names u can call this function twice and compare the
>>>>>> return values. if values are same then they are phonetically similar.
>>>>>>
>>>>>> you can go ahead and write code and call these methods for now.
>>>>>>
>>>>>> Thanks,
>>>>>> Pradnya
>>>>>>
>>>>>>
>>>>>> On Tue, Nov 23, 2010 at 10:43 AM, Akilandeswari Ramakrishnan <
>>>>>> aramakr@xxxxxxxx> wrote:
>>>>>>
>>>>>>> It would be helpful if you could share how your module works.. after
>>>>>>> your testing is complete..
>>>>>>> in the sense what is the input that it expects
>>>>>>> how will it give the o/p
>>>>>>> basically Input/output parameters..
>>>>>>>
>>>>>>> thnx
>>>>>>> Akila
>>>>>>>
>>>>>>> So that from the controllers.. we would provide the necessary inputs
>>>>>>> and process the o/p from your module accordingly..
>>>>>>>
>>>>>>>   On Tue, Nov 23, 2010 at 10:16 AM, Pradnya Kulkarni <
>>>>>>> kulkarni.pradnya@xxxxxxxxx> wrote:
>>>>>>>
>>>>>>>>  Hi all,
>>>>>>>>
>>>>>>>> I have created a new file in 's3deduplicator.py' in modules and
>>>>>>>> added functions for algos.
>>>>>>>> does any one have idea abt how to call methods from modules? and
>>>>>>>> how to import modules in other file?
>>>>>>>>
>>>>>>>> Let me know as I want to test the algo code
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Pradnya
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Mailing list: https://launchpad.net/~sahana-s08-de-duplicator<https://launchpad.net/%7Esahana-s08-de-duplicator>
>>>>>>>> Post to     : sahana-s08-de-duplicator@xxxxxxxxxxxxxxxxxxx
>>>>>>>> Unsubscribe : https://launchpad.net/~sahana-s08-de-duplicator<https://launchpad.net/%7Esahana-s08-de-duplicator>
>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Follow ups

References