Fuzzy Duplicate

2 posts / 0 new
Last post
Joined: 10/31/2015 - 04:46
Fuzzy Duplicate

Hi Brian,
Hope you are well.
Quick question about fuzzy duplicate:
I am try to group similar names coz names come in as free text and tend to error prone.
Current output from fuzzy duplicate match:
Name                                   Group Name (from Fuzzy)
John Smith                            John Smith
Smith, John                           Smith John
Why does it not match when first name and last name are reverse but exactly the same? do you have any suggestion how I can match those as well?
Thank you,

Brian Element's picture
Brian Element
Joined: 07/11/2012 - 19:57

I think it is because it uses the Levenshtein distance algorythm so the it measures the distance between letters, I assume in your case since they are reversed instead of just having typos that the distance is to far apart.  In your example they are using a comma when they are reversed, maybe create a virtual field and if it contains a comma then you reverse the name and then do the fuzzy match, that would make up this scenario.