Apply fuzzy matching across a dataframe column and save results in a new column

I couldn’t tell what you were doing. This is how I would do it.

from fuzzywuzzy import fuzz
from fuzzywuzzy import process

Create a series of tuples to compare:

compare = pd.MultiIndex.from_product([df1['Company'],
                                      df2['FDA Company']]).to_series()

Create a special function to calculate fuzzy metrics and return a series.

def metrics(tup):
    return pd.Series([fuzz.ratio(*tup),
                     ['ratio', 'token'])

Apply metrics to the compare series


enter image description here

There are bunch of ways to do this next part:

Get closest matches to each row of df1


enter image description here

Get closest matches to each row of df2


enter image description here

Leave a Comment