Name Matching Experiment
(Part 6)

Eurospider has carried out a simple experiment with the popular Levenshtein distance string metric. Around 600 names taken from the media were used to search for hits in a test database of more than 1000 entries. For each of the 600 names, the test database contained the full and correct name, which differed from the name used in the media. The names that were found for each of the 600 names were ranked by ascending Levenshtein distance. Finally, yield and precision were determined in the event of the top n ranks being sifted. What can we learn from this?

chart both

We can see that the more ranks are sifted, the more correct hits (true positives) are found. As expected, the precision declines. This means that the more ranks are sifted, the more false hits (false positives) are found. The sharp drop in the precision curve means that the verification effort increases significantly.

Complete Revision of the Federal Data Protection Act

Complete Revision of the Federal Data Protection Act: „As of 15th September 2017, draft and report for a completely revised Federal Data Protection Act is public. In a first step parliament and the people agreed to adaptations in order to be compliant with EU law. The second part of the revision is debated by the parliament since September 2019. Data Protection is to be increased by giving people more control over their private data as well as reinforcing transparency regarding the handling of confidential data.”


Eurospider Information Technology AG
Winterthurerstrasse 92
8006 Zürich


Cookies make it easier for us to provide you with our services. With the usage of our services you permit us to use cookies.
More information Ok Decline