Does Screena provide rules-based and/or fuzzy matching capabilities?

Screena name-matching approach is twofold with a combination of deterministic rules and predictive scoring methods to minimize false negatives (i.e., achieving high recall) and false positives (i.e., achieving high precision).

Rules-based and fuzzy matching is the core approach to ensure high recall before reducing false positives through our machine learning models.

We first employ traditional edit distance algorithms such as Jaro to measure string similarity when initiating searches on lists. We also apply rules-based algorithms and use proprietary name libraries to detect specific name patterns that can not be addressed through string distance alone. These include, but are not limited to:

  • Name order variations and missing name components,

  • Misspellings and errors (inverted letters, missing letters, substituted letters),

  • Truncated names,

  • Name concatenations,

  • Acronyms and initials,

  • Nicknames, synonyms and common aliases,

  • Titles, honorifics and company legal forms,

  • Phonetic resemblances,

  • Detection of stopwords, linking words and weighting of common words,

  • Detection of locations (cities, towns, regions, ports),

  • Numbers variation,

  • Domain names.

Last updated