How does Screena handle common names?
We combine multiple technics to increase matching accuracy for common names.
Secondary attributes-matching is one efficient method for automatically discarding a considerable portion of false hits.
In addition to secondary attributes-matching, our algorithms allow excluding matches for specific entity types. It is beneficial to prevent irrelevant matches of common names such as those of individuals or organizations against vessel names (e.g., "Christina" or "Mariana").
When secondary attributes are either missing or difficult to rely on due to data quality issues, our machine learning models have been extensively trained to overcome the challenges of names very common to specific cultures (e.g., "Mohamed Ali", "Liu Wei").
Furthermore, threshold sensitivity is automatically recalibrated based on the detected culture. This technique delivers better results for highly challenging cultures such as Chinese, Korean or Vietnamese.
Some machine learning features have been included to apply specific weighting to common name elements of individuals (e.g., "Al", "Ben") or organizations (e.g., "bank", "international", "services"). We also provide a list of stopwords and linking words in multiple languages (e.g., "and", "or") to eliminate false positives when screening narrative fields within payments.
Screena comes out with a set of options to determine how to handle matches against short names. For instance, it is possible to systematically discard matches against single-token names contained within full names (e.g., "Arthur Timothy Smith" matching with "Arthur"). These parameters can be differentiated per entity type and tuned based on the number of tokens being screened.
Last updated