Good data is essential to developing successful interventions to reduce crime and reform our criminal justice system. But too often, available datasets do not provide the necessary level of detail to “talk to” one another across systems; for example, many lack a unique identifier such as a Social Security number, while others have unreliable information caused by typos, name changes, and nicknames. These issues make linking an individual’s criminal record across multiple datasets—which is crucial for tracking how well an intervention works to improve outcomes—difficult, if not impossible.
To solve this problem, Crime Lab researchers developed Name Match, which uses machine learning—tools that continually leverage data to “learn” and improve performance—to make criminal records compatible in identifying individuals across systems. Our algorithm works by comparing identifying information such as name, date of birth, home address, gender, and race to find overlap with personal information that is consistent across records.
Since development began in 2016, we’ve used the Name Match tool to perform record linkage for several of our evaluation projects, including READI Chicago and One Summer Chicago. In October 2023, we released Name Match on GitHub as an open-source Python package so other researchers, non-profits, and public sector agencies could benefit from the tool and contribute to its ongoing development.