• Computer Society Webinar – Large scale Entity Resolution: A Practical Blueprint from Noisy Records to Trustworthy Entities

    Virtual: https://events.vtools.ieee.org/m/520810

    Entity resolution is the backbone of any data platform that aims to present a single, trustworthy view of an organization across noisy, overlapping sources. This talk shares a practical, system-oriented blueprint for companies entity resolution that you can adapt to your stack. We’ll begin with upstream data preparation—standardization, canonicalization, and normalization of names, websites, addresses, and phones—to reduce ambiguity before matching. We’ll then cover signature construction (e.g., relaxed/collapsed variants), blocking to avoid N² explosion, and a match function that combines exact agreement on one core attribute (website, name, or address) with a second fuzzy signal to balance precision and recall. You’ll see how constraints (e.g., unique primary website; unique name+HQ address), attribute scoring/selection, and separation of company vs. location resolution improve quality and explainability. We’ll discuss pre‑merge signals from authoritative linkages, human‑in‑the‑loop controls for edge cases, and governance patterns—provenance (“why this value”), rollovers for stable IDs, and reproducibility. Finally, we’ll outline evaluation and monitoring tactics (drift checks, audits) and deployment considerations for both batch and streaming environments. Attendees leave with a clear set of building blocks to move from noisy inputs to reliable, auditable entities. Speaker(s): Rohit Muthyala, Virtual: https://events.vtools.ieee.org/m/520810