The Canadian Roots of Data Quality

2991525763_f4da626c92_b

Ivan Fellegi deserves to be better known, especially in Canada where he was at StatsCan for half a century and became head. In the 1960’s and 70’s he pioneered the field of Data Quality (long before it was ever called that), laying the foundation of probabilistic record linkage in 1969 (the Fellegi-Sunter model, essentially equivalent to Naive Bayes in machine learning) as well missing-data imputation/editing (the Fellegi-Holt model). And he has an interesting bio as well (link to PDF).

(Photo by Flickr user abdallahh, Creative Commons License)