There is a nice story in the Globe and Mail today regarding StatsCan and the Canadian census. The story describes two very important aspects of data quality: the missing data problem, and the record linkage problem. This is in the context of the 2011 decision by the Canadian government to scrap the mandatory long-form census, which has made it difficult to gather and analyse vital information about Canada and its citizens. Response rates have been low in many areas, and people are often reluctant to enter information about their income (leading to the missing data problem). StatsCan is doing its best in this difficult situation, and in this case is asking a sample of respondents to include their Social Insurance Number so that the census data can be linked to more reliable income data from the Canadian Revenue Agency (the record linkage problem).
“When the data from the survey was released last year, information on thousands of smaller communities was withheld because of low response rates. And because some people didn’t want to fill out the voluntary form or parts of it, collected data on income levels has been criticized as flawed.
The agency is now asking a broad sample of those who fill out the tests of the mandatory, short-form census to include their SIN. The number will help tap into specific information from tax returns held by the Canada Revenue Agency, the type of solid data that could backstop the census.”
(Photo by Flickr user abdallah, Creative Commons license)