Summary
Microaggregation by individual ranking is one of themost commonly applied disclosure control techniques for continuous microdata. The paper studies the effect of microaggregation by individual ranking on the least squares estimation of a multiple linear regression model. It is shown that the traditional least squares estimates are asymptotically unbiased. Moreover, the least squares estimates asymptotically have the same variances as the least squares estimates based on the original (non-aggregated) data. Thus, asymptotically, microaggregation by individual ranking does not result in a loss of efficiency in the least squares estimation of a multiple linear regression model.
Similar content being viewed by others
References
Brand, R. (2000). Anonymität von Betriebsdaten. Beiträge zur Arbeitsmarkt-und Berufsforschung, 237, Institut für Arbeitsmarkt- und Berufsforschung, Nürnberg.
Defays, D., Anwar, M. N. (1998). Masking microdata using micro-aggregation.Journal of Official Statistics 14 449–461.
Defays, D., Nanopoulos, P. (1993). Panels of enterprises and confidentiality: the small aggregates method. Proceedings of the 1992 Symposium on Design and Analysis of Longitudinal Surveys, Ottawa, Statistics Canada, 195–204.
Domingo-Ferrer, J. (2002).Inference Control in Statistical Databases. Springer, New York.
Domingo-Ferrer, J., Mateo-Sanz, J. M. (2001). An empirical comparison of SDC methods for continuous microdata in terms of information loss and reidentification risk. Second Joint ECE/Eurostat Work Session on Statistical Data Confidentiality, Skopje, Macedonia.
Domingo-Ferrer, J., Mateo-Sanz, J. M. (2002). Practical data-oriented microaggregation for statistical disclosure control.IEEE Transactions on Knowledge and Data Engineering 14 189–201.
Domingo-Ferrer, J., Oganian, A., Torres, A., Mateo-Sanz, J. M. (2002). On the security of microaggregation with individual ranking: Analytical attacks.International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 10 477–491.
Domingo-Ferrer, J., Torra, V. (2001). A quantitative comparison of disclosure control methods for microdata. InConfidentiality, Disclosure, and Data Access (P. Doyle, J. Lane, J. Theeuwes, L. Zayatz, eds.), 111–133. North-Holland, Amsterdam.
Doyle, P., Lane, J., Theeuwes, J., Zayatz, L. (2001).Confidentiality, Disclosure, and Data Access. North-Holland, Amsterdam.
Feige, E. L., Watts, H. W. (1972). An investigation of the consequences of partial aggregation of micro-economic data.Econometrica 40 343–360.
Giessing, S. (2001). Nonperturbative disclosure control methods for tabular data. InConfidentiality, Disclosure, and Data Access (P. Doyle, J. Lane, J. Theeuwes, L. Zayatz, eds.), 185–213. North-Holland, Amsterdam.
Hansen, S. L., Mukherjee, S. (2003) A polynomial algorithm for optimal univariate microaggregation.IEEE Transactions on Knowledge and Data Engineering 15 1043–1044.
Laszlo, M., Mukherjee, S. (2005). Minimum Spanning Tree Partitioning Algorithm for Microaggregation.IEEE Transactions on Knowledge and Data Engineering 17 902–911.
Lechner, S., Pohlmeier, W. (2003). Schätzung ökonometrischer Modelle auf der Grundlage anonymisierter Daten. InAnonymisierung wirtschaftsstatistischer Einzeldaten (G. Ronning, R. Gnoss, Hrsg.), 115–137. Forum der Bundesstatistik, 42, Statistisches Bundesamt, Wiesbaden.
Ronning, G., Gnoss, R. (2003). Anonymisierung wirtschaftsstatistischer Einzeldaten. Forum der Bundesstatistik, 42, Statistisches Bundesamt, Wiesbaden.
Schmid, M. (2006). The effect of single-axis sorting on the estimation of a linear regression. Discussion Paper 472, SFB 386, Department of Statistics, University of Munich.
Schmid, M., Schneeweiss, H. (2005). The effect of microaggregation procedures on the estimation of linear models: A simulation study. InEconometrics of Anonymized Micro data (W. Pohlmeier, G. Ronning, J. Wagner, eds.), Jahrbücher für Nationalökonomie und Statistik, 225, No. 5, Lucius & Lucius, Stuttgart.
Schmid, M., Schneeweiss, H., Küchenhoff, H. (2005a). Consistent estimation of a simple linear model under microaggregation. Discussion Paper 415, SFB 386, Department of Statistics, University of Munich.
Schmid, M., Schneeweiss, H., Küchenhoff, H. (2005b). Statistical inference in a simple linear model under microaggregation. Discussion paper 416, SFB 386, Department of Statistics, University of Munich.
Statistisches Bundesamt (2005). Handbuch zur Anonymisierung wirtschafts-statistischer Mikrodaten. Statistik und Wissenschaft 4, Statistisches Bundesamt, Wiesbaden.
Willenborg, L., de Waal, T. (2001).Elements of Statistical Disclosure Control. Springer, New York.
Winkler, W. E. (2002). Single-ranking micro-aggregation and re-identification. Statistical Research Division report RR 2002/08, U.S. Bureau of the Census, Washington.
Author information
Authors and Affiliations
Additional information
I thank Hans Schneeweiss for very helpful discussions and comments. Financial support from the Deutsche Forschungsgemeinschaft (German Science Foundation) is gratefully acknowledged.
Rights and permissions
About this article
Cite this article
Schmid, M. Estimation of a linear model under microaggregation by individual ranking. Allgemeines Statistisches Arch 90, 419–438 (2006). https://doi.org/10.1007/s10182-006-0243-z
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/s10182-006-0243-z
Keywords
- Asymptotic variance
- consistent estimation
- disclosure control
- individual ranking
- linear model
- microaggregation