The Hidden Problem in Big Data: Even Infinite Information does not Guarantee Consistent Measurement

Abstract

The social sciences heavily depend on the measurement of abstract constructs for quantifying effects, identifying association between variables, and testing hypotheses. In data science, constructs are also often used for forecasting, and, thanks to the recent big data revolution, they promise to enhance their accuracy by leveraging the constantly increasing stream of digital information around us. However, the possibility of optimizing various social indicators implicitly hinges on our ability to reliably reduce complex and abstract constructs (such as life satisfaction or social trust) into numeric measures. While many scientists are aware of the issue of measurement error, there is widespread, implicit hope that access to more data will eventually render this issue irrelevant. This paper delves into the nature of measurement error under quasi-ideal conditions. We show mathematically and by employing simulations that single measurements fail to converge even when we have access to progressively more information. Then, by using real-world data from the Social Capital Benchmark Surveys, we demonstrate how the addition of new information increases the dimensionality to the measured construct quasi-indefinately which further contributes to measurement divergence. We conclude discussing implications and future research directions to possibly solve this problem.

Philip Warncke
Philip Warncke
Post-doctoral fellow in Political Science

Philip Warncke is a recent doctoral graduate from the University of North Carolina at Chapel Hill. Philip studies mass belief systems, particularly how to measure and compare them, as well as their consequences for political outcomes.

Related