The Hidden Problem in Big Data: Even Infinite Information does not Guarantee Consistent Measurement

Philip Warncke, Dino Carpentras

January 2025

PDF DOI

Abstract

The social sciences heavily depend on the measurement of abstract constructs for quantifying effects, identifying association between variables, and testing hypotheses. In data science, constructs are also often used for forecasting, and, thanks to the recent big data revolution, they promise to enhance their accuracy by leveraging the constantly increasing stream of digital information around us. However, the possibility of optimizing various social indicators implicitly hinges on our ability to reliably reduce complex and abstract constructs (such as life satisfaction or social trust) into numeric measures. While many scientists are aware of the issue of measurement error, there is widespread, implicit hope that access to more data will eventually render this issue irrelevant. This paper delves into the nature of measurement error under quasi-ideal conditions. We show mathematically and by employing simulations that single measurements fail to converge even when we have access to progressively more information. Then, by using real-world data from the Social Capital Benchmark Surveys, we demonstrate how the addition of new information increases the dimensionality to the measured construct quasi-indefinately which further contributes to measurement divergence. We conclude discussing implications and future research directions to possibly solve this problem.

Type

Journal article

Source Themes

Philip Warncke

Post-doctoral fellow in Political Psychology

Philip is a comparative political psychologist studying mass belief systems, particularly how to measure and compare them, as well as their consequences for political outcomes.

The Hidden Problem in Big Data: Even Infinite Information does not Guarantee Consistent Measurement

Abstract

Philip Warncke

Post-doctoral fellow in Political Psychology

Related