Enhancing the Robustness of Observational Social Science Research by Computational Multi-Model Analyses
Replication sciences have so far heavily focused on experimental research where uncertainty of results is primarily caused by sampling error. However, the analysis of the robustness of research with non-experimental data, which is dominant in many social sciences disciplines, requires other methods that also capture uncertainty caused by model choice: In research with non-experimental data, there are often numerous possibilities to specify the analysed samples, the functional form of studied associations, the selection of covariates, and regression models. In addition, unobserved heterogeneity can endanger the validity of non-experimental research (so-called sensitivity). However, large-scale evaluation studies are missing, and there is also a lack of suitable methods for this purpose.
This project therefore asks: How can the robustness of non-experimental social science research be assessed and improved with the help of computational “multi-model” programs? Three closely related research objectives serve this purpose:
- The further development of promising programs for robustness and sensitivity analyses (so-called "multi-model", "multiverse", or "specification curve" analyses) for their use in large-scale evaluations. For example, we aim at developing standardised robustness measures and defining model variants to be tested (in the form of sample/variable/regression models).
- To achieve the first large-scale robustness analysis of effect estimates with regression analyses of non-experimental data. For this purpose, the tools developed in 1. will be applied to 100 studies published in leading journals of relevant disciplines (sociology, political science, and economics). We will investigate the reproducibility rate: To what extent are results reproducible with the models and data of the primary studies; and what role do possible (coding) errors play in this? The robustness rate: To what extent are results robust against (which) alternative models? As well as the sensitivity rate: To what extent does unobserved heterogeneity threaten the robustness and validity of estimated effects? These comprehensive analyses also allow for the first time a systematic identification key (statistical) sources for robustness.
- The exploration of routines to improve robustness in primary research: To what extent can multi-model and sensitivity analyses already help researchers to arrive at more robust estimates? As another novelty we will implement with “robustness notes” a new publication format.
With these three closely intertwined research objectives, the project makes important contributions to the "what" and "how" question of the META-REP Priority Programme: What is the replication rate (robustness), how can it be determined, and how can robustness be improved already in primary research?