Forests of UncertainT(r)ees: Using tree-based ensembles to estimate probability distributions of future conflict

Abstract:

The high uncertainty of point predictions when forecasting conflict, especially at the subnational level, is a significant shortcoming and major obstacle to the practical application of conflict prediction systems. In our contribution to the 2023/24 ViEWS prediction challenge at the PRIO-GRID-month (pgm) level, we employ a quasi-hurdle combination of tree-based models to generate pgm-specific predictions of N=1000 samples each three to fourteen months into the future. Our strategy combines predictions from a binary classification task on the occurrence of fatalities with sample outputs from a distributional regressors trained only on non-zero targets. We address the problem of zero-inflation by interpreting the probability of the classifier as the share of non-zero predictions in the final samples drawn to represent the predicted distributions. We design a modeling pipeline to automatically tune multiple classifiers and regressors and select the best model for each prediction timestep based on tuning performance. In an effort to address a lack of data as a source of uncertainty, we additionally generate “local” model predictions for semi-automatically generated spatial clusters of violence based on pgms experiencing any fatalities in our training data, thus accounting for context-specific systematic differences in conflict dynamics. While all our models beat a series of benchmarks across almost all test windows and metrics, the “global”-only model (unibw_trees_global) and a global-local combination selected based on past performance (unibw_trees_global-local) scored best, with the local model (unibw_trees_local) only slightly worse.

Authors:

Daniel Mittermaier, Tobias Bohne, and Martin Hofer