Imputation of erosivity values under incomplete rainfall data by machine learning methods


In this article, a comparison is presented of empirical equations to machine learning methods for the estimation and imputation of rainfall erosivity values, associated with significant amounts of rainfall measurements that are missing in the available recording rain gauge data of the Greek Hydroscope database. The empirical equations are mainly based on exponential relations between erosivity and rainfall, while the machine learning methods employed in this paper are feed-forward neural networks with Bayesian regularization and ridge regression with nonlinear transformation. The data came from 81 measuring stations of the Ministry of the Environment and Energy. In the employed algorithms, the output was the weekly cumulative erosivity value, which resulted from processing the data of all rain gauges and pluviographs, while the input data consisted of the weekly cumulative rainfall, the month, the co-ordinates and the elevation of the station, as well as the number of days for which the rainfall was recorded. For validation, a method of nested cross-validation was employed. The machine learning methods gave significantly better results compared to the empirical equations, thus reducing the effects of estimating R from only weekly rainfall records.

In European Water 57: p. 193-199, 2017