Sustainable management of the environment is based on the preservation of natural resources, first of all, freshwater-both surface and groundwater-from exhaustion and contamination. Thus, development of adequate monitoring solutions, including fast and adaptive modelling approaches, are of high importance. Recent progress in machine learning techniques provide an opportunity to improve the prediction accuracy of the spatial distribution of properties of natural objects and to automate all stages of this process to exclude uncertainties caused by handcrafting. We propose a technique to construct the weighted Water Quality Index (WQI) and the spatial prediction map of the WQI in tested area. In particular, WQI is calculated using dimensionality reduction technique (Principal Component Analysis), and spatial map of WQI is constructed using Gaussian Process Regression with automatic kernel structure selection using Bayesian Information Criterion (BIC). We validate our approach on a new dataset for groundwater quality in the New Moscow region, where groundwater is mostly used for drinking purposes. According to estimated WQI values, groundwater quality across the study region is relatively high, with few points, less than 0.5% of all observations, severely contaminated. Estimated WQIs then were used to construct spatial distribution models, GPR-BIC approach was compared with ordinary Kriging (OK), Universal Kriging (UK) with exponential, Gaussian, polynomial and periodic kernels. Quality of models was assessed using cross-validation scheme, according to which BIC-GPR approach showed better performance on average with 15% higher R2 score comparing to other Kriging models. We show that the proposed geospatial interpolation is a potentially powerful and adaptable tool for predicting the spatial distribution of properties of natural resources.
- Bayesian information criterion
- Gaussian process regression
- PCA-loading index
- Water quality