Machine learning methods for estimation the indicators of phosphogypsum influence in soil

Maria A. Pukalchik, Alexandr M. Katrutsa, Dmitry Shadrin, Vera A. Terekhova, Ivan V. Oseledets

    Research output: Contribution to journalArticlepeer-review

    5 Citations (Scopus)


    Purpose: The full understanding of the effect of mineral waste-based fertilizer in soil is still unrelieved, because of the extreme complex chemical composition and plethora of their action pathways. The purposes of this paper is to quantify the input of PG into the soil ecosystem process, considering the direct effects of PG as a whole on soil environment using of a plethora of chemical, toxicological, and biological tests. Materials and methods: Greenhouse experiment includes different PG doses (0, 1%, 3%, 7.5%, 15%, 25%, and 40%) and two-time collection points after treatments—7 and 28 days. For each treatment and each time collection point, we measure (i) soil pH, bioavailable (H20 and NH4COOH-extractable) element content (S, P, K, Na, Mg, Ca, Fe, Zn, Sr, Ba, F); (ii) soil enzyme activities—dehydrogenase, urease, acid phosphatase, FDA; (iii) soil CO2 respiration activity with and without glucose addition; (iv) Eisenia fetida, Sinapis alba, and Avena sativa responses. Finally, we combine the ordinary chemical, toxicology, and biological measuring of soil properties with state-of-the-art mathematical analysis, namely (i) support vector machines (used for prediction), (ii) mutual information test (variable importance tasks), (iii) t-SNE and LLE algorithms (used for unsupervised classification). Results and discussion: The results show similarity between the 0%, 1%, and 3% PG treatments in all collection times based on the toxicological and biological properties. Beyond 7.5% PG, some biological test was significantly inhibited in response to trace element stress. Among all tested parameters, soil urease activities, soil respiration activities after glucose addition, S. alba root lengths, and E. fetida survival rates show sensitivity to PG addition. Furthermore, the machine learning algorithms revealed that only several elements (mobile and water-soluble forms of Ca, Ba, Sr, S, and Na; water-soluble F) could be responsible to elevated soil toxicity for those indicators. SVR models were able to predict soil biological and ecotoxicity properties, and increasing numbers of randomly selected training examples from 50 to 90% of initial experimental data significantly improved model performance. Conclusions: At this study, we demonstrate benefits of unsupervised machine learning methods for investigating toxicity of man-made substances in soil that can be further applied to risk assessments of various toxins, which are of significant interest to environmental protection.

    Original languageEnglish
    Pages (from-to)2265-2276
    Number of pages12
    JournalJournal of Soils and Sediments
    Issue number5
    Publication statusPublished - 1 May 2019


    • Bioassay
    • Biological properties
    • Feature relevance
    • Machine learning
    • Pollution
    • Regression
    • Soil
    • Trace element
    • Waste


    Dive into the research topics of 'Machine learning methods for estimation the indicators of phosphogypsum influence in soil'. Together they form a unique fingerprint.

    Cite this