Decision tree and geostatistics in reducing the number of soil micronutrient analyses



Sample density, Statistical learning, Ordinary kriging


To carry out interpolation by kriging, it is important that each point in a semivariogram is obtained from at least a combination of 30 pairs of points in at least 100 samples, which makes the process expensive for the farmer. As an alternative, machine learning methodologies were used, particularly decision trees. The main objective of this work was to evaluate the use of the decision tree methodology to reduce sample density for soil attributes in order to perform ordinary kriging with a reduced sample size. To this end, 50 samples were taken using the Latin Hypercube Sampling (LHS) algorithm, with meshes containing 82, 112 and 127 sampled points and the missing values were predicted using the decision tree, until 150 points were completed and then ordinary kriging was carried out for the MR127, MR112 and MR82 meshes, which was generated by combining the 50 predictions by decision tree and evaluating the values of the Root Mean Square Error (RMSE) and Mean Absolute Error (MAE).

It was noticeable that there is a reduction in these statistics as the number of original samples is reduced. When mapping the attributes of the reduced meshes, it can be seen that the micronutrient concentration pattern of the soils in the reduced meshes is similar to the original pattern, i.e. areas with higher concentrations still have high concentrations and regions with lower concentrations still have lower concentrations. Therefore, the use of decision trees proved to be efficient in preserving the soil's micronutrient concentration pattern.




