To cite this paper use one of the standards below:
Monitoring Soil Organic Matter (SOM) is essential for Monitoring, Reporting, and Verification (MRV) protocols in agricultural ecosystems. However, balancing sampling costs with fit-for-purpose modeling remains a critical bottleneck for scalable DSM. This study evaluates the sensitivity of the Random Forest (RF) algorithm to training sample density for predicting SOM (0–30 cm) across a 52-ha agricultural area in Chile. The methodology integrated georeferenced SOM data with multi-source geoenvironmental covariates at 30 m resolution. Predictors included terrain attributes (elevation, slope, curvatures), Sentinel-1/2 multispectral and radar bands, and ALOS PALSAR data. To quantify model sensitivity, multiple sample size scenarios (n) were generated using conditioned Latin Hypercube Sampling (cLHS) within stratified principal components (PC1–PC5). RF models were optimized via grid-search and validated against an independent dataset using RMSE, R2, MAPE, and RPIQ. K-means clustering (k=3) was applied to performance metrics to identify Low, Medium, and High-performance tiers. Results indicated a clear performance plateau within the "Medium" cluster at an average of 26 samples, translating to a density of 1.98 ha/sample. High-performance stability was achieved at approximately 1.7 ha/sample (~30 samples). Below these thresholds, model error increased significantly and predictability (R2) became low. These findings demonstrate that increasing sampling density beyond 2 ha/sample yields diminishing returns in predictive accuracy for this landscape. This study provides a data-driven framework for optimizing soil sampling designs, ensuring robust SOM predictions while minimizing operational costs. Furthermore, integrating regional datasets through similarity-based weighting can enhance local model performance, effectively reducing the necessity for intensive primary and additional sampling.
With nearly 200,000 papers published, Galoá empowers scholars to share and discover cutting-edge research through our streamlined and accessible academic publishing platform.
Learn more about our products:
This proceedings is identified by a DOI , for use in citations or bibliographic references. Attention: this is not a DOI for the paper and as such cannot be used in Lattes to identify a particular work.
Check the link "How to cite" in the paper's page, to see how to properly cite the paper