3.2.2 Predicting treatment stocks under degraded conditions
With the intent to conceptually estimate carbon sequestered since restoration treatment, we predicted treatment floodplain carbon stocks with models created for degraded floodplains. Hypothetically, model-predicted treatment stocks represent pre-treatment conditions at the treatment category floodplain, i.e., if it were still degraded. We utilize this method as a thought exercise and recognize the large assumption that pre-restoration conditions are similar to the degraded floodplains associated with each restoration project. We acknowledge the centrality of the assumption and suggest that direct, repeat pre and post measurements would better represent the estimate of carbon sequestered since restoration. After predicting treatment stocks using the degraded-derived models, we calculated the differences between measured and predicted treatment stocks as a first-order approximation of how much carbon could be sequestered if environmental conditions are sufficiently similar between degraded, treatment, and reference sites.
3.2.3 Across-site comparisons
We tested for differences between degraded, treatment, and reference carbon stocks (Mg/ha) for sites combined by Level III ecoregion (Omernik and Griffith, 2014). We categorized Type III ecoregions by where the treatment floodplain was located for a set of degraded, treatment, and reference floodplains within a site.
Along with testing categories and regions for all data, we also explored models that best explain carbon stock for the entire dataset. We used ANOVA tests for both the ecoregion comparisons and the entire dataset categorical comparisons and compared estimated marginal means with the emmeans package in R (Lenth 2023). Then, using the complete dataset, we investigated correlations between carbon stock and potential numeric predictor variables related to soil texture and climate. Using variables with significant correlations and additional categorical predictor variables of research interest, we used three types of models to estimate carbon content (%). We used carbon content rather than stocks to avoid uncertainty introduced by the assigned bulk density values that were used to calculate stocks, i.e., we used direct laboratory results with no modification. We split the dataset using 80% of sample points for model building and the remaining 20% for model evaluation.
We compared three modeling approaches to carbon estimation using a variety of predictors including treatment category within a site. In all models discussed, we excluded data from Colorado due to the non-associated nature of the degraded-treatment-reference datasets in this region. In the Colorado dataset, degraded, treatment, and reference sites are far (>10 km) apart and not along the same stream. Using data from Oregon and Utah only, we began with a linear mixed model. To account for the lack of independence of samples from the same floodplains, lack of independence of samples from different depths from the same hole, and the availability-based nature of the stream restoration projects we chose to sample, we modeled carbon content as a mixed model with random and fixed effects in a nested block study design using the lmer function from the lme4 package in R version 4.1.3 (R Core Team, 2022). Due to the large number of complexities to be considered in the mixed generalized linear model, we also modeled carbon using a gradient boosted regression tree model that utilizes elements of decision trees and machine learning to account for characteristics of the dataset without the need to account for the same linear model sensitivities. The gradient boosting model was built with the dismo package in R (Hijmans et al., 2021). Parameters for this model were those suggested by Elith et al. (2008). In addition, we modeled the data with a random forest model using regression decision trees with the randomForest package in R (Liaw and Wiener 2002). We compared results from the three models with the root mean square error (RMSE) and the coefficient of determination (R2) between 20% of the data reserved for model evaluation and the model predictions.