3.2.2 Predicting treatment stocks under degraded conditions
With the intent to conceptually estimate carbon sequestered since
restoration treatment, we predicted treatment floodplain carbon stocks
with models created for degraded floodplains. Hypothetically,
model-predicted treatment stocks represent pre-treatment conditions at
the treatment category floodplain, i.e., if it were still degraded. We
utilize this method as a thought exercise and recognize the large
assumption that pre-restoration conditions are similar to the degraded
floodplains associated with each restoration project. We acknowledge the
centrality of the assumption and suggest that direct, repeat pre and
post measurements would better represent the estimate of carbon
sequestered since restoration. After predicting treatment stocks using
the degraded-derived models, we calculated the differences between
measured and predicted treatment stocks as a first-order approximation
of how much carbon could be sequestered if environmental conditions are
sufficiently similar between degraded, treatment, and reference sites.
3.2.3 Across-site comparisons
We tested for differences between degraded, treatment, and reference
carbon stocks (Mg/ha) for sites combined by Level III ecoregion (Omernik
and Griffith, 2014). We categorized Type III ecoregions by where the
treatment floodplain was located for a set of degraded, treatment, and
reference floodplains within a site.
Along with testing categories and regions for all data, we also explored
models that best explain carbon stock for the entire dataset. We used
ANOVA tests for both the ecoregion comparisons and the entire dataset
categorical comparisons and compared estimated marginal means with the
emmeans package in R (Lenth 2023). Then, using the complete dataset, we
investigated correlations between carbon stock and potential numeric
predictor variables related to soil texture and climate. Using variables
with significant correlations and additional categorical predictor
variables of research interest, we used three types of models to
estimate carbon content (%). We used carbon content rather than stocks
to avoid uncertainty introduced by the assigned bulk density values that
were used to calculate stocks, i.e., we used direct laboratory results
with no modification. We split the dataset using 80% of sample points
for model building and the remaining 20% for model evaluation.
We compared three modeling approaches to carbon estimation using a
variety of predictors including treatment category within a site. In all
models discussed, we excluded data from Colorado due to the
non-associated nature of the degraded-treatment-reference datasets in
this region. In the Colorado dataset, degraded, treatment, and reference
sites are far (>10 km) apart and not along the same stream.
Using data from Oregon and Utah only, we began with a linear mixed
model. To account for the lack of independence of samples from the same
floodplains, lack of independence of samples from different depths from
the same hole, and the availability-based nature of the stream
restoration projects we chose to sample, we modeled carbon content as a
mixed model with random and fixed effects in a nested block study design
using the lmer function from the lme4 package in R version 4.1.3 (R Core
Team, 2022). Due to the large number of complexities to be considered in
the mixed generalized linear model, we also modeled carbon using a
gradient boosted regression tree model that utilizes elements of
decision trees and machine learning to account for characteristics of
the dataset without the need to account for the same linear model
sensitivities. The gradient boosting model was built with the dismo
package in R (Hijmans et al., 2021). Parameters for this model were
those suggested by Elith et al. (2008). In addition, we modeled the data
with a random forest model using regression decision trees with the
randomForest package in R (Liaw and Wiener 2002). We compared results
from the three models with the root mean square error (RMSE) and the
coefficient of determination (R2) between 20% of the
data reserved for model evaluation and the model predictions.