Study hypotheses and objectives
H1: The AGI model will perform better than the dissolved oxygen and null model, and the dissolved oxygen model will perform better than the null model.
study objective being met: Which model performs the best and presents the best predictions (i.e., best predictive performance scores, most ecologically realistic suitability maps)?
H2: The inclusion of dissolved oxygen at deeper depths will result in better/more ecologically realistic habitat suitability predictions relative to the dissolved oxygen model considering surface values alone.
study objective being met: How does dissolved oxygen at different depths influence habitat suitability predictions relative to oxygen at the surface?
H3: The inclusion of the AGI at deeper depths will result in better/more ecologically realistic habitat suitability predictions relative to the AGI model considering surface values alone.
study objective being met: How does the aerobic growth index (AGI; environmental oxygen supply:theoretical oxygen demand) at different depths influence habitat suitability predictions relative to the aerobic growth index at the surface?
H4: There will be important relationships between dissolved oxygen/the AGI and latitude/distance to coast.
study objective being met: Are there any important relationships between dissolved oxygen or AGI at the surface or at depth and latitude or distance to the coast?
H5: The null model will predict higher habitat suitability in areas or during seasons or periods (upwelling or La Niña) with lower dissolved oxygen through the water column relative to the dissolved oxygen and AGI models.
study objective being met: How do the habitat suitability maps differ between the models? How do these variations compare for different points in time?
Model summaries
explore_brt (mod_file_path = brt_outputs_crw[7 ],
test_data = base_test_daily_crw)
[1] "Model performance metrics"
Model 1
Total.Deviance 1.3862823
Residual.Deviance 0.7986447
Correlation 0.7174286
AUC 0.9148000
Per.Expl 42.3894599
cvDeviance 1.0127285
cvCorrelation 0.5642493
cvAUC 0.8220800
cvPer.Expl 26.9464423
[1] "Relative influence of predictor variables"
rel.inf
bathy_mean 27.367481
temp_mean 18.477169
sal_mean 10.540214
chl_mean 8.710175
ssh_mean 6.015610
mld_mean 5.861958
vostr_mean 5.303627
bathy_sd 5.270798
vo_mean 3.496298
uo_mean 3.392592
uostr_mean 2.849782
pred_var 2.714296
[1] "Partial plots"
[1] "Top most important pairwise interactions as identified by the model"
var1.index var1.names var2.index var2.names int.size
1 10 bathy_mean 2 temp_mean 363.44
2 6 vo_mean 3 sal_mean 193.16
3 12 pred_var 4 uo_mean 168.67
4 8 ssh_mean 2 temp_mean 162.10
5 2 temp_mean 1 chl_mean 129.68
6 10 bathy_mean 8 ssh_mean 110.34
7 8 ssh_mean 1 chl_mean 98.98
[1] "External percent deviance explained"
[1] 0.3850404
[1] "TPR"
[1] 0.6952086
[1] "TSS"
[1] 0.6133057
[1] "Model evaluation using a 75/25 train/test data split"
gbm::gbm(formula = y.data ~ ., distribution = as.character(family),
data = x.data, weights = site.weights, var.monotone = var.monotone,
n.trees = target.trees, interaction.depth = tree.complexity,
shrinkage = learning.rate, bag.fraction = bag.fraction, verbose = FALSE)
A gradient boosted model with bernoulli loss function.
4150 iterations were performed.
There were 12 predictors of which 12 had non-zero influence.
RMSE Cor C-index PredRatio DevianceExplained PseudoR2
1 0.3706824 0.6788681 0.8917619 1.003817 0.3850404 0.4238946
explore_brt (mod_file_path = brt_outputs_back[7 ],
test_data = base_test_daily_back)
[1] "Model performance metrics"
Model 1
Total.Deviance 1.3862741
Residual.Deviance 0.2948092
Correlation 0.9249630
AUC 0.9922000
Per.Expl 78.7336988
cvDeviance 0.5909966
cvCorrelation 0.8025147
cvAUC 0.9464300
cvPer.Expl 57.3679835
[1] "Relative influence of predictor variables"
rel.inf
bathy_mean 37.726838
temp_mean 23.806676
sal_mean 7.021355
chl_mean 5.980357
ssh_mean 5.413233
uostr_mean 5.244057
vostr_mean 3.838429
bathy_sd 2.871536
mld_mean 2.581925
uo_mean 2.392186
vo_mean 1.868975
pred_var 1.254433
[1] "Partial plots"
[1] "Top most important pairwise interactions as identified by the model"
var1.index var1.names var2.index var2.names int.size
1 10 bathy_mean 2 temp_mean 835.60
2 10 bathy_mean 8 ssh_mean 650.18
3 8 ssh_mean 2 temp_mean 556.11
4 10 bathy_mean 3 sal_mean 496.83
5 10 bathy_mean 4 uo_mean 406.37
6 3 sal_mean 2 temp_mean 343.56
7 8 ssh_mean 1 chl_mean 337.30
[1] "External percent deviance explained"
[1] -3.437823
[1] "TPR"
[1] 0.2602644
[1] "TSS"
[1] 0
[1] "Model evaluation using a 75/25 train/test data split"
gbm::gbm(formula = y.data ~ ., distribution = as.character(family),
data = x.data, weights = site.weights, var.monotone = var.monotone,
n.trees = target.trees, interaction.depth = tree.complexity,
shrinkage = learning.rate, bag.fraction = bag.fraction, verbose = FALSE)
A gradient boosted model with bernoulli loss function.
4250 iterations were performed.
There were 12 predictors of which 12 had non-zero influence.
RMSE Cor C-index PredRatio DevianceExplained PseudoR2
1 0.8898691 -0.8883681 0.01999832 0.9823639 -3.437823 0.787337
explore_brt (mod_file_path = "data/brt/mod_outputs/crw/refined/brt_agi_0m_60m_250m_dail_seas_ann_no_wind.rds" ,
test_data = agi_test_daily_seasonal_annual_crw)
[1] "Model performance metrics"
Model 1
Total.Deviance 1.3862598
Residual.Deviance 0.1784155
Correlation 0.9620158
AUC 0.9980000
Per.Expl 87.1297181
cvDeviance 0.4443144
cvCorrelation 0.8616943
cvAUC 0.9680100
cvPer.Expl 67.9486949
[1] "Relative influence of predictor variables"
rel.inf
bathy_mean 29.410142
temp_mean 22.090687
AGI_250m_seas 8.885923
AGI_0m 7.036726
AGI_0m_seas 4.031791
sal_mean 3.863634
AGI_250m_ann 3.717057
AGI_250m 3.185755
AGI_60m_ann 3.018782
ssh_mean 2.964206
chl_mean 2.686814
AGI_60m_seas 2.575530
AGI_0m_ann 1.639762
AGI_60m 1.581991
bathy_sd 1.553865
mld_mean 1.073780
pred_var 0.683555
[1] "Partial plots"
[1] "Top most important pairwise interactions as identified by the model"
var1.index var1.names var2.index var2.names int.size
1 8 AGI_0m 2 temp_mean 3465.55
2 15 AGI_0m_ann 12 AGI_0m_seas 386.42
3 6 bathy_mean 3 sal_mean 359.44
4 16 AGI_60m_ann 6 bathy_mean 297.40
5 16 AGI_60m_ann 12 AGI_0m_seas 290.80
6 17 AGI_250m_ann 3 sal_mean 258.66
7 14 AGI_250m_seas 6 bathy_mean 192.39
8 14 AGI_250m_seas 2 temp_mean 163.51
9 8 AGI_0m 4 ssh_mean 162.57
10 6 bathy_mean 2 temp_mean 154.13
11 14 AGI_250m_seas 10 AGI_250m 153.38
12 3 sal_mean 2 temp_mean 150.76
13 12 AGI_0m_seas 2 temp_mean 123.54
14 8 AGI_0m 3 sal_mean 103.44
[1] "External percent deviance explained"
[1] -2.981096
[1] "TPR"
[1] 0.3431204
[1] "TSS"
[1] 0
[1] "Model evaluation using a 75/25 train/test data split"
gbm::gbm(formula = y.data ~ ., distribution = as.character(family),
data = x.data, weights = site.weights, var.monotone = var.monotone,
n.trees = target.trees, interaction.depth = tree.complexity,
shrinkage = learning.rate, bag.fraction = bag.fraction, verbose = FALSE)
A gradient boosted model with bernoulli loss function.
9050 iterations were performed.
There were 17 predictors of which 17 had non-zero influence.
RMSE Cor C-index PredRatio DevianceExplained PseudoR2
1 0.8144573 -0.5895665 0.1853505 0.6929731 -2.981096 0.8712972
explore_brt (mod_file_path = "data/brt/mod_outputs/crw/refined/brt_do_0m_60m_250m_dail_seas_ann_no_wind.rds" ,
test_data = do_test_daily_seasonal_annual_crw)
[1] "Model performance metrics"
Model 1
Total.Deviance 1.3862871
Residual.Deviance 0.5891149
Correlation 0.8163435
AUC 0.9606000
Per.Expl 57.5041190
cvDeviance 0.8440735
cvCorrelation 0.6692730
cvAUC 0.8827300
cvPer.Expl 39.1126493
[1] "Relative influence of predictor variables"
rel.inf
o2_mean_0m 17.287654
o2_mean_250m_ann 14.631246
o2_mean_0m_seas 8.910687
o2_mean_250m_seas 7.142539
temp_mean 6.627435
o2_mean_60m_ann 5.495030
o2_mean_60m_seas 5.325783
bathy_mean 4.964121
sal_mean 4.592187
o2_mean_250m 4.465235
chl_mean 4.270698
o2_mean_0m_ann 3.305816
o2_mean_60m 3.042965
ssh_mean 2.916029
mld_mean 2.724895
bathy_sd 2.244270
pred_var 2.053409
[1] "Partial plots"
[1] "Top most important pairwise interactions as identified by the model"
var1.index var1.names var2.index var2.names int.size
1 3 temp_mean 1 o2_mean_0m 508.25
2 12 o2_mean_0m_seas 1 o2_mean_0m 301.24
3 14 o2_mean_250m_seas 10 o2_mean_250m 166.76
4 12 o2_mean_0m_seas 5 ssh_mean 150.68
5 12 o2_mean_0m_seas 2 chl_mean 122.58
6 4 sal_mean 3 temp_mean 101.22
7 15 o2_mean_0m_ann 4 sal_mean 89.73
8 16 o2_mean_60m_ann 7 bathy_mean 86.63
9 9 o2_mean_60m 7 bathy_mean 81.59
10 9 o2_mean_60m 8 bathy_sd 77.18
11 11 pred_var 7 bathy_mean 66.90
12 7 bathy_mean 3 temp_mean 65.75
13 12 o2_mean_0m_seas 4 sal_mean 63.82
14 13 o2_mean_60m_seas 4 sal_mean 63.74
[1] "External percent deviance explained"
[1] 0.5285492
[1] "TPR"
[1] 0.7203563
[1] "TSS"
[1] 0.7319403
[1] "Model evaluation using a 75/25 train/test data split"
gbm::gbm(formula = y.data ~ ., distribution = as.character(family),
data = x.data, weights = site.weights, var.monotone = var.monotone,
n.trees = target.trees, interaction.depth = tree.complexity,
shrinkage = learning.rate, bag.fraction = bag.fraction, verbose = FALSE)
A gradient boosted model with bernoulli loss function.
5650 iterations were performed.
There were 17 predictors of which 17 had non-zero influence.
RMSE Cor C-index PredRatio DevianceExplained PseudoR2
1 0.3166705 0.7799527 0.9421311 1.001803 0.5285492 0.5750412
explore_brt (mod_file_path = "data/brt/mod_outputs/background/refined/brt_agi_0m_60m_250m_dail_seas_ann_no_wind.rds" ,
test_data = agi_test_daily_seasonal_annual_back)
[1] "Model performance metrics"
Model 1
Total.Deviance 1.3862937
Residual.Deviance 0.1804510
Correlation 0.9611911
AUC 0.9978000
Per.Expl 86.9832054
cvDeviance 0.4441876
cvCorrelation 0.8615975
cvAUC 0.9680800
cvPer.Expl 67.9586240
[1] "Relative influence of predictor variables"
rel.inf
bathy_mean 30.0158951
temp_mean 22.6707636
AGI_250m_seas 9.1916259
AGI_0m 6.6966664
AGI_0m_seas 3.8422741
sal_mean 3.5425059
AGI_250m_ann 3.2096436
ssh_mean 3.1559174
AGI_60m_seas 2.9237078
AGI_60m_ann 2.8613663
chl_mean 2.6961486
AGI_250m 2.6213203
AGI_60m 1.6572538
AGI_0m_ann 1.6292092
bathy_sd 1.6259866
mld_mean 1.0917162
pred_var 0.5679992
[1] "Partial plots"
[1] "Top most important pairwise interactions as identified by the model"
var1.index var1.names var2.index var2.names int.size
1 8 AGI_0m 2 temp_mean 4282.43
2 15 AGI_0m_ann 6 bathy_mean 498.47
3 16 AGI_60m_ann 6 bathy_mean 428.77
4 15 AGI_0m_ann 12 AGI_0m_seas 302.09
5 6 bathy_mean 3 sal_mean 265.89
6 3 sal_mean 2 temp_mean 254.57
7 12 AGI_0m_seas 9 AGI_60m 224.51
8 14 AGI_250m_seas 2 temp_mean 215.85
9 8 AGI_0m 4 ssh_mean 210.70
10 12 AGI_0m_seas 2 temp_mean 189.86
11 16 AGI_60m_ann 12 AGI_0m_seas 180.44
12 6 bathy_mean 2 temp_mean 138.32
13 9 AGI_60m 6 bathy_mean 130.20
14 17 AGI_250m_ann 3 sal_mean 123.95
[1] "External percent deviance explained"
[1] 0.8308918
[1] "TPR"
[1] 0.7459277
[1] "TSS"
[1] 0.9328179
[1] "Model evaluation using a 75/25 train/test data split"
gbm::gbm(formula = y.data ~ ., distribution = as.character(family),
data = x.data, weights = site.weights, var.monotone = var.monotone,
n.trees = target.trees, interaction.depth = tree.complexity,
shrinkage = learning.rate, bag.fraction = bag.fraction, verbose = FALSE)
A gradient boosted model with bernoulli loss function.
9000 iterations were performed.
There were 17 predictors of which 17 had non-zero influence.
RMSE Cor C-index PredRatio DevianceExplained PseudoR2
1 0.1727172 0.9399568 0.9925128 0.9983244 0.8308918 0.8698321
explore_brt (mod_file_path = "data/brt/mod_outputs/background/refined/brt_do_0m_60m_250m_dail_seas_ann_no_wind.rds" ,
test_data = do_test_daily_seasonal_annual_back)
[1] "Model performance metrics"
Model 1
Total.Deviance 1.3862943
Residual.Deviance 0.1909322
Correlation 0.9569687
AUC 0.9975000
Per.Expl 86.2271541
cvDeviance 0.4463350
cvCorrelation 0.8600541
cvAUC 0.9682600
cvPer.Expl 67.8037354
[1] "Relative influence of predictor variables"
rel.inf
bathy_mean 26.2112221
o2_mean_0m 21.2023536
o2_mean_250m_seas 10.1374648
o2_mean_0m_seas 8.4989877
o2_mean_60m_seas 5.2518224
o2_mean_250m_ann 4.0182405
o2_mean_0m_ann 3.4428506
chl_mean 2.9180736
temp_mean 2.7140167
o2_mean_250m 2.6253593
ssh_mean 2.6104859
sal_mean 2.4001687
o2_mean_60m_ann 2.3289369
o2_mean_60m 2.3143331
bathy_sd 1.3897784
mld_mean 1.2730290
pred_var 0.6628767
[1] "Partial plots"
[1] "Top most important pairwise interactions as identified by the model"
var1.index var1.names var2.index var2.names int.size
1 3 temp_mean 1 o2_mean_0m 780.03
2 16 o2_mean_60m_ann 7 bathy_mean 417.55
3 13 o2_mean_60m_seas 4 sal_mean 365.89
4 7 bathy_mean 2 chl_mean 256.48
5 4 sal_mean 1 o2_mean_0m 238.13
6 16 o2_mean_60m_ann 13 o2_mean_60m_seas 224.10
7 16 o2_mean_60m_ann 9 o2_mean_60m 214.13
8 12 o2_mean_0m_seas 3 temp_mean 202.33
9 7 bathy_mean 5 ssh_mean 191.77
10 9 o2_mean_60m 7 bathy_mean 190.11
11 2 chl_mean 1 o2_mean_0m 171.76
12 5 ssh_mean 4 sal_mean 164.64
13 16 o2_mean_60m_ann 3 temp_mean 158.76
14 7 bathy_mean 3 temp_mean 143.24
[1] "External percent deviance explained"
[1] 0.8138392
[1] "TPR"
[1] 0.7447363
[1] "TSS"
[1] 0.9235923
[1] "Model evaluation using a 75/25 train/test data split"
gbm::gbm(formula = y.data ~ ., distribution = as.character(family),
data = x.data, weights = site.weights, var.monotone = var.monotone,
n.trees = target.trees, interaction.depth = tree.complexity,
shrinkage = learning.rate, bag.fraction = bag.fraction, verbose = FALSE)
A gradient boosted model with bernoulli loss function.
8300 iterations were performed.
There were 17 predictors of which 17 had non-zero influence.
RMSE Cor C-index PredRatio DevianceExplained PseudoR2
1 0.182767 0.9320937 0.9901405 0.9997562 0.8138392 0.8622715
Here, dissolved oxygen was included at 0m and at all temporal resolutions, while AGI was included at 250m and all temporal resolutions.
explore_brt (mod_file_path = "data/brt/mod_outputs/crw/refined/brt_agi_250_DO_0_dail_seas_ann.rds" ,
test_data = dat_test_do_agi_all)
[1] "Model performance metrics"
Model 1
Total.Deviance 1.3862875
Residual.Deviance 0.6250167
Correlation 0.7985999
AUC 0.9528000
Per.Expl 54.9143526
cvDeviance 0.8554645
cvCorrelation 0.6626543
cvAUC 0.8795600
cvPer.Expl 38.2909726
[1] "Relative influence of predictor variables"
rel.inf
o2_mean_0m 16.470171
AGI_250m_ann 15.811049
o2_mean_0m_seas 12.815606
AGI_250m_seas 7.984658
temp_mean 7.744521
bathy_mean 7.655246
sal_mean 6.784886
AGI_250m 5.144102
chl_mean 4.881779
o2_mean_0m_ann 4.757163
ssh_mean 4.089160
mld_mean 3.229369
bathy_sd 2.632290
[1] "Partial plots"
[1] "Top most important pairwise interactions as identified by the model"
var1.index var1.names var2.index var2.names int.size
1 3 temp_mean 1 o2_mean_0m 461.19
2 9 o2_mean_0m_seas 1 o2_mean_0m 277.38
3 12 AGI_250m_seas 4 sal_mean 266.29
4 10 o2_mean_0m_ann 3 temp_mean 262.22
5 4 sal_mean 3 temp_mean 249.21
6 5 ssh_mean 4 sal_mean 136.76
7 5 ssh_mean 2 chl_mean 126.46
8 12 AGI_250m_seas 11 AGI_250m 120.24
[1] "External percent deviance explained"
[1] 0.5094073
[1] "TPR"
[1] 0.7172322
[1] "TSS"
[1] 0.7106185
[1] "Model evaluation using a 75/25 train/test data split"
gbm::gbm(formula = y.data ~ ., distribution = as.character(family),
data = x.data, weights = site.weights, var.monotone = var.monotone,
n.trees = target.trees, interaction.depth = tree.complexity,
shrinkage = learning.rate, bag.fraction = bag.fraction, verbose = FALSE)
A gradient boosted model with bernoulli loss function.
5400 iterations were performed.
There were 13 predictors of which 13 had non-zero influence.
RMSE Cor C-index PredRatio DevianceExplained PseudoR2
1 0.3247552 0.7664765 0.9358957 0.9980248 0.5094073 0.5491435
#load do final model
agi_mod <- readRDS (here ("data/brt/mod_outputs/crw/ensemble/brt_agi_only.rds" ))
do_mod <- readRDS (here ("data/brt/mod_outputs/crw/ensemble/brt_do_final.rds" ))
# make df of predictions from each model
test_ensemble_df <- readRDS (here ("data/brt/mod_eval/agi_do_test_daily_seasonal_annual.rds" ))
pred_testdata <- data.frame (
do = predict.gbm (do_mod, test_ensemble_df,
n.trees = do_mod$ gbm.call$ best.trees,
type = "response" ),
agi = predict.gbm (agi_mod, test_ensemble_df,
n.trees = agi_mod$ gbm.call$ best.trees,
type = "response" )
)
# Mean of probabilities
mean_prob <- rowMeans (pred_testdata)
# performance measures for "mean of probabilities"
(perf_mean_prob <- mecofun:: evalSDM (test_ensemble_df$ PA, mean_prob))
AUC TSS Kappa Sens Spec PCC D2 thresh
1 0.9310446 0.7052047 0.7052807 0.8729466 0.8322581 0.8526573 0.4576839 0.52
#calculate percent external deviance explained
ext.residual.deviance <- calc.deviance (obs = test_ensemble_df$ PA, pred= mean_prob, family= "bernoulli" , calc.mean= TRUE ) #get % deviance
null.dev = calc.deviance (test_ensemble_df$ PA ,rep (mean (test_ensemble_df$ PA),length (test_ensemble_df$ PA)), family= "bernoulli" , calc.mean= T)
(dev= (null.dev - ext.residual.deviance)/ null.dev)
explore_brt (mod_file_path = "data/brt/mod_outputs/crw/refined/brt_agi_0m_250m_dail_seas_ann.rds" ,
test_data = agi_test_daily_seasonal_annual_crw)
[1] "Model performance metrics"
Model 1
Total.Deviance 1.3862903
Residual.Deviance 0.6277548
Correlation 0.7973163
AUC 0.9522000
Per.Expl 54.7169289
cvDeviance 0.8617301
cvCorrelation 0.6582930
cvAUC 0.8768100
cvPer.Expl 37.8391330
[1] "Relative influence of predictor variables"
rel.inf
temp_mean 15.504668
AGI_250m_ann 15.163685
AGI_0m 11.740021
bathy_mean 11.710734
AGI_0m_seas 8.197830
sal_mean 6.807123
AGI_250m_seas 6.751818
AGI_0m_ann 5.161877
chl_mean 5.090897
AGI_250m 4.624616
bathy_sd 3.784137
mld_mean 3.106183
pred_var 2.356411
[1] "Partial plots"
[1] "Top most important pairwise interactions as identified by the model"
var1.index var1.names var2.index var2.names int.size
1 7 AGI_0m 2 temp_mean 2843.30
2 7 AGI_0m 5 bathy_mean 355.65
3 12 AGI_0m_ann 10 AGI_0m_seas 197.02
4 10 AGI_0m_seas 8 AGI_250m 181.68
5 13 AGI_250m_ann 12 AGI_0m_ann 165.05
6 11 AGI_250m_seas 8 AGI_250m 149.59
7 10 AGI_0m_seas 2 temp_mean 136.02
8 13 AGI_250m_ann 11 AGI_250m_seas 122.24
[1] "External percent deviance explained"
[1] 0.4999456
[1] "TPR"
[1] 0.7155658
[1] "TSS"
[1] 0.7086853
[1] "Model evaluation using a 75/25 train/test data split"
gbm::gbm(formula = y.data ~ ., distribution = as.character(family),
data = x.data, weights = site.weights, var.monotone = var.monotone,
n.trees = target.trees, interaction.depth = tree.complexity,
shrinkage = learning.rate, bag.fraction = bag.fraction, verbose = FALSE)
A gradient boosted model with bernoulli loss function.
5450 iterations were performed.
There were 13 predictors of which 13 had non-zero influence.
RMSE Cor C-index PredRatio DevianceExplained PseudoR2
1 0.328907 0.7586342 0.9325364 1.007354 0.4999456 0.5471693
explore_brt (mod_file_path = "data/brt/mod_outputs/crw/refined/brt_do_0m_250m_dail_seas_ann.rds" ,
test_data = do_test_daily_seasonal_annual_crw)
[1] "Model performance metrics"
Model 1
Total.Deviance 1.3862871
Residual.Deviance 0.6163731
Correlation 0.8037426
AUC 0.9552000
Per.Expl 55.5378474
cvDeviance 0.8577154
cvCorrelation 0.6617449
cvAUC 0.8786300
cvPer.Expl 38.1285877
[1] "Relative influence of predictor variables"
rel.inf
o2_mean_0m 17.645183
o2_mean_250m_ann 17.206276
o2_mean_0m_seas 10.345249
temp_mean 7.771990
o2_mean_250m_seas 7.199949
bathy_mean 7.171107
sal_mean 6.275883
chl_mean 5.065733
o2_mean_0m_ann 4.840965
o2_mean_250m 4.413205
ssh_mean 4.028459
mld_mean 3.069241
bathy_sd 2.700490
pred_var 2.266270
[1] "Partial plots"
[1] "Top most important pairwise interactions as identified by the model"
var1.index var1.names var2.index var2.names int.size
1 3 temp_mean 1 o2_mean_0m 510.24
2 13 o2_mean_0m_ann 3 temp_mean 283.65
3 12 o2_mean_250m_seas 4 sal_mean 254.65
4 11 o2_mean_0m_seas 1 o2_mean_0m 212.12
5 4 sal_mean 3 temp_mean 189.20
6 11 o2_mean_0m_seas 5 ssh_mean 181.69
7 7 bathy_mean 3 temp_mean 158.82
8 13 o2_mean_0m_ann 4 sal_mean 151.35
9 12 o2_mean_250m_seas 9 o2_mean_250m 148.24
10 5 ssh_mean 4 sal_mean 132.70
[1] "External percent deviance explained"
[1] 0.5114889
[1] "TPR"
[1] 0.7177333
[1] "TSS"
[1] 0.7216508
[1] "Model evaluation using a 75/25 train/test data split"
gbm::gbm(formula = y.data ~ ., distribution = as.character(family),
data = x.data, weights = site.weights, var.monotone = var.monotone,
n.trees = target.trees, interaction.depth = tree.complexity,
shrinkage = learning.rate, bag.fraction = bag.fraction, verbose = FALSE)
A gradient boosted model with bernoulli loss function.
5400 iterations were performed.
There were 14 predictors of which 14 had non-zero influence.
RMSE Cor C-index PredRatio DevianceExplained PseudoR2
1 0.3235065 0.7685859 0.9368822 1.002484 0.5114889 0.5553785
explore_brt (mod_file_path = "data/brt/mod_outputs/background/refined/brt_agi_0m_250m_dail_seas_ann.rds" ,
test_data = agi_test_daily_seasonal_annual_back)
[1] "Model performance metrics"
Model 1
Total.Deviance 1.3862896
Residual.Deviance 0.2010998
Correlation 0.9542854
AUC 0.9970000
Per.Expl 85.4936623
cvDeviance 0.4550922
cvCorrelation 0.8583417
cvAUC 0.9666300
cvPer.Expl 67.1719242
[1] "Relative influence of predictor variables"
rel.inf
bathy_mean 30.944500
temp_mean 20.937253
AGI_250m_seas 10.045099
AGI_0m 7.256943
ssh_mean 6.121208
AGI_250m_ann 4.606362
sal_mean 4.514486
AGI_0m_seas 4.190922
chl_mean 3.030145
AGI_250m 2.700179
bathy_sd 2.167797
AGI_0m_ann 2.088657
mld_mean 1.396450
[1] "Partial plots"
[1] "Top most important pairwise interactions as identified by the model"
var1.index var1.names var2.index var2.names int.size
1 8 AGI_0m 2 temp_mean 4136.97
2 6 bathy_mean 3 sal_mean 529.40
3 3 sal_mean 2 temp_mean 284.85
4 13 AGI_250m_ann 3 sal_mean 275.46
5 11 AGI_250m_seas 6 bathy_mean 272.17
6 8 AGI_0m 4 ssh_mean 255.06
7 13 AGI_250m_ann 12 AGI_0m_ann 244.55
8 12 AGI_0m_ann 10 AGI_0m_seas 213.62
[1] "External percent deviance explained"
[1] 0.8167617
[1] "TPR"
[1] 0.7449762
[1] "TSS"
[1] 0.9286281
[1] "Model evaluation using a 75/25 train/test data split"
gbm::gbm(formula = y.data ~ ., distribution = as.character(family),
data = x.data, weights = site.weights, var.monotone = var.monotone,
n.trees = target.trees, interaction.depth = tree.complexity,
shrinkage = learning.rate, bag.fraction = bag.fraction, verbose = FALSE)
A gradient boosted model with bernoulli loss function.
9000 iterations were performed.
There were 13 predictors of which 13 had non-zero influence.
RMSE Cor C-index PredRatio DevianceExplained PseudoR2
1 0.1797948 0.9348405 0.9907361 0.9974215 0.8167617 0.8549366
explore_brt (mod_file_path = "data/brt/mod_outputs/background/refined/brt_do_0m_250m_dail_seas_ann.rds" ,
test_data = do_test_daily_seasonal_annual_back)
[1] "Model performance metrics"
Model 1
Total.Deviance 1.3862943
Residual.Deviance 0.2038413
Correlation 0.9528997
AUC 0.9971000
Per.Expl 85.2959607
cvDeviance 0.4624112
cvCorrelation 0.8532735
cvAUC 0.9663100
cvPer.Expl 66.6440829
[1] "Relative influence of predictor variables"
rel.inf
bathy_mean 27.1431098
o2_mean_0m 21.2576581
o2_mean_250m_seas 14.6524628
o2_mean_0m_seas 10.3343085
o2_mean_250m_ann 3.7807478
chl_mean 3.5499333
temp_mean 3.4191758
o2_mean_250m 3.3145643
o2_mean_0m_ann 2.9473000
sal_mean 2.8866095
ssh_mean 2.6457934
bathy_sd 1.7336484
mld_mean 1.5223918
pred_var 0.8122965
[1] "Partial plots"
[1] "Top most important pairwise interactions as identified by the model"
var1.index var1.names var2.index var2.names int.size
1 3 temp_mean 1 o2_mean_0m 550.04
2 12 o2_mean_250m_seas 4 sal_mean 330.52
3 5 ssh_mean 4 sal_mean 277.30
4 7 bathy_mean 2 chl_mean 228.67
5 13 o2_mean_0m_ann 3 temp_mean 217.07
6 7 bathy_mean 3 temp_mean 200.66
7 11 o2_mean_0m_seas 5 ssh_mean 200.53
8 2 chl_mean 1 o2_mean_0m 197.61
9 7 bathy_mean 5 ssh_mean 180.78
10 14 o2_mean_250m_ann 7 bathy_mean 176.73
[1] "External percent deviance explained"
[1] 0.8042935
[1] "TPR"
[1] 0.7442145
[1] "TSS"
[1] 0.9193297
[1] "Model evaluation using a 75/25 train/test data split"
gbm::gbm(formula = y.data ~ ., distribution = as.character(family),
data = x.data, weights = site.weights, var.monotone = var.monotone,
n.trees = target.trees, interaction.depth = tree.complexity,
shrinkage = learning.rate, bag.fraction = bag.fraction, verbose = FALSE)
A gradient boosted model with bernoulli loss function.
8400 iterations were performed.
There were 14 predictors of which 14 had non-zero influence.
RMSE Cor C-index PredRatio DevianceExplained PseudoR2
1 0.1882474 0.9277648 0.9891198 0.9997909 0.8042935 0.8529596
Final models w/o random predictor variables
explore_brt (mod_file_path = "data/brt/mod_outputs/final_mods/brt_base_0m_dail_no_wind.rds" ,
test_data = base_test_daily_crw)
[1] "Model performance metrics"
Model 1
Total.Deviance 1.3862823
Residual.Deviance 0.8160352
Correlation 0.7038050
AUC 0.9065000
Per.Expl 41.1349935
cvDeviance 0.9973663
cvCorrelation 0.5753416
cvAUC 0.8287700
cvPer.Expl 28.0546057
[1] "Relative influence of predictor variables"
rel.inf
bathy_mean 30.831647
temp_mean 22.940454
sal_mean 12.777564
chl_mean 10.523532
bathy_sd 7.978901
ssh_mean 7.803239
mld_mean 7.144664
[1] "Partial plots"
[1] "Top most important pairwise interactions as identified by the model"
var1.index var1.names var2.index var2.names int.size
1 6 bathy_mean 2 temp_mean 730.22
2 4 ssh_mean 2 temp_mean 560.59
[1] "External percent deviance explained"
[1] 0.3774799
[1] "TPR"
[1] 0.6925418
[1] "TSS"
[1] 0.5960813
[1] "Model evaluation using a 75/25 train/test data split"
gbm::gbm(formula = y.data ~ ., distribution = as.character(family),
data = x.data, weights = site.weights, var.monotone = var.monotone,
n.trees = target.trees, interaction.depth = tree.complexity,
shrinkage = learning.rate, bag.fraction = bag.fraction, verbose = FALSE)
A gradient boosted model with bernoulli loss function.
4300 iterations were performed.
There were 7 predictors of which 7 had non-zero influence.
RMSE Cor C-index PredRatio DevianceExplained PseudoR2
1 0.3740063 0.6699421 0.8864212 1.002138 0.3774799 0.4113499
explore_brt (mod_file_path = "data/brt/mod_outputs/final_mods/brt_do_0m_60m_250m_dail_seas_ann_no_wind.rds" ,
test_data = do_test_daily_seasonal_annual_crw)
[1] "Model performance metrics"
Model 1
Total.Deviance 1.3862871
Residual.Deviance 0.6004545
Correlation 0.8099189
AUC 0.9575000
Per.Expl 56.6861326
cvDeviance 0.8423891
cvCorrelation 0.6697309
cvAUC 0.8832400
cvPer.Expl 39.2341516
[1] "Relative influence of predictor variables"
rel.inf
o2_mean_0m 17.027705
o2_mean_250m_ann 15.896634
o2_mean_0m_seas 9.384843
temp_mean 7.043452
o2_mean_250m_seas 6.239946
o2_mean_60m_ann 5.633626
bathy_mean 5.169538
o2_mean_60m_seas 5.037047
sal_mean 4.915712
o2_mean_250m 4.514210
chl_mean 4.181528
o2_mean_0m_ann 3.503057
o2_mean_60m 3.087269
ssh_mean 3.035695
mld_mean 2.872809
bathy_sd 2.456927
[1] "Partial plots"
[1] "Top most important pairwise interactions as identified by the model"
var1.index var1.names var2.index var2.names int.size
1 3 temp_mean 1 o2_mean_0m 421.52
2 11 o2_mean_0m_seas 1 o2_mean_0m 256.74
3 14 o2_mean_0m_ann 4 sal_mean 210.40
4 13 o2_mean_250m_seas 10 o2_mean_250m 190.15
5 11 o2_mean_0m_seas 5 ssh_mean 144.83
6 7 bathy_mean 3 temp_mean 138.68
7 5 ssh_mean 4 sal_mean 124.56
8 4 sal_mean 3 temp_mean 115.46
9 11 o2_mean_0m_seas 4 sal_mean 107.40
10 15 o2_mean_60m_ann 7 bathy_mean 105.21
11 9 o2_mean_60m 7 bathy_mean 105.03
12 12 o2_mean_60m_seas 9 o2_mean_60m 94.94
13 9 o2_mean_60m 8 bathy_sd 94.79
[1] "External percent deviance explained"
[1] 0.5236399
[1] "TPR"
[1] 0.7194688
[1] "TSS"
[1] 0.7276229
[1] "Model evaluation using a 75/25 train/test data split"
gbm::gbm(formula = y.data ~ ., distribution = as.character(family),
data = x.data, weights = site.weights, var.monotone = var.monotone,
n.trees = target.trees, interaction.depth = tree.complexity,
shrinkage = learning.rate, bag.fraction = bag.fraction, verbose = FALSE)
A gradient boosted model with bernoulli loss function.
5400 iterations were performed.
There were 16 predictors of which 16 had non-zero influence.
RMSE Cor C-index PredRatio DevianceExplained PseudoR2
1 0.3188455 0.7760116 0.9403423 1.001684 0.5236399 0.5668613
explore_brt (mod_file_path = "data/brt/mod_outputs/final_mods/brt_agi_0m_60m_250m_dail_seas_ann_no_wind.rds" ,
test_data = agi_test_daily_seasonal_annual_crw)
[1] "Model performance metrics"
Model 1
Total.Deviance 1.3862903
Residual.Deviance 0.5858010
Correlation 0.8161710
AUC 0.9602000
Per.Expl 57.7432667
cvDeviance 0.8309178
cvCorrelation 0.6751996
cvAUC 0.8859900
cvPer.Expl 40.0617726
[1] "Relative influence of predictor variables"
rel.inf
temp_mean 13.808659
AGI_250m_ann 13.248314
AGI_0m 11.562619
bathy_mean 8.434040
AGI_0m_seas 7.508202
sal_mean 5.742767
AGI_250m_seas 5.436860
AGI_60m_ann 4.943205
AGI_60m_seas 4.543656
AGI_0m_ann 4.409229
AGI_250m 4.252699
ssh_mean 4.180552
chl_mean 3.804112
bathy_sd 2.999297
mld_mean 2.623939
AGI_60m 2.501850
[1] "Partial plots"
[1] "Top most important pairwise interactions as identified by the model"
var1.index var1.names var2.index var2.names int.size
1 8 AGI_0m 2 temp_mean 1760.25
2 15 AGI_60m_ann 11 AGI_0m_seas 315.74
3 8 AGI_0m 6 bathy_mean 235.87
4 11 AGI_0m_seas 10 AGI_250m 189.96
5 16 AGI_250m_ann 14 AGI_0m_ann 183.28
6 8 AGI_0m 4 ssh_mean 145.41
7 8 AGI_0m 3 sal_mean 141.05
8 13 AGI_250m_seas 10 AGI_250m 109.75
9 14 AGI_0m_ann 11 AGI_0m_seas 106.67
10 9 AGI_60m 6 bathy_mean 98.24
11 12 AGI_60m_seas 4 ssh_mean 96.58
12 14 AGI_0m_ann 7 bathy_sd 92.19
13 15 AGI_60m_ann 4 ssh_mean 73.03
[1] "External percent deviance explained"
[1] 0.5261534
[1] "TPR"
[1] 0.719393
[1] "TSS"
[1] 0.7302788
[1] "Model evaluation using a 75/25 train/test data split"
gbm::gbm(formula = y.data ~ ., distribution = as.character(family),
data = x.data, weights = site.weights, var.monotone = var.monotone,
n.trees = target.trees, interaction.depth = tree.complexity,
shrinkage = learning.rate, bag.fraction = bag.fraction, verbose = FALSE)
A gradient boosted model with bernoulli loss function.
5750 iterations were performed.
There were 16 predictors of which 16 had non-zero influence.
RMSE Cor C-index PredRatio DevianceExplained PseudoR2
1 0.3183118 0.77655 0.940186 1.009582 0.5261534 0.5774327
[1] "Model performance metrics"
Model 1
Total.Deviance 1.3862871
Residual.Deviance 0.6216714
Correlation 0.7997202
AUC 0.9530000
Per.Expl 55.1556520
cvDeviance 0.8550097
cvCorrelation 0.6633890
cvAUC 0.8795600
cvPer.Expl 38.3237656
[1] "Relative influence of predictor variables"
rel.inf
o2_mean_0m 18.096412
o2_mean_250m_ann 17.371934
o2_mean_0m_seas 10.555409
o2_mean_250m_seas 8.392439
temp_mean 8.008430
sal_mean 6.610190
bathy_mean 6.609148
chl_mean 5.117774
o2_mean_0m_ann 4.922417
o2_mean_250m 4.500747
ssh_mean 3.912119
mld_mean 3.194189
bathy_sd 2.708790
[1] "Partial plots"
[1] "Top most important pairwise interactions as identified by the model"
var1.index var1.names var2.index var2.names int.size
1 3 temp_mean 1 o2_mean_0m 666.21
2 10 o2_mean_0m_seas 1 o2_mean_0m 368.62
3 11 o2_mean_250m_seas 4 sal_mean 296.11
4 12 o2_mean_0m_ann 3 temp_mean 248.07
5 11 o2_mean_250m_seas 9 o2_mean_250m 221.72
6 4 sal_mean 3 temp_mean 220.04
7 10 o2_mean_0m_seas 5 ssh_mean 188.99
8 7 bathy_mean 3 temp_mean 180.12
[1] "External percent deviance explained"
[1] 0.5113989
[1] "TPR"
[1] 0.7174051
[1] "TSS"
[1] 0.712829
[1] "Model evaluation using a 75/25 train/test data split"
gbm::gbm(formula = y.data ~ ., distribution = as.character(family),
data = x.data, weights = site.weights, var.monotone = var.monotone,
n.trees = target.trees, interaction.depth = tree.complexity,
shrinkage = learning.rate, bag.fraction = bag.fraction, verbose = FALSE)
A gradient boosted model with bernoulli loss function.
5300 iterations were performed.
There were 13 predictors of which 13 had non-zero influence.
RMSE Cor C-index PredRatio DevianceExplained PseudoR2
1 0.3240668 0.7672517 0.9361937 1.003678 0.5113989 0.5515565
explore_brt (mod_file_path = "data/brt/mod_outputs/final_mods/brt_agi_0m_250m_dail_seas_ann.rds" ,
test_data = agi_test_daily_seasonal_annual_crw)
[1] "Model performance metrics"
Model 1
Total.Deviance 1.3862903
Residual.Deviance 0.6227688
Correlation 0.7983866
AUC 0.9524000
Per.Expl 55.0765938
cvDeviance 0.8514835
cvCorrelation 0.6647368
cvAUC 0.8804000
cvPer.Expl 38.5782656
[1] "Relative influence of predictor variables"
rel.inf
AGI_250m_ann 15.624829
temp_mean 14.313855
AGI_0m 12.578555
bathy_mean 10.022586
AGI_0m_seas 8.257366
sal_mean 7.104914
AGI_250m_seas 5.966422
ssh_mean 5.289787
AGI_0m_ann 4.800005
AGI_250m 4.606867
chl_mean 4.471600
bathy_sd 4.007598
mld_mean 2.955618
[1] "Partial plots"
[1] "Top most important pairwise interactions as identified by the model"
var1.index var1.names var2.index var2.names int.size
1 8 AGI_0m 2 temp_mean 2100.42
2 10 AGI_0m_seas 9 AGI_250m 333.68
3 8 AGI_0m 6 bathy_mean 264.37
4 11 AGI_250m_seas 9 AGI_250m 150.53
5 8 AGI_0m 4 ssh_mean 145.22
6 12 AGI_0m_ann 10 AGI_0m_seas 128.88
7 13 AGI_250m_ann 12 AGI_0m_ann 127.32
8 11 AGI_250m_seas 3 sal_mean 106.97
[1] "External percent deviance explained"
[1] 0.5028644
[1] "TPR"
[1] 0.7155138
[1] "TSS"
[1] 0.7032217
[1] "Model evaluation using a 75/25 train/test data split"
gbm::gbm(formula = y.data ~ ., distribution = as.character(family),
data = x.data, weights = site.weights, var.monotone = var.monotone,
n.trees = target.trees, interaction.depth = tree.complexity,
shrinkage = learning.rate, bag.fraction = bag.fraction, verbose = FALSE)
A gradient boosted model with bernoulli loss function.
5350 iterations were performed.
There were 13 predictors of which 13 had non-zero influence.
RMSE Cor C-index PredRatio DevianceExplained PseudoR2
1 0.3280184 0.759673 0.932425 1.008251 0.5028644 0.5507659
HSI maps
These dates run from January to December of each year, but I am happy to modify this if there are better ways to categorize/plot ENSO phases.
Strong La Niña year (-1.5 to -1.9 temperature anomaly; 2011)
Moderate/strong El Niño year (1.5 to 1.9 temperature anomaly; 2009)
Strong La Niña year (-1.5 to -1.9 temperature anomaly; 2011)
Moderate/strong El Niño year (1.5 to 1.9 temperature anomaly; 2009)
HSI difference maps
Strong La Niña year (-1.5 to -1.9 temperature anomaly; 2011)
Moderate/strong El Niño year (1.5 to 1.9 temperature anomaly; 2009)
AGI maps
Strong La Niña year (-1.5 to -1.9 temperature anomaly; 2011)
Moderate/strong El Niño year (1.5 to 1.9 temperature anomaly; 2009)
Model evaluation
Predictor variable relative importance