摘要:For species distribution models, species frequency is termed
prevalence and prevalence in samples should be similar to natural species
prevalence, for unbiased samples. However, modelers commonly adjust sampling
prevalence, producing a modeling prevalence that has a different frequency of
occurrences than sampling prevalence. The separate effects of (1) use of
sampling prevalence compared to adjusted modeling prevalence and
(2) modifications necessary in thresholds, which convert continuous
probabilities to discrete presence or absence predictions, to account for
prevalence, are unresolved issues. We examined effects of prevalence and
thresholds and two types of pseudoabsences on model accuracy. Use of sampling
prevalence produced similar models compared to use of adjusted modeling
prevalences. Mean correlation between predicted probabilities of the least
(0.33) and greatest modeling prevalence (0.83) was 0.86. Mean predicted
probability values increased with increasing prevalence; therefore, unlike
constant thresholds, varying threshold to match prevalence values was
effective in holding true positive rate, true negative rate, and species
prediction areas relatively constant for every modeling prevalence. The area
under the curve (AUC) values appeared to be as informative as sensitivity and
specificity, when using surveyed pseudoabsences as absent cases, but when the
entire study area was coded, AUC values reflected the area of predicted
presence as absent. Less frequent species had greater AUC values when
pseudoabsences represented the study background. Modeling prevalence had a
mild impact on species distribution models and accuracy assessment metrics
when threshold varied with prevalence. Misinterpretation of AUC values is
possible when AUC values are based on background absences, which correlate
with frequency of species.