Merge average df with predictions with original data frame
Source:R/utils_average_trend.R
merge_average_df.Rd
Merge average df with predictions with original data frame
Usage
merge_average_df(
avg_df,
df,
response,
average_cols,
group_col,
obs_filter,
sort_col,
pred_col,
pred_upper_col,
pred_lower_col,
test_col
)
Arguments
- avg_df
Data frame with average trends.
- df
Data frame of model data.
- response
Column name of response variable.
- average_cols
Column name(s) of column(s) for use in grouping data for averaging, such as regions. If missing, uses global average of the data for infilling.
- group_col
Column name(s) of group(s) to use in
dplyr::group_by()
when supplying type, calculating mean absolute scaled error on data involving time series, and ifgroup_models
, then fitting and predicting models too. IfNULL
, not used. Defaults to"iso3"
.- obs_filter
String value of the form "
logical operator
integer
" that specifies the number of observations required to fit the model and replace observations with predicted values. This is done in conjunction withgroup_col
. So, ifgroup_col = "iso3"
andobs_filter = ">= 5"
, then for this model, predictions will only be used foriso3
vales that have 5 or more observations. Possible logical operators to use are>
,>=
,<
,<=
,==
, and!=
.If `group_models = FALSE`, then `obs_filter` is only used to determine when predicted values replace observed values but **is not** used to restrict values from being used in model fitting. If `group_models = TRUE`, then a model is only fit for a group if they meet the `obs_filter` requirements. This provides speed benefits, particularly when running INLA time series using `predict_inla()`.
- sort_col
Column name(s) to use to
dplyr::arrange()
the data prior to supplying type and calculating mean absolute scaled error on data involving time series. IfNULL
, not used. Defaults to"year"
.- pred_col
Column name to store predicted value.
- pred_upper_col
Column name to store upper bound of confidence interval generated by the
predict_...
function. This stores the full set of generated values for the upper bound.- pred_lower_col
Column name to store lower bound of confidence interval generated by the
predict_...
function. This stores the full set of generated values for the lower bound.- test_col
Name of logical column specifying which response values to remove for testing the model's predictive accuracy. If
NULL
, ignored. Seemodel_error()
for details on the methods and metrics returned.