predict_average_fn() does simple imputation and flat extrapolation
using averages grouped by average_cols.
Usage
predict_average_fn(
df,
col,
average_cols = NULL,
weight_col = NULL,
flat_extrap = TRUE,
test_col = NULL,
group_col = NULL,
obs_filter = NULL,
pred_col = "pred",
sort_col = NULL,
sort_descending = FALSE,
error_correct = FALSE,
error_correct_cols = NULL,
shift_trend = FALSE
)Arguments
- df
Data frame of model data.
- col
Name of column to extrapolate/interpolate.
- average_cols
Column name(s) of column(s) for use in grouping data for averaging, such as regions. If missing, uses global average of the data for infilling.
- weight_col
Column name of column of weights to be used in averaging, such as country population.
- flat_extrap
Logical value determining whether or not to flat extrapolate using the latest average for missing rows with no data available.
- test_col
Name of logical column specifying which response values to remove for testing the model's predictive accuracy. If
NULL, ignored. Seemodel_error()for details on the methods and metrics returned.- group_col
Column name(s) of group(s) to use in
dplyr::group_by()when supplying type, calculating mean absolute scaled error on data involving time series, and ifgroup_models, then fitting and predicting models too. IfNULL, not used. Defaults to"iso3".- obs_filter
String value of the form "
logical operatorinteger" that specifies the number of observations required to fit the model and replace observations with predicted values. This is done in conjunction withgroup_col. So, ifgroup_col = "iso3"andobs_filter = ">= 5", then for this model, predictions will only be used foriso3vales that have 5 or more observations. Possible logical operators to use are>,>=,<,<=,==, and!=.If `group_models = FALSE`, then `obs_filter` is only used to determine when predicted values replace observed values but **is not** used to restrict values from being used in model fitting. If `group_models = TRUE`, then a model is only fit for a group if they meet the `obs_filter` requirements. This provides speed benefits, particularly when running INLA time series using `predict_inla()`.- pred_col
Column name to store predicted value.
- sort_col
Column name(s) to use to
dplyr::arrange()the data prior to supplying type and calculating mean absolute scaled error on data involving time series. IfNULL, not used. Defaults to"year".- sort_descending
Logical value on whether the sorted values from
sort_colshould be sorted in descending order. Defaults toFALSE.- error_correct
Logical value indicating whether or not whether mean error should be used to adjust predicted values. If
TRUE, the mean error between observed and predicted data points will be used to adjust predictions. Iferror_correct_colsis notNULL, mean error will be used within those groups instead of overall mean error.- error_correct_cols
Column names of data frame to group by when applying error correction to the predicted values.
- shift_trend
Logical value specifying whether or not to shift predictions so that the trend matches up to the last observation. If
error_correctandshift_trendare bothTRUE,shift_trendtakes precedence.
