Skip to contents

predict_average_fn() does simple imputation and flat extrapolation using averages grouped by average_cols.


  average_cols = NULL,
  weight_col = NULL,
  flat_extrap = TRUE,
  test_col = NULL,
  group_col = NULL,
  obs_filter = NULL,
  pred_col = "pred",
  sort_col = NULL,
  sort_descending = FALSE,
  error_correct = FALSE,
  error_correct_cols = NULL,
  shift_trend = FALSE



Data frame of model data.


Name of column to extrapolate/interpolate.


Column name(s) of column(s) for use in grouping data for averaging, such as regions. If missing, uses global average of the data for infilling.


Column name of column of weights to be used in averaging, such as country population.


Logical value determining whether or not to flat extrapolate using the latest average for missing rows with no data available.


Name of logical column specifying which response values to remove for testing the model's predictive accuracy. If NULL, ignored. See model_error() for details on the methods and metrics returned.


Column name(s) of group(s) to use in dplyr::group_by() when supplying type, calculating mean absolute scaled error on data involving time series, and if group_models, then fitting and predicting models too. If NULL, not used. Defaults to "iso3".


String value of the form "logical operator integer" that specifies the number of observations required to fit the model and replace observations with predicted values. This is done in conjunction with group_col. So, if group_col = "iso3" and obs_filter = ">= 5", then for this model, predictions will only be used for iso3 vales that have 5 or more observations. Possible logical operators to use are >, >=, <, <=, ==, and !=.

If `group_models = FALSE`, then `obs_filter` is only used to determine when
predicted values replace observed values but **is not** used to restrict values
from being used in model fitting. If `group_models = TRUE`, then a model
is only fit for a group if they meet the `obs_filter` requirements. This provides
speed benefits, particularly when running INLA time series using `predict_inla()`.

Column name to store predicted value.


Column name(s) to use to dplyr::arrange() the data prior to supplying type and calculating mean absolute scaled error on data involving time series. If NULL, not used. Defaults to "year".


Logical value on whether the sorted values from sort_col should be sorted in descending order. Defaults to FALSE.


Logical value indicating whether or not whether mean error should be used to adjust predicted values. If TRUE, the mean error between observed and predicted data points will be used to adjust predictions. If error_correct_cols is not NULL, mean error will be used within those groups instead of overall mean error.


Column names of data frame to group by when applying error correction to the predicted values.


Logical value specifying whether or not to shift predictions so that the trend matches up to the last observation. If error_correct and shift_trend are both TRUE, shift_trend takes precedence.


A data frame.