Linearly interpolate data — predict_simple

predict_simple_fn() does simple linear interpolation and flat extrapolation on specified columnn in a data frame using zoo::na.approx().

Usage

predict_simple_fn(
  df,
  model,
  col,
  test_col = NULL,
  group_col = NULL,
  obs_filter = NULL,
  pred_col = "pred",
  sort_col = NULL,
  sort_descending = FALSE
)

Arguments

df

Data frame of model data.

model

Type of simple extrapolation or interpolation to perform:

forward: Just flat_extrap and linear_interp. (default)
all: All of flat_extrap, linear_interp, and back_extrap
flat_extrap: Flat extrapolation from latest observed point.
linear_interp: Linear interpolation between observed data points.
back_extrap: Flat extrapolation from first observed data point backwards.
both_extrap: Both flat_extrap and back_extrap.

col

Name of column to extrapolate/interpolate.

test_col

Name of logical column specifying which response values to remove for testing the model's predictive accuracy. If NULL, ignored. See model_error() for details on the methods and metrics returned.

group_col

Column name(s) of group(s) to use in dplyr::group_by() when supplying type, calculating mean absolute scaled error on data involving time series, and if group_models, then fitting and predicting models too. If NULL, not used. Defaults to "iso3".

obs_filter

String value of the form "logical operator integer" that specifies the number of observations required to fit the model and replace observations with predicted values. This is done in conjunction with group_col. So, if group_col = "iso3" and obs_filter = ">= 5", then for this model, predictions will only be used for iso3 vales that have 5 or more observations. Possible logical operators to use are >, >=, <, <=, ==, and !=.

If `group_models = FALSE`, then `obs_filter` is only used to determine when
predicted values replace observed values but **is not** used to restrict values
from being used in model fitting. If `group_models = TRUE`, then a model
is only fit for a group if they meet the `obs_filter` requirements. This provides
speed benefits, particularly when running INLA time series using `predict_inla()`.

pred_col

Column name to store predicted value.

sort_col

Column name(s) to use to dplyr::arrange() the data prior to supplying type and calculating mean absolute scaled error on data involving time series. If NULL, not used. Defaults to "year". For predict_simple(), the first value in sort_col is passed to zoo::na.approx() as xout to ensure linear interpolation is based on sort_col indexing rather than default data frame indexing.

sort_descending

Logical value on whether the sorted values from sort_col should be sorted in descending order. Defaults to FALSE.

Value

A data frame.