Fit INLA model to data — fit_inla

Used within predict_inla(), this function fits the model to the data frame, working whether the model is being fit across the entire data frame or being fit to each group individually. Data is filtered prior to fitting, model(s) are fit, and then fitted values are generated on the original.

Usage

fit_inla_model(
  df,
  formula,
  control.predictor,
  ...,
  formula_vars,
  test_col,
  group_col,
  group_models,
  obs_filter,
  sort_col,
  sort_descending,
  pred_col,
  pred_upper_col,
  pred_lower_col,
  filter_na,
  ret,
  error_correct,
  error_correct_cols,
  shift_trend
)

Arguments

df

Data frame of model data.

formula

A formula that will be supplied to the model, such as y~x.

control.predictor

Used to set compute = TRUE to ensure that the posterior marginals of the fitted values are obtained and the mean and standard deviation of the fitted values returned for use in the infilling and predictions. Additional arguments can be passed in the control.predictor list, but must always include compute = TRUE. See INLA::control.predictor() for details.

...

Additional arguments passed to INLA::inla().

formula_vars

Variables included in the model formula, generated by all.vars(formula).

test_col

Name of logical column specifying which response values to remove for testing the model's predictive accuracy. If NULL, ignored. See model_error() for details on the methods and metrics returned.

group_col

Column name(s) of group(s) to use in dplyr::group_by() when supplying type, calculating mean absolute scaled error on data involving time series, and if group_models, then fitting and predicting models too. If NULL, not used. Defaults to "iso3".

group_models

Logical, if TRUE, fits and predicts models individually onto each group_col. If FALSE, a general model is fit across the entire data frame.

obs_filter

String value of the form "logical operator integer" that specifies the number of observations required to fit the model and replace observations with predicted values. This is done in conjunction with group_col. So, if group_col = "iso3" and obs_filter = ">= 5", then for this model, predictions will only be used for iso3 vales that have 5 or more observations. Possible logical operators to use are >, >=, <, <=, ==, and !=.

If `group_models = FALSE`, then `obs_filter` is only used to determine when
predicted values replace observed values but **is not** used to restrict values
from being used in model fitting. If `group_models = TRUE`, then a model
is only fit for a group if they meet the `obs_filter` requirements. This provides
speed benefits, particularly when running INLA time series using `predict_inla()`.

sort_col

Column name(s) to use to dplyr::arrange() the data prior to supplying type and calculating mean absolute scaled error on data involving time series. If NULL, not used. Defaults to "year".

sort_descending

Logical value on whether the sorted values from sort_col should be sorted in descending order. Defaults to FALSE.

pred_col

Column name to store predicted value.

pred_upper_col

Column name to store upper bound of confidence interval generated by the predict_... function. This stores the full set of generated values for the upper bound.

pred_lower_col

Column name to store lower bound of confidence interval generated by the predict_... function. This stores the full set of generated values for the lower bound.

filter_na

Character value specifying how, if at all, to filter NA values from the dataset prior to applying the model. By default, only observations with missing predictors are removed, although it can also remove rows only if they have missing dependent or independent variables, or no filtering at all. Model prediction and fitting are done in one pass with INLA::inla(), so there will be no predictions if observations with missing dependent variables are removed.

ret

Character vector specifying what values the function returns. Defaults to returning a data frame, but can return a vector of model error, the model itself or a list with all 3 as components.

error_correct

Logical value indicating whether or not whether mean error should be used to adjust predicted values. If TRUE, the mean error between observed and predicted data points will be used to adjust predictions. If error_correct_cols is not NULL, mean error will be used within those groups instead of overall mean error.

error_correct_cols

Column names of data frame to group by when applying error correction to the predicted values.

shift_trend

Logical value specifying whether or not to shift predictions so that the trend matches up to the last observation. If error_correct and shift_trend are both TRUE, shift_trend takes precedence.

Value

List of mdl (fitted model) and df (data frame with fitted values and confidence bounds generated from the model).

Details

If fitting models individually to each group, mdl will never be returned, as as these are instead a large group of models. Otherwise, a list of mdl and df is returned and used within predict_inla().