Used within predict_forecast()
, this function fits the model to the data
frame, working whether the model is being fit across the entire data frame or
being fit to each group individually. Data is filtered prior to fitting,
model(s) are fit, and then fitted values are generated on the original.
Usage
fit_forecast_model(
df,
forecast_function,
response,
...,
test_col,
group_col,
group_models,
obs_filter,
sort_col,
sort_descending,
pred_col,
pred_upper_col,
pred_lower_col,
filter_na,
ret
)
Arguments
- df
Data frame of model data.
- forecast_function
An R function that outputs a forecast object coming from the forecast package. You can directly pass
forecast::forecast()
to the function, or you can pass other wrappers to it such asforecast::holt()
orforecast::ses()
.- response
Column name of response variable to be used as the input to the forecast function.
- ...
Additional arguments passed to the forecast function.
- test_col
Name of logical column specifying which response values to remove for testing the model's predictive accuracy. If
NULL
, ignored. Seemodel_error()
for details on the methods and metrics returned.- group_col
Column name(s) of group(s) to use in
dplyr::group_by()
when supplying type, calculating mean absolute scaled error on data involving time series, and ifgroup_models
, then fitting and predicting models too. IfNULL
, not used. Defaults to"iso3"
.- group_models
Logical, if
TRUE
, fits and predicts models individually onto eachgroup_col
. IfFALSE
, a general model is fit across the entire data frame.- obs_filter
String value of the form "
logical operator
integer
" that specifies the number of observations required to fit the model and replace observations with predicted values. This is done in conjunction withgroup_col
. So, ifgroup_col = "iso3"
andobs_filter = ">= 5"
, then for this model, predictions will only be used foriso3
vales that have 5 or more observations. Possible logical operators to use are>
,>=
,<
,<=
,==
, and!=
.If `group_models = FALSE`, then `obs_filter` is only used to determine when predicted values replace observed values but **is not** used to restrict values from being used in model fitting. If `group_models = TRUE`, then a model is only fit for a group if they meet the `obs_filter` requirements. This provides speed benefits, particularly when running INLA time series using `predict_inla()`.
- sort_col
Column name of column to arrange data by in
dplyr::arrange()
, prior to filtering for latest contiguous time series and producing the forecast. Not used ifNULL
, defaults to"year"
.- sort_descending
Logical value on whether the sorted values from
sort_col
should be sorted in descending order. Defaults toFALSE
.- pred_col
Column name to store predicted value.
- pred_upper_col
Column name to store upper bound of confidence interval generated by the
predict_...
function. This stores the full set of generated values for the upper bound.- pred_lower_col
Column name to store lower bound of confidence interval generated by the
predict_...
function. This stores the full set of generated values for the lower bound.- filter_na
Character value specifying how, if at all, to filter
NA
values from the dataset prior to applying the model. By default, all observations with missing values are removed, although it can also remove rows only if they have missing dependent or independent variables, or no filtering at all.- ret
Character vector specifying what values the function returns. Defaults to returning a data frame, but can return a vector of model error, the model itself or a list with all 3 as components.
Value
List of mdl
(fitted model) and df
(data frame with fitted values
and confidence bounds generated from the model).
Details
If fitting models individually to each group, mdl
will never be returned, as
as these are instead a large list of models. Otherwise, a list of mdl
and df
is returned and used within predict_inla()
.