Merges predicted data into data frame. By default, does not replace observed values with modeled data.
Usage
merge_prediction(
df,
response,
group_col,
obs_filter,
sort_col,
sort_descending,
pred_col,
pred_upper_col,
pred_lower_col,
upper_col,
lower_col,
type_col,
types,
source_col,
source,
scenario_detail_col,
scenario_detail,
replace_obs
)
Arguments
- df
Data frame of model data.
- response
Column name of response variable.
- group_col
Column name(s) of group(s) to use in
dplyr::group_by()
when supplying type, calculating mean absolute scaled error on data involving time series, and ifgroup_models
, then fitting and predicting models too. IfNULL
, not used. Defaults to"iso3"
.- obs_filter
String value of the form "
logical operator
integer
" that specifies the number of observations required to fit the model and replace observations with predicted values. This is done in conjunction withgroup_col
. So, ifgroup_col = "iso3"
andobs_filter = ">= 5"
, then for this model, predictions will only be used foriso3
vales that have 5 or more observations. Possible logical operators to use are>
,>=
,<
,<=
,==
, and!=
.If `group_models = FALSE`, then `obs_filter` is only used to determine when predicted values replace observed values but **is not** used to restrict values from being used in model fitting. If `group_models = TRUE`, then a model is only fit for a group if they meet the `obs_filter` requirements. This provides speed benefits, particularly when running INLA time series using `predict_inla()`.
- sort_col
Column name(s) to use to
dplyr::arrange()
the data prior to supplying type and calculating mean absolute scaled error on data involving time series. IfNULL
, not used. Defaults to"year"
.- sort_descending
Logical value on whether the sorted values from
sort_col
should be sorted in descending order. Defaults toFALSE
.- pred_col
Column name to store predicted value.
- pred_upper_col
Column name to store upper bound of confidence interval generated by the
predict_...
function. This stores the full set of generated values for the upper bound.- pred_lower_col
Column name to store lower bound of confidence interval generated by the
predict_...
function. This stores the full set of generated values for the lower bound.- upper_col
Column name that contains upper bound information, including upper bound of the input data to the model. Values from
pred_upper_col
are put into this column in the exact same way the response is filled bypred
based onreplace_na
(only when there is a missing value in the response).- lower_col
Column name that contains lower bound information, including lower bound of the input data to the model. Values from
pred_lower_col
are put into this column in the exact same way the response is filled bypred
based onreplace_na
(only when there is a missing value in the response).- type_col
Column name specifying data type.
- types
Vector of length 3 that provides the type to provide to data produced in the model. These values are only used to fill in type values where the dependent variable is missing. The first value is given to missing observations that precede the first observation, the second to those after the last observation, and the third for those following the final observation.
- source_col
Column name containing source information for the data frame. If provided, the argument in
source
is used to fill in where predictions have filled in missing data.- source
Source to add to missing values.
- scenario_detail_col
Column name containing scenario_detail information for the data frame. If provided, the argument in
scenario_detail
is used to fill in where prediction shave filled in missing data.- scenario_detail
Scenario details to add to missing values (usually the name of the model being used to generate the projection, optionally with relevant parameters).
- replace_obs
Character value specifying how, if at all, observations should be replaced by fitted values. Defaults to replacing only missing values, but can be used to replace all values or none.