Skip to contents

expand_df() is a wrapper around tidyr::expand_grid() and dplyr::right_join() that can be used to make missing values explicit within a data frame prior to it being passed to a predict_...() function.

Usage

expand_df(
  df,
  ...,
  response = "value",
  keep_no_obs = TRUE,
  keep_before_obs = FALSE,
  sort_col = "year",
  sort_descending = FALSE,
  group_col = "iso3",
  join_covariates = FALSE
)

Arguments

df

Data frame.

...

Named vectors to pass to expand grid.

response

Column name of response variables whose missing values will be infilled and projected, defaults to "value".

keep_no_obs

Logical value indicating whether or not to keep rows in the expanded data frame when there is no data. Defaults to TRUE. This is done based on the group_col, if provided.

keep_before_obs

Logical value indicating when data is available, whether or not to keep rows in the expanded data frame that lie before the first observed point. Defaults to FALSE. This is done based on the sort_col and group_col, if provided.

sort_col

Column name(s) to use to dplyr::arrange() the data prior to supplying type and calculating mean absolute scaled error on data involving time series. If NULL, not used. Defaults to "year".

sort_descending

Logical value on whether the sorted values from sort_col should be sorted in descending order. Defaults to FALSE.

group_col

Column name(s) of group(s) to use in dplyr::group_by() when supplying type, calculating mean absolute scaled error on data involving time series, and if group_models, then fitting and predicting models too. If NULL, not used. Defaults to "iso3".

join_covariates

Logical value indicating whether or not to join the final expanded data frame to the covariates_df data frame. If TRUE, iso3 and year must be columns within the input df.

Value

Expanded data frame with explicit missing values.