Skip to contents

#'wrangle_gho_rural_urban_data() exapnds the functionality of wrangle_gho_data() by also handling indicators which have TOTL/RUR/URB values in the Dim1 column of the GHO data response, by first pivoting the data frame and then selecting only the total, rural, and urban values for a given (iso3, year) combination (in that order of preference).

Usage

wrangle_gho_rural_urban_data(
  df,
  source = NULL,
  type = NULL,
  ind = NULL,
  upload_date = NULL,
  scenario = NULL,
  id_cols = c("SpatialDim", "TimeDim"),
  names_from = "Dim1",
  values_from = c("NumericValue", "High", "Low", "DataSourceDim", "Comments", "Date")
)

Arguments

df

A data frame in GHO format, returned from [ghost:gho_data()]https://gpw13.github.io/ghost/reference/gho_data.html.

source

Character string of source to be provided to the data frame. If NULL, the source column is generated from the GHO's DataSourceDim column. If not NULL, it overrides the source provided by the GHO.

type

Character string of type to be provided to the data frame. If NULL, the type column is filled with NA_character_.

ind

Character string of the indicator name to be provided to the data frame. This is a required argument and will raise an error if not provided. If only a RUR or URB values is available, the indicator name has _rural or _urban appended to it in the output data frame.

upload_date

Character string indicating on which day the data was last updated on GHO

scenario

(Character) string of scenario to be provided to the data frame. If NULL, the scenario is set to NA_character.

id_cols

Character vector of the columns that are the same regardless of the TOTL/RUR/URB dimension. Used as the argument of the same name in pivot_wider.

names_from, values_from

A pair of character vectors used as the arguments of the same name in pivot_wider.

Value

A data frame

Details

It also automatically filters 'mixed' time series — i.e., instances where the time series for a given country contains a combination of TOTL, RUR, and URB values by keeping the time series associated only with the most commonly occuring of these options. For example, a time series with URB data from 2000 to 2015 and TOTL data from 2016 to 2020 will be cutoff at 2015, so that only the URB data is kept.

TODO

  • Convert to more generic unspool_gho_dim function which can work with any other DimType, and not just TOTL/RUR/URB.

  • Re-write to make it work better with wrangle_gho_data to avoid the significant amount of redundant logic. This means that, eventually, users may do gho_data(.) %>% unspool_gho_dim(.) %>% wrangle_gho_data(.)