Save the output to disk after ensuring column specs
Source:R/utils_wrangling.R
save_wrangled_output.Rd
Helper functions that saves a wrangled output data frame to the disk if it has the correct columns as required by the Triple Billions xMart tables.
Usage
save_wrangled_output(
df,
path,
data_type = c("wrangled_data", "projected_data", "final_data"),
na_rm = FALSE,
compression = "gzip"
)
Arguments
- df
data frame the output
- path
the path where the output should be saved
- data_type
(string): the type of data
wrangled_data
(default): raw data that has been wrangled into a suitable form for analysis.projected_data
: data that has been fully projected to the target year but has not yet been transformed or calculated upon.final_data
: the complete set of billions data with transformed values, contributions, and all calculations available.
- na_rm
(logical) Specifies whether to remove rows where
value
is missing. Defaults toFALSE
.- compression
Compression algorithm to use for parquet format.
"gzip"
by default
Value
A data frame. Note that this is the modified version of in the input
the function (such as from removing empty rows when na_rm = TRUE
) are carried
over to the output.
Details
The function returns a data frame (like readr::write_csv()
) in order to allow
it to work with pipes better.
See also
Wrangle data functions
add_missing_xmart_rows()
,
get_data_lake_name()
,
get_whdh_path()
,
has_xmart_cols()
,
save_gho_backup_to_whdh()
,
wrangle_gho_data()
,
wrangle_gho_rural_urban_data()
,
wrangle_unsd_data()
,
xmart_col_types()
,
xmart_cols()