With Crunch, you can add additional rows to a dataset by appending a second dataset to the bottom of the original dataset. Crunch makes intelligent guesses to align the variables between the two datasets and to harmonize the categories and subvariables of variables, as appropriate.

appendDataset(dataset1, dataset2, upsert = FALSE)



a CrunchDataset


another CrunchDataset, or possibly a data.frame. If dataset2 is not a Crunch dataset, it will be uploaded as a new dataset before appending. If it is a CrunchDataset, it may be subsetted with a filter expression on the rows and a selection of variables on the columns.


Logical: should the append instead "update" rows based on the primary key variable and "insert" (append) where the primary key values are new? Default is FALSE. Note that this upserting behavior requires a primary key variable to have been set previously; see pk().


dataset1, updated with dataset2, potentially filtered on rows and variables, appended to it.


Variables are matched between datasets based on their aliases. Variables present in only one of the two datasets are fine; they're handled by filling in with missing values for the rows corresponding to the dataset where they don't exist. For variables present in both datasets, you will have best results if you ensure that the two datasets have the same variable names and types, and that their categorical and array variables have consistent categories. To preview how datasets will align when appended, see compareDatasets().

Particularly if you're appending to datasets that are already shared with others, you may want to use the fork-edit-merge workflow when appending datasets. This allows you to verify your changes before releasing them to the other viewers of the dataset. To do this fork the dataset with forkDataset(), append the new data to the fork, ensure that the append worked as expected, and then merge the fork back to the original dataset with mergeFork(). For more, see vignette("fork-and-merge", package = "crunch").


if (FALSE) {
ds <- loadDataset("Survey, 2016")
new_wave <- loadDataset("Survey, 2017")
ds <- appendDataset(ds, new_wave)