As base::merge()
does for data.frame
s, this function takes two datasets,
matches rows based on a specified key variable, and adds columns from one to
the other.
joinDatasets(
x,
y,
by = intersect(names(x), names(y)),
by.x = by,
by.y = by,
all = FALSE,
all.x = TRUE,
all.y = FALSE,
copy = TRUE
)
extendDataset(
x,
y,
by = intersect(names(x), names(y)),
by.x = by,
by.y = by,
all = FALSE,
all.x = TRUE,
all.y = FALSE,
...
)
# S3 method for CrunchDataset
merge(
x,
y,
by = intersect(names(x), names(y)),
by.x = by,
by.y = by,
all = FALSE,
all.x = TRUE,
all.y = FALSE,
...
)
CrunchDataset to add data to
CrunchDataset to copy data from. May be filtered by rows and/or columns.
character, optional shortcut for specifying by.x
and
by.y
by alias if the key variables have the same alias in both
datasets.
CrunchVariable in x
on which to join, or the alias
(following crunch.namekey.dataset
of a variable. Must be type
numeric or text and have all unique, non-missing values.
CrunchVariable in y
on which to join, or the alias
(following crunch.namekey.dataset
of a variable. Must be type
numeric or text and have all unique, non-missing values.
logical: should all rows in x and y be kept, i.e. a "full outer"
join? Only FALSE
is currently supported.
logical: should all rows in x be kept, i.e. a "left outer"
join? Only TRUE
is currently supported.
logical: should all rows in y be kept, i.e. a "right outer"
join? Only FALSE
is currently supported.
logical: make a virtual or materialized join. Default is
TRUE
, which means materialized. Virtual joins are in fact not currently
implemented, so the default is the only valid value.
additional arguments, ignored
x
extended by the columns of y
, matched on the "by" variables.
Since joining two datasets can sometimes produce unexpected results if the
keys differ between the two datasets, you may want to follow the
fork-edit-merge workflow for this operation. To do this, fork the dataset
with forkDataset()
, join the new data to the fork, ensure that
the resulting dataset is correct, and merge it back to the original dataset
with mergeFork()
. For more, see
vignette("fork-and-merge", package = "crunch")
.