crunch::cut() is equivalent to base::cut() except that it operates on Crunch variables instead of in-memory R objects. The function takes a Datetime variable and derives a new categorical variable from it based on the breaks argument. You can either break the variable into evenly spaced categories by specifying an interval using a string that defines a period or a vector containing the start and end point of each category. For example, specifying breaks = "2 weeks" will break the datetime data into 2 week size bins while breaks = as.Date(c("2020-01-01", "2020-01-15" "2020-02-01")) will recode the data into two groups based on whether the numeric vector falls between January 1 and 14 or January 15 and 31

# S4 method for DatetimeVariable
cut(x, breaks, labels = NULL, dates = NULL, name, right = FALSE, ...)

Arguments

x

A Crunch DatetimeVariable

breaks

Either a numeric vector of two or more unique cut point datetimes or a single string giving the interval size into which x is to be cut with a number optionally at the beginning nd "day", "weeks", "months", a "quarters" or "years". If specifying cut points, values that are less than the smallest value in breaks or greater than the largest value in breaks will be marked missing in the resulting categorical variable.

labels

A character vector representing the labels for the levels of the resulting categories. The length of the labels argument should be the same as the number of categories, which is one fewer than the number of breaks. If not specified, labels are constructed with a formatting like "YYYY/MM/DD - YYYY/MM/DD" (for example ("2020/01/01 - 2020/01/14"))

dates

(Optionally) A character vector with the date strings that should be associated with the resulting categories. These dates can have the form "YYYY-MM-DD", "YYYY-MM", "YYYY", "YYYY-WXX" (where "XX" is the ISO week number) or "YYYY-MM-DD,YYYY-MM-DD". If left NULL, it will be created from the categories.

name

The name of the resulting Crunch variable as a character string.

right

logical, indicating if the intervals should be closed on the right (and open on the left) or vice versa. This only applies if giving a vector of break points.

...

further arguments passed to makeCaseVariable

Value

a Crunch VariableDefinition. Assign it into the dataset to create it as a derived variable on the server.

Examples

if (FALSE) {
ds <- loadDataset("example")
ds$month_cat <- cut(ds$date, breaks = "month", name = "monthly")
ds$four_weeks_cat <- cut(ds$date, breaks = "4 weeks", name = "four week categorical date")

ds$wave_cat <- cut(
    ds$date,
    as.Date(c("2020-01-01", "2020-02-15", "2020-04-01", "2020-05-15")),
    labels = c("wave1", "wave2", "wave3"),
    name = "wave var"
  )
}