• Variables can now be created as materialized by default instead of derived, by setting environment variable R_CRUNCH_DEFAULT_DERIVED or option crunch.default.derived to FALSE. See ?toVariable for more information (#648).

  • The concept of a personal folder is being removed from the API imminently. This has a few implications for rcrunch:

    • All datasets must be created with a project (eg via the project argument of newDataset())

    • Dataset forks will be created in the same folder as their parent

    • Because loading datasets by name doesn’t work for datasets in projects, it’s not really possible to load a dataset by name without specifying the full project path.

    • To make things easier, it is possible to set a default project path with environment variable R_CRUNCH_DEFAULT_PROJECT or option crunch.default.project. This will be used as the default project folder when creating and loading datasets. Forks will still be put next to parents.

  • Fix typo which relied on partial argument matching when using the variable catalog cache (#625, thanks @rossellhayes)
  • Can now upload a dataset with metadata stored as a .json file (#620)
  • Avoid traversing hidden/private variable folder tree in more situations when the crunch.warn.hidden & crunch.warn.private options are set (#619).
  • Support for including hidden/private variables in names() to avoid issues with stuttering by RStudio auto-complete with option crunch.names.includes.hidden.private.variables (#619).
  • Experimental support for avoiding loading the variable order with option crunch.order.var.catalog (#619).
  • Miscellaneous fixes for CRAN checks
  • You can now run crunch automation scripts on project folders. For a list of commands see https://help.crunch.io/hc/en-us/sections/10343332025101-Folder-commands. (#617)
  • Fix for printing ScriptCatalog (and removal of the ScriptCatalog method for ScriptBody the full body, subset to the particular script if you need the body text with vapply(scripts(ds), function(x) scriptBody(x), character(1))).
  • Bug fix for setting the encoding type when running crunch automation commands.
  • Fixes for problems found with R-devel found by CRAN checks.
  • Authenticating with crunch via login() is no longer supported. See the vignette("crunch") or ?crunch-api-key for more details on authenticating with api keys.
  • Experimental support for parquet export in exportDataset() (#595)
  • Fix a bug where environment variables could not set numeric or logical options like “R_CRUNCH_TIMEOUT”. (#593)
  • Uses the “public” variable folder rather than the root to pave the way for an upcoming join feature. In general end users should not notice. (#592)
  • Authenticating with an API key is now the recommended way to use the R crunch package. See the vignette("crunch") or ?crunch-api-key for more details. The login() authentication flow is deprecated and will be removed from an upcoming release.
  • You can now export a csv version of a tabbook for use in Tableau or other BI software (#583)
  • Crunch slides have been improved:
    • Can set multiple filters per slide (#570)
    • Can create and edit markdown slides (#568)
  • Fixed a warning in R 4.1 when setting attributes on S4 objects (#578)
  • Can create and edit subtotal insertions on Multiple Response variables (#580)
  • Update to the pets example (#585)
  • interactVariables() now uses server side logic to create categories so that it’s faster, but the sep argument is no longer supported (it’s always set to " and ") and the category order will not be the same.
  • as.data.frame() and as.vector() work with numeric arrays now (#558)
  • Can now create subtotal differences from rcrunch (#559)
  • Can now use options(crunch.show.progress.url = TRUE) to show the URL checked for progress (#565)
  • Avoids a possible bug when a hidden or private folder have bad entities in them (#561)
  • Fixed a bug in as.data.frame() where it would not respect the include.hidden argument (#560)
  • Fixed a warning message in R 4.1 when assigning two levels deep (#578)
  • Variables catalogs are now loaded lazily, which means some dataset operations will be much faster. You can override this behavior (useful for eg testing or rare situations when working on many datasets at once by using forceVariableCatalog() or option(crunch.lazy.variable.catalog = FALSE)).
  • You can now set the filter, weight and viz_type of a slide when creating it.
  • searchDatasets() gained an argument f that allows you to pass R objects to filter on.
  • Added a vignette for common deck operations.
  • You can now set the analyses manually using analyses argument of newSlide() or set analysis<- to a list, allowing all slide customizations from R.
  • Documentation improvements.
  • Renaming variable folders via setName() is fixed (#487)
  • You can set the weight and filter to the same value for every slide in a deck by assigning a weight or filter on the deck object.
  • You can now derive a variable directly from a logical expression (#286)
  • Fixes to pkgdown documentation website
  • Added support for crunch automation, see ?runCrunchAutomation for more information
  • Previously defined expropriateUser() function is now accessed through the function call reassignUser(). Functionality of the call has not changed.
  • Added makeCaseWhenVariable() that helps with many common recoding needs.
  • Added alterArrayExpr() which allows adding, removing, reordering and renaming subvariables in a derived Array.
  • You can now use the keyring package to store your credentials. See ?login() for details (thanks @mainwaringb!)
  • Add several new expressions that let you create derived variables in more flexible ways than was previously possible.

  • crunch::filter() now falls back to the next filter on your searchpath when no method is defined.

New features

  • You can now mark variables private (and remove their private status)
  • tabBook() by default uses a new endpoint, which allows for more options. The old endpoint is deprecated, but while the server supports it, you can still use it. See ?tabBook for more details.
  • You can now view and modify the weight on a slide using weight() and weight()<- (#440)
  • You can now use a named filter or filter object when using tabbook(). Filtering by expression in the dataset argument is also supported again.
  • You can now cast crunch expressions to other variable types without saving as a variable first.

Bug fixes

  • newMultiTable() now correctly passes ..., so arguments like is_public work (#424)
  • hiddenVariables() works when the hidden variables folder has subdirectories (#372).

Internal Changes & Deprecations

  • importMultitable() has been removed because it was deprecated on the server.
  • crunch now uses folders for hidden
  • https verification can be disabled during testing by setting environment variable R_TEST_VERIFY_SSL=FALSE
  • All mock test files are compressed
  • Can now deriveArray() using expressions to create the subvariables.
  • New function slideCategories() helps you create overlapping categorical variables (#396).
  • crunch will work with upcoming changes to stringsAsFactors defaults (#402).
  • Exporting a Deck requires (and now includes) a valid JSON body in its request.
  • Documentation for newSlide includes examples of vizType settings and other improvements.
  • A new copyFolders() function that copies folders and variable order from one dataset to another (similar to copyOrder() which was deprecated)
  • Slides in decks can be manipulated more robustly (and their filters can be removed or added with filters(slide_object) <- NULL or filters(slide_object) <- filter_object)
  • There is a new helper function for dealing with API changes that lets you expect two different outcomes while testing (expect_either())
  • Extraneous ... removed from documentation
  • Internally, the (un)hideVariables() functions are upgraded to use folder operations.

Personal folder

  • You can now access your “personal folder” of datasets, which contains only those datasets you imported and that haven’t been moved into another project. This dataset folder is denoted in paths by ~, as in a *nix file system. cd(projects(), "~") takes you there; mv(projects(), ds, "~") moves ds into your personal folder.
  • listDatasets() now by default only prints datasets in your personal folder, not a combination of your personal datasets and some of the datasets that have been shared with you.
  • When you give a dataset name to loadDataset(), it now searches to find datasets exactly matching that name unless you specify a project to load from. If you have multiple datasets with the same name in different locations, loadDataset("your dataset name") may return a different one than it did previously. If you want to identify a dataset precisely in loadDataset(), either specify the dataset URL (most effective but not as human friendly) or provide project = "path/to/folder".
  • Calling loadDataset(<integer>) no longer is supported.

Bug fixes

  • Fix is.public(multitables[[i]]) <- TRUE and several other similar assignments of attributes on an element of a catalog, which previously successfully updated the value on the server but errored when returning to R (#303, #367)
  • subvariables() on non-array variables returns NULL instead of an error (#237)
  • Fixed a bug with the display of univariate cube prop.tables

Other enhancements

  • newExampleDataset() creates a sample dataset for you to explore
  • exportDeck() can now write to PowerPoint with format = "pptx"
  • newDataset() now supports importing data in Triple-S format, providing a schema file in addition to the row data.
  • resolution() lets you see the data units of a datetime variable (“Y”, “M”, “D”, “ms”, etc.); resolution<- lets you set it (#234)
  • deleteDataset() accepts web app URLs, just as loadDataset() already did (#279)
  • Set options(crunch.warn.hidden=FALSE) to suppress the “Variable x is hidden” messages when accessing hidden variables (#172)
  • Support sharing decks with a team via team(deck) <-
  • Added upsert argument to appendDataset() to allow datasets to be updated based on the primary-key variable; see pk() for details on primary keys (#49)
  • combineCategories() and combineResponses() are aliases for combine(), providing a way to avoid accidental clashes with dplyr::combine() (#359)

Shared Crunch assets

  • Initial support for deck creation and manipulation.
  • Filters and multitables can now be shared with teams by assigning teams()<- on them. View which teams can access them by calling teams() on them.

Folder enhancements

  • Improved robustness of API usage for moving datasets in projects.
  • Support for "." as a folder path/segment, referencing the current folder. cd(project, ".") returns project; mv(project, ds, ".") moves ds into project.
  • Fix bug in printing folders that contain entities with excessively long names.

Cube computation

  • Summary statistic (means and medians) can now be calculated for any direction of a CrunchCube.
  • Improved speed for calculating insertions (subtotals, headers, etc.) on large cubes (speed ups of ~25x on large, realistic cubes).

Internal

  • Remove (for now) support for the experimental virtual join feature.
  • Remove previously deprecated variable order functions.
  • 404 Not Found HTTP responses now print the request URL to aid in debugging.
  • Fix a duplicated vignette title.
  • Suppress check for new GitHub release of the package in non-interactive sessions.
  • Fixed a bug that wouldn’t allow vignette mocks to use datasetReference()
  • Removed excess metadata in some cube fixtures in anticipation of Crunch not sending that information any more (no code changes were necessary).
  • The RStudio gadgets for listDatasets() and makeArrayGadget() have been moved to the crunchy package. Wiring for them, including RStudio add-in configuration, remains here, but you’ll have to install that package to use them.
  • Minor fixes for backwards compatibility with the old projects API
  • Remove code paths that modify the project dataset order, which was removed from the Crunch API.
  • List as deprecated many functions that modify variable order, suggesting mv() and the other folder operations. These functions will be removed in December 2018.

Organization

Bugfixes

  • Requesting tab books for subsets of variables with weights no longer errors.
  • makeWeight() is now correct for categorical variables with non-sequential IDs.
  • Hidden variables are now included in the output of write.csv or as.data.frame(force = TRUE) if requested.
  • The print method for empty dataset/variable folders now prints something informative.

Other

  • Adjusted the calculation of index.table() to better reflect analysts’ intentions. Now, index.table() calculates the index with respect to the marginal proportion of the margin given, so for index.table(cube, 2) the column proportions of the table are indexed to the marginal row proportions. In other words: for each column how much larger or smaller is the proportion in that column when compared to the proportions for the row variable alone.
  • Updated for compatibility with the upcoming release of the haven package and its new haven_labelled and haven_labelled_spss object classes.
  • The package now warns when an API endpoint is deprecated.

Improved support for subtotals

  • CrunchCubes can now be displayed with subtotals on any axis (not just rows).
  • Subtotals in CrunchCubes have been improved and stabilized, and should work in many more places than they did before (e.g. margin.table, prop.table, etc.)

Bugfixes

  • Fix order of variables when using mv() to move them to a folder.
  • deleteVariables() no longer tries to delete duplicate variables.
  • Resolved a but when using as.data.frame(..., force = TRUE) with numeric variables that have missing values.
  • Resolve missing Suggests reference for test packages, following new check requirement.

Internal improvements

  • getDimTypes() returns a richer set of cube dimension types differentiating multiple response from categorical array dimensions.
  • Added support for alias, description, and notes on VariableTuples
  • New introductory vignette: vignette("crunch")
  • changeCategoryID() tries to unset then reset the dataset exclusion if that impacts its progress. Best practice is to disable exclusions before running changeCategoryID() if at all possible.
  • Setting the ordering<- of datasets within a project will now drop any invalid entries with a warning, rather than error.
  • Fix a bug introduced in 1.22.0 in creating categorical variables from factors with missing values.
  • Fix a similar yet unrelated bug in creating numeric and other types of variables with all-NA data.
  • Fix streamRows() for case when sending only one row (#253).
  • Internal: support for the “selected_array” method of multiple response calculation, deprecated since 1.20.0, has been removed.
  • Internal: getDimTypes() returns a richer set of cube dimension types differentiating multiple response from categorical array dimensions.
  • Internal: Added support for alias, description, and notes on VariableTuples
  • makeArrayGadget() launches an RStudio gadget to help you build valid categorical arrays and multiple response variables.

Analysis methods

  • CrunchCubes can now be subset just like R arrays using the [ method.
  • Add summary statistics to CrunchCubes that have categorical variables with scale values (numeric_values). See ?addSummaryStat for more information.
  • index.table() to return tables indexed to a margin.

Bug fixes and other enhancements

  • Fix bug in assigning subtotals(var) <- NULL when it already was NULL (#231).
  • Consistently return "" for variable metadata fields if no value is set (#232).
  • Better subvariable metadata methods for CrunchCubes (#215).
  • Clarified the error message when using makeMRFromText() with a categorical variable.
  • Export GitHub package version checking function so that other crunch* packages can use it.
  • %in% and == on Crunch objects now follow R semantics more closely with regards to missing data.
  • Add some forward-compatible code to prepare for API changes to logical variables. This led to a couple of trivial changes to internals around boolean types that should not affect package users.

Variable organization

  • New functions for organizing variables in a dataset, modeled on file system operations: cd(), mv(), mkdir(), rmdir(). These functions use a new API for variable folders (unlike the experimental versions of some that were introduced in the 1.19.0 package release). This API is currently in a beta testing phase. See vignettes("variable-order", package="crunch") for examples and details.

Loading and navigating

  • listDatasets(shiny = TRUE) launches an RStudio addin which allows you to select your dataset in order to generate a valid loadDataset() call. You can also associate this addin with a hotkey using in RStudio through Tools > Modify Keyboard Shortcuts.
  • webApp() now works for Crunch variables: it will take you to the “browse” view of the web application with the given variable card loaded on screen.

New variable creation and derivation

  • Create a derived view of a variable as another type without altering the underlying data. Have a text input that is only numbers, such as an ID, and want to have a variable that is a true numeric, but you also want to make sure that new (text) values can be appended to the dataset? Use ds$id_var_numeric <- as.Numeric(ds$id_var). There are as.* methods for all Crunch data types except for array-like variables.
  • Preliminary support for haven’s labelled class when converting to Crunch variable types.
  • makeMRFromText() to take a variable imported as delimited strings, parse the multiple-response options, and return a (derived) multiple_response variable.

Miscellaneous

  • Added support for setting population sizes on datasets with setPopulation(ds, size = 24.13e6, magnitude = 3) and for getting population sizes (or magnitudes) with popSize(ds) and popMagnitude(ds) respectively.
  • Added support for getting and setting rollup resolutions for displaying Datetime variables. Get resolution with rollupResolution(ds$datetime) and set with rollupResolution(ds$datetime) <- "M".
  • Add options(crunch.show.progress) to govern whether to report progress of long-running requests. Default is TRUE, but set it to FALSE to run quietly.
  • Export pollProgress() and recommend using that when a long-running request fails to complete within the local timeout.

Subtotals and Headings

  • Added support for subtotals and headings on categorical variables and CrunchCubes. Subtotals can be set with subtotals(variable) <- Subtotal(name = 'subtotal', categories = c(1, 2), after = 2). Use subtotals(variable) to see what subtotals are set for a variable.
  • By default, subtotals will be displayed on CrunchCube results. Arrays consisting of only subtotals can be created using subtotalArray([cube])
  • See ?subtotals or vignette("subtotals", package="crunch") for more information.

Multiple response in CrunchCubes

  • The default method for including multiple response variables in CrunchCubes has changed, allowing for better handling of variables with different missingness across subvariables. (Internally: queries with multiple response variables now use the as_selected function instead of selected_array, which is now deprecated).
  • For now, the deprecated method can be restored by setting options(crunch.mr.selection = "selected_array").

Improvements to conditionalTransform()

Optimizations and bugfixes

  • Improved efficiency when loading a dataset from URL.
  • refresh() for Datasets is now more efficient.
  • Fixed a bug where CrunchCubes with categorical variables that had categories “Selected”, “Not selected”, and “No data” might not display correctly.

New functions

  • Variable groups (folders) can now be referenced by “path”: either a vector of nested folder names (as in ordering(ds)[[c("Top folder", "Nested folder")]]) or a single string with nested folders separated by a delimiter (as in ordering(ds)[["Top folder/Nested folder"]]). “/” is the default path delimiter, and this is configurable via options(crunch.delimiter). If you have folders that actually contain “/” in the folder name, this may be a breaking change. If so, set options(crunch.delimiter="|") or some other string so that folder names are not incorrectly interpreted as paths.
  • Introduce new mv() and mkdir() functions for creating variable folders and moving variables into them. These take a Dataset as their argument and can be chained together for convenience/readability.
  • Other helper functions folder() and folder<- to locate a variable in the folder hierarchy and to move it to a new folder. folder(ds$var) <- "New folder/subfolder" is equivalent to ds <- mv(ds, "var", c("New folder", "subfolder")).
  • Create new variables that take on different values when specific conditions are met using conditionalTransform() (#64, #153)
  • collapseCategories() allows you to combine categories in place without creating a new variable

Enhancements and fixes

  • Deep copying variables with copy() has been made more efficient
  • CrunchDataFrames have been improved to act more data.frame-like. You can now access and overwrite values with standard data.frame methods like crdf$variable1 or crdf[,"variable1"] and crdf$variable1 <- 1 or crdf[,"variable1"] <- 1. CrunchDataFrames now also support adding arbitrary columns, although it should be noted that these columns are not stored on the Crunch server, so if you want to keep that data outside of your current R session, you should send it back to your Dataset as a new variable.
  • is.selected() is now vectorized to work with Categories, as is.na() has always been. You can also now assign into the function (#123)
  • addSubvariable() now accepts variable definitions directly (#72)
  • makeCaseVariable() has better errors when a user doesn’t name all of their case definitions (#158).
  • The size limit on as.data.frame() when force = TRUE has been removed (#150)
  • All catalog objects now have an as.data.frame() method.
  • The list of dataset “weight variables” can now be set with modifyWeightVariables(), weightVariables(ds) <- ds$newWeight or is.weightVariables(ds$var) <- TRUE
  • Users with account admin privileges can now expropriateUser() to transfer datasets, projects, and other objects owned by one user to another, as when that user has left your organization.
  • Access members of user UserCatalogs by email (e.g. catalog[["you@example.com"]]) by default. All catalog extract methods ([ and [[) now also accept a secondary argument for setting an index to match against to change that default.
  • Crunch authentication email and password can be stored in and read from the environmental variables R_CRUNCH_EMAIL and R_CRUNCH_PW respectively.
  • Cube queries with as_selected multiple-response variables have margin and prop.table methods
  • Cube variables() now contain additional metadata, including “type”
  • Fix bases() when called on a univariate statistic (#124)
  • Update some tests and code to anticipate changes in an upcoming release of testthat
  • makeWeight() allows you to generate new weighting variables based on categorical variables (#80).
  • cut(), equivalent to base::cut, allows you to generate a derived categorical variable based on a numeric variable (#93).
  • Create a new Crunch dataset from a file by calling newDataset() directly instead of newDatasetFromFile. Also, you can now create a dataset from a hosted file passing its URL to newDataset(FromFile).
  • as.data.frame() method for VariableCatalog for a view of variable metadata (#75)
  • crunchBox() now allows you to specify colors for branding or even category-specific coloring.
  • RStudio users will now be prompted for their password on login() in a way that conceals the input.
  • Changed the behavior of changeCategoryID() to only update numeric values of the category having its id changed when the id and the numeric value are the same.
  • The autorollback argument of appendDataset() has been deprecated. The option no longer has any effect and a warning will be printed to notify users about the deprecation.
  • Long-deprecated newDatasetByCSV was removed.

Support for mapping

  • Crunch-hosted geographic data can now be set and updated. Use geo() on a variable to see if there is already associated geographic data.
  • addGeoMetadata() function to match a text or categorical variable with available geodata based on the contents of the variable and metadata associated with Crunch-hosted geographic data.

Better derived variable support

  • Derivation expressions can now be retrieved from derived variables with derivation()
  • Derived variables can be integrated or instantiated by setting derivation() <- NULL

Other new functions

Fixes and adjustments

  • Categories are now selectable by names as well as ids
  • Fix issue where deleteSubvariable() by index instead deleted the parent variable
  • Add a missing import from the methods package so that Rscript works (#90)
  • Allow deep copying of multiple-response type variables
  • Experimental support for merging CrunchDataFrames with standard data.frames

Two attempts to fix download issues introduced by 1.17.4:

  • Changed file downloads to crGET with httr::write_disk() to hopefully work around issues caused by utils::download.file with method “libcurl”.
  • Add a retry for downloads to hopefully work around a delay in CDN population.
  • searchDatasets() to use the Crunch search API.
  • Added support for viewing and changing the number of digits after the decimal place to be printed with digits() (useful when exporting to SPSS files).
  • crtabs and table where a dimension is a CrunchLogicalExpr now return a boolean dimension with names “FALSE” and “TRUE”, rather than the previous behavior of dropping the dimension and only returning the TRUE value.
  • Added support for case variables (#36): makeCaseVariable() takes a sequence of case statements to derive a new variable based on the values from other variables.
  • Added a function to create interactions of variables (#42): interactVariables() takes two or more categorical variables and derives a new variable with the combination of each.
  • Fixed a bug where exports (data and tab book) might not work on Windows. If you’re using a version of R older than 3.3, and you now have problems downloading, and you’re not on Windows, try options(download.file.method="curl").
  • Support for streaming data: check for received data with pendingStream(); append that pending stream data to the dataset with appendStream() (#40)
  • Multitables can now be updated with multitables(ds)[["Multitable name"]] <- ~ var1 + var2 syntax. Similarly, multitables can be deleted with multitables(ds)[["Multitable name"]] <- NULL. Multitables also have new name() and delete() methods.
  • toVariable() now accepts (and then strips) arguments of class AsIs (#44)
  • Fixed a bug where changeCategoryID() failed on multiple response variables.
  • dashboard and dashboard<- methods to view and set a dashboard URL on a dataset
  • changeCategoryID function to map categorical data to a new “id” and value in the data (#38, #47)
  • Added importMultitable() to copy a multitable form one dataset to another. Additionally, Multitables now have a show method showing its name and column variables.
  • Can now extract variables from a dataset by the variable URL
  • appendDataset() now truly appends a dataset and no longer upserts if there is a primary key set. This is accomplished by removing the primary key before appending. (#35)
  • Primary keys can now be viewed with pk(dataset) and set with pk(dataset) <- variable.
  • Fix issue in printing filter expressions with long value columns (#39, #45)
  • Progress bars now clean up after themselves and do not leave the prompt hanging out at the end of the line
  • Test setup code moved to inst/ so that other packages that depend on crunch can use the same setup.

Cube and tab book improvements

  • Reshape TabBookResults that contain categorical array variables so that prop.table computations line up with those not containing array variables (i.e. move subvariables to the third array dimension in the result).
  • Add names, aliases, and descriptions methods to CrunchCube (corresponding to variables of the dimensions in the cube), MultitableResult (corresponding to the “column” variables of the cubes in the result), and TabBookResult (corresponding to the “row”/“sheet” variables in each multitable result).
  • Fix names method for TabBookResults following an API change.
  • Extend crtabs formula parsing to support multiple, potentially named, measures

Other new features

  • weightVariables method to display the set of variables designated as valid weights. (Works like hiddenVariables.)
  • In appendDataset, allow specifying a subset of rows to append (in addition to the already supported selection of variables)
  • loadDataset can now load a dataset by its URL.

Housekeeping

  • Remove “confirm” argument from various delete functions (deprecated since 1.14.4) and the “cleanup” argument to append (deprecated since 1.13.4)
  • All destructive actions now require ‘consent’, even in non-interactive mode. See ?with_consent for more details.
  • Improvements to validation when updating values in a dataset.
  • Move mock API fixtures to inst/ so that other packages depending on this package can access them more easily.
  • Support for additional dataset export arguments
  • Add is.derived method for Variables
  • Allow a ‘message’ when sharing a dataset (#27)
  • More validation for the input to the various export functions
  • Fix handling of “total” column in TabBookResults when the row variable is a categorical array
  • multitables method to access catalog from a Dataset. newMultitable to create one. See ?multitables and ?newMultitable for docs and examples.
  • tabBook to compute a tab book with a multitable. If format="json" (the default), returns a TabBookResult containing CrunchCube objects with which further analysis or formatting can be done.
  • bases method for cubes and tab book responses to access unweighted counts and margin tables.
  • Handle case of attempting to saveVersion when there are no changes since the last saved version.
  • Update to work with roxygen2 6.0.0 release
  • newFilter and newProject functions to create those objects more directly, rather than by assigning into their respective catalogs.
  • Require confirmation before doing a “force” merge in mergeFork.
  • Add with_consent as an alternative to with(consent(), ...)
  • Deprecate the “confirm” argument to destructive functions and methods such as delete in favor of the consent context manager.
  • Add deprecation warning that destructive actions will soon also require consent when running in a non-interactive R session.
  • Use httptest for mocking HTTP and the Crunch API.
  • Trivial change to DESCRIPTION to meet new, hidden CRAN requirement
  • embedCrunchBox to generate embeddable HTML markup for CrunchBoxes
  • duplicated method for Crunch variables and expressions
  • Prevent invalid expressions with incorrect variable references from making bad requests
  • Print methods for Category/ies now show category ids
  • Speed up as.vector and as.data.frame methods by smarter pagination of requests.
  • Option “crunch.namekey.variableorder” to govern how VariableOrder is printed. Current default is “name”, the status quo, but set it to “alias” to have ordering print aliases.
  • Support for is.na<- to set missing values on a variable, equivalent to assigning NA
  • Fix behavior and validation for subsetting datasets/variables that are already subsetted by a Crunch expression object.
  • Allow setting a variable entity to settings(ds)$weight and not just its self URL.
  • crunchBox to make a public, embeddable analysis widget
  • settings and settings<- to view and modify dataset-level controls, such as default “weight” and viewer permissions (“viewers_can_change_weight”, “viewers_can_export”)
  • flattenOrder to strip out nested groups from an order
  • Univariate statistics on variables, such as mean, median, and sd, now respect filter expressions, as does the summary method.
  • “median” can now be used in crtabs
  • Copying and deriving variables now bring in the “notes” attribute.
  • Improve error handling when attempting to loadDataset from a nonexistent project.
  • More utility functions for working with order objects: dedupeOrder, removeEmptyGroups
  • appendDataset can now append a subset of variables
  • Update to changes in the dataset version API
  • Fix bug in assigning NA to an array subvariable that didn’t already have the “No Data” category
  • flipArrays function to generate derived views of array subvariables
  • Add autorollback argument to appendDataset, defaulted to TRUE, which ensures that a failed append leaves the dataset in a clean state.
  • allVariables is now ordered by the variable catalog’s order, just as variables has always been.
  • Add “force” argument to mergeFork.
  • Support an as_array (pseudo-)function in crtabs that allows crosstabbing a multiple-response variable as if it were a categorical array.
  • Fix bug in dataset export when attempting to export a single variable
  • Support deep copying of categorical array variables.
  • Join (merge) a subset of variables and/or rows of a dataset.
  • moveToGroup function and setter for easier adding of variables to existing groups.
  • locateEntity function to find a variable or dataset within a potentially deeply nested order.
  • Change default key for printing hiddenVariables from “name” to “alias”, governed by options(crunch.namekey.dataset) as elsewhere
  • Allow disabling of check for new package releases on load by setting options(crunch.check.updates=FALSE).
  • Return a Session object from session() that lazily fetches catalogs rather than when instantiated.
  • as.vector on a categorical-array or multiple-response variable now returns a data.frame. While a matrix is a more accurate representation of the data type, using data.frame allows for more intuitive accessing of subvariables by $, just as they are from the Crunch dataset.
  • Enhancements to merge/extendDataset: a “by” argument as a shortcut for “by.x” and “by.y”; referencing “by” variables by alias; and aliasing the function also through joinDatasets with its (new) default copy=TRUE argument.
  • POST new array variable definitions that are a series of subvariable definitions as a single request, rather than uploading each subvariable separately and then binding.
  • Improve addSubvariable to PATCH rather than unbind and rebind; also extend it to accept more than one (sub)variable to add to the array.
  • Remove pattern matching argument from makeArray, makeMR, deleteVariables, and hideVariables, deprecated since 1.9.6.
  • Standardize deleteSubvariable to follow model of deleteVariable, including requiring consent to delete.
  • New vignette on downloading data to your local R session and exporting datasets to file formats.
  • Preparation for upcoming API changes.
  • Patch a test for handling duplicate factor levels, which is deprecated in current R releases but converted to an error in the upcoming release.
  • Breaking change: Accessing subvariables from array variables is now done by alias, just as variables are extracted from a Dataset. The “crunch.namekey.dataset” and “crunch.namekey.array” options have existed for a while, but they’ve had different default values. Now both default to “alias”, which should offer a more consistent interface. If you want to maintain the old behavior, you can set options(crunch.namekey.array="name") in your script or in your .Rprofile.
  • deleteSubvariable now follows “crunch.namekey.array” and will take either subvariable names or aliases, depending on the value of the setting.
  • New extendDataset function, also aliased as merge, to allow you to add columns from one dataset to another, joining on a key variable from each.
  • compareDatasets now checks the subvariable matching across array variables in the datasets to identify additional conflicts.
  • Creating Crunch logical expressions that reference category names that do not exist for the given variable no longer errors; instead, a warning is given, and the unknown category names are dropped from the expression so that they evaluate as intended.
  • notes and notes<- methods for datasets, variables, and variable catalogs to view and edit those new metadata fields.
  • Update for API change in dataset export.
  • Attempting to assign a name<- on NULL (i.e. when you reference a variable in a dataset using $ and the variable does not exist) returns a helpful message.
  • Fix dataset import via newDataset when passing a data.frame or similar that has spaces in the column names.
  • Handle the (deprecated in R) case of duplicate factor levels when translating to categorical in toVariable
  • Fix issue with sharing datasets owned by a project.
  • Support updating Categorical variables created from R logical-type vectors with logical values
  • Remove “crunch.max.categories” option to govern converting factors to Crunch categorical variables only if fewer than that threshold. Use as.character if you have a factor and want it to be imported as type Text.
  • Increase default “crunch.timeout” for long-running jobs to 15 minutes, after which point progress polling will give up.
  • Add cleanseBatches function to remove batch records from failed append attempts. Remove deprecated code around batch conflict reporting.
  • Validation to prevent attempting to set NA category names.
  • Generic datasets and projects functions to get dataset and project catalogs. (datasets previously existed only as a method for Project entities.)
  • Add project argument to listDatasets and add project and refresh to loadDatasets to facilitate viewing and loading datasets that belong to projects.
  • New function compareDatasets that shows how datasets will line up when appending. A summary method on its return value prints a report that highlights areas of possible mismatch.
  • Support computing numeric aggregates (mean, max, etc.) of categorical variables with numeric values in crtabs
  • Allow NULL assignment into Variable/DatasetGroups to remove elements
  • Fix refresh method for Datasets that have been transferred to a Project.
  • (Re-)improve print method for expressions involving categorical variables
  • Improve handling of filters when composing complex expressions of CrunchExpr, Variable, and Dataset objects
  • Add expression support for operations involving a DatetimeVariable and a character vector, assumed to be ISO-8601 formatted.
  • Export a permissions method for Datasets to work directly with sharing privileges.
  • Fix as.data.frame/as.environment for CrunchDataset when a variable alias contained an apostrophe.
  • Better print method for project MemberCatalog.
  • Fix for change in jsonlite API in its v0.9.22
  • Progress polling now returns the error message, if given, if a job fails.
  • exportDataset to download a CSV or SAV file of a dataset. write.csv convenience method for CSV export.
  • Correctly parse datetimes that don’t include timezone information.
  • Add icon and icon<- methods for Projects to read the project’s current icon URL and to set a new icon by supplying a local file name to upload.
  • Get and set “archived” and “published” status of a dataset with is.archived, is.draft, and is.published (the inverse of is.draft). See ?publish for more.
  • Add draft argument to forkDataset
  • Support for future API to handle failed long-running jobs.
  • Assorted updates to new API usage

New support for working with users and their permissions on datasets and projects

  • Add owner and owner<- for datasets to read and modify the owner
  • Add owners and ownerNames for DatasetCatalog
  • is.editor and is.editor<- for project MemberCatalog
  • me function to get the user entity for yourself

Other changes

  • Add missing print method for DatasetOrder
  • Support creating OrderGroups (for both Datasets and Variables) by assigning URLs into a new group name
  • Improve support for parsing datetime data values
  • Fix bug in setting nested groups inside DatasetOrder
  • Fix failure on interactive login in R.app on OS X
  • Generalize and update to new Progress API. Add a progress bar.
  • Remove deprecated query parameter on variable catalog
  • variableMetadata function to export all variable metadata associated with the dataset
  • Better support for deleting hidden variables
  • Allow subsetting of datasets to include hidden variables
  • Require that version names must be a single string value
  • Fix bug in print method for VariableOrder that manifested when fixing the variable catalog’s relative URL API
  • Add warning that the pattern argument for functions including makeArray, makeMR, deleteVariables, and hideVariables is being deprecated. The help pages for those functions advise you to grep for or otherwise identify your variables outside of these functions.
  • unshare to revoke access of a user or a team to a dataset.
  • Support for DatasetOrder, in particular for datasets within a project.
  • Do more validation that type<- assignment is safe.
  • Make paginated requests to GET /table/ (for CrunchExprs) for greater reliability
  • Finally fix bug that prevented sharing datasets with non-editors when the dataset had already been shared with a team.
  • Add a “session” object, retrievable by either session() or returned from login(), containing the various catalog resources (Datasets, etc.).
  • Additional methods on the dataset catalog, such as names<-.
  • Extract from most catalogs either by URL or name.
  • Initial implementation of Projects API.
  • loadDataset with a dataset catalog tuple, allowing some degree of tab completion by dataset name. (Example: cr <- login(...); ds <- loadDataset(cr$datasets$My_Dataset_Name))
  • Update tests to pass with forthcoming release of testthat.
  • Remove useAlias attribute of datasets and move it to a global option, “crunch.namekey.dataset”, defaulted to “alias”. Implement the same for array variables, “crunch.namekey.array”, and default to “name” for consistency with previous versions. This default will change in a future release.
  • New Progress API for checking status of pending, long-running server jobs.
  • Switch as.vector for CrunchExpr to GET rather than POST.
  • forkDataset to make a fork (copy) of a dataset; mergeFork to merge changes from a fork back to its parent (or vice versa)
  • Remove a duplicate request made when setting variable order
  • Update to new API to get a datetime variable’s rollup resolution and save a request

Major changes

  • Pull HTTP query cache out to the httpcache package and take dependency on that. Remove dependency on digest package (httpcache depends on it instead).
  • New vignette on filters and exclusions
  • combine categories of categorical and categorical-array variables, and responses of multiple-response variables, into new derived variables
  • startDate and endDate attributes and setters for dataset entities (#10, #11)
  • Allow editing of filter expressions in UI filter objects (CrunchFilter)

Other changes

  • Improved validation for “name” setting, especially for categories
  • Speed up ncol(ds) by removing a server request
  • Speed up variable catalog editing by avoiding unnecessary updates to the variable order
  • Fix cache invalidation when reordering subvariables
  • Improve error message for subscript out of bounds in catalog objects
  • Include active filter in print method for datasets and variables, if applicable
  • More formal support for creating and managing UI filters
  • Better print method for Crunch expressions (CrunchExpr): prints an R formula-like expression
  • Fix error in reading/writing query cache with a very long querystring. Requires new dependency on the digest package.
  • Fix bug in assigning name(ds$var$subvar) <- value
  • Fix overly rigid validation in share
  • Update API usage to always send full variable URLs in queries
  • Add method for R logical &/| Crunch expression
  • Upgrade for compatibility with httr 1.1
  • addSubvariable function to add to array and multiple response variables (#7)
  • Make paginated requests to GET /values/ for greater reliability
  • Update to match changes in filter API
  • dropRows to permanently delete rows from a dataset.
  • Better print method for catalog resources, using the new catalogToDataFrame function.
  • Export a few more functions (shojiURL, batches)
  • Catch NULL in cube dimension when referencing subvariable that does not exist (as when using alias instead of name) and return a useful message.
  • Fix for unintended substring matching in %in% expression translation.
  • Internal change to match user catalog API update
  • Update docs to conform to R-devel changes to as.vector’s signature.
  • addVariables function to add multiple variables to a dataset efficiently
  • Support aggregating with CrunchExprs and filtered variables in table
  • Save a variable catalog refresh on (un)dichotomize. Slight speedup as a result.
  • Fix bug in creating VariableOrder with a named list.
  • Improve performance of many operations by more lazily loading variable entities from the server. Changes to several internal package APIs to make that happen, but the public package interface should be unchanged.
  • Also speed up loading of variable catalogs by deferring resolution of relative subvariable URLs until requested. Eliminates significant load time for datasets with lots of array variables.
  • Fix bug in results from crtabs when requesting a crosstab of three or more dimensions.
  • VariableDefinition (or VarDef) function and class for creating variable definitions with more metadata (rather than assigning R vectors into a dataset and having to add metadata after).
  • Reworked various new variable functions, including copy, makeArray, and makeMR, to return VariableDefinitions rather than creating the new variables themselves. Creation happens on assignment into the dataset.
  • Support adding No Data (NA for categoricals) even if No Data doesn’t already exist
  • Tools for logging and profiling HTTP requests and cache performance. See ?startLog and ?logMessage.
  • Support deep copying of non-array variables.
  • Check for new version of the package on GitHub when the package is loaded.
  • Make a shallow copy of a variable. See ?copyVariable.
  • Fix error in updating the values of a subvariable in an array.
  • Handle case of assigning NULL into a dataset when the referenced variable (alias) does not exist.
  • More support for NA assignment into variables.
  • Gradually slow the polling of /batches/ while waiting for an append to complete. Improves the performance of the append operation.
  • New c method for Categories, plus support for creating and adding new categories to variables. See ?Categories and ?"c-categories"
  • Get category ids or numeric values from as.vector by specifying a “mode” of “id” or “numeric”, respectively. See ?"variable-to-R"
  • Set values as missing by assigning NA into variables.
  • Always send No Data category when creating Categorical Variables.
  • Fixed minor bugs in margin.table on CrunchCube objects.
  • Better validation of category subsetting.
  • Add Python-esque context manager for use in with statements. Use it to give consent() to delete things.
  • Delete variables by <- NULL into a dataset (like removing a column from a data.frame). Requires consent. Also create deleteVariable(s) functions that also return the dataset object. Use either method to prevent your dataset from getting out of sync with the server when you delete variables.
  • Delete subvariables from within array variables with deleteSubvariable(s).
  • Better evaluation of formulas within crtabs to allow you to crosstab array subvariables.
  • Update to new exclusion API.
  • Validate inputs on making filter expressions with categorical variables
  • Very basic print methods for all Crunch objects
  • Subset rows of datasets and variables for analysis, using either [ or subset
  • Access and set exclusion filters on datasets to drop certain rows
  • Fix some inconsistent handling in R of filters that are set on the server (i.e. for persistent viewing in the web application)
  • (un)lock datasets for editing when there are multiple editors
  • Send better emails when sharing datasets
  • Support for auto-login in Jupyter notebooks
  • One more CRANdated import
  • Import functions from methods, stats, and utils, per change in CRAN policy.
  • Functions saveVersion and restoreVersion for dataset versioning
  • Update requirement to httr 1.0; remove dependency on RCurl in favor of curl
  • Minor API updates
  • Fix for some issues authenticating on Windows
  • Fix bug in editing array variables with a single subvariable

crunch 1.3.3

  • More tools (not yet exported) for managing users

crunch 1.3.2

  • Adapt to minor updates in append API: new intermediate “appended” state for append operations.

crunch 1.3.1

  • More methods for managing teams
  • Prepare for httr 1.0
  • Provisional interface for managing users and teams.
  • Improved messaging for failure modes in appendDataset.
  • Adapt to minor updates in append API
  • Fix bug in updating an array with only one subvariable.
  • Add types method to VariableCatalog.
  • Additional methods for working with VariableOrder and VariableGroup. You can create new Groups by assigning into an Order or Group with a new name. And, with the new duplicates parameter, which is FALSE by default, adding new Groups to an Order “moves” the variable references to the new Group, rather than creating copies. See the variable order vignette for more details.
  • Add share function for sharing a dataset with other users.
  • Remove all non-ASCII from test files so that tests will run on Solaris.
  • Add query cache, on by default.

  • as.data.frame now does not return an actual data.frame unless given the argument force=TRUE. Instead, it returns a CrunchDataFrame, and environment containing unevaluated promises. This allows R functions, particularly those of the form function(formula, data) to work with CrunchDatasets without copying the entire dataset from the server to local memory. Only the variables referenced in the formula fetch data when their promises evaluated.

  • Remove RJSONIO dependency in favor of jsonlite for toJSON.

  • Rename package to crunch. Update all docs to reflect that. Make amendments to pass CRAN checks.
  • newDataset2 renamed to newDatasetByCSV and made to be the default strategy in newDataset. The old newDataset has been moved to newDatasetByColumn.

  • Support for NA and NaN in crtabs response.

  • getCube is now crtabs. Ready for more extensive beta testing. Has prop.table and margin.table methods. Vignette forthcoming.

  • newDataset2 that uses the CSV+JSON import method, rather than the columm-by-column strategy that newDataset uses.

  • Support for shoji:order document for hierarchical variable order. HTTP API change.

  • Initial, limited support for xtabs-like crosstabbing with a formula with the getCube function.