NEWS.md
Variables can now be created as materialized by default instead of derived, by setting environment variable R_CRUNCH_DEFAULT_DERIVED
or option crunch.default.derived
to FALSE
. See ?toVariable
for more information (#648).
The concept of a personal folder is being removed from the API imminently. This has a few implications for rcrunch:
All datasets must be created with a project (eg via the project
argument of newDataset()
)
Dataset forks will be created in the same folder as their parent
Because loading datasets by name doesn’t work for datasets in projects, it’s not really possible to load a dataset by name without specifying the full project path.
To make things easier, it is possible to set a default project path with environment variable R_CRUNCH_DEFAULT_PROJECT
or option crunch.default.project
. This will be used as the default project folder when creating and loading datasets. Forks will still be put next to parents.
crunch.warn.hidden
& crunch.warn.private
options are set (#619).names()
to avoid issues with stuttering by RStudio auto-complete with option crunch.names.includes.hidden.private.variables
(#619).crunch.order.var.catalog
(#619).ScriptCatalog
(and removal of the ScriptCatalog
method for ScriptBody
the full body, subset to the particular script if you need the body text with vapply(scripts(ds), function(x) scriptBody(x), character(1))
).login()
is no longer supported. See the vignette("crunch")
or ?crunch-api-key
for more details on authenticating with api keys.exportDataset()
(#595)vignette("crunch")
or ?crunch-api-key
for more details. The login()
authentication flow is deprecated and will be removed from an upcoming release.interactVariables()
now uses server side logic to create categories so that it’s faster, but the sep
argument is no longer supported (it’s always set to " and "
) and the category order will not be the same.as.data.frame()
and as.vector()
work with numeric arrays now (#558)options(crunch.show.progress.url = TRUE)
to show the URL checked for progress (#565)as.data.frame()
where it would not respect the include.hidden
argument (#560)forceVariableCatalog()
or option(crunch.lazy.variable.catalog = FALSE)
).searchDatasets()
gained an argument f
that allows you to pass R objects to filter on.deriveArray()
and makeArray()
, and you can perform calculations on them via crtabs()
and tabBook()
.analyses
argument of newSlide()
or set analysis<-
to a list, allowing all slide customizations from R.?runCrunchAutomation
for more informationexpropriateUser()
function is now accessed through the function call reassignUser()
. Functionality of the call has not changed.makeCaseWhenVariable()
that helps with many common recoding needs.alterArrayExpr()
which allows adding, removing, reordering and renaming subvariables in a derived Array.keyring
package to store your credentials. See ?login()
for details (thanks @mainwaringb!)Add several new expressions
that let you create derived variables in more flexible ways than was previously possible.
crunch::filter()
now falls back to the next filter
on your searchpath when no method is defined.
tabBook()
by default uses a new endpoint, which allows for more options. The old endpoint is deprecated, but while the server supports it, you can still use it. See ?tabBook
for more details.weight()
and weight()<-
(#440)filter
or filter
object when using tabbook()
. Filtering by expression in the dataset argument is also supported again.newMultiTable()
now correctly passes ...
, so arguments like is_public
work (#424)hiddenVariables()
works when the hidden variables folder has subdirectories (#372).deriveArray()
using expressions to create the subvariables.slideCategories()
helps you create overlapping categorical variables (#396).stringsAsFactors
defaults (#402).newSlide
includes examples of vizType
settings and other improvements.copyFolders()
function that copies folders and variable order from one dataset to another (similar to copyOrder()
which was deprecated)filters(slide_object) <- NULL
or filters(slide_object) <- filter_object
)expect_either()
)...
removed from documentation(un)hideVariables()
functions are upgraded to use folder operations.~
, as in a *nix
file system. cd(projects(), "~")
takes you there; mv(projects(), ds, "~")
moves ds
into your personal folder.listDatasets()
now by default only prints datasets in your personal folder, not a combination of your personal datasets and some of the datasets that have been shared with you.loadDataset()
, it now searches to find datasets exactly matching that name unless you specify a project to load from. If you have multiple datasets with the same name in different locations, loadDataset("your dataset name")
may return a different one than it did previously. If you want to identify a dataset precisely in loadDataset()
, either specify the dataset URL (most effective but not as human friendly) or provide project = "path/to/folder"
.loadDataset(<integer>)
no longer is supported.is.public(multitables[[i]]) <- TRUE
and several other similar assignments of attributes on an element of a catalog, which previously successfully updated the value on the server but errored when returning to R (#303, #367)subvariables()
on non-array variables returns NULL
instead of an error (#237)prop.table
snewExampleDataset()
creates a sample dataset for you to exploreexportDeck()
can now write to PowerPoint with format = "pptx"
newDataset()
now supports importing data in Triple-S format, providing a schema
file in addition to the row data.resolution()
lets you see the data units of a datetime variable (“Y”, “M”, “D”, “ms”, etc.); resolution<-
lets you set it (#234)deleteDataset()
accepts web app URLs, just as loadDataset()
already did (#279)options(crunch.warn.hidden=FALSE)
to suppress the “Variable x is hidden” messages when accessing hidden variables (#172)team(deck) <-
upsert
argument to appendDataset()
to allow datasets to be updated based on the primary-key variable; see pk()
for details on primary keys (#49)combineCategories()
and combineResponses()
are aliases for combine()
, providing a way to avoid accidental clashes with dplyr::combine()
(#359)teams()<-
on them. View which teams can access them by calling teams()
on them."."
as a folder path/segment, referencing the current folder. cd(project, ".")
returns project
; mv(project, ds, ".")
moves ds
into project
.datasetReference()
listDatasets()
and makeArrayGadget()
have been moved to the crunchy
package. Wiring for them, including RStudio add-in configuration, remains here, but you’ll have to install that package to use them.mv()
and the other folder operations. These functions will be removed in December 2018.cd()
, mv()
, mkdir()
, rmdir()
) for organizing datasets within projects, following the pattern of variable folders. See vignette("projects", package = "crunch")
.setName()
and setNames()
for renaming folders and folder contents, respectively.makeWeight()
is now correct for categorical variables with non-sequential IDs.write.csv
or as.data.frame(force = TRUE)
if requested.index.table()
to better reflect analysts’ intentions. Now, index.table()
calculates the index with respect to the marginal proportion of the margin
given, so for index.table(cube, 2)
the column proportions of the table are indexed to the marginal row proportions. In other words: for each column how much larger or smaller is the proportion in that column when compared to the proportions for the row variable alone.haven
package and its new haven_labelled
and haven_labelled_spss
object classes.margin.table
, prop.table
, etc.)mv()
to move them to a folder.deleteVariables()
no longer tries to delete duplicate variables.as.data.frame(..., force = TRUE)
with numeric variables that have missing values.Suggests
reference for test packages, following new check
requirement.getDimTypes()
returns a richer set of cube dimension types differentiating multiple response from categorical array dimensions.alias
, description
, and notes
on VariableTuples
vignette("crunch")
changeCategoryID()
tries to unset then reset the dataset exclusion if that impacts its progress. Best practice is to disable exclusions before running changeCategoryID()
if at all possible.ordering<-
of datasets within a project will now drop any invalid entries with a warning, rather than error.NA
data.streamRows()
for case when sending only one row (#253).getDimTypes()
returns a richer set of cube dimension types differentiating multiple response from categorical array dimensions.alias
, description
, and notes
on VariableTuples
makeArrayGadget()
launches an RStudio gadget to help you build valid categorical arrays and multiple response variables.CrunchCube
s can now be subset just like R arrays using the [
method.numeric_values
). See ?addSummaryStat
for more information.index.table()
to return tables indexed to a margin.subtotals(var) <- NULL
when it already was NULL
(#231).""
for variable metadata fields if no value is set (#232).makeMRFromText()
with a categorical variable.crunch*
packages can use it.%in%
and ==
on Crunch objects now follow R semantics more closely with regards to missing data.cd()
, mv()
, mkdir()
, rmdir()
. These functions use a new API for variable folders (unlike the experimental versions of some that were introduced in the 1.19.0 package release). This API is currently in a beta testing phase. See vignettes("variable-order", package="crunch")
for examples and details.listDatasets(shiny = TRUE)
launches an RStudio addin which allows you to select your dataset in order to generate a valid loadDataset()
call. You can also associate this addin with a hotkey using in RStudio through Tools
> Modify Keyboard Shortcuts
.webApp()
now works for Crunch variables: it will take you to the “browse” view of the web application with the given variable card loaded on screen.ds$id_var_numeric <- as.Numeric(ds$id_var)
. There are as.*
methods for all Crunch data types except for array-like variables.haven
’s labelled
class when converting to Crunch variable types.makeMRFromText()
to take a variable imported as delimited strings, parse the multiple-response options, and return a (derived) multiple_response
variable.setPopulation(ds, size = 24.13e6, magnitude = 3)
and for getting population sizes (or magnitudes) with popSize(ds)
and popMagnitude(ds)
respectively.rollupResolution(ds$datetime)
and set with rollupResolution(ds$datetime) <- "M"
.options(crunch.show.progress)
to govern whether to report progress of long-running requests. Default is TRUE
, but set it to FALSE
to run quietly.pollProgress()
and recommend using that when a long-running request fails to complete within the local timeout.subtotals(variable) <- Subtotal(name = 'subtotal', categories = c(1, 2), after = 2)
. Use subtotals(variable)
to see what subtotals are set for a variable.subtotalArray([cube])
?subtotals
or vignette("subtotals", package="crunch")
for more information.as_selected
function instead of selected_array
, which is now deprecated).options(crunch.mr.selection = "selected_array")
.conditionalTransform()
conditionalTransform()
now has a formulas
argument to specify a list of conditions to be used.conditionalTransform()
.refresh()
for Datasets is now more efficient.ordering(ds)[[c("Top folder", "Nested folder")]]
) or a single string with nested folders separated by a delimiter (as in ordering(ds)[["Top folder/Nested folder"]]
). “/” is the default path delimiter, and this is configurable via options(crunch.delimiter)
. If you have folders that actually contain “/” in the folder name, this may be a breaking change. If so, set options(crunch.delimiter="|")
or some other string so that folder names are not incorrectly interpreted as paths.mv()
and mkdir()
functions for creating variable folders and moving variables into them. These take a Dataset as their argument and can be chained together for convenience/readability.folder()
and folder<-
to locate a variable in the folder hierarchy and to move it to a new folder. folder(ds$var) <- "New folder/subfolder"
is equivalent to ds <- mv(ds, "var", c("New folder", "subfolder"))
.conditionalTransform()
(#64, #153)collapseCategories()
allows you to combine categories in place without creating a new variablecopy()
has been made more efficientCrunchDataFrames
have been improved to act more data.frame
-like. You can now access and overwrite values with standard data.frame
methods like crdf$variable1
or crdf[,"variable1"]
and crdf$variable1 <- 1
or crdf[,"variable1"] <- 1
. CrunchDataFrames
now also support adding arbitrary columns, although it should be noted that these columns are not stored on the Crunch server, so if you want to keep that data outside of your current R session, you should send it back to your Dataset as a new variable.is.selected()
is now vectorized to work with Categories, as is.na()
has always been. You can also now assign into the function (#123)addSubvariable()
now accepts variable definitions directly (#72)makeCaseVariable()
has better errors when a user doesn’t name all of their case definitions (#158).as.data.frame()
when force = TRUE
has been removed (#150)as.data.frame()
method.modifyWeightVariables()
, weightVariables(ds) <- ds$newWeight
or is.weightVariables(ds$var) <- TRUE
expropriateUser()
to transfer datasets, projects, and other objects owned by one user to another, as when that user has left your organization.UserCatalogs
by email (e.g. catalog[["you@example.com"]]
) by default. All catalog extract methods ([
and [[
) now also accept a secondary
argument for setting an index to match against to change that default.R_CRUNCH_EMAIL
and R_CRUNCH_PW
respectively.as_selected
multiple-response variables have margin and prop.table methodsvariables()
now contain additional metadata, including “type”bases()
when called on a univariate statistic (#124)testthat
makeWeight()
allows you to generate new weighting variables based on categorical variables (#80).cut()
, equivalent to base::cut
, allows you to generate a derived categorical variable based on a numeric variable (#93).newDataset()
directly instead of newDatasetFromFile
. Also, you can now create a dataset from a hosted file passing its URL to newDataset(FromFile)
.as.data.frame()
method for VariableCatalog
for a view of variable metadata (#75)crunchBox()
now allows you to specify colors for branding or even category-specific coloring.login()
in a way that conceals the input.changeCategoryID()
to only update numeric values of the category having its id changed when the id and the numeric value are the same.autorollback
argument of appendDataset()
has been deprecated. The option no longer has any effect and a warning will be printed to notify users about the deprecation.newDatasetByCSV
was removed.geo()
on a variable to see if there is already associated geographic data.addGeoMetadata()
function to match a text or categorical variable with available geodata based on the contents of the variable and metadata associated with Crunch-hosted geographic data.derivation()
derivation() <- NULL
resetPassword()
functioncopyOrder()
to copy the ordering of variables from one dataset to another.loadDataset()
and it will now load the same dataset in your R session.webApp()
function to go the other way: open the dataset from your R session in your web browser.categoriesFromLevels()
is now exported (#77)deleteSubvariable()
by index instead deleted the parent variablemethods
package so that Rscript
works (#90)CrunchDataFrame
s with standard data.frame
sTwo attempts to fix download issues introduced by 1.17.4:
crGET
with httr::write_disk()
to hopefully work around issues caused by utils::download.file
with method “libcurl”.retry
for downloads to hopefully work around a delay in CDN population.searchDatasets()
to use the Crunch search API.digits()
(useful when exporting to SPSS files).crtabs
and table
where a dimension is a CrunchLogicalExpr
now return a boolean dimension with names “FALSE” and “TRUE”, rather than the previous behavior of dropping the dimension and only returning the TRUE
value.makeCaseVariable()
takes a sequence of case statements to derive a new variable based on the values from other variables.interactVariables()
takes two or more categorical variables and derives a new variable with the combination of each.options(download.file.method="curl")
.pendingStream()
; append that pending stream data to the dataset with appendStream()
(#40)multitables(ds)[["Multitable name"]] <- ~ var1 + var2
syntax. Similarly, multitables can be deleted with multitables(ds)[["Multitable name"]] <- NULL
. Multitables also have new name()
and delete()
methods.toVariable()
now accepts (and then strips) arguments of class AsIs
(#44)changeCategoryID()
failed on multiple response variables.dashboard
and dashboard<-
methods to view and set a dashboard URL on a datasetchangeCategoryID
function to map categorical data to a new “id” and value in the data (#38, #47)importMultitable()
to copy a multitable form one dataset to another. Additionally, Multitable
s now have a show method showing its name and column variables.appendDataset()
now truly appends a dataset and no longer upserts if there is a primary key set. This is accomplished by removing the primary key before appending. (#35)pk(dataset)
and set with pk(dataset) <- variable
.inst/
so that other packages that depend on crunch
can use the same setup.prop.table
computations line up with those not containing array variables (i.e. move subvariables to the third array dimension in the result).names
, aliases
, and descriptions
methods to CrunchCube
(corresponding to variables of the dimensions in the cube), MultitableResult
(corresponding to the “column” variables of the cubes in the result), and TabBookResult
(corresponding to the “row”/“sheet” variables in each multitable result).names
method for TabBookResults following an API change.crtabs
formula parsing to support multiple, potentially named, measuresweightVariables
method to display the set of variables designated as valid weights. (Works like hiddenVariables
.)appendDataset
, allow specifying a subset of rows to append (in addition to the already supported selection of variables)loadDataset
can now load a dataset by its URL.?with_consent
for more details.inst/
so that other packages depending on this package can access them more easily.is.derived
method for VariablesTabBookResult
s when the row variable is a categorical arraymultitables
method to access catalog from a Dataset. newMultitable
to create one. See ?multitables
and ?newMultitable
for docs and examples.tabBook
to compute a tab book with a multitable. If format="json"
(the default), returns a TabBookResult
containing CrunchCube
objects with which further analysis or formatting can be done.bases
method for cubes and tab book responses to access unweighted counts and margin tables.saveVersion
when there are no changes since the last saved version.roxygen2
6.0.0 release
newFilter
and newProject
functions to create those objects more directly, rather than by assigning into their respective catalogs.mergeFork
.with_consent
as an alternative to with(consent(), ...)
delete
in favor of the consent
context manager.httptest
for mocking HTTP and the Crunch API.embedCrunchBox
to generate embeddable HTML markup for CrunchBoxesduplicated
method for Crunch variables and expressionsas.vector
and as.data.frame
methods by smarter pagination of requests.ordering
print aliases.is.na<-
to set missing values on a variable, equivalent to assigning NA
settings(ds)$weight
and not just its self
URL.crunchBox
to make a public, embeddable analysis widgetsettings
and settings<-
to view and modify dataset-level controls, such as default “weight” and viewer permissions (“viewers_can_change_weight”, “viewers_can_export”)flattenOrder
to strip out nested groups from an ordermean
, median
, and sd
, now respect filter expressions, as does the summary
method.crtabs
loadDataset
from a nonexistent project.dedupeOrder
, removeEmptyGroups
appendDataset
can now append a subset of variablesflipArrays
function to generate derived views of array subvariablesautorollback
argument to appendDataset
, defaulted to TRUE
, which ensures that a failed append leaves the dataset in a clean state.allVariables
is now ordered by the variable catalog’s order, just as variables
has always been.mergeFork
.as_array
(pseudo-)function in crtabs
that allows crosstabbing a multiple-response variable as if it were a categorical array.merge
) a subset of variables and/or rows of a dataset.moveToGroup
function and setter for easier adding of variables to existing groups.locateEntity
function to find a variable or dataset within a potentially deeply nested order.hiddenVariables
from “name” to “alias”, governed by options(crunch.namekey.dataset)
as elsewhereoptions(crunch.check.updates=FALSE)
.session()
that lazily fetches catalogs rather than when instantiated.as.vector
on a categorical-array or multiple-response variable now returns a data.frame
. While a matrix
is a more accurate representation of the data type, using data.frame
allows for more intuitive accessing of subvariables by $
, just as they are from the Crunch dataset.joinDatasets
with its (new) default copy=TRUE
argument.addSubvariable
to PATCH rather than unbind and rebind; also extend it to accept more than one (sub)variable to add to the array.pattern
matching argument from makeArray
, makeMR
, deleteVariables
, and hideVariables
, deprecated since 1.9.6.deleteSubvariable
to follow model of deleteVariable
, including requiring consent to delete.options(crunch.namekey.array="name")
in your script or in your .Rprofile.deleteSubvariable
now follows “crunch.namekey.array” and will take either subvariable names or aliases, depending on the value of the setting.extendDataset
function, also aliased as merge
, to allow you to add columns from one dataset to another, joining on a key variable from each.compareDatasets
now checks the subvariable matching across array variables in the datasets to identify additional conflicts.notes
and notes<-
methods for datasets, variables, and variable catalogs to view and edit those new metadata fields.name<-
on NULL
(i.e. when you reference a variable in a dataset using $
and the variable does not exist) returns a helpful message.newDataset
when passing a data.frame
or similar that has spaces in the column names.toVariable
as.character
if you have a factor and want it to be imported as type Text.cleanseBatches
function to remove batch records from failed append attempts. Remove deprecated code around batch conflict reporting.datasets
and projects
functions to get dataset and project catalogs. (datasets
previously existed only as a method for Project entities.)project
argument to listDatasets
and add project
and refresh
to loadDatasets
to facilitate viewing and loading datasets that belong to projects.compareDatasets
that shows how datasets will line up when appending. A summary
method on its return value prints a report that highlights areas of possible mismatch.crtabs
NULL
assignment into Variable/DatasetGroups to remove elementsCrunchExpr
, Variable, and Dataset objectsDatetimeVariable
and a character vector, assumed to be ISO-8601 formatted.permissions
method for Datasets to work directly with sharing privileges.as.data.frame
/as.environment
for CrunchDataset
when a variable alias contained an apostrophe.MemberCatalog
.jsonlite
API in its v0.9.22exportDataset
to download a CSV or SAV file of a dataset. write.csv
convenience method for CSV export.icon
and icon<-
methods for Projects to read the project’s current icon URL and to set a new icon by supplying a local file name to upload.is.archived
, is.draft
, and is.published
(the inverse of is.draft
). See ?publish
for more.draft
argument to forkDataset
owner
and owner<-
for datasets to read and modify the ownerowners
and ownerNames
for DatasetCatalogis.editor
and is.editor<-
for project MemberCatalogme
function to get the user entity for yourselfpattern
argument for functions including makeArray
, makeMR
, deleteVariables
, and hideVariables
is being deprecated. The help pages for those functions advise you to grep for or otherwise identify your variables outside of these functions.unshare
to revoke access of a user or a team to a dataset.type<-
assignment is safe.CrunchExpr
s) for greater reliabilitysession()
or returned from login()
, containing the various catalog resources (Datasets, etc.).names<-
.loadDataset
with a dataset catalog tuple, allowing some degree of tab completion by dataset name. (Example: cr <- login(...); ds <- loadDataset(cr$datasets$My_Dataset_Name)
)testthat
.useAlias
attribute of datasets and move it to a global option, “crunch.namekey.dataset”, defaulted to “alias”. Implement the same for array variables, “crunch.namekey.array”, and default to “name” for consistency with previous versions. This default will change in a future release.as.vector
for CrunchExpr
to GET rather than POST.forkDataset
to make a fork (copy) of a dataset; mergeFork
to merge changes from a fork back to its parent (or vice versa)digest
package (httpcache depends on it instead).combine
categories of categorical and categorical-array variables, and responses of multiple-response variables, into new derived variablesstartDate
and endDate
attributes and setters for dataset entities (#10, #11)CrunchFilter
)ncol(ds)
by removing a server requestCrunchExpr
): prints an R formula-like expressiondigest
package.name(ds$var$subvar) <- value
share
addSubvariable
function to add to array and multiple response variables (#7)dropRows
to permanently delete rows from a dataset.catalogToDataFrame
function.shojiURL
, batches
)NULL
in cube dimension when referencing subvariable that does not exist (as when using alias instead of name) and return a useful message.%in%
expression translation.addVariables
function to add multiple variables to a dataset efficientlyCrunchExpr
s and filtered variables in table
crtabs
when requesting a crosstab of three or more dimensions.VariableDefinition
(or VarDef
) function and class for creating variable definitions with more metadata (rather than assigning R vectors into a dataset and having to add metadata after).copy
, makeArray
, and makeMR
, to return VariableDefinition
s rather than creating the new variables themselves. Creation happens on assignment into the dataset.NA
for categoricals) even if No Data doesn’t already exist?startLog
and ?logMessage
.copy
of a variable. See ?copyVariable
.NULL
into a dataset when the referenced variable (alias) does not exist.NA
assignment into variables./batches/
while waiting for an append to complete. Improves the performance of the append operation.c
method for Categories, plus support for creating and adding new categories to variables. See ?Categories
and ?"c-categories"
as.vector
by specifying a “mode” of “id” or “numeric”, respectively. See ?"variable-to-R"
NA
into variables.margin.table
on CrunchCube
objects.with
statements. Use it to give consent()
to delete things.<- NULL
into a dataset (like removing a column from a data.frame). Requires consent. Also create deleteVariable(s)
functions that also return the dataset object. Use either method to prevent your dataset from getting out of sync with the server when you delete variables.deleteSubvariable(s)
.crtabs
to allow you to crosstab array subvariables.[
or subset
exclusion
filters on datasets to drop certain rows(un)lock
datasets for editing when there are multiple editorssaveVersion
and restoreVersion
for dataset versioninghttr
1.0; remove dependency on RCurl
in favor of curl
appendDataset
.duplicates
parameter, which is FALSE
by default, adding new Groups to an Order “moves” the variable references to the new Group, rather than creating copies. See the variable order vignette for more details.share
function for sharing a dataset with other users.New vignettes for deriving variables and analyzing datasets.
Update appending workflow to support new API.
Add query cache, on by default.
as.data.frame
now does not return an actual data.frame
unless given the argument force=TRUE
. Instead, it returns a CrunchDataFrame
, and environment containing unevaluated promises. This allows R functions, particularly those of the form function(formula, data)
to work with CrunchDatasets without copying the entire dataset from the server to local memory. Only the variables referenced in the formula fetch data when their promises evaluated.
Remove RJSONIO
dependency in favor of jsonlite
for toJSON
.
crunch
. Update all docs to reflect that. Make amendments to pass CRAN checks.newDataset2
renamed to newDatasetByCSV
and made to be the default strategy in newDataset
. The old newDataset
has been moved to newDatasetByColumn
.
Support for NA
and NaN
in crtabs
response.