| Type: | Package |
| Title: | Convenient Access to NYS Open Data API Endpoints |
| Version: | 0.1.1 |
| Description: | Provides helper functions to access datasets from the NYS Open Data platform https://data.ny.gov/. Functions return results as tidy tibbles and support optional filtering, sorting, and row limits via the Socrata API. |
| License: | MIT + file LICENSE |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.3 |
| Imports: | dplyr, tibble, stringr, jsonlite, httr, janitor, rlang |
| Suggests: | curl, covr, knitr, testthat (≥ 3.0.0), vcr, withr, webmockr, ggplot2 |
| URL: | https://martinezc1.github.io/nysOpenData/, https://github.com/martinezc1/nysOpenData |
| BugReports: | https://github.com/martinezc1/nysOpenData/issues |
| VignetteBuilder: | knitr |
| Config/testthat/edition: | 3 |
| Depends: | R (≥ 4.1.0) |
| NeedsCompilation: | no |
| Packaged: | 2026-03-27 19:19:58 UTC; christianmartinez |
| Author: | Christian Martinez
|
| Maintainer: | Christian Martinez <c.martinez0@outlook.com> |
| Repository: | CRAN |
| Date/Publication: | 2026-04-01 08:00:14 UTC |
Load Any NYS Open Data Dataset
Description
Downloads any NYS Open Data dataset given its Socrata JSON endpoint.
Usage
nys_any_dataset(
json_link,
limit = 10000,
timeout_sec = 30,
clean_names = TRUE,
coerce_types = TRUE
)
Arguments
json_link |
A Socrata dataset JSON endpoint URL (e.g., "https://data.ny.gov/resource/28gk-bu58.json"). |
limit |
Number of rows to retrieve (default = 10,000). |
timeout_sec |
Request timeout in seconds (default = 30). |
clean_names |
Logical; if TRUE, convert column names to snake_case (default = TRUE). |
coerce_types |
Logical; if TRUE, attempt light type coercion (default = TRUE). |
Value
A tibble containing the requested dataset.
Examples
# Examples that hit the live nys Open Data API are guarded so CRAN checks
# do not fail when the network is unavailable or slow.
if (interactive() && curl::has_internet()) {
endpoint <- "https://data.ny.gov/resource/28gk-bu58.json"
out <- try(nys_any_dataset(endpoint, limit = 3), silent = TRUE)
if (!inherits(out, "try-error")) {
head(out)
}
}
List datasets available in nysOpenData
Description
Retrieves the current Open NY catalog and returns datasets available for use with 'nys_pull_dataset()'.
Usage
nys_list_datasets()
Details
Keys are generated from dataset titles using 'janitor::make_clean_names()'.
Value
A tibble of available datasets, including generated 'key', dataset 'uid', and dataset 'title'.
Examples
if (interactive() && curl::has_internet()) {
nys_list_datasets()
}
Pull a NYS Open Data dataset from the NYS Open Data catalog
Description
Uses a dataset 'key' or 'uid' from 'nys_list_datasets()' to pull data from NYS Open Data.
Usage
nys_pull_dataset(
dataset,
limit = 10000,
filters = list(),
date = NULL,
from = NULL,
to = NULL,
date_field = NULL,
where = NULL,
order = NULL,
timeout_sec = 30,
clean_names = TRUE,
coerce_types = TRUE
)
Arguments
dataset |
A dataset key or UID from 'nys_list_datasets()'. |
limit |
Number of rows to retrieve (default = 10,000). |
filters |
Optional named list of filters. Supports vectors (translated to IN()). |
date |
Optional single date (matches all times that day) using 'date_field'. |
from |
Optional start date (inclusive) using 'date_field'. |
to |
Optional end date (exclusive) using 'date_field'. |
date_field |
Optional date/datetime column to use with 'date', 'from', or 'to'. Must be supplied when 'date', 'from', or 'to' are used. |
where |
Optional raw SoQL WHERE clause. If 'date', 'from', or 'to' are provided, their conditions are AND-ed with this. |
order |
Optional SoQL ORDER BY clause. |
timeout_sec |
Request timeout in seconds (default = 30). |
clean_names |
Logical; if TRUE, convert column names to snake_case (default = TRUE). |
coerce_types |
Logical; if TRUE, attempt light type coercion (default = TRUE). |
Details
Dataset keys are generated from dataset titles using 'janitor::make_clean_names()'. Because keys are derived from live catalog metadata, dataset UIDs are the more stable option.
Value
A tibble.
Examples
if (interactive() && curl::has_internet()) {
# Pull by key
nys_pull_dataset("311_service_requests", limit = 3)
# Pull by UID
nys_pull_dataset("28gk-bu58", limit = 3)
# Filters
nys_pull_dataset("28gk-bu58", limit = 3, filters = list(award_name = "MBA"))
}