Skip to contents

Getting Started

  1. Ensure you are connected to any required VPNs

  2. Run setupAzureCreds(). You’ll be prompted to enter the server, database, username and password. They will be securely stored in your operating system’s credential manager.

  3. To update credentials use updateAzureCreds(). This will prompt you for the fields that change.

Getting results from Azure database.

The getAzureResults() function is able to query the database, so long as you are connected to the VPN.

By default, the query results are also stored as a .parquet file in the data directory.

In the example below, data/query_name.parquet already exists and so no query will be executed. Instead, the .parquet file will be read in using arrow::read_parquet()

query <- "queries/query_name.sql"
results <- getAzureResults(
  SQLscript = query
)
results
#>    student_keyno student_year_of_graduation
#> 1              4                       2025
#> 2         204166                       2025
#> 3         204167                       2025
#> 4        1659705                       2024
#> 5        3252277                       2023
#> 6        3282158                       2025
#> 7        4348489                       2050
#> 8        5051723                       2024
#> 9        5364090                       2024
#> 10       5488836                       2024
#> 11       5489693                       2024
#> 12       5499102                       2024
#> 13       5540574                       2026
#> 14       5541548                       2024
#> 15       5547144                       2025
#> 16       5547149                       2025
#> 17       5549276                       2024
#> 18       5549277                       2024
#> 19       5549282                       2024
#> 20       5914913                       2027

The default behaviour can be overridden if, for example, the data or query has changed by using the overwrite = TRUE argument.

results <- getAzureResults(
  SQLscript = query,
  output = "data/query_name.parquet",
  overwrite = TRUE
)
results

Sometimes, you might want to change the query before passing to the function. In this case the raw .sqlquery must first be formatted using helper functions from glue and readr. In the example below we first read in the sql script.

query_text <- readr::read_file("queries/query_name.sql")
query_text
#> [1] "SELECT TOP(20) student_keyno, student_year_of_graduation\r\nFROM student\r\n"

Next we modify the script.

modified <- query_text |>
  stringr::str_replace("student_year_of_graduation", "student_school")
modified
#> [1] "SELECT TOP(20) student_keyno, student_school\r\nFROM student\r\n"

The script must be ‘glued’ before passing to the function.

query <- glue::glue_sql(modified)
query
#> <SQL> SELECT TOP(20) student_keyno, student_school
#> FROM student
results <- getAzureResults(
  SQLscript = query,
  output = "data/query_results2.parquet",
  glued = FALSE
)
results
#>    student_keyno student_school
#> 1         158341            248
#> 2         158342            543
#> 3         158343            543
#> 4         158344            581
#> 5         158345            248
#> 6         158346            531
#> 7         158347            498
#> 8         158348            498
#> 9         158349            498
#> 10        158350            498
#> 11        158351            498
#> 12        158352            498
#> 13        158353            498
#> 14        158354            498
#> 15        158355            498
#> 16        158356            498
#> 17        158357            498
#> 18        158358            116
#> 19        158359            116
#> 20        158360            269