Azure
Azure.Rmd
Getting Started
Ensure you are connected to any required VPNs
Run
setupAzureCreds()
. You’ll be prompted to enter the server, database, username and password. They will be securely stored in your operating system’s credential manager.To update credentials use
updateAzureCreds()
. This will prompt you for the fields that change.
Getting results from Azure database.
The getAzureResults()
function is able to query the
database, so long as you are connected to the VPN.
By default, the query results are also stored as a .parquet file in the data directory.
In the example below, data/query_name.parquet already exists and so
no query will be executed. Instead, the .parquet file will be read in
using arrow::read_parquet()
query <- "queries/query_name.sql"
results <- getAzureResults(
SQLscript = query
)
results
#> student_keyno student_year_of_graduation
#> 1 4 2025
#> 2 204166 2025
#> 3 204167 2025
#> 4 1659705 2024
#> 5 3252277 2023
#> 6 3282158 2025
#> 7 4348489 2050
#> 8 5051723 2024
#> 9 5364090 2024
#> 10 5488836 2024
#> 11 5489693 2024
#> 12 5499102 2024
#> 13 5540574 2026
#> 14 5541548 2024
#> 15 5547144 2025
#> 16 5547149 2025
#> 17 5549276 2024
#> 18 5549277 2024
#> 19 5549282 2024
#> 20 5914913 2027
The default behaviour can be overridden if, for example, the data or
query has changed by using the overwrite = TRUE
argument.
results <- getAzureResults(
SQLscript = query,
output = "data/query_name.parquet",
overwrite = TRUE
)
results
Sometimes, you might want to change the query before passing to the
function. In this case the raw .sql
query must first be
formatted using helper functions from glue
and
readr
. In the example below we first read in the sql
script.
query_text <- readr::read_file("queries/query_name.sql")
query_text
#> [1] "SELECT TOP(20) student_keyno, student_year_of_graduation\r\nFROM student\r\n"
Next we modify the script.
modified <- query_text |>
stringr::str_replace("student_year_of_graduation", "student_school")
modified
#> [1] "SELECT TOP(20) student_keyno, student_school\r\nFROM student\r\n"
The script must be ‘glued’ before passing to the function.
query <- glue::glue_sql(modified)
query
#> <SQL> SELECT TOP(20) student_keyno, student_school
#> FROM student
results <- getAzureResults(
SQLscript = query,
output = "data/query_results2.parquet",
glued = FALSE
)
results
#> student_keyno student_school
#> 1 158341 248
#> 2 158342 543
#> 3 158343 543
#> 4 158344 581
#> 5 158345 248
#> 6 158346 531
#> 7 158347 498
#> 8 158348 498
#> 9 158349 498
#> 10 158350 498
#> 11 158351 498
#> 12 158352 498
#> 13 158353 498
#> 14 158354 498
#> 15 158355 498
#> 16 158356 498
#> 17 158357 498
#> 18 158358 116
#> 19 158359 116
#> 20 158360 269