Site icon R-bloggers

Pass parameter to SQL Server DATEADD function using R

[This article was first published on Data by John, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I wanted to try and create some parameters in my flexdashboard script, so I could extract data from our warehouse over a flexible date range.

In SQL Server, I’d create a variable and reference it like this:

DECLARE @ndays int
SET @ndays = -7
SELECT cols, that, I , want
FROM [server].[schema].[table1] t1
LEFT JOIN [server].[schema].[table2] t2 ON t1.common_ID = t2.common_id
WHERE Certain_Date >= DATEADD(D, @ndays, GETDATE())
ORDER BY Certain_Date DESC

This is fine, but my SQL query was buried way down in my .Rmd file, and I didn’t want to be amending that query each time. Also, I wanted the parameter to have a positive value for use in various strings, and a negative value for use in the SQL Server DATEADD function.

A brief description of DATEADD

Disclaimer – this is my description, not an offical transcript from Microsoft or another global authority

DATEADD(interval, number, date)
Interval The time element that we want to add or subtract. It could be hours, minutes, seconds, days or more..I’ll leave it to you to look into.
Number – the number of intervals you want to add or subtract – this can be a positive or negative number. But it can’t be a string – which is important in this case.
Date – the actual date (or datetime) you want to perform the calculation on. Durations will be added or subtracted relative to this date.

In this case, I’m using GETDATE() to retrieve the current system date – as I am running this report on a rolling basis, so every day the date range changes.

In the query , the WHERE clause ensures that the dates I return are between ndays ago and today’s date

WHERE Certain_Date >= DATEADD(D, @ndays, GETDATE())

OK? Let’s get back to R

I need a parameter defined in R that will pass through to the SQL query.

My friend Chris Beeley tweeted about the {glue} package, which I hadn’t really used , because paste0() has always worked for me.

I had a look into it, and discovered sql_glue, and immediately started thinking about various ways this could transform my work.

So with a bit of googling, I tried to pass the parameter to the SQL query, but it failed, due to binding errors.

This is what I tried :

ndays <- 7 # for plot titles and other strings
ndays_sql <- ndays * -1 # for passing to DATEADD

con <- DBI::dbConnect(odbc::odbc(),
                      Driver = "SQL Server",
                      Server = "SERVER", 
                      Database = "TABLE",
                      Trusted_Connection = "True",
                      Port = 1433)
                      
                      
data_sql <- dbFetch(dbSendQuery(con,
"SELECT cols, that, I , want
FROM [server].[schema].[table1] t1
LEFT JOIN [server].[schema].[table2] t2 ON t1.common_ID = t2.common_id
WHERE Certain_Date >= DATEADD(D,?ndays_sql, GETDATE())
ORDER BY Certain_Date DESC"))

Now the binding error threw me off on the wrong tangent for a bit – researching that took me to the well known but still difficult problem of having to have your long text columns at the end of the query. I have found a couple of workarounds for this, but they weren’t helping.

Eventually I found myself on the RStudio guide to databases in R and I began to work through the options.

Parameterised queries and glue_sql didn’t help, because they return strings and I need to pass an integer to the DATEADD function.

How?..

Interpolation, baby

What does interpolation mean?

The first definition I saw read “ the insertion of something of a different nature into something else”

But being more sensible – “the addition of something different in the middle of a text, piece of music, etc. or …thing” does describe what we’re doing pretty well.

I need to add an integer into a function, using a method that works with strings.

First – use sqlInterpolate to add the parameter, which itself is aliased.
This has to be wrapped in a call to dbSendQuery, and this finally, has to be passed to dbFetch.

The reason I didn’t wrap the whole thing up in dbFetchwas due to further binding errors. This two step process works:

ndays <- 7 # for plot titles and other strings
ndays_sql <- ndays * -1 # for passing to DATEADD

con <- DBI::dbConnect(odbc::odbc(),
                      Driver = "SQL Server",
                      Server = "SERVER", 
                      Database = "TABLE",
                      Trusted_Connection = "True",
                      Port = 1433)
                                            
data_sql <- dbSendQuery(con,sqlInterpolate(con,"SELECT cols, that, I , want
FROM [server].[schema].[table1] t1
LEFT JOIN [server].[schema].[table2] t2 ON t1.common_ID = t2.common_id
WHERE Certain_Date >= DATEADD(D,?ndays2, GETDATE())
ORDER BY Certain_Date DESC",
ndays2 = ndays_sql))

data <- dbFetch(data_sql)  

I had to use ndays2 within the query, and refer that back to my ndays_sql variable outside of it. The results of this were saved to data_sql, and the actual data was then retrieved using dbFetch.

Job done, and on to some more parameterised SQL, which will be the subject of the next post..

In the meantime, this gist has the code you need without the explanation:

interpolation of integer in DATEADD function

To leave a comment for the author, please follow the link and comment on their blog: Data by John.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.