Use dplyr with a database without creating an explicit DBI object

Most code examples showing how to use dplyr with a database include creating a database connection object:

connStr <- "driver=driver;server=hostname;database=mydatabase;..." db <- DBI::dbConnect(odbc::odbc(), .connection_string=connStr) tbl <- tbl(db, "mytable") tbl %>% verb1 %>% verb2 %>% ... 

However, suppose I omit the creation of the db object:

 tbl <- tbl(DBI::dbConnect(odbc::odbc(), .connection_string=connStr), "mytable") tbl %>% verb1 %>% verb2 %>% ... 

Are there any implications for this? Will I use database resources / memory leak / etc?

The DBMS that I mean is SQL Server, and the driver package is odbc, if that matters.

+5
source share
1 answer

The new DBI specifications assume that the caller releases all the connections that they allocate with dbConnect() , with the corresponding call to dbDisconnect() . Failure to do so will close the connection only during garbage collection (or at the end of the R session), which delays the release of resources or even leakage of the connection.

The exact behavior depends on the DBI backend used (in this case, the odbc package). According to Jim Hester, odbc keeper,

[it] automatically calls dbDisconnect() when the connection object collects garbage, so this will not cause connections to leak. If you open a large number of connections, it is always better to be explicit, if you just do it interactively, perhaps in this case rely on the garbage collector.

+2
source

Source: https://habr.com/ru/post/1271811/


All Articles