Sqldf packet in R, data frame request

I am trying to rewrite some code using the sqldf library in R, which should allow me to run SQL queries in data frames, but I have a problem in that whenever I try to run a query, R seems to try to query the actual actual MySQL db con, which I use, and look for the table by the name of the data frame that I'm trying to execute.

When I ran this:

sqldf("SELECT COUNT(*) from work.class_scores") 

I get:

Error in mysqlNewConnection (drv, ...): RS-DBI driver: (Could not connect to the database: Error: cannot connect to the local MySQL server through the socket "/tmp/mysql.sock" (2))

When I try to specify a location using two different methods (the first form is googlecode page, the second is correct based on documents)

 > sqldf("SELECT COUNT(*) from work.class_scores", sqldf.driver = "SQLite") Error in sqldf("SELECT COUNT(*) from work.class_scores", sqldf.driver = "SQLite") : unused argument(s) (sqldf.driver = "SQLite") > sqldf("SELECT COUNT(*) from work.class_scores", drv = "SQLite") Loading required package: tcltk Loading Tcl/Tk interface ... Error : .onLoad failed in loadNamespace() for 'tcltk', details: call: dyn.load(file, DLLpath = DLLpath, ...) error: unable to load shared library '/Library/Frameworks/R.framework/Resources/library/tcltk/libs/x86_64/tcltk.so': dlopen(/Library/Frameworks/R.framework/Resources/library/tcltk/libs/x86_64/tcltk.so, 10): Library not loaded: /usr/local/lib/libtcl8.5.dylib Referenced from: /Library/Frameworks/R.framework/Resources/library/tcltk/libs/x86_64/tcltk.so Reason: image not found Error: require(tcltk) is not TRUE 

So, I think this may be a problem with the tcltk package that I have never heard of, so I try to take care of this and find some problems:

  > install.packages("tcltk") Warning in install.packages : argument 'lib' is missing: using '/Users/michaeldiscenza/Library/R/2.11/library' Warning in install.packages : package 'tcltk' is not available > install.packages("tcltk2", lib="/Applications/RStudio.app/Contents/Resources/R/library") trying URL 'http://lib.stat.cmu.edu/R/CRAN/bin/macosx/leopard/contrib/2.11/tcltk2_1.1-5.tgz' Content type 'application/x-gzip' length 940835 bytes (918 Kb) opened URL ================================================== downloaded 918 Kb The downloaded packages are in /var/folders/Y1/Y1gdz9tKFiSnWsGP9+BDcU+++TI/-Tmp-//RtmpL07KTL/downloaded_packages > library("tcltk") Loading Tcl/Tk interface ... Error : .onLoad failed in loadNamespace() for 'tcltk', details: call: dyn.load(file, DLLpath = DLLpath, ...) error: unable to load shared library '/Library/Frameworks/R.framework/Resources/library/tcltk/libs/x86_64/tcltk.so': dlopen(/Library/Frameworks/R.framework/Resources/library/tcltk/libs/x86_64/tcltk.so, 10): Library not loaded: /usr/local/lib/libtcl8.5.dylib Referenced from: /Library/Frameworks/R.framework/Resources/library/tcltk/libs/x86_64/tcltk.so Reason: image not found Error: package/namespace load failed for 'tcltk' 

A mistake in! dbPreExists: invalid argument type

Here I just don’t know what the problem is, do I need to move something?

Another approach I tried was before starting the query in the data frame object, establishing my database connection so that R looked there rather than trying to connect to the actual local MySQL database. But that did not work. Let's get back to the problem with the socket (even if I can query the local database myself without any problems.

 > con <- sqldf() Error in mysqlNewConnection(drv, ...) : RS-DBI driver: (Failed to connect to database: Error: Can't connect to local MySQL server through socket '/tmp/mysql.sock' (2) ) 

In the end, I want to query to get the number of entries where the C value is greater than 2, for example, and I feel comfortable. The only problem is that I don't know if there is another way to indicate that what I'm requesting is a data frame, not the actual db. Did I miss something really stupid and easy here?

Thanks!

+4
source share
2 answers

This answer has been carried over from my previous comments.

The post and comments indicate that:

  • it is advisable to use SQLite with sqldf, although RMySQL is loaded and

  • There was a message that tcltk is missing

  • There was a problem: sqldf("select count(*) from work.class_scores") where work.class_scores is a data frame.

On the sqldf homepage, FAQ # 7 addresses (1) above and FAQ # 5 addresses (2). (3) due to the fact that the point is an SQL statement, therefore, such data frame names must be specified or their name changed to delete the point.

Below we provide a reproducible example that implements the above three solutions.

The sqldf.driver option sqldf.driver used to force the use of SQLite, although RMySQL is loaded.

There are three approaches to tcltk: (i) The gsubfn.engine option gsubfn.engine you to use R code instead of tcltk so that you do not need the tcltk package. See the sample code below. (ii) Install tcltk one at a time. (iii) This question was asked when sqldf 0.4-4 was the current version, but now that sqldf 0.4-5 is not specified, additional detection of the tcltk package has been added, making it more likely that it will automatically handle all this without user use set any parameters and do not install tcltk. Thus, the simplest solution would be to simply upgrade to sqldf 0.4-5 or later.

We give the name of the data frame that has a point in it, or replace the name of the data frame with a name that does not contain a point:

 options(sqldf.driver = "SQLite") # as per FAQ #7 force SQLite options(gsubfn.engine = "R") # as per FAQ #5 use R code rather than tcltk library(RMySQL) library(sqldf) work.class_scores <- BOD # BOD is built in sqldf("select count(*) from 'work.class_scores'") # or work_class_scores <- work.class_scores sqldf("select count(*) from work_class_scores") 

EDIT:

Added information about sqldf 0.4-5.

+11
source

Can you try installing the tcl package from here ? (assuming you are on a mac).

+3
source

Source: https://habr.com/ru/post/1382397/


All Articles