R RODBC places a list of numbers in an IN () statement

I looked at the "Transfer R variable" in RODBC sqlQuery with multiple records already, but it can't seem to get it working. I'm trying to make sqlQuery() from R on SQL Server 2008 R2 db. I'm trying to get a sample from a big db on based on line numbers. First I created a list of random numbers:

 sampRowNum <- sample(seq(1,100000,1), 5000) 

Then I try to use these numbers in the query using:

 query1 <- sqlQuery(channel, paste("select * FROM db where row_id in (", sampRowNum,")", sep="")) 

I only get results from db, where row_id is equal to the first number in sampRowNum . Any suggestions?

+4
source share
1 answer

You do not paste correctly with your request.

If you execute the paste statement in isolation, you will see that you get a vector of length 5000, so sqlQuery only executes the first one, corresponding to the first element in samRowNum .

What you want to do is something more:

 paste("select * FROM db where row_id in (", paste(sampRowNum,collapse = ","),")", sep="") 

Just like the note added (and since I had to do a lot of things ...), building sql queries with an IN clause with strings is a bit more unpleasant, since you need to stick with all single quotes:

 vec <- letters[1:5] paste("SELECT * FROM db WHERE col IN ('", paste(vec,collapse = "','"),"')",sep = "") [1] "SELECT * FROM db WHERE col IN ('a','b','c','d','e')" 

If you do this a lot, you will end up writing a small function that performs symbol vector embedding for you.

As always, this type of SQL string manipulation is not very good if you are dealing with user inputs (for example, in a web application) due to SQL injection attacks. In my particular situation, this is not a big concern, but in general, people prefer parameterized queries if you don't have much control over the input values.

+6
source

Source: https://habr.com/ru/post/1388062/


All Articles