ODBC connect, get table in R

Question

ODBC connect, get table in R

Fighting all day with this problem:

I want to access data on Hadoop (via Hive). And installed the ODBC package.

I can establish a connection to the server:

con <- dbConnect(odbc:: odbc(), "hadoop")

And I can see the table I want to get in R:

dblistTables(con, schema= "aacs")

Conclusion:

   [1] "dev_1"                  "dev_2"     
   [3] "dev_3"                  "dev_4"

I want to have "dev_4" (in the data frame) in my R environment. I tried:

db_orders <- tbl(con, "dev_4")

But I have an error: the table or view was not found. Also the next line does not lead anything.

db_orders <- tbl(con, "aacs.dev_4")

How can I get this data table in my R environment?

EDIT 1

Tried to run the following two things:

result <- dbSendQuery(con, "SELECT * FROM aacs.dev_4")

A warning was received: there is no space left on the device.

So let's shorten the query:

result <- dbSendQuery(con, "SELECT * FROM aacs.dev_4 LIMIT 100")

But again, the error:

Error: <SQL> 'SELECT * FROM aacs.dev_4 limit 100'
  nanodbc/nanodbc.cpp:1587: HY000: [Hortonworks][Hardy] (35) Error from server: error code: '2' error message: 'Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 2, vertexId=vertex_15177720341_0081_2_08, diagnostics=[Task failed, taskId=task_15177723341_0081_2_08_000146, diagnostics=[TaskAttempt 0 failed, info=[Error: FS Error in Child JVM:org.apache.hadoop.fs.FSError: java.io.IOException: No space left on device
    at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:261)
    at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
    at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58)
    at java.io.DataOutputStream.write(DataOutputStream.java:107)
    at org.apache.tez.runtime.library.common.sort.impl.IFileOutputStream.write(IFil

Does anyone know how to solve this? It is strange that there is no memory ... because I have a lot of space (enough to store data!).

EDIT 2

As @Florian suggested:

data <- dbReadTable(con, "aacs.dev_4")

:

Error: <SQL> 'SELECT * FROM `aacs.dev_4`'
  nanodbc/nanodbc.cpp:1587: HY000: [Hortonworks][Hardy] (35) Error from server: error code: '2' error message: 'Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 2, vertexId=vertex_1517772023341_0082_1_08, diagnostics=[Task failed, taskId=task_1517772023341_0082_1_08_000236, diagnostics=[TaskAttempt 0 failed, info=[Error: exceptionThrown=org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError: error in shuffle in fetcher {Map_4} #10
    at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:360)
    at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:337)
    at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
    at java.util.concurrent.FutureTask.run(FutureTask.java

+4

sql r dbi odbc

R overflow 02 . '18 15:06

3

Florian · Answer 1 · 2018-03-02T15:16:30+0000

x <- dbReadTable(con, "dev_4")

:

library(DBI)
library(RSQLite)

con <- dbConnect(RSQLite::SQLite(), ":memory:")

dbListTables(con)
dbWriteTable(con, "mtcars", mtcars)
x <- dbReadTable(con, "mtcars")

, !

R overflow · Answer 2 · 2018-03-05T12:34:20+0000

, ...

( "dev_4" ) . , :

test <- dbReadTable(con, "dev_3")

. ... !

Fino · Answer 3 · 2018-03-07T13:36:16+0000

I think you are looking

data<- dbGetQuery(con,"SELECT * FROM aacs.dev_4 LIMIT 100")

ODBC connect, get table in R

More articles: