Run R script from SSIS package

I wanted to execute the R code from the SSIS package. How to add a data control that executes R code? SSIS only supports vb.net and asp.net.

SSIS has many data transformations, but R is very friendly when it comes to data manipulation.

I want to run R code from SSIS scripts or some other way. Basically, I'm trying to integrate R into the ETL process.

I wanted to extract data (E) from a CSV file.

Convert (T) to R and upload (L) to the Microsoft database. Is it possible for this workflow to be executed in the SSIS package by executing an R-script using SSIS data controls? Thanks!

+5
source share
2 answers

Here are a few ways you could integrate R into your ETL process.

  • Raw, fast and dirty . Execution of a process task in a control thread. This would be like calling RScript from the command line. You will probably make your transformation, save it to a file on disk, and get that file name from your execution task so that you can submit it to the data stream task. Surface - you keep your R clean and separate from your C # / VB.

  • Integrated via Rdotnet . You can use the RDotNet library (I believe I have not tried to integrate it). You will need to register the DLL in the GAC, and then you can either work with .NET objects in SSIS scripts or access R scripts directly .

  • Integrated in SQL Server 2016 . Microsoft has added R support through extended stored procedures. You call the R script through a stored process and use the SQL query to enter data and you can store the output. See details for more details . This would mean using the SQL Execute task in SSIS.

+5
source

I hope this helps you or someone else, since you want to process the data, you can bring your data set to a CSV file (via the data flow task), execute the file using: "Rscript" (it can be executed as command with the task of executing the process), inside the file you must load the data set into a dataframe (calling it using the readLines () function), then do all the mathematical calculations / calculations that you request, write the data or the results of the calculations to the CSV file, reading it again him from SSIS.

This is not an elegant solution, but it works :). At least until microsoft integrates R into the control / data flow process.

KDM

PS. here you can run files from the command line: Run R script from the command line

0
source

Source: https://habr.com/ru/post/1234549/


All Articles