How to convert the sas7bdat file to csv?

I want to convert a .sas7bdat file to .csv / txt format so that I can upload it to a hive table. I get the .sas7bdat file from an external server and do not have SAS on my machine.

Thanks in advance.

+6
source share
5 answers

Use one of the R-packages to read the file and then convert to CSV using this tool.

http://cran.r-project.org/doc/manuals/R-data.pdf Pg 12

Instead, the SAS7BDAT package is used. It seems to ignore the user format by reading the underlying data.

In SAS:

proc format; value agegrp low - 12 = 'Pre Teen' 13 -15 = 'Teen' 16 - high = 'Driver'; run; libname test 'Z:\Consulting\SAS Programs'; data test.class; set sashelp.class; age2=age; format age2 agegrp.; run; 

In R:

  install.packages(sas7bdat) library(sas7bdat) x<-read.sas7bdat("class.sas7bdat", debug=TRUE) x 
+7
source

If this is a one-time mode, you can download the SAS viewer for free (after registering for an account that is also free):

http://support.sas.com/downloads/package.htm?pid=176

Then you can open the sas dataset with the viewer and save it as a csv file. As far as I can tell, there is no CLI, but if you really wanted to, maybe you could write an autohotkey script or similar to convert SAS datasets to csv.

It is also possible to use the SAS provider for OLE DB to read SAS datasets without actually having SAS and available here:

http://support.sas.com/downloads/browse.htm?fil=0&cat=64

However, this is rather complicated - some documents are available here if you want to get an idea:

http://support.sas.com/documentation/cdl/en/oledbpr/59558/PDF/default/oledbpr.pdf

+2
source

I recently wrote this package that allows you to convert sas7bdat to csv using Hadoop / Spark. It is able to split the giant sas7bdat file, thereby achieving a high level of parallelism. Parse also uses parso as suggested by @Ashpreet

https://github.com/saurfang/spark-sas7bdat

+2
source

Thank you for your help. I ended up using the parso utility in java and it worked like a charm. The utility returns strings as arrays of objects that I wrote to a text file.

I mentioned the utility: http://lifescience.opensource.epam.com/parso.html

thanks

+1
source

The python sas7bdat package available here includes a library for reading sas7bdat files:

 from sas7bdat import SAS7BDAT with SAS7BDAT('foo.sas7bdat') as f: for row in f: print row 

and a command-line program that does not require programming

 $ sas7bdat_to_csv in.sas7bdat out.csv 
0
source

Source: https://habr.com/ru/post/977140/


All Articles