Extract hyperlink from Excel file in R

Is it possible to extract hyperlinks from Excel files in R? I looked through XLConnectand xlsx, but the only thing I found was how to write hyperlinks and not read them.

Additional information: the Excel file contains hyperlinks in one of the columns, which means the text that you can click on, which will take you to a file or URL. When I open a file with XLConnector xlsx, I only see text (no longer), but I need a URL.

+4
source share
1 answer

I found a super collapsed way to extract hyperlinks:

library(XML)

# rename file to .zip
my.zip.file <- sub("xlsx", "zip", my.excel.file)
file.copy(from = my.excel.file, to = my.zip.file)

# unzip the file
unzip(my.zip.file)

# unzipping produces a bunch of files which we can read using the XML package
# assume sheet1 has our data
xml <- xmlParse("xl/worksheets/sheet1.xml")

# finally grab the hyperlinks
hyperlinks <- xpathApply(xml, "//x:hyperlink/@display", namespaces="x")

Retrieved from this blogpost .

+6
source

Source: https://habr.com/ru/post/1544048/


All Articles