One option is to customize and replace the problem columns individually.
Margin columns can target xpath
# get the html html <- URL %>% read_html() # Example using the first margin column (column # 6) html %>% html_nodes(xpath = '//table[2]') %>% # get table 2 html_nodes(xpath = '//td[6]/text()') %>% # get column 6 using text() iconv("UTF-8", "UTF-8") # to convert "รขห'" to "-" # [1] "โ10.44%" "โ3.00%" "โ0.83%" "โ0.51%" "0.09%" "0.17%" "0.57%" # [8] "0.70%" "1.45%" "2.06%" "2.46%" "3.01%" "3.12%" "3.86%" #[15] "4.31%" "4.48%" "4.79%" "5.32%" "5.56%" "6.05%" "6.12%" #[22] "6.95%" "7.27%" "7.50%" "7.72%" "8.51%" "8.53%" "9.74%" #[29] "9.96%" "10.08%" "10.13%" "10.85%" "11.80%" "12.20%" "12.25%" #[36] "14.20%" "14.44%" "15.40%" "17.41%" "17.76%" "17.81%" "18.21%" #[43] "18.83%" "22.58%" "23.15%" "24.26%" "25.22%" "26.17%"
Do the same for the other column in the field. I used iconv
to convert รขห'
to -
, since this is an encoding problem, but instead you can use a replacement-based solution (e.g. using sub
).
To specify a column with the names of the presidents, you can use xpath again:
html %>% html_nodes(xpath = '//table[2]') %>% html_nodes(xpath = '//td[3]/a/text()') %>% html_text()
source share