Ggplot to create a multi-line chart from a csv file

I am completely new to ggplot (and to some extent R). I was shocked by the quality of the graphs that can be created using ggplot, and I am trying to learn how to create a simple multi-line graph using ggplot.

Unfortunately, I did not find any tutorials to help me get closer to what I'm trying to do:

I have a CSV file that contains the following data:

id,f1,f2,f3,f4,f5,f6 30,0.841933670833,0.842101814883,0.842759547545,1.88961562347,1.99808377527,0.841933670833 40,1.47207692205,1.48713866811,1.48717177671,1.48729643008,1.48743226992,1.48713866811 50,0.823895293045,0.900091982861,0.900710334491,0.901274168324,0.901413662472,0.901413662472 

I would like to build:

  • first column (id) on the x axis
  • each subsequent “column” in the form of a line graph with smoothing between the points of the line to create a nice smooth line
  • Legend for f1, f2 ....
  • Specify the color of the line and add labels (for example, crosses, that is, “+”) to the line graph for the f2 column (for example).

I am really new to ggplot, so I really could not read the file in R.

Any help on getting me to create the plot as described above will be very educational and will help reduce the ggplot learning curve.

+3
source share
1 answer
 dat <- structure(list(id = c(30L, 40L, 50L), f1 = c(0.841933670833, 1.47207692205, 0.823895293045), f2 = c(0.842101814883, 1.48713866811, 0.900091982861), f3 = c(0.842759547545, 1.48717177671, 0.900710334491 ), f4 = c(1.88961562347, 1.48729643008, 0.901274168324), f5 = c(1.99808377527, 1.48743226992, 0.901413662472), f6 = c(0.841933670833, 1.48713866811, 0.901413662472)), .Names = c("id", "f1", "f2", "f3", "f4", "f5", "f6"), class = "data.frame", row.names = c(NA, -3L)) 

from here I would use melt . Read ?melt.data.frame for more information. But in one sentence, it takes data from a "wide" format to a "long" format.

 library(reshape2) dat.m <- melt(dat, id.vars='id') > dat.m id variable value 1 30 f1 0.8419337 2 40 f1 1.4720769 3 50 f1 0.8238953 4 30 f2 0.8421018 5 40 f2 1.4871387 6 50 f2 0.9000920 7 30 f3 0.8427595 8 40 f3 1.4871718 9 50 f3 0.9007103 10 30 f4 1.8896156 11 40 f4 1.4872964 12 50 f4 0.9012742 13 30 f5 1.9980838 14 40 f5 1.4874323 15 50 f5 0.9014137 16 30 f6 0.8419337 17 40 f6 1.4871387 18 50 f6 0.9014137 > 

then sketch how you would like:

 ggplot(dat.m, aes(x=id, y=value, colour=variable)) + geom_line() + geom_point(data=dat.m[dat.m$variable=='f2',], cex=2) 

Where aes defines aesthetics such as x value, y value, color / color, etc. Then you add “layers”. in the previous example, I added a line for what I defined in the ggplot() section using geom_line() , and added a point with geom_point , where I placed them only in the f2 variable.

below, I added a smooth line with geom_smooth() . See the Documentation for more information on what this does ?geom_smooth .

 ggplot(dat.m, aes(x=id, y=value, colour=variable)) + geom_smooth() + geom_point(data=dat.m[dat.m$variable=='f2',], shape=3) 

or figures for everyone. Here I put the form in the aesthetics of ggplot() . By placing them here, they apply to all successive layers, and not to the need to specify them each time. However, I can overwrite the values ​​specified in ggplot() in any later layer:

 ggplot(dat.m, aes(x=id, y=value, colour=variable, shape=variable)) + geom_smooth() + geom_point() + geom_point(data=dat, aes(x=id, y=f2, color='red'), size=10, shape=2) 

However, a little understanding of ggplot takes time. Check out some examples in the documentation and on the ggplot2 website. If your experience is something like mine, after a battle with him for several days or weeks, he will eventually click. As for data, if you assign your dat data, the code will not change. dat <- read.csv(...) . I do not use data as a variable because it is a built-in function.

+3
source

Source: https://habr.com/ru/post/1439794/


All Articles