I want to visualize the difference between two dots with a line / line in ggplot2.
Suppose we have some data on income and expenses as a time series. We would like to visualize not only them, but also the balance (= income - expenses). In addition, we would like to indicate whether the balance was positive (= surplus) or negative (= deficit).
I tried several approaches, but none of them gave a satisfactory result. Here we give a reproducible example.
library(dplyr)
library(ggplot2)
library(tidyr)
df <- data.frame(year = rep(2000:2009, times=3),
var = rep(c("income","spending","balance"), each=10),
value = c(0:9, 9:0, rep(c("deficit","surplus"), each=5)))
df
1.Approach with LONG data
It is not surprising that it does not work with LONG data, because the arguments geom_linerange yminand ymaxcannot be specified correctly. ymin=value, ymax=value- this is certainly the wrong way (expected behavior). ymin=income, ymax=spendingalso clearly erroneous (expected behavior).
df %>%
ggplot() +
geom_point(aes(x=year, y=value, colour=var)) +
geom_linerange(aes(x=year, ymin=value, ymax=value, colour=net))
#>Error in function_list[[i]](value) : could not find function "spread"
2.Approach with WIDE data
I almost got it working with WIDE data. The plot looks good, but there is no legend for it geom_point(s)(expected behavior). Just adding show.legend = TRUEgeom_point (s) to the two points does not solve the problem, as it imprints the legend geom_linerange. In addition, I would prefer the lines of code to geom_pointbe merged into one (see 1.Approach).
df %>%
spread(var, value) %>%
ggplot() +
geom_linerange(aes(x=year, ymin=spending, ymax=income, colour=balance)) +
geom_point(aes(x=year, y=spending), colour="red", size=3) +
geom_point(aes(x=year, y=income), colour="green", size=3) +
ggtitle("income (green) - spending (red) = balance")

3.Approach using LONG and WIDE data
The combination of 1.Approach with the result of 2.Approach leads to yet another unsatisfactory plot. The legend does not distinguish between balance and var (= expected behavior).
ggplot() +
geom_point(data=(df %>% filter(var=="income" | var=="spending")),
aes(x=year, y=value, colour=var)) +
geom_linerange(data=(df %>% spread(var, value)),
aes(x=year, ymin=spending, ymax=income, colour=balance))

- () ?
geom geom_linerange?- ?