Visualizing the difference between two points with ggplot2

I want to visualize the difference between two dots with a line / line in ggplot2.

Suppose we have some data on income and expenses as a time series. We would like to visualize not only them, but also the balance (= income - expenses). In addition, we would like to indicate whether the balance was positive (= surplus) or negative (= deficit).

I tried several approaches, but none of them gave a satisfactory result. Here we give a reproducible example.

# Load libraries and create LONG data example data.frame
library(dplyr)
library(ggplot2)
library(tidyr)

df <- data.frame(year  = rep(2000:2009, times=3),
                 var   = rep(c("income","spending","balance"), each=10),
                 value = c(0:9, 9:0, rep(c("deficit","surplus"), each=5)))
df

1.Approach with LONG data

It is not surprising that it does not work with LONG data, because the arguments geom_linerange yminand ymaxcannot be specified correctly. ymin=value, ymax=value- this is certainly the wrong way (expected behavior). ymin=income, ymax=spendingalso clearly erroneous (expected behavior).

df %>% 
ggplot() + 
  geom_point(aes(x=year, y=value, colour=var)) +
  geom_linerange(aes(x=year, ymin=value, ymax=value, colour=net))

#>Error in function_list[[i]](value) : could not find function "spread"

2.Approach with WIDE data

I almost got it working with WIDE data. The plot looks good, but there is no legend for it geom_point(s)(expected behavior). Just adding show.legend = TRUEgeom_point (s) to the two points does not solve the problem, as it imprints the legend geom_linerange. In addition, I would prefer the lines of code to geom_pointbe merged into one (see 1.Approach).

df %>% 
  spread(var, value) %>% 
ggplot() + 
  geom_linerange(aes(x=year, ymin=spending, ymax=income, colour=balance)) +
  geom_point(aes(x=year, y=spending), colour="red", size=3) +
  geom_point(aes(x=year, y=income), colour="green", size=3) +
  ggtitle("income (green) - spending (red) = balance")

2.Approach

3.Approach using LONG and WIDE data

The combination of 1.Approach with the result of 2.Approach leads to yet another unsatisfactory plot. The legend does not distinguish between balance and var (= expected behavior).

ggplot() + 
  geom_point(data=(df %>% filter(var=="income" | var=="spending")),
             aes(x=year, y=value, colour=var)) +
  geom_linerange(data=(df %>% spread(var, value)), 
                 aes(x=year, ymin=spending, ymax=income, colour=balance)) 

3.Approach

  • () ?
  • geom geom_linerange?
  • ?
+4
1

Try

ggplot(df[df$var != "balance", ]) + 
  geom_point(
    aes(x = year, y = value, fill = var), 
        size=3, pch = 21, colour = alpha("white", 0)) +
  geom_linerange(
    aes(x = year, ymin = income, ymax = spending, colour = balance), 
        data = spread(df, var, value)) +
  scale_fill_manual(values = c("green", "red"))

: enter image description here

, (fill pch colour ), .

+3

Source: https://habr.com/ru/post/1659145/


All Articles