Ggplot2 - How to apply a manual gradient with a legend when the dot-graph is not a coloring point in the same coordinates

I know that I am using dotplot in a slightly strange way, but I have it creating the graphics that I want; which shows how many players are in each position of each Premier League football club, with each point showing one player. I have several categories - it is shown whether the player is a player in the team or a young player, they are displayed separately, and the second is pushed down so that they do not overlap.

I want to add another layer of information to it, which shades the dots depending on how many minutes each player played. I have this data in my data frame.

This color encodes dots fine, unless the data is "grouped", in which case it turns gray.

screenshot of my plot

I read the guide on preparing a good question. I cut the data to show the problem without being huge, and deleted all lines of code, such as manipulating the data to this point, the names of the graphs, etc.

This is a sample of 20 players that produces 16 beautifully colored dots and 2 pairs of gray, unpainted dots.

structure(list(team = structure(c(2L, 3L, 4L, 4L, 5L, 6L, 8L, 9L, 11L, 12L, 5L, 6L, 7L, 10L, 12L, 12L, 1L, 4L, 5L, 7L), .Label = c("AFC Bournemouth", "Arsenal", "Brighton & Hove Albion", "Chelsea", "Crystal Palace", "Everton", "Huddersfield Town", "Leicester City", "Liverpool", "Swansea City", "Tottenham Hotspur", "West Bromwich Albion"), class = "factor"), role = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "U21", class = "factor"), name = structure(c(10L, 2L, 1L, 15L, 13L, 19L, 4L, 7L, 20L, 8L, 17L, 9L, 18L, 11L, 3L, 6L, 14L, 5L, 12L, 16L), .Label = c("Boga", "Brown", "Burke", "Chilwell", "Christensen", "Field", "Grujic", "Harper", "Holgate", "Iwobi", "Junior Luz Sanches", "Loftus Cheek", "Lumeka", "Mousset", "Musonda", "Palmer", "Riedwald", "Sabiri", "Vlasic", "Walker-Peters"), class = "factor"), pos = structure(c(6L, 7L, 6L, 6L, 6L, 5L, 2L, 4L, 3L, 6L, 1L, 1L, 5L, 4L, 6L, 4L, 7L, 1L, 4L, 5L), .Label = c("2. CB", "3. LB", "3. RB", "4. CM", "5. AM", "5. WM", "6. CF"), class = "factor"), mins = c(11, 24, 18, 1, 25, 10, 90, 6, 90, 20, 99, 180, 97, 127, 35, 156, 32, 162, 258, 124)), .Names = c("team", "role", "name", "pos", "mins"), row.names = 471:490, class = "data.frame") 

Here is the code I'm using:

 library(ggplot2) ggplot()+ geom_dotplot(data=u21, aes(x=team, y=pos, fill=mins), binaxis='y', stackdir="center", stackratio = 1, dotsize = 0.1, binwidth=0.75, position=position_nudge(y=-0.1)) + scale_fill_gradient(low="pink",high='red') 

In my actual code, I run the ggplot line again, but call up another data frame with a different color gradient and a different push so that the dots do not overlap.

+5
source share
1 answer

Basically, what happens is that these “grouped” points are treated as NA values ​​because ggplot gets two minimum values ​​for the same x, y coordinates that destroy the coloring mechanism. For example, at the intersection of "team = Chelsea" and "pos = 5. WM" there are two minutes: 18 and 1. The following code / graph changes the NA values ​​from the default value from gray to yellow to show what is happening:

 ggplot()+ geom_dotplot(data=df, aes(x=team, y=pos, fill=mins), binaxis='y', stackdir="center", stackratio = 1, dotsize = 0.2, binwidth=0.75, position=position_nudge(y=-0.1)) + scale_fill_gradient(low="pink",high='red',na.value="yellow") + theme(axis.text.x = element_text(angle=90, vjust=0.2, hjust=1, size=8)) 

Output:

enter image description here

It was a creative test geom_dotplot. This does not mean that you cannot do what you ask with this method, but it will be too difficult to get the effect you want with this approach. Instead, you may have more luck with geom_jitter, which was designed to handle the construction of this data type.

 ggplot(df)+ geom_jitter(aes(x=team, y=pos, col=mins),width = 0.2, height = 0) + scale_color_gradient(low="pink",high='red',na.value="yellow") + theme(axis.text.x = element_text(angle=90, vjust=0.2, hjust=1, size=8)) 

Output:

enter image description here

EDIT:

If you still need a complicated version with dotplot, avoiding jitter, then here:

 cols <- colorRampPalette(c("pink","red")) df$cols <- cols( max(df$mins,na.rm=T))[findInterval(df$mins,sort(1:max(df$mins,na.rm=T)))] ggplot()+ geom_dotplot(data=df, aes(x=team, y=pos, col=mins, fill=cols), binaxis='y',stackdir="centerwhole",stackgroups=TRUE, binpositions="all",stackratio=1,dotsize=0.2,binwidth=0.75, position=position_nudge(y=-0.1)) + scale_color_gradient(low="pink",high='red',na.value="yellow") + scale_fill_identity() + theme(axis.text.x = element_text(angle=90, vjust=0.2, hjust=1, size=8)) 

Output:

enter image description here

For those who are less familiar with what is happening in the code for the third chart: step 1 - save the gradient range using colorRampPalette; step 2 carefully assigns a hexadecimal color value for each line according to the value of the string df $ mins; step 3 displays the data using both color and fill arguments set so that a legend appears, but otherwise the gray (or yellow) grouped points overlap with the correct manual gradient color that we set by calling scale_fill_identity (). With this configuration, you will get the right color and the right legend.

+3
source

Source: https://habr.com/ru/post/1271889/


All Articles