The amount of density in heatmaps

Question

The amount of density in heatmaps

I have a problem with my heat mask, which displays the density level but says nothing about density. (for example, how many points are in one area).

My data is divided into more columns, but the most important are: lat, lon.

I would like to have something like this, but with "count": https://stackoverflow.com/a/166268/240524/ , however, when I try to apply the code that it uses in this answer, my maximum “level” density is not reflects my density. (Intead 7500 I get, for example, 6, even if I have thousands and thousands of data) What is my code:

us_map_g_str <- get_map(location = c(-90.0,41.5,-81.0,42.7), zoom = 7) ggmap(us_map_g_str, extent = "device") + geom_tile(data = data1, aes(x = as.numeric(lon), y = as.numeric(lat)), size = 0.3) + stat_density2d(data = data1, aes(x = as.numeric(lon), y = as.numeric(lat), fill = ..level.., alpha = ..level..), size = 0.3, bins = 10, geom = "polygon") + scale_fill_gradient(name= "Ios",low = "green", high = "red", trans= "exp") + scale_alpha(range = c(0, 0.3), guide = FALSE)

This is what I get:

This is a piece of data:

  lat lon tag device 1 43.33622 -83.67445 0 iPhone5 2 43.33582 -83.69964 0 iPhone5 3 43.33623 -83.68744 0 iPhone5 4 43.33584 -83.72186 0 iPhone5 5 43.33616 -83.67526 0 iPhone5 6 43.25040 -83.78234 0 iPhone5

(The column "tag" is not important)

+3

r ggplot2 ggmap

U. Cremona Sep 11 '15 at 11:27

source share

1 answer

jazzurro · Accepted Answer · 2015-09-11T17:10:10+0000

REVISED

I realized that my previous answer needs to be reviewed. So there it is. If you want to know how many data points exist at each contour level, you really have a lot to do. If you are happy to use the leaflet option below, your life will be much easier.

First, let's get a Detroit map and create a sample data frame.

 library(dplyr) library(ggplot2) library(ggmap) mymap <- get_map(location = "Detroit", zoom = 8) ### Create a sample data set.seed(123) mydata <- data.frame(long = runif(min = -84, max = -82.5, n = 100), lat = runif(min = 42, max = 42.7, n = 100))

Now we will draw a map and save it as g .

 g <- ggmap(mymap) + stat_density2d(data = mydata, aes(x = long, y = lat, fill = ..level..), size = 0.5, bins = 10, geom = "polygon")

The real work begins here. To find out the number of data points at all levels, you want to use the data frame that ggplot generates. In this data frame, you have data for polygons. These polygons are used to draw level lines. You can see this in the following image, in which I draw three levels on the map.

 ### Create a data frame so that we can find how many data points exist ### in each level. mydf <- ggplot_build(g)$data[[4]] ### Check where the polygon lines are. This is just for a check. check <- ggmap(mymap) + geom_point(data = mydata, aes(x = long, y = lat)) + geom_path(data = subset(mydf, group == "1-008"), aes(x = x, y = y)) + geom_path(data = subset(mydf, group == "1-009"), aes(x = x, y = y)) + geom_path(data = subset(mydf, group == "1-010"), aes(x = x, y = y))

The next step is to create a level vector for the legend. We group the data into groups (for example, 1-010 ) and take the first row for each group using slice() . Then ungroup the data and select the second column. Finally, create a vector with unlist() . In the end we will return to lev .

 mydf %>% group_by(group) %>% slice(1) %>% ungroup %>% select(2) %>% unlist -> lev

Now we break the polygonal data (i.e. mydf) into groups and create a polygon for each level. Since we have 11 levels (11 polygons), we use lapply() . In the lapply loop we need to do; 1) extract a column for longitude and latitude, 2) create a polygon, 3) convert polygons to spatial polygons, 4) assign CRS, 5) create a dummy data frame, and 6) create SpatialPolygonsDataFrames.

 mylist <- split(mydf, f = mydf$group) test <- lapply(mylist, function(x){ xy <- x[, c(3,4)] circle <- Polygon(xy, hole = as.logical(NA)) SP <- SpatialPolygons(list(Polygons(list(circle), ID = "1"))) proj4string(SP) <- CRS("+proj=longlat +ellps=WGS84") df <- data.frame(value = 1, row.names = "1") circleDF <- SpatialPolygonsDataFrame(SP, data = df) })

Now back to the original data. We need to convert the data frame to SpatialPointsDataFrame. This is due to the fact that we need to multiply the data and determine how many data points exist in each polygon (at each level). First, get long and lat from your data.frame. Make sure the order is in lon / lat mode.

 xy <- mydata[,c(1,2)]

Then we create the SPDF (SpatialPolygonsDataFrame). You want to have an identical proj4string between spatial polygons and spatial point data.

 spdf <- SpatialPointsDataFrame(coords = xy, data = mydata, proj4string = CRS("+proj=longlat +ellps=WGS84"))

Then we multiply the data ( mydata ) using each polygon.

 ana <- lapply(test, function(y){ mydf <- as.data.frame(spdf[y, ]) })

Data points overlap between levels; we have duplication. First, we try to find unique data points for each level. We bind the data frames in ana and create a data frame that is equal to foo1 . We also create a data frame that we want to find a unique number of data points. We guarantee that the column names are identical between foo1 and foo2 . Using setdiff() and nrow() , we can find a unique number of data points at each level.

 total <- lapply(11:2, function(x){ foo1 <- bind_rows(ana[c(11:x)]) foo2 <- as.data.frame(ana[x-1]) names(foo2) <- names(foo1) nrow(setdiff(foo2, foo1)) })

Finally, we need to find the number of data points for the innermost level, which is level 11. We select a data frame for level 11 in ana and create a data frame and count the number of rows.

  bob <- nrow(as.data.frame(ana[11])) out <- c(bob,unlist(total)) ### check if total is 100 ### sum(out) ### [1] 100

We assign reverse out as names for lev . This is because we want to show how many data points exist for each level in the legend.

  names(lev) <- rev(out)

Now we are ready to add a legend.

  final <- g + scale_fill_continuous(name = "Total", guide = guide_legend(), breaks = lev) final

LEAFLET OPTION

If you are using a booklet package, you can group data points with different scales. The flyer counts data points in specific areas and shows the numbers in circles, as shown in the following figure. The more you zoom in, the more leaflets break the data into small groups. In terms of workload, this is much easier. In addition, your map is interactive. This might be the best option.

 library(leaflet) leaflet(mydf) %>% addTiles() %>% addMarkers(clusterOptions = markerClusterOptions())

The amount of density in heatmaps

More articles: