Google Polylines

D Cooley

2020-12-15

The goal of googlePolylines is to encode and decode coordinates using Google’s polyline encoding algorithm

Polyline encoding is a lossy compression algorithm that allows you to store a series of coordinates as a single string.

Encoded polylines are used by Google Maps to draw lines and polygons, and are therefore supported in the googleway package.

I am intending to update googleway to support plotting sf objects using these encoded polylines.

Caution

The word lossy is important to keep in mind, as the encoding process could reduce precision of your data. If you are after highly accurate coordinates this process probably isn’t for you.

However, if you need to reduce the size of spatial objects/data, and want quicker plots (see the Benchmarking section), then this could help.

Encoding

Encoding is split across two functions

vectors

Given two vectors of longitude and latitude coordinates:

library(googlePolylines)

lon <- c(144.9709, 144.9713, 144.9715, 144.9719, 144.9728, 144.9732, 
144.973, 144.9727, 144.9731, 144.9749, 144.9742)

lat <- c(-37.8075, -37.8076, -37.8076, -37.8078, -37.8079, -37.8079, 
-37.8091, -37.8107, -37.8115, -37.8153, -37.8155)

encodeCoordinates(lon, lat)
## [1] "xgweFcsysZToA?g@f@oAPsD?oApFf@|Hz@`DmAvViJf@jC"

data.frame

The encode() function will attempt to find the lon & lat columns inside a data.frame using regex matching. However, you can also specify the columns of coordinates:

df <- data.frame(
  lon = c(144.9709, 144.9713, 144.9715, 144.9719, 144.9728, 144.9732, 144.973, 144.9727, 144.9731, 144.9749, 144.9742),
  lat = c(-37.8075, -37.8076, -37.8076, -37.8078, -37.8079, -37.8079, -37.8091, -37.8107, -37.8115, -37.8153, -37.8155)
  )

encode(df)
## [1] "xgweFcsysZToA?g@f@oAPsD?oApFf@|Hz@`DmAvViJf@jC"
## or specify the columns to use
# encode(df, lon = "lon", lat = "lat")

sf

encode() will currently work on sf objects with geometry types

It will not work on GEOMETRYCOLLECTION objects.

library(sf)
## Linking to GEOS 3.8.1, GDAL 3.1.4, PROJ 6.3.1
## data set of North Carolina states
nc <- sf::st_read(system.file("shape/nc.shp", package="sf"))
## Reading layer `nc' from data source `/Library/Frameworks/R.framework/Versions/4.0/Resources/library/sf/shape/nc.shp' using driver `ESRI Shapefile'
## Simple feature collection with 100 features and 14 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
## geographic CRS: NAD27
nc[1:5,]
## Simple feature collection with 5 features and 14 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -81.74107 ymin: 36.07282 xmax: -75.77316 ymax: 36.58965
## geographic CRS: NAD27
##    AREA PERIMETER CNTY_ CNTY_ID        NAME  FIPS FIPSNO CRESS_ID BIR74 SID74
## 1 0.114     1.442  1825    1825        Ashe 37009  37009        5  1091     1
## 2 0.061     1.231  1827    1827   Alleghany 37005  37005        3   487     0
## 3 0.143     1.630  1828    1828       Surry 37171  37171       86  3188     5
## 4 0.070     2.968  1831    1831   Currituck 37053  37053       27   508     1
## 5 0.153     2.206  1832    1832 Northampton 37131  37131       66  1421     9
##   NWBIR74 BIR79 SID79 NWBIR79                       geometry
## 1      10  1364     0      19 MULTIPOLYGON (((-81.47276 3...
## 2      10   542     3      12 MULTIPOLYGON (((-81.23989 3...
## 3     208  3616     6     260 MULTIPOLYGON (((-80.45634 3...
## 4     123   830     2     145 MULTIPOLYGON (((-76.00897 3...
## 5    1066  1606     3    1197 MULTIPOLYGON (((-77.21767 3...

When used on an sf object, an sfencoded object is returned, with the encoded polylines replacing the sf::st_geometry column.

enc <- encode(nc)
str(enc)
## Classes 'sfencoded' and 'data.frame':    100 obs. of  15 variables:
##  $ AREA     : num  0.114 0.061 0.143 0.07 0.153 0.097 0.062 0.091 0.118 0.124 ...
##  $ PERIMETER: num  1.44 1.23 1.63 2.97 2.21 ...
##  $ CNTY_    : num  1825 1827 1828 1831 1832 ...
##  $ CNTY_ID  : num  1825 1827 1828 1831 1832 ...
##  $ NAME     : chr  "Ashe" "Alleghany" "Surry" "Currituck" ...
##  $ FIPS     : chr  "37009" "37005" "37171" "37053" ...
##  $ FIPSNO   : num  37009 37005 37171 37053 37131 ...
##  $ CRESS_ID : int  5 3 86 27 66 46 15 37 93 85 ...
##  $ BIR74    : num  1091 487 3188 508 1421 ...
##  $ SID74    : num  1 0 5 1 9 7 0 0 4 1 ...
##  $ NWBIR74  : num  10 10 208 123 1066 ...
##  $ BIR79    : num  1364 542 3616 830 1606 ...
##  $ SID79    : num  0 3 6 2 3 5 2 2 2 5 ...
##  $ NWBIR79  : num  19 12 260 145 1197 ...
##  $ geometry :encoded_column of length 100; first element: List of 1
##   ..$ : chr "u_d|EtsgpNmmFphLyEbcCibLf{Lk~H`bT}rNmjGihHd[kvL_lEzgBkl~@lyE`MnvCimCbmEqfAxnGieH~gEeTd_DmiCxvA_D|oAdwCidAtsKr_A"| __truncated__
##   .. ..- attr(*, "sfc")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
##  - attr(*, "encoded_column")= chr "geometry"
##  - attr(*, "sfAttributes")=List of 7
##   ..$ type: chr "MULTIPOLYGON"
##   ..$ dim : chr "XY"
##   ..$ bbox: 'bbox' Named num [1:4] -84.3 33.9 -75.5 36.6
##   .. ..- attr(*, "names")= chr [1:4] "xmin" "ymin" "xmax" "ymax"
##   ..$ epsg: int 4267
##   ..$ proj: chr "+proj=longlat +datum=NAD27 +no_defs"
##   ..$ prec: num 0
##   ..$ n_em: int 0
attr(enc, "encoded_column")
## [1] "geometry"
enc[1, attr(enc, "encoded_column")]
## [[1]]
## [1] "u_d|EtsgpNmmFphLyEbcCibLf{Lk~H`bT}rNmjGihHd[kvL_lEzgBkl~@lyE`MnvCimCbmEqfAxnGieH~gEeTd_DmiCxvA_D|oAdwCidAtsKr_AjiEdwA~YztFggAjbCfrAwf@hrF~mBdd@bh@zsB`cCpgCt_@d{B"
## attr(,"sfc")
## [1] "XY"           "MULTIPOLYGON" "sfg"         
## 
## attr(,"class")
## [1] "list"

As you can see, the geometry attributes are kept on the encoded object. However, you can remove them by specifying strip = TRUE

encLite <- encode(nc, strip = T)
str(encLite)
## Classes 'sfencodedLite' and 'data.frame':    100 obs. of  15 variables:
##  $ AREA     : num  0.114 0.061 0.143 0.07 0.153 0.097 0.062 0.091 0.118 0.124 ...
##  $ PERIMETER: num  1.44 1.23 1.63 2.97 2.21 ...
##  $ CNTY_    : num  1825 1827 1828 1831 1832 ...
##  $ CNTY_ID  : num  1825 1827 1828 1831 1832 ...
##  $ NAME     : chr  "Ashe" "Alleghany" "Surry" "Currituck" ...
##  $ FIPS     : chr  "37009" "37005" "37171" "37053" ...
##  $ FIPSNO   : num  37009 37005 37171 37053 37131 ...
##  $ CRESS_ID : int  5 3 86 27 66 46 15 37 93 85 ...
##  $ BIR74    : num  1091 487 3188 508 1421 ...
##  $ SID74    : num  1 0 5 1 9 7 0 0 4 1 ...
##  $ NWBIR74  : num  10 10 208 123 1066 ...
##  $ BIR79    : num  1364 542 3616 830 1606 ...
##  $ SID79    : num  0 3 6 2 3 5 2 2 2 5 ...
##  $ NWBIR79  : num  19 12 260 145 1197 ...
##  $ geometry :encoded_column of length 100; first element: List of 1
##   ..$ : chr "u_d|EtsgpNmmFphLyEbcCibLf{Lk~H`bT}rNmjGihHd[kvL_lEzgBkl~@lyE`MnvCimCbmEqfAxnGieH~gEeTd_DmiCxvA_D|oAdwCidAtsKr_A"| __truncated__
##  - attr(*, "encoded_column")= chr "geometry"
encLite[1, attr(encLite, "encoded_column")]
## [[1]]
## [1] "u_d|EtsgpNmmFphLyEbcCibLf{Lk~H`bT}rNmjGihHd[kvL_lEzgBkl~@lyE`MnvCimCbmEqfAxnGieH~gEeTd_DmiCxvA_D|oAdwCidAtsKr_AjiEdwA~YztFggAjbCfrAwf@hrF~mBdd@bh@zsB`cCpgCt_@d{B"
## 
## attr(,"class")
## [1] "list"

The benefit of stripping the attributes is to reduce the size of the object, which can be useful for web plotting if bandwidth/data transfer speeds are an issue.

vapply(mget(c('nc', 'enc', 'encLite') ), function(x) { format(object.size(x), units = "Kb") }, '')
##         nc        enc    encLite 
## "139.8 Kb"  "90.6 Kb"  "52.6 Kb"

Well-known Text

The two functions polyline_wkt and wkt_polylne can be used to convert sfencoded objects to and from well-known text.

wkt <- polyline_wkt(enc)
wkt[1, ]
##    AREA PERIMETER CNTY_ CNTY_ID NAME  FIPS FIPSNO CRESS_ID BIR74 SID74 NWBIR74
## 1 0.114     1.442  1825    1825 Ashe 37009  37009        5  1091     1      10
##   BIR79 SID79 NWBIR79                          geometry
## 1  1364     0      19 MULTIPOLYGON (((-81.472748 36....
enc2 <- wkt_polyline(wkt)
enc2[1, ]
##    AREA PERIMETER CNTY_ CNTY_ID NAME  FIPS FIPSNO CRESS_ID BIR74 SID74 NWBIR74
## 1 0.114     1.442  1825    1825 Ashe 37009  37009        5  1091     1      10
##   BIR79 SID79 NWBIR79                              geometry
## 1  1364     0      19 MULTIPOLYGON: erenEdma_NcvDgbCmhEw...

I’ve provided these functions to enable the conversion of encoded polylines into other geometry formats, should they be required by other packages.

Converting to sf

sf can read well-known text, so you can convert the wkt object back to sf / sfc objects

# sfc from wkt
st_as_sfc(wkt$geometry)
## Geometry set for 100 features 
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -84.32384 ymin: 33.88199 xmax: -75.45697 ymax: 36.58964
## CRS:            NA
## First 5 geometries:
## MULTIPOLYGON (((-81.47275 36.23435, -81.54084 3...
## MULTIPOLYGON (((-81.23989 36.36536, -81.24068 3...
## MULTIPOLYGON (((-80.45634 36.24255, -80.47638 3...
## MULTIPOLYGON (((-76.00897 36.31959, -76.01734 3...
## MULTIPOLYGON (((-77.21766 36.24098, -77.23461 3...
## back to sf - use `as.data.frame` to remove sfencoded attributes
sf_wkt <- as.data.frame(wkt)
sf_wkt$geometry <- st_as_sfc(sf_wkt$geometry)
sf_wkt <- st_sf(sf_wkt)
  
head(sf_wkt[, c("AREA", "PERIMETER", "geometry")])
## Simple feature collection with 6 features and 2 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -81.74107 ymin: 36.07281 xmax: -75.77315 ymax: 36.58964
## CRS:            NA
##    AREA PERIMETER                       geometry
## 1 0.114     1.442 MULTIPOLYGON (((-81.47275 3...
## 2 0.061     1.231 MULTIPOLYGON (((-81.23989 3...
## 3 0.143     1.630 MULTIPOLYGON (((-80.45634 3...
## 4 0.070     2.968 MULTIPOLYGON (((-76.00897 3...
## 5 0.153     2.206 MULTIPOLYGON (((-77.21766 3...
## 6 0.097     1.670 MULTIPOLYGON (((-76.74506 3...

Decoding

Use decode() to decode polylines into coordinates. This function will return a list of data.frames with lon/lat column.

polylines <- c(
 "ohlbDnbmhN~suq@am{tAw`qsAeyhGvkz`@fge}A",
 "ggmnDt}wmLgc`DesuQvvrLofdDorqGtzzV"
)

decode(polylines)
## [[1]]
##      lat       lon
## 1 26.774 -80.18999
## 2 18.466 -66.11799
## 3 32.321 -64.75700
## 4 26.774 -80.18999
## 
## [[2]]
##      lat       lon
## 1 28.745 -70.57899
## 2 29.570 -67.51400
## 3 27.339 -66.66800
## 4 28.745 -70.57899

Benchmarking

You will note that the encoded strings are different between the two enc objects created earlier. For example

enc[1, 'geometry'][[1]] == enc2[1, 'geometry'][[1]]
## [1] FALSE FALSE

This results from the lossy-ness of the encoding. However, the general shape of the information is preserved. These two maps are plots of the enc and enc2 objects respectively:

library(googleway)

## You'll need a Google Map API key to run this code
mapKey <- "your_api_key"

google_map(key = mapKey) %>%
  add_polygons(data = nc, polyline = "geometry", fill_colour = "#00FF00", fill_opacity = 0.2)

map_enc

google_map(key = mapKey) %>%
  add_polygons(data = enc2, polyline = "geometry", fill_colour = "#FF00FF", fill_opacity = 0.2)

map_enc2

Plotting

This benchmark compares plotting an sf object through leaflet vs plotting the encoded object through googleway


library(microbenchmark)
library(leaflet)

microbenchmark(

  goog = { 
    google_map(key = mapKey) %>%
      add_polygons(data = enc, polyline = "polyline")
      },
      
  leaf = { 
    leaflet() %>%
      addTiles() %>%
      addPolygons(data = nc)
    }, 
    times = 25
)

Unit: milliseconds
 expr       min        lq      mean    median        uq        max neval
 goog  5.457699  5.972203  6.379053  6.162676  6.467225   8.876789   100
 leaf 29.915151 32.093100 34.251388 33.088324 34.629769 106.124047   100
 

These benchmarks don’t account for the time taken for the browswer to render the maps