Introduction to rgeolocate

Os Keyes

2021-12-18

rgeolocate is an R package for geolocating IP addresses. It contains bindings to a variety of web APIs for IP geolocation, as well as an interface to the proprietary MaxMind GeoIP databases (and a corresponding sample database).

For the package to install properly, you’ll first need libmaxminddb: the source code and installation instructions can be found here.

MaxMind databases

The biggest chunk of functionality is around the MaxMind proprietary databases, which you can read about here. rgeolocate has support for retrieving continent, country, region, netspeed, and timezone information from these databases, which (unfortunately!) must be paid for if you want to access them.

MaxMind does openly release a more limited set of databases covering country and city data; these are also supported and can be obtained here. To minimise the effort for the individual user, the smaller, country-level database is included in rgeolocate. To demonstrate both how to use the database, and how to use the geolocation function, we can do:

library(rgeolocate)
file <- system.file("extdata","GeoLite2-Country.mmdb", package = "rgeolocate")
results <- maxmind("196.200.60.51", file, c("continent_name", "country_code", "country_name"))
results
##   continent_name country_code country_name
## 1         Africa           ML         Mali

The maxmind() function is fully vectorised and will happily geolocate around a million IP addresses in just under 5 seconds. Only the country data, as said, is shipped with rgeolocate: if you’d like another database, you should download it yourself from the MaxMind website, and point the file argument of maxmind() to the downloaded database.

Web APIs

rgeolocate also contains bindings to various IP geolocation web APIs. At the moment, there is support for:

These are somewhat different, with different sources for their data and different amounts of data. While vectorised, they’re going to be a lot slower than maxmind(), but are more viable for (say) obtaining information we don’t currently extract from the MaxMind databases, or handling IP addresses the free MaxMind databases don’t cover if you can’t obtain the paid ones.