This vignette is intended to demonstrate how to use the functions rtry_geocoding
and rtry_revgeocoding
within the ‘rtry’ package to perform geocoding and reverse geocoding for a list of locations or coordinates.
Geocoding is the process of converting an address into geographic coordinates (latitude and longitude), while reverse geocoding is the process of converting geographic coordinates (latitude and longitude) into an address.
The functions rtry_geocoding
and rtry_revgeocoding
are based on Nominatim, a search engine for OpenStreetMap (OSM) data. The data provided are free to use for any purpose, including commercial use, note that they are governed by the Open Database License (ODbL). As part of the Nominatim Usage Policy, an absolute maximum of 1 request per second (no heavy usage) and a valid email address to identify the request are required when using this OSM service. For details, please refer to: https://wiki.openstreetmap.org/wiki/Nominatim.
Note that the georeference system used is WGS84 projection.
Make sure you have the ‘rtry’ package installed. If not, you may refer to the vignette “Introduction to rtry” (rtry-introduction).
To start, set the work directory to the desired location:
# Set the working directory
setwd("<path_to_dir>")
# Check the working directory
getwd()
Note: The character “\
” is used as escape character in R to give the following character special meaning (e.g. “\n
” for newline, “\t
” for tab, “\r
” for carriage return and so on). Therefore, for Windows users, it is important to use the “\
” in the file path of the command instead of “/
” in order for R to correctly understand the input path.
Load the required packages using the commands:
# Load the rtry package
library(rtry)
# Check the version of rtry
packageVersion("rtry")
# Load the dplyr package which is used for piping (%>%)
library(dplyr)
rtry_geocoding()
takes two parameters address
and email
, and returns a data frame that contains latitude (lat
) and longitude (lon
) in WGS84 projection.
rtry_geocoding(address = NULL, email = NULL)
Argument | Description |
---|---|
address |
String of an address |
email |
String of an email address |
In the context of this example workflow, we will use the location data provided within the ‘rtry’ package. In this specific case the input argument for the file data_locations.csv
can be obtained via system.file()
that finds the full file path to the ‘rtry’ package:
# Obtain and print the path to the sample dataset within the rtry package
<- system.file("testdata", "data_locations.csv", package = "rtry")
path_to_data path_to_data
## [1] "C:/Program Files/R/R-4.0.5/library/rtry/testdata/data_locations.csv"
To load the .csv
file with location information, use rtry_import()
:
# Load the locations from a .csv file
<- rtry_import(path_to_data, separator = ",", encoding = "UTF-8", quote = "\"")
input_locations
# View the location data in the data viewer
View(input_locations)
## input: C:/Program Files/R/R-4.0.5/library/rtry/testdata/data_locations.csv
## dim: 20 3
## col: Country code Country Location
Then, the location data should be converted into the required format for address
, i.e. <location>, <country>
:
# Extract and combine the location and country names
<- paste(input_locations$Location, input_locations$Country, sep = ", ")
input_addresses
# Display the first six rows
head(input_addresses)
## [1] "Hajdúdorog, Hungary" "Diósd, Hungary" "Fót, Hungary" "Bőcs, Hungary"
## [5] "Regéc, Hungary" "Sáska, Hungary"
Note that file encoding UTF-8
is used, and it is normal for the RStudio console to display character in Unicode character (<U+0000>
semantic) depending on the system language setting. For example, “Bőcs” might be displayed as “B<U+0151>cs”.
In order to apply the function rtry_geocoding()
to the list input_addresses
, use lapply()
, and please remember to change the email address into your own email address.
Since OSM is an absolute maximum of 1 request per second, in the following example, a 2 second delay has been set between each search.
# Prepare counter for printed progress messages
<- 1
counter <- NULL # somethings received error messages 'no object found'
output_coordinates
# Use lapply to apply function to the list of addresses
<- lapply(input_addresses, function(address) {
output_coordinates # Calling the Nominatim OpenStreetMap API
# Please change the email address into your own email address
<- rtry_geocoding(address, email = "john.doe@example.com")
geocode_output
# No heavy uses (an absolute maximum of 1 request per second)
# Here set to 2 seconds between each search
Sys.sleep(2)
# Print message in console to see the progress
message("Geocoding ", counter, "/", nrow(input_locations), " completed.")
<<- counter + 1
counter
# Return data.frame with the input address, output of the rtry_geocoding function
return(data.frame(address = address, geocode_output))
%>%
}) # Stack the list output into data.frame
bind_rows() %>% data.frame()
## Geocoding 1/20 completed.
## Geocoding 2/20 completed.
## Geocoding 3/20 completed.
## ...
## Geocoding 20/20 completed.
The progress of the geocoding can be seen in the console. Once the geocoding is completed, view the output_coordinates
using the View
function.
The output_coordinates
would look like the following. Note that the location which is unknown to OSM, the resulting latitude and longitude will remain or marked as NA
.
Substitute the coordinates into the corresponding columns of the input data.
# Add the output coordinates to the corresponding columns in the input data
$Latitude <- output_coordinates$lat
input_locations$Longitude <- output_coordinates$lon
input_locations
# If necessary, re-arrange the columns
<- rtry_select_col(input_locations, "Country code", Country, Location, Latitude, Longitude, showOverview = FALSE)
input_locations
# View data
head(input_locations)
# Export into .csv
= file.path(tempdir(), "locations_to_coordinates.csv")
output_file rtry_export(input_locations, output_file)
## File saved at: C:/Users/user/AppData/Local/Temp/Rtmp4wJAvQ/locations_to_coordinates.csv
The rtry_revgeocoding()
takes two parameters lat_lon
and email
, and returns a data frame that contains the corresponding location.
rtry_revgeocoding(lat_lon = NULL, email = NULL)
Argument | Description |
---|---|
lat_lon |
A data frame containing latitude and longitude in WGS84 projection |
email |
String of an email address |
Here, we will use the coordinates data provided within the ‘rtry’ package. In this specific case the input argument for the file data_coordinates.csv
can be obtained via system.file()
that finds the full file path to the ‘rtry’ package:
# Obtain and print the path to the sample dataset within the rtry package
<- system.file("testdata", "data_coordinates.csv", package = "rtry")
path_to_data path_to_data
## [1] "C:/Program Files/R/R-4.0.5/library/rtry/testdata/data_coordinates.csv"
To load the .csv
file with coordinates information, use rtry_import()
:
<- rtry_import(path_to_data, separator = ",", encoding = "UTF-8", quote = "\"") input_coordinates
## input: C:/Program Files/R/R-4.0.5/library/rtry/testdata/data_coordinates.csv
## dim: 20 2
## col: Latitude Longitude
Then, the coordinates data should be converted into a data.frame
:
# Extract and converted the coordinates into a data frame
<- data.frame(lat = input_coordinates$Latitude, lon = input_coordinates$Longitude) input_lat_lon
In order to apply the function rtry_revgeocoding
to the input_lat_lon
, use apply()
, and please remember to change the email address into your own email address.
Since OSM is an absolute maximum of 1 request per second, in the following example, a 2 second delay has been set between each search.
# Prepare counter for printed progress messages
<- 1
counter <- NULL # somethings received error messages 'no object found'
output_locations
# Use apply to apply function to the data.frame that contains the coordinates
# Please change the email address to your own email address
<- apply(input_lat_lon, 1, function(lat_lon) {
output_locations # Calling the Nominatim OpenStreetMap API
<- rtry_revgeocoding(lat_lon, email = "john.doe@example.com")
rev_geocode_output
# No heavy uses (an absolute maximum of 1 request per second)
# Here set to 2 seconds between each search
Sys.sleep(2)
# Print message in console to see the progress
message("Reverse Geocoding ", counter, "/", length(input_lat_lon$lat), " completed.")
<<- counter + 1
counter
# Return data.frame with the input coordinates, output of the rtry_revgeocoding function
return(data.frame(lat = lat_lon[1], lon = lat_lon[2], rev_geocode_output))
%>%
}) # Stack the list output into data.frame
bind_rows() %>% data.frame()
## Reverse Geocoding 1/20 completed.
## Reverse Geocoding 2/20 completed.
## Reverse Geocoding 3/20 completed.
## ...
## Reverse Geocoding 20/20 completed.
The progress of the reverse geocoding can be seen in the console. Once the reverse geocoding is completed, view the output_locations
using the View
function.
The output location information would look like the following. Note that for some coordinates, OpenStreetMap might not have the town/city information, in such case, those columns will be marked as NA
.
Substitute the country_code
and country
into the corresponding columns of the input list, while the location information is extracted from either town
or city
.
# Add the output location information to the corresponding columns in the input data
$'Country code' <- output_locations$country_code
input_coordinates$Country <- output_locations$country
input_coordinates$Location <- ifelse(!is.na(output_locations$town), output_locations$town, output_locations$city)
input_coordinates
# If necessary, re-arrange the columns
<- rtry_select_col(input_coordinates, Latitude, Longitude, "Country code", Country, Location, showOverview = FALSE)
input_coordinates
# View data
head(input_coordinates)
# Export into .csv
= file.path(tempdir(), "coordinates_to_locations.csv")
output_file rtry_export(input_coordinates, output_file)
## File saved at: C:/Users/user/AppData/Local/Temp/Rtmp4wJAvQ/locations_to_coordinates.csv