helsinki R package provides tools to access open data from the Helsinki region in Finland.
For contact information, source code and bug reports, see the project’s GitHub page. For other similar packages and related blog posts, see the rOpenGov project website.
Release version for most users:
install.packages("helsinki")
Development version for developers and other interested parties:
library(remotes)
::install_github("ropengov/helsinki") remotes
Load the package.
library(helsinki)
The package has basic functions for interacting with WFS APIs, courtesy of FMI2-package: wfs_api()
for returning “wfs_api” and to_sf()
for turning these objects to sf-objects.
All available features of a given API can be easily listed with the get_feature_list()
function. Partly due to legacy considerations, partly due to user convenience we often use HSY API url in these function examples, sometimes even as a default option. The API functions can, however, be used with a wide variety of different base.url
parameters.
<- "https://kartta.hsy.fi/geoserver/wfs"
url
<- get_feature_list(base.url = url)
hsy_features # Select only features which are related to water utilities and services
<- hsy_features[which(hsy_features$Namespace == "vesihuolto"),]
hsy_vesihuolto
hsy_vesihuolto#> Name
#> 73 vesihuolto:VH_Vesipostit_HSY
#> 74 vesihuolto:VH_muut_vesihuollon_toiminta_alueet_2020
#> 75 vesihuolto:VH_toiminta_alue_2020
#> 76 vesihuolto:VH_toiminta_alue_2020_alustava_laajeneminen_21_22
#> 77 vesihuolto:VH_toiminta_alueen_laajeneminen_2020
#> 78 vesihuolto:VH_vesihuollon_toiminta_alue_vedenjakelu_2020
#> 85 vesihuolto:Vh_HSY_toiminta_al_2017_2019
#> 86 vesihuolto:Vh_Muu_vesihuo_toiminta_al_2017_2019
#> 87 vesihuolto:Vh_Vedenjak_toiminta_al_2017_2019
#> 204 vesihuolto:vesihuollon_toimipisteet
#> 206 vesihuolto:vh_hulevesiviemaroity_alue
#> 207 vesihuolto:vh_hva_laajennusalueet
#> 208 vesihuolto:vh_jatevesi_matkaaika_puhdistamolle
#> 209 vesihuolto:vh_sekaviemarointialue
#> 216 vesihuolto:ylimaarainen_verkostopaine
#> Title Namespace
#> 73 VH_Vesipostit_HSY vesihuolto
#> 74 VH_muut_vesihuollon_toiminta_alueet_2020 vesihuolto
#> 75 VH_toiminta_alue_2020 vesihuolto
#> 76 VH_toiminta_alue_2020_alustava_laajeneminen_21_22 vesihuolto
#> 77 VH_toiminta_alueen_laajeneminen_2020 vesihuolto
#> 78 VH_vesihuollon_toiminta_alue_vedenjakelu_2020 vesihuolto
#> 85 Vh_HSY_toiminta_al_2017_2019 vesihuolto
#> 86 Vh_Muu_vesihuo_toiminta_al_2017_2019 vesihuolto
#> 87 Vh_Vedenjak_toiminta_al_2017_2019 vesihuolto
#> 204 vesihuollon_toimipisteet vesihuolto
#> 206 vh_hulevesiviemaroity_alue vesihuolto
#> 207 vh_hva_laajennusalueet vesihuolto
#> 208 vh_jatevesi_matkaaika_puhdistamolle vesihuolto
#> 209 vh_sekaviemarointialue vesihuolto
#> 216 ylimaarainen_verkostopaine vesihuolto
# We select our feature of interest from this list: Location of waterposts
<- "vesihuolto:VH_Vesipostit_HSY" feature_of_interest
When the wanted feature and its Name (in other words: Namespace:Title combination) is known, it can be downloaded with get_feature()
by providing the correct base.url
and the Name as the typename
parameter.
# downloading a feature
<- get_feature(base.url = url, typename = feature_of_interest)
waterposts # Visualizing the location of waterposts
plot(waterposts$geom)
Dots on a blank canvas do not make much sense and therefore helsinki-package has get_city_map()
function for downloading city district boundaries. An example of this is provided in the Helsinki region district maps section of this vignette.
Helsinki-package provides an easy-to-use menu-driven select_feature()
function that effectively combines get_feature_list()
and get_feature()
. At default it only returns the Name of the wanted function, but if get
parameter is set to TRUE, it returns an sf_object which can be easily visualized.
# Interactive example with select_feature
<- select_feature(base.url = url)
selected_feature <- get_feature(base.url = url, typename = selected_feature)
feature
# Skipping a redundant step with parameter get = TRUE
<- select_feature(base.url = url, get = TRUE) feature
The above example shows a general use case which can easily be applied to Helsinki Region Environmental Services (HSY) WFS API as well as other service providers’ APIs.
For legacy reasons, helsinki-package has also some specialized functions that aim to make downloading often used data as easy as possible.
Specifically, there are two new functions that replace deprecated functionalities from get_hsy()
function: get_vaestotietoruudukko()
(population grid) and get_rakennustietoruudukko()
(building information grid). As of writing, years 2015 to 2020 are supported but the API may be updated at any time and get_feature_list()
can be used to download datasets that are not baked into these functions.
<- get_vaestotietoruudukko(year = 2018)
pop_grid <- get_rakennustietoruudukko(year = 2020)
building_grid
library(ggplot2)
# Logarithmic scales to make the visualizations more discernible
ggplot(pop_grid) + geom_sf(aes(colour=log(asukkaita), fill=log(asukkaita)))
ggplot(building_grid) + geom_sf(aes(colour=log(kerala_yht), fill=log(kerala_yht)))
While easy enough to build, specialized functions such as these are probably not something that power users want to rely on in their work flows. They also add more manual phases to package maintenance and therefore are probably not the direction we’re heading in the future. If you feel differently about this and there is a dataset that gets a lot of use, feel free to drop us a suggestion in GitHub.
Function get_servicemap()
retrieves regional service data from city of Helsinki Service Map API, that contains data from the Service Map.
# Search for "puisto" (park) (specify q="query")
<- get_servicemap(query="search", q="puisto")
search_puisto # Study results: 47 variables in the data frame
str(search_puisto, max.level = 1)
#> List of 4
#> $ count : int 2155
#> $ next : chr "https://api.hel.fi/servicemap/v2/search/?page=2&q=puisto"
#> $ previous: NULL
#> $ results :'data.frame': 20 obs. of 47 variables:
We can see that this search returns a large number of results, over 2000. The results are returned as pages, where each page has 20 results by default. By giving no additional search parameters, we get 20 results from the first page of search results.
# Get names for the first 20 results
$results$name$fi
search_puisto#> [1] "Sinebrychoffin puiston yleisövessa"
#> [2] "Sibeliuksen puiston yleisövessa"
#> [3] "Töölönlahden puisto / Leikkipaikka"
#> [4] "Esplanadin puiston WLAN-tukiasema"
#> [5] "Hesperian puiston yleisövessa"
#> [6] "Topeliuksen puiston yleisövessa"
#> [7] "Ankkuripohjanpuisto"
#> [8] "Sibeliuksen puiston yleisövessa"
#> [9] "Pysäköintilippuautomaatti 334, Castreninkatu, puisto, korttimaksu"
#> [10] "Pysäköintilippuautomaatti 132, Merimiehenkatu, puisto, korttimaksu"
#> [11] "Pysäköintilippuautomaatti 88, Pengerkatu, puisto, kolikkomaksu"
#> [12] "Shakkilauta, Von Glanin puisto"
#> [13] "Matti Heleniuksen puiston yleisövessa"
#> [14] "Shakkilauta, Katri Valan puisto"
#> [15] "Tove Janssonin puiston yleisövessa"
#> [16] "Ala-Malmin puisto"
#> [17] "Pyhän Birgitan puisto"
#> [18] "Everstinpuisto"
#> [19] "Kirkkojärvenpuisto"
#> [20] "Lehtikaskenpuisto"
# See what kind of data is given for services
names(search_puisto$results)
#> [1] "id" "connections"
#> [3] "entrances" "accessibility_properties"
#> [5] "identifiers" "department"
#> [7] "root_department" "provider_type"
#> [9] "organizer_type" "contract_type"
#> [11] "is_active" "deleted_at"
#> [13] "organizer_name" "organizer_business_id"
#> [15] "picture_url" "picture_entrance_url"
#> [17] "streetview_entrance_url" "description"
#> [19] "short_description" "name"
#> [21] "street_address" "www"
#> [23] "address_postal_full" "call_charge_info"
#> [25] "picture_caption" "phone"
#> [27] "fax" "email"
#> [29] "accessibility_phone" "accessibility_email"
#> [31] "accessibility_www" "created_time"
#> [33] "address_zip" "data_source"
#> [35] "extensions" "last_modified_time"
#> [37] "accessibility_viewpoints" "root_service_nodes"
#> [39] "municipality" "service_nodes"
#> [41] "services" "keywords"
#> [43] "location" "accessibility_shortcoming_count"
#> [45] "sort_index" "object_type"
#> [47] "score"
More results could be retrieved and viewed by giving additional search
parameters.
<- get_servicemap(query="search", q="puisto", page_size = 30, page = 2)
search_puisto $results$name$fi
search_puisto#> [1] "Kaupungintalon puisto"
#> [2] "Kuttulammenpuisto"
#> [3] "Rinkelipuisto"
#> [4] "Olarin asukaspuisto"
#> [5] "Alli Tryggin puiston opaskoira-aitaus"
#> [6] "Hurtigin puisto"
#> [7] "Matinkylän asukaspuisto"
#> [8] "Nurmilinnunpuisto"
#> [9] "Ruusutorpanpuisto"
#> [10] "Tynnyripuisto"
#> [11] "Perkkaan asukaspuisto"
#> [12] "Asematien leikkipuisto"
#> [13] "Stenbergin puisto"
#> [14] "Viherkallion asukaspuisto"
#> [15] "Leppävaaran asukaspuisto"
#> [16] "Veijarivuoren puiston talviuintipaikka (Humaus ry)"
#> [17] "Kasavuoren puisto"
#> [18] "Suvelan asukaspuisto"
#> [19] "Karakallion asukaspuisto"
#> [20] "Marketanpuisto"
#> [21] "Kylätalo Palttinan asukaspuisto"
#> [22] "Mankkaan asukaspuisto"
#> [23] "Tapiolan asukaspuisto"
#> [24] "Pohjankulman puisto"
#> [25] "Suvelan puisto"
#> [26] "Alberganesplanadin puisto"
#> [27] "Parkvillanpihan puisto"
#> [28] "Träskändan kartanopuisto"
#> [29] "Laivatorin puisto"
#> [30] "Rantaraitin puisto"
As we could see from above example, the returned data frame had 47 observations. At full width this output can be messy to handle in R console. One possible option would be to turn it into a more easily manageable tibble (which often is not a bad idea), another is to limit the extent of the query at the start. We do the latter and it can be done with the function parameter only
:
# Search for padel-related services in Helsinki
<- get_servicemap(query="search", input="padel", only="unit.name, unit.location.coordinates, unit.street_address", municipality="helsinki")
search_padel $results
search_padel#> id name.fi
#> 1 59998 Padel Helsinki / Padelkentät
#> 2 63925 Padel Messukeskus / Padelkentät
#> 3 64716 Padel Aurinkolahti / Padelkenttä
#> 4 45413 PadelCenter Helsinki / Padelkentät
#> 5 57222 Padel Club Viikinranta / Padelkenttä
#> 6 63927 Padel Arena Center Myllypuro / Padelkentät
#> 7 66781 Padel Messukeskus / Padelkentät (ulko)
#> 8 57221 ProPADEL Sörnäinen / Padelkenttä
#> 9 63926 Billebeino Padel / Padelkenttä
#> 10 62776 Talihalli / Padelkenttä
#> 11 64747 Urheiluhallit Vuosaari / Padelkentät
#> 12 42296 Smash Center / Padelkenttä
#> 13 62904 Laajasalon Palloiluhallit Oy / Padelkenttä
#> name.sv street_address.fi
#> 1 Padel Helsinki / Padelplaner Lahnalahdentie 11
#> 2 Padel Mässcentrum / Padelplaner Messuaukio 1
#> 3 Padel Aurinkolahti / Padelplan Urheilukalastajankuja 1
#> 4 PadelCenter Helsingfors / Padelplaner Hernesaarenranta 1
#> 5 Padel Club Viikinranta / Padelplan Jokisuuntie 5
#> 6 Padel Arena Center Myllypuro / Padelplaner Alakiventie 2
#> 7 Padel Mässcentrum / Padelplaner Messuaukio 1
#> 8 ProPADEL Sörnäs / Padelplan Sörnäisten rantapromenadi 1
#> 9 Billebeino Padel / Padelplan Traverssikuja 3
#> 10 Talihalli / Padelplan Huopalahdentie 28
#> 11 Urheiluhallit Vuosaari / Padelplaner Vuosaarentie 5
#> 12 Smash Center / Padelplan Varikkotie 4
#> 13 Laajasalon Palloiluhallit Oy / Padelplan Sarvastonkaari 23
#> street_address.sv street_address.en location.type
#> 1 Braxviksvägen 11 Lahnalahdentie 11 Point
#> 2 Mässplatsen 1 Messuaukio 1 Point
#> 3 Sportfiskargränden 1 Urheilukalastajankuja 1 Point
#> 4 Ärtholmsstranden 1 Hernesaarenranta 1 Point
#> 5 Åminnevägen 5 Jokisuuntie 5 Point
#> 6 Understensvägen 2 Alakiventie 2 Point
#> 7 Mässplatsen 1 Messuaukio 1 Point
#> 8 Sörnäs strandpromenad 1 Sörnäisten rantapromenadi 1 Point
#> 9 Traversgränden 3 Traverssikuja 3 Point
#> 10 Hoplaksvägen 28 Huopalahdentie 28 Point
#> 11 Nordsjövägen 5 Vuosaarentie 5 Point
#> 12 Depåvägen 4 Varikkotie 4 Point
#> 13 Fladabågen 23 Sarvastonkaari 23 Point
#> location.coordinates sort_index object_type score
#> 1 24.86652, 60.16435 0 unit 7.416649
#> 2 24.93602, 60.20486 1 unit 7.416649
#> 3 25.15954, 60.20301 2 unit 7.309525
#> 4 24.93146, 60.15325 3 unit 7.309525
#> 5 24.98746, 60.21826 4 unit 6.799793
#> 6 25.07979, 60.22095 5 unit 6.533953
#> 7 24.93741, 60.20461 6 unit 6.533953
#> 8 24.96320, 60.18242 7 unit 2.782590
#> 9 24.94567, 60.19430 8 unit 2.765424
#> 10 24.87776, 60.21116 9 unit 2.765424
#> 11 25.14068, 60.20865 10 unit 2.279006
#> 12 25.06807, 60.20983 11 unit 2.268616
#> 13 25.05760, 60.17416 12 unit 2.254439
The function parameter only
is not as straightforward as filtering and subsetting columns in a data.frame / tibble, as it requires a valid type prefix in the beginning, in this case “unit”. Other valid input in this case would be “service_node”, “service” and “address”, but during testing unit-prefix seemed to work best. When in doubt, checking the original documentation may help.
Function get_linkedevents()
retrieves regional event data from the new Linked Events API.
# Search for current events
<- get_linkedevents(query="event")
events # Get names for the first 20 results
$data$name$fi
events#> [1] "Kirjastokurren etäsatutuokio"
#> [2] "Kirjastokurren etäsatutuokio"
#> [3] "Kirjastokurren etäsatutuokio"
#> [4] "Luetaan yhdessä ryhmä"
#> [5] "Kirjastokurren etäsatutuokio"
#> [6] "Etätapahtuma: Opitaan kielioppia!"
#> [7] "Luetaan yhdessä ryhmä"
#> [8] "Kirjastokurren etäsatutuokio"
#> [9] "Etätapahtuma: Opitaan kielioppia!"
#> [10] "Luetaan yhdessä ryhmä"
#> [11] "Kirjastokurren etäsatutuokio"
#> [12] "Etätapahtuma: Opitaan kielioppia!"
#> [13] "Luetaan yhdessä ryhmä"
#> [14] "Kirjastokurren etäsatutuokio"
#> [15] "Etätapahtuma: Opitaan kielioppia!"
#> [16] "Luetaan yhdessä ryhmä"
#> [17] "Kirjastokurren etäsatutuokio"
#> [18] "Etätapahtuma: Opitaan kielioppia!"
#> [19] "Tikkurilan Marttojen Kässäkahvila #opitaanyhdessä"
#> [20] "Luetaan yhdessä ryhmä"
# See what kind of data is given for events
names(events$data)
#> [1] "id" "location"
#> [3] "keywords" "super_event"
#> [5] "event_status" "type_id"
#> [7] "external_links" "offers"
#> [9] "data_source" "publisher"
#> [11] "sub_events" "images"
#> [13] "videos" "in_language"
#> [15] "audience" "created_time"
#> [17] "last_modified_time" "date_published"
#> [19] "start_time" "end_time"
#> [21] "custom_data" "audience_min_age"
#> [23] "audience_max_age" "super_event_type"
#> [25] "deleted" "maximum_attendee_capacity"
#> [27] "remaining_attendee_capacity" "minimum_attendee_capacity"
#> [29] "enrolment_start_time" "enrolment_end_time"
#> [31] "local" "search_vector_fi"
#> [33] "search_vector_en" "search_vector_sv"
#> [35] "kulke_new" "replaced_by"
#> [37] "location_extra_info" "provider"
#> [39] "name" "info_url"
#> [41] "provider_contact_info" "short_description"
#> [43] "description" "@id"
#> [45] "@context" "@type"
Helsinki region geographic data can be accessed from a WFS API by using the get_city_map() function. Data is available for all 4 cities in the capital region: Helsinki, Espoo, Vantaa and Kauniainen.
Administrative divisions can be accessed on 3 distinct levels: “suuralue”, “tilastoalue” and “pienalue”. Literal, completely unofficial translations for these could be “grand district”, “statistical area” and “(minor) district”. The naming convention of these levels is sometimes confusing even in Finnish documents and different names can vary by city and time.
The main takeaway is that “suuralue” is the highest-level division and “pienalue” is the most granular level of division. “Tilastoalue” is somewhere between these two. These are the names to be used even if the city of interest might not use them in their Finnish or English website.
As promised earlier in API Access, the following example gives an idea on how to visualize waterpost locations (and, of course, other types of spatial data as well) on capital region map.
<- get_city_map(city = "helsinki", level = "suuralue")
helsinki <- get_city_map(city = "espoo", level = "suuralue")
espoo <- get_city_map(city = "vantaa", level = "suuralue")
vantaa <- get_city_map(city = "kauniainen", level = "suuralue")
kauniainen
library(ggplot2)
ggplot() +
geom_sf(data = helsinki) +
geom_sf(data = espoo) +
geom_sf(data = vantaa) +
geom_sf(data = kauniainen) +
geom_sf(data = waterposts)
In addition, it is possible to download “aanestysalue” (voting district) divisions for the city of Helsinki. Currently this data is not available for other cities and it must be accessed from other sources.
library(sf)
<- get_city_map(city = "helsinki", level = "suuralue")
map plot(sf::st_geometry(map))
<- get_city_map(city = "helsinki", level = "aanestysalue")
voting_district plot(sf::st_geometry(voting_district))
For other cities than Helsinki voting districts are currently not available.
See help()
to get citation information for each function and related data sources.
If no such information is explicitly stated, see data provider’s website for more information.
citation("helsinki")
:
Kindly cite the helsinki R package as follows
(C) Juuso Parkkinen, Joona Lehtomaki, Pyry Kantanen and Leo Lahti2014-2021. helsinki R package
for LaTeX users is
A BibTeX entry
@Misc{,
= {helsinki R package},
title = {Juuso Parkkinen and Joona Lehtomaki and Pyry Kantanen and Leo Lahti},
author = {2014-2021},
year
}
for all contributors! For more info, see:
Many thanks ://github.com/rOpenGov/helsinki https
This vignette was created with
sessionInfo()
#> R version 4.1.1 (2021-08-10)
#> Platform: x86_64-apple-darwin17.0 (64-bit)
#> Running under: macOS Catalina 10.15.7
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
#>
#> locale:
#> [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] ggplot2_3.3.5 helsinki_1.0.5
#>
#> loaded via a namespace (and not attached):
#> [1] Rcpp_1.0.7 highr_0.9 pillar_1.6.2 compiler_4.1.1
#> [5] class_7.3-19 tools_4.1.1 digest_0.6.27 gtable_0.3.0
#> [9] jsonlite_1.7.2 evaluate_0.14 lifecycle_1.0.0 tibble_3.1.3
#> [13] pkgconfig_2.0.3 rlang_0.4.11 DBI_1.1.1 curl_4.3.2
#> [17] yaml_2.2.1 xfun_0.25 e1071_1.7-8 withr_2.4.2
#> [21] xml2_1.3.2 dplyr_1.0.7 stringr_1.4.0 httr_1.4.2
#> [25] knitr_1.33 generics_0.1.0 vctrs_0.3.8 grid_4.1.1
#> [29] classInt_0.4-3 tidyselect_1.1.1 glue_1.4.2 sf_1.0-2
#> [33] R6_2.5.1 fansi_0.5.0 rmarkdown_2.10 farver_2.1.0
#> [37] purrr_0.3.4 magrittr_2.0.1 scales_1.1.1 units_0.7-2
#> [41] ellipsis_0.3.2 htmltools_0.5.1.1 assertthat_0.2.1 colorspace_2.0-2
#> [45] httpcache_1.2.0 labeling_0.4.2 utf8_1.2.2 KernSmooth_2.23-20
#> [49] stringi_1.7.3 proxy_0.4-26 munsell_0.5.0 crayon_1.4.1