An introduction to the r-package PxWebApiData is given below. Three calls to the main function, ApiData
, are demonstrated. First, two calls for reading data sets are shown.The last call captures meta data. However, in practise, one may look at the meta data first. Then three more examples and some background is given.
Note that the text below was written before the possibility to return a single data set was included in the package (the functions ApiData1
, ApiData2
, ApiData12
).
The dataset below has three variables, Region, ContentsCode and Tid. The variables can be used as input parameters. Here two of the parameters are specified by variable id’s and one parameter is specified by indices. Negative values are used to specify reversed indices. Thus, we here obtain the two first and the two last years in the data.
A list of two data frames is returned; the label version and the id version.
ApiData("http://data.ssb.no/api/v0/en/table/04861",
Region = c("1103", "0301"), ContentsCode = "Bosatte", Tid = c(1, 2, -2, -1))
$`04861: Area and population of urban settlements, by region, contents and year`
region contents year value
1 Oslo municipality Number of residents 2000 504348
2 Oslo municipality Number of residents 2002 508134
3 Oslo municipality Number of residents 2019 677139
4 Oslo municipality Number of residents 2020 689560
5 Stavanger Number of residents 2000 106804
6 Stavanger Number of residents 2002 108271
7 Stavanger Number of residents 2019 132771
8 Stavanger Number of residents 2020 137663
$dataset
Region ContentsCode Tid value
1 0301 Bosatte 2000 504348
2 0301 Bosatte 2002 508134
3 0301 Bosatte 2019 677139
4 0301 Bosatte 2020 689560
5 1103 Bosatte 2000 106804
6 1103 Bosatte 2002 108271
7 1103 Bosatte 2019 132771
8 1103 Bosatte 2020 137663
All possible values is obtained by TRUE and corresponds to “all” in the api query. Elimination of a variables is obtained by FALSE. An imaginary value corresponds to “top” in the api query.
<- ApiData("http://data.ssb.no/api/v0/en/table/04861",
x Region = FALSE, ContentsCode = TRUE, Tid = 3i)
It is possible to select either label version or id version
1]] x[[
contents year value
1 Area of urban settlements (km²) 2018 2205.07
2 Area of urban settlements (km²) 2019 2206.45
3 Area of urban settlements (km²) 2020 2218.08
4 Number of residents 2018 4327937.00
5 Number of residents 2019 4368614.00
6 Number of residents 2020 4416981.00
2]] x[[
ContentsCode Tid value
1 Areal 2018 2205.07
2 Areal 2019 2206.45
3 Areal 2020 2218.08
4 Bosatte 2018 4327937.00
5 Bosatte 2019 4368614.00
6 Bosatte 2020 4416981.00
Meta information about the data set can be obtained by “returnMetaFrames = TRUE”.
ApiData("http://data.ssb.no/api/v0/en/table/04861", returnMetaFrames = TRUE)
$Region
values valueTexts
1 3001 Halden
2 3002 Moss
3 3003 Sarpsborg
4 3004 Fredrikstad
5 3005 Drammen
6 3006 Kongsberg
7 3007 Ringerike
8 3011 Hvaler
9 3012 Aremark
10 3013 Marker
11 3014 Indre Østfold
12 3015 Skiptvet
13 3016 Rakkestad
14 3017 Råde
15 3018 Våler (Viken)
16 3019 Vestby
17 3020 Nordre Follo
18 3021 Ås
19 3022 Frogn
20 3023 Nesodden
21 3024 Bærum
22 3025 Asker
23 3026 Aurskog-Høland
24 3027 Rælingen
25 3028 Enebakk
[ reached 'max' / getOption("max.print") -- omitted 796 rows ]
$ContentsCode
values valueTexts
1 Areal Area of urban settlements (km²)
2 Bosatte Number of residents
$Tid
values valueTexts
1 2000 2000
2 2002 2002
3 2003 2003
4 2004 2004
5 2005 2005
6 2006 2006
7 2007 2007
8 2008 2008
9 2009 2009
10 2011 2011
11 2012 2012
12 2013 2013
13 2014 2014
14 2015 2015
15 2016 2016
16 2017 2017
17 2018 2018
18 2019 2019
19 2020 2020
attr(,"text")
Region ContentsCode Tid
"region" "contents" "year"
attr(,"elimination")
Region ContentsCode Tid
TRUE FALSE FALSE
attr(,"time")
Region ContentsCode Tid
FALSE FALSE TRUE
PxWebApi has two more filters for groupings, agg: and vs:. You can see these filters in the code “API Query for this table” when you have made a table in PxWeb.
agg: is used for readymade aggregation groupings. This example shows the use of aggregation in age groups and aggregated timeseries for the new Norwegian municipality structure from 2020.
ApiData("http://data.ssb.no/api/v0/no/table/07459",
Region = list("agg:KommSummer", c("K-3001", "K-3002")),
Tid = 3i,
Alder = list("agg:TodeltGrupperingB", c("H17", "H18")),
Kjonn = TRUE)
$`07459: Befolkning, etter region, kjønn, alder, statistikkvariabel og år`
region kjønn alder statistikkvariabel år value
1 Halden Menn 0-17 år Personer 2019 3209
2 Halden Menn 0-17 år Personer 2020 3197
3 Halden Menn 0-17 år Personer 2021 3148
4 Halden Menn 18 år eller eldre Personer 2019 12509
5 Halden Menn 18 år eller eldre Personer 2020 12609
6 Halden Menn 18 år eller eldre Personer 2021 12674
7 Halden Kvinner 0-17 år Personer 2019 3005
8 Halden Kvinner 0-17 år Personer 2020 3023
[ reached 'max' / getOption("max.print") -- omitted 16 rows ]
$dataset
Region Kjonn Alder ContentsCode Tid value
1 K-3001 1 H17 Personer1 2019 3209
2 K-3001 1 H17 Personer1 2020 3197
3 K-3001 1 H17 Personer1 2021 3148
4 K-3001 1 H18 Personer1 2019 12509
5 K-3001 1 H18 Personer1 2020 12609
6 K-3001 1 H18 Personer1 2021 12674
7 K-3001 2 H17 Personer1 2019 3005
8 K-3001 2 H17 Personer1 2020 3023
[ reached 'max' / getOption("max.print") -- omitted 16 rows ]
There are two limitations in the PxWebApi here.
The other filter vs:, specify the grouping value sets, which is a part of the value pool. As it is only possible to give single elements as input, it is easier to query the value pool. Thar means that vs: is redundant.
In this example Region is the value pool and Fylker is the value set. These two will return the same result:
Region = list("vs:Fylker",c("01","02"))
Region = list(c("01","02")).
In PxWebApi the original query is formulated in JSON. Using the parameter returnApiQuery can be useful for debugging.
ApiData("http://data.ssb.no/api/v0/en/table/04861", returnApiQuery = TRUE)
{
"query": [
{
"code": "Region",
"selection": {
"filter": "item",
"values": ["3001", "2399", "9999"]
}
},
{
"code": "ContentsCode",
"selection": {
"filter": "item",
"values": ["Areal", "Bosatte"]
}
},
{
"code": "Tid",
"selection": {
"filter": "item",
"values": ["2000", "2019", "2020"]
}
}
],
"response": {
"format": "json-stat"
}
}
Statistics Norway also provide an API with readymade datasets, available by http GET. Use the parameter getDataByGET = TRUE. By changing to lang=no you get the label version in Norwegian.
This dataset is from Economic trends forecasts.
<- ApiData("https://data.ssb.no/api/v0/dataset/934516.json?lang=en", getDataByGET = TRUE)
x 1]] x[[
year contents value
1 2021 Gross domestic product 3.0
2 2021 GDP Mainland Norway 3.6
3 2021 Employed persons 0.7
4 2021 Unemployment rate (level) 4.7
5 2021 Wages per standard man-year 3.1
6 2021 Consumer price index (CPI) 3.3
7 2021 CPI-ATE 1.9
8 2021 Housing prices 9.7
9 2021 Money market rate (level) 0.5
10 2021 Import-weighted NOK exchange rate (44 countries) -5.0
11 2022 Gross domestic product 4.1
12 2022 GDP Mainland Norway 3.8
13 2022 Employed persons 1.4
14 2022 Unemployment rate (level) 4.4
15 2022 Wages per standard man-year 3.1
16 2022 Consumer price index (CPI) 1.9
[ reached 'max' / getOption("max.print") -- omitted 24 rows ]
We would like to extract the number of female R&D personel in the services sector of the Norwegian business life for the years 2017 and 2018.
Locate the relevant table at https://www.ssb.no that contains information on R&D personel. Having obtained the relevant table, table 07964, we create the link https://data.ssb.no/api/v0/no/table/07964/
Load the package.
library(PxWebApiData)
<- ApiData("https://data.ssb.no/api/v0/no/table/07964/",
variables returnMetaFrames = TRUE)
names(variables)
## [1] "NACE2007" "ContentsCode" "Tid"
<- ApiData("https://data.ssb.no/api/v0/no/table/07964/",
values returnMetaData = TRUE)
1]]$values values[[
## [1] "A-N" "A03" "B05-B09" "B06_B09.1" "C" "C10-C11"
## [7] "C13" "C14-C15" "C16" "C17" "C18" "C19-C20"
## [13] "C21" "C22" "C23" "C24" "C25" "C26"
## [19] "C26.3" "C26.5" "C27" "C28" "C29" "C30"
## [25] "C30.1" "C31" "C32" "C32.5" "C33" "D35"
## [31] "E36-E39" "F41-F43" "G-N" "G46" "H49-H53" "J58"
## [37] "J58.2" "J59-J60" "J61" "J62" "J63" "K64-K66"
## [43] "M70" "M71" "M72" "M74.9" "N82.9"
2]]$values values[[
## [1] "EnhetTot" "EnheterFoU" "FoUpersonale"
## [4] "KvinneligFoUpers" "FoUPersonaleUoHutd" "FoUPersonaleDoktor"
## [7] "FoUArsverk" "FoUArsverkPers" "FoUArsverkUtd"
3]]$values values[[
## [1] "2007" "2008" "2009" "2010" "2011" "2012" "2013" "2014" "2015" "2016"
## [11] "2017" "2018" "2019"
<- ApiData("https://data.ssb.no/api/v0/en/table/07964/",
data Tid = c("2017", "2018"), # Define year to 2017 and 2018
NACE2007 = "G-N", # Define the services sector
ContentsCode = c("KvinneligFoUpers")) # Define women R&D personell
<- data[[1]] # Extract the first list element, which contains full variable names.
data
head(data)
## industry (SIC2007) contents year value
## 1 Services total Female R&D personnel 2017 4408
## 2 Services total Female R&D personnel 2018 4528
PxWeb and it’s API, PxWebApi is used as output database (Statbank) by many statistical agencies in the Nordic countries and several others, i.e. Statistics Norway, Statistics Finland, Statistics Sweden. See list of installations: https://www.scb.se/en/services/statistical-programs-for-px-files/px-web/pxweb-examples/