The French official open data portal offers a huge quantity of information. They also provide a well structured API. The BARIS package allows you to exploit this API in order to get the required data from the portal.

The package is available on CRAN, you can also install the development version from Github:

BARIS_search()

The BARIS_search() function allows you to search for a specified data set. A quick tip: within your query, use plain Nouns and avoid prepositions and determinants: le, la, de, des, en, à … and so on :

library(BARIS)

BARIS_search(query = "Monuments Historiques Marseille")
#> # A tibble: 20 x 11
#>    id     title   organization   page   views frequency created_at last_modified
#>    <chr>  <chr>   <chr>          <chr>  <chr> <chr>     <chr>      <chr>        
#>  1 5cebf~ "Marse~ https://stati~ https~ 81190 unknown   2013-10-2~ 2020-06-29T0~
#>  2 536c4~ "Monum~ https://stati~ https~ 43725 annual    2013-11-0~ 2022-01-13T1~
#>  3 54a13~ "Monum~ https://stati~ https~ 0     punctual  2014-12-2~ 2015-08-07T1~
#>  4 5dde8~ "Monum~ https://stati~ https~ 57    punctual  2019-11-2~ 2019-11-28T1~
#>  5 5fdaa~ "Monum~ https://stati~ https~ 4     unknown   2020-12-1~ 2020-12-16T0~
#>  6 6206f~ "Monum~ https://stati~ https~ 2     unknown   2022-02-1~ 2022-03-24T0~
#>  7 55253~ "Monum~ <NA>           https~ 0     punctual  2015-04-0~ 2016-02-10T1~
#>  8 55253~ "Monum~ <NA>           https~ 0     punctual  2015-04-0~ 2015-12-23T0~
#>  9 55520~ "Monum~ <NA>           https~ 0     unknown   2015-05-1~ 2015-05-12T1~
#> 10 55520~ "Monum~ <NA>           https~ 0     unknown   2015-05-1~ 2015-05-12T1~
#> 11 5e78d~ "Monum~ https://stati~ https~ 6     punctual  2020-03-2~ 2020-03-23T1~
#> 12 602fb~ "Epône~ https://stati~ https~ 1     unknown   2021-02-1~ 2021-02-18T1~
#> 13 618b8~ "Avign~ https://stati~ https~ 2     unknown   2018-03-1~ 2020-12-24T0~
#> 14 58aef~ "Monum~ https://stati~ https~ 0     unknown   2017-02-2~ 2019-03-05T0~
#> 15 619f8~ "Monum~ https://stati~ https~ 2     punctual  2021-11-2~ 2021-11-26T1~
#> 16 5beab~ "Liste~ https://stati~ https~ 81297 unknown   2018-11-1~ 2016-08-04T1~
#> 17 617ba~ "Couch~ <NA>           https~ 5     unknown   2021-10-2~ 2021-10-29T0~
#> 18 53699~ "Monum~ https://stati~ https~ 0     unknown   2013-09-1~ 2015-07-15T1~
#> 19 54296~ "Monum~ https://stati~ https~ 1155~ unknown   2014-09-2~ 2016-03-03T1~
#> 20 5878e~ "Monum~ https://stati~ https~ 6     unknown   2013-05-1~ 2017-07-10T0~
#> # ... with 3 more variables: last_update <chr>, archived <chr>, deleted <chr>

Cool we have our data set … but wait it would be better to get some explanation about it.

BARIS_explain()

The BARIS_explain() function provides a description of a data set. The function takes one argument which is the ID of the data set:


BARIS_explain(datasetId = "5cebfa8306e3e77ffdb31ef5")
#> [1] "Monuments historiques situés sur le territoire de Marseille, avec adresse, numéro de base Mérimée (base de données du Ministère de la Culture recensant les monuments historiques de toute la France) et points de géolocalisation"

Don’t panic if you’re not a french speaker. You can always use the great googleLanguageR.

Now, it’s time to list the resources contained within this data set !!!

BARIS_resources()

The BARIS_resources function displays the available resources or data frames within a data set. The function takes as argument the ID of the data set:

BARIS_resources(datasetId = "5cebfa8306e3e77ffdb31ef5")
#> # A tibble: 2 x 6
#>   id         title       format published   url             description         
#>   <chr>      <chr>       <chr>  <chr>       <chr>           <chr>               
#> 1 59ea7bba-~ MARSEILLE_~ csv    2019-05-27~ https://trouve~ Monuments historiqu~
#> 2 6328f8b3-~ Plan des M~ pdf    2019-05-27~ https://trouve~ Edition Janvier 2013

You can see from above that the data set has two resources, a csv and a pdf. Now, we’ve reached the interesting part: extracting the data frame that you’ll work on !

BARIS_extract()

Using BARIS_extract() you can extract directly into your R session the needed data set. Currently, “only” theses formats are supported: json, csv, xls, xlsx, xml, geojson and shp, nevertheless you can always rely on the url of the resource to download it manually.

In order to use the function you’ll have to specify two arguments: The ID of the resource and its format.

You can visually catch the structure difference between the ID of a data set and the ID of a resource.


data <- BARIS_extract(resourceId = "59ea7bba-f38a-4d75-b85f-2d1955050e53", format = "csv")

head(data)
#> # A tibble: 6 x 10
#>   n_base_merimee date_de_protection_a~ denomination        adresse   code_postal
#>   <chr>          <chr>                 <chr>               <chr>           <int>
#> 1 PA00081336     Classement : liste d~ Ancienne église de~ "/"             13002
#> 2 PA00081340     Classement: 13/09/19~ Eglise Saint-Laure~ "Esplana~       13002
#> 3 PA00081331     Classement: 29/01/19~ Chapelle et Hospic~ "2, Rue ~       13002
#> 4 PA00081344     Classement: 16/06/19~ Fort Saint-Jean     ""              13002
#> 5 PA00081325     Inscription : 23/11/~ Les deux bâtiments~ "Quai du~       13002
#> 6 PA00081334     Inscription : 07/07/~ Clocher des Accoul~ "Montée ~       13002
#> # ... with 5 more variables: proprietaire_du_monument <chr>,
#> #   epoque_de_construction <chr>, date_de_construction <chr>, longitude <dbl>,
#> #   latitude <dbl>

End of the vignette.

Introduction

BARIS_search()

BARIS_explain()

BARIS_resources()

BARIS_extract()