[API] eurostat

API & Databases R Courses

Access statistics at European level through the Eurostat API.

Thierry Warin https://warin.ca/aboutme.html (HEC Montréal and CIRANO (Canada))https://www.hec.ca/en/profs/thierry.warin.html
01-28-2020

Database description

Eurostat is the statistical office of the European Union. While statistic authorities in Member States collect and analyse data, Eurostat’s role is to consolidate the data and ensure they are comparable. It provides statistics at European level that enable comparisons between countries and regions. From EU policies, economy and finance to social conditions and environment, Eurostat is a powerful tool that consolidate the data using a harmonized methodology.

Eurostat: https://ec.europa.eu/eurostat/fr/about/overview

Functions

Each of these functions are detailed in this course and some examples are provided.

get_eurostat_toc()

The function get_eurostat_toc() downloads a table of contents of eurostat datasets.

# Load the package
library(eurostat)
library(rvest)

# Get Eurostat data listing
toc <- get_eurostat_toc()
title code type last update of data last table structure change data start data end values
Database by themes data folder NA NA NA NA NA
General and regional statistics general folder NA NA NA NA NA
European and national indicators for short-term analysis euroind folder NA NA NA NA NA
Business and consumer surveys (source: DG ECFIN) ei_bcs folder NA NA NA NA NA
Consumer surveys (source: DG ECFIN) ei_bcs_cs folder NA NA NA NA NA
Consumers - monthly data ei_bsco_m dataset 29.10.2020 29.10.2020 1980M01 2020M10 NA

search_eurostat()

With search_eurostat() you can search the table of contents for particular patterns, e.g. all datasets related to passenger transport. Note that with the type argument of this function you could restrict the search to for instance datasets or tables.

# info about passengers
search_eurostat("passenger transport")
title code type last update of data last table structure change data start data end values
Volume of passenger transport relative to GDP tran_hv_pstra dataset 01.09.2020 31.08.2020 1990 2018 NA
Modal split of passenger transport tran_hv_psmod dataset 01.09.2020 31.08.2020 1990 2018 NA
Air passenger transport by reporting country avia_paoc dataset 20.11.2020 09.10.2020 1993 2020Q3 NA
Air passenger transport by main airports in each reporting country avia_paoa dataset 20.11.2020 09.10.2020 1993 2020Q3 NA
Air passenger transport between reporting countries avia_paocc dataset 28.10.2020 09.10.2020 1993 2020Q3 NA
Air passenger transport between main airports in each reporting country and partner reporting countries avia_paoac dataset 20.11.2020 09.10.2020 1993 2020Q3 NA

Once you have found the datasets you are looking for, you can insert the specific id of the dataset in a variable of your choice.

id <- search_eurostat("Modal split of passenger transport", 
                         type = "table")$code[1]
print(id)
[1] "t2020_rk310"

get_eurostat()

The function get_eurostat takes as an input the specific id of the dataset. It returns datas from the dataset The str() function allows you to investigate the structure of the downloaded data set.

dat <- get_eurostat(id)
str(dat)
tibble [2,798 × 5] (S3: tbl_df/tbl/data.frame)
 $ unit   : chr [1:2798] "PC" "PC" "PC" "PC" ...
 $ vehicle: chr [1:2798] "BUS_TOT" "BUS_TOT" "BUS_TOT" "BUS_TOT" ...
 $ geo    : chr [1:2798] "AT" "BE" "CH" "DE" ...
 $ time   : Date[1:2798], format: "1990-01-01" ...
 $ values : num [1:2798] 8.2 10.6 3.7 9.1 11.3 32.4 14.9 13.5 6 24.8 ...
unit vehicle geo time values
PC BUS_TOT AT 1990-01-01 8.2
PC BUS_TOT BE 1990-01-01 10.6
PC BUS_TOT CH 1990-01-01 3.7
PC BUS_TOT DE 1990-01-01 9.1
PC BUS_TOT DK 1990-01-01 11.3
PC BUS_TOT EL 1990-01-01 32.4

It is possible to add filters to only have a specific part of the dataset.

By default variables are returned as Eurostat codes, but to get human-readable labels instead, use a type = “label” argument.

datl <- get_eurostat(id, filters = list(geo = c("EU28", "FI"), 
                                         lastTimePeriod = 1), 
                      type = "label", time_format = "num")
unit vehicle geo time values
Percentage Motor coaches, buses and trolley buses European Union - 28 countries (2013-2020) 2018 8.7
Percentage Motor coaches, buses and trolley buses Finland 2018 10.1
Percentage Passenger cars European Union - 28 countries (2013-2020) 2018 83.3
Percentage Passenger cars Finland 2018 84.2
Percentage Trains European Union - 28 countries (2013-2020) 2018 8.0
Percentage Trains Finland 2018 5.7

As we can see, we now have the percentage value of transport utilisation for the Finland compare to the rest of the European Union in 2017.

tl;dr

# Load the package
library(eurostat)
library(rvest)

# Get Eurostat data listing
toc <- get_eurostat_toc()

# Info about passengers
kable(head(search_eurostat("passenger transport")))
#id of the dataset
id <- search_eurostat("Modal split of passenger transport", 
                         type = "table")$code[1]
#Raw data 
dat <- get_eurostat(id)
str(dat)

# Filters addition
datl <- get_eurostat(id, filters = list(geo = c("EU28", "FI"), 
                                         lastTimePeriod = 1), 
                      type = "label", time_format = "num")

Code learned this week

Command Detail
get_eurostat_toc() Downloads a table of contents of eurostat datasets
search_eurostat() search the table of contents for particular patterns
get_eurostat() Read eurostat data from a specfic id of a dataset

References

This course uses the Eurostat Tutorial


Citation

For attribution, please cite this work as

Warin (2020, Jan. 28). Thierry Warin, PhD: [API] eurostat. Retrieved from https://warin.ca/posts/api-eurostat/

BibTeX citation

@misc{warin2020[api],
  author = {Warin, Thierry},
  title = {Thierry Warin, PhD: [API] eurostat},
  url = {https://warin.ca/posts/api-eurostat/},
  year = {2020}
}