Repo of R scripts elaborated for the paper: "The impact of temporal resolution on public transport accessibility measurement" (2019).
The repo of scirpts used for the paper The impact of temporal resolution on public transport accessibility measurement: review and case study in Poland, accepted for publication in the Journal of Transport Geography on 11th January 2019 (submitted: 18th July 2018).
Stepniak, M., Pritchard, J.P., Geurs K.T., Goliszek S., 2019, The impact of temporal resolution on public transport accessibility measurement: review and case study in Poland, Journal of Transport Geography, doi: https://doi.org/10.1016/j.jtrangeo.2019.01.007.
All the data used for the study can be downloaded from Open Data Repository RepOD. Direct link and reference of the dataset:
Stepniak, M., Goliszek, S., Pritchard, J., Geurs, K., 2019. The Impact of Temporal Resolution on Public Transport Accessibility Measurement. [Dataset] RepOD. https://doi.org/10.18150/repod.7727991.
Authors:
- Marcin Stępniak (tGIS, Department of Geography, Complutense University of Madrid, Spain)
- Sławomir Goliszek (Institute of Geography and Spatial Organization, Polish Academy of Sciences)
- John P. Pritchard (Centre for Transport Studies, University of Twente)
- Karst T. Geurs (Centre for Transport Studies, University of Twente)
R01. Compare precision of accessibility measurement
R02 Travel time calculations
R03 Compare travel times
R04 Compare accessibility measures
R05 Combine Gini coefficients
R06 Frequency graph
The input for the repo are origin-destination (OD) travel time matrices which uses census track centroids as origins. All ODs are stored in two subfolders in Data.zip which can be downloaded from here
Destinations in ODs are:
-
Subfolder
f03_AggregatesFor proximity measure:
Adm: Local administration office (1 point)Zlob: Nurseries (30 points)
For cumulative opportunities measure:
Teatr: Theatres (21 points)SpecHC: Specialized health centres (169 points)
- Subfolder
f03_Aggregates_Ai
For potential accessibility measure:
HOS: Hospitals with attached number of beds (9 points)Edu_Lo: Secondary schools with attached number of classes (68 points)OBWOD: census track centroids with number of inhabitants (1745 points)
For the details please consult the file Data_description.pdf which can be found here.
The following sampling procedures were tested for the study:
a) Systematic Sampling departure time selected using a regular interval
b) Simple Random Sampling a specified number of sample times are selected at random (without replacement)
c) Hybrid Sampling departure times are randomly selected from given time intervals (resulted from applied temporal resolution)
d) Constrained Random Walk Sampling 1st departure time is randomly selected within the first time interval, and next ones from subsequent time intervals defined by a temporal resolution (+1 resolution +/- 0.5 temporal resolution).
For details please consult Owen & Murphy (2018).
The detailed description of the script which enables to generate departure time can be found in this repo.
The table below shows applied temporal resolutions and number of iterations required for 1-hour long time window:
| resolution | interations |
|---|---|
| 2 | 30 |
| 3 | 20 |
| 4 | 15 |
| 5 | 12 |
| 6 | 10 |
| 10 | 6 |
| 12 | 5 |
| 15 | 4 |
| 20 | 3 |
| 30 | 2 |
| 60 | 1 |
The repo consists of serveral of subsequent scripts:
The R01_Ai_Calculations.R script applies different functions, depending on which of accessibility measures is in use. The code selects departure time according to a given sampling method for all considered temporal resolutions. Then it calculates accessibility measures and calculate (aggregated) errors:
-
MAPE (Mean Absolute Percentage Error), expressed in %, calculated according to the formula:
mean(abs((y - x)/y))*100; -
MAE (Mean Absolute Error), expressed in absolute values (e.g. minutes) calculated according to the formula:
mean(abs(y-x)); -
maxdif (maximum difference), expressed in absolute values (e.g. minutes) calculated according to the formula:
max(abs(y-x));where
xis an evaluated value, whiley- a benchmark one.
Additionally, each of the scripts calculate Gini coefficients for all tested temporal resolutions as well as for benchmark values.
Particular functions (seperate for each of the applied accessibility measures) are stored in separate Rscripts:
-
proximity (or travel-time-to-nearest-provider) for public administration
Admand nurseriesZlob. Function stored inR011_Ai_proximity.RFunction syntax:R011_Ai_proximity(file_all, mc_max)Application of temporal resolution: This script aggregates selected travel times using an arithmetic mean.
-
cumulative opportunities (or isochrones) for accessibility to specilized health care
SpecHCand theatresTeatr. Function syntax:Ai_cumulative(file_all, threshold, mc_max)Application of temporal resolution: This script aggregates calculated accessibility measures using an arithmetic mean.
-
potential accessibility for accessibility to education (secondary schools,
Edu), hospitalsHOSand populationOBWOD. The funciton uses negative exponential function:(mass*(exp(-beta*TravelTime).Application of temporal resolution: This script aggregates selected travel times using harmonic-based means (for details please consult Stępniak & Jacobs-Crisioni (2017).
Function syntax:
Ai_potential(file_mass, file_all, beta, mc_max)where:
mc_maxnumber of iterations in case of simple random, hybrid, and constrained random walk sampling methods;file_alldefines a list of a given types of OD matrixes (for different types of destinations; e.g. HOS)file_mass(relative or absolute) path to the file where the quantitative value of attractiveness of destinations is stored. The file should be two-column, in the first there should be ID (to be used while merging with destinations ID) and the second - value of attractiveness of a given destination, e.g.:- number of beds in a hospital ("Results/t00_data/HOS.csv")
- number of classes in a school ("Results/t00_data/EduLO.csv")
- number of population in a census track ("Results/t00_data/POP.csv")
betathe value of beta parameter (applied value: 0.023105)
Each of sampling procedures is repeated an user-defined number of times (in case of the paper = 100).
Set of origin-destination matrixes stored in two folders:
f03_Aggregatesfor measures without distance decay (proximity and cumulative opportunities measures)f03_Aggregates_Aifor measures with distance decay (potential accessibility measure)
Set of csv files stored in two folders (names of files depends on the type of destination):
-
t06_Resultsfiles with aggregated values of:- MAPE, MAE, maximum difference
- values of Gini (stored in subfolder
Gini)
-
t04_Temporaryfiles with disaggrated values (calculated separately for each of randomly selected scenarios):- MAPE (data used for Table 4 in the paper)
- MAE
- maxdiff
- Gini
Additionally, in the t04_Temporary the script saves values of accessibility measures calculated for a systematic approach (one file for each of destination and time-window period; for the details please consult Data_description.pdf file).
R02_TravelTime.R compares precision of travel times' estimation using MAPE, MAE and maximum difference (maxdiff). The script uses all 4 sampling methods.
Set of OD matrixes stored in f03_Aggregates_Ai folder (census track centroids as origin & departure points)
TravTime.csv file stored in Results/t05_TTResults folder which contains MAPE, MAE and maxdiff (maximum difference) indicators (MAPE values used for Table 3 in the paper)
R03_Comparison_TravelTime.R prepares graphs which present the loose of precision of travel time estimation due to reduced temporal resolution.
TravTime.csv file (stored in t05_TTResults folder) which contains MAPE, MAE & maxdiff of travel times aggregated for different temporal resolutions and obtained using different sampling methods.
TravelTime_sampling.pngfigure which compares the quality of different sampling methods. (Figure 3 in the paper)TravelTime_estimation.pngfigure which presents the loose of precision (MAPE & MAE) of hybrid model in different 1-hour-long scenarios and their average. (Figure 5 in the paper)
R03_Comparison_TravelTime.R prepares graphs which present the loose of precision of travel time estimation due to reduced temporal resolution.
Set of files (one per each destination) with MAPEs vales stored in t06_Results folder.
Ai_Sampling.pngcompares the quality of different sampling methods (Figure 4 in the paper).Ai_Hybrid.pngcompares all measures calculated for particular destinations, using MAPE of hybrid model in different 1-hour-long scenarios (not used in the paper).Ai_Hybrid_complex.pngthe same as above but added a zoom-in with excluded curves for cumulative opportunities (Figure 6 in the paper).Ai_H_scenarios.pngpresents MAPE of hybrid model in different 1-hour-long scenarios, for different types of measures and destinations (Figure 7 in the annex)
R05_Gini_Ai.R combines Gini coefficients stored in separate files (one for each destination and time-window) and save an output to xlsx file.
Set of files (one per each destination and time window) with Gini coefficients stored in t06_Results folder.
Gini_table.xlsxstored int07_Graphsfolder (data used for Table 5 in the paper)
Simple script used to draw a graph which presents the total number of vehicles in 1-hour-long periods of time (Figure 2 in the paper).
Selected GTFS files: stop_times.txt, trips.txt, calendar.txt and routes.txt stored in Results/p00_data/GTFS_Szczecin folder.
Graph Plot_Freq.png stored in Results/p07_Graphs/ folder (Figure 2 in the paper)
This document is created within the MSCA CAlCULUS project.
This project has received funding from the European Union's Horizon 2020 research and innovation Programme under the Marie Sklodowska-Curie Grant Agreement no. 749761.
The views and opinions expressed herein do not necessarily reflect those of the European Commission.
License for scripts: CC-BY-4.0