In this repository, we process Data of French Studiants by aggregating at differents geographical levels and correct some features.
To use it, first:
- create a
INPUTfolder in the root directory with the raw data in it, in .parquet format for each year. - create an empty
POST_GENTABfolder in the root directory. - adding new formats of this year in the
formatpage of the ATLAS googlesheet here:https://docs.google.com/spreadsheet/ccc?key=11NFXSIg6gQMCsMa8zWQQyypvvYBEmfyJfF2yytXqgMk
- adding new EPE of this year in the
D_EPEpage of the ESR googlesheet here:https://docs.google.com/spreadsheet/ccc?key=11NFXSIg6gQMCsMa8zWQQyypvvYBEmfyJfF2yytXqgMk
- Run the notebook with all the details of the code, step by step. When using the Jupyter notebook, select the
rentreeparameter that means the year of the scholarship (ex: 2024 for the year 2024-2025) at the beginning, and then execute each cell to obtain a standardized format of data for each year.
This notebook enable to change the incorrect features and complete the missing ones. At the end of the process, when each year is done: The POST_GENTAB folder must have the same number of file as the folder INPUT. With these new files, we produce the 'état du supérieur' map.
- Run the notebook with all the details of the code, step by step for the level "communal".
- Run the notebook step by step, including all code details, using the "établissement" level, which is much more granular.