An aggregator for the Rutgers SOC API with a more simpler developer experience. This is a project built for parsing courses for Scarlet Navigator. It prevents potential overriding conflicts when compiling courses for both Spring and Fall semesters, while adding important metadata for the Scarlet Navigator project.
Make sure you have the following installed:
git clone [email protected]:openscarletorg/ScarletCourse.git
cd ScarletCourse
pip3 install -r requirements.txtTo run the scraper, you need to specify the year and semester you want to scrape. The scraper will scrape all the courses for that semester and store them in a json file.
python3 scraper.py [year] [semester]python3 scraper.py 2019 fall-spring -o output.jsonThe above command will scrape all the courses for the Fall 2019 semester and Spring 2020 semester, storing them in output.json.
If there are conflicts (duplicate courses) between the two semesters, the scraper will automatically add to the course metadata with important information distinguishing between Fall and Spring. The scraper will then store the courses in the json file.
Following values for semester are supported:
fallspringwintersummerfall-springfall-winter-springsummer-fall-winter-spring
Before the scraper calls the SOC API, it will first make sure the parameters are valid. If the year is currently 2025 but semester data for Spring is not ready, it will not run.
Following command creates an output.json file with all the courses for the Fall 2019 semester for undergrad courses in Camden.
python3 scraper.py 2019 fall -o output.json -p cam -l undergradYou can specify the following options:
--outputor-o: The output file to store the scraped data in. Defaults tooutput.json--levelor-l: The level of the courses to scrape. Defaults toundergraduate.undergrad: Scrape all undergraduate coursesgrad: Scrape all graduate coursesall: Scrape both grad and undergrad courses
--placeor-p: Whether to print the scraped data to the console. Defaults toFalsenb: Scrape New Brunswick coursescam: Scrape Camden coursesnewark: Scrape Newark coursesonline: Scrape online coursesall: Scrape courses from all campuses/locations
--verboseor-v: Whether to print course titles and conflict management to console. Defaults toFalse--helpor-h: Show the help message and exit