-
Notifications
You must be signed in to change notification settings - Fork 6
1. Group Goals
-
Migrate data into structure that allows for real-time searching
-
Preprocess as many calculations as possible
-
Write query script for database
-
Begin capturing wikiData
-
Disambiguate duplicate author entries
-
Began writing script to transfer file statistics into MungoDB
-
Began reconciling authors with wikiData IDs, process running overnight
-
Created SPARQL query for extracting Wikidata for filtering
-
Began disambiguation of authorName metadata
-
Various miscellaneous work on branding, user interviews, developing means of sorting Excel (i.e.: "COUNTIF(G2:Z2, ">0")
-
Stephen had a great idea for an article: the problems of anecdotal content ("")
-
Completed user interviews
-
Brainstorming - Initial Branding
-
Perform chunking script again, including volume tails DONE
-
Complete migration of data into structure that allows for real-time searching
-
Preprocess as many calculations as possible
-
Include metadata in database structure
-
Write query script for database
-
Code script to input passages, analyze most frequent words, and input additional words
-
Begin mapping out website structure, including a wireframe
-
Complete reconciling author strings with wikidata IDs
-
Complete author disambiguation
-
Retrieve and integrate Wikidata data with ECCO metadata: https://github.com/artshumrc/search-hackathon/blob/master/Corpus_Enrichment/SPARQL%20query%20for%20data%20by%20ID
-
Add "person or non-person" indicator to ECCO metadata: https://github.com/artshumrc/search-hackathon/blob/master/Corpus_Enrichment/Sift%20works%20by%20possible%20persons%20and%20likely%20non-persons.py
-
Include calculation for cosine similarity in MongoDB
-
Document progress over the course of the hackathon
-
Set up slack channel for future collaboration