Skip to content

1. Group Goals

Vittorio2015 edited this page Mar 30, 2018 · 13 revisions

Main Goals for Day 1

  1. Migrate data into structure that allows for real-time searching

  2. Preprocess as many calculations as possible

  3. Write query script for database

  4. Begin capturing wikiData

  5. Disambiguate duplicate author entries

Progress

  1. Began writing script to transfer file statistics into MungoDB

  2. Began reconciling authors with wikiData IDs, process running overnight

  3. Created SPARQL query for extracting Wikidata for filtering

  4. Began disambiguation of authorName metadata

  5. Various miscellaneous work on branding, user interviews, developing means of sorting Excel (i.e.: "COUNTIF(G2:Z2, ">0")

  6. Stephen had a great idea for an article: the problems of anecdotal content ("")

  7. Completed user interviews

  8. Brainstorming - Initial Branding

Main Goals for Day 2

  1. Perform chunking script again, including volume tails DONE

  2. Complete migration of data into structure that allows for real-time searching

  3. Preprocess as many calculations as possible

  4. Include metadata in database structure

  5. Write query script for database

  6. Code script to input passages, analyze most frequent words, and input additional words

  7. Begin mapping out website structure, including a wireframe

  8. Static HTML prototype

  9. Complete reconciling author strings with wikidata IDs

  10. Complete author disambiguation

  11. Retrieve and integrate Wikidata data with ECCO metadata: https://github.com/artshumrc/search-hackathon/blob/master/Corpus_Enrichment/SPARQL%20query%20for%20data%20by%20ID

  12. Add "person or non-person" indicator to ECCO metadata: https://github.com/artshumrc/search-hackathon/blob/master/Corpus_Enrichment/Sift%20works%20by%20possible%20persons%20and%20likely%20non-persons.py

  13. Include calculation for cosine similarity in MongoDB

  14. Document progress over the course of the hackathon

  15. Set up slack channel for future collaboration

Progress

State of Project at End of Hackathon

Clone this wiki locally