-
Notifications
You must be signed in to change notification settings - Fork 10
Added program totals management scripts and tables #429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added program totals management scripts and tables #429
Conversation
2af10cf to
49d491b
Compare
3b87d46 to
ac17618
Compare
suecarmol
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for all of the work on this PR. The UI now loads lightning-fast thanks to your changes 😄
I have a few minor comments and nits before we can merge this in.
Noting that the tests for the management commands will be added in PR #439, but that I tested the commands locally and they are working.
extlinks/aggregates/management/commands/fill_top_organisations_totals.py
Outdated
Show resolved
Hide resolved
* Program totals and statistics now use the new tables * Program CSV downloads now use the new tables
6af659a to
9912e28
Compare
suecarmol
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me! Thank you so much for your efforts on making Wikilink load faster!
|
Thanks for your work, I validated that this works as expected locally with the exception that program stats will be unavailable until we start backfilling the new database tables. |
Description
Added management commands for creating precomputed program totals from aggregate data.
Rationale
Querying program data from our aggregates table is causing performance issues. We plan on archiving this data to help with performance, but for old program statistics to remain accurate we need to compute program totals for previous months that will be archived. As a byproduct this also improves program query performance substantially.
Phabricator Ticket
https://phabricator.wikimedia.org/T370980
How Has This Been Tested?
Existing programs tests cover the accuracy of the new program totals command output.
Totals Command Examples
Totals function can be tested like so:
If no date option is passed then the earliest date possible is identified and totals are calculated for the whole dataset. The top organisations and top projects caclulate quickly, but top users can take ~10 minutes for the entire dataset.
Screenshots of your changes (if appropriate):
N/A
Types of changes
What types of changes does your code introduce? Add an
xin all the boxes that apply: