-
Notifications
You must be signed in to change notification settings - Fork 3k
Update dataset_infos for UDHN/udhr dataset #4362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update dataset_infos for UDHN/udhr dataset #4362
Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
|
Thanks for contributing @leondz. The checksums of the files have changed because more languages have been added:
|
|
Yep! All done (also fixed the language tags in the README which were iso639-3 instead of the expected bcp47) |
|
I guess the language code CI failure is due to languages.json being a subset of bcp47 (see issue #4304), happy to contribute a solution here, e.g. autogeneration of the lang list from the relevant isos and the ietf bcp47 subtag register or full code for validation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for your contribution, @leondz.
Yes, I think it is OK to set version 1.0.0 (as previous was 0.0.0).
One of the CI failures is related to dummy data: once you have updated the dataset version, the dummy_data ZIP file should be moved from "dummy/0.0.0/dummy_data.zip" to "dummy/1.0.0/dummy_data.zip".
Other CI failure is related to missing languages in our resources file. This has been addressed in this PR:
You should merge master branch into your feature branch to incorporate that fix.
Oh, thanks, I missed that one
Yeah, I saw this :) I already have the merge, thanks. I'm talking about the longer-term picture: every time another language code comes up (e.g. da-bornholm or es-VE), the json will need updating, because the current approach is non-exhaustive manual whitelisting instead of relying on the established bcp standard. |
albertvillanova
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @leondz,
Could you please merge the master branch to see if the tests pass?
git fetch upstream master
git merge upstream/master
git push
albertvillanova
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
Checksum update to
udhrfor issue #4361