Commit 6bdc52d
authored
Fix IndexError while loading Arabic Billion Words dataset (#2729)
* Catch IndexError and ignore that record
* Update metadata JSON
* Add pretty_name tag to dataset card
* Add Data Splits to dataset card
* Add Data Instances to dataset card
* Fix licenses tag in dataset card1 parent 202f253 commit 6bdc52d
File tree
3 files changed
+32
-7
lines changed- datasets/arabic_billion_words
3 files changed
+32
-7
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | | - | |
| 9 | + | |
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| |||
37 | 37 | | |
38 | 38 | | |
39 | 39 | | |
| 40 | + | |
40 | 41 | | |
41 | 42 | | |
42 | 43 | | |
| |||
92 | 93 | | |
93 | 94 | | |
94 | 95 | | |
95 | | - | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
96 | 108 | | |
97 | 109 | | |
98 | 110 | | |
| |||
104 | 116 | | |
105 | 117 | | |
106 | 118 | | |
107 | | - | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
108 | 133 | | |
109 | 134 | | |
110 | 135 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
157 | 157 | | |
158 | 158 | | |
159 | 159 | | |
160 | | - | |
| 160 | + | |
161 | 161 | | |
162 | 162 | | |
163 | 163 | | |
164 | 164 | | |
165 | 165 | | |
166 | | - | |
167 | | - | |
| 166 | + | |
| 167 | + | |
168 | 168 | | |
169 | 169 | | |
170 | 170 | | |
0 commit comments