-
Notifications
You must be signed in to change notification settings - Fork 297
Closed
Description
In jiant/jiant/tasks/lib/wsc.py, _create_examples() dies at the statement span1_idx=line["span1_index"] (line 177) with KeyError: 'span1_index' because it mismatches the structure of the JSON task data.
The statement should be span1_idx=line["target"]["span1_index"] and similarly for the next 3 statements.
To Reproduce
- Install jiant v2 from any recent commit (where wsc.py hasn't been touched since 3bd801c)
- I doubt this matters, but running Python version 3.8 in a recent linux on a 40 core, 80 thread skylake CPU with 384 GB of RAM and a VT100/16GB GPU.
- In the Python REPL,
from jiant.proj.simple import runscript as run
import jiant.scripts.download_data.runscript as downloader
downloader.download_data(["wsc"], "/home/rasmussen.63/wsc-speed/tasks")
args = run.RunConfiguration(
run_name="wsc-speed",
exp_dir="wsc-speed",
data_dir="wsc-speed/tasks",
model_type="roberta-base",
tasks="wsc",
train_batch_size=16,
num_train_epochs=3
)
run.run_simple(args)Watch stderr come back to you until it stops at
Tokenizing Task 'wsc' for phases 'train,val,test'
WSCTask
[train]: /home/rasmussen.63/wsc-speed/tasks/data/wsc/train.jsonl
[val]: /home/rasmussen.63/wsc-speed/tasks/data/wsc/val.jsonl
[test]: /home/rasmussen.63/wsc-speed/tasks/data/wsc/test.jsonl
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/rasmussen.63/jiant/jiant/proj/simple/runscript.py", line 148, in run_simple
tokenize_and_cache.main(
File "/home/rasmussen.63/jiant/jiant/proj/main/tokenize_and_cache.py", line 165, in main
examples=task.get_train_examples(),
File "/home/rasmussen.63/jiant/jiant/tasks/lib/wsc.py", line 160, in get_train_examples
return self._create_examples(lines=read_json_lines(self.train_path), set_type="train")
File "/home/rasmussen.63/jiant/jiant/tasks/lib/wsc.py", line 177, in _create_examples
span1_idx=line["span1_index"],
KeyError: 'span1_index'
Expected behavior
WSCTask should initialize an Example from the downloaded data.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels