Make feedback and finetune-data paths respect subject/model args

Hi, I found a path/lineage issue in the problem-solving self-improvement pipeline.

The README describes the loop as parameterized by subject/model, with data at:

```text
dataset/{subject}_train.jsonl
dataset/{subject}_test.jsonl
```

and examples such as:

```bash
python Problem_solving/PhyChem/get_a_sol.py --model='gpt-3.5-turbo' --task='MMLU_physics' --prompt_type='multi_agent' --mode='generate' --subject='phy'
```

`get_c_regenerate.py` follows this pattern and derives its files from `args.subject` and `args.model`:

```python
inputfile = f"Problem_solving/PhyChem/logs/solve_{args.subject}_{args.model}/feedback.jsonl"
regenerate_sol_file = f"Problem_solving/PhyChem/logs/solve_{args.subject}_{args.model}/regenerate_sol.jsonl"
```

But `get_b_feedback.py` hardcodes the feedback source and destination:

```python
inputfile = "Problem_solving/PhyChem/logs/solve_phy_gpt-3.5-turbo/wrong/wrong.jsonl"
feedback_file = f"Problem_solving/PhyChem/logs/solve_phy_gpt-3.5-turbo/feedback.jsonl"
```

and `get_finetune_data.py` also hardcodes:

```python
ditc = "Problem_solving/PhyChem/logs/solve_phy_gpt-3.5-turbo"
sft_dic = "Problem_solving/PhyChem/logs/solve_phy_gpt-3.5-turbo"
```

This can break the self-improvement lineage for any run that is not exactly `subject=phy` and `model=gpt-3.5-turbo`. For example, a user running `--subject chem` or a different model can generate trajectories into one run directory, but feedback/finetune data can still be read from or written to the physics GPT-3.5 directory. The regenerate stage is already parameterized, so the B/C/D phases can silently diverge.

For a self-improving system, the experience library is effectively the promotion surface: successful trajectories and regenerated failures become training data for the next agent version. In bounded verifier-style RSI harnesses, that lineage needs to stay tied to the same run identity and evaluation split; otherwise the improvement claim becomes difficult to audit.

Suggested fix:

```python
base_dir = f"Problem_solving/PhyChem/logs/solve_{args.subject}_{args.model}"
inputfile = f"{base_dir}/wrong/wrong.jsonl"
feedback_file = f"{base_dir}/feedback.jsonl"
```

and make `get_finetune_data.py` derive `ditc`/output paths from `args.subject` and `args.model` as well. It may also be useful to write a small manifest with `subject`, `model`, `mode`, source files, and output files so the experience-library lineage is auditable across improvement rounds.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make feedback and finetune-data paths respect subject/model args #2

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Make feedback and finetune-data paths respect subject/model args #2

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions