Community-contributed datasets for PSYCTL personality steering.
Base conversation datasets used to generate steering datasets. These contain social dialogue scenarios with situational context.
Format Example:
{
"narrative": "Alice is at a party...",
"speakers": ["Friend", "Alice"],
"dialogue": ["Want to dance?", "Sure!"]
}| Repository | Language | Samples | License | Description |
|---|---|---|---|---|
| CaveduckAI/simplified_soda_kr | Korean | - | - | Korean version of SoDA dataset |
| allenai/soda | English | ~1.5M | ODC-BY | Social dialogue dataset |
Datasets for extracting personality steering vectors using methods like mean_diff (Mean Difference) and BiPO. Each sample contains personality-specific (positive) and neutral responses to the same situation. The extracted vectors are applied using CAA (Contrastive Activation Addition).
Format Example:
{
"situation": "Alice is at a party...\nFriend: Want to dance?\n",
"char_name": "Alice",
"positive": "Absolutely! Let's get everyone together!",
"neutral": "Sure, I'll join you."
}| Repository | Personality | Language | Samples | Source Dataset | Model | License |
|---|---|---|---|---|---|---|
| CaveduckAI/steer-personality-extroversion-ko | Extroversion (외향성) | Korean | 100 | simplified_soda_kr | kimi-k2-0905 | MIT |
| CaveduckAI/steer-personality-rudeness-ko | Rudeness (무례함) | Korean | 500 | simplified_soda_kr | kimi-k2-0905 | MIT |
- Generate your dataset using PSYCTL
- Upload to HuggingFace Hub
- Add a row to the appropriate table above via pull request
Dataset Naming Convention: {username}/steer-personality-{trait}-{lang}
Example:
psyctl dataset.build.steer \
--openrouter-api-key "your-key" \
--openrouter-model "moonshotai/kimi-k2-0905" \
--personality "Your Trait" \
--output "./results/dataset" \
--limit-samples 100 \
--dataset "CaveduckAI/simplified_soda_kr"