Modeling personality as adversarial equilibrium, with novelty-driven attention
A computational framework for representing a person's decision-making as competing internal drives that stake evidence, debate in adversarial arenas, and shift allocations based on novelty-modulated attention. Not a profile. Not a summary. A dynamic system that is the personality, in computational form.
Ideas are atomic. Position is relational.
The observation "Lives paycheck to paycheck at 42" has no inherent meaning. Its meaning emerges from what you're optimizing for:
- SURVIVAL sees: financial risk, precarity
- MEANING sees: sacrifice for purpose, choosing mission over money
- AUTONOMY sees: rejecting salary slavery, freedom at cost
- COMFORT sees: unsustainable intensity, warning sign
Same fact. Different positions. The World Model captures this by letting the same observation appear across multiple value trees with different polarities.
Three interlocking systems form a feedback loop:
Input (text/voice/tweets)
β
βΌ
ββββββββββββ "How surprising is this?"
β NOVELTY β Fetch/parse loop against reference frame
β β 4 dimensions: integration resistance,
β β contradiction depth, coverage gap,
β β allocation disruption
ββββββ¬ββββββ
β novelty score
βΌ
ββββββββββββ "Should I attend to it?"
βATTENTION β Bounded symbol streams force prioritization
β β Novelty captures attention β shifts toward CURIOSITY
β β Salience filters: recency, convergence, repetition
ββββββ¬ββββββ
β promoted observations
βΌ
ββββββββββββ "What do my tendencies think?"
β ARENA β 7 tendencies propose claims, stake evidence
β (life) β Winners gain allocation, losers shrink
β β Equilibrium IS the personality
ββββββ¬ββββββ
β updated allocations
ββββββββββββββββΊ feeds back into novelty & attention
Seven universal human drives act as agents competing for influence:
| Tendency | Default | Question It Asks |
|---|---|---|
| SURVIVAL | 18% | "Is this safe? Do I have enough?" |
| CONNECTION | 20% | "Am I known? Do I belong?" |
| COMFORT | 18% | "Is this pleasant? Can I sustain this?" |
| STATUS | 12% | "Am I respected? Do I matter to others?" |
| AUTONOMY | 12% | "Am I free? Can I choose?" |
| MEANING | 10% | "Does this matter? Will it outlast me?" |
| CURIOSITY | 10% | "Do I understand? What's there to learn?" |
Each agent:
- PROPOSES claims about what matters
- STAKES observations as evidence (PRO or CON)
- WINS or LOSES debates based on evidence strength
- GAINS or LOSES allocation based on outcomes
The equilibrium that emergesβwhich tendencies dominate, where they conflict, how they resolveβis the personality.
from world_model import create_world_model, Arena
model = create_world_model("Person", "observations.json")
arena = Arena()
trees, result = arena.run_full_debate(
observations=model.observations,
agents=model.agents,
)
print(f"Winner: {result.winner}")
print(f"Allocations: {model.agents}")from world_model.dynamics import Trainer, TrainConfig
config = TrainConfig(
max_epochs=5,
convergence_threshold=0.01,
validation_split=0.2,
)
trainer = Trainer(config)
history, result = trainer.train(
observations=model.observations,
agents=model.agents,
)
print(f"Validation accuracy: {history.validation_results[-1].accuracy:.1%}")from world_model.novelty import measure_against_claims
result = measure_against_claims(
concept="Bitcoin",
claim_texts=[
"Traditional banking provides security",
"Trust in institutions is necessary",
]
)
print(f"Termination: {result.termination}") # CONTRADICTS_ROOT
print(f"Composite novelty: {result.composite:.3f}")from world_model.attention import Sequence, Symbol, NoveltyProcess
# Bounded buffers force prioritization (like working memory)
conscious = Sequence("conscious", capacity=7, min_value=0.5)
working = Sequence("working", capacity=20, min_value=0.3)
# Novel items auto-promote from working memory to conscious attention
proc = NoveltyProcess("filter", inputs=[working], outputs=[conscious])
proc.start()
working.publish(Symbol(data="something novel", value=0.6))from world_model.integration import AttentionBridge
bridge = AttentionBridge(
agent_set=model.agents,
observation_store=model.observations,
)
event = bridge.process("New information about the person")
print(f"Novelty: {event.novelty_score:.2f}")
print(f"Promoted: {event.was_promoted}")
print(f"Dominant tendency: {event.dominant_tendency}")world_model/
βββ models/ Core data structures
β βββ observation.py Atomic facts (~280 bytes, content-hash deduped)
β βββ agent.py Tendency enum, Agent, AgentSet (allocations sum to 1)
β βββ tree.py Value hierarchies, PRO/CON positioning, weight propagation
β
βββ dynamics/ Adversarial competition
β βββ arena.py Debate orchestration (propose β stake β resolve β reallocate)
β βββ trainer.py ML-style training (epochs, convergence, learning rate decay)
β
βββ novelty/ Novelty measurement
β βββ core.py Abstract interfaces (NoveltyProbe, ReferenceFrame)
β βββ hybrid_probe.py Wikidata graph + Neural NLI (recommended)
β βββ neural_probe.py Pure NLI, no external dependencies
β βββ wikidata_probe.py Wikidata graph structure only
β βββ wikidata.py Wikidata API integration
β βββ embeddings.py Sentence embeddings + NLI stance detection
β
βββ attention/ Attention routing
β βββ curves.py Novelty β attention allocation (sigmoid curves)
β βββ sequence.py Bounded symbol buffers with eviction policies
β βββ process.py Pattern matching (repetition, convergence, loop detection)
β βββ salience.py Value functions (recency, keywords, novelty, allocations)
β βββ novelty_process.py Novelty-aware routing between sequences
β
βββ staking/ Evidence staking
β βββ staker.py Tendency-specific evidence analysis
β βββ hierarchical_staker.py Recursive claim decomposition
β
βββ extraction/ Observation extraction
β βββ observation_extractor.py Text β atomic observations
β βββ voice_extractor.py Voice transcripts β voice profile
β βββ tweet_processor.py Twitter data β observations
β
βββ agents/ Autonomous agents
β βββ moltbook_agent.py Social media agent driven by world model
β
βββ storage/ Persistence
β βββ world_model_store.py JSON serialization
β βββ firestore_adapter.py Google Firestore
β
βββ integration.py AttentionBridge + ArenaFeedback (wires everything together)
api/ FastAPI service
data/ Sample encoded world model
docs/ Full documentation
scripts/ CLI tools for extraction
tests/ Test suite
Novelty is not a one-shot score. It's the termination reason of a fetch/parse loop that explores how a concept relates to existing beliefs.
WHILE NOT TERMINATED:
data = fetch(focus) # Query knowledge graph
verdict = parse(data, frame) # Evaluate against beliefs
IF verdict.terminates:
BREAK # Termination reason IS the novelty
ELSE:
frame = frame.absorb(data) # Update reference frame
focus = verdict.next_focus # Expand search outward
Termination reasons:
INTEGRATEDβ concept fits naturally (low novelty)CONTRADICTS_ROOTβ opposes foundational beliefs (high novelty)ORTHOGONALβ no connection found despite search (high novelty)DISRUPTSβ would restructure tendency allocations (high novelty)
Four dimensions (combined via geometric mean):
| Dimension | Measures |
|---|---|
| Integration Resistance | How many iterations before the loop terminates |
| Contradiction Depth | How foundational the conflicting belief is |
| Coverage Gap | Fraction of worldview untouched by the concept |
| Allocation Disruption | How much tendency priorities would shift |
Attention is modeled as cascading symbol streams with finite capacity:
βββββββββββββββ
Input βββββββββββΊ β Sequence β βββββββββ Salience
β (bounded) β Function
ββββββββ¬βββββββ
β
ββββββββββββββΌβββββββββββββ
βΌ βΌ βΌ
ββββββββββ ββββββββββ ββββββββββ
βProcess β βProcess β βProcess β
βββββ¬βββββ βββββ¬βββββ βββββ¬βββββ
β β β
ββββββββββββββΌβββββββββββββ
βΌ
βββββββββββββββ
β Sequence β
β (output) β
βββββββββββββββ
- Sequences are bounded buffers (like Miller's 7Β±2). Overflow forces eviction β low-value items are dropped.
- Processes watch sequences for patterns: repetition, convergence across sources, loops.
- Salience functions assign value: recency decay, keyword matching, novelty scores, tendency allocations.
- Novelty captures attention: High novelty shifts allocation toward CURIOSITY via sigmoid curves, configurable per personality (explorer, balanced, conservative).
First training run on 165 observations:
| Metric | Value |
|---|---|
| Accuracy | 27.3% |
| Baseline | 7.1% (random chance = 1/7) |
| P-value | 0.001 |
The model predicts which tendency "owns" an observation 4x better than chance.
This architecture bets that:
- Binary distinction + recursion = sufficient to model meaning
- Seven tendencies = comprehensive but tractable agent set
- Competition = produces coherent personality from plurality
- Novelty is relative = same information, different surprise depending on who's hearing it
- Attention is finite = bounded buffers produce intelligent filtering
- Same structure = applies to anyone (swap observations + allocations)
If the bet pays off, this is a general architecture for modeling minds.
# Core (no external dependencies)
pip install -e .
# With novelty measurement (requires ML models)
pip install -e ".[novelty]"
# With API server
pip install -e ".[api]"
# Everything
pip install -e ".[all]"Requires Python 3.11+.
- Architecture β System design and components
- Core Concepts β Key ideas and terminology
- Training Guide β ML-style training with convergence detection
- API Reference β Module documentation
- Novelty Theory β Theoretical foundation for novelty measurement
- Novelty Formalization β Mathematical specification
- Wikidata Integration β Knowledge graph specifics
- Attention Mechanisms β Salience, routing, and novelty-modulated allocation
- Future: Digital Twin β Vision for complete embodiment
This system is infrastructure for After Meβtrustless estate planning with posthumous digital continuity.
The world model captures the mindβvalues and decision-making. A complete digital twin adds the bodyβvoice, face, mannerisms. Together they create a digital double that:
- Reasons from your values (world model)
- Looks and sounds like you (embodiment model, future)
- Carries cryptographic attestation via embedded weight hashes
This inverts the deepfake problem: instead of detecting fakes, you prove authenticity.
MIT