Skip to content

nibzard/agent-perceptions

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Agent Survey Analysis

This repository contains the full pipeline for preparing, analyzing, and visualizing survey responses for the "Will Agents Replace Us?" preprint project. All scripts, outputs, and documentation are aligned with the final manuscript and reproducible environment.

Project Structure

  • scripts/01_prepare.py: Cleans and processes the raw survey data, expands JSON answers, extracts region info, and outputs tidy data files.
  • scripts/02_explore_v2.py: Generates all exploratory figures (bar charts, grid, heatmap, mosaic) and saves summary statistics.
  • scripts/03_infer_v2.py: Performs all inferential analyses (pairwise tests, MCA, K-Modes clustering, logistic regression), generates all results and manuscript figures.
  • scripts/csv_to_md.py: Utility script to convert the K-Modes cluster table CSV (results/kmodes_table1.csv) to Markdown and LaTeX tables for manuscript inclusion.
  • data/raw/: Place raw survey data files here (e.g., survey_responses_rows_20250512.csv).
  • data/: Output directory for cleaned data and extracted text.
  • manuscript/figs/: All figures for the manuscript (auto-generated).
  • results/: All result tables, JSONs, and model outputs (auto-generated).
  • manuscript/article.tex: The main LaTeX manuscript referencing all canonical outputs.
  • manuscript/article.pdf: The compiled PDF of the manuscript.

Setup

  1. Install dependencies (Recommended: Conda)

    This project requires Python 3.11 and the dependencies listed in environment.yml:

    conda env create -f environment.yml
    conda activate ai_survey
  2. Prepare data

    Place your raw survey CSV file in the data/raw/ directory. By default, the script expects a file named survey_responses_rows_20250512.csv.

  3. Run the preparation script

    python scripts/01_prepare.py

    This will generate:

    • data/clean_survey.parquet: Cleaned and expanded survey data.
    • data/q11.txt: Free-text responses to question 11.
  4. Run exploratory analysis

    python scripts/02_explore_v2.py

    This will generate:

    • Individual bar charts for Q1-Q10: manuscript/figs/Q1_barh.png, ..., manuscript/figs/Q10_barh.png
    • Grid of all questions: manuscript/figs/all_questions_grid_improved.png
    • Heatmap: manuscript/figs/all_questions_heatmap.png
    • Mosaic plot: manuscript/figs/Q1xQ3_mosaic_supplementary_s1.png
    • Cramér's V heatmap: manuscript/figs/cramers_v_heatmap_figure2.png
    • Proportion summary: results/all_questions_results.json
  5. Run inferential analysis

    python scripts/03_infer_v2.py

    This will generate:

    • Pairwise test results: results/pairwise_tests.csv, results/pairwise_summary.json, results/pairwise_matrices.json
    • MCA outputs: results/mca_inertia.json, results/mca_row_coordinates.csv, manuscript/figs/mca_biplot_clusters_figure3.png
    • K-Modes clustering: results/kmodes_results.json, results/kmodes_table1.csv, results/kmodes_table1.json, results/kmodes_elbow.png
    • Logistic regression: results/logit_deployment_summary.txt, results/logit_deployment_coefs.csv, results/logit_deployment_vif.csv, manuscript/figs/logit_forest_plot_figure4.png, manuscript/figs/logit_forest_plot_significant_only.png
  6. (Optional) Convert clustering results to Markdown/LaTeX

    To generate Markdown and LaTeX tables from the K-Modes clustering results for manuscript inclusion:

    python scripts/csv_to_md.py

Reproducing the Manuscript

All figures and tables referenced in manuscript/article.tex are generated by the above scripts and saved in manuscript/figs/ and results/. To fully reproduce the manuscript:

  1. Run all three main scripts in order as above.
  2. Compile manuscript/article.tex to PDF using your preferred LaTeX toolchain (e.g., pdflatex, latexmk, or an online LaTeX editor).

Rendering the Manuscript as PDF

To generate a PDF of the manuscript, run the following command from the manuscript/ directory (after all scripts have been run):

pdflatex article.tex
  • The output will be saved as manuscript/article.pdf.
  • This works both on your local machine and inside the Docker container.
  • Note: Rendering to PDF requires a working LaTeX installation with the tabularx and booktabs packages. If you encounter errors like Environment tabularx undefined, install a full TeX distribution (e.g., TeX Live, MacTeX, or TinyTeX). See Overleaf's documentation for details and troubleshooting.

Limitations & Notes

  • Region Extraction: The region is extracted from timezone metadata and is a coarse proxy (may be inaccurate due to VPNs, travel, or shared infrastructure). See manuscript Methods/Limitations for discussion.
  • Hardcoded Cleaning: Merging of Q3 and Q5 categories is done via string matching in scripts/01_prepare.py. This is brittle and not robust to changes in survey wording. For this analysis, it is acceptable, but future work should use a config or reference questions.json.
  • Qualitative Analysis (Q11): Only a manual thematic summary is performed for Q11 (n=11). No further scripting or automation is included.

Using Docker

  1. Build the Docker image
    docker build -t ai-agent-survey .
  2. Run the container
    docker run -it --rm -v "$PWD":/workspace ai-agent-survey
  3. Run the scripts as above inside the container.

Code and Workflow Notes

  • All code is implemented as Python scripts (.py), not Jupyter notebooks, in accordance with project rules.
  • The project workflow follows: specs review → data review → todo.md planning → script writing → manuscript writing. Each phase is logged for traceability.

License

  • Code & Data: All Rights Reserved

About

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors