A lightweight CLI tool for running batches of prompts through OpenAI's GPT models using Python. This repo contains three versions of a script that:
- Reads prompts from a file (
prompts.txt) - Sends them one by one to the OpenAI API
- Logs each response
- Estimates token usage and cost
- Optionally saves results to
.txtand.csvfiles
Before running, create a .env file in the project folder:
OPENAI_API_KEY=sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Make sure to:
- Use your own OpenAI key
- Never commit
.envto public repos
📂 Script Versions
- Model: GPT-3.5 Turbo
- Token Limit: 150
- Output: Individual
.txtfiles - No CSV logging
- Simplest version – great for quick testing.
- Model: GPT-4 Turbo
- Token Limit: 600
- Output:
.txtfile per prompt with usage summary- Central CSV log (
summary_log.csv) with tokens, cost, truncation info
- Added rate limit protection
- Model: GPT-3.5 Turbo
- Token Limit: 1000
- Output:
.txtper response- CSV log (
summary_log_v3.csv)
- Includes cost calculation per prompt
- Detects and logs truncation
- Clean terminal summaries
📊 Helper Scripts
This repo includes two report builders that generate visual summaries from OpenAI batch CSV logs.
Purpose:
Generate a lightweight summary report (HTML or Markdown) from a GPT batch CSV.
Requires standard columns: prompt, response, prompt_tokens, completion_tokens, total_tokens, cost
Features:
- Summary statistics (min, mean, median, max)
- Optional charts:
- Scatter: Total Tokens vs. Cost
- Histogram: Response Length
- Bar chart: Top-N costly prompts
- Exports either:
- HTML with embedded Base64 images
- Markdown with PNG charts saved to disk
Example Commands:
# Generate HTML report
python report_builder.py -i results/summary_log_v3.csv -f html -o batch_report.html
# Generate Markdown report with PNGs
python report_builder.py -i results/summary_log_v3.csv -f md -o batch_report.md --top-n 10Purpose:
An advanced version with extra metrics, filtering, and comparison features.
Key Features:
- Cost-per-token, prompt/completion ratios
- Correlation matrix + heatmap
- Percentile tables
- Before/after comparison (e.g., for optimization)
- Near-duplicate prompt detection
- Flexible CLI filters:
--min-cost,--keyword,--opt-flag-col - Multiple chart types via
--charts
Example Commands:
# Full-feature HTML report
python report_builder_v2.py -i results/summary_log_v3.csv -f html -o batch_report_v2.html \
--title "GPT-3.5 Batch Report v2" \
--charts scatter_cost_tokens hist_response_length heatmap_corr \
--top-n 5 --min-cost 0.0002 --keyword "error" --opt-flag-col optimized
# Markdown version (lightweight)
python report_builder_v2.py -i results/summary_log_v3.csv -f md -o batch_report_v2.md \
--charts scatter_cost_tokens hist_response_length✅ Use report_builder.py for quick insights
🧠 Use report_builder_v2.py for advanced analysis
This repo includes two report builders that generate visual summaries from OpenAI batch CSV logs.
Purpose:
Analyzes results/summary_log_v3.csv generated by batch_from_file_v3.py.
Outputs token + cost stats, and saves a detailed report.
What it reports:
- Total & average prompt, completion, and total tokens
- Total & average cost
- Cost per token / 1000 tokens / 100 prompts
- Truncated responses (with Prompt #s)
How to use:
python csv_analyzer.py- Input:
results/summary_log_v3.csv(must exist) - Output:
results/summary_report.txt
Purpose:
Works with any CSV file. Gives numeric stats (mean, std, percentiles) and categorical summaries (most common values, unique counts).
What it reports:
- Row & column count
- Per-column stats for numbers and strings
- Optional: Save report to a
.txtfile
How to use:
# Just print to console:
python generic_csv_analyzer.py path/to/your_file.csv
# Save report to a text file:
python generic_csv_analyzer.py path/to/your_file.csv --output reports/summary.txtDependencies:
pip install pandas✅ Tip: Use csv_analyzer.py for OpenAI prompt logs
📂 Use generic_csv_analyzer.py for any custom CSV
📜 prompts.txt Format
Create a file named prompts.txt with one prompt per line. For example:
prompt1
prompt2
prompt3
prompt4
...
promptx
Avoid blank lines at the end.
▶️ How to Run
-
Install dependencies:
pip install openai python-dotenv
-
Prepare files:
.envwith your keyprompts.txtwith your prompts
-
Run a script:
python batch_from_file.py # basic python batch_from_file_v2.py # with CSV and GPT-4 python batch_from_file_v3.py # GPT-3.5, 1000-token limit
-
Check outputs:
- Text files:
response_1.txt,response_2.txt, ... - CSV logs:
summary_log.csvorsummary_log_v3.csv
- Text files:
🧾 Output Breakdown
Each script (except v1) tracks:
- Prompt and response tokens
- Truncation status
- Cost per prompt and total cost
- First 250 characters of response in CSV
- Full response saved in
.txtfiles with usage details
🤖 Model Comparison Summary
This chart summarizes the tradeoff between cost per prompt and average response quality:
| Model | Avg. Cost per Prompt | Avg. Quality Score (out of 20) |
|---|---|---|
| GPT-4 Turbo | $0.0184 | 18.0 |
| GPT-3.5 Turbo | $0.0006 | 16.0 |
- GPT-3.5 Turbo is cost-efficient and fast.
- GPT-4 Turbo is more powerful and accurate but more expensive.
Clone the repo and use batch_from_file.py first to get started.
Then explore v2 and v3 for more advanced logging and flexibility. Use a cheaper model when testing and don't forget to limit the tokens of the responses. NEVER share this API key or commit your .env file publically.
This project is intended for educational or research use. API usage costs are your responsibility.
