Skip to content

vesteinn/bibtex_cleanup

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BibTeX Cleanup Tool

An interactive tool for cleaning and formatting BibTeX entries according to specific guidelines.

Features

  • Interactive Processing: Review and approve each change before applying
  • Citation Tracking: Scans TeX files to identify which BibTeX entries are actually used
  • Smart Formatting: Automatically fixes common BibTeX issues:
    • Protects capital letters in titles (acronyms, proper nouns, etc.)
    • Fixes entry types (e.g., @misc to @article for arXiv papers)
    • Cleans up fields (removes abstracts, unnecessary publishers, etc.)
    • Standardizes arXiv entries with proper journal fields
    • Fixes page ranges to use double dashes
  • Diff Display: Shows clear before/after comparisons with color highlighting
  • Manual Editing: Option to manually edit entries in your text editor
  • Safe Operation: Never overwrites original files; saves to new file
  • Validation: Ensures output BibTeX is properly formatted and parseable

Installation

  1. Clone or download this tool into the bibtext_cleanup folder
  2. Install required Python packages:
cd bibtext_cleanup
pip install -r requirements.txt

Usage

Basic usage:

python cleanup_tool.py paper.tex references.bib

Specify output file:

python cleanup_tool.py paper.tex references.bib -o cleaned_refs.bib

Alternative syntax:

python cleanup_tool.py --tex main.tex --bib refs.bib --output refs_clean.bib

Interactive Commands

When processing each entry, you can:

  • [a] Accept changes - Apply the suggested formatting
  • [s] Skip entry - Keep the original formatting
  • [e] Edit manually - Open in text editor for custom changes
  • [d] Show detailed diff - Display the differences again
  • [v] View side-by-side - Show original and formatted versions
  • [q] Quit - Save progress and exit
  • [x] Exit - Exit without saving

Guidelines Followed

The tool follows these BibTeX formatting guidelines:

  1. Entry Types:

    • Uses @article for journal papers and arXiv preprints
    • Uses @inproceedings for conference papers
    • Uses @book for books
  2. Capitalization:

    • Protects acronyms (NASA, IEEE, etc.) with braces
    • Protects proper nouns and CamelCase words
    • Protects capitals after colons and periods
  3. Field Cleaning:

    • Removes abstract fields for readability
    • Removes publisher field from articles
    • Removes location information from booktitle/journal fields
    • Ensures page ranges use double dashes (--)
  4. arXiv Entries:

    • Converts @misc to @article
    • Adds journal field: "arXiv preprint arXiv:XXXX.XXXXX"
    • Ensures URL field is present

Files

  • cleanup_tool.py - Main script
  • bibtex_parser.py - BibTeX parsing module
  • citation_scanner.py - TeX file citation scanner
  • bibtex_formatter.py - Formatting rules implementation
  • interactive_cli.py - User interface components
  • requirements.txt - Python dependencies
  • README.md - This file

Notes

  • The tool only processes entries that are actually cited in your TeX file
  • Original files are never modified; results are saved to a new file
  • The output file is validated to ensure proper BibTeX formatting
  • You can interrupt the process at any time and save partial progress

Requirements

  • Python 3.6+
  • colorama (for colored terminal output)
  • bibtexparser (optional, for enhanced parsing)

About

Simple tool for cleaning up bibtex

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages