General Conference PDF Generator

A Python web scraping tool that extracts talks from Church of Jesus Christ General Conference pages and generates formatted PDF documents.

NOTE: This tool was AI Generated and not reflective of my skills. I am a JavaScript developer not a Python developer. I am making this public in case someone else would like a PDF of confrence in a nice simple format like I do.

Features

Scrapes conference talks from churchofjesuschrist.org
Extracts speaker names, talk titles, and full content
Includes images from talks in the PDF (photos, artwork, etc.)
PDF Bookmarks/Outline - Navigate easily between sessions and talks using the PDF reader's sidebar
Generates professionally formatted PDF documents
Includes cover page and session dividers
Reusable for different conference years and sessions
Saves intermediate JSON data for debugging
Can adjust page size

Installation

Prerequisites

Python 3.7 or higher
pip (Python package manager)

Quick Setup (Recommended)

Run the setup script to automatically create a virtual environment and install dependencies:

./setup.sh

Then activate the virtual environment:

source venv/bin/activate

Manual Setup

If you prefer to set up manually:

# Create virtual environment
python3 -m venv venv

# Activate virtual environment
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

Or install packages directly:

pip install reportlab Pillow

Usage

Important: Make sure to activate the virtual environment before running the scripts:

source venv/bin/activate

Quick Start

Generate a PDF from a conference URL (note the quotes around the URL):

python generate_conference_pdf.py "https://www.churchofjesuschrist.org/study/general-conference/2025/04?lang=eng"

Important: Always wrap the URL in quotes to prevent shell expansion issues.

This will create a PDF file named 2025_April.pdf in the Output directory with automatic bookmarks for easy navigation.

💡 New! See QUICK_START_BOOKMARKS.md to learn about the bookmark navigation feature.

Specify Output Filename

python generate_conference_pdf.py "https://www.churchofjesuschrist.org/study/general-conference/2025/04?lang=eng" my_conference.pdf

Advanced Usage

Scrape Only (No PDF)

To scrape conference data and save as JSON without generating a PDF:

python conference_scraper.py "https://www.churchofjesuschrist.org/study/general-conference/2025/04?lang=eng"

This creates a JSON file with all scraped data.

Generate PDF from Existing JSON

If you already have scraped data in JSON format:

python pdf_generator.py conference_data.json output.pdf

File Structure

.
├── generate_conference_pdf.py  # Main script (scrape + generate PDF)
├── conference_scraper.py       # Web scraping module
├── pdf_generator.py            # PDF generation module
├── requirements.txt            # Python dependencies
├── README.md                   # This file
└── example/
    └── 2025_April.pdf         # Example output

How It Works

Web Scraping: The script uses the Church's API to fetch conference data
- Retrieves the main conference page
- Extracts links to individual talks
- Fetches full content for each talk
- Parses HTML to extract clean text and images
- Downloads images from the Church's servers
PDF Generation: Creates a formatted PDF using ReportLab
- Session divider pages for each conference session
- Individual pages for each talk with speaker and title
- Images embedded at their proper locations within talks
- Image captions with titles/descriptions
- Professional formatting and styling

Examples

April 2025 General Conference

python generate_conference_pdf.py "https://www.churchofjesuschrist.org/study/general-conference/2025/04?lang=eng"

October 2024 General Conference

python generate_conference_pdf.py "https://www.churchofjesuschrist.org/study/general-conference/2024/10?lang=eng"

Output

The script generates two files:

PDF File: Formatted document with all talks
- Example: 2025_April.pdf
JSON Data File: Raw scraped data (for debugging)
- Example: 2025_April_data.json

Troubleshooting

Import Errors

If you get import errors, make sure all dependencies are installed:

pip install --upgrade reportlab

Network Errors

If scraping fails, check your internet connection and verify the conference URL is correct.

PDF Generation Issues

If PDF generation fails, check that you have write permissions in the output directory.

Customization

Modify PDF Styling

Edit pdf_generator.py and modify the _setup_custom_styles() method to change:

Font sizes
Colors
Spacing
Alignment

Change Scraping Behavior

Edit conference_scraper.py to:

Filter specific types of talks
Extract additional metadata
Modify text extraction logic

Notes

The script respects the Church's website structure and uses their public API
Scraping may take several minutes depending on the number of talks
Generated PDFs match the format of the example in example/2025_April.pdf

License

This tool is provided as-is for personal use. Please respect the Church's copyright and terms of service when using scraped content.

Support

For issues or questions, please check:

Python version (3.7+)
All dependencies installed
Valid conference URL format
Internet connection

Additional Documentation

Bookmarks/Outline Feature - Navigate PDFs with hierarchical bookmarks
Image Support - How images are handled in PDFs
Footnotes Feature - Footnote extraction and formatting

Version History

v1.2 (2025-10-05): PDF Bookmarks/Outline added
- Hierarchical bookmarks for sessions and talks
- Easy navigation in PDF readers
- Verification script to check bookmarks
v1.1 (2025-10-05): Image support added
- Images from talks are now included in PDFs
- Automatic image downloading and embedding
- Image captions with titles/descriptions
- Proper image positioning within talk content
v1.0 (2025-10-05): Initial release
- Web scraping functionality
- PDF generation with formatting
- Support for different conference years/sessions

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
example		example
.gitignore		.gitignore
README.md		README.md
conference_scraper.py		conference_scraper.py
generate_conference_pdf.py		generate_conference_pdf.py
pdf_generator.py		pdf_generator.py
requirements.txt		requirements.txt
setup.sh		setup.sh

Folders and files

Latest commit

History

Repository files navigation

General Conference PDF Generator

Features

Installation

Prerequisites

Quick Setup (Recommended)

Manual Setup

Usage

Quick Start

Specify Output Filename

Advanced Usage

Scrape Only (No PDF)

Generate PDF from Existing JSON

File Structure

How It Works

Examples

April 2025 General Conference

October 2024 General Conference

Output

Troubleshooting

Import Errors

Network Errors

PDF Generation Issues

Customization

Modify PDF Styling

Change Scraping Behavior

Notes

License

Support

Additional Documentation

Version History

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages