An intelligent tool that uses OpenAI's GPT-5 to forge comprehensive summaries of ebooks in multiple formats.
Repository: [email protected]:profullstack/summary-forge-module.git
- ๐ Multiple Input Formats: Supports PDF, EPUB files, and web page URLs
- ๐ Web Page Summarization: Fetch and summarize any web page with automatic content extraction
- ๐ค AI-Powered Summaries: Uses GPT-5 with direct PDF upload for better quality
- ๐ Vision API: Preserves formatting, tables, diagrams, and images from PDFs
- ๐งฉ Intelligent Chunking: Automatically processes large PDFs (500+ pages) without truncation
- ๐ก๏ธ Directory Protection: Prompts before overwriting existing summaries (use --force to skip)
- ๐ฆ Multiple Output Formats: Creates Markdown, PDF, EPUB, plain text, and MP3 audio summaries
- ๐ Printable Flashcards: Generates double-sided flashcard PDFs for studying
- ๐ผ๏ธ Flashcard Images: Individual PNG images for web app integration (q-001.png, a-001.png, etc.)
- ๐๏ธ Natural Audio Narration: AI-generated conversational audio script for better listening
- ๐๏ธ Bundled Output: Packages everything into a convenient
.tgzarchive - ๐ Auto-Conversion: Automatically converts EPUB to PDF using Calibre
- ๐ Book Search: Search Amazon by title using Rainforest API
- ๐ Auto-Download: Downloads books from Anna's Archive with CAPTCHA solving
- ๐ป CLI & Module: Use as a command-line tool or import as an ESM module
- ๐จ Interactive Mode: Guided workflow with inquirer prompts
- ๐ฅ EPUB Priority: Automatically prefers EPUB format (open standard, more flexible)
pnpm install -g @profullstack/summary-forge-modulepnpm add @profullstack/summary-forge-module-
Node.js v20 or newer
-
Calibre (for EPUB conversion - provides
ebook-convertcommand)# macOS brew install calibre # Ubuntu/Debian sudo apt-get install calibre # Arch Linux sudo pacman -S calibre
-
Pandoc (for document conversion)
# macOS brew install pandoc # Ubuntu/Debian sudo apt-get install pandoc # Arch Linux sudo pacman -S pandoc
-
XeLaTeX (for PDF generation)
# macOS brew install --cask mactex # Ubuntu/Debian sudo apt-get install texlive-xetex # Arch Linux sudo pacman -S texlive-core texlive-xetex
Before using the CLI, configure your API keys:
summary setupThis interactive command will prompt you for:
- OpenAI API Key (required)
- Rainforest API Key (optional - for Amazon book search)
- ElevenLabs API Key (optional - for audio generation, get key here)
- 2Captcha API Key (optional - for CAPTCHA solving, sign up here)
- Browserless API Key (optional)
- Browser and proxy settings
Configuration is saved to ~/.config/summary-forge/settings.json and used automatically by all CLI commands.
# View current configuration
summary config
# Update configuration
summary setup
# Delete configuration
summary config --deleteNote: The CLI will use configuration in this priority order:
- Environment variables (
.envfile) - Configuration file (
~/.config/summary-forge/settings.json)
summary interactive
# or
summary iThis launches an interactive menu where you can:
- Process local files (PDF/EPUB)
- Process web page URLs
- Search for books by title
- Look up books by ISBN/ASIN
summary file /path/to/book.pdf
summary file /path/to/book.epub
# Force overwrite if directory already exists
summary file /path/to/book.pdf --force
summary file /path/to/book.pdf -fsummary url https://example.com/article
summary url https://blog.example.com/post/123
# Force overwrite if directory already exists
summary url https://example.com/article --force
summary url https://example.com/article -fFeatures:
- Automatically fetches web page content using Puppeteer
- Sanitizes HTML to remove navigation, ads, footers, and other non-content elements
- Saves web page as PDF for processing
- Generates clean title from page title or uses OpenAI to create one
- Prompts specifically optimized for web page content (ignores nav/ads/footers)
- Creates same output formats as book processing (MD, TXT, PDF, EPUB, MP3, flashcards)
# Search for books (defaults to 1lib.sk - faster, no DDoS protection)
summary search "LLM Fine Tuning"
summary search "JavaScript" --max-results 5 --extensions pdf,epub
summary search "Python" --year-from 2020 --year-to 2024
summary search "Machine Learning" --languages english --order date
# Use Anna's Archive instead (has DDoS protection, slower)
summary search "Clean Code" --source anna
summary search "Rare Book" --source anna --sources zlib,lgli
# Title search (shortcut for search command)
summary title "A Philosophy of Software Design"
summary title "Clean Code" --force # Auto-select first result
summary title "Python" --source anna # Use Anna's Archive
# ISBN lookup (defaults to 1lib.sk)
summary isbn 9780134685991
summary isbn B075HYVHWK --force # Auto-select and process
summary isbn 9780134685991 --source anna # Use Anna's Archive
# Common Options:
# --source <source> Search source: zlib (1lib.sk, default) or anna (Anna's Archive)
# -n, --max-results <number> Maximum results to display (default: 10)
# -f, --force Auto-select first result and process immediately
#
# 1lib.sk Options (--source zlib, default):
# --year-from <year> Filter by publication year from (e.g., 2020)
# --year-to <year> Filter by publication year to (e.g., 2024)
# -l, --languages <languages> Language filter, comma-separated (default: english)
# -e, --extensions <extensions> File extensions, comma-separated (case-insensitive, default: PDF)
# --content-types <types> Content types, comma-separated (default: book)
# -s, --order <order> Sort order: date (newest) or empty for relevance
# --view <view> View type: list or grid (default: list)
#
# Anna's Archive Options (--source anna):
# -f, --format <format> Filter by format: pdf, epub, pdf,epub, or all (default: pdf)
# -s, --sort <sort> Sort by: date (newest) or empty for relevance (default: '')
# -l, --language <language> Language code(s), comma-separated (e.g., en, es, fr) (default: en)
# --sources <sources> Data sources, comma-separated (default: all sources)
# Options: zlib, lgli, lgrs, and otherssummary isbn B075HYVHWK
# Force overwrite if directory already exists
summary isbn B075HYVHWK --force
summary isbn B075HYVHWK -fsummary --help
summary file --helpAll methods now return consistent JSON objects with the following structure:
{
success: true | false, // Indicates if operation succeeded
...data, // Method-specific data fields
error?: string, // Error message (only when success is false)
message?: string // Success message (optional)
}This enables:
- โ
Consistent error handling - Check
successfield instead of try-catch - โ REST API ready - Direct JSON responses for HTTP endpoints
- โ Better debugging - Rich metadata in all responses
- โ Type-safe - Predictable structure for TypeScript users
import { SummaryForge } from '@profullstack/summary-forge-module';
import { loadConfig } from '@profullstack/summary-forge-module/config';
// Load config from ~/.config/summary-forge/settings.json
const configResult = await loadConfig();
if (!configResult.success) {
console.error('Failed to load config:', configResult.error);
process.exit(1);
}
const forge = new SummaryForge(configResult.config);
const result = await forge.processFile('./my-book.pdf');
if (result.success) {
console.log('Summary created:', result.archive);
console.log('Files:', result.files);
console.log('Costs:', result.costs);
} else {
console.error('Processing failed:', result.error);
}import { SummaryForge } from '@profullstack/summary-forge-module';
const forge = new SummaryForge({
// Required
openaiApiKey: 'sk-...',
// Optional API keys
rainforestApiKey: 'your-key', // For Amazon search
elevenlabsApiKey: 'sk-...', // For audio generation (get key: https://try.elevenlabs.io/oh7kgotrpjnv)
twocaptchaApiKey: 'your-key', // For CAPTCHA solving (sign up: https://2captcha.com/?from=9630996)
browserlessApiKey: 'your-key', // For browserless.io
// Processing options
maxChars: 500000, // Max chars to process
maxTokens: 20000, // Max tokens in summary
// Audio options
voiceId: '21m00Tcm4TlvDq8ikWAM', // ElevenLabs voice
voiceSettings: {
stability: 0.5,
similarity_boost: 0.75
},
// Browser options
headless: true, // Run browser in headless mode
enableProxy: false, // Enable proxy
proxyUrl: 'http://proxy.com', // Proxy URL
proxyUsername: 'user', // Proxy username
proxyPassword: 'pass', // Proxy password
proxyPoolSize: 36 // Number of proxies in pool (default: 36)
});
const result = await forge.processFile('./book.epub');
console.log('Archive:', result.archive);const forge = new SummaryForge({
openaiApiKey: process.env.OPENAI_API_KEY,
rainforestApiKey: process.env.RAINFOREST_API_KEY
});
const searchResult = await forge.searchBookByTitle('Clean Code');
if (!searchResult.success) {
console.error('Search failed:', searchResult.error);
process.exit(1);
}
console.log(`Found ${searchResult.count} results:`);
console.log(searchResult.results.map(b => ({
title: b.title,
author: b.author,
asin: b.asin
})));
// Get download URL
const url = forge.getAnnasArchiveUrl(searchResult.results[0].asin);
console.log('Download from:', url);const forge = new SummaryForge({
openaiApiKey: process.env.OPENAI_API_KEY,
enableProxy: true,
proxyUrl: process.env.PROXY_URL,
proxyUsername: process.env.PROXY_USERNAME,
proxyPassword: process.env.PROXY_PASSWORD
});
// Basic search
const searchResult = await forge.searchAnnasArchive('JavaScript', {
maxResults: 10,
format: 'pdf',
sortBy: 'date' // Sort by newest
});
if (!searchResult.success) {
console.error('Search failed:', searchResult.error);
process.exit(1);
}
console.log(`Found ${searchResult.count} results`);
console.log(searchResult.results.map(r => ({
title: r.title,
author: r.author,
format: r.format,
size: `${r.sizeInMB.toFixed(1)} MB`,
url: r.url
})));
// Download the first result
if (searchResult.results.length > 0) {
const md5 = searchResult.results[0].href.match(/\/md5\/([a-f0-9]+)/)[1];
const downloadResult = await forge.downloadFromAnnasArchive(md5, '.', searchResult.results[0].title);
if (downloadResult.success) {
console.log('Downloaded:', downloadResult.filepath);
console.log('Directory:', downloadResult.directory);
} else {
console.error('Download failed:', downloadResult.error);
}
}const forge = new SummaryForge({
openaiApiKey: process.env.OPENAI_API_KEY,
enableProxy: true,
proxyUrl: process.env.PROXY_URL,
proxyUsername: process.env.PROXY_USERNAME,
proxyPassword: process.env.PROXY_PASSWORD
});
// Basic search
const searchResult = await forge.search1lib('LLM Fine Tuning', {
maxResults: 10,
yearFrom: 2020,
languages: ['english'],
extensions: ['PDF']
});
if (!searchResult.success) {
console.error('Search failed:', searchResult.error);
process.exit(1);
}
console.log(`Found ${searchResult.count} results`);
console.log(searchResult.results.map(r => ({
title: r.title,
author: r.author,
year: r.year,
extension: r.extension,
size: r.size,
language: r.language,
isbn: r.isbn,
url: r.url
})));
// Download the first result
if (searchResult.results.length > 0) {
const downloadResult = await forge.downloadFrom1lib(
searchResult.results[0].url,
'.',
searchResult.results[0].title
);
if (downloadResult.success) {
console.log('Downloaded:', downloadResult.filepath);
// Process the downloaded book
const processResult = await forge.processFile(downloadResult.filepath, downloadResult.identifier);
if (processResult.success) {
console.log('Summary created:', processResult.archive);
console.log('Costs:', processResult.costs);
} else {
console.error('Processing failed:', processResult.error);
}
} else {
console.error('Download failed:', downloadResult.error);
}
}Enhanced Error Handling:
The 1lib.sk download functionality includes robust error handling with automatic debugging:
- Multiple Selector Fallbacks: Tries 6 different selectors to find download buttons
- Debug HTML Capture: Saves page HTML when download button isn't found
- Link Analysis: Lists all links on the page for troubleshooting
- Detailed Error Messages: Provides actionable information for debugging
If a download fails, check the debug-book-page.html file in the book's directory for detailed page structure information.
new SummaryForge({
// API Keys
openaiApiKey: string, // Required: OpenAI API key
rainforestApiKey: string, // Optional: For title search
elevenlabsApiKey: string, // Optional: For audio generation
twocaptchaApiKey: string, // Optional: For CAPTCHA solving
browserlessApiKey: string, // Optional: For browserless.io
// Processing Options
maxChars: number, // Optional: Max chars to process (default: 400000)
maxTokens: number, // Optional: Max tokens in summary (default: 16000)
// Audio Options
voiceId: string, // Optional: ElevenLabs voice ID (default: Brian)
voiceSettings: object, // Optional: Voice customization settings
// Browser Options
headless: boolean, // Optional: Run browser in headless mode (default: true)
enableProxy: boolean, // Optional: Enable proxy (default: false)
proxyUrl: string, // Optional: Proxy URL
proxyUsername: string, // Optional: Proxy username
proxyPassword: string, // Optional: Proxy password
proxyPoolSize: number // Optional: Number of proxies in pool (default: 36)
})All methods return JSON objects with { success, ...data, error?, message? } format.
-
processFile(filePath, asin?)- Process a PDF or EPUB file- Returns:
{ success, basename, markdown, files, archive, hasAudio, asin, costs, message, error? } - Example:
const result = await forge.processFile('./book.pdf'); if (result.success) { console.log('Archive:', result.archive); console.log('Costs:', result.costs); }
- Returns:
-
processWebPage(url, outputDir?)- Process a web page URL- Returns:
{ success, basename, dirName, markdown, files, directory, archive, hasAudio, url, title, costs, message, error? } - Example:
const result = await forge.processWebPage('https://example.com/article'); if (result.success) { console.log('Summary:', result.markdown.substring(0, 100)); }
- Returns:
-
searchBookByTitle(title)- Search Amazon using Rainforest API- Returns:
{ success, results, count, query, message, error? } - Example:
const result = await forge.searchBookByTitle('Clean Code'); if (result.success) { console.log(`Found ${result.count} books`); }
- Returns:
-
searchAnnasArchive(query, options?)- Search Anna's Archive directly- Returns:
{ success, results, count, query, options, message, error? } - Example:
const result = await forge.searchAnnasArchive('JavaScript', { maxResults: 10, format: 'pdf', sortBy: 'date' }); if (result.success) { console.log(`Found ${result.count} results`); }
- Returns:
-
search1lib(query, options?)- Search 1lib.sk- Returns:
{ success, results, count, query, options, message, error? }
- Returns:
-
downloadFromAnnasArchive(asin, outputDir?, bookTitle?)- Download from Anna's Archive- Returns:
{ success, filepath, directory, asin, format, message, error? } - Example:
const result = await forge.downloadFromAnnasArchive('B075HYVHWK', '.'); if (result.success) { console.log('Downloaded to:', result.filepath); }
- Returns:
-
downloadFrom1lib(bookUrl, outputDir?, bookTitle?, downloadUrl?)- Download from 1lib.sk- Returns:
{ success, filepath, directory, title, format, message, error? }
- Returns:
-
search1libAndDownload(query, searchOptions?, outputDir?, selectCallback?)- Search and download in one session- Returns:
{ success, results, download, message, error? }
- Returns:
-
generateSummary(pdfPath)- Generate AI summary from PDF- Returns:
{ success, markdown, length, method, chunks?, message, error? } - Methods:
gpt5_pdf_upload,text_extraction_single,text_extraction_chunked - Example:
const result = await forge.generateSummary('./book.pdf'); if (result.success) { console.log(`Generated ${result.length} char summary using ${result.method}`); }
- Returns:
-
generateAudioScript(markdown)- Generate audio-friendly narration script- Returns:
{ success, script, length, message }
- Returns:
-
generateAudio(text, outputPath)- Generate audio using ElevenLabs TTS- Returns:
{ success, path, size, duration, message, error? }
- Returns:
-
generateOutputFiles(markdown, basename, outputDir)- Generate all output formats- Returns:
{ success, files: {...}, message }
- Returns:
-
convertEpubToPdf(epubPath)- Convert EPUB to PDF- Returns:
{ success, pdfPath, originalPath, message, error? }
- Returns:
-
createBundle(files, archiveName)- Create tar.gz archive- Returns:
{ success, path, files, message, error? }
- Returns:
-
getCostSummary()- Get cost tracking information- Returns:
{ success, openai, elevenlabs, rainforest, total, breakdown }
- Returns:
For CLI usage, run the setup command to configure your API keys:
summary setupThis saves your configuration to ~/.config/summary-forge/settings.json so you don't need to manage environment variables.
For programmatic usage or if you prefer environment variables, create a .env file:
OPENAI_API_KEY=sk-your-key-here
RAINFOREST_API_KEY=your-key-here
ELEVENLABS_API_KEY=sk-your-key-here # Optional: for audio generation
TWOCAPTCHA_API_KEY=your-key-here # Optional: for CAPTCHA solving
BROWSERLESS_API_KEY=your-key-here # Optional
# Browser Configuration
HEADLESS=true # Run browser in headless mode
ENABLE_PROXY=false # Enable proxy for browser requests
PROXY_URL=http://proxy.example.com # Proxy URL (if enabled)
PROXY_USERNAME=username # Proxy username (if enabled)
PROXY_PASSWORD=password # Proxy password (if enabled)
PROXY_POOL_SIZE=36 # Number of proxies in your pool (default: 36)Or set them in your shell:
export OPENAI_API_KEY=sk-your-key-here
export RAINFOREST_API_KEY=your-key-here
export ELEVENLABS_API_KEY=sk-your-key-here # OptionalWhen using the module programmatically, configuration is loaded in this order (highest priority first):
- Constructor options - Passed directly to
new SummaryForge(options) - Environment variables - From
.envfile or shell - Config file - From
~/.config/summary-forge/settings.json(CLI only)
To avoid IP bans when downloading from Anna's Archive, configure a proxy during setup:
summary setupWhen prompted:
- Enable proxy:
Yes - Enter proxy URL:
http://your-proxy.com:8080 - Enter proxy username and password
Why use a proxy?
- โ Avoids IP bans from Anna's Archive
- โ USA-based proxies prevent geo-location issues
- โ Works with both browser navigation and file downloads
- โ Automatically applied to all download operations
Recommended Proxy Service:
We recommend Webshare.io for reliable, USA-based proxies:
- ๐ USA-based IPs (no geo-location issues)
- โก Fast and reliable
- ๐ฐ Affordable pricing with free tier
- ๐ HTTP/HTTPS/SOCKS5 support
Important: Use Static Proxies for Sticky Sessions
For Anna's Archive downloads, you need a static/direct proxy (not rotating) to maintain the same IP:
- In your Webshare dashboard, go to Proxy โ List
- Copy a Static Proxy endpoint (not the rotating endpoint)
- Use the format:
http://host:port(e.g.,http://45.95.96.132:8080) - Username format:
dmdgluqz-US-{session_id}(session ID added automatically)
The tool automatically generates a unique session ID (1 to PROXY_POOL_SIZE) for each download to get a fresh IP, while maintaining that IP throughout the 5-10 minute download process.
Proxy Pool Size Configuration:
Set PROXY_POOL_SIZE to match your Webshare plan (default: 36):
- Free tier: 10 proxies โ
PROXY_POOL_SIZE=10 - Starter plan: 25 proxies โ
PROXY_POOL_SIZE=25 - Professional plan: 100 proxies โ
PROXY_POOL_SIZE=100 - Enterprise plan: 250+ proxies โ
PROXY_POOL_SIZE=250
The tool will randomly select a session ID from 1 to your pool size, distributing load across all available proxies.
Smart ISBN Detection:
When searching Anna's Archive, the tool automatically detects whether an identifier is a real ISBN or an Amazon ASIN:
- Real ISBNs (10 or 13 numeric digits): Searches by ISBN for precise results
- Amazon ASINs (alphanumeric): Searches by book title instead for better results
- This ensures you get relevant search results even when Amazon returns proprietary ASINs instead of standard ISBNs
Note: Rotating proxies (p.webshare.io) don't support sticky sessions. Use individual static proxy IPs from your proxy list instead.
Testing your proxy:
node test-proxy.js <ASIN>This will verify your proxy configuration by attempting to download a book.
Audio generation is optional and requires an ElevenLabs API key. If the key is not provided, the tool will skip audio generation and only create text-based outputs.
Get ElevenLabs API Key: Sign up here for high-quality text-to-speech.
Features:
- Uses ElevenLabs Turbo v2.5 model (optimized for audiobooks)
- Default voice: Brian (best for technical content, customizable)
- Automatically truncates long texts to fit API limits
- Generates high-quality MP3 audio files
- Natural, conversational narration style
The tool generates:
<book_name>_summary.md- Markdown summary<book_name>_summary.txt- Plain text summary<book_name>_summary.pdf- PDF summary with table of contents<book_name>_summary.epub- EPUB summary with clickable TOC<book_name>_summary.mp3- Audio summary (if ElevenLabs key provided)<book_name>.pdf- Original or converted PDF<book_name>.epub- Original EPUB (if input was EPUB)<book_name>_bundle.tgz- Compressed archive containing all files
# 1. Search for a book
summary search
# Enter: "A Philosophy of Software Design"
# Select from results, get ASIN
# 2. Download and process automatically
summary isbn B075HYVHWK
# Downloads, asks if you want to process
# Creates summary bundle automatically!
# Alternative: Process a local file
summary file ~/Downloads/book.epub- Input Processing: Accepts PDF or EPUB files (EPUB is converted to PDF)
- Smart Processing Strategy:
- Small PDFs (<400k chars): Direct upload to OpenAI's vision API
- Large PDFs (>400k chars): Intelligent chunking with synthesis
- AI Summarization: GPT-5 analyzes content with full formatting, tables, and diagrams
- Format Conversion: Uses Pandoc to convert the Markdown summary to PDF and EPUB
- Audio Generation: Optional TTS conversion using ElevenLabs
- Bundling: Creates a compressed archive with all generated files
For PDFs exceeding 400,000 characters (typically 500+ pages), the tool automatically uses an intelligent chunking strategy:
How it works:
- Analysis: Calculates optimal chunk size based on PDF statistics
- Page-Based Chunking: Splits PDF into logical chunks (typically 50-150k chars each)
- Parallel Processing: Each chunk is summarized independently by GPT-5
- Intelligent Synthesis: All chunk summaries are combined into a cohesive final summary
- Quality Preservation: Maintains narrative flow and eliminates redundancy
Benefits:
- โ Complete Coverage: Processes entire books without truncation
- โ High Quality: Each section gets full AI attention
- โ Seamless Output: Final summary reads as a unified document
- โ Cost Efficient: Optimizes token usage across multiple API calls
- โ Automatic: No configuration needed - works transparently
Example Output:
๐ PDF Stats: 523 pages, 1,245,678 chars, ~311,420 tokens
๐ PDF is large - using intelligent chunking strategy
This will process the ENTIRE 523-page PDF without truncation
๐ Using chunk size: 120,000 chars
๐ฆ Created 11 chunks for processing
Chunk 1: Pages 1-48 (119,234 chars)
Chunk 2: Pages 49-95 (118,901 chars)
...
โ
All 11 chunks processed successfully
๐ Synthesizing chunk summaries into final comprehensive summary...
โ
Final summary synthesized: 45,678 characters
The tool prioritizes OpenAI's vision API for direct PDF upload when possible:
- โ Better Quality: Preserves document formatting, tables, and diagrams
- โ More Accurate: AI can see the actual PDF layout and structure
- โ Better for Technical Books: Code examples and diagrams are preserved
- โ Fallback Strategy: Automatically switches to intelligent chunking for large files
Summary Forge includes a comprehensive test suite using Vitest.
# Run all tests
pnpm test
# Run tests in watch mode
pnpm test:watch
# Run tests with coverage report
pnpm test:coverageThe test suite includes:
- โ 30+ passing tests
- Constructor validation
- Helper method tests
- PDF upload functionality tests
- API integration tests
- Error handling tests
- Edge case coverage
- File operation tests
See test/summary-forge.test.js for the complete test suite.
Summary Forge includes powerful flashcard generation capabilities for study and review.
Generate double-sided flashcard PDFs optimized for printing:
import { extractFlashcards, generateFlashcardsPDF } from '@profullstack/summary-forge-module/flashcards';
import fs from 'node:fs/promises';
// Read your markdown summary
const markdown = await fs.readFile('./book_summary.md', 'utf-8');
// Extract Q&A pairs
const extractResult = extractFlashcards(markdown, { maxCards: 50 });
console.log(`Extracted ${extractResult.count} flashcards`);
// Generate printable PDF
const pdfResult = await generateFlashcardsPDF(
extractResult.flashcards,
'./flashcards.pdf',
{
title: 'JavaScript Fundamentals',
branding: 'SummaryForge.com',
cardWidth: 3.5, // inches
cardHeight: 2.5, // inches
fontSize: 11
}
);
console.log(`PDF created: ${pdfResult.path}`);
console.log(`Total pages: ${pdfResult.pages}`);Generate individual PNG images for each flashcard, perfect for web applications:
import { extractFlashcards, generateFlashcardImages } from '@profullstack/summary-forge-module/flashcards';
import fs from 'node:fs/promises';
// Read your markdown summary
const markdown = await fs.readFile('./book_summary.md', 'utf-8');
// Extract Q&A pairs
const extractResult = extractFlashcards(markdown);
// Generate individual PNG images
const imageResult = await generateFlashcardImages(
extractResult.flashcards,
'./flashcards', // Output directory
{
title: 'JavaScript Fundamentals',
branding: 'SummaryForge.com',
width: 800, // pixels
height: 600, // pixels
fontSize: 24
}
);
if (imageResult.success) {
console.log(`Generated ${imageResult.images.length} images`);
console.log('Files:', imageResult.images);
// Output: ['./flashcards/q-001.png', './flashcards/a-001.png', ...]
}Image Naming Convention:
q-001.png,q-002.png, etc. - Question cardsa-001.png,a-002.png, etc. - Answer cards
Use Cases:
- ๐ Web-based flashcard applications
- ๐ฑ Mobile learning apps
- ๐ฎ Interactive quiz games
- ๐ Study progress tracking systems
- ๐ Spaced repetition software
Features:
- โ Clean, professional design with book title
- โ Automatic text wrapping for long content
- โ Customizable dimensions and styling
- โ SVG-based rendering for crisp quality
- โ Works in Docker (no native dependencies)
The extractFlashcards function supports multiple markdown formats:
1. Explicit Q&A Format:
**Q: What is a closure?**
A: A closure is a function that has access to variables in its outer scope.2. Definition Lists:
**Closure**
: A function that has access to variables in its outer scope.3. Question Headers:
### What is a closure?
A closure is a function that has access to variables in its outer scope.See the examples/ directory for more usage examples:
programmatic-usage.js- Using as a moduleflashcard-images-demo.js- Generating flashcard images
If you encounter "Too many requests" errors from 1lib.sk:
Error Message:
Too many requests from your IP xxx.xxx.xxx.xxx
Please wait 10 seconds. [email protected]. Err #ipd1
Automatic Handling: The tool automatically detects rate limiting and:
- โ Waits the requested time (usually 10 seconds)
- โ Retries up to 3 times with exponential backoff
- โ Adds a 2-second buffer to ensure rate limit has cleared
Manual Solutions:
- Wait a few minutes before trying again
- Use a different proxy session (the tool rotates through your proxy pool automatically)
- Switch to Anna's Archive:
summary search "book title" --source anna - Reduce concurrent requests if running multiple downloads
Note: The proxy pool helps distribute requests across different IPs, reducing rate limiting issues.
If you encounter "Download button not found" errors when downloading from 1lib.sk:
-
Check Debug Files: The tool automatically saves
debug-book-page.htmlin the book's directory- Open this file to inspect the actual page structure
- Look for download links or buttons that might have different selectors
-
Review Error Output: The error message includes:
- All selectors that were tried
- List of links found on the page
- Location of the debug HTML file
-
Common Causes:
- Z-Access/Library Access Page: Book page redirects to authentication page (most common)
- Page structure changed (1lib.sk updates their site)
- Book is deleted or unavailable
- Session expired or cookies not maintained
- Proxy issues preventing proper page load
-
Solutions:
- Recommended: Use Anna's Archive instead:
summary search "book title" --source anna - Try the
search1libcommand separately to verify the book exists - Check if the book page loads correctly in a regular browser with the same proxy
- Verify proxy configuration is working correctly
- Try a different book from search results
- Recommended: Use Anna's Archive instead:
-
Known Issue - Z-Access Page: If you see links to
library-access.skorZ-Access pagein the debug output, this means:- The book page requires authentication or special access
- 1lib.sk's session management is blocking automated access
- Workaround: Use Anna's Archive which has better automation support
Example Debug Output (Z-Access Issue):
โ Download button not found on book page
Debug HTML saved to: ./uploads/book_name/debug-book-page.html
Found 6 links on page
First 5 links:
- https://library-access.sk (Z-Access page)
- mailto:[email protected] ([email protected])
- https://www.reddit.com/r/zlibrary (https://www.reddit.com/r/zlibrary)
Recommended Alternative:
# Use Anna's Archive instead (more reliable for automation)
summary search "prompt engineering" --source annaIf you're getting blocked by Anna's Archive:
-
Enable proxy in your configuration:
summary setup
-
Use a USA-based proxy to avoid geo-location issues
-
Test your proxy before downloading:
node test-proxy.js B0BCTMXNVN
-
Run browser in visible mode to debug:
summary config --headless false
The proxy is used for:
- โ Browser navigation (Puppeteer)
- โ File downloads (fetch with https-proxy-agent)
- โ All HTTP requests to Anna's Archive
Supported proxy formats:
http://proxy.example.com:8080https://proxy.example.com:8080socks5://proxy.example.com:1080http://proxy.example.com:8080-session-<SESSION_ID>(sticky session)
Recommended Service: Webshare.io - Reliable USA-based proxies with free tier available.
Webshare Sticky Sessions:
Add -session-<YOUR_SESSION_ID> to your proxy URL to maintain the same IP:
http://p.webshare.io:80-session-myapp123
When downloading from Anna's Archive, you may encounter CAPTCHAs. To automatically solve them:
- Sign up for 2Captcha: Get API key here
- Add to configuration:
summary setup
- Enter your 2Captcha API key when prompted
The tool will automatically detect and solve CAPTCHAs during downloads, making the process fully automated.
- Maximum PDF file size: No practical limit (intelligent chunking handles any size)
- GPT-5 uses default temperature of 1 (not configurable)
- Requires external tools: Calibre, Pandoc, XeLaTeX
- CAPTCHA solving requires 2captcha.com API key (optional)
- Very large PDFs (1000+ pages) may incur higher API costs due to multiple chunk processing
- Anna's Archive may block IPs without proxy configuration
- Chunked processing uses text extraction (images/diagrams described in text only)
- ISBN/ASIN lookup via Anna's Archive
- Automatic download from Anna's Archive with CAPTCHA solving
- Book title search via Rainforest API
- CLI with interactive mode
- ESM module for programmatic use
- Audio generation with ElevenLabs TTS
- Direct PDF upload to OpenAI vision API
- EPUB format prioritization (open standard)
- Support for more input formats (MOBI, AZW3)
- Chunked processing for very large books (>100MB)
- Custom summary templates
- Web interface
- Multiple voice options for audio
- Audio chapter markers
- Batch processing multiple books
ISC
Contributions are welcome! Please feel free to submit a Pull Request.