176 lines
4.8 KiB
Markdown
176 lines
4.8 KiB
Markdown
# EPUB Reader Documentation
|
|
|
|
## Overview
|
|
|
|
This project implements two major enhancements to pyWebLayout:
|
|
|
|
1. **Enhanced Page Class**: Moved HTML rendering logic from the browser into the `Page` class for better separation of concerns
|
|
2. **Tkinter EPUB Reader**: A complete EPUB reader application with pagination support
|
|
|
|
## Files Created/Modified
|
|
|
|
### 1. Enhanced Page Class (`pyWebLayout/concrete/page.py`)
|
|
|
|
**New Features Added:**
|
|
- `load_html_string()` - Load HTML content directly into a Page
|
|
- `load_html_file()` - Load HTML from a file
|
|
- Private conversion methods to transform abstract blocks to renderables
|
|
- Integration with existing HTML extraction system
|
|
|
|
**Key Methods:**
|
|
```python
|
|
page = Page(size=(800, 600))
|
|
page.load_html_string(html_content) # Load HTML string
|
|
page.load_html_file("file.html") # Load HTML file
|
|
image = page.render() # Render to PIL Image
|
|
```
|
|
|
|
**Benefits:**
|
|
- Reuses existing `html_extraction.py` infrastructure
|
|
- Converts abstract blocks to concrete renderables
|
|
- Supports headings, paragraphs, lists, images, etc.
|
|
- Proper error handling with fallback rendering
|
|
|
|
### 2. EPUB Reader Application (`epub_reader_tk.py`)
|
|
|
|
**Features:**
|
|
- Complete Tkinter-based GUI
|
|
- EPUB file loading using existing `epub_reader.py`
|
|
- Chapter navigation with dropdown selection
|
|
- Page-by-page display with navigation controls
|
|
- Adjustable font size (8-24pt)
|
|
- Keyboard shortcuts (arrow keys, Ctrl+O)
|
|
- Status bar with loading feedback
|
|
- Scrollable content display
|
|
|
|
**GUI Components:**
|
|
- File open dialog for EPUB selection
|
|
- Chapter dropdown and navigation buttons
|
|
- Page navigation controls
|
|
- Font size adjustment
|
|
- Canvas with scrollbars for content display
|
|
- Status bar for feedback
|
|
|
|
**Navigation:**
|
|
- **Left/Right arrows**: Previous/Next page
|
|
- **Up/Down arrows**: Previous/Next chapter
|
|
- **Ctrl+O**: Open file dialog
|
|
- **Mouse**: Dropdown chapter selection
|
|
|
|
### 3. Test Suite (`test_enhanced_page.py`)
|
|
|
|
**Test Coverage:**
|
|
- HTML string loading and rendering
|
|
- HTML file loading and rendering
|
|
- EPUB reader app import and instantiation
|
|
- Error handling verification
|
|
|
|
## Technical Architecture
|
|
|
|
### HTML Processing Flow
|
|
```
|
|
HTML String/File → parse_html_string() → Abstract Blocks → Page._convert_block_to_renderable() → Concrete Renderables → Page.render() → PIL Image
|
|
```
|
|
|
|
### EPUB Reading Flow
|
|
```
|
|
EPUB File → read_epub() → Book → Chapters → Abstract Blocks → Page Conversion → Tkinter Display
|
|
```
|
|
|
|
## Usage Examples
|
|
|
|
### Basic HTML Page Rendering
|
|
```python
|
|
from pyWebLayout.concrete.page import Page
|
|
|
|
# Create and load HTML
|
|
page = Page(size=(800, 600))
|
|
page.load_html_string("""
|
|
<h1>Hello World</h1>
|
|
<p>This is a <strong>test</strong> paragraph.</p>
|
|
""")
|
|
|
|
# Render to image
|
|
image = page.render()
|
|
image.save("output.png")
|
|
```
|
|
|
|
### EPUB Reader Application
|
|
```python
|
|
# Run the EPUB reader
|
|
python epub_reader_tk.py
|
|
|
|
# Or import and use programmatically
|
|
from epub_reader_tk import EPUBReaderApp
|
|
app = EPUBReaderApp()
|
|
app.run()
|
|
```
|
|
|
|
## Features Demonstrated
|
|
|
|
### HTML Parsing & Rendering
|
|
- ✅ Paragraphs with inline formatting (bold, italic)
|
|
- ✅ Headers (H1-H6) with proper sizing
|
|
- ✅ Lists (ordered and unordered)
|
|
- ✅ Images with alt text fallback
|
|
- ✅ Error handling for malformed content
|
|
|
|
### EPUB Processing
|
|
- ✅ Full EPUB metadata extraction
|
|
- ✅ Chapter-by-chapter navigation
|
|
- ✅ Table of contents integration
|
|
- ✅ Multi-format content support
|
|
|
|
### User Interface
|
|
- ✅ Intuitive navigation controls
|
|
- ✅ Responsive layout with scrolling
|
|
- ✅ Font size customization
|
|
- ✅ Keyboard shortcuts
|
|
- ✅ Status feedback
|
|
|
|
## Dependencies
|
|
|
|
The EPUB reader leverages existing pyWebLayout infrastructure:
|
|
- `pyWebLayout.io.readers.epub_reader` - EPUB parsing
|
|
- `pyWebLayout.io.readers.html_extraction` - HTML to abstract blocks
|
|
- `pyWebLayout.concrete.*` - Renderable objects
|
|
- `pyWebLayout.abstract.*` - Abstract document model
|
|
- `pyWebLayout.style.*` - Styling system
|
|
|
|
## Testing
|
|
|
|
Run the test suite to verify functionality:
|
|
```bash
|
|
python test_enhanced_page.py
|
|
```
|
|
|
|
Expected output:
|
|
- ✅ HTML String Loading: PASS
|
|
- ✅ HTML File Loading: PASS
|
|
- ✅ EPUB Reader Imports: PASS
|
|
|
|
## Future Enhancements
|
|
|
|
1. **Advanced Pagination**: Break long chapters across multiple pages
|
|
2. **Search Functionality**: Full-text search within books
|
|
3. **Bookmarks**: Save reading position
|
|
4. **Themes**: Dark/light mode support
|
|
5. **Export**: Save pages as images or PDFs
|
|
6. **Zoom**: Variable zoom levels for accessibility
|
|
|
|
## Integration with Existing Browser
|
|
|
|
The enhanced Page class can be used to improve the existing `html_browser.py`:
|
|
|
|
```python
|
|
# Instead of complex parsing in the browser
|
|
parser = HTMLParser()
|
|
page = parser.parse_html_string(html_content)
|
|
|
|
# Use the new Page class
|
|
page = Page()
|
|
page.load_html_string(html_content)
|
|
```
|
|
|
|
This provides better separation of concerns and reuses the robust HTML extraction system.
|