pyWebLayout/EPUB_READER_README.md

# EPUB Reader Documentation

## Overview

This project implements two major enhancements to pyWebLayout:

1. **Enhanced Page Class**: Moved HTML rendering logic from the browser into the `Page` class for better separation of concerns
2. **Tkinter EPUB Reader**: A complete EPUB reader application with pagination support

## Files Created/Modified

### 1. Enhanced Page Class (`pyWebLayout/concrete/page.py`)

**New Features Added:**
- `load_html_string()` - Load HTML content directly into a Page
- `load_html_file()` - Load HTML from a file
- Private conversion methods to transform abstract blocks to renderables
- Integration with existing HTML extraction system

**Key Methods:**
```python
page = Page(size=(800, 600))
page.load_html_string(html_content)  # Load HTML string
page.load_html_file("file.html")     # Load HTML file
image = page.render()                # Render to PIL Image
```

**Benefits:**
- Reuses existing `html_extraction.py` infrastructure
- Converts abstract blocks to concrete renderables
- Supports headings, paragraphs, lists, images, etc.
- Proper error handling with fallback rendering

### 2. EPUB Reader Application (`epub_reader_tk.py`)

**Features:**
- Complete Tkinter-based GUI
- EPUB file loading using existing `epub_reader.py`
- Chapter navigation with dropdown selection
- Page-by-page display with navigation controls
- Adjustable font size (8-24pt)
- Keyboard shortcuts (arrow keys, Ctrl+O)
- Status bar with loading feedback
- Scrollable content display

**GUI Components:**
- File open dialog for EPUB selection
- Chapter dropdown and navigation buttons
- Page navigation controls
- Font size adjustment
- Canvas with scrollbars for content display
- Status bar for feedback

**Navigation:**
- **Left/Right arrows**: Previous/Next page
- **Up/Down arrows**: Previous/Next chapter
- **Ctrl+O**: Open file dialog
- **Mouse**: Dropdown chapter selection

### 3. Test Suite (`test_enhanced_page.py`)

**Test Coverage:**
- HTML string loading and rendering
- HTML file loading and rendering
- EPUB reader app import and instantiation
- Error handling verification

## Technical Architecture

### HTML Processing Flow
```
HTML String/File → parse_html_string() → Abstract Blocks → Page._convert_block_to_renderable() → Concrete Renderables → Page.render() → PIL Image
```

### EPUB Reading Flow
```
EPUB File → read_epub() → Book → Chapters → Abstract Blocks → Page Conversion → Tkinter Display
```

## Usage Examples

### Basic HTML Page Rendering
```python
from pyWebLayout.concrete.page import Page

# Create and load HTML
page = Page(size=(800, 600))
page.load_html_string("""
<h1>Hello World</h1>
<p>This is a <strong>test</strong> paragraph.</p>
""")

# Render to image
image = page.render()
image.save("output.png")
```

### EPUB Reader Application
```python
# Run the EPUB reader
python epub_reader_tk.py

# Or import and use programmatically
from epub_reader_tk import EPUBReaderApp
app = EPUBReaderApp()
app.run()
```

## Features Demonstrated

### HTML Parsing & Rendering
- ✅ Paragraphs with inline formatting (bold, italic)
- ✅ Headers (H1-H6) with proper sizing
- ✅ Lists (ordered and unordered)
- ✅ Images with alt text fallback
- ✅ Error handling for malformed content

### EPUB Processing
- ✅ Full EPUB metadata extraction
- ✅ Chapter-by-chapter navigation
- ✅ Table of contents integration
- ✅ Multi-format content support

### User Interface
- ✅ Intuitive navigation controls
- ✅ Responsive layout with scrolling
- ✅ Font size customization
- ✅ Keyboard shortcuts
- ✅ Status feedback

## Dependencies

The EPUB reader leverages existing pyWebLayout infrastructure:
- `pyWebLayout.io.readers.epub_reader` - EPUB parsing
- `pyWebLayout.io.readers.html_extraction` - HTML to abstract blocks
- `pyWebLayout.concrete.*` - Renderable objects
- `pyWebLayout.abstract.*` - Abstract document model
- `pyWebLayout.style.*` - Styling system

## Testing

Run the test suite to verify functionality:
```bash
python test_enhanced_page.py
```

Expected output:
- ✅ HTML String Loading: PASS
- ✅ HTML File Loading: PASS
- ✅ EPUB Reader Imports: PASS

## Future Enhancements

1. **Advanced Pagination**: Break long chapters across multiple pages
2. **Search Functionality**: Full-text search within books
3. **Bookmarks**: Save reading position
4. **Themes**: Dark/light mode support
5. **Export**: Save pages as images or PDFs
6. **Zoom**: Variable zoom levels for accessibility

## Integration with Existing Browser

The enhanced Page class can be used to improve the existing `html_browser.py`:

```python
# Instead of complex parsing in the browser
parser = HTMLParser()
page = parser.parse_html_string(html_content)

# Use the new Page class
page = Page()
page.load_html_string(html_content)
```

This provides better separation of concerns and reuses the robust HTML extraction system.