# pyWebLayout Architecture: Abstract vs Concrete

This document explains the fundamental architectural separation between **Abstract** and **Concrete** layers in the pyWebLayout library.

## Overview

The pyWebLayout library follows a clear separation between two distinct layers:

- **Abstract Layer**: Represents the logical structure and content of documents (HTML/EPUB text)
- **Concrete Layer**: Handles the spatial rendering and visual representation of content

This separation provides flexibility, testability, and clean separation of concerns.

## Abstract Layer (`pyWebLayout/abstract/`)

The Abstract layer deals with the **logical structure** of documents without concerning itself with how content will be visually rendered.

### Key Components

#### `abstract/block.py`
- `Block`: Base class for all block-level content
- `Paragraph`: Represents a logical paragraph containing words
- `Heading`: Represents headings with semantic levels (H1-H6)
- `HList`: Represents ordered/unordered lists
- `Image`: Represents image references

#### `abstract/inline.py`
- `Word`: Represents individual words with text content and styling information
- Contains methods for hyphenation and text manipulation
- Does **not** handle rendering or spatial layout

#### `abstract/document.py`
- `Document`: Container for the overall document structure
- `Chapter`: Logical grouping of blocks (for books/long documents)

### Characteristics of Abstract Classes

1. **Content-focused**: Store text, structure, and semantic meaning
2. **Layout-agnostic**: No knowledge of fonts, pixels, or rendering
3. **Reusable**: Same content can be rendered in different formats/sizes
4. **Serializable**: Can be saved/loaded without rendering context

### Example: Abstract Word

```python
# An Abstract Word knows its text content and semantic properties
word = Word("supercalifragilisticexpialidocious", font_style)
word.hyphenate()  # Logical operation - finds break points
parts = word.get_hyphenated_parts()  # Returns ["super-", "cali-", "fragi-", ...]
```

## Concrete Layer (`pyWebLayout/concrete/`)

The Concrete layer handles the **spatial representation** and actual rendering of content.

### Key Components

#### `concrete/text.py`
- `Text`: Renders a specific text fragment with precise positioning
- `Line`: Manages a line of `Text` objects with spacing and alignment
- Handles actual pixel measurements, font rendering, and positioning

#### `concrete/page.py`
- `Page`: Top-level container for rendered content
- `Container`: Layout manager for organizing renderable objects
- Handles spatial layout, pagination, and visual composition

#### `concrete/box.py`
- `Box`: Base class for all spatially-aware renderable objects
- Provides positioning, sizing, and rendering capabilities

### Characteristics of Concrete Classes

1. **Rendering-focused**: Handle pixels, fonts, images, and visual output
2. **Spatially-aware**: Know exact positions, sizes, and layout constraints
3. **Implementation-specific**: Tied to specific rendering technologies (PIL, etc.)
4. **Non-portable**: Rendering results are tied to specific display contexts

### Example: Concrete Text

```python
# A Concrete Text object handles actual rendering
text = Text("super-", font)  # Specific text fragment
text._calculate_dimensions()  # Computes exact pixel size
image = text.render()  # Produces actual visual output
```

## The Transformation Process

The architecture involves a clear transformation from Abstract to Concrete:

```
Abstract Document
       ↓
   [Parser Layer]
       ↓  
Abstract Blocks (Paragraph, Heading, etc.)
       ↓
   [Layout Engine]
       ↓
Concrete Objects (Text, Line, Page)
       ↓
   [Rendering Engine]  
       ↓
Visual Output (Images, PDF, etc.)
```

### Example Transformation

```python
# 1. Abstract content
paragraph = Paragraph()
paragraph.add_word(Word("This", font))
paragraph.add_word(Word("is", font))
paragraph.add_word(Word("a", font))
paragraph.add_word(Word("test", font))

# 2. Layout transformation
layout = ParagraphLayout(line_width=200, line_height=20)
lines = layout.layout_paragraph(paragraph)  # Returns List[Line]

# 3. Each Line contains concrete Text objects
for line in lines:
    for text_obj in line.text_objects:  # List[Text]
        print(f"Text: '{text_obj.text}' at position {text_obj._origin}")
```

## Key Architectural Principles

### 1. **Single Responsibility**
- Abstract classes: Handle content and structure
- Concrete classes: Handle rendering and layout

### 2. **Separation of Concerns**
- Text parsing/processing ≠ Text rendering
- Document structure ≠ Page layout
- Content semantics ≠ Visual presentation

### 3. **Immutable Abstract Content**
- Abstract content remains unchanged during rendering
- Multiple concrete representations can be generated from same abstract content
- Enables pagination, different formats, responsive layouts

### 4. **One-to-Many Relationships**
- One Abstract Word → Multiple Concrete Text objects (hyphenation)
- One Abstract Paragraph → Multiple Concrete Lines
- One Abstract Document → Multiple Concrete Pages

## Common Anti-Patterns to Avoid

### ❌ **Mixing Concerns**
```python
# WRONG: Abstract class knowing about pixels
class Word:
    def __init__(self, text):
        self.text = text
        self.rendered_width = None  # ❌ Concrete concern in abstract class
```

### ❌ **renderable_words Concept**
```python
# WRONG: Confusing abstract and concrete
line.renderable_words  # ❌ This suggests Words are renderable
                      # Words are abstract - only Text objects render
```

### ✅ **Correct Separation**
```python
# CORRECT: Clear separation
abstract_word = Word("test")  # Abstract content
concrete_text = Text("test", font)  # Concrete rendering
line.text_objects.append(concrete_text)  # Concrete objects in concrete container
```

## Benefits of This Architecture

### 1. **Flexibility**
- Same content can be rendered at different sizes
- Multiple output formats from single source
- Easy to implement responsive design

### 2. **Testability**
- Abstract logic can be tested without rendering
- Layout algorithms can be tested independently
- Visual rendering can be mocked

### 3. **Performance**
- Abstract content can be cached and reused
- Layout can be computed once for multiple renderings
- Incremental updates possible

### 4. **Maintainability**
- Clear boundaries between text processing and rendering
- Changes to rendering don't affect content parsing
- Easy to swap rendering backends

## File Organization

```
pyWebLayout/
├── abstract/           # Content and structure
│   ├── block.py       # Document blocks (Paragraph, Heading, etc.)
│   ├── inline.py      # Inline content (Word, etc.)
│   ├── document.py    # Document structure
│   └── functional.py  # Links, buttons, etc.
│
├── concrete/          # Rendering and layout
│   ├── text.py        # Text and Line rendering
│   ├── page.py        # Page layout and containers
│   ├── box.py         # Base rendering classes
│   ├── image.py       # Image rendering
│   └── functional.py  # Interactive elements
│
├── typesetting/       # Layout algorithms
│   ├── paragraph_layout.py  # Abstract → Concrete transformation
│   ├── flow.py        # Text flow management
│   └── pagination.py  # Page breaking logic
│
└── style/             # Styling and formatting
    ├── fonts.py       # Font management
    ├── layout.py      # Layout constants
    └── alignment.py   # Alignment enums
```

## Conclusion

The Abstract/Concrete separation is fundamental to pyWebLayout's design. It ensures clean separation between content processing and visual rendering, enabling flexible, maintainable, and testable document processing pipelines.

**Remember**: 
- **Abstract** = What to display (content, structure, semantics)
- **Concrete** = How to display it (pixels, fonts, positioning, rendering)

This architecture enables the library to handle complex document layouts while maintaining clear, understandable code organization.