dtourolle/pyWebLayout

Fork 0

Duncan Tourolle f7ad69f9ec first code commit

2025-05-27 11:58:19 +02:00

7.8 KiB

Raw Blame History

PyWebLayout Testing Strategy

This document outlines the comprehensive unit testing strategy for the pyWebLayout project.

Testing Philosophy

The testing strategy follows these principles:

Separation of Concerns: Each component is tested independently
Comprehensive Coverage: All public APIs and critical functionality are tested
Integration Testing: End-to-end workflows are validated
Regression Prevention: Tests prevent breaking changes
Documentation: Tests serve as living documentation of expected behavior

Test Organization

Current Test Files (Implemented)

✅ `test_html_style.py`

Tests the HTMLStyleManager class for CSS parsing and style management.

Coverage:

Style initialization and defaults
Style stack operations (push/pop)
CSS property parsing (font-size, font-weight, colors, etc.)
Color parsing (named, hex, rgb, rgba)
Tag-specific default styles
Inline style parsing
Font object creation
Style combination (tag + inline styles)

✅ `test_html_text.py`

Tests the HTMLTextProcessor class for text buffering and word creation.

Coverage:

Text buffer management
HTML entity reference handling
Character reference processing (decimal/hex)
Word creation with styling
Paragraph management
Text flushing operations
Buffer state operations

✅ `test_html_content.py`

Integration tests for the HTMLContentReader class covering complete HTML parsing.

Coverage:

Simple paragraph parsing
Heading levels (h1-h6)
Styled text (bold, italic)
Lists (ul, ol, dl)
Tables with headers and cells
Blockquotes with nested content
Code blocks with language detection
HTML entities
Nested element structures
Complex document parsing

✅ `test_abstract_blocks.py`

Tests for the core abstract block element classes.

Coverage:

Paragraph word management
Heading levels and properties
Quote nesting capabilities
Code block line management
List creation and item handling
Table structure (rows, cells, sections)
Image properties and scaling
Simple elements (hr, br)

✅ `test_runner.py`

Test runner script for executing all tests with summary reporting.

Additional Tests Needed

🔄 High Priority (Should Implement Next)

`test_abstract_inline.py`

Tests for inline elements and text formatting.

Needed Coverage:

Word creation and properties
Word hyphenation functionality
FormattedSpan management
Word chaining (previous/next relationships)
Font style application
Language-specific hyphenation

`test_abstract_document.py`

Tests for document structure and metadata.

Needed Coverage:

Document creation and initialization
Metadata management (title, author, language, etc.)
Block addition and management
Anchor creation and resolution
Resource management
Table of contents generation
Chapter and book structures

`test_abstract_functional.py`

Tests for functional elements (links, buttons, forms).

Needed Coverage:

Link creation and type detection
Link execution for different types
Button functionality and state
Form field management
Form validation and submission
Parameter handling

`test_style_system.py`

Tests for the style system (fonts, colors, alignment).

Needed Coverage:

Font creation and properties
Color representation and manipulation
Font weight, style, decoration enums
Alignment enums and behavior
Style inheritance and cascading

🔧 Medium Priority

`test_html_elements.py`

Unit tests for the HTML element handlers.

Needed Coverage:

BlockElementHandler individual methods
ListElementHandler state management
TableElementHandler complex scenarios
InlineElementHandler link processing
Handler coordination and delegation
Error handling in handlers

`test_html_metadata.py`

Tests for HTML metadata extraction.

Needed Coverage:

Meta tag parsing
Open Graph extraction
JSON-LD structured data
Title and description extraction
Language detection
Character encoding handling

`test_html_resources.py`

Tests for HTML resource extraction.

Needed Coverage:

CSS stylesheet extraction
JavaScript resource identification
Image source collection
Media element detection
External resource resolution
Base URL handling

`test_io_base.py`

Tests for the base reader architecture.

Needed Coverage:

BaseReader interface compliance
MetadataReader abstract methods
ContentReader abstract methods
ResourceReader abstract methods
CompositeReader coordination

🔍 Lower Priority

`test_concrete_elements.py`

Tests for concrete rendering implementations.

Needed Coverage:

Box model calculations
Text rendering specifics
Image rendering and scaling
Page layout management
Functional element rendering

`test_typesetting.py`

Tests for the typesetting system.

Needed Coverage:

Flow algorithms
Pagination logic
Document pagination
Line breaking
Hyphenation integration

`test_epub_reader.py`

Tests for EPUB reading functionality.

Needed Coverage:

EPUB file structure parsing
Manifest processing
Chapter extraction
Metadata reading
Navigation document parsing

`test_integration.py`

End-to-end integration tests.

Needed Coverage:

Complete HTML-to-document workflows
EPUB-to-document workflows
Style application across parsers
Resource resolution chains
Error handling scenarios

Testing Infrastructure

Test Dependencies

# Required for testing
unittest  # Built-in Python testing framework
unittest.mock  # For mocking and test doubles

Test Data

Create tests/data/ directory with sample files:
- sample.html - Well-formed HTML document
- complex.html - Complex nested HTML
- malformed.html - Edge cases and error conditions
- sample.epub - Sample EPUB file
- test_images/ - Sample images for testing

Continuous Integration

Tests should run on Python 3.6+
All tests must pass before merging
Aim for >90% code coverage
Performance regression testing for parsing speed

Running Tests

Run All Tests

python tests/test_runner.py

Run Specific Test Module

python tests/test_runner.py html_style
python -m unittest tests.test_html_style

Run Individual Test

python -m unittest tests.test_html_style.TestHTMLStyleManager.test_color_parsing

Run with Coverage

pip install coverage
coverage run -m unittest discover tests/
coverage report -m
coverage html  # Generate HTML report

Test Quality Guidelines

Test Naming

Test files: test_<module_name>.py
Test classes: Test<ClassName>
Test methods: test_<specific_functionality>

Test Structure

Arrange: Set up test data and mocks
Act: Execute the functionality being tested
Assert: Verify the expected behavior

Mock Usage

Mock external dependencies (file I/O, network)
Mock complex objects when testing units in isolation
Prefer real objects for integration tests

Edge Cases

Empty inputs
Invalid inputs
Boundary conditions
Error scenarios
Performance edge cases

Success Metrics

Coverage: >90% line coverage across all modules
Performance: No test takes longer than 1 second
Reliability: Tests pass consistently across environments
Maintainability: Tests are easy to understand and modify
Documentation: Tests clearly show expected behavior

Implementation Priority

Week 1: Complete high-priority abstract tests
Week 2: Implement HTML processing component tests
Week 3: Add integration and end-to-end tests
Week 4: Performance and edge case testing

This testing strategy ensures comprehensive coverage of the pyWebLayout library while maintaining good separation of concerns and providing clear documentation of expected behavior.

7.8 KiB Raw Blame History

PyWebLayout Testing Strategy

Testing Philosophy

Test Organization

Current Test Files (Implemented)

✅ test_html_style.py

✅ test_html_text.py

✅ test_html_content.py

✅ test_abstract_blocks.py

✅ test_runner.py

Additional Tests Needed

🔄 High Priority (Should Implement Next)

test_abstract_inline.py

test_abstract_document.py

test_abstract_functional.py

test_style_system.py

🔧 Medium Priority

test_html_elements.py

test_html_metadata.py

test_html_resources.py

test_io_base.py

🔍 Lower Priority

test_concrete_elements.py

test_typesetting.py

test_epub_reader.py

test_integration.py

Testing Infrastructure

Test Dependencies

Test Data

Continuous Integration

Running Tests

Run All Tests

Run Specific Test Module

Run Individual Test

Run with Coverage

Test Quality Guidelines

Test Naming

Test Structure

Mock Usage

Edge Cases

Success Metrics

Implementation Priority

7.8 KiB

Raw Blame History

✅ `test_html_style.py`

✅ `test_html_text.py`

✅ `test_html_content.py`

✅ `test_abstract_blocks.py`

✅ `test_runner.py`

`test_abstract_inline.py`

`test_abstract_document.py`

`test_abstract_functional.py`

`test_style_system.py`

`test_html_elements.py`

`test_html_metadata.py`

`test_html_resources.py`

`test_io_base.py`

`test_concrete_elements.py`

`test_typesetting.py`

`test_epub_reader.py`

`test_integration.py`