Cleaning unusued prototypes
Some checks failed
Python CI / test (push) Failing after 6m20s

This commit is contained in:
Duncan Tourolle 2025-11-04 19:36:34 +01:00
parent 993095caf9
commit 12d6fcd5db
5 changed files with 0 additions and 1820 deletions

View File

@ -1,371 +0,0 @@
# Recursive Position System
A flexible, hierarchical position tracking system for dynamic content positioning in document layout applications.
## Overview
The Recursive Position System provides a powerful way to track positions within complex, nested document structures. Unlike traditional flat position systems that only track basic coordinates, this system can reference any type of content (words, images, table cells, list items, etc.) with full hierarchical context.
## Key Features
- **Hierarchical Position Tracking**: Navigate through nested document structures with precision
- **Dynamic Content Type Support**: Handle words, images, tables, lists, forms, and more
- **Flexible Serialization**: Save positions as JSON or Python shelf objects
- **Position Relationships**: Query ancestor/descendant relationships between positions
- **Fluent Builder Pattern**: Easy position creation with method chaining
- **Metadata Support**: Store rendering context (font scale, themes, etc.)
- **Real-world Applications**: Perfect for ereaders, document editors, and CMS systems
## Architecture
### Core Components
1. **ContentType Enum**: Defines all supported content types
2. **LocationNode**: Represents a single position within a content type
3. **RecursivePosition**: Hierarchical position with a path of LocationNodes
4. **PositionBuilder**: Fluent interface for creating positions
5. **PositionStorage**: Persistent storage with JSON and shelf support
### Position Hierarchy
Positions are represented as paths from document root to specific locations:
```
Document → Chapter[2] → Block[5] → Paragraph → Word[12] → Character[3]
Document → Chapter[1] → Block[3] → Table → Row[2] → Cell[1] → Word[0]
Document → Chapter[0] → Block[1] → Image
```
## Usage Examples
### Basic Position Creation
```python
from pyWebLayout.layout.recursive_position import PositionBuilder
# Create a word position with character-level precision
position = (PositionBuilder()
.chapter(2)
.block(5)
.paragraph()
.word(12, offset=3)
.with_rendering_metadata(font_scale=1.5, theme="dark")
.build())
print(position) # document[0] -> chapter[2] -> block[5] -> paragraph[0] -> word[12]+3
```
### Different Content Types
```python
from pyWebLayout.layout.recursive_position import (
create_word_position, create_image_position,
create_table_cell_position, create_list_item_position
)
# Word in a paragraph
word_pos = create_word_position(chapter=1, block=3, word=15, char_offset=2)
# Image in a block
image_pos = create_image_position(chapter=2, block=1, image_index=0)
# Cell in a table
table_pos = create_table_cell_position(chapter=0, block=4, row=2, col=1, word=5)
# Item in a list
list_pos = create_list_item_position(chapter=1, block=2, item=3, word=0)
```
### Complex Nested Structures
```python
# Position in a nested list
nested_pos = (PositionBuilder()
.chapter(2)
.block(5)
.list(0, list_type="ordered")
.list_item(2)
.list(1, list_type="unordered") # Nested list
.list_item(1)
.word(3)
.build())
# Position in a table cell with metadata
table_pos = (PositionBuilder()
.chapter(3)
.block(10)
.table(0, table_type="financial", columns=5)
.table_row(2, row_type="data")
.table_cell(1, cell_type="currency", format="USD")
.word(0, text="$1,234.56")
.build())
```
### Position Relationships
```python
# Check ancestor/descendant relationships
chapter_pos = PositionBuilder().chapter(1).block(2).build()
word_pos = PositionBuilder().chapter(1).block(2).paragraph().word(5).build()
print(chapter_pos.is_ancestor_of(word_pos)) # True
print(word_pos.is_descendant_of(chapter_pos)) # True
# Find common ancestors
other_pos = create_word_position(1, 3, 0) # Different block
common = word_pos.get_common_ancestor(other_pos)
print(common) # document[0] -> chapter[1]
```
### Serialization and Storage
```python
from pyWebLayout.layout.recursive_position import PositionStorage
# JSON storage
storage = PositionStorage("bookmarks", use_shelf=False)
# Save positions
storage.save_position("my_document", "bookmark1", position)
storage.save_position("my_document", "bookmark2", other_position)
# Load positions
loaded = storage.load_position("my_document", "bookmark1")
all_bookmarks = storage.list_positions("my_document")
# Shelf storage (binary, more efficient for large datasets)
shelf_storage = PositionStorage("bookmarks", use_shelf=True)
shelf_storage.save_position("my_document", "bookmark1", position)
```
## Content Types
The system supports the following content types:
| Type | Description | Example Usage |
|------|-------------|---------------|
| `DOCUMENT` | Document root | Always present as root node |
| `CHAPTER` | Document chapters/sections | Chapter navigation |
| `BLOCK` | Block-level elements | Paragraphs, headings, tables |
| `PARAGRAPH` | Text paragraphs | Text content |
| `HEADING` | Section headings | H1-H6 elements |
| `TABLE` | Table structures | Data tables |
| `TABLE_ROW` | Table rows | Row navigation |
| `TABLE_CELL` | Table cells | Cell-specific content |
| `LIST` | List structures | Ordered/unordered lists |
| `LIST_ITEM` | List items | Individual list entries |
| `WORD` | Individual words | Word-level precision |
| `IMAGE` | Images | Visual content |
| `LINK` | Hyperlinks | Interactive links |
| `BUTTON` | Interactive buttons | Form controls |
| `FORM_FIELD` | Form input fields | User input |
| `LINE` | Rendered text lines | Layout-specific |
| `PAGE` | Rendered pages | Pagination |
## Ereader Integration
The system is designed for ereader applications with features like:
### Bookmark Management
```python
# Save reading position with context
reading_pos = (PositionBuilder()
.chapter(3)
.block(15)
.paragraph()
.word(23, offset=7)
.with_rendering_metadata(
font_scale=1.2,
page_size=[600, 800],
theme="sepia"
)
.build())
storage.save_position("novel", "chapter3_climax", reading_pos)
```
### Chapter Navigation
```python
# Jump to chapter start
chapter_start = PositionBuilder().chapter(5).block(0).paragraph().word(0).build()
# Navigate within chapter
current_pos = PositionBuilder().chapter(5).block(12).paragraph().word(45).build()
# Check if positions are in same chapter
same_chapter = chapter_start.get_common_ancestor(current_pos)
chapter_node = same_chapter.get_node(ContentType.CHAPTER)
print(f"Both in chapter {chapter_node.index}")
```
### Font Scaling Support
```python
# Position with rendering metadata
position = (PositionBuilder()
.chapter(2)
.block(8)
.paragraph()
.word(15)
.with_rendering_metadata(
font_scale=1.5,
page_size=[800, 600],
line_height=24,
theme="dark"
)
.build())
# Metadata persists through serialization
json_str = position.to_json()
restored = RecursivePosition.from_json(json_str)
print(restored.rendering_metadata["font_scale"]) # 1.5
```
## Advanced Features
### Position Navigation
```python
# Truncate position to specific level
word_pos = create_word_position(2, 5, 12, 3)
block_pos = word_pos.copy().truncate_to_type(ContentType.BLOCK)
print(block_pos) # document[0] -> chapter[2] -> block[5]
# Navigate between related positions
table_cell_pos = create_table_cell_position(1, 3, 2, 1, 0)
next_cell_pos = table_cell_pos.copy()
cell_node = next_cell_pos.get_node(ContentType.TABLE_CELL)
cell_node.index = 2 # Move to next column
```
### Metadata Usage
```python
# Rich metadata support
position = (PositionBuilder()
.chapter(1)
.block(5)
.table(0,
table_type="financial",
columns=5,
rows=20,
title="Q3 Results")
.table_row(3,
row_type="data",
category="revenue")
.table_cell(2,
cell_type="currency",
format="USD",
precision=2)
.word(0, text="$1,234,567.89")
.build())
# Access metadata
table_node = position.get_node(ContentType.TABLE)
print(table_node.metadata["title"]) # "Q3 Results"
cell_node = position.get_node(ContentType.TABLE_CELL)
print(cell_node.metadata["format"]) # "USD"
```
## Performance Considerations
### Memory Usage
- Positions are lightweight (typically < 1KB serialized)
- Path-based structure minimizes memory overhead
- Metadata is optional and only stored when needed
### Serialization Performance
- **JSON**: Human-readable, cross-platform, ~2-3x larger
- **Shelf**: Binary format, faster for large datasets, Python-specific
### Comparison Operations
- Position equality: O(n) where n is path depth
- Ancestor/descendant checks: O(min(depth1, depth2))
- Common ancestor finding: O(min(depth1, depth2))
## Integration with Existing Systems
### Backward Compatibility
The system can coexist with existing position tracking:
```python
# Convert from old RenderingPosition
def convert_old_position(old_pos):
return (PositionBuilder()
.chapter(old_pos.chapter_index)
.block(old_pos.block_index)
.paragraph()
.word(old_pos.word_index)
.build())
# Convert to old format (lossy)
def convert_to_old(recursive_pos):
chapter_node = recursive_pos.get_node(ContentType.CHAPTER)
block_node = recursive_pos.get_node(ContentType.BLOCK)
word_node = recursive_pos.get_node(ContentType.WORD)
return RenderingPosition(
chapter_index=chapter_node.index if chapter_node else 0,
block_index=block_node.index if block_node else 0,
word_index=word_node.index if word_node else 0
)
```
### Migration Strategy
1. **Phase 1**: Implement recursive system alongside existing system
2. **Phase 2**: Update bookmark storage to use new format
3. **Phase 3**: Migrate existing bookmarks
4. **Phase 4**: Update layout engines to generate recursive positions
5. **Phase 5**: Remove old position system
## Testing
Comprehensive test suite covers:
- Position creation and manipulation
- Serialization/deserialization
- Storage systems (JSON and shelf)
- Position relationships
- Real-world scenarios
- Performance benchmarks
Run tests with:
```bash
python -m pytest tests/layout/test_recursive_position.py -v
```
## Examples
See `examples/recursive_position_demo.py` for a complete demonstration of all features.
## Future Enhancements
Potential improvements:
1. **Position Comparison**: Implement `<`, `>`, `<=`, `>=` operators for sorting
2. **Path Compression**: Optimize storage for deep hierarchies
3. **Query Language**: SQL-like queries for position sets
4. **Indexing**: B-tree indexing for large position collections
5. **Diff Operations**: Calculate differences between positions
6. **Batch Operations**: Efficient bulk position updates
## Conclusion
The Recursive Position System provides a robust, flexible foundation for position tracking in complex document structures. Its hierarchical approach, rich metadata support, and efficient serialization make it ideal for modern ereader applications and document management systems.
The system's design prioritizes:
- **Flexibility**: Handle any content type or nesting level
- **Performance**: Efficient operations and minimal memory usage
- **Usability**: Intuitive builder pattern and clear APIs
- **Persistence**: Reliable serialization and storage options
- **Extensibility**: Easy to add new content types and features
This makes it a significant improvement over traditional flat position systems and provides a solid foundation for advanced document navigation features.

View File

@ -56,10 +56,6 @@ These examples demonstrate rendering HTML content to multi-page layouts:
For detailed information about HTML rendering, see `README_HTML_MULTIPAGE.md`. For detailed information about HTML rendering, see `README_HTML_MULTIPAGE.md`.
### Advanced Topics
**`recursive_position_demo.py`** - Demonstrates the recursive position tracking system
## Documentation ## Documentation
- `README_EREADER.md` - Detailed EbookReader API documentation - `README_EREADER.md` - Detailed EbookReader API documentation

View File

@ -1,386 +0,0 @@
#!/usr/bin/env python3
"""
Demonstration of the Recursive Position System
This example shows how to use the hierarchical position tracking system
that can reference any type of content (words, images, table cells, etc.)
in a nested document structure.
Key Features Demonstrated:
- Hierarchical position tracking
- Dynamic content type support
- JSON and shelf serialization
- Position relationships (ancestor/descendant)
- Bookmark management
- Real-world ereader scenarios
"""
import sys
import os
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..'))
from pyWebLayout.layout.recursive_position import (
ContentType, LocationNode, RecursivePosition, PositionBuilder, PositionStorage,
create_word_position, create_image_position, create_table_cell_position, create_list_item_position
)
def demonstrate_basic_position_creation():
"""Show basic position creation and manipulation"""
print("=== Basic Position Creation ===")
# Create a position using the builder pattern
position = (PositionBuilder()
.chapter(2)
.block(5)
.paragraph()
.word(12, offset=3)
.with_rendering_metadata(font_scale=1.5, page_size=[800, 600])
.build())
print(f"Position path: {position}")
print(f"Depth: {position.get_depth()}")
print(f"Leaf node: {position.get_leaf_node()}")
# Query specific nodes
chapter_node = position.get_node(ContentType.CHAPTER)
word_node = position.get_node(ContentType.WORD)
print(f"Chapter: {chapter_node.index}")
print(f"Word: {word_node.index}, offset: {word_node.offset}")
print(f"Font scale: {position.rendering_metadata.get('font_scale')}")
print()
def demonstrate_different_content_types():
"""Show positions for different content types"""
print("=== Different Content Types ===")
# Word position
word_pos = create_word_position(1, 3, 15, 2)
print(f"Word position: {word_pos}")
# Image position
image_pos = create_image_position(2, 1, 0)
print(f"Image position: {image_pos}")
# Table cell position
table_pos = create_table_cell_position(0, 4, 2, 1, 5)
print(f"Table cell position: {table_pos}")
# List item position
list_pos = create_list_item_position(1, 2, 3, 0)
print(f"List item position: {list_pos}")
# Complex nested structure
complex_pos = (PositionBuilder()
.chapter(3)
.block(7)
.table(0, table_type="data", columns=4)
.table_row(2, row_type="header")
.table_cell(1, cell_type="data", colspan=2)
.link(0, url="https://example.com", text="Click here")
.build())
print(f"Complex nested position: {complex_pos}")
print()
def demonstrate_position_relationships():
"""Show ancestor/descendant relationships"""
print("=== Position Relationships ===")
# Create related positions
chapter_pos = (PositionBuilder()
.chapter(1)
.block(2)
.build())
paragraph_pos = (PositionBuilder()
.chapter(1)
.block(2)
.paragraph()
.build())
word_pos = (PositionBuilder()
.chapter(1)
.block(2)
.paragraph()
.word(5)
.build())
# Test relationships
print(f"Chapter position: {chapter_pos}")
print(f"Paragraph position: {paragraph_pos}")
print(f"Word position: {word_pos}")
print(f"Chapter is ancestor of paragraph: {chapter_pos.is_ancestor_of(paragraph_pos)}")
print(f"Chapter is ancestor of word: {chapter_pos.is_ancestor_of(word_pos)}")
print(f"Word is descendant of chapter: {word_pos.is_descendant_of(chapter_pos)}")
# Find common ancestors
unrelated_pos = create_word_position(2, 1, 0) # Different chapter
common = word_pos.get_common_ancestor(unrelated_pos)
print(f"Common ancestor of word and unrelated: {common}")
print()
def demonstrate_serialization():
"""Show JSON and shelf serialization"""
print("=== Serialization ===")
# Create a complex position
position = (PositionBuilder()
.chapter(4)
.block(8)
.table(0, table_type="financial", columns=5, rows=20)
.table_row(3, row_type="data", category="Q2")
.table_cell(2, cell_type="currency", format="USD")
.word(0, text="$1,234.56")
.with_rendering_metadata(
font_scale=1.2,
page_size=[600, 800],
theme="light",
currency_format="USD"
)
.build())
# JSON serialization
json_str = position.to_json()
print("JSON serialization:")
print(json_str[:200] + "..." if len(json_str) > 200 else json_str)
# Deserialize and verify
restored = RecursivePosition.from_json(json_str)
print(f"Restored position equals original: {position == restored}")
print()
def demonstrate_storage_systems():
"""Show both JSON and shelf storage"""
print("=== Storage Systems ===")
# Create test positions
positions = {
"bookmark1": create_word_position(1, 5, 20, 3),
"bookmark2": create_image_position(2, 3, 1),
"bookmark3": create_table_cell_position(3, 1, 2, 1, 0)
}
# Test JSON storage
print("JSON Storage:")
json_storage = PositionStorage("demo_positions_json", use_shelf=False)
for name, pos in positions.items():
json_storage.save_position("demo_doc", name, pos)
print(f" Saved {name}: {pos}")
# List and load positions
saved_positions = json_storage.list_positions("demo_doc")
print(f" Saved positions: {saved_positions}")
loaded = json_storage.load_position("demo_doc", "bookmark1")
print(f" Loaded bookmark1: {loaded}")
print(f" Matches original: {loaded == positions['bookmark1']}")
# Test shelf storage
print("\nShelf Storage:")
shelf_storage = PositionStorage("demo_positions_shelf", use_shelf=True)
for name, pos in positions.items():
shelf_storage.save_position("demo_doc", name, pos)
shelf_positions = shelf_storage.list_positions("demo_doc")
print(f" Shelf positions: {shelf_positions}")
# Clean up demo files
import shutil
try:
shutil.rmtree("demo_positions_json")
shutil.rmtree("demo_positions_shelf")
except:
pass
print()
def demonstrate_ereader_scenario():
"""Show realistic ereader bookmark scenario"""
print("=== Ereader Bookmark Scenario ===")
# Simulate user reading progress
reading_positions = [
# User starts reading chapter 1
(PositionBuilder()
.chapter(1)
.block(0)
.paragraph()
.word(0)
.with_rendering_metadata(font_scale=1.0, page_size=[600, 800], theme="light")
.build(), "Chapter 1 Start"),
# User bookmarks an interesting quote in chapter 2
(PositionBuilder()
.chapter(2)
.block(15)
.paragraph()
.word(8, offset=0)
.with_rendering_metadata(font_scale=1.2, page_size=[600, 800], theme="sepia")
.build(), "Interesting Quote"),
# User bookmarks a table in chapter 3
(PositionBuilder()
.chapter(3)
.block(22)
.table(0, table_type="data", title="Sales Figures")
.table_row(1, row_type="header")
.table_cell(0, cell_type="header", text="Quarter")
.with_rendering_metadata(font_scale=1.1, page_size=[600, 800], theme="dark")
.build(), "Sales Table"),
# User bookmarks an image caption
(PositionBuilder()
.chapter(4)
.block(8)
.image(0, alt_text="Company Logo", caption="Figure 4.1: Corporate Identity")
.with_rendering_metadata(font_scale=1.0, page_size=[600, 800], theme="light")
.build(), "Logo Image"),
# User's current reading position (with character-level precision)
(PositionBuilder()
.chapter(5)
.block(12)
.paragraph()
.word(23, offset=7) # 7 characters into word 23
.with_rendering_metadata(font_scale=1.3, page_size=[600, 800], theme="dark")
.build(), "Current Position")
]
# Save all bookmarks
storage = PositionStorage("ereader_bookmarks", use_shelf=False)
for position, description in reading_positions:
bookmark_name = description.lower().replace(" ", "_")
storage.save_position("my_novel", bookmark_name, position)
print(f"Saved bookmark '{description}': {position}")
print(f"\nTotal bookmarks: {len(storage.list_positions('my_novel'))}")
# Demonstrate bookmark navigation
print("\n--- Bookmark Navigation ---")
current_pos = reading_positions[-1][0] # Current reading position
for position, description in reading_positions[:-1]: # All except current
# Calculate relationship to current position
if position.is_ancestor_of(current_pos):
relationship = "ancestor of current"
elif current_pos.is_ancestor_of(position):
relationship = "descendant of current"
else:
common = position.get_common_ancestor(current_pos)
if len(common.path) > 1:
relationship = f"shares {common.get_leaf_node().content_type.value} with current"
else:
relationship = "unrelated to current"
print(f"'{description}' is {relationship}")
# Clean up
try:
shutil.rmtree("ereader_bookmarks")
except:
pass
print()
def demonstrate_advanced_navigation():
"""Show advanced navigation scenarios"""
print("=== Advanced Navigation Scenarios ===")
# Multi-level list navigation
print("Multi-level List Navigation:")
nested_list_pos = (PositionBuilder()
.chapter(2)
.block(5)
.list(0, list_type="ordered", title="Main Topics")
.list_item(2, text="Data Structures")
.list(1, list_type="unordered", title="Subtopics")
.list_item(1, text="Hash Tables")
.word(3, text="implementation")
.build())
print(f" Nested list position: {nested_list_pos}")
# Navigate to parent list item
parent_item_pos = nested_list_pos.copy().truncate_to_type(ContentType.LIST_ITEM)
print(f" Parent list item: {parent_item_pos}")
# Navigate to main list
main_list_pos = nested_list_pos.copy().truncate_to_type(ContentType.LIST)
print(f" Main list: {main_list_pos}")
# Table navigation
print("\nTable Navigation:")
table_pos = (PositionBuilder()
.chapter(3)
.block(10)
.table(0, table_type="comparison", rows=5, columns=3)
.table_row(2, row_type="data")
.table_cell(1, cell_type="data", header="Price")
.word(0, text="$99.99")
.build())
print(f" Table cell position: {table_pos}")
# Navigate to different cells in same row
next_cell_pos = table_pos.copy()
cell_node = next_cell_pos.get_node(ContentType.TABLE_CELL)
cell_node.index = 2 # Move to next column
cell_node.metadata["header"] = "Quantity"
word_node = next_cell_pos.get_node(ContentType.WORD)
word_node.text = "5"
print(f" Next cell position: {next_cell_pos}")
# Verify they share the same row
common = table_pos.get_common_ancestor(next_cell_pos)
row_node = common.get_node(ContentType.TABLE_ROW)
print(f" Shared row index: {row_node.index if row_node else 'None'}")
print()
def main():
"""Run all demonstrations"""
print("Recursive Position System Demonstration")
print("=" * 50)
print()
demonstrate_basic_position_creation()
demonstrate_different_content_types()
demonstrate_position_relationships()
demonstrate_serialization()
demonstrate_storage_systems()
demonstrate_ereader_scenario()
demonstrate_advanced_navigation()
print("=== Summary ===")
print("The Recursive Position System provides:")
print("✓ Hierarchical position tracking for any content type")
print("✓ Dynamic content type support (words, images, tables, lists, etc.)")
print("✓ Flexible serialization (JSON and Python shelf)")
print("✓ Position relationships (ancestor/descendant queries)")
print("✓ Fluent builder pattern for easy position creation")
print("✓ Metadata support for rendering context")
print("✓ Real-world ereader bookmark management")
print("✓ Advanced navigation capabilities")
print()
print("This system is ideal for:")
print("• Ereader applications with precise bookmarking")
print("• Document editors with complex navigation")
print("• Content management systems")
print("• Any application requiring hierarchical position tracking")
if __name__ == "__main__":
main()

View File

@ -1,481 +0,0 @@
"""
Recursive location index system for dynamic content positioning.
This module provides a flexible, hierarchical position tracking system that can
reference any type of content (words, images, table cells, list items, etc.)
in a nested document structure.
"""
from __future__ import annotations
from dataclasses import dataclass, field
from typing import List, Dict, Any, Optional, Union, Tuple
from enum import Enum
import json
import pickle
import shelve
from pathlib import Path
class ContentType(Enum):
"""Types of content that can be referenced in the position index"""
DOCUMENT = "document"
CHAPTER = "chapter"
BLOCK = "block"
PARAGRAPH = "paragraph"
HEADING = "heading"
TABLE = "table"
TABLE_ROW = "table_row"
TABLE_CELL = "table_cell"
LIST = "list"
LIST_ITEM = "list_item"
WORD = "word"
IMAGE = "image"
LINK = "link"
BUTTON = "button"
FORM_FIELD = "form_field"
LINE = "line" # Rendered line of text
PAGE = "page" # Rendered page
@dataclass
class LocationNode:
"""
A single node in the recursive location index.
Each node represents a position within a specific content type.
"""
content_type: ContentType
index: int = 0 # Position within this content type
offset: int = 0 # Offset within the indexed item (e.g., character offset in word)
metadata: Dict[str, Any] = field(default_factory=dict) # Additional context
def to_dict(self) -> Dict[str, Any]:
"""Serialize node to dictionary"""
return {
'content_type': self.content_type.value,
'index': self.index,
'offset': self.offset,
'metadata': self.metadata
}
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> 'LocationNode':
"""Deserialize node from dictionary"""
return cls(
content_type=ContentType(data['content_type']),
index=data['index'],
offset=data['offset'],
metadata=data.get('metadata', {})
)
def __str__(self) -> str:
"""Human-readable representation"""
if self.offset > 0:
return f"{self.content_type.value}[{self.index}]+{self.offset}"
return f"{self.content_type.value}[{self.index}]"
@dataclass
class RecursivePosition:
"""
Hierarchical position that can reference any nested content structure.
The path represents a traversal from document root to the specific location:
- Document -> Chapter[2] -> Block[5] -> Paragraph -> Word[12] -> Character[3]
- Document -> Chapter[1] -> Block[3] -> Table -> Row[2] -> Cell[1] -> Word[0]
- Document -> Chapter[0] -> Block[1] -> Image
"""
path: List[LocationNode] = field(default_factory=list)
rendering_metadata: Dict[str, Any] = field(default_factory=dict) # Font scale, page size, etc.
def __post_init__(self):
"""Ensure we always have at least a document root"""
if not self.path:
self.path = [LocationNode(ContentType.DOCUMENT)]
def copy(self) -> 'RecursivePosition':
"""Create a deep copy of this position"""
return RecursivePosition(
path=[LocationNode(node.content_type, node.index, node.offset, node.metadata.copy())
for node in self.path],
rendering_metadata=self.rendering_metadata.copy()
)
def get_node(self, content_type: ContentType) -> Optional[LocationNode]:
"""Get the first node of a specific content type in the path"""
for node in self.path:
if node.content_type == content_type:
return node
return None
def get_nodes(self, content_type: ContentType) -> List[LocationNode]:
"""Get all nodes of a specific content type in the path"""
return [node for node in self.path if node.content_type == content_type]
def add_node(self, node: LocationNode) -> 'RecursivePosition':
"""Add a node to the path (returns self for chaining)"""
self.path.append(node)
return self
def pop_node(self) -> Optional[LocationNode]:
"""Remove and return the last node in the path"""
if len(self.path) > 1: # Keep at least document root
return self.path.pop()
return None
def get_depth(self) -> int:
"""Get the depth of the position (number of nodes)"""
return len(self.path)
def get_leaf_node(self) -> LocationNode:
"""Get the deepest (most specific) node in the path"""
return self.path[-1] if self.path else LocationNode(ContentType.DOCUMENT)
def truncate_to_type(self, content_type: ContentType) -> 'RecursivePosition':
"""Truncate path to end at the first occurrence of the given content type"""
for i, node in enumerate(self.path):
if node.content_type == content_type:
self.path = self.path[:i+1]
break
return self
def is_ancestor_of(self, other: 'RecursivePosition') -> bool:
"""Check if this position is an ancestor of another position"""
if len(self.path) >= len(other.path):
return False
for i, node in enumerate(self.path):
if i >= len(other.path):
return False
other_node = other.path[i]
if (node.content_type != other_node.content_type or
node.index != other_node.index):
return False
return True
def is_descendant_of(self, other: 'RecursivePosition') -> bool:
"""Check if this position is a descendant of another position"""
return other.is_ancestor_of(self)
def get_common_ancestor(self, other: 'RecursivePosition') -> 'RecursivePosition':
"""Find the deepest common ancestor with another position"""
common_path = []
min_length = min(len(self.path), len(other.path))
for i in range(min_length):
if (self.path[i].content_type == other.path[i].content_type and
self.path[i].index == other.path[i].index):
common_path.append(self.path[i])
else:
break
return RecursivePosition(path=common_path)
def to_dict(self) -> Dict[str, Any]:
"""Serialize position to dictionary for JSON storage"""
return {
'path': [node.to_dict() for node in self.path],
'rendering_metadata': self.rendering_metadata
}
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> 'RecursivePosition':
"""Deserialize position from dictionary"""
return cls(
path=[LocationNode.from_dict(node_data) for node_data in data['path']],
rendering_metadata=data.get('rendering_metadata', {})
)
def to_json(self) -> str:
"""Serialize to JSON string"""
return json.dumps(self.to_dict(), indent=2)
@classmethod
def from_json(cls, json_str: str) -> 'RecursivePosition':
"""Deserialize from JSON string"""
return cls.from_dict(json.loads(json_str))
def __str__(self) -> str:
"""Human-readable path representation"""
return " -> ".join(str(node) for node in self.path)
def __eq__(self, other) -> bool:
"""Check equality with another position"""
if not isinstance(other, RecursivePosition):
return False
return (self.path == other.path and
self.rendering_metadata == other.rendering_metadata)
def __hash__(self) -> int:
"""Make position hashable for use as dict key"""
path_tuple = tuple((node.content_type, node.index, node.offset) for node in self.path)
return hash(path_tuple)
class PositionBuilder:
"""
Builder class for constructing RecursivePosition objects fluently.
Example usage:
position = (PositionBuilder()
.chapter(2)
.block(5)
.paragraph()
.word(12, offset=3)
.build())
"""
def __init__(self):
self._position = RecursivePosition()
def document(self, index: int = 0, **metadata) -> 'PositionBuilder':
"""Add document node"""
self._position.add_node(LocationNode(ContentType.DOCUMENT, index, metadata=metadata))
return self
def chapter(self, index: int, **metadata) -> 'PositionBuilder':
"""Add chapter node"""
self._position.add_node(LocationNode(ContentType.CHAPTER, index, metadata=metadata))
return self
def block(self, index: int, **metadata) -> 'PositionBuilder':
"""Add block node"""
self._position.add_node(LocationNode(ContentType.BLOCK, index, metadata=metadata))
return self
def paragraph(self, index: int = 0, **metadata) -> 'PositionBuilder':
"""Add paragraph node"""
self._position.add_node(LocationNode(ContentType.PARAGRAPH, index, metadata=metadata))
return self
def heading(self, index: int = 0, **metadata) -> 'PositionBuilder':
"""Add heading node"""
self._position.add_node(LocationNode(ContentType.HEADING, index, metadata=metadata))
return self
def table(self, index: int = 0, **metadata) -> 'PositionBuilder':
"""Add table node"""
self._position.add_node(LocationNode(ContentType.TABLE, index, metadata=metadata))
return self
def table_row(self, index: int, **metadata) -> 'PositionBuilder':
"""Add table row node"""
self._position.add_node(LocationNode(ContentType.TABLE_ROW, index, metadata=metadata))
return self
def table_cell(self, index: int, **metadata) -> 'PositionBuilder':
"""Add table cell node"""
self._position.add_node(LocationNode(ContentType.TABLE_CELL, index, metadata=metadata))
return self
def list(self, index: int = 0, **metadata) -> 'PositionBuilder':
"""Add list node"""
self._position.add_node(LocationNode(ContentType.LIST, index, metadata=metadata))
return self
def list_item(self, index: int, **metadata) -> 'PositionBuilder':
"""Add list item node"""
self._position.add_node(LocationNode(ContentType.LIST_ITEM, index, metadata=metadata))
return self
def word(self, index: int, offset: int = 0, **metadata) -> 'PositionBuilder':
"""Add word node"""
self._position.add_node(LocationNode(ContentType.WORD, index, offset, metadata=metadata))
return self
def image(self, index: int = 0, **metadata) -> 'PositionBuilder':
"""Add image node"""
self._position.add_node(LocationNode(ContentType.IMAGE, index, metadata=metadata))
return self
def link(self, index: int, **metadata) -> 'PositionBuilder':
"""Add link node"""
self._position.add_node(LocationNode(ContentType.LINK, index, metadata=metadata))
return self
def button(self, index: int, **metadata) -> 'PositionBuilder':
"""Add button node"""
self._position.add_node(LocationNode(ContentType.BUTTON, index, metadata=metadata))
return self
def form_field(self, index: int, **metadata) -> 'PositionBuilder':
"""Add form field node"""
self._position.add_node(LocationNode(ContentType.FORM_FIELD, index, metadata=metadata))
return self
def line(self, index: int, **metadata) -> 'PositionBuilder':
"""Add rendered line node"""
self._position.add_node(LocationNode(ContentType.LINE, index, metadata=metadata))
return self
def page(self, index: int, **metadata) -> 'PositionBuilder':
"""Add page node"""
self._position.add_node(LocationNode(ContentType.PAGE, index, metadata=metadata))
return self
def with_rendering_metadata(self, **metadata) -> 'PositionBuilder':
"""Add rendering metadata (font scale, page size, etc.)"""
self._position.rendering_metadata.update(metadata)
return self
def build(self) -> RecursivePosition:
"""Build and return the final position"""
return self._position
class PositionStorage:
"""
Storage manager for recursive positions supporting both JSON and shelf formats.
"""
def __init__(self, storage_dir: str = "positions", use_shelf: bool = False):
"""
Initialize position storage.
Args:
storage_dir: Directory to store position files
use_shelf: If True, use Python shelf format; if False, use JSON
"""
self.storage_dir = Path(storage_dir)
self.storage_dir.mkdir(exist_ok=True)
self.use_shelf = use_shelf
def save_position(self, document_id: str, position_name: str, position: RecursivePosition):
"""Save a position to storage"""
if self.use_shelf:
self._save_to_shelf(document_id, position_name, position)
else:
self._save_to_json(document_id, position_name, position)
def load_position(self, document_id: str, position_name: str) -> Optional[RecursivePosition]:
"""Load a position from storage"""
if self.use_shelf:
return self._load_from_shelf(document_id, position_name)
else:
return self._load_from_json(document_id, position_name)
def list_positions(self, document_id: str) -> List[str]:
"""List all saved positions for a document"""
if self.use_shelf:
return self._list_shelf_positions(document_id)
else:
return self._list_json_positions(document_id)
def delete_position(self, document_id: str, position_name: str) -> bool:
"""Delete a position from storage"""
if self.use_shelf:
return self._delete_from_shelf(document_id, position_name)
else:
return self._delete_from_json(document_id, position_name)
def _save_to_json(self, document_id: str, position_name: str, position: RecursivePosition):
"""Save position as JSON file"""
file_path = self.storage_dir / f"{document_id}_{position_name}.json"
with open(file_path, 'w') as f:
json.dump(position.to_dict(), f, indent=2)
def _load_from_json(self, document_id: str, position_name: str) -> Optional[RecursivePosition]:
"""Load position from JSON file"""
file_path = self.storage_dir / f"{document_id}_{position_name}.json"
if not file_path.exists():
return None
try:
with open(file_path, 'r') as f:
data = json.load(f)
return RecursivePosition.from_dict(data)
except Exception:
return None
def _list_json_positions(self, document_id: str) -> List[str]:
"""List JSON position files for a document"""
pattern = f"{document_id}_*.json"
files = list(self.storage_dir.glob(pattern))
return [f.stem.replace(f"{document_id}_", "") for f in files]
def _delete_from_json(self, document_id: str, position_name: str) -> bool:
"""Delete JSON position file"""
file_path = self.storage_dir / f"{document_id}_{position_name}.json"
if file_path.exists():
file_path.unlink()
return True
return False
def _save_to_shelf(self, document_id: str, position_name: str, position: RecursivePosition):
"""Save position to shelf database"""
shelf_path = str(self.storage_dir / f"{document_id}.shelf")
with shelve.open(shelf_path) as shelf:
shelf[position_name] = position
def _load_from_shelf(self, document_id: str, position_name: str) -> Optional[RecursivePosition]:
"""Load position from shelf database"""
shelf_path = str(self.storage_dir / f"{document_id}.shelf")
try:
with shelve.open(shelf_path) as shelf:
return shelf.get(position_name)
except Exception:
return None
def _list_shelf_positions(self, document_id: str) -> List[str]:
"""List positions in shelf database"""
shelf_path = str(self.storage_dir / f"{document_id}.shelf")
try:
with shelve.open(shelf_path) as shelf:
return list(shelf.keys())
except Exception:
return []
def _delete_from_shelf(self, document_id: str, position_name: str) -> bool:
"""Delete position from shelf database"""
shelf_path = str(self.storage_dir / f"{document_id}.shelf")
try:
with shelve.open(shelf_path) as shelf:
if position_name in shelf:
del shelf[position_name]
return True
except Exception:
pass
return False
# Convenience functions for common position patterns
def create_word_position(chapter: int, block: int, word: int, char_offset: int = 0) -> RecursivePosition:
"""Create a position pointing to a specific word and character"""
return (PositionBuilder()
.chapter(chapter)
.block(block)
.paragraph()
.word(word, offset=char_offset)
.build())
def create_image_position(chapter: int, block: int, image_index: int = 0) -> RecursivePosition:
"""Create a position pointing to an image"""
return (PositionBuilder()
.chapter(chapter)
.block(block)
.image(image_index)
.build())
def create_table_cell_position(chapter: int, block: int, row: int, col: int, word: int = 0) -> RecursivePosition:
"""Create a position pointing to content in a table cell"""
return (PositionBuilder()
.chapter(chapter)
.block(block)
.table()
.table_row(row)
.table_cell(col)
.word(word)
.build())
def create_list_item_position(chapter: int, block: int, item: int, word: int = 0) -> RecursivePosition:
"""Create a position pointing to content in a list item"""
return (PositionBuilder()
.chapter(chapter)
.block(block)
.list()
.list_item(item)
.word(word)
.build())

View File

@ -1,578 +0,0 @@
"""
Unit tests for the recursive position system.
Tests the hierarchical position tracking that can reference any nested content structure.
"""
import unittest
import tempfile
import shutil
import json
from pathlib import Path
from pyWebLayout.layout.recursive_position import (
ContentType, LocationNode, RecursivePosition, PositionBuilder, PositionStorage,
create_word_position, create_image_position, create_table_cell_position, create_list_item_position
)
class TestLocationNode(unittest.TestCase):
"""Test cases for LocationNode"""
def test_node_creation(self):
"""Test basic node creation"""
node = LocationNode(ContentType.WORD, 5, 3, {"text": "hello"})
self.assertEqual(node.content_type, ContentType.WORD)
self.assertEqual(node.index, 5)
self.assertEqual(node.offset, 3)
self.assertEqual(node.metadata["text"], "hello")
def test_node_serialization(self):
"""Test node serialization to/from dict"""
node = LocationNode(ContentType.TABLE_CELL, 2, 0, {"colspan": 2})
# Serialize
data = node.to_dict()
expected = {
'content_type': 'table_cell',
'index': 2,
'offset': 0,
'metadata': {'colspan': 2}
}
self.assertEqual(data, expected)
# Deserialize
restored = LocationNode.from_dict(data)
self.assertEqual(restored.content_type, ContentType.TABLE_CELL)
self.assertEqual(restored.index, 2)
self.assertEqual(restored.offset, 0)
self.assertEqual(restored.metadata, {'colspan': 2})
def test_node_string_representation(self):
"""Test string representation of nodes"""
node1 = LocationNode(ContentType.PARAGRAPH, 3)
self.assertEqual(str(node1), "paragraph[3]")
node2 = LocationNode(ContentType.WORD, 5, 2)
self.assertEqual(str(node2), "word[5]+2")
class TestRecursivePosition(unittest.TestCase):
"""Test cases for RecursivePosition"""
def test_position_creation(self):
"""Test basic position creation"""
pos = RecursivePosition()
# Should have document root by default
self.assertEqual(len(pos.path), 1)
self.assertEqual(pos.path[0].content_type, ContentType.DOCUMENT)
def test_position_building(self):
"""Test building complex positions"""
pos = RecursivePosition()
pos.add_node(LocationNode(ContentType.CHAPTER, 2))
pos.add_node(LocationNode(ContentType.BLOCK, 5))
pos.add_node(LocationNode(ContentType.PARAGRAPH, 0))
pos.add_node(LocationNode(ContentType.WORD, 12, 3))
self.assertEqual(len(pos.path), 5) # Including document root
self.assertEqual(pos.path[1].content_type, ContentType.CHAPTER)
self.assertEqual(pos.path[1].index, 2)
self.assertEqual(pos.path[-1].content_type, ContentType.WORD)
self.assertEqual(pos.path[-1].index, 12)
self.assertEqual(pos.path[-1].offset, 3)
def test_position_copy(self):
"""Test position copying"""
original = RecursivePosition()
original.add_node(LocationNode(ContentType.CHAPTER, 1))
original.add_node(LocationNode(ContentType.WORD, 5, 2, {"text": "test"}))
original.rendering_metadata = {"font_scale": 1.5}
copy = original.copy()
# Should be equal but not the same object
self.assertEqual(original, copy)
self.assertIsNot(original, copy)
self.assertIsNot(original.path, copy.path)
self.assertIsNot(original.rendering_metadata, copy.rendering_metadata)
# Modifying copy shouldn't affect original
copy.add_node(LocationNode(ContentType.IMAGE, 0))
self.assertNotEqual(len(original.path), len(copy.path))
def test_node_queries(self):
"""Test querying nodes by type"""
pos = RecursivePosition()
pos.add_node(LocationNode(ContentType.CHAPTER, 2))
pos.add_node(LocationNode(ContentType.BLOCK, 5))
pos.add_node(LocationNode(ContentType.TABLE, 0))
pos.add_node(LocationNode(ContentType.TABLE_ROW, 1))
pos.add_node(LocationNode(ContentType.TABLE_CELL, 2))
# Get single node
chapter_node = pos.get_node(ContentType.CHAPTER)
self.assertIsNotNone(chapter_node)
self.assertEqual(chapter_node.index, 2)
# Get non-existent node
word_node = pos.get_node(ContentType.WORD)
self.assertIsNone(word_node)
# Get multiple nodes (if there were multiple)
table_nodes = pos.get_nodes(ContentType.TABLE_ROW)
self.assertEqual(len(table_nodes), 1)
self.assertEqual(table_nodes[0].index, 1)
def test_position_hierarchy_operations(self):
"""Test ancestor/descendant relationships"""
# Create ancestor position: document -> chapter[1] -> block[2]
ancestor = RecursivePosition()
ancestor.add_node(LocationNode(ContentType.CHAPTER, 1))
ancestor.add_node(LocationNode(ContentType.BLOCK, 2))
# Create descendant position: document -> chapter[1] -> block[2] -> paragraph -> word[5]
descendant = ancestor.copy()
descendant.add_node(LocationNode(ContentType.PARAGRAPH, 0))
descendant.add_node(LocationNode(ContentType.WORD, 5))
# Create unrelated position: document -> chapter[2] -> block[1]
unrelated = RecursivePosition()
unrelated.add_node(LocationNode(ContentType.CHAPTER, 2))
unrelated.add_node(LocationNode(ContentType.BLOCK, 1))
# Test relationships
self.assertTrue(ancestor.is_ancestor_of(descendant))
self.assertTrue(descendant.is_descendant_of(ancestor))
self.assertFalse(ancestor.is_ancestor_of(unrelated))
self.assertFalse(unrelated.is_descendant_of(ancestor))
# Test common ancestor
common = ancestor.get_common_ancestor(descendant)
self.assertEqual(len(common.path), 3) # document + chapter + block
common_unrelated = ancestor.get_common_ancestor(unrelated)
self.assertEqual(len(common_unrelated.path), 1) # Only document root
def test_position_truncation(self):
"""Test truncating position to specific content type"""
pos = RecursivePosition()
pos.add_node(LocationNode(ContentType.CHAPTER, 1))
pos.add_node(LocationNode(ContentType.BLOCK, 2))
pos.add_node(LocationNode(ContentType.PARAGRAPH, 0))
pos.add_node(LocationNode(ContentType.WORD, 5))
# Truncate to block level
truncated = pos.copy().truncate_to_type(ContentType.BLOCK)
self.assertEqual(len(truncated.path), 3) # document + chapter + block
self.assertEqual(truncated.path[-1].content_type, ContentType.BLOCK)
def test_position_serialization(self):
"""Test position serialization to/from dict and JSON"""
pos = RecursivePosition()
pos.add_node(LocationNode(ContentType.CHAPTER, 2))
pos.add_node(LocationNode(ContentType.WORD, 5, 3, {"text": "hello"}))
pos.rendering_metadata = {"font_scale": 1.5, "page_size": [800, 600]}
# Test dict serialization
data = pos.to_dict()
restored = RecursivePosition.from_dict(data)
self.assertEqual(pos, restored)
# Test JSON serialization
json_str = pos.to_json()
restored_json = RecursivePosition.from_json(json_str)
self.assertEqual(pos, restored_json)
def test_position_equality_and_hashing(self):
"""Test position equality and hashing"""
pos1 = RecursivePosition()
pos1.add_node(LocationNode(ContentType.CHAPTER, 1))
pos1.add_node(LocationNode(ContentType.WORD, 5))
pos2 = RecursivePosition()
pos2.add_node(LocationNode(ContentType.CHAPTER, 1))
pos2.add_node(LocationNode(ContentType.WORD, 5))
pos3 = RecursivePosition()
pos3.add_node(LocationNode(ContentType.CHAPTER, 1))
pos3.add_node(LocationNode(ContentType.WORD, 6)) # Different word
# Test equality
self.assertEqual(pos1, pos2)
self.assertNotEqual(pos1, pos3)
# Test hashing (should be able to use as dict keys)
position_dict = {pos1: "value1", pos3: "value2"}
self.assertEqual(position_dict[pos2], "value1") # pos2 should hash same as pos1
def test_string_representation(self):
"""Test human-readable string representation"""
pos = RecursivePosition()
pos.add_node(LocationNode(ContentType.CHAPTER, 2))
pos.add_node(LocationNode(ContentType.BLOCK, 5))
pos.add_node(LocationNode(ContentType.WORD, 12, 3))
expected = "document[0] -> chapter[2] -> block[5] -> word[12]+3"
self.assertEqual(str(pos), expected)
class TestPositionBuilder(unittest.TestCase):
"""Test cases for PositionBuilder"""
def test_fluent_building(self):
"""Test fluent interface for building positions"""
pos = (PositionBuilder()
.chapter(2)
.block(5)
.paragraph()
.word(12, offset=3)
.with_rendering_metadata(font_scale=1.5, page_size=[800, 600])
.build())
# Check path structure
self.assertEqual(len(pos.path), 5) # document + chapter + block + paragraph + word
self.assertEqual(pos.path[1].content_type, ContentType.CHAPTER)
self.assertEqual(pos.path[1].index, 2)
self.assertEqual(pos.path[-1].content_type, ContentType.WORD)
self.assertEqual(pos.path[-1].index, 12)
self.assertEqual(pos.path[-1].offset, 3)
# Check rendering metadata
self.assertEqual(pos.rendering_metadata["font_scale"], 1.5)
self.assertEqual(pos.rendering_metadata["page_size"], [800, 600])
def test_table_building(self):
"""Test building table cell positions"""
pos = (PositionBuilder()
.chapter(1)
.block(3)
.table()
.table_row(2)
.table_cell(1)
.word(0)
.build())
# Verify table structure
table_node = pos.get_node(ContentType.TABLE)
row_node = pos.get_node(ContentType.TABLE_ROW)
cell_node = pos.get_node(ContentType.TABLE_CELL)
self.assertIsNotNone(table_node)
self.assertIsNotNone(row_node)
self.assertIsNotNone(cell_node)
self.assertEqual(row_node.index, 2)
self.assertEqual(cell_node.index, 1)
def test_list_building(self):
"""Test building list item positions"""
pos = (PositionBuilder()
.chapter(0)
.block(2)
.list()
.list_item(3)
.word(1)
.build())
# Verify list structure
list_node = pos.get_node(ContentType.LIST)
item_node = pos.get_node(ContentType.LIST_ITEM)
self.assertIsNotNone(list_node)
self.assertIsNotNone(item_node)
self.assertEqual(item_node.index, 3)
def test_image_building(self):
"""Test building image positions"""
pos = (PositionBuilder()
.chapter(1)
.block(4)
.image(0, alt_text="Test image", width=300, height=200)
.build())
image_node = pos.get_node(ContentType.IMAGE)
self.assertIsNotNone(image_node)
self.assertEqual(image_node.metadata["alt_text"], "Test image")
self.assertEqual(image_node.metadata["width"], 300)
class TestPositionStorage(unittest.TestCase):
"""Test cases for PositionStorage"""
def setUp(self):
"""Set up temporary directory for testing"""
self.temp_dir = tempfile.mkdtemp()
self.storage_json = PositionStorage(self.temp_dir, use_shelf=False)
self.storage_shelf = PositionStorage(self.temp_dir, use_shelf=True)
def tearDown(self):
"""Clean up temporary directory"""
shutil.rmtree(self.temp_dir)
def test_json_storage(self):
"""Test JSON-based position storage"""
# Create test position
pos = (PositionBuilder()
.chapter(2)
.block(5)
.word(12, offset=3)
.with_rendering_metadata(font_scale=1.5)
.build())
# Save position
self.storage_json.save_position("test_doc", "bookmark1", pos)
# Load position
loaded = self.storage_json.load_position("test_doc", "bookmark1")
self.assertIsNotNone(loaded)
self.assertEqual(pos, loaded)
# List positions
positions = self.storage_json.list_positions("test_doc")
self.assertIn("bookmark1", positions)
# Delete position
success = self.storage_json.delete_position("test_doc", "bookmark1")
self.assertTrue(success)
# Verify deletion
loaded_after_delete = self.storage_json.load_position("test_doc", "bookmark1")
self.assertIsNone(loaded_after_delete)
def test_shelf_storage(self):
"""Test shelf-based position storage"""
# Create test position
pos = (PositionBuilder()
.chapter(1)
.block(3)
.table()
.table_row(2)
.table_cell(1)
.build())
# Save position
self.storage_shelf.save_position("test_doc", "table_pos", pos)
# Load position
loaded = self.storage_shelf.load_position("test_doc", "table_pos")
self.assertIsNotNone(loaded)
self.assertEqual(pos, loaded)
# List positions
positions = self.storage_shelf.list_positions("test_doc")
self.assertIn("table_pos", positions)
# Delete position
success = self.storage_shelf.delete_position("test_doc", "table_pos")
self.assertTrue(success)
def test_multiple_positions(self):
"""Test storing multiple positions for same document"""
pos1 = create_word_position(0, 1, 5)
pos2 = create_image_position(1, 2)
pos3 = create_table_cell_position(2, 3, 1, 2, 0)
# Save multiple positions
self.storage_json.save_position("multi_doc", "pos1", pos1)
self.storage_json.save_position("multi_doc", "pos2", pos2)
self.storage_json.save_position("multi_doc", "pos3", pos3)
# List all positions
positions = self.storage_json.list_positions("multi_doc")
self.assertEqual(len(positions), 3)
self.assertIn("pos1", positions)
self.assertIn("pos2", positions)
self.assertIn("pos3", positions)
# Load and verify each position
loaded1 = self.storage_json.load_position("multi_doc", "pos1")
loaded2 = self.storage_json.load_position("multi_doc", "pos2")
loaded3 = self.storage_json.load_position("multi_doc", "pos3")
self.assertEqual(pos1, loaded1)
self.assertEqual(pos2, loaded2)
self.assertEqual(pos3, loaded3)
class TestConvenienceFunctions(unittest.TestCase):
"""Test cases for convenience functions"""
def test_create_word_position(self):
"""Test word position creation"""
pos = create_word_position(2, 5, 12, 3)
chapter_node = pos.get_node(ContentType.CHAPTER)
block_node = pos.get_node(ContentType.BLOCK)
word_node = pos.get_node(ContentType.WORD)
self.assertEqual(chapter_node.index, 2)
self.assertEqual(block_node.index, 5)
self.assertEqual(word_node.index, 12)
self.assertEqual(word_node.offset, 3)
def test_create_image_position(self):
"""Test image position creation"""
pos = create_image_position(1, 3, 0)
chapter_node = pos.get_node(ContentType.CHAPTER)
block_node = pos.get_node(ContentType.BLOCK)
image_node = pos.get_node(ContentType.IMAGE)
self.assertEqual(chapter_node.index, 1)
self.assertEqual(block_node.index, 3)
self.assertEqual(image_node.index, 0)
def test_create_table_cell_position(self):
"""Test table cell position creation"""
pos = create_table_cell_position(0, 2, 1, 3, 5)
chapter_node = pos.get_node(ContentType.CHAPTER)
block_node = pos.get_node(ContentType.BLOCK)
table_node = pos.get_node(ContentType.TABLE)
row_node = pos.get_node(ContentType.TABLE_ROW)
cell_node = pos.get_node(ContentType.TABLE_CELL)
word_node = pos.get_node(ContentType.WORD)
self.assertEqual(chapter_node.index, 0)
self.assertEqual(block_node.index, 2)
self.assertEqual(row_node.index, 1)
self.assertEqual(cell_node.index, 3)
self.assertEqual(word_node.index, 5)
def test_create_list_item_position(self):
"""Test list item position creation"""
pos = create_list_item_position(1, 4, 2, 7)
chapter_node = pos.get_node(ContentType.CHAPTER)
block_node = pos.get_node(ContentType.BLOCK)
list_node = pos.get_node(ContentType.LIST)
item_node = pos.get_node(ContentType.LIST_ITEM)
word_node = pos.get_node(ContentType.WORD)
self.assertEqual(chapter_node.index, 1)
self.assertEqual(block_node.index, 4)
self.assertEqual(item_node.index, 2)
self.assertEqual(word_node.index, 7)
class TestRealWorldScenarios(unittest.TestCase):
"""Test cases for real-world usage scenarios"""
def test_ereader_bookmark_scenario(self):
"""Test typical ereader bookmark usage"""
# User is reading chapter 3, paragraph 2, word 15, character 5
reading_pos = (PositionBuilder()
.chapter(3)
.block(8) # Block 8 in chapter 3
.paragraph()
.word(15, offset=5)
.with_rendering_metadata(
font_scale=1.2,
page_size=[600, 800],
theme="dark"
)
.build())
# Save as bookmark
storage = PositionStorage(use_shelf=False)
storage.save_position("my_novel", "chapter3_climax", reading_pos)
# Later, load bookmark
loaded_pos = storage.load_position("my_novel", "chapter3_climax")
self.assertEqual(reading_pos, loaded_pos)
# Verify we can extract the reading context
chapter_node = loaded_pos.get_node(ContentType.CHAPTER)
word_node = loaded_pos.get_node(ContentType.WORD)
self.assertEqual(chapter_node.index, 3)
self.assertEqual(word_node.index, 15)
self.assertEqual(word_node.offset, 5)
self.assertEqual(loaded_pos.rendering_metadata["font_scale"], 1.2)
def test_table_navigation_scenario(self):
"""Test navigating within a complex table"""
# User is in a table: chapter 2, table block 5, row 3, cell 2, word 1
table_pos = (PositionBuilder()
.chapter(2)
.block(5)
.table(0, table_type="data", columns=4, rows=10)
.table_row(3, row_type="data")
.table_cell(2, cell_type="data", colspan=1)
.word(1)
.build())
# Navigate to next cell (same row, next column)
next_cell_pos = table_pos.copy()
cell_node = next_cell_pos.get_node(ContentType.TABLE_CELL)
cell_node.index = 3 # Move to next column
word_node = next_cell_pos.get_node(ContentType.WORD)
word_node.index = 0 # Reset to first word in new cell
# Verify positions are different but related
self.assertNotEqual(table_pos, next_cell_pos)
# They should share common ancestor up to table row level
common = table_pos.get_common_ancestor(next_cell_pos)
row_node = common.get_node(ContentType.TABLE_ROW)
self.assertIsNotNone(row_node)
self.assertEqual(row_node.index, 3)
def test_multi_level_list_scenario(self):
"""Test navigating nested lists"""
# Position in nested list: chapter 1, list block 3, item 2, sub-list, sub-item 1, word 3
nested_pos = (PositionBuilder()
.chapter(1)
.block(3)
.list(0, list_type="ordered")
.list_item(2)
.list(1, list_type="unordered") # Nested list
.list_item(1)
.word(3)
.build())
# Verify we can distinguish between the two list levels
list_nodes = nested_pos.get_nodes(ContentType.LIST)
self.assertEqual(len(list_nodes), 2)
self.assertEqual(list_nodes[0].index, 0) # Outer list
self.assertEqual(list_nodes[1].index, 1) # Inner list
# Verify list item hierarchy
item_nodes = nested_pos.get_nodes(ContentType.LIST_ITEM)
self.assertEqual(len(item_nodes), 2)
self.assertEqual(item_nodes[0].index, 2) # Outer item
self.assertEqual(item_nodes[1].index, 1) # Inner item
def test_position_comparison_and_sorting(self):
"""Test comparing positions for sorting/ordering"""
# Create positions at different locations
pos1 = create_word_position(1, 2, 5) # Chapter 1, block 2, word 5
pos2 = create_word_position(1, 2, 10) # Chapter 1, block 2, word 10
pos3 = create_word_position(1, 3, 1) # Chapter 1, block 3, word 1
pos4 = create_word_position(2, 1, 1) # Chapter 2, block 1, word 1
positions = [pos4, pos2, pos1, pos3] # Unsorted
# For proper sorting, we'd need to implement comparison operators
# For now, we can test that positions are distinguishable
unique_positions = set(positions)
self.assertEqual(len(unique_positions), 4)
# Test that we can find common ancestors
common_12 = pos1.get_common_ancestor(pos2)
common_13 = pos1.get_common_ancestor(pos3)
common_14 = pos1.get_common_ancestor(pos4)
# pos1 and pos2 share paragraph-level ancestor (same chapter, block, paragraph)
self.assertEqual(len(common_12.path), 4) # document + chapter + block + paragraph
# pos1 and pos3 share chapter-level ancestor (same chapter, different blocks)
self.assertEqual(len(common_13.path), 2) # document + chapter
# pos1 and pos4 share only document-level ancestor (different chapters)
self.assertEqual(len(common_14.path), 1) # document only
if __name__ == '__main__':
unittest.main()