add some additional tests

2025-06-07 20:16:38 +02:00 · 2025-06-07 20:16:38 +02:00 · 899182152a
commit 899182152a
parent c981fbd1c0
83 changed files with 8168 additions and 928 deletions
--- a/BROWSER_README.md
+++ b/BROWSER_README.md
@ -0,0 +1,143 @@
 # pyWebLayout HTML Browser
 A simple HTML browser built using the pyWebLayout library components from `pyWebLayout/io/` and `pyWebLayout/concrete/`.
 ## Features
 This browser demonstrates the capabilities of pyWebLayout by implementing:
 ### Rendering Components
 - **Text rendering** with various formatting (bold, italic, underline)
 - **Headers** (H1-H6) with proper sizing and styling
 - **Links** (clickable, with external browser opening for external URLs)
 - **Images** (local files and web URLs with error handling)
 - **Layout containers** for proper element positioning
 - **Basic HTML parsing** and element conversion
 ### User Interface
 - **Navigation controls**: Back, Forward, Refresh buttons
 - **Address bar**: Enter URLs or file paths
 - **File browser**: Open local HTML files
 - **Scrollable content area** with both vertical and horizontal scrollbars
 - **Mouse interaction**: Clickable links with hover effects
 - **Status bar**: Shows current operation status
 ## Usage
 ### Starting the Browser
 ```bash
 python html_browser.py
 ```
 ### Loading Content
 1. **Load the test page**: The browser starts with a welcome page showing various features
 2. **Open local files**: Click "Open File" to browse and select HTML files
 3. **Enter URLs**: Type URLs in the address bar and press Enter or click "Go"
 4. **Navigate**: Use back/forward buttons to navigate through history
 ### Test Files
 - `test_page.html` - A comprehensive test page demonstrating all supported features including:
  - Text formatting (bold, italic, underline)
  - Headers of all levels (H1-H6)
  - Links (both internal and external)
  - Images (includes the sample image from tests/data/)
  - Line breaks and paragraphs
 ## Architecture
 ### HTML Parser (`HTMLParser` class)
 - Simple regex-based HTML tokenizer
 - Converts HTML elements to pyWebLayout abstract objects
 - Handles font styling with a font stack for nested formatting
 - Supports basic HTML tags: h1-h6, b, strong, i, em, u, a, img, br, p, div, span
 ### Browser Window (`BrowserWindow` class)
 - Tkinter-based GUI with navigation controls
 - Canvas-based rendering of pyWebLayout Page objects
 - Mouse event handling for interactive elements
 - Navigation history management
 - File and URL loading capabilities
 ### pyWebLayout Integration
 The browser uses these pyWebLayout components:
 #### From `pyWebLayout/concrete/`:
 - `Page` - Top-level container for web page content
 - `Container` - Layout management for multiple elements
 - `Box` - Basic rectangular container with positioning
 - `Text` - Text rendering with font styling
 - `RenderableImage` - Image loading and display with scaling
 - `RenderableLink` - Interactive link elements
 - `RenderableButton` - Interactive button elements
 #### From `pyWebLayout/abstract/`:
 - `Link` - Abstract link representation with types (internal, external, API, function)
 - `Image` - Abstract image representation with dimensions and loading
 - Font and styling classes for text appearance
 #### From `pyWebLayout/style/`:
 - `Font` - Font management with size, weight, style, and decoration
 - `FontWeight`, `FontStyle`, `TextDecoration` - Typography enums
 - `Alignment` - Layout positioning options
 ## Supported HTML Features
 ### Text Elements
 - `<h1>` to `<h6>` - Headers with appropriate sizing
 - `<p>` - Paragraphs with spacing
 - `<b>`, `<strong>` - Bold text
 - `<i>`, `<em>` - Italic text
 - `<u>` - Underlined text
 - `<br>` - Line breaks
 ### Interactive Elements
 - `<a href="...">` - Links (opens external URLs in system browser)
 ### Media Elements
 - `<img src="..." alt="..." width="..." height="...">` - Images with scaling
 ### Container Elements
 - `<div>`, `<span>` - Generic containers (parsed but not specially styled)
 ## Example Usage
 ```python
 # Start the browser
 from html_browser import BrowserWindow
 browser = BrowserWindow()
 browser.run()
 ```
 ## Limitations
 This is a demonstration browser with simplified HTML parsing:
 - No CSS support (styling is done through pyWebLayout components)
 - No JavaScript execution
 - Limited HTML tag support
 - No form submission (forms can be rendered but not submitted)
 - No advanced layout features (flexbox, grid, etc.)
 ## Dependencies
 - `tkinter` - GUI framework (usually included with Python)
 - `PIL` (Pillow) - Image processing
 - `requests` - HTTP requests for web URLs
 - `pyWebLayout` - The core layout and rendering library
 ## Testing
 Load `test_page.html` to see all supported features in action:
 1. Run the browser: `python html_browser.py`
 2. Click "Open File" and select `test_page.html`
 3. Explore the different text formatting, links, and image rendering
 The test page includes:
 - Various header levels
 - Text formatting examples
 - Clickable links (try the Google link!)
 - A sample image from the test data
 - Mixed content demonstrations
--- a/coverage-docs.svg
+++ b/coverage-docs.svg
@ -1,5 +1,5 @@
 <svg width="140" height="20" viewBox="0 0 140 20" version="1.1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" xml:space="preserve" xmlns:serif="http://www.serif.com/" style="fill-rule:evenodd;clip-rule:evenodd;stroke-linejoin:round;stroke-miterlimit:2;">
-    <title>interrogate: 94.6%</title>
+    <title>interrogate: 92.0%</title>
    <g transform="matrix(1,0,0,1,22,0)">
        <g id="backgrounds" transform="matrix(1.32789,0,0,1,-22.3892,0)">
            <rect x="0" y="0" width="71" height="20" style="fill:rgb(85,85,85);"/>
@ -12,8 +12,8 @@
    <g fill="#fff" text-anchor="middle" font-family="DejaVu Sans,Verdana,Geneva,sans-serif" font-size="110">
        <text x="590" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="610">interrogate</text>
        <text x="590" y="140" transform="scale(.1)" textLength="610">interrogate</text>
-        <text x="1160" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="370" data-interrogate="result">94.6%</text>
+        <text x="1160" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="370" data-interrogate="result">92.0%</text>
-        <text x="1160" y="140" transform="scale(.1)" textLength="370" data-interrogate="result">94.6%</text>
+        <text x="1160" y="140" transform="scale(.1)" textLength="370" data-interrogate="result">92.0%</text>
    </g>
    <g id="logo-shadow" serif:id="logo shadow" transform="matrix(0.854876,0,0,0.854876,-6.73514,1.732)">
        <g transform="matrix(0.299012,0,0,0.299012,9.70229,-6.68582)">
--- a/coverage-summary.txt
+++ b/coverage-summary.txt
@ -1 +1 @@
-41.1%
+57.0%
--- a/coverage.json
+++ b/coverage.json
--- a/coverage.svg
+++ b/coverage.svg
@ -15,7 +15,7 @@
    <g fill="#fff" text-anchor="middle" font-family="DejaVu Sans,Verdana,Geneva,sans-serif" font-size="11">
        <text x="31.5" y="15" fill="#010101" fill-opacity=".3">coverage</text>
        <text x="31.5" y="14">coverage</text>
-        <text x="80" y="15" fill="#010101" fill-opacity=".3">47%</text>
+        <text x="80" y="15" fill="#010101" fill-opacity=".3">57%</text>
-        <text x="80" y="14">47%</text>
+        <text x="80" y="14">57%</text>
    </g>
 </svg>
--- a/coverage.xml
+++ b/coverage.xml
--- a/html_browser.py
+++ b/html_browser.py
@ -0,0 +1,642 @@
 #!/usr/bin/env python3
 """
 Simple HTML Browser using pyWebLayout
 This browser can render basic HTML content using the pyWebLayout concrete objects.
 It supports text, images, links, forms, and basic styling.
 """
 import re
 import tkinter as tk
 from tkinter import ttk, messagebox, filedialog, simpledialog
 from PIL import Image, ImageTk
 from typing import Dict, List, Optional, Tuple, Any
 import webbrowser
 import os
 from urllib.parse import urljoin, urlparse
 import requests
 from io import BytesIO
 # Import pyWebLayout components
 from pyWebLayout.concrete import (
    Page, Container, Box, Text, RenderableImage, 
    RenderableLink, RenderableButton, RenderableForm, RenderableFormField
 )
 from pyWebLayout.abstract.functional import (
    Link, Button, Form, FormField, LinkType, FormFieldType
 )
 from pyWebLayout.style.fonts import Font, FontWeight, FontStyle, TextDecoration
 from pyWebLayout.style.layout import Alignment
 class HTMLParser:
    """Simple HTML parser that converts HTML to pyWebLayout objects"""
    def __init__(self):
        self.font_stack = [Font(font_size=14)]  # Default font
        self.current_container = None
    def parse_html_string(self, html_content: str, base_url: str = "") -> Page:
        """Parse HTML string and return a Page object"""
        # Create the main page
        page = Page(size=(800, 1600), background_color=(255, 255, 255))
        self.current_container = page
        self.base_url = base_url
        # Simple HTML parsing using regex (not production-ready, but works for demo)
        # Remove comments and scripts
        html_content = re.sub(r'<!--.*?-->', '', html_content, flags=re.DOTALL)
        html_content = re.sub(r'<script.*?</script>', '', html_content, flags=re.DOTALL)
        html_content = re.sub(r'<style.*?</style>', '', html_content, flags=re.DOTALL)
        # Extract title
        title_match = re.search(r'<title>(.*?)</title>', html_content, re.IGNORECASE)
        if title_match:
            page.title = title_match.group(1)
        # Extract body content
        body_match = re.search(r'<body[^>]*>(.*?)</body>', html_content, re.DOTALL | re.IGNORECASE)
        if body_match:
            body_content = body_match.group(1)
        else:
            # If no body tag, use the entire content
            body_content = html_content
        # Parse the body content
        self._parse_content(body_content, page)
        return page
    def parse_html_file(self, file_path: str) -> Page:
        """Parse HTML file and return a Page object"""
        try:
            with open(file_path, 'r', encoding='utf-8') as f:
                html_content = f.read()
            base_url = os.path.dirname(os.path.abspath(file_path))
            return self.parse_html_string(html_content, base_url)
        except Exception as e:
            # Create error page
            page = Page(size=(800, 1600), background_color=(255, 255, 255))
            error_text = Text(f"Error loading file: {str(e)}", Font(font_size=16, colour=(255, 0, 0)))
            page.add_child(error_text)
            return page
    def _parse_content(self, content: str, container: Container):
        """Parse HTML content and add elements to container"""
        # Simple token-based parsing
        tokens = self._tokenize_html(content)
        i = 0
        while i < len(tokens):
            token = tokens[i]
            if token['type'] == 'text':
                if token['content'].strip():  # Only add non-empty text
                    text_obj = Text(token['content'].strip(), self.font_stack[-1])
                    container.add_child(text_obj)
            elif token['type'] == 'tag':
                # Handle the tag and potentially parse content between opening and closing tags
                i = self._handle_tag_with_content(token, tokens, i, container)
                continue
            i += 1
    def _handle_tag_with_content(self, token, tokens, current_index, container):
        """Handle tags and their content, returning the new index position"""
        tag_name = token['name']
        is_closing = token['closing']
        if is_closing:
            # Handle closing tags
            if tag_name in ['b', 'strong', 'i', 'em', 'u', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6']:
                if len(self.font_stack) > 1:  # Don't pop the last font
                    self.font_stack.pop()
            return current_index + 1
        # For opening tags that affect text styling, parse their content with the new style
        if tag_name in ['b', 'strong', 'i', 'em', 'u', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6']:
            # Push new font onto stack
            self._handle_tag(token, container)
            # Find the matching closing tag and parse content in between
            content_start = current_index + 1
            content_end = self._find_matching_closing_tag(tokens, current_index, tag_name)
            if content_end > content_start:
                # Parse content between opening and closing tags with current font style
                for j in range(content_start, content_end):
                    content_token = tokens[j]
                    if content_token['type'] == 'text':
                        if content_token['content'].strip():
                            text_obj = Text(content_token['content'].strip(), self.font_stack[-1])
                            container.add_child(text_obj)
                    elif content_token['type'] == 'tag' and not content_token['closing']:
                        # Handle nested tags
                        self._handle_tag(content_token, container)
            # Pop the font from stack
            if len(self.font_stack) > 1:
                self.font_stack.pop()
            return content_end + 1 if content_end < len(tokens) else len(tokens)
        else:
            # Handle other tags normally
            self._handle_tag(token, container)
            return current_index + 1
    def _find_matching_closing_tag(self, tokens, start_index, tag_name):
        """Find the index of the matching closing tag"""
        open_count = 1
        i = start_index + 1
        while i < len(tokens) and open_count > 0:
            token = tokens[i]
            if token['type'] == 'tag' and token['name'] == tag_name:
                if token['closing']:
                    open_count -= 1
                else:
                    open_count += 1
            i += 1
        return i - 1 if open_count == 0 else len(tokens)
    def _tokenize_html(self, content: str) -> List[Dict]:
        """Simple HTML tokenizer"""
        tokens = []
        tag_pattern = r'<(/?)([^>]+)>'
        last_end = 0
        for match in re.finditer(tag_pattern, content):
            # Add text before tag
            text_content = content[last_end:match.start()]
            if text_content:
                tokens.append({'type': 'text', 'content': text_content})
            # Add tag
            is_closing = bool(match.group(1))
            tag_content = match.group(2)
            tag_parts = tag_content.split()
            tag_name = tag_parts[0].lower()
            # Parse attributes
            attributes = {}
            if len(tag_parts) > 1:
                attr_text = ' '.join(tag_parts[1:])
                attr_pattern = r'(\w+)=(?:"([^"]*)"|\'([^\']*)\'|([^\s>]+))'
                for attr_match in re.finditer(attr_pattern, attr_text):
                    attr_name = attr_match.group(1).lower()
                    attr_value = attr_match.group(2) or attr_match.group(3) or attr_match.group(4)
                    attributes[attr_name] = attr_value
            tokens.append({
                'type': 'tag',
                'name': tag_name,
                'closing': is_closing,
                'attributes': attributes,
                'content': tag_content
            })
            last_end = match.end()
        # Add remaining text
        if last_end < len(content):
            text_content = content[last_end:]
            if text_content:
                tokens.append({'type': 'text', 'content': text_content})
        return tokens
    def _handle_tag(self, token: Dict, container: Container):
        """Handle HTML tags"""
        tag_name = token['name']
        is_closing = token['closing']
        attributes = token['attributes']
        if is_closing:
            # Handle closing tags
            if tag_name in ['b', 'strong']:
                self.font_stack.pop()
            elif tag_name in ['i', 'em']:
                self.font_stack.pop()
            elif tag_name == 'u':
                self.font_stack.pop()
            return
        # Handle opening tags
        if tag_name in ['h1', 'h2', 'h3', 'h4', 'h5', 'h6']:
            # Headers
            size_map = {'h1': 24, 'h2': 20, 'h3': 18, 'h4': 16, 'h5': 14, 'h6': 12}
            font = self.font_stack[-1].with_size(size_map[tag_name]).with_weight(FontWeight.BOLD)
            self.font_stack.append(font)
        elif tag_name in ['b', 'strong']:
            # Bold text
            font = self.font_stack[-1].with_weight(FontWeight.BOLD)
            self.font_stack.append(font)
        elif tag_name in ['i', 'em']:
            # Italic text
            font = self.font_stack[-1].with_style(FontStyle.ITALIC)
            self.font_stack.append(font)
        elif tag_name == 'u':
            # Underlined text
            font = self.font_stack[-1].with_decoration(TextDecoration.UNDERLINE)
            self.font_stack.append(font)
        elif tag_name == 'a':
            # Links
            href = attributes.get('href', '#')
            title = attributes.get('title', href)
            # Determine link type
            if href.startswith('http'):
                link_type = LinkType.EXTERNAL
            elif href.startswith('#'):
                link_type = LinkType.INTERNAL
            else:
                link_type = LinkType.INTERNAL
            # Create link callback
            def link_callback(location, **kwargs):
                return f"Navigate to: {location}"
            link = Link(href, link_type, link_callback, title=title)
            link_font = self.font_stack[-1].with_colour((0, 0, 255)).with_decoration(TextDecoration.UNDERLINE)
            # For now, just add the link text with link styling
            link_text = attributes.get('title', href)
            renderable_link = RenderableLink(link, link_text, link_font)
            container.add_child(renderable_link)
        elif tag_name == 'img':
            # Images
            src = attributes.get('src', '')
            alt = attributes.get('alt', 'Image')
            width = attributes.get('width')
            height = attributes.get('height')
            if src:
                # Resolve relative URLs
                if self.base_url and not src.startswith(('http://', 'https://')):
                    if os.path.isdir(self.base_url):
                        src = os.path.join(self.base_url, src)
                    else:
                        src = urljoin(self.base_url, src)
                try:
                    # Create abstract image
                    from pyWebLayout.abstract.block import Image as AbstractImage
                    abstract_img = AbstractImage(src, alt)
                    # Parse dimensions if provided
                    max_width = int(width) if width and width.isdigit() else None
                    max_height = int(height) if height and height.isdigit() else None
                    renderable_img = RenderableImage(abstract_img, max_width, max_height)
                    container.add_child(renderable_img)
                except Exception as e:
                    # Add error text if image fails to load
                    error_text = Text(f"[Image Error: {alt}]", Font(colour=(255, 0, 0)))
                    container.add_child(error_text)
        elif tag_name == 'br':
            # Line breaks - add some vertical space
            spacer = Box((0, 0), (1, 10))
            container.add_child(spacer)
        elif tag_name == 'p':
            # Paragraphs - add some vertical space
            spacer = Box((0, 0), (1, 5))
            container.add_child(spacer)
        elif tag_name in ['div', 'span']:
            # Generic containers - just continue parsing
            pass
 class BrowserWindow:
    """Main browser window using Tkinter"""
    def __init__(self):
        self.root = tk.Tk()
        self.root.title("pyWebLayout HTML Browser")
        self.root.geometry("900x700")
        self.current_page = None
        self.history = []
        self.history_index = -1
        self.setup_ui()
    def setup_ui(self):
        """Setup the user interface"""
        # Create main frame
        main_frame = ttk.Frame(self.root)
        main_frame.pack(fill=tk.BOTH, expand=True, padx=5, pady=5)
        # Navigation frame
        nav_frame = ttk.Frame(main_frame)
        nav_frame.pack(fill=tk.X, pady=(0, 5))
        # Navigation buttons
        self.back_btn = ttk.Button(nav_frame, text="←", command=self.go_back, state=tk.DISABLED)
        self.back_btn.pack(side=tk.LEFT, padx=(0, 5))
        self.forward_btn = ttk.Button(nav_frame, text="→", command=self.go_forward, state=tk.DISABLED)
        self.forward_btn.pack(side=tk.LEFT, padx=(0, 5))
        self.refresh_btn = ttk.Button(nav_frame, text="⟳", command=self.refresh)
        self.refresh_btn.pack(side=tk.LEFT, padx=(0, 10))
        # Address bar
        ttk.Label(nav_frame, text="URL:").pack(side=tk.LEFT)
        self.url_var = tk.StringVar()
        self.url_entry = ttk.Entry(nav_frame, textvariable=self.url_var, width=50)
        self.url_entry.pack(side=tk.LEFT, fill=tk.X, expand=True, padx=(5, 5))
        self.url_entry.bind('<Return>', self.navigate_to_url)
        self.go_btn = ttk.Button(nav_frame, text="Go", command=self.navigate_to_url)
        self.go_btn.pack(side=tk.LEFT, padx=(0, 10))
        # File operations
        self.open_btn = ttk.Button(nav_frame, text="Open File", command=self.open_file)
        self.open_btn.pack(side=tk.LEFT)
        # Content frame with scrollbars
        content_frame = ttk.Frame(main_frame)
        content_frame.pack(fill=tk.BOTH, expand=True)
        # Create canvas with scrollbars
        self.canvas = tk.Canvas(content_frame, bg='white')
        v_scrollbar = ttk.Scrollbar(content_frame, orient=tk.VERTICAL, command=self.canvas.yview)
        h_scrollbar = ttk.Scrollbar(content_frame, orient=tk.HORIZONTAL, command=self.canvas.xview)
        self.canvas.configure(yscrollcommand=v_scrollbar.set, xscrollcommand=h_scrollbar.set)
        v_scrollbar.pack(side=tk.RIGHT, fill=tk.Y)
        h_scrollbar.pack(side=tk.BOTTOM, fill=tk.X)
        self.canvas.pack(side=tk.LEFT, fill=tk.BOTH, expand=True)
        # Status bar
        self.status_var = tk.StringVar(value="Ready")
        status_bar = ttk.Label(main_frame, textvariable=self.status_var, relief=tk.SUNKEN)
        status_bar.pack(fill=tk.X, pady=(5, 0))
        # Bind mouse events
        self.canvas.bind('<Button-1>', self.on_click)
        self.canvas.bind('<Motion>', self.on_mouse_move)
        # Load default page
        self.load_default_page()
    def load_default_page(self):
        """Load a default welcome page"""
        html_content = """
        <html>
        <head><title>pyWebLayout Browser - Welcome</title></head>
        <body>
            <h1>Welcome to pyWebLayout Browser</h1>
            <p>This is a simple HTML browser built using pyWebLayout components.</p>
            <h2>Features:</h2>
            <ul>
                <li>Basic HTML rendering</li>
                <li>Text formatting (bold, italic, underline)</li>
                <li>Headers (H1-H6)</li>
                <li>Links (clickable)</li>
                <li>Images</li>
                <li>Forms (basic support)</li>
            </ul>
            <h2>Try these features:</h2>
            <p><b>Bold text</b>, <i>italic text</i>, and <u>underlined text</u></p>
            <p>Sample link: <a href="https://www.example.com" title="External link">Visit Example.com</a></p>
            <h3>File Operations</h3>
            <p>Use the "Open File" button to load local HTML files.</p>
            <p>Or enter a URL in the address bar above.</p>
        </body>
        </html>
        """
        parser = HTMLParser()
        self.current_page = parser.parse_html_string(html_content)
        self.render_page()
        self.status_var.set("Welcome page loaded")
    def navigate_to_url(self, event=None):
        """Navigate to the URL in the address bar"""
        url = self.url_var.get().strip()
        if not url:
            return
        self.status_var.set(f"Loading {url}...")
        self.root.update()
        try:
            if url.startswith(('http://', 'https://')):
                # Web URL
                response = requests.get(url, timeout=10)
                response.raise_for_status()
                html_content = response.text
                parser = HTMLParser()
                self.current_page = parser.parse_html_string(html_content, url)
            elif os.path.isfile(url):
                # Local file
                parser = HTMLParser()
                self.current_page = parser.parse_html_file(url)
            else:
                # Try to treat as a local file path
                if not url.startswith('file://'):
                    url = 'file://' + os.path.abspath(url)
                file_path = url.replace('file://', '')
                if os.path.isfile(file_path):
                    parser = HTMLParser()
                    self.current_page = parser.parse_html_file(file_path)
                else:
                    raise FileNotFoundError(f"File not found: {file_path}")
            # Add to history
            self.add_to_history(url)
            self.render_page()
            self.status_var.set(f"Loaded {url}")
        except Exception as e:
            self.status_var.set(f"Error loading {url}: {str(e)}")
            messagebox.showerror("Error", f"Failed to load {url}:\n{str(e)}")
    def open_file(self):
        """Open a local HTML file"""
        file_path = filedialog.askopenfilename(
            title="Open HTML File",
            filetypes=[("HTML files", "*.html *.htm"), ("All files", "*.*")]
        )
        if file_path:
            self.url_var.set(file_path)
            self.navigate_to_url()
    def render_page(self):
        """Render the current page to the canvas"""
        if not self.current_page:
            return
        # Clear canvas
        self.canvas.delete("all")
        # Render the page to PIL Image
        page_image = self.current_page.render()
        # Convert to PhotoImage
        self.photo = ImageTk.PhotoImage(page_image)
        # Display on canvas
        self.canvas.create_image(0, 0, anchor=tk.NW, image=self.photo)
        # Update scroll region
        self.canvas.configure(scrollregion=self.canvas.bbox("all"))
        # Store page elements for interaction
        self.page_elements = self._get_clickable_elements(self.current_page)
    def _get_clickable_elements(self, container, offset=(0, 0)) -> List[Tuple]:
        """Get list of clickable elements with their positions"""
        elements = []
        if hasattr(container, '_children'):
            for child in container._children:
                if hasattr(child, '_origin'):
                    child_offset = (offset[0] + child._origin[0], offset[1] + child._origin[1])
                    # Check if element is clickable
                    if isinstance(child, (RenderableLink, RenderableButton)):
                        elements.append((child, child_offset, child._size))
                    # Recursively check children
                    if hasattr(child, '_children'):
                        elements.extend(self._get_clickable_elements(child, child_offset))
        return elements
    def on_click(self, event):
        """Handle mouse clicks on the canvas"""
        # Convert canvas coordinates to image coordinates
        canvas_x = self.canvas.canvasx(event.x)
        canvas_y = self.canvas.canvasy(event.y)
        # Check if click is on any clickable element
        for element, offset, size in self.page_elements:
            element_x, element_y = offset
            element_w, element_h = size
            if (element_x <= canvas_x <= element_x + element_w and
                element_y <= canvas_y <= element_y + element_h):
                # Handle the click
                if isinstance(element, RenderableLink):
                    result = element._callback()
                    if result:
                        self.status_var.set(result)
                        # For external links, open in system browser
                        if element._link.link_type == LinkType.EXTERNAL:
                            webbrowser.open(element._link.location)
                elif isinstance(element, RenderableButton):
                    result = element._callback()
                    if result:
                        self.status_var.set(f"Button clicked: {result}")
                break
    def on_mouse_move(self, event):
        """Handle mouse movement for hover effects"""
        # Convert canvas coordinates to image coordinates
        canvas_x = self.canvas.canvasx(event.x)
        canvas_y = self.canvas.canvasy(event.y)
        # Check if mouse is over any clickable element
        cursor = "arrow"
        for element, offset, size in self.page_elements:
            element_x, element_y = offset
            element_w, element_h = size
            if (element_x <= canvas_x <= element_x + element_w and
                element_y <= canvas_y <= element_y + element_h):
                cursor = "hand2"
                break
        self.canvas.configure(cursor=cursor)
    def add_to_history(self, url):
        """Add URL to navigation history"""
        # Remove any forward history
        self.history = self.history[:self.history_index + 1]
        # Add new URL
        self.history.append(url)
        self.history_index = len(self.history) - 1
        # Update navigation buttons
        self.update_nav_buttons()
    def update_nav_buttons(self):
        """Update the state of navigation buttons"""
        self.back_btn.configure(state=tk.NORMAL if self.history_index > 0 else tk.DISABLED)
        self.forward_btn.configure(state=tk.NORMAL if self.history_index < len(self.history) - 1 else tk.DISABLED)
    def go_back(self):
        """Navigate back in history"""
        if self.history_index > 0:
            self.history_index -= 1
            url = self.history[self.history_index]
            self.url_var.set(url)
            self.navigate_to_url()
    def go_forward(self):
        """Navigate forward in history"""
        if self.history_index < len(self.history) - 1:
            self.history_index += 1
            url = self.history[self.history_index]
            self.url_var.set(url)
            self.navigate_to_url()
    def refresh(self):
        """Refresh the current page"""
        if self.current_page:
            current_url = self.url_var.get()
            if current_url:
                self.navigate_to_url()
            else:
                self.load_default_page()
    def run(self):
        """Start the browser"""
        self.root.mainloop()
 def main():
    """Main function to run the browser"""
    print("Starting pyWebLayout HTML Browser...")
    try:
        browser = BrowserWindow()
        browser.run()
    except Exception as e:
        print(f"Error starting browser: {e}")
        import traceback
        traceback.print_exc()
 if __name__ == "__main__":
    main()
--- a/pyWebLayout/concrete/text.py
+++ b/pyWebLayout/concrete/text.py
@ -43,13 +43,35 @@ class Text(Renderable, Queriable):
        # The bounding box is (left, top, right, bottom)
        try:
            bbox = font.getbbox(self._text)
-            self._width = bbox[2] - bbox[0]
+            # Width is the difference between right and left
-            self._height = bbox[3] - bbox[1]
+            self._width = max(1, bbox[2] - bbox[0])
            # Height needs to account for potential negative top values
            # Use the full height from top to bottom, ensuring positive values
            top = min(0, bbox[1])  # Account for negative ascenders
            bottom = max(bbox[3], bbox[1] + font.size)  # Ensure minimum height
            self._height = max(font.size, bottom - top)
            self._size = (self._width, self._height)
            # Store the offset for proper text positioning
            self._text_offset_x = max(0, -bbox[0])
            self._text_offset_y = max(0, -top)
        except AttributeError:
            # Fallback for older PIL versions
-            self._width, self._height = font.getsize(self._text)
+            try:
-            self._size = (self._width, self._height)
+                self._width, self._height = font.getsize(self._text)
                # Add some padding to prevent cropping
                self._height = max(self._height, int(self._style.font_size * 1.2))
                self._size = (self._width, self._height)
                self._text_offset_x = 0
                self._text_offset_y = 0
            except:
                # Ultimate fallback
                self._width = len(self._text) * self._style.font_size // 2
                self._height = int(self._style.font_size * 1.2)
                self._size = (self._width, self._height)
                self._text_offset_x = 0
                self._text_offset_y = 0
    @property
    def text(self) -> str:
@ -123,8 +145,10 @@ class Text(Renderable, Queriable):
        if self._style.background and self._style.background[3] > 0:  # If alpha > 0
            draw.rectangle([(0, 0), self._size], fill=self._style.background)
-        # Draw the text
+        # Draw the text using calculated offsets to prevent cropping
-        draw.text((0, 0), self._text, font=self._style.font, fill=self._style.colour)
+        text_x = getattr(self, '_text_offset_x', 0)
        text_y = getattr(self, '_text_offset_y', 0)
        draw.text((text_x, text_y), self._text, font=self._style.font, fill=self._style.colour)
        # Apply any text decorations
        self._apply_decoration(draw)
--- a/test_page.html
+++ b/test_page.html
@ -0,0 +1,46 @@
 <!DOCTYPE html>
 <html>
 <head>
    <title>Test Page for pyWebLayout Browser</title>
 </head>
 <body>
    <h1>pyWebLayout Browser Test Page</h1>
    <h3>Images</h3>
    <p>Here's a sample image:</p>
    <img src="tests/data/sample_image.jpg" alt="Sample Image" width="200" height="150">
    <h2>Text Formatting</h2>
    <p>This is a paragraph with <b>bold text</b>, <i>italic text</i>, and <u>underlined text</u>.</p>
    <h3>Links</h3>
    <p>Here are some test links:</p>
    <ul>
        <li><a href="https://www.google.com" title="Google">External link to Google</a></li>
        <li><a href="#section1" title="Section 1">Internal link to Section 1</a></li>
    </ul>
    <h3>Headers</h3>
    <h1>H1 Header</h1>
    <h2>H2 Header</h2>
    <h3>H3 Header</h3>
    <h4>H4 Header</h4>
    <h5>H5 Header</h5>
    <h6>H6 Header</h6>
    <h3>Line Breaks and Paragraphs</h3>
    <p>This is the first paragraph.</p>
    <br>
    <p>This is the second paragraph after a line break.</p>
    <h3 id="section1">Section 1</h3>
    <p>This is the content of section 1. You can link to this section using the internal link above.</p>
    <h3>Images</h3>
    <p>Here's a sample image:</p>
    <img src="tests/data/sample_image.jpg" alt="Sample Image" width="200" height="150">
    <h3>Mixed Content</h3>
    <p>This paragraph contains <b>bold</b> and <i>italic</i> text, as well as an <a href="https://www.example.com">external link</a>.</p>
    <p><strong>Strong text</strong> and <em>emphasized text</em> should also work.</p>
 </body>
 </html>
--- a/tests/data/Kimi
+++ b/tests/data/Kimi
--- a/Wikipedia_files/Ambox_important.svg.webp
+++ b/Wikipedia_files/Ambox_important.svg.webp
--- a/Wikipedia_files/Ambox_important.svg_002.webp
+++ b/Wikipedia_files/Ambox_important.svg_002.webp
--- a/Wikipedia_files/Ambox_important.svg_003.webp
+++ b/Wikipedia_files/Ambox_important.svg_003.webp
--- a/Wikipedia_files/Commons-logo.svg.webp
+++ b/Wikipedia_files/Commons-logo.svg.webp
--- a/Wikipedia_files/Commons-logo.svg_002.webp
+++ b/Wikipedia_files/Commons-logo.svg_002.webp
--- a/Wikipedia_files/Commons-logo.svg_003.webp
+++ b/Wikipedia_files/Commons-logo.svg_003.webp
--- a/Wikipedia_files/Edit-clear.svg.webp
+++ b/Wikipedia_files/Edit-clear.svg.webp
--- a/Wikipedia_files/Edit-clear.svg_002.webp
+++ b/Wikipedia_files/Edit-clear.svg_002.webp
--- a/Wikipedia_files/Edit-clear.svg_003.webp
+++ b/Wikipedia_files/Edit-clear.svg_003.webp
--- a/Wikipedia_files/Flag_of_Argentina.svg.webp
+++ b/Wikipedia_files/Flag_of_Argentina.svg.webp
--- a/Wikipedia_files/Flag_of_Argentina.svg_002.webp
+++ b/Wikipedia_files/Flag_of_Argentina.svg_002.webp
--- a/Wikipedia_files/Flag_of_Argentina.svg_003.webp
+++ b/Wikipedia_files/Flag_of_Argentina.svg_003.webp
--- a/Wikipedia_files/Flag_of_Australia_(converted).svg.webp
+++ b/Wikipedia_files/Flag_of_Australia_(converted).svg.webp
--- a/Wikipedia_files/Flag_of_Australia_(converted).svg_002.webp
+++ b/Wikipedia_files/Flag_of_Australia_(converted).svg_002.webp
--- a/Wikipedia_files/Flag_of_Austria.svg.webp
+++ b/Wikipedia_files/Flag_of_Austria.svg.webp
--- a/Wikipedia_files/Flag_of_Austria.svg_002.webp
+++ b/Wikipedia_files/Flag_of_Austria.svg_002.webp
--- a/Wikipedia_files/Flag_of_Austria.svg_003.webp
+++ b/Wikipedia_files/Flag_of_Austria.svg_003.webp
--- a/Wikipedia_files/Flag_of_Barbados.svg.webp
+++ b/Wikipedia_files/Flag_of_Barbados.svg.webp
--- a/Wikipedia_files/Flag_of_Belgium_(civil).svg.webp
+++ b/Wikipedia_files/Flag_of_Belgium_(civil).svg.webp
--- a/Wikipedia_files/Flag_of_Belgium_(civil).svg_002.webp
+++ b/Wikipedia_files/Flag_of_Belgium_(civil).svg_002.webp
--- a/Wikipedia_files/Flag_of_Belgium_(civil).svg_003.webp
+++ b/Wikipedia_files/Flag_of_Belgium_(civil).svg_003.webp
--- a/Wikipedia_files/Flag_of_Brazil.svg.webp
+++ b/Wikipedia_files/Flag_of_Brazil.svg.webp
--- a/Wikipedia_files/Flag_of_Brazil.svg_002.webp
+++ b/Wikipedia_files/Flag_of_Brazil.svg_002.webp
--- a/Wikipedia_files/Flag_of_Brazil.svg_003.webp
+++ b/Wikipedia_files/Flag_of_Brazil.svg_003.webp
--- a/Wikipedia_files/Flag_of_Canada_(Pantone).svg.webp
+++ b/Wikipedia_files/Flag_of_Canada_(Pantone).svg.webp
--- a/Wikipedia_files/Flag_of_Canada_(Pantone).svg_002.webp
+++ b/Wikipedia_files/Flag_of_Canada_(Pantone).svg_002.webp
--- a/Wikipedia_files/Flag_of_Finland.svg.webp
+++ b/Wikipedia_files/Flag_of_Finland.svg.webp
--- a/Wikipedia_files/Flag_of_Finland.svg_002.webp
+++ b/Wikipedia_files/Flag_of_Finland.svg_002.webp
--- a/Wikipedia_files/Flag_of_Finland.svg_003.webp
+++ b/Wikipedia_files/Flag_of_Finland.svg_003.webp
--- a/Wikipedia_files/Flag_of_France.svg.webp
+++ b/Wikipedia_files/Flag_of_France.svg.webp
--- a/Wikipedia_files/Flag_of_France.svg_002.webp
+++ b/Wikipedia_files/Flag_of_France.svg_002.webp
--- a/Wikipedia_files/Flag_of_Germany.svg.webp
+++ b/Wikipedia_files/Flag_of_Germany.svg.webp
--- a/Wikipedia_files/Flag_of_Germany.svg_002.webp
+++ b/Wikipedia_files/Flag_of_Germany.svg_002.webp
--- a/Wikipedia_files/Flag_of_Germany.svg_003.webp
+++ b/Wikipedia_files/Flag_of_Germany.svg_003.webp
--- a/Wikipedia_files/Flag_of_Ireland.svg.webp
+++ b/Wikipedia_files/Flag_of_Ireland.svg.webp
--- a/Wikipedia_files/Flag_of_Ireland.svg_002.webp
+++ b/Wikipedia_files/Flag_of_Ireland.svg_002.webp
--- a/Wikipedia_files/Flag_of_Italy.svg.webp
+++ b/Wikipedia_files/Flag_of_Italy.svg.webp
--- a/Wikipedia_files/Flag_of_Italy.svg_002.webp
+++ b/Wikipedia_files/Flag_of_Italy.svg_002.webp
--- a/Wikipedia_files/Flag_of_Italy.svg_003.webp
+++ b/Wikipedia_files/Flag_of_Italy.svg_003.webp
--- a/Wikipedia_files/Flag_of_Japan.svg.webp
+++ b/Wikipedia_files/Flag_of_Japan.svg.webp
--- a/Wikipedia_files/Flag_of_Japan.svg_002.webp
+++ b/Wikipedia_files/Flag_of_Japan.svg_002.webp
--- a/Wikipedia_files/Flag_of_Mexico.svg.webp
+++ b/Wikipedia_files/Flag_of_Mexico.svg.webp
--- a/Wikipedia_files/Flag_of_Mexico.svg_002.webp
+++ b/Wikipedia_files/Flag_of_Mexico.svg_002.webp
--- a/Wikipedia_files/Flag_of_Monaco.svg.webp
+++ b/Wikipedia_files/Flag_of_Monaco.svg.webp
--- a/Wikipedia_files/Flag_of_Monaco.svg_002.webp
+++ b/Wikipedia_files/Flag_of_Monaco.svg_002.webp
--- a/Wikipedia_files/Flag_of_Norway.svg.webp
+++ b/Wikipedia_files/Flag_of_Norway.svg.webp
--- a/Wikipedia_files/Flag_of_Norway.svg_002.webp
+++ b/Wikipedia_files/Flag_of_Norway.svg_002.webp
--- a/Wikipedia_files/Flag_of_Poland.svg.webp
+++ b/Wikipedia_files/Flag_of_Poland.svg.webp
--- a/Wikipedia_files/Flag_of_Poland.svg_002.webp
+++ b/Wikipedia_files/Flag_of_Poland.svg_002.webp
--- a/Wikipedia_files/Flag_of_South_Africa_(1928-1982).svg.webp
+++ b/Wikipedia_files/Flag_of_South_Africa_(1928-1982).svg.webp
--- a/Wikipedia_files/Flag_of_South_Africa_(1928-1982).svg_002.webp
+++ b/Wikipedia_files/Flag_of_South_Africa_(1928-1982).svg_002.webp
--- a/Wikipedia_files/Flag_of_Sweden.svg.webp
+++ b/Wikipedia_files/Flag_of_Sweden.svg.webp
--- a/Wikipedia_files/Flag_of_Sweden.svg_002.webp
+++ b/Wikipedia_files/Flag_of_Sweden.svg_002.webp
--- a/Wikipedia_files/Flag_of_Sweden.svg_003.webp
+++ b/Wikipedia_files/Flag_of_Sweden.svg_003.webp
--- a/Wikipedia_files/Flag_of_Switzerland_(Pantone).svg.webp
+++ b/Wikipedia_files/Flag_of_Switzerland_(Pantone).svg.webp
--- a/Wikipedia_files/Flag_of_Switzerland_(Pantone).svg_002.webp
+++ b/Wikipedia_files/Flag_of_Switzerland_(Pantone).svg_002.webp
--- a/Wikipedia_files/Flag_of_Venezuela.svg.webp
+++ b/Wikipedia_files/Flag_of_Venezuela.svg.webp
--- a/Wikipedia_files/Flag_of_Venezuela.svg_002.webp
+++ b/Wikipedia_files/Flag_of_Venezuela.svg_002.webp
--- a/Wikipedia_files/Flag_of_the_People's_Republic_of_China.svg.webp
+++ b/Wikipedia_files/Flag_of_the_People's_Republic_of_China.svg.webp
--- a/Wikipedia_files/Flag_of_the_People's_Republic_of_China.svg_002.webp
+++ b/Wikipedia_files/Flag_of_the_People's_Republic_of_China.svg_002.webp
--- a/Wikipedia_files/Flag_of_the_United_Kingdom.svg.webp
+++ b/Wikipedia_files/Flag_of_the_United_Kingdom.svg.webp
--- a/Wikipedia_files/Flag_of_the_United_Kingdom.svg_002.webp
+++ b/Wikipedia_files/Flag_of_the_United_Kingdom.svg_002.webp
--- a/Wikipedia_files/Flag_of_the_United_States.svg.webp
+++ b/Wikipedia_files/Flag_of_the_United_States.svg.webp
--- a/Wikipedia_files/Flag_of_the_United_States.svg_002.webp
+++ b/Wikipedia_files/Flag_of_the_United_States.svg_002.webp
--- a/Wikipedia_files/Flag_of_the_United_States.svg_003.webp
+++ b/Wikipedia_files/Flag_of_the_United_States.svg_003.webp
--- a/Wikipedia_files/OOjs_UI_icon_edit-ltr-progressive.svg.webp
+++ b/Wikipedia_files/OOjs_UI_icon_edit-ltr-progressive.svg.webp
--- a/Wikipedia_files/Symbol_category_class.svg.webp
+++ b/Wikipedia_files/Symbol_category_class.svg.webp
--- a/Wikipedia_files/Symbol_category_class.svg_002.webp
+++ b/Wikipedia_files/Symbol_category_class.svg_002.webp
--- a/Wikipedia_files/load.css
+++ b/Wikipedia_files/load.css
--- a/Wikipedia_files/load.js
+++ b/Wikipedia_files/load.js
--- a/Wikipedia_files/load_002.css
+++ b/Wikipedia_files/load_002.css
--- a/tests/test_concrete_functional.py
+++ b/tests/test_concrete_functional.py
@ -393,7 +393,7 @@ class TestRenderableFormField(unittest.TestCase):
    """ TODO: Fix test
    @patch('PIL.ImageDraw.Draw')
    def test_render_field_with_value(self, mock_draw_class):
-        """Test rendering field with value"""
+        #Test rendering field with value
        mock_draw = Mock()
        mock_draw_class.return_value = mock_draw
--- a/tests/test_html_file_loader.py
+++ b/tests/test_html_file_loader.py
@ -0,0 +1,118 @@
 """
 Test module for loading HTML files using the html_extraction module.
 This test verifies that HTML files can be loaded from disk and processed
 using the html_extraction.parse_html_string function.
 """
 import os
 import unittest
 from pyWebLayout.io.readers.html_extraction import parse_html_string
 from pyWebLayout.abstract.block import Block
 from pyWebLayout.style import Font
 class TestHTMLFileLoader(unittest.TestCase):
    """Test class for HTML file loading functionality."""
    def test_load_html_file(self):
        """Test loading and parsing an HTML file from disk."""
        # Path to the test HTML file
        html_file_path = os.path.join("tests", "data", "Kimi Räikkönen - Wikipedia.html")
        # Verify the test file exists
        self.assertTrue(os.path.exists(html_file_path), f"Test HTML file not found: {html_file_path}")
        # Read the HTML file
        with open(html_file_path, 'r', encoding='utf-8') as file:
            html_content = file.read()
        # Verify we got some content
        self.assertGreater(len(html_content), 0, "HTML file should not be empty")
        # Parse the HTML content using the html_extraction module
        try:
            blocks = parse_html_string(html_content)
        except Exception as e:
            self.fail(f"Failed to parse HTML file: {e}")
        # Verify we got some blocks
        self.assertIsInstance(blocks, list, "parse_html_string should return a list")
        self.assertGreater(len(blocks), 0, "Should extract at least one block from the HTML file")
        # Verify all returned items are Block instances
        for i, block in enumerate(blocks):
            self.assertIsInstance(block, Block, f"Item {i} should be a Block instance, got {type(block)}")
        print(f"Successfully loaded and parsed HTML file with {len(blocks)} blocks")
    def test_load_html_file_with_custom_font(self):
        """Test loading HTML file with a custom base font."""
        html_file_path = os.path.join("tests", "data", "Kimi Räikkönen - Wikipedia.html")
        # Skip if file doesn't exist
        if not os.path.exists(html_file_path):
            self.skipTest(f"Test HTML file not found: {html_file_path}")
        # Create a custom font
        custom_font = Font(font_size=14, colour=(100, 100, 100))
        # Read and parse with custom font
        with open(html_file_path, 'r', encoding='utf-8') as file:
            html_content = file.read()
        blocks = parse_html_string(html_content, base_font=custom_font)
        # Verify we got blocks
        self.assertGreater(len(blocks), 0, "Should extract blocks with custom font")
        print(f"Successfully parsed HTML file with custom font, got {len(blocks)} blocks")
    def test_load_html_file_content_types(self):
        """Test that the loaded HTML file contains expected content types."""
        html_file_path = os.path.join("tests", "data", "Kimi Räikkönen - Wikipedia.html")
        # Skip if file doesn't exist
        if not os.path.exists(html_file_path):
            self.skipTest(f"Test HTML file not found: {html_file_path}")
        with open(html_file_path, 'r', encoding='utf-8') as file:
            html_content = file.read()
        blocks = parse_html_string(html_content)
        # Check that we have different types of blocks
        block_type_names = [type(block).__name__ for block in blocks]
        unique_types = set(block_type_names)
        # A Wikipedia page should contain multiple types of content
        self.assertGreater(len(unique_types), 1, "Should have multiple types of blocks in Wikipedia page")
        print(f"Found block types: {sorted(unique_types)}")
    def test_html_file_size_handling(self):
        """Test that large HTML files can be handled gracefully."""
        html_file_path = os.path.join("tests", "data", "Kimi Räikkönen - Wikipedia.html")
        # Skip if file doesn't exist
        if not os.path.exists(html_file_path):
            self.skipTest(f"Test HTML file not found: {html_file_path}")
        # Get file size
        file_size = os.path.getsize(html_file_path)
        print(f"HTML file size: {file_size} bytes")
        # Read and parse
        with open(html_file_path, 'r', encoding='utf-8') as file:
            html_content = file.read()
        # This should not raise an exception even for large files
        blocks = parse_html_string(html_content)
        # Basic verification
        self.assertIsInstance(blocks, list)
        print(f"Successfully processed {file_size} byte file into {len(blocks)} blocks")
 if __name__ == '__main__':
    unittest.main()