Skip to content

Implementation Status

Overview

Sidedoc MVP is complete with all core features implemented and tested.

  • Completion: MVP complete with table and track changes support
  • Tests: Run pytest --cov=sidedoc to verify

✅ Implemented Features

Core Infrastructure

  • ✅ Project structure and dependencies (Python 3.11+)
  • ✅ Data models (Block, Style, Manifest)
  • ✅ CLI framework with click
  • ✅ Full test suite with pytest

Commands

  • sidedoc extract - Extract docx to .sidedoc/ directory
  • sidedoc build - Build docx from sidedoc
  • sidedoc sync - Sync edited markdown back to docx
  • sidedoc diff - Show changes since last sync
  • sidedoc unpack - Extract archive contents to directory
  • sidedoc pack - Create archive from directory
  • sidedoc validate - Validate archive integrity
  • sidedoc info - Display archive metadata

Content Extraction

  • ✅ Paragraphs → markdown
  • ✅ Headings (H1-H6) → markdown headers
  • ✅ Inline formatting (bold, italic, underline) → markdown
  • ✅ Bulleted and numbered lists → markdown
  • ✅ Images extracted to assets directory
  • ✅ Hyperlinks → markdown [text](url) syntax
  • ✅ Tables → GFM pipe syntax (merged cells, cell formatting)
  • ✅ Track changes → CriticMarkup (insertions/deletions with author/date)
  • ✅ Basic paragraph formatting preserved
  • ✅ Block-level structure maintained
  • ✅ Round-trip testing (extract → build → validate)

Archive Format

  • ✅ .sidedoc/ directory and .sdoc ZIP formats
  • content.md - Clean markdown content
  • structure.json - Block mappings
  • styles.json - Formatting data
  • manifest.json - Metadata and hashes

Benchmark Suite

  • ✅ Benchmark framework with 4 comparison pipelines
  • ✅ 3 benchmark tasks (summarize, single-edit, multi-turn edit)
  • ✅ Token counting with tiktoken (cl100k_base encoding)
  • ✅ Cost calculation for LLM and Document Intelligence APIs
  • ✅ Fidelity scoring (structural, style, visual)
  • ✅ CLI tools for running benchmarks and generating reports

See Benchmark Suite for usage instructions.

🚧 Post-MVP Roadmap

Next Priority (v0.2.0)

  • ⏳ Nested lists (2+ levels)
  • ⏳ Enhanced style preservation

Future Enhancements (v0.3.0+)

  • ⏳ Multi-column layouts
  • ⏳ Headers and footers
  • ⏳ Comments
  • ⏳ Footnotes and endnotes
  • ⏳ Advanced document features

Version 1.0.0

  • Full feature parity with spec
  • Production ready
  • PyPI package

Test Coverage

Tests cover extract, build, sync, inline formatting, lists, images, tables, track changes, hyperlinks, archive management, CLI commands, and round-trip validation. Run pytest --cov=sidedoc to see current counts and coverage.

Known Limitations

  1. Nested Lists: Only single-level lists supported
  2. Complex Formatting: Multi-column layouts, headers/footers not yet supported

Current Workflow

The MVP supports the full extract → edit → sync → build workflow:

# Extract Word document
sidedoc extract document.docx
# → Creates document.sidedoc/

# Edit the markdown directly
vim document.sidedoc/content.md

# View changes (optional)
sidedoc diff document.sidedoc/
# → Shows what changed since extraction

# Sync content changes
sidedoc sync document.sidedoc/
# → Updates internal structure

# Rebuild Word document
sidedoc build document.sidedoc/
# → Creates document.docx with formatting preserved

Contributing

The project is actively developed and welcoming contributions! See CONTRIBUTING.md for guidelines.