Implementation Status
Overview
Sidedoc MVP is complete with all core features implemented and tested.
- Completion: MVP complete with table and track changes support
- Tests: Run
pytest --cov=sidedocto verify
✅ Implemented Features
Core Infrastructure
- ✅ Project structure and dependencies (Python 3.11+)
- ✅ Data models (Block, Style, Manifest)
- ✅ CLI framework with click
- ✅ Full test suite with pytest
Commands
- ✅
sidedoc extract- Extract docx to .sidedoc/ directory - ✅
sidedoc build- Build docx from sidedoc - ✅
sidedoc sync- Sync edited markdown back to docx - ✅
sidedoc diff- Show changes since last sync - ✅
sidedoc unpack- Extract archive contents to directory - ✅
sidedoc pack- Create archive from directory - ✅
sidedoc validate- Validate archive integrity - ✅
sidedoc info- Display archive metadata
Content Extraction
- ✅ Paragraphs → markdown
- ✅ Headings (H1-H6) → markdown headers
- ✅ Inline formatting (bold, italic, underline) → markdown
- ✅ Bulleted and numbered lists → markdown
- ✅ Images extracted to assets directory
- ✅ Hyperlinks → markdown
[text](url)syntax - ✅ Tables → GFM pipe syntax (merged cells, cell formatting)
- ✅ Track changes → CriticMarkup (insertions/deletions with author/date)
- ✅ Basic paragraph formatting preserved
- ✅ Block-level structure maintained
- ✅ Round-trip testing (extract → build → validate)
Archive Format
- ✅ .sidedoc/ directory and .sdoc ZIP formats
- ✅
content.md- Clean markdown content - ✅
structure.json- Block mappings - ✅
styles.json- Formatting data - ✅
manifest.json- Metadata and hashes
Benchmark Suite
- ✅ Benchmark framework with 4 comparison pipelines
- ✅ 3 benchmark tasks (summarize, single-edit, multi-turn edit)
- ✅ Token counting with tiktoken (cl100k_base encoding)
- ✅ Cost calculation for LLM and Document Intelligence APIs
- ✅ Fidelity scoring (structural, style, visual)
- ✅ CLI tools for running benchmarks and generating reports
See Benchmark Suite for usage instructions.
🚧 Post-MVP Roadmap
Next Priority (v0.2.0)
- ⏳ Nested lists (2+ levels)
- ⏳ Enhanced style preservation
Future Enhancements (v0.3.0+)
- ⏳ Multi-column layouts
- ⏳ Headers and footers
- ⏳ Comments
- ⏳ Footnotes and endnotes
- ⏳ Advanced document features
Version 1.0.0
- Full feature parity with spec
- Production ready
- PyPI package
Test Coverage
Tests cover extract, build, sync, inline formatting, lists, images, tables, track changes, hyperlinks, archive management, CLI commands, and round-trip validation. Run pytest --cov=sidedoc to see current counts and coverage.
Known Limitations
- Nested Lists: Only single-level lists supported
- Complex Formatting: Multi-column layouts, headers/footers not yet supported
Current Workflow
The MVP supports the full extract → edit → sync → build workflow:
# Extract Word document
sidedoc extract document.docx
# → Creates document.sidedoc/
# Edit the markdown directly
vim document.sidedoc/content.md
# View changes (optional)
sidedoc diff document.sidedoc/
# → Shows what changed since extraction
# Sync content changes
sidedoc sync document.sidedoc/
# → Updates internal structure
# Rebuild Word document
sidedoc build document.sidedoc/
# → Creates document.docx with formatting preserved
Contributing
The project is actively developed and welcoming contributions! See CONTRIBUTING.md for guidelines.