Everything you need to know about DOCX
DOCX (.docx) is Microsoft Office's Word document format since 2007 - based on Office Open XML, standardized as ISO/IEC 29500 in 2008. Under the hood, it's a ZIP archive of XML files describing document structure, styling, and content. Open with any unzip tool to see the XML inside.
How it works under the hood
- ZIP package. Rename .docx to .zip and extract: you'll find `word/document.xml` (the content), `word/styles.xml` (formatting), `word/numbering.xml` (lists), and `[Content_Types].xml` (manifest).
- Open Office XML. ECMA-376/ISO 29500 specifies how documents are structured. The schemas are open and documented, but Word adds non-standard extensions.
- Tracked changes. Revisions are stored as inline XML with author and timestamp - this is why you should accept all changes before sharing sensitive documents.
- Embedded media. Images, embedded fonts, and OLE objects live in `word/media/` and `word/embeddings/` directories.
Where you'll actually use it
- Business correspondence and reports
- Academic papers and theses
- Collaborative editing with Track Changes
- Templated documents with Mail Merge
How it compares to alternatives
DOCX vs PDF: DOCX is editable and reflows; PDF is final and locked. DOCX vs ODT: Same concept, different XML schema. LibreOffice handles both. DOCX vs DOC: DOC was the old binary format (1997-2003); DOCX is the modern XML version.
Things that will trip you up
- DOCX from Word and DOCX from Google Docs have subtle differences - perfect round-trip editing is rare
- Track Changes can leak previous versions of text - 'Inspect Document' before sharing externally
- Embedded Excel charts often break when opened on a machine without Excel - export as images for portability