Everything you need to know about XLSX
XLSX (.xlsx) is Microsoft Excel's spreadsheet format since 2007, also Office Open XML standardized as ISO/IEC 29500. Like DOCX, it's a ZIP of XML files - one for the workbook structure, one per sheet, separate files for shared strings, styles, and formulas.
How it works under the hood
- ZIP architecture. Inside: `xl/workbook.xml` (sheet list), `xl/worksheets/sheet1.xml` (cell data per sheet), `xl/sharedStrings.xml` (deduplicated text), `xl/styles.xml` (formatting).
- Cell references. A1 notation is the standard: column letter + row number. Formulas use this same syntax: `=SUM(A1:A10)`.
- Formulas as text. Each formula is stored as text plus its last-computed value. Open in Excel re-computes; reading XML gets you both the formula and the cached result.
- Limits. 1,048,576 rows by 16,384 columns (XFD) per sheet. Hit that, you need a database, not Excel.
Where you'll actually use it
- Financial models and budgets
- Data analysis and pivot tables
- Inventory and CRM lite
- Configuration tables for engineering and ops teams
How it compares to alternatives
XLSX vs CSV: XLSX has formulas, formatting, multiple sheets, charts. CSV is plain data. XLSX vs ODS: ODS is the OpenDocument equivalent. XLSX vs Google Sheets: Google Sheets exports XLSX, but real-time collab is browser-only.
Things that will trip you up
- Excel's auto-conversion eats data - leading zeros in zip codes, dates parsed wrong (the famous 'gene name' problem in genomics)
- VBA macros in .xlsm files can be malware vectors - never enable macros from unknown sources
- Formulas can have circular references that loop forever - Excel warns, but parsers may crash