.csv

Sample CSV files download

CSV files store tabular data in plain text. The simplest format for data exchange between applications.

File size Label Specs / Info Format Download
1 KB 10 rows 10 rows CSV Download CSV Download
5 KB 100 rows 100 rows CSV Download CSV Download
50 KB 1,000 rows 1000 rows CSV Download CSV Download
500 KB 10,000 rows 10000 rows CSV Download CSV Download
5 KB Semicolon Semicolon-separated CSV Download CSV Download
Advertisement
Technical guide

Everything you need to know about CSV

CSV (Comma-Separated Values, .csv) is the simplest serious data interchange format - rows of fields separated by commas, defined informally by RFC 4180 in 2005. Despite its apparent simplicity, CSV is genuinely tricky to get right because there's no single canonical specification, and edge cases (quotes, commas in values, line breaks in cells) trip up most parsers.

How it works under the hood

  • Field separator varies. Comma is the namesake, but TSV uses tab, and locales using comma decimals (Europe) often use semicolon.
  • Quoting rules. RFC 4180 says wrap fields containing commas, quotes, or newlines in double quotes; escape internal quotes by doubling them. Many tools don't follow this strictly.
  • No native data types. Everything is a string. The receiver decides what's a number, what's a date, what's a boolean. This causes endless type-inference bugs.
  • No nesting, no schema. CSV is flat tabular data. For trees or schema enforcement, use JSON or Parquet.

Where you'll actually use it

  • Database imports/exports
  • Spreadsheet data exchange between Excel and other tools
  • Bulk record updates (CRM imports, mailing lists)
  • Simple log files for ad-hoc analysis

How it compares to alternatives

CSV vs TSV: TSV uses tabs - safer because tabs rarely appear in data. CSV vs JSON: JSON has types and nesting; CSV is flat strings. CSV vs Parquet: Parquet is columnar binary - 10-100x faster for analytics on large datasets.

Things that will trip you up

  • Excel's auto-conversion eats CSV data - leading zeros, dates, scientific notation in long numbers
  • Embedded newlines inside quoted fields confuse simple line-based parsers - always use a real CSV library
  • BOM at file start (`UTF-8 with BOM`) breaks naive parsers - handle or strip explicitly
Test it yourself: `csvkit` (csvlook, csvstat) for command-line analysis, Python's `csv` module or `pandas.read_csv()` for code, OpenRefine for cleanup of messy CSVs.

Format details

MIME Types

  • text/csv

License

CC0 1.0 (Public Domain)

Free for personal and commercial use, no attribution required.

Read full license