CDX Internet Archive Index

http://the-fr.org/id/file-format/1322

Name: CDX Internet Archive Index

Version:

Description: A CDX file consists of individual lines of text, each of which summarizes a single web document. The first line in the file is a legend for interpreting the data, and the following lines contain the data for referencing the corresponding pages within the host. The first character of the file is the field delimiter used in the rest of the file. This is followed by the literal "CDX". For signature strength we currently assume the field delimiter will be a space character, however please contact the PRONOM team should you encounter CDX index files where the delimiter is different.

Deprecated: false

MIMEType:

PUID: fmt/869

sameAs : PRONOM: http://www.nationalarchives.gov.uk/PRONOM/fmt/869

Extension: cdx

Magic: true

Container Magic: false

Binary Magic: true

Signature Priority Over:

See Also (e.g. Wikidata, Library of Congress):

Software that can read the format:

Alias:

Class: http://the-fr.org/def/format-registry/FileFormat

Type: http://the-fr.org/def/format-registry/StructuredText

SPARQL: http://the-fr.org/public/sparql/endpoint.php?query=describe+%3Chttp://the-fr.org/id/file-format/1322%3E&output=&jsonp=&key=&show_inline=1