WARC

http://the-fr.org/id/file-format/742

Name: WARC

Version:

Description: The WARC (Web ARChive) file format offers a convention for concatenating multiple resource records (data objects), each consisting of a set of simple text headers and an arbitrary data block into one long file. The WARC format is an extension of the ARC file format (ARC) that has traditionally been used to store "web crawls" as sequences of content blocks harvested from the World Wide Web (…) Besides the primary content recorded in ARCs, the extended WARC format accommodates related secondary content, such as assigned metadata, abbreviated duplicate detection events, later-date transformations, and segmentation of large resources”. WARC format has been written by the members of the IIPC (http://www.netpreserve.org/) grouped within the ISO/TC46/SC4/WG12.

Deprecated: false

MIMEType:

PUID: fmt/289

sameAs : PRONOM: http://www.nationalarchives.gov.uk/PRONOM/fmt/289

Extension: warc

Magic: true

Container Magic: false

Binary Magic: true

Signature Priority Over:

Alias: ISO 28500-2009

Class: http://the-fr.org/def/format-registry/FileFormat

Type: http://the-fr.org/def/format-registry/Aggregate

SPARQL: http://the-fr.org/public/sparql/endpoint.php?query=describe+%3Chttp://the-fr.org/id/file-format/742%3E&output=&jsonp=&key=&show_inline=1