Erratum
No truncation indicator in WARC records
Originally reported by
Henry Thompson
.
Due to an issue with our crawler, not all truncations were indicated correctly. A workaround to detect length truncation is to be suspicious if the length of the content is exactly 1048576 bytes. Truncations for time or network do not have such a workaround. In the WARC files this indicator is called "WARC-Truncated".
Affected Crawls
Affected Web Graphs
No items found.