< Back to Blog
August 1, 2012

Mat Kelcey Joins The Common Crawl Advisory Board

Note: this post has been marked as obsolete.
We are excited to announce that Mat Kelcey has joined the Common Crawl Board of Advisors! Mat has been extremely helpful to Common Crawl over the last several months and we are very happy to have him as an official Advisor to the organization.
Common Crawl Foundation
Common Crawl Foundation
Common Crawl builds and maintains an open repository of web crawl data that can be accessed and analyzed by anyone.
Mat Kelcey

We are excited to announce that Mat Kelcey has joined the Common Crawl Board of Advisors! Mat has been extremely helpful to Common Crawl over the last several months and we are very happy to have him as an official Advisor to the organization.

Mat is a brilliant engineer with a knack for machine learning, informational retrieval, natural language processing, and artificial intelligence. He is currently working on machine learning and natural language processing systems at Wavii. You can  also learn more about him by taking a look at some of his code on Github. You can keep up with what is on Mat's mind on Twitter or on his blog. If you frequent the Common Crawl Discussion Group you will see lots of helpful comments and advice from Mat. Please join me in welcoming Mat and celebrating Common Crawl's good fortune to have him as part of our team by posting a comment here, on the discussion group, or on Twitter.

This release was authored by:
No items found.

Erratum: 

Content is truncated

Originally reported by: 
Permalink

Some archived content is truncated due to fetch size limits imposed during crawling. This is necessary to handle infinite or exceptionally large data streams (e.g., radio streams). Prior to March 2025 (CC-MAIN-2025-13), the truncation threshold was 1 MiB. From the March 2025 crawl onwards, this limit has been increased to 5 MiB.

For more details, see our truncation analysis notebook.