Lawyers have told us that we need to have some rules about how you can use this site, our service and/or our data. Since we don’t like wading through pages of legal documents any more than you do, we have provided a summary of the ToU below. Scroll further down for the full legally-binding terms.
- Use of our site or services means you have agreed to be legally bound by the full ToU. (We know we already said that, but it is important enough to repeat).
- We may amend the ToU by posting an updated version online. If we change it, you are bound by the most current version it even if you didn’t look at the most current version.
- We grant you a non-assignable, non-transferable, non-sub-licensable, limited license to use our site and data in accordance with the ToU.
- Don’t break the law or do anything illegal with our site or data. This includes, but is not limited to, honoring the restrictions of robot.txt files and NOFOLLOW metatags. Other examples of illegal stuff you can’t do are:
- Engage in abusive, harassing, hateful or otherwise offensive activities
- Invade other people’s privacy
- Harm minors
- Violate other people’s rights (IP, proprietary, etc.)
- Circumvent copy-protection
- Interfere or disrupt our site, service or security
- Spam people
- Stalk people
- Impersonate people or otherwise disguise your identity
- Forge headers or otherwise disguise our content
- Harvest personally identifiable information
- Communicate for commercial solicitation
- Break the law (Again, important enough to repeat)
- We didn’t produce the crawled content, we just found it on the web. So we are not vouching for the content or liable if there is something wrong with it.
- We have the right to delete, recategorize or otherwise change the contents of the crawl repository.
- We will kick you off and ban you if you break the law.
- This site is protected by copyright, trademark and service marks.
- We will take take appropriate action if you let us know about a copyright infringement.
- When you tell us about a copyright infringement, you have to: notify us in writing, sign the notification, describe the copyrighted work being infringed, and give us your contact information.
- To report a copyright infringement, please contact us at:
9663 Santa Monica Blvd., #425
Beverly Hills, CA 90210
- Our site, services and data are provided “as is” with no warranty.
- If you violate the ToU or do something that is somehow illegal with our site, services or data and you get sued, we are not liable and it is your obligation to make it clear we are not liable.
- Common Crawl is based in Los Angeles California, so any disputes will be settled in Los Angeles courts.
- Even if we fail to enforce the ToU in some instances, that doesn’t mean we are giving up the right to enforce it in other instances.
- The ToU is the only agreement between you, the user, and Common Crawl.
If you know of a violation of the ToU, you can report it to us at:
Common Crawl Foundation
9663 Santa Monica Blvd., #425
Beverly Hills, CA 90210
LAST UPDATED: August 21, 2023
Welcome to the commoncrawl.org website (the “Site”). The Common Crawl Foundation (“CC,” “we,” or “us”) established the Site and the databases, tools and information we collected and developed using the ccBot crawler (collectively, the “Service”) for anyone to access a comprehensive crawl of the Internet for the purpose of enabling a new wave of innovation, education and research.
We may make changes to the Site and/or the Service at any time. You understand that we may discontinue or restrict your use of the Site and/or Service for any reason or no reason with or without notice.
Your use of the Site AND/OR THE SERVICE signifies that you agree to the ToU and constitutes your binding acceptance of the ToU, including any modifications that we make from time to time. We may include links to these ToU on all data, INFORMATION AND CONTENT accumulated by the service, and your use of such data, INFORMATION and/or content signifies your acceptance of the ToU.
1. LIMITED LICENSE.
CC grants you a limited, non-assignable, non-transferable, non-sublicensable limited license to access the Service subject to the terms and conditions of these ToU. You may not use the Service or the Site in any manner except as authorized under these ToU. CC reserves all other rights not otherwise granted in these ToU.
2. UNLAWFUL AND PROHIBITED USE AND CONDUCT.
In addition to the foregoing restrictions, you hereby agree that you will not use or access the Service for or to engage in any of the following:
(a) engaging in harmful, defamatory, threatening, abusive, harassing, tortious, vulgar, libelous, hateful or otherwise offensive or objectionable activities;
(b) invading other people’s privacy;
(c) exploiting children or otherwise harming minors;
(d) violating the rights of another individual or entity, including but not limited to such party’s intellectual property rights or other proprietary rights;
(e) circumventing copy-protected devices and software;
(f) interfering with or disrupting the Service and/or the servers or networks connected to the Service or circumventing, disabling or interfering with security features on the Site;
(g) displaying, distributing or transmitting unsolicited advertisements, promotional materials, “spam”, “junk mail”, “chain letters”, and “pyramid schemes”;
(h) stalking or otherwise harassing another person;
(i) impersonating any person or entity or falsely stating or otherwise misrepresenting your affiliation with a person or entity;
(j) forging or intentionally modifying headers or other information for the purpose of disguising the origin of any Crawled Content available through the Service;
(k) violating any applicable local, state, national or international law, and any applicable regulations having the force of law;
(l) collecting or harvesting any personally identifiable information; or
(n) using the communication systems provided by the Site for any commercial solicitation purposes.
3. CONTENT DISCLAIMERS AND RESTRICTIONS.
You understand and agree that the Crawled Content made available through the Service is the sole responsibility of the individual or entity from which such Crawled Content originated. We are not responsible for any Crawled Content accessible through the Service. CC cannot guarantee the truthfulness, authenticity, quality or accuracy of said Crawled Content. Under no circumstances will CC be liable in any way for any Crawled Content, including, without limitation, for any errors or omissions in any Crawled Content, or for any loss or damage of any kind incurred as a result of the use of any Crawled Content made available via the Service. In addition, the Site may contain links to third party websites that are not owned or controlled by CC. By using the Service, you agree that we shall not be liable in any manner from your use of any third party website, including, without limitation, web sites linked from the Site or websites accessible through the Service.
You acknowledge that we and our designees shall have the right (but not the obligation) in our sole discretion to refuse, change, delete, or recategorize any Crawled Content that is available via the Service. Without limiting the foregoing, we and our designees shall have the right to remove any Crawled Content that violates the ToU or is otherwise objectionable.
You agree that any improper, abusive, fraudulent, and/or illegal activity or violation of the ToU may be grounds for immediate termination of your right to continue to use the Site or the Service. You further agree that you must evaluate, and bear all risks associated with the use of any Crawled Content, including any reliance on the accuracy, completeness, or usefulness of such Crawled Content. In this regard, you acknowledge that you may not rely on any Crawled Content created or accumulated by CC.
4. INTELLECTUAL PROPERTY.
The Site is protected by copyrights, trademarks, service marks, and/or other proprietary rights under the laws of the U.S. and other countries. By using or accessing the Site or the Service you agree to comply with all state and federal laws that protect our proprietary interest in the material appearing on the Site.
5. NOTIFICATION AND PROCEDURE FOR MAKING CLAIMS OF COPYRIGHT INFRINGEMENT OR INTELLECTUAL PROPERTY INFRINGEMENT.
We will take appropriate actions in response to notice of copyright infringement. If you believe that your work has been used or copied in a way that constitutes copyright infringement and such infringement is occurring on or through this site or the Service, please provide Notice to our Copyright Agent.
Pursuant to Title 17, United States Code, Section 512(c)(3), a notification of claimed infringement must be a written communication addressed to the designated agent as set forth below (the “Notice“), and must include substantially all of the following:
(a) a physical or electronic signature of the person authorized to act on behalf of the owner of the copyright interest that is alleged to have been infringed;
(b) a description of the copyrighted work or works that you claim have been infringed (“infringed work”) and identification of what material in such work(s) is claimed to be infringing (“infringing work”) and which you request to be removed or access to which is to be disabled;
(c) a description of the exact name of the infringing work on the Site or the Service (and the location of the infringing work, if it appears on the Site or the Service) or if the infringing work appears on a site linked to from the Site or the Service where the material that you claim is infringing is located on such site;
(d) information sufficient to permit us to contact you, such as your physical address, telephone number, and email address;
(e) a statement by you that you have a good faith belief that the use of the material identified in your Notice in the manner complained of is not authorized by the copyright owner, its agent, or the law;
(f) a statement by you that the information in your Notice is accurate and, under penalty of perjury that you are the copyright owner or authorized to act on the copyright owner’s behalf.
To reach our Copyright Agent for Notice of claims of copyright infringement:
Common Crawl Foundation
9663 Santa Monica Blvd., #425
Beverly Hills, CA 90210
The Copyright Agent should only be contacted if you believe that your work has been used or copied in a way that constitutes copyright infringement and such infringement is occurring on or through the Site or the Service. The Copyright Agent will not respond to any other inquiries.
6. DISCLAIMER OF WARRANTIES.
YOU EXPRESSLY UNDERSTAND AND AGREE THAT:
YOUR USE OF THE SERVICE IS AT YOUR SOLE RISK. THE SERVICE IS PROVIDED ON AN “AS IS” AND “AS AVAILABLE” BASIS AND TO THE FULLEST EXTENT PERMITTED BY LAW, CC, ITS OFFICERS, DIRECTORS, EMPLOYEES, AND AGENTS EXPRESSLY DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, IN CONNECTION WITH THE SITE AND/OR SERVICE AND YOUR USE THEREOF, INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, ACCURACY, AND NON-INFRINGEMENT. CC MAKES NO REPRESENTATIONS OR WARRANTIES ABOUT THE ACCURACY OR COMPLETENESS OF THE SITE’S OR THE SERVICE’S CONTENT OR THE CONTENT OF ANY SITES LINKED TO THE SITE AND ASSUMES NO LIABILITY OR RESPONSIBILITY FOR (i) ANY ERRORS, MISTAKES, OR INACCURACIES OF CONTENT, (ii) THE TIMELINESS, DELETION, OR MIS-DELIVERY OF ANY CONTENT, OR FAILURE TO PROVIDE ANY CONTENT; (iii) PERSONAL INJURY OR PROPERTY DAMAGE, OF ANY NATURE WHATSOEVER, RESULTING FROM YOUR ACCESS TO AND USE OF THE SITE OR SERVICE, (iv) ANY UNAUTHORIZED ACCESS TO THE SITE AND/OR SERVICE AND/OR ANY AND ALL PERSONAL INFORMATION AND/OR FINANCIAL INFORMATION STORED THEREIN, (v) ANY INTERRUPTION OR CESSATION OF TRANSMISSION TO OR FROM THE SITE, (vi) ANY BUGS, VIRUSES, TROJAN HORSES, OR THE LIKE WHICH MAY BE TRANSMITTED TO OR THROUGH THE SITE OR SERVICE BY ANY THIRD PARTY, AND/OR (vii) ANY ERRORS OR OMISSIONS IN ANY CONTENT OR FOR ANY LOSS OR DAMAGE OF ANY KIND INCURRED AS A RESULT OF THE USE OF ANY CONTENT MADE AVAILABLE THROUGH THE SITE OR SERVICE. NO ADVICE OR INFORMATION, WHETHER ORAL OR WRITTEN, OBTAINED BY YOU FROM CC OR THROUGH THE SERVICE SHALL CREATE ANY WARRANTY NOT EXPRESSLY STATED IN THESE TOU.
7. LIMITATION OF LIABILITY.
YOU UNDERSTAND AND AGREE THAT CC SHALL NOT BE LIABLE TO YOU FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, CONSEQUENTIAL OR EXEMPLARY DAMAGES, INCLUDING WITHOUT LIMITATION DAMAGES FOR LOSS OF PROFITS, GOODWILL, USE, DATA OR OTHER INTANGIBLE LOSSES (EVEN IF CC HAD BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES), RESULTING FROM: (i) THE USE OR THE INABILITY TO USE THE SERVICE; (ii) THE COST OF PROCUREMENT OF SUBSTITUTE SERVICES RESULTING FROM ANY SERVICES OBTAINED THROUGH OR FROM THE SERVICE; (iii) STATEMENTS OR CONDUCT OF ANY THIRD PARTY ON THE SERVICE; (iv) INACCURACIES, MISTAKES, OR ERRORS OF CONTENT, (v) PERSONAL INJURY OR PROPERTY DAMAGE OF ANY NATURE WHATSOEVER RESULTING FROM YOUR ACCESS TO AND USE OF THE SITE OR SERVICE; (vi) ANY BUGS VIRUSES, TROJAN HORSES, OR THE LIKE, WHICH MAY BE TRANSMITTED TO OR THOUGH THE SITE OR SERVICE BY A THIRD PARTY OR (vii) ANY OTHER MATTER RELATING TO THE SITE OR SERVICE, WHETHER BASED ON WARRANTY, CONTRACT, TORT, OR ANY OTHER LEGAL THEORY.
8. EXCLUSIONS AND LIMITATIONS.
SOME JURISDICTIONS DO NOT ALLOW THE EXCLUSION OF CERTAIN WARRANTIES OR THE LIMITATION OR EXCLUSION OF LIABILITY FOR INCIDENTAL OR CONSEQUENTIAL DAMAGES. ACCORDINGLY, SOME OF THE ABOVE LIMITATIONS MAY NOT APPLY TO YOU.
YOU AGREE, AT YOUR OWN EXPENSE, TO INDEMNIFY, DEFEND, AND HOLD HARMLESS CC, ITS EMPLOYEES, AGENTS, AND REPRESENTATIVES AGAINST ANY CLAIM, SUIT, ACTION, OR ADMINISTRATIVE PROCEEDING BROUGHT AGAINST CC, ITS EMPLOYEES, AGENTS, AND REPRESENTATIVES TO THE EXTENT SAID CLAIM, SUIT, ACTION, OR ADMINISTRATIVE PROCEEDING ARISES OUT OF, OR IS RELATED TO YOUR CONTENT, YOUR USE OF OR CONNECTION TO THE SERVICE, YOUR VIOLATION OF THESE TOU, OR YOUR VIOLATION OF ANY RIGHTS OF ANOTHER. WITHOUT LIMITATION TO THE FOREGOING, THE INDEMNIFICATION AGREEMENT HEREUNDER EXTENDS TO, AMONG OTHER THINGS, ANY CLAIM, SUIT, ACTION, OR ADMINISTRATIVE PROCEEDING ARISING OUT OF: (A) YOUR ACCESS OR USE OF THE SERVICE; (B) ACCESS OR USE OF THE SERVICE BY SOMEONE ELSE USING YOUR COMPUTER; (C) A VIOLATION OF THE TOU BY YOU OR ANYONE USING YOUR COMPUTER. THIS INDEMNIFICATION AGREEMENT APPLIES TO ANY CLAIM, SUIT, ACTION, OR ADMINISTRATIVE PROCEEDING THAT IS DIRECTLY OR INDIRECTLY RELATED TO THE SERVICE, INCLUDING BY WAY OF ILLUSTRATION A CLAIM, SUIT, ACTION, OR ADMINISTRATIVE PROCEEDING THAT YOUR USE OR ACCESS OR ANOTHER PERSON’S USE OR ACCESS OF YOUR COMPUTER INFRINGES ANY COPYRIGHT, TRADEMARK, SERVICE MARK, PATENT, TRADE SECRET, RIGHT OF PUBLICITY, OR OTHER PROPRIETARY RIGHT OF A THIRD PARTY UNDER ANY U.S., INTERNATIONAL, FOREIGN, OR STATE LAW. AS ANOTHER EXAMPLE, WHICH IS NOT TO BE CONSTRUED AS LIMITING THE SCOPE OF THE INDEMNIFICATION PROVISION, THIS INDEMNIFICATION AGREEMENT APPLIES TO ANY CLAIM, SUIT, OR ACTION WHERE YOUR USE OR ACCESS TO THE SERVICE IS IN ANY WAY DEFAMATORY, SLANDEROUS, LIBELOUS, OR OTHERWISE INJURIOUS TO A THIRD PARTY. YOU AGREE TO PAY ANY AND ALL COSTS, DAMAGES AND EXPENSES, INCLUDING, BUT NOT LIMITED TO, REASONABLE ATTORNEY’S FEES AND COSTS AWARDED OR INCURRED BY OR IN CONNECTION WITH ANY SUCH CLAIM, SUIT, ACTION, OR ADMINISTRATIVE PROCEEDING. THIS DEFENSE AND INDEMNIFICATION OBLIGATION SHALL SURVIVE THESE TOU AND YOUR USE OF THE SITE OR THE SERVICE.
10. GOVERNING LAW.
The ToU and the relationship between you and CC shall be exclusively governed by and construed in all respects under the laws of California, United States of America, without giving effect to any choice-of-law or conflict-of-laws provisions. You and CC further agree to submit to the personal jurisdiction of the courts located in Los Angeles, California, United States of America for any and all legal claims, suits or actions that arise in connection with the Site, The Service or from a dispute as to the interpretation or breach of these ToU.
11. WAIVER AND SEVERABILITY.
The failure of CC to exercise or enforce any right or provision of the ToU shall not constitute a waiver of such right or provision. If any of the provisions of the ToU are held invalid, unenforceable, or void by a court or other tribunal of competent jurisdiction, the parties nevertheless agree that the court should endeavor to give effect to the parties’ intentions as reflected in the provision, and the other provisions of the ToU remain in full force and effect. The application of the United Nations Convention on Contracts for the International Sale of Goods is expressly excluded.
12. ENTIRE AGREEMENT.
These ToU constitute the entire and only Agreement between you and CC with respect to the Service, and supersedes any prior agreements, oral or written, between you and CC. Any and all other collateral representations, promises, and conditions; and any representation, warranty, promise or condition not incorporated herein or made as provided for in these ToU shall not be binding on CC.
13. NO THIRD PARTY BENEFICIARIES.
You agree that, except as otherwise expressly provided in these ToU, there shall be no third party beneficiaries to this agreement.
14. SECTION TITLES.
The section titles in the ToU are for convenience only and have no legal or contractual effect.
These ToU, and any rights and licenses granted hereunder, may not be transferred or assigned by you, but may be assigned or transferred by CC without restriction.
Please report any violations of these ToU to Common Crawl at the following address:
Contact us at:
Common Crawl Foundation
9663 Santa Monica Blvd. #425,
Beverly Hills, CA 90210