Robots.txt

A robots.txt is a file that is usually placed in the root of a website (for example, https://www.example.com/robots.txt). It specifies whether or not crawlers are allowed access to an entire website, or to specified resources. A restrictive robots.txt file can prevent bandwidth consumption by crawlers.

A site owner can forbid crawlers to detect a certain path (and all files in that path) or a specific file. This is often done to prevent these resources from being indexed or served by search engines.

If a crawler is allowed to access resources, you can define indexing rules for those resources via <meta name="robots"> elements (commonly referred to as a "robots tag") and X-Robots-Tag HTTP headers. Search-related crawlers use these rules to determine how to index and serve resources in search results, or to adjust the crawl rate for specific resources over time.