Robots.txt Generator — Create Robots.txt File
Free robots.txt generator. Create and generate a robots.txt file for your website. Control which pages search engines can crawl and index. Block AI bots.
Configure Your Robots.txt
How to Use
Choose a quick preset (WordPress, Laravel, etc.) or build custom rules below.
Add your disallow/allow paths, sitemap URL, and any comments you want.
Click Generate, then copy or download the file. Upload it to the root of your website.
What is a Robots.txt File?
A robots.txt file is a plain text file at the root of your website (e.g. yoursite.com/robots.txt) that tells search engine crawlers which pages they can and cannot access. It uses the Robots Exclusion Protocol supported by Google, Bing, Yahoo, and all major search engines.
Complete Guide to Robots.txt
How Robots.txt Works
When a search engine bot like Googlebot visits your site, it first checks for a robots.txt file at the root directory. The file contains rules that tell the bot which URLs it is allowed to crawl and which it should skip. Every rule starts with a User-agent line (which bot the rule applies to) followed by Disallow and Allow directives.
Basic Robots.txt Syntax
A robots.txt file is made up of one or more rule groups. Each group targets a specific crawler (or all crawlers with *).
Disallow: /admin/
Disallow: /private/
Allow: /public/
Sitemap: https://yoursite.com/sitemap.xml
This example blocks all crawlers from /admin/ and /private/, allows /public/, and points to the XML sitemap.
Common Robots.txt Rules
Block entire site: Disallow: / — prevents all crawling. Use this for staging or development sites.
Allow entire site: Allow: / or an empty Disallow — lets all crawlers access everything.
Block specific folder: Disallow: /wp-admin/ — blocks a directory and everything inside it.
Block specific file type: Disallow: /*.pdf$ — blocks crawling of all PDF files.
Crawl delay: Crawl-delay: 10 — asks bots to wait 10 seconds between requests. Supported by Bing and Yandex, but not Google.
Robots.txt for WordPress
WordPress sites should block /wp-admin/, /wp-includes/, /trackback/, /feed/, and search result pages (/?s=). Always allow /wp-admin/admin-ajax.php since many themes and plugins depend on it. Our generator has a WordPress preset that applies these rules automatically.
Robots.txt vs Noindex — What's the Difference?
Robots.txt blocks crawling — the bot won't visit the page. A noindex meta tag blocks indexing — the bot visits the page but doesn't add it to search results. If you want a page completely hidden from Google, use noindex. If you just want to save crawl budget, use robots.txt.
Where to Upload Your Robots.txt File
Upload the file to the root directory of your website so it's accessible at https://yoursite.com/robots.txt. Every domain and subdomain needs its own robots.txt file. You can verify it's working using Google Search Console's robots.txt tester.
Common Mistakes to Avoid
Blocking CSS/JS files: Google needs to render your pages. Blocking stylesheets and JavaScript can hurt your rankings.
Blocking your sitemap: Make sure the folder containing your sitemap.xml is not disallowed.
Using robots.txt for sensitive data: Robots.txt is publicly visible. Anyone can read it. Never use it to hide sensitive URLs — use authentication or server-side access control instead.
How to Block AI Bots in Robots.txt
Block PerplexityBot in Robots.txt
Perplexity AI uses PerplexityBot as its user agent to crawl websites for its AI search engine. To block PerplexityBot from crawling your site, add this to your robots.txt:
Disallow: /
PerplexityBot respects robots.txt according to their documentation. This blocks it from crawling any page on your site. You can also block specific directories instead of the entire site.
Block Ahrefs Bot in Robots.txt
Ahrefs uses AhrefsBot to crawl websites for its SEO database. If you want to prevent Ahrefs from crawling your site (to hide your backlink profile or reduce server load), add:
Disallow: /
Other SEO bots you may want to block include SemrushBot, MJ12bot (Majestic), and DotBot (Moz).
Block All AI Crawlers at Once
To block all known AI bots from training on your content, add these rules to your robots.txt:
User-agent: GPTBot
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: PerplexityBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: anthropic-ai
Disallow: /
User-agent: CCBot
Disallow: /
This blocks OpenAI (GPTBot, ChatGPT-User), Google AI (Google-Extended), Perplexity (PerplexityBot), Anthropic (ClaudeBot, anthropic-ai), and Common Crawl (CCBot) from scraping your content. Note: this does not block regular Googlebot search crawling.
Robots.txt Allow All & Troubleshooting
Robots.txt Allow All — Open Your Site to Crawlers
To allow all search engine crawlers to access your entire site, use this minimal robots.txt:
Allow: /
Sitemap: https://yoursite.com/sitemap.xml
Alternatively, an empty robots.txt file or no robots.txt file at all has the same effect — all crawlers are allowed by default. However, including the Sitemap directive is recommended to help search engines discover your pages faster.
Blocked by Robots.txt — How to Fix It
If Google Search Console shows "Blocked by robots.txt" for pages you want indexed, your robots.txt is preventing Googlebot from crawling those pages. To fix it:
1. Check your robots.txt: Visit yoursite.com/robots.txt and look for Disallow rules that match the blocked URL path.
2. Remove or modify the rule: Delete the Disallow line blocking the URL, or add a specific Allow rule above it (Allow takes priority over Disallow for the same path).
3. Test in Search Console: Use the robots.txt tester in Google Search Console to verify the URL is now allowed.
4. Request re-crawl: After updating, use the URL Inspection tool in Search Console to request Google re-crawl the page.
Indexed Though Blocked by Robots.txt
This Search Console warning means Google found a URL (through links from other pages) but cannot crawl it because robots.txt blocks it. Google indexes the URL but not its content — it may appear in search results with no snippet or description. To fix this, either remove the robots.txt block so Google can crawl the page, or add a noindex meta tag to the page and remove the robots.txt block (Google needs to crawl the page to see the noindex tag).
How to Remove Robots.txt in WordPress
WordPress generates a virtual robots.txt by default. If you want to modify or remove it:
Option 1 — Override with a physical file: Create a robots.txt file using our generator and upload it to your WordPress root directory via FTP or cPanel File Manager. A physical file overrides WordPress's virtual robots.txt.
Option 2 — Use an SEO plugin: Plugins like Yoast SEO and Rank Math let you edit robots.txt directly from the WordPress admin under SEO → Tools → File Editor.
Option 3 — Check "Discourage search engines": Go to Settings → Reading and make sure "Discourage search engines from indexing this site" is unchecked. When checked, WordPress adds Disallow: / to robots.txt, blocking your entire site.
Frequently Asked Questions
What is a robots.txt file?
How do I generate a robots.txt file?
How do I block PerplexityBot in robots.txt?
How do I block Ahrefs bot in robots.txt?
How do I set robots.txt to allow all?
What does "Blocked by robots.txt" mean in Search Console?
What does "Indexed though blocked by robots.txt" mean?
How do I remove or edit robots.txt in WordPress?
How do I block all AI crawlers in robots.txt?
Does robots.txt block pages from Google?
Should I add my sitemap to robots.txt?
Does Google respect the Crawl-delay directive?
Share This Tool
Found it useful? Share it with your friends, classmates, or colleagues.