Question 1

How does robots.txt path matching work?

Accepted Answer

Paths are matched from the start of the URL path. A rule of /admin/ blocks everything under /admin/. The asterisk (*) is a wildcard matching any sequence of characters. A dollar sign ($) at the end of a rule matches the end of the URL. When multiple rules match, the longest (most specific) path wins. If Allow and Disallow rules tie on length, Allow wins per Google's specification.

Question 2

Does robots.txt prevent pages from appearing in search results?

Accepted Answer

No — blocking crawling in robots.txt prevents Google from reading the page, but does not remove it from the index. Google can still index a URL it has never crawled if other sites link to it. To prevent a page from appearing in search results, use the noindex meta robots tag or the X-Robots-Tag HTTP header instead.

Question 3

What is the x-default user-agent in robots.txt?

Accepted Answer

The wildcard user-agent * applies to all crawlers not explicitly named. Most bots that respect robots.txt — including Googlebot and Bingbot — also respect rules set for *. However, named user-agents (e.g. User-agent: Googlebot) take precedence over * rules for that specific bot.

Question 4

Is my robots.txt parsed in the browser?

Accepted Answer

Yes — all parsing and testing runs entirely in your browser using JavaScript. Your robots.txt content is never sent to any server. The tool works offline after the page has loaded.

robots.txt Tester

About this tool

How to use

Related Tools

FAQ

How does robots.txt path matching work?

Does robots.txt prevent pages from appearing in search results?

What is the x-default user-agent in robots.txt?

Is my robots.txt parsed in the browser?