|
Creating Your Robots.txt File
|
When creating your robots.txt file you should not make it in Microsoft Word or any HTML Editor. Always use an ASCII Editor. You can use Notepad or Text pad as these programs will not put in any hidden html entities or characters that may cause any internal errors with your robots.txt file, and stop it from working effectively.
It must be saved as robots.txt in order to work correctly and not with an .html or .htm extension. It then has to be uploaded to the root directory, so it will appear on your site using the following http://www.yoursite.com/robots.txt
|
|
Menu
|
|
| Robots.txt Generator | List of Bad-Bots |
|
|
Allowing all bots
|
To allow all bots to index all pages on your site, you would use the symbol * after the User-Agent: statement. So you would use the following.
User-agent: *
Disallow:
|
|
Disallow all bots
|
To disallow all bots to not index pages on your site, you would use the forward slash / after the Disallow statement. So you would use the following.
User-agent: *
Disallow: /
|
|
Disallowing specific bots
|
If you would like to not allow certain bots to index your pages, you can refer to bots by their name. For example, many webmasters prefer not to allow Google's Image Search or MSN's Pic Search bots to index any of the images on their site. So you would use the following.
User-Agent field you list the name of the bot.
Disallow field we use the forward slash / to prevent indexing.
User-agent: Googlebot-Image
Disallow: /
User-agent: psbot
Disallow: /
|
|
Disallowing Specific Pages and Directories
|
You may want to restrict bots to not index certain pages or directories on your site.
The most common directory that webmasters wish to deny access to search bots is the cgi-bin, since it normally contains scripts and code that doesn't need to be indexed nor do they wish to make their scripts available to be indexed in the search engines.
In most cases, you should disallow robots using the meta tags instead but for this example we will show you how using the robots.txt file.
User-agent: *
Disallow: /cgi-bin/
Disallow: /public_html/
Disallow: /customer/files.html
|
|
What to do with the Robots.txt file?
|
Once you have added all the bots and directories you don't want indexed by search engine bots, save it as robots.txt and upload it to your server.
Alternatively, you can use our free robots.txt generator to have it generated for you.
|
|
Copyright © 2006 Invision-Graphics Inc.
|