Jan1

Comments Off

How To Use a Robots.txt File

Using a Robots Text File

Search engines will search for a special file called robots.txt before spidering your site. The Robots Text File is created specifically to give directions to web crawlers/robots. Place the following two lines in your robots.txt file if you wish to allow search engines to crawl/spider everything on your site:

robots.txt example

User-agent: *
Disallow:

The * in the first line specifies that the directions are for all search engines. The second line indicates that nothing is disallowed.

Once you have created your robots.txt file, upload it to your websites main directory where your homepage and other HTML files are located.

Robots.txt Usage Examples

Below are several common examples of how you can use a robots.txt file to set parameters and control how different crawlers/robots access your website.

The following example would allow all crawlers/robots to access all files except for your images file.

robots.txt example

User-agent: *
Disallow: /images/

The following would give Google a direct link to your XML sitemap (more info here), as well as other search engiens, and direct crawlers/robots to crawl all files except the cgi-bin files, images directory, and your log files.

robots.txt example

User-agent: *
Disallow: /cgi-bin/
Disallow: /logs/
Disallow: /images/

Using “Crawl-delay” parameters in the robots.txt file. This parameter indicates the number of seconds for a crawler/spider to delay between requests.

robots.txt with Crawl-delay

User-agent: Googlebot
Crawl-delay: 20

User-agent: Slurp
Crawl-delay: 20

User-Agent: msnbot
Crawl-Delay: 20

Bad Robots and Email Harvestors

Below are several robots/crawlers that you might want to block.

robots.txt example

User-agent: *

User-agent: Titan
User-agent: EmailCollector
User-agent: EmailSiphon
User-agent: EmailWolf
User-agent: ExtractorPro
User-agent: WebZip
User-agent: larbin
User-agent: b2w/0.1
User-agent: htdig/3.1.5
User-agent: teleport
User-agent: NPBot
User-agent: TurnitinBot
User-agent: dloader(NaverRobot)
User-agent: dloader(Speedy Spider)
User-agent: FunWebProducts
User-agent: WebStripper
User-agent: WebSauger
User-agent: WebCopier

Robots Resources and Tools

Tools

About The Author:

Sean Odom is the founder of SeOpt Internet Marketing, a Houston based local SEO service provider. Sean is also co-founder of Socialot.com, a SAAS social contact management system for small businesses.

Call: (713) 518-2159
e-mail: sean@seopt.com

Blog

How To Use a Robots.txt File

Using a Robots Text File

robots.txt example

Robots.txt Usage Examples

robots.txt example

robots.txt example

robots.txt with Crawl-delay

robots.txt example

Robots Resources and Tools

Tools

About The Author:

Call: (713) 518-2159 e-mail: sean@seopt.com

Blog

How To Use a Robots.txt File

Using a Robots Text File

robots.txt example

Robots.txt Usage Examples

robots.txt example

robots.txt example

robots.txt with Crawl-delay

robots.txt example

Robots Resources and Tools

Tools

About The Author:

Call: (713) 518-2159
e-mail: sean@seopt.com