SeOpt Internet Marketing

Call: (713) 518-2159
e-mail: sean@seopt.com

Blog

  • Home
  • About
  • Search Engine Optimization
    • Local Houston SEO
    • PPC Campaigns
    • Houston SEO Consultant
  • Site Design
  • Social Media
  • Contact
  • Blog
  • Resources
    • Local Citations
    • Good Reads
    • General Directories
Home › Blog › How To Use a Robots.txt File
Jan1
Sean Odom
Blog
Comments Off on How To Use a Robots.txt File

How To Use a Robots.txt File

Using a Robots Text File

Search engines will search for a special file called robots.txt before spidering your site. The Robots Text File is created specifically to give directions to web crawlers/robots. Place the following two lines in your robots.txt file if you wish to allow search engines to crawl/spider everything on your site:

robots.txt example

User-agent: *
Disallow:

The * in the first line specifies that the directions are for all search engines. The second line indicates that nothing is disallowed.

Once you have created your robots.txt file, upload it to your websites main directory where your homepage and other HTML files are located.

Robots.txt Usage Examples

Below are several common examples of how you can use a robots.txt file to set parameters and control how different crawlers/robots access your website.

The following example would allow all crawlers/robots to access all files except for your images file.

robots.txt example

User-agent: *
Disallow: /images/

The following would give Google a direct link to your XML sitemap (more info here), as well as other search engiens, and direct crawlers/robots to crawl all files except the cgi-bin files, images directory, and your log files.

robots.txt example

User-agent: *
Disallow: /cgi-bin/
Disallow: /logs/
Disallow: /images/

Using “Crawl-delay” parameters in the robots.txt file. This parameter indicates the number of seconds for a crawler/spider to delay between requests.

robots.txt with Crawl-delay

User-agent: Googlebot
Crawl-delay: 20

User-agent: Slurp
Crawl-delay: 20

User-Agent: msnbot
Crawl-Delay: 20

Bad Robots and Email Harvestors

Below are several robots/crawlers that you might want to block.

robots.txt example

User-agent: *

User-agent: Titan
User-agent: EmailCollector
User-agent: EmailSiphon
User-agent: EmailWolf
User-agent: ExtractorPro
User-agent: WebZip
User-agent: larbin
User-agent: b2w/0.1
User-agent: htdig/3.1.5
User-agent: teleport
User-agent: NPBot
User-agent: TurnitinBot
User-agent: dloader(NaverRobot)
User-agent: dloader(Speedy Spider)
User-agent: FunWebProducts
User-agent: WebStripper
User-agent: WebSauger
User-agent: WebCopier

Robots Resources and Tools

  • The Robots Exclusion Protocol
  • Web Robots FAQ
  • Using Apache To Stop Bad Robots
  • List of Robots
  • Database of Web Robots
  • Types and Details of Robots
  • Articles and Papers

Tools

  • Robots.txt Generator
  • Robots.txt File Checker
  • Internet Marketing Ninjas Robots.txt Generator
  • Robots Text File Manager
Sean Odom
About The Author:

Sean Odom is the founder of SeOpt Internet Marketing, a Houston based local SEO service provider. Sean is also co-founder of Socialot.com, a SAAS social contact management system for small businesses.

robots

Recent Posts

  • Questions About Web Crawlers, Robots, Spiders
  • SEO Guide
  • Google Business & Site Owner Help
  • HTML Validation
  • Simple 301 Redirect Instructions
  • How To Use a Robots.txt File
  • SEO Glossary & Definitions
  • Flash Search Engine Optimization
  • Increase Website Conversion Rates
  • The Problem with Frames

Recent Comments

  • Search Engine Friendly Website Design - SeOpt SEO Blog on SEO Guide
  • Keyword Rich Anchor Text, Optimizing Heading Tags & Page Naming on Search Engine Marketing
  • Google Webmaster Guidelines For SEO's - SeOpt SEO Blog on W3C HTML Validator
  • Link Building - How To Build Quality Backlinks - SeOpt SEO Blog on Search Engine Marketing
  • Keyword Analysis & Targeting Competitive Keywords - SeOpt on PPC Campaign Management

Archives

  • February 2015
  • February 2013
  • January 2013
  • December 2012
  • November 2012
  • October 2012
  • August 2012
  • July 2012
  • May 2012
  • June 2011
  • April 2011
  • September 2009
  • April 2009
  • October 2007
  • August 2007
  • December 2006

Categories

  • Blog

Meta

  • Log in
  • Entries RSS
  • Comments RSS
  • WordPress.org

Get In Touch

SeOpt, Inc.
24044 Cinco Village Center Blvd
Suite #100

Katy, TX 77494
(713) 518-2159
sean@seopt.com
URL of Map
Monday-Friday 9am-5pm
Saturday 11am-3pm

Site Navigation

  • Home
  • About Us
  • Contact Us
  • Search Engine Optimization
    • Local Houston SEO
    • PPC Management
    • Houston SEO Consultant
  • Site Design
  • Social Media
  • Blog
  • Privacy
  • Sitemap

Partners

  • The Writers For Hire
  • Zephyr Salvo
  • ChiliPepperWeb.net

Recent Posts

  • Questions About Web Crawlers, Robots, Spiders
  • SEO Guide
  • Google Business & Site Owner Help
  • HTML Validation
  • Simple 301 Redirect Instructions
Copyright © 2013 SeOpt.com Internet Marketing - All Rights Reserved www.seopt.com