Stay organized with collections Save and categorize content based on your preferences.
Tuesday, March 06, 2007
Search engine robots, including our very own Googlebot, are incredibly polite. They work hard to respect your every wish regarding what pages they should and should not crawl. How can they tell the difference? You have to tell them, and you have to speak their language, which is an industry standard called the Robots Exclusion Protocol.
Dan Crow has written about this on the Google Blog recently, including an introduction to setting up your own rules for robots and a description of some of the more advanced options. His first two posts in the series are:
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],[],[],["Search engine robots respect website owners' wishes regarding crawling. The Robots Exclusion Protocol is the industry standard language used to communicate these preferences. Resources are provided for setting up rules, including blog posts by Dan Crow on controlling search engine access and the protocol itself. Additional help center content and past posts about debugging blocked URLs, Googlebot, and robots.txt files are also linked. The provided resources are not updated and a link to up-to-date information is also provided.\n"]]