Home arrow SEO Articles arrow User-Agent Strings in SEO

User-Agent Strings in SEO PDF Print E-mail

User-agent strings are one of many aspects of SEO that the prospective website promoter needs to be aware of.  A user-agent string is a piece of text included as part of a connection of a web client to your website. It is most often used to describe the application or program that initiated the request.

An example of such a request is as follows:

GET / HTTP/1.1
Host: http://www.cybernac.com/
Connection: Close
User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)
Referer: http://www.cybernac.com

By observing the user-agent, we can learn a number of interesting facts about our visitors. For example, many user-agent strings include details regarding the browser used to access the site. Similarly, many search engine spiders (automated programs run by search companies such as Yahoo, Google, and many others) identify themselves in the user-agent. For example, the following user-agent is generated by Slurp!, the spidering program managed by Yahoo.

Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)

In its most basic sense, analyzing this can be helpful in profiling your viewing audience. On a more technical level however, a creative author can use the user agent string to customized pages to different visitors, based upon the browser being used to access the page. Depending on the rationale behind such technology, the practice may be either encouraged, or frowned upon.

For example, hand-held computers and cell-phones are increasingly being used to access the Internet. However, content that may look perfectly fine on a laptop or desktop, may be virtually unusable on these smaller devices. Further, support for web technologies may vary considerably between the devices themselves. A website promoter may wish to deliver content to these devices using an alternative layout, or offer a different selection of content, if it maybe more relevant to visitors accessing the site “on-the-go”. Below is an example of a user-agent from such a Nokia 6681 Cellular Telephone.

Nokia6681/2.0 (5.37.01) SymbianOS/8.0 Series 60/2.6 Profile/MIDP-2.0

On the other hand, some website operators use them for less-than-ethical reasons. For example, the practice of ‘cloaking’ describes a method of delivering different content to a search engine spider than you would to a regular user. To do this, a website promoter would develop a webpage with highly optimized content; often so optimized that it became unreadable by a human. This would be in addition to regular, human-readable content. The operator would then check the user-agent before replying; if the user-agent contained a reference to a search engine, the highly optimized version was sent, otherwise the more readable version was. The end result being improved search engine rankings without affecting website presentation. Such practices are frowned upon by search engines, and website operators employing cloaking technologies to deceive search engines were often punished by having their content removed entirely from search engine results.

A final example would be to restrict access to the site all together. Some website operators might want to discourage certain types of uses of their website. As an example, the user-agent shown below is from CURL, a command line tool used for transferring files, most commonly found on Unix-based computers, such as Linux and Solaris. The program has numerous uses (many which the average website operator will never observe), nearly all legitimate. However, the program is also very often to ‘scrape’ content from a website, or other similar actions; however it is almost certainly never used in place of a browser when accessing web pages. Thus, some website operators may choose to deny any and all traffic to users of the program.

curl/7.15.5 (x86_64-redhat-linux-gnu) libcurl/7.15.5 OpenSSL/0.9.8b zlib/1.2.3 libidn/0.6.5

Retrieving the user-agent depends on the programming language the site is built upon. Below are a few examples:

PHP:
$stUserAgent = $_SERVER[‘HTTP_USER_AGENT’];

JavaScript:
var stUserAgent = navigator.userAgent;

ASP (.NET):
stUserAgent = Request.ServerVariables(“HTTP_USER_AGENT”)

In each case, the variable stUserAgent will contain the user-agent of the program making the request.
There are a number of factors to consider when using the user-agent string:

  • It is trivial to spoof the user-agent, so you should not expect it to be 100% effective.
  • Some privacy programs will prevent a user-agent string from being transmitted.
  • Some adware, malware, and viruses will also modify the user-agent.
  • Do not expect search engines to always send their usual user-agent string; they may be checking up on you.
  • User-Agent strings often change with the release of new browsers.
  • It is unwise to a use a string comparison to exclude a user-agent outright. Instead, use pattern-matching functions to include or exclude a user-agent from your selection.
Further Information

http://www.useragentstring.com/ - A user-agent decoder website. Will display your user-agent along with a description, or you can supply your own.
http://www.botsvsbrowsers.com/ - An extensive database of user-agents that have been found in the wild.
http://en.wikipedia.org/wiki/User_Agent - The wikipedia entry on user-agent strings.
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html - RFC2616 describes the HTTP/1.1 protocol standard; the reference to the user-agent string is specified on section 14.43.

Comments?

Did I miss anything here?  If you have an idea how else SEO's can use user-agent strings, we'd love to hear from you.  Please leave a comment below.

Comments
Add New Search RSS
Write comment
Name:
Email:
 
Website:
Title:
UBBCode:
[b] [i] [u] [url] [quote] [code] [img] 
 
 
Please input the anti-spam code that you can read in the image.

3.20 Copyright (C) 2007 Alain Georgette / Copyright (C) 2006 Frantisek Hliva. All rights reserved."

 
< Prev