Search engine optimization

From Citizendium
Jump to navigation Jump to search
This article is developing and not approved.
Main Article
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
This editable Main Article is under development and subject to a disclaimer.

Search engine optimization, sometimes abbreviated as SEO, describes strategies to boost traffic to a particular web site so it appears earlier in a computer search. SEO aims to achieve the highest possible search engine results page or SERP ranking for a website or a set of web pages.

When a computer user hunts for information using a search engine, sites appearing on the top page are more likely to be chosen, thereby increasing chances that a user will click on a particular web site. This increase a web site's effectiveness since it will have greater exposure to the public. For a business selling over the Internet using a web site, this can translate into more customers. There is terrific competition among web sites to increase both the quality and volume of traffic, and SEO techniques are valuable for Internet marketing.

While most SEO techniques focus on specific words and terms known as "keywords", it's possible to optimize searches based on images, audio recordings, and video recordings as well.

The acronym "SEO" sometimes means "search engine optimizers" to describe consultants hired to boost Internet traffic as well as in-house employees charged with this task. Sometimes they recommend changes to the HTML code of a web site to make it more "search engine friendly" and, as a result, modify menus, pictures, images, videos, sounds and other aspects to make a particular web site more attractive to targeted users. They consider how search engines work and what people search for, and study specific keywords extensively. It is important for them to understand how search engine algorithms work, and how search engine software sub-programs, sometimes called "robots", operate.

When done right, SEO strategies can build a bridge between a user and a web site, and can help a seeker of specific information find that specific information that was sought. It can help customers find what they want to buy and can help sellers find buyers. But SEO strategies can be overdone and can even backfire when a web site attracts the wrong customers, or attracts irrelevant customers, or when a user finds only clutter instead of the specific information desired. This diminishes the effectiveness of a particular search engine. Sometimes SEO techniques which seek to manipulate the searching experience for unproductive purposes are called black hat or spamdexing methods which misuse tricks such as link farms, keyword stuffing and article spinning to degrade the relevance of search results.

In response, to keep search results relevant, search engine software engineers use techniques to weed out artificially enhanced web sites and help users sift through the clutter. They try to figure out which sites are trying to game the system artificially in unproductive ways, and take steps to remove these sites from indexes created by their "crawlers", or sometimes called "spiders", which are software programs used by search engine software to continually explore the world wide web, hunting for content, and creating indexes. They try to remove the clutter.

Battling between search engines and optimizers

Generally, the early history of the Internet reveals a running battle of wits between SEO consultants and search engine programmers which continues to the present, and is likely to continue well into the future.

The first search engines began cataloging the Web in the early 1990s. It wasn't long before webmasters and content providers began efforts to boost traffic to their sites in the mid-1990s. At the beginning, all a webmaster needed to do was submit the address of a page, or URL, to the various engines which would send a spider to "crawl" that page, extract links to other pages from it, and return information found on the page to be indexed.[1] The process involves a search engine spider downloading a page and storing it on the search engine's own server, where a second program, known as an indexer, extracts information about the page, such as the words it contains and where these are located, and tries to determine the weight or importance given specific terms, and all links the page contains, which are then placed into a scheduler for crawling at a later date.

Web site owners quickly recognized the value of having their sites highly ranked and visible in search engine results. There were opportunities for both white hat and black hat SEO consultants. According to software analyst Danny Sullivan, the phrase search engine optimization probably came into use in 1997.[2]

Early versions of search algorithms relied on webmaster-provided information such as the keyword meta tag, or index files in engines like ALIWEB. Meta tags provide a guide to each page's content. But search engineers found that using these meta data to index pages was often unreliable because the webmaster's choice of keywords in the meta tag could potentially be an inaccurate representation of the site's actual content. Inaccurate, incomplete, and inconsistent data in meta tags caused pages to rank highly resulting in irrelevant searches.[3] Some web content providers manipulated a number of attributes within the HTML source code of a page to boost traffic artificially in search engines.[4]

By over-reliance on factors such as keyword density which were exclusively within a webmaster's control, early search engines suffered from abuse and ranking manipulation. To provide better results for users, search engines had to adapt to ensure their results pages showed the most relevant search results, rather than unrelated pages stuffed with numerous keywords by unscrupulous webmasters. Since the usefulness of a search engine is determined by its ability to produce the most relevant results to any given search, allowing false results were counterproductive. Search engines responded by developing more complex ranking algorithms, taking into account additional factors that were more difficult for webmasters to manipulate.

By 1997 search engines recognized that webmasters were making efforts to rank well in their search engines, and that some webmasters were even manipulating their rankings in search results by stuffing pages with excessive or irrelevant keywords. Early search engines, such as Infoseek, adjusted their algorithms in an effort to prevent webmasters from manipulating rankings.[5]

Graduate students at Stanford University, Larry Page and Sergey Brin, developed a search engine named "backrub" that relied on a mathematical algorithm to rate the prominence of web pages. The number calculated by the algorithm, PageRank, is a function of the quantity and strength of inbound links.[6] PageRank estimates the likelihood that a given page will be reached by a web user who randomly surfs the web, and follows links from one page to another. The software enhancements meant that search engines could determine which links were stronger than others, since a higher PageRank page is more likely to be reached by the random surfer.

Diagram with different colored circles with percentages and arrows.
Google's "PageRank" method seeks to determine the most relevant web sites by tracing links. Even though more sites link to web site E, web site C is ranked as more relevant since its links are stronger. A web surfer who chooses a random link on every site (but with 15% likelihood jumps to a random page on the whole web) is going to be on Page E for 8.1% of the time. The most relevant web site is B, if hypothetical users click randomly and follow links for a limited amount of time.

Page and Brin founded Google in 1998. Google attracted a loyal following among the growing number of Internet users, who liked its simple design.[7] Google's search software used so-called off-page factors such as PageRank and hyperlink analysis as well as so-called on-page factors such as keyword frequency, meta tags, headings, links and site structure. As a result, Google searches bypassed much of the manipulation which flummoxed search engines which only considered on-page factors for their rankings. It helped searchers find what they were searching for.

Webmasters, in response, seeking to boost traffic to their particular sites, tried new methods to try to outwit the savvier Google search engine. Although PageRank was more difficult to game, webmasters found that link building tools and schemes, which had been developed earlier to influence the Inktomi search engine, had some effectiveness in gaming PageRank. Many web sites focused on exchanging, buying, and selling links, often on a massive scale, and they quickly earned the moniker of "link farms". A link farm is a group of web sites that all hyperlink to every other site in the group; while it's possible to create link farms manually, most are created from automated programs, and are built to try to outwit a search engine into boosting traffic to particular web sites. Link farms are a form of spamming the index of a web search engine and are sometimes called spamdexing or spamexing. Sometimes link farms consist of thousands of sites.[8]

Diagram showing circles representing web sites in a larger circle, linked by arrows.
Early attempts by search engine optimizers to game the system led to the development of link farms, usually created using special programs, but linking thousands of artificially created web sites. The multiplication of links was designed to fool search engines into thinking that each individual website was important. Since then, search engine programmers have developed counter-measures.

Search engines battled back during the first decade of the twenty-first century with a new slew of proprietary techniques to outwit the link farms and other measures to game the system, such as PageRank sculpting. Their ranking algorithms tried to reduce the impact of link manipulation. Google says it ranks sites using more than 200 different signals.[9] Understandably, leading search engines such as Google and Yahoo refuse to disclose their page ranking algorithms. SEO consultants studied different approaches to search engine optimization and sometimes publish their opinions in online forums and blogs[10][11] and even study software patents to gain insight into the algorithms.[12]

In 2005 Google began personalizing search results for each user based on their history of previous searches.[13] One expert suggested that personalized searching meant overall page ranking was meaningless because results would be different for each user and even each search.[14]

In 2007 Google announced a campaign against paid links that transfer PageRank.[15] In June 2009, Google disclosed that they had taken measures to mitigate the effects of PageRank sculpting by use of the nofollow attribute on links. Google engineer Matt Cutts announced that the Google Bot would no longer treat nofollowed links in the same way, in order to prevent SEOs from using nofollow links for PageRank sculpting[16]. As a countermeasure, SEOs developed alternative techniques that replace nofollowed tags with obfuscated Javascript and thus tried to restore attempts to sculpt PageRank. Additionally several solutions have been suggested that include the usage of iframes, flash and javascript.[17]

In December 2009 Google announced plans to use the web search history of all users to populate search results [18]. In late 2009, a new technique called "real-time-search" was introduced to make search results even more timely and relevant. Site administrators have spent months or even years optimizing a website to increase search rankings. With the growth in popularity of social media sites and blogs, the leading search engines made changes to their algorithms to allow fresh content to rank quickly within the search results.[19] This new approach to search places importance on current, fresh and unique content, and tries to avoid problems when there is a greater bias to older and more established content.

Due to the high marketing value of targeted search results, there is an adversarial aspect to the relation between search engine software firms and SEOs. In 2005, an annual conference, AIRWeb, an acronym for Adversarial Information Retrieval on the Web,[20] was created to discuss and minimize the damaging effects of aggressive web content providers. SEO companies using overly aggressive techniques to jigger search results can have their efforts backfire: in some cases, search firms have banned some websites from search results. In 2005, the Wall Street Journal reported on a company which allegedly used high-risk techniques and failed to disclose those risks to its clients.[21] Google's Matt Cutts later confirmed that Google did in fact ban Traffic Power and some of its clients.[22]

Some search engine firms have reached out to the SEO industry to seek ways to make the system work for all parties. Search engine consultants are frequent sponsors and guests at SEO conferences, chats, and seminars. With the advent of paid inclusion, some search engines now have a vested interest in the health of the optimization community. Major search engine firms provide information and guidelines to help SEO consultants with site optimization.[23][24][25] Google has a Sitemaps program[26] to help webmasters learn if Google is having any problems indexing their website and also provides data on Google traffic to the website. Google guidelines are a list of suggested practices which Google has provided as guidance to webmasters. Yahoo! Site Explorer provides a way for webmasters to submit URLs, determine how many pages are in the Yahoo! index and view link information.[27]


Getting indexed

The leading search engines, such as Google and Yahoo!, use crawlers to find pages for their algorithmic search results. Pages that are linked from other search engine indexed pages do not need to be submitted because they are found automatically. Some search engines, notably Yahoo!, operate a paid submission service that guarantee crawling for either a set fee or cost per click.[28] Such programs usually guarantee inclusion in the database, but do not guarantee specific ranking within the search results.[29] Two major directories, the Yahoo Directory and the Open Directory Project both require manual submission and human editorial review.[30] Google offers Google Webmaster Tools, for which an XML Sitemap feed can be created and submitted for free to ensure that all pages are found, especially pages that aren't discoverable by automatically following links.[31]

Search engine crawlers examine different factors when crawling a site. Not every page is indexed by the search engines. The distance of a particular page from the root directory of a site may also be a factor in whether or not pages get crawled.[32]

Preventing crawling

To avoid undesirable content in the search indexes, webmasters can instruct spiders not to crawl certain files or directories through the standard robots.txt file in the root directory of the domain. Pages typically prevented from being crawled include login specific pages such as shopping carts and user-specific content such as search results from internal searches. Additionally, a page can be explicitly excluded from a search engine's database by using a meta tag specific to robots. When a search engine visits a site, the robots.txt located in the root directory is the first file crawled. The robots.txt file is then parsed and instructs the robot to only crawl specific pages. Search engine crawlers typically keep a copy of pages crawled in a cache. But it's possible for crawlers to visit pages despite instructions to the contrary from webmasters. In March 2007, Google warned webmasters that they should prevent indexing of internal search results because those pages are considered as "search spam".[33]

Increasing prominence

A variety of methods can be used to optimize search results. These include:

  • Cross linking between pages of the same website. This gives more links to a website's main pages to increase its PageRank.[34]
  • Encouraging legitimate links to other websites. Note: link farming and comment spam risk the possibility of negatively impacting a search position.
  • Writing content that includes frequently searched keyword phrases.[35] Adding relevant keywords to a web page meta tags, including keyword stuffing.
  • URL normalization. Sometimes web pages are made more accessible via multiple URLs, using the "canonical" meta tag.[36] URL normalization, sometimes called URL canonicalization is modification of URLs so they're standardized and consistent so that a search engine can determine whether two syntactically different URLs are equivalent. Search engines employ URL normalization in order to assign importance to web pages and to reduce indexing of duplicate pages. Web crawlers perform URL normalization in order to avoid crawling the same resource more than once. Web browsers may perform normalization to determine if a link has been visited or to determine if a page has been cached. An example of normalization: converting to lower case, so that HTTP:// becomes

White hat versus black hat

SEO techniques can be classified into recommended techniques which search engine firms agree are acceptable, and manipulative techniques which try to game the system such as spamdexing. Some industry commentators call the recommended techniques by the term white hat SEO, and the undesired techniques as black hat SEO.[37] White hats tend to produce results that last a long time, whereas black hats anticipate that their sites may eventually be banned either temporarily or permanently once the search engines discover what they are doing.[38]

A search engine optimization technique is considered white hat SEO (or sometimes "legal SEO") if it conforms to the search engines' guidelines and involves no deception. Search engine guidelines[23][24][25][39] are not, however, written as a series of fixed rules or commandments, but it more a matter of judgment and accepted industry practice. White hat SEO is not merely following guidelines but ensuring the content a search engine indexes and ranks is the same content a user will see. White hat advice is generally considered as web strategies focused on benefitting users, and helping make quality content easily accessible to search engine spiders, while black hat advice are attempts to trick the algorithm from its intended purpose. White hat SEO is in many ways similar to web development that promotes accessibility,[40] although the two are not identical.

Black hat SEO tries to improve rankings by deceiving search engines. One technique uses hidden text, either as text of the same color as the background, or placed in an invisible div location, or positioned off screen. Another method gives a different page depending on whether the page is being requested by a human visitor or a search engine, a technique known as cloaking. Another method is to use invisible iframes, where a page you see is not necessarily from the company that is hosting that webpage. This is one way in which duplicitous websites download software into your computer in so-called "drive-by" downloads without the users knowledge or permission.

If and when search engines discover black hat methods, they may penalize sites using these techniques by reducing their rankings or eliminating their listings from their databases altogether. Such penalties can be applied either automatically by the search engines' algorithms, or by a manual site review. In February 2006 Google removed both BMW Germany and Ricoh Germany for use of deceptive practices.[41] and the April 2006 removal of the PPC Agency BigMouthMedia.[42] All three companies, however, apologized, fixed the offending pages, and were restored to Google's list.[43]

Search optimization does not guarantee increased sales. SEO is not necessarily an appropriate strategy for every website, and other Internet marketing strategies can be much more effective, depending on the site operator's goals.[44] A successful Internet marketing campaign may drive organic traffic to web pages, but it also may involve the use of paid advertising on search engines and other pages, building high quality web pages to engage and persuade, addressing technical issues that may keep search engines from crawling and indexing those sites, setting up analytics programs to enable site owners to measure their successes, and improving a site's conversion rate.[45]

Nevertheless, SEO may generate a substantial return on investment. Since search engines are not paid for organic search traffic and their algorithms change, there can be no guarantees of continued referrals. A business overly reliant on search engine traffic can suffer major losses if the search engines stop sending visitors.[46] It is considered sound business practice for website operators to liberate themselves from dependence on particular search engine strategies.[47] A top-ranked SEO blog[48] suggested, "Search marketers, in a twist of irony, receive a very small share of their traffic from search engines." Instead, their main sources of traffic are links from other websites.[49]

The dominance of Google

While the search engines' market shares vary from market to market, technologist Danny Sullivan reported that Google was used in 75% of all searches in 2003.[50] In markets outside the United States, Google's share is often larger, and Google remains the dominant search engine worldwide as of 2007.[51] In 2006, Google had an 85-90% market share in Germany.[52] While there were hundreds of SEO firms in the US at that time, there were only about five in Germany.[52] In June 2008, the market share of Google in the UK was close to 90% according to Hitwise.[53] That market share is achieved in a number of countries.[54]

As of 2009, there are only a few large markets where Google is not the leading search engine. In most cases, when Google is not leading in a given market, it is lagging closely behind a local player. The most notable markets where this is the case are China, Japan, South Korea, Russia and the Czech Republic where respectively Baidu, Yahoo! Japan, Naver, Yandex and Seznam are market leaders.

Successful search optimization for international markets may require professional translation of web pages, registration of a domain name with a top level domain in the target market, and web hosting that provides a local IP address. Otherwise, the fundamental elements of search optimization are essentially the same, regardless of language.[52]

Legal activity

On October 17, 2002, SearchKing filed a lawsuit against Google. SearchKing's claim was that Google's tactics to prevent spamdexing constituted a tortious interference with contractual relations. On May 27, 2003, the court granted Google's motion to dismiss the complaint because SearchKing "failed to state a claim upon which relief may be granted."[55][56]

In March 2006, KinderStart filed a lawsuit against Google over search engine rankings. Kinderstart's web site was removed from Google's index prior to the lawsuit and the amount of traffic to the site dropped by 70%. On March 16, 2007 a United States court in San Jose, California, dismissed KinderStart's complaint without leave to amend, and partially granted Google's motion for sanctions against KinderStart's attorney, requiring him to pay part of Google's legal expenses.[57][58]


  1. Brian Pinkerton. Finding What People Want: Experiences with the WebCrawler (PDF). The Second International WWW Conference Chicago, USA, October 17–20, 1994. Retrieved on 2007-05-07.
  2. Danny Sullivan (June 14, 2004). Who Invented the Term "Search Engine Optimization"?. Search Engine Watch. Retrieved on 2007-05-14.
  3. Cory Doctorow (August 26, 2001). Metacrap: Putting the torch to seven straw-men of the meta-utopia. e-LearningGuru. Retrieved on 2007-05-08.
  4. Pringle, G., Allison, L., and Dowe, D. (April 1998). What is a tall poppy among web pages?. Proc. 7th Int. World Wide Web Conference. Retrieved on 2007-05-08.
  5. Laurie J. Flynn (November 11, 1996). Desperately Seeking Surfers. New York Times. Retrieved on 2007-05-09.
  6. Brin, Sergey and Page, Larry (1998). The Anatomy of a Large-Scale Hypertextual Web Search Engine. Proceedings of the seventh international conference on World Wide Web. Retrieved on 2007-05-08.
  7. Thompson, Bill (December 19, 2003). Is Google good for you?. BBC News. Retrieved on 2007-05-16.
  8. Zoltan Gyongyi and Hector Garcia-Molina (2005). Link Spam Alliances (PDF). Proceedings of the 31st VLDB Conference, Trondheim, Norway. Retrieved on 2007-05-09.
  9. Google Keeps Tweaking Its Search Engine. New York Times (June 3, 2007). Retrieved on 2007-06-06.
  10. Danny Sullivan (September 29, 2005). Rundown On Search Ranking Factors. Search Engine Watch. Retrieved on 2007-05-08.
  11. Search Engine Ranking Factors V2. (April 2, 2007). Retrieved on 2007-05-14.
  12. Christine Churchill (November 23, 2005). Understanding Search Engine Patents. Search Engine Watch. Retrieved on 2007-05-08.
  13. Google Personalized Search Leaves Google Labs - Search Engine Watch (SEW). Retrieved on 2009-09-05.
  14. Will Personal Search Turn SEO On Its Ear?. Retrieved on 2009-09-05.
  15. 8 Things We Learned About Google PageRank. Retrieved on 2009-08-17.
  16. PageRank sculpting. Matt Cutts. Retrieved on 2010-01-12.
  17. Google Loses “Backwards Compatibility” On Paid Link Blocking & PageRank Sculpting. Retrieved on 2009-08-17.
  18. Personalized Search for everyone. Google. Retrieved on 2009-12-14.
  19. Relevance Meets Real Time Web. Google Blog.
  20. AIRWeb. Adversarial Information Retrieval on the Web, annual conference. Retrieved on 2007-05-09.
  21. David Kesmodel (September 22, 2005). Sites Get Dropped by Search Engines After Trying to 'Optimize' Rankings. Wall Street Journal. Retrieved on 2008-07-30.
  22. Matt Cutts (February 2, 2006). Confirming a penalty. Retrieved on 2007-05-09.
  23. 23.0 23.1 Google's Guidelines on Site Design. Retrieved on 2007-04-18.
  24. 24.0 24.1 Site Owner Help: MSN Search Web Crawler and Site Indexing. Retrieved on 2007-04-18.
  25. 25.0 25.1 Yahoo! Search Content Quality Guidelines. Retrieved on 2007-04-18.
  26. Google Webmaster Tools. Retrieved on 2007-05-09.
  27. Yahoo! Site Explorer. Retrieved on 2007-05-09.
  28. Submitting To Search Crawlers: Google, Yahoo, Ask & Microsoft's Live Search. Search Engine Watch (2007-03-12). Retrieved on 2007-05-15.
  29. Search Submit. Retrieved on 2007-05-09.
  30. Submitting To Directories: Yahoo & The Open Directory. Search Engine Watch (2007-03-12). Retrieved on 2007-05-15.
  31. What is a Sitemap file and why should I have one?. Retrieved on 2007-03-19.
  32. Cho, J., Garcia-Molina, H. (1998). Efficient crawling through URL ordering. Proceedings of the seventh conference on World Wide Web, Brisbane, Australia. Retrieved on 2007-05-09.
  33. Newspapers Amok! New York Times Spamming Google? LA Times Hijacking Search Engine Land (May 8, 2007). Retrieved on 2007-05-09.
  34. [1] Link Development"
  35. [2] "keyword rich text"
  36. Bing - Partnering to help solve duplicate content issues - Webmaster Blog - Bing Community. Retrieved on 2009-10-30.
  37. Andrew Goodman. Search Engine Showdown: Black hats vs. White hats at SES. SearchEngineWatch. Retrieved on 2007-05-09.
  38. Jill Whalen (November 16, 2004). Black Hat/White Hat Search Engine Optimization. Retrieved on 2007-05-09.
  39. What's an SEO? Does Google recommend working with companies that offer to make my site Google-friendly?. Retrieved on 2007-04-18.
  40. Andy Hagans (November 8, 2005). High Accessibility Is Effective Search Engine Optimization. A List Apart. Retrieved on 2007-05-09.
  41. Matt Cutts (February 4, 2006). Ramping up on international webspam. Retrieved on 2007-05-09.
  42. seobook (April 4, 2006). Big Mouth Media Banned for Excessive Hidden Text Spamming - Google's Matt Cutts Confirms Hand Job. Retrieved on 2007-05-09.
  43. Matt Cutts (February 7, 2006). Recent reinclusions. Retrieved on 2007-05-09.
  44. What SEO Isn't. (June 24, 2006). Retrieved on 2007-05-16.
  45. Melissa Burdon (March 13, 2007). The Battle Between Search Engine Optimization and Conversion: Who Wins?. Retrieved on 2007-05-09.
  46. Andy Greenberg (April 30, 2007). Condemned To Google Hell. Forbes. Retrieved on 2007-05-09.
  47. Jakob Nielsen (January 9, 2006). Search Engines as Leeches on the Web. Retrieved on 2007-05-14.
  48. SEOmoz: Best SEO Blog of 2006. (January 3, 2007). Retrieved on 2007-05-31.
  49. A survey of 25 blogs in the search space comparing external metrics to visitor tracking data. Retrieved on 2007-05-31.
  50. The search engine that could, USA Today, 2003-08-26. Retrieved on 2007-05-15.
  51. Greg Jarboe (2007-02-22). Stats Show Google Dominates the International Search Landscape. Search Engine Watch. Retrieved on 2007-05-15.
  52. 52.0 52.1 52.2 Mike Grehan (April 3, 2006). Search Engine Optimizing for Europe. Click. Retrieved on 2007-05-14.
  53. Jack Schofield (2008-06-10). Google UK closes in on 90% market share. Guardian. Retrieved on 2008-06-10.
  54. Alex Chitu (2009-03-13). Google's Market Share in Your Country. Google Operating System. Retrieved on 2009-05-16.
  55. Search King, Inc. v. Google Technology, Inc., CIV-02-1457-M (PDF). (May 27, 2003). Retrieved on 2008-05-23.
  56. Stefanie Olsen (May 30, 2003). Judge dismisses suit against Google. CNET. Retrieved on 2007-05-10.
  57. Technology & Marketing Law Blog: KinderStart v. Google Dismissed—With Sanctions Against KinderStart's Counsel. Retrieved on 2008-06-23.
  58. Technology & Marketing Law Blog: Google Sued Over Rankings— v. Google. Retrieved on 2008-06-23.