As search engines become more sophisticated, the methods of keeping pace in order to achieve top rankings must also constantly evolve. Search engines have always loved good, original content so you might think, “Why don’t I just publish more content and get it out there?” When you get content published, you increase your chances of generating valuable site links and that can lead to more site traffic. So what’s the down side?
What Is Duplicate Content?
Duplicate content is having *identical* content on two or more different web pages. These web pages may have different urls and reside on different web sites. Some companies and individuals purchase multiple domain names and point them to the same content for marketing purposes, or to prevent competitors from purchasing similar domains. Because search engines index pages by url, the search engine treats each domain url string as a separate page. For example, www.mycoolsite.com and www.mykoolsite may both point to the same files, but www.mycoolsite.com/index.html and www.mykoolsite.com/index.html will be indexed uniquely and will appear as duplicate content.
Why Do Search Engines Care About Duplicate Content?
In an effort to satisfy searchers and deliver relevant and unique search results, search engines filter out duplicate content so that search results for specific keywords don’t contain a long list of urls with the exact same content. Filtering or ignoring duplicate content helps keep search engines fast and efficient by keeping the database of indexed urls as relevent and clean as possible. Duplicate content has also been used to “spam” search engines in hopes of higher rankings so the search engines have taken measures to curtail this practice. Google has even been issued a patent on the way that they handle duplicate content.
Even if the sites where duplicate content appear have different navigation, footers, graphics etc., a page with duplicate content may trigger a search engine filter. In this case, the page can remain in the search engine database, but will not appear in search engine results.
You can see this in practice if you perform a Google search for this specific excerpt from a news story: “A tsunami-warning system for Pacific rim countries has operated for decades, with the Hawaii center issuing alerts to 26 countries.” Scroll to the bottom of the Google results page and, you’ll see this message from Google: “In order to show you the most relevant results, we have omitted some entries very similar to the 9 already displayed. If you like, you can repeat the search with the omitted results included.” You can then click on the link to see the omitted pages.
Duplicate content within a web site internal structure can also lessen the value of a web site, decreasing its ranking in search engine results pages, and even possibly getting the site banned from the search engine.
What Is Considered Duplicate Content?
Duplicate content can appear on the Web in a variety of ways:
- Syndicated columns that appear on more than one site
- Licensed material that appears on more than one page
- RSS, RDF, and Atom Feeds
- News Wire Stories
- Blog Archive pages (new blog posts appear on the front page but are also automatically created in the archive as well)
- Newsletter archives which reproduce articles submitted by others
- Fair use text
- Manufacturers’ product descriptions on retail sites
- Editors’ and producers’ descriptions and reviews on retail sites for books, movies, and music.
- Multiple domain names pointing to the same site
- Mirrored sites
- Plagiarized and copyright infringed content
What Content Pages Win?
If search engines are set up to filter duplicate content, how is it determined what content is shown and what content is filtered? The Google patent may shed a few clues as to what factors are being taken into consideration:
- Highest Page Rank
- Which page the oldest
- Which page the most recent
- Which page is linked to BY more “authority sites”
- Which page links TO more “authority sites”
Got Duplicate Content? Now What?
If nothing else, be judicious about adding duplicate content pages to your web site, and definitely do NOT replicate the same exact content on more than one page of your site. Make sure that each content page, even if it addresses a similar topic, is unique. If you notice duplicate content on your site, remove the duplicate page, or re-write it.
If you are writing content to distribute to other sites for re-publication, try to write unique articles for different sites. Do not simply submit duplicate articles or information from your current site. When you do distribute these articles, submit them to a limited number of sites with good search engine rankings, good credentials, and an audience that matches the content of your article.
If you have multiple domains (e.g., mycoolsite.com, mykoolsite.com, mycoolsight.com) all pointing to one main site, only promote one url to the search engines. Use a 301 redirect on all other domains.
If you have mirrored sites, you can thwart the search engines from indexing more than one site by using either a robots.txt file in the root directory of the domain, or a noindex meta tag in the [head] sections of the pages you don’t want indexed.
In general, use common sense. The search engines understand shared content, news articles that get republished hundreds of times over and that it’s good business practice to provide helpful information you might find to your site visitors or newsletter readers. Just proceed with caution — if it doesn’t smell good, look good, or feel good, it probably isn’t good.
WEBADVANTAGE.NET RELATED ARTICLES:
Is Your Search Engine Optimized Content Working?
RELATED LINKS:
Google’s patent for handling duplicate content:
Web Ad.vantage is a full-service online marketing company with core competencies in search engine optimizatiom, PPC Campaign Management and online media buying. Visit our Internet Marketing Services section to learn more about our full range of services.
WebAdvantage.net encourages the reprinting of our marketing tips and articles. Before doing so, however, please contact us at for permission to do so. The company bio located above is required to accompany any reprint. Thank you in advance for your professional courtesy.
Pragmatic, professional advice with no hidden agenda.
![]()
Internet Business Forum
Find out more hereOlympics Ad Spend Not Quite Gold Medal Worthy
Online researchers eMarketer released data on August 22nd that estimated NBC’s Olympics video advertising spend at 5.75million. The Olympics has brought record numbers of site visitors to NBCOlympics.com as well as TV viewers to the network... read more
U.S. Women and the Internet, Part 1
This article by Hollis originally appeared in ClickZ on February 26, 2008. What do women want? Women’s use of the Internet and their online presence is huge, yet I feel the interactive advertising industry has treated online women as... read more
Cuil: Cool or Uncool?
It seems that everybody’s “Googling” these days—but are you “Cuiling” yet? Cuil (pronounced “cool”) is a new search engine developed by former Google engineer and search architect Anna Patterson and her husband Tom Costello (a former... read more















back to top
Subscribe to our blog RSS



