Debunking Duplicate Content Myths
Many of you have probably heard about the penalties applied by Google, when discrovering duplicate content on your pages. Your site may loose the rank, or even get deindexed.
Next you’ve heard that you can copy the exact article from the other web site and nothing wrong is going to happen. So does big G take care about duplicated content or does not ?
—-What is a duplicate content in the eyes of Google ?—–
The first generation of spiders compared the strings A and B like:
A – “The duplicate content is a myth”
B – “The duplicate content is a myth”
The duplicated content was when most of the strings on the page were exactly the same like already known to spiders.
The first attempt was to create new string with some special signs like
C – “The. duplicate. content, is A myth”
The string C was different from A, but that trick lasted several months until Google improved its search algo.
It was almost the same time, whem the spiders started to recognise ‘Yoda’ language
so the string
D – “Myth is the duplicate content” became classified as a duplicate of A.
The last generation of spiders recognise much more, full phrases and ….
E- “Repeating content is the bogus” may also be a duplicate of A.
If the spiders have the original article stored in the data center, they can spot pretty well rewritten news and other data. The thumb rule is that the first published article is considered original, and the rest are duplicate. The time, when spider first met the article is crucial, hovewer sometimes if the article is republished by the web site of established authority that web site may be the winner.
—The punishment—
There is one obvious fact, no one needs many duplicates of the same article, and the spiders cannot store the same info all over again.
One of the biggest problem in the tourism industry is that the hotel data is exactly the same, you cannot describe in many of creative ways if the hotel room has the shower and Plasma TV set. Also the height of Empire State Building or the date of Statue Liberty creation is the same in every tourist guidebook. And the simple rewrite does not necessary work.
Few years ago all the duplicated content was considered spam, and many of new web sites
were deindexed without any chances for ranking, or loose the rank for long months.
The recipes suggested to make unique say over 40 % of the page, to avoid the penalty.
Some two years ago in Google index you were again able to see many of the duplicate pages without visible signs of penalties. Google PR guru Matt Cuts announced that they don’t punish for legitimate duplicates, but still are severe to spammers.
And for now this is the official statement.
—Doesn’t it hurt at all ?—-
I have used several times the word punishment, and google behave in the way which was clear to spot. Today what I see, in many cases they don’t introduce filter, or ban, or any other visible form of pain. The spiders just tend to ignore mistakes, or tricks to win the system.
Although the duplicated content may be indexed by the Google, it is hardly to be seen it the Search results, as the duplicates are considered worthless. Some duplicated pages on high authority sites may rank well, while the rest of the sites would have pages ignored.
—-Yes it Does —–
As you can see Google has plenty of instruments to spot the content which is not original, and it improves the ways to rank only the fresh, original articles.
Some duplicates on your web site may not hurt your rankings, but on the other hand won’t help either.