A blog reader at my other blog Web-World Watch, left this link http://www.copyscape.com/ on a post that spoke about Google dinging sites for showing duplicate content.
I entered my own blog address in this tool, and found that there were sites that had actually snatched my own blog content verbatim and had not supplied a link back or even had identified me as the author. In fact they had passed the content off as their own, and had selected some of my hottest traffic posts!
I have notified them of copyright infringement! You should check your own content to see if you have a similar problem. If you are like me, you don’t mind if others quote you, even show one or two paragraphs of your post and link back to read the full content, or even contact you for approval, but to simply snatch content and provide no links back and pass the content off as their own intellectual property? Very bad form!
The issue on duplicate content that Google is particularly targeting in one of their most recent patent disclosures is simply this case in point. Who should get the credit for duplicate content? Google is developing a way to identify the author of content just in a case like this. I would imagine that this will revolve around the initial post date recorded by the web server and a factor of a match to other content and writing style on the site. Eventually I am looking to the development of a trust certification for site owner to embed on their page that tags their content for Google.
In the meantime, if you are scraping someone else’s content from their blog, please stop! It’s time to create your own, and if you aren’t then check to see if someone is at Copyscape.com.