There’s no doubt about it – content is king on the web. Entire search engine optimization strategies are built around the ability to write high-quality, clear, engaging content in order to rank higher on Google and to a lesser extent, other search engines.
However, in the rush to not only produce high quality content but also get it ranking quickly has led to the development of many shortcuts in the content creation world – some more detrimental than others. Plagiarism, or using someone else’s work without giving them proper credit, has become a major concern among search engine optimization professionals, content agencies and writers alike.
In this article, we’ll take a closer look at how plagiarism and SEO are related, and how duplicate content can harm your website’s ranking both in the short and long term.
Before we get into the nitty-gritty of SEO, it’s important to understand what duplicate content is, exactly, because it’s not just copying and pasting large swaths of content from someone else’s site (although it is that, too).
If a content creator is tasked with creating several pages, they may use AI to paraphrase for them, changing words here and there so as to try and sidestep plagiarism detectors and AI writing detectors. However, new content that is largely similar to existing content will still be considered duplicate content, even if it isn’t a word-for-word reproduction.
What many website owners don’t realize is that boilerplate content can also be considered duplicate content, like manufacturer’s descriptions, which are prevalent on reseller and e-commerce sites across the web.
Search engines, and Google in particular, are firmly fixed on delivering the most relevant, high-quality and unique results they can so that searchers have an enjoyable and informative experience using their search engine and consistently come back again and again.
When a search engine “robot” (the script that crawls the web looking at content and indexing sites) finds multiple versions of the same content, they have to consider things like:
On its own, Google doesn’t penalize sites for duplicate content. Instead, it filters it out, showing a single version in the search engines. Even if you wrote the content first, if the offending site (or sites) has written more concretely on the topic using your plagiarized content, your own page may not get the visibility or traffic it deserves, which can cause a substantial drop off in traffic and potential orders.
When someone plagiarizes your website content, the duplicate content can have a ripple effect on many different areas of your business, including:
Loss of Trust from Consumers - If customers see the same content on your site that they’ve seen elsewhere, they’ll be less inclined to trust your business or even visit your site again.
Less Visibility in the Search Engines - As search engines become more adept at spotting duplicate content (even paraphrased content), the material that was plagiarized moves further and further down the search results (if it’s visible at all)
Legal Issues - Using copyrighted content without permission isn’t just harmful to your search engine ranking; it can cost you legally as well.
Whether you’re a writer, content creator, website owner or marketing agency, preventing plagiarism and duplicate content from affecting your SEO is paramount to getting and maintaining your ranking. Here’s how to do it:
If you work with writers or are in charge of a team of content creators, always emphasize the importance of creating original content. This means swift repercussions for using plagiarized material as well as rewarding high quality content and having clear-cut objectives and processes for creating your content.
Run your content through Originality.AI to check for plagiarized content as well as potentially AI-written content before you publish it.
Canonical tags are Google’s way of determining which page is the authority if there are multiple versions. Because you’re likely tracking your different advertising initiatives to see where your traffic is coming from, you might have examples like:
www.example.com/example/refID=facebook
www.example.com/example/refID=tiktok
www.eample.com/example/refID=instagram
And so on. All of these pages have the same content on them, the only difference being where the traffic is coming from. However, the search engines see them all as the same page of content repeated many times. To tell the search engines “this page is the authoritative version” even if there are multiple versions, you’d place the following in between the <head> tags of your site:
<link rel=”canonical href=”https://www.example.com/example” />
This lets the search engine know that even if there are several variations of the Example page, that one above is the one they should consider displaying in their results pages.
Google Search Console is a way for website owners to monitor the performance of their site and diagnose any issues related to how Google searches, crawls and indexes their pages. One of the most common issues Google will encounter when it comes to duplicate content is thinking that the www and non-www versions of a site are two different versions.
In Google Search console, you can specify either the www or non-www version as the “preferred domain”. Beyond this, there may be several other factors that affect the crawlability of your site or whether or not Google believes there is more than one version of your content.
You can change how Google “understands” these pieces of information by editing the URL parameters in the search console, however you should only do this if you are well-versed in making those edits, as one wrong configuration can cause Googlebot to index or crawl your site incorrectly or even miss entire sections of it as it crawls.
Last but not least, try to avoid publishing syndicated content if you can. If you have to include it, put a link back to the original article and add a “noindex” to it so Google will know not to index your version.
So far, we’ve talked about what plagiarized content is, how plagiarism can hurt your ranking and your business reputation, and how to prevent it from happening if you work alongside other writers and content creators.
But what if you are the one that’s plagiarized? Although there’s no 100% guaranteed way to prevent all types of plagiarism from happening all of the time, there are some steps you can take to minimize your content from appearing on others’ sites and risking having it labeled as duplicate content in the search engines.
Before you do anything else, document the instances of plagiarism. Take screenshots, copy URLs and write down the date(s) you accessed the page. Keep this evidence on a Google Drive or Dropbox rather than on your local hard drive as you’ll need this as evidence if it comes down to proving copyright infringement.
The next step is to reach out to the website owner. Sometimes they won’t even be aware that the content that’s on their page has been copied from elsewhere, particularly if they work with a roster of content creators.
Let them know about the situation and ask them to either remove the content or provide a link back to the original (yours). Be polite but firm at this stage since it’s entirely possible that the plagiarism was unintentional and they may want to resolve it as quickly as possible to avoid any legal repercussions.
If the site owner doesn’t respond or they aren’t cooperative, you can take the matter to their web host. Online tools like https://digital.com/who-is-hosting-this/ will tell you which company is hosting a particular website. Most, if not all, web hosting companies do not want to be in a position where they could be involved in legal issues. In addition, their terms of service often prohibit plagiarism, with the end result being that the offending content will be removed.
If you’re in the United States or the offending site is hosted in the U.S. (or a country with similar copyright protections), you can issue a DMCA (Digital Millennium Copyright Act) takedown notice. This formal document states that the material is copyrighted and is being used on a third-party website without permission. When most hosting companies and search engines receive this notice, they’re generally obligated to remove the content.
Many law firms and online resource repositories have DMCA takedown notice templates that you can download, fill out and use for free or for a very low cost
If none of the above gains any traction, the next step may be legal action. This is particularly common if the offending content harms the company or its employees in terms of financial losses or loss of reputation in some way. If you decide to pursue this route, it’s advisable to consult with a copyright attorney to learn what your options may be.
There are some ways to protect your content from being copied in the first place. Watermarking images is a common way to protect them and is often used because the process of removing the watermark can be difficult and time consuming. Plain text, however, has no such protections. You can, however, set up a Google Alert for snippets of text that are directly identifiable as yours and get notified when this text appears elsewhere. This will let you address any potential plagiarism before it becomes widespread.
However, there are other content protection tools such as third-party scripts that disable the user’s ability to right-click (to avoid showing the context menu to copy) or even highlight and select text. Keep in mind that these tools can easily be bypassed by turning off such scripts and oftentimes, the ability to select text isn’t just used by would-be plagiarizers, but also screen-readers and other assistive devices.
Beyond that, disabling core functions (such as the ability to see the context menu entirely) can do more harm than good, disabling a user’s ability to freely navigate your site, so we recommend avoiding these types of scripts as they often do more harm than good in creating a positive user experience.
Now that you have a better idea of how plagiarism and duplicate content can affect your search engine optimization initiatives, the next step is to check your website’s content. Try Originality.AI’s comprehensive plagiarism scanner and AI writing detector now for as low as 1 cent per 100 words scanned.
You can optionally choose a plagiarism scan or AI writing detection on a case by case basis to ensure that the content you produce (or content that’s written for you) is unique, authoritative and original – the very things Google looks for when deciding which content should rank highest in the search engines. Take steps today to protect your content, assure your ranking and keep your site’s reputation as the go-to source for information in your chosen niche or industry.