A lot of readers don’t usually care about duplicate content. After all, when a user is done reading an article, he or she will usually close the tab and move on. However, the problem appears in the Google spiders. If you have a lot of duplicate content, Google bots might mark your website as a spam site, causing the company to penalize your site. The worst-case scenario is when Google bots will not display your content at all.
Fortunately, there are easy ways to remove your identical contents to avoid being penalized. Let’s get started:
Seven ways to remove duplicate content
1. 301 Redirect
One of the most popular ways to remove duplicate content is by using 301 redirects. You would want to use this plugin carefully. If you use this method, your content does not go to a trash bin; it goes to a shredder that goes directly to a black hole. There’s no way to make the pages active again once they’re redirected.
A 301 Redirect doesn’t erase the page; it just tells Google to go on another page instead. However, the page with redirect will not be indexed anymore. 301 Redirect is the ideal method if you somehow created multiple articles about one topic in the past and decided to consolidate it into the main article.
2. Using Canonical Tags
The Canonical Tag, also known as the rel=canonical tag, is considered by a lot of site owners to be the best method in removing duplicates. It is introduced by Google to trim out duplicate content for faster crawling. What it does is to tell the Google crawlers that a specific page is the primary one.
Although it shares some similarities with the 301 Redirect method, there are specific key differences between the two. A 301 is implemented by the use of a plugin and will usually redirect the user to the main page. The canonical tag, on the other hand, can be easily added as an HTML code and will not redirect the user to the primary page.
3. 302 Redirect
The 302 redirects are used when you don’t want the duplicate page to be gone forever in your website. This redirect’s effect is only temporary, while the 301 is permanent. However, it is worthy to note that 302s, which are not changed for six months, will be treated as 301s.
Webmasters use 302 redirects when you only need to make the page unavailable to the users for a short amount of time. Don’t use this if you plan to forget your content thoroughly. Only use 302s if you’re planning to move to a new domain or if you’re testing out a new page feature, and you don’t want to get your rankings affected.
4. Google Search Console: Remove URLs
To easily remove your duplicate content, one of the most reasonable options that don’t involve redirects is by using the Google Search Console. You can access the Google Search Console by going to your webmaster tools and using the “Remove URL” option. Although this is self-initiated, it doesn’t mean that you will be the one to remove the URL. It just requests removal.
Why would you want to use this feature instead of redirects? It’s an easier and clutter-less solution. Do take note that it is only a temporary fix, you might want to get a more efficient way to remove your duplicate content. Only use this if you’re going to remove tons of webpages, and you don’t have the time to do so manually.
5. Mark Page As A 404
A lot of people would argue that 301 redirect is the best method for removing a duplicate. But if you truly want to treat an article like it wasn’t even made in the first place, your best bet is to mark it as a 404. By doing so, the error will tell the users that the page is not on the server anymore. Pages that are marked with a 404 will be automatically deindexed from the search engine databases.
Use this method with caution, as it will not transfer page authority to the main article. There is an alternative for this; you can send a 410 header, which will expedite the deletion of the page from the Google index. By all means, the 301 redirect works in the majority of cases for a lot of webmasters.
6. Using Meta NoIndex
But what if you still want site users to read the duplicate? What if you don’t want to erase the page for some reason completely? If you want a webpage to stay existing in your server but doesn’t want to affect your SEO, your best bet in fixing this problem is to use meta NoIndex
Using the NoIndex tag is very easy: just put it in the head section of the source code. Search engines will stop indexing pages with these tags without redirecting or deleting it entirely. However, the pages will also not affect your rankings positively. Just like with other tags in the list, make sure that the page doesn’t contain essential things, such as an inbound link.
7. Use Robots.txt to Disallow Caching
The Robots.txt is a special file that enables communication between the webmaster and search engine crawlers. Webmasters use this file to tell crawlers a set of instructions on how they want these bots to crawl into their pages. You can use this file to specify the location of your sitemaps. This file is also responsible for preventing files such as .png or .pdf from appearing directly in public SERP. For this reason, the file is also used to avoid duplicate content from appearing in the search results. This is done by disallowing search engines to cache the content of a site’s particular robots.txt file.