How to Overcome Excluded Status Index Coverage Report

How to Overcome the Excluded Status Index Coverage Report. In this post, we discussed the indexing report contained in the Index Coverage Report in...

How to Overcome the Excluded Status Index Coverage Report. In this post, we discussed the indexing report contained in the Index Coverage Report in Google Search Console with Valid and Valid With Warnings statuses.

Google search console
© Google search console

Before we learn more about the Status Excluded Index Coverage Report in Google Search Console, let's first get to know some terms that are often used, namely:

#1: Canonical URL (Unique URL) is a page URL that helps webmasters to prevent duplication issues in content/article creation. Each page has only one canonical URL that distinguishes it from other page URLs.

#2: Robots.txt is a text-based file that tells search engines what to and what not to crawl. When you post a new page or post to your website, Search engine robots will crawl this content to index it and display it in Google search results. However, if you have some parts of your website that you don't want indexed, you can tell Search Robots to skip them so they don't show up on the results page.

Immediately, we discuss the types of indexing categories that are commonly found in the excluded status (excluded) in indexing in Google Search Console starting from the most frequently found and how to overcome them and fix the excluded status, the explanation is as follows:

Excluded status in search console

1. Blocked By Robots.txt

Blocked by robots.txt . If the index coverage report shows the exclude status is in this status, it means that the page URL is not indexed by Google because the URL is blocked by the robots.txt file .

This also means that Google has not found a signal that is strong enough to guarantee the indexing of the URL, because if it is indexed, the page URL will be in the Valid with warnings - Indexed , though blocked by robots.txt 

Problem Solution : 

There are several solutions that are worth trying and need to be done to overcome this such as the following:

  • Try checking your robot.txt settings on the blog dashboard settings to see if you intentionally block (Dissallow) the URL of the page so that it is not crawled by Google search engines.
  • If there is no intentional command to block, the next step is to check the Robot.txt test to find and check for errors that are blocked by robot.txt.
  • Try manually telling Google to crawl your site, by navigating to the Search Console Crawl Fetch as Google properties. Add the URL of the page marked in this status and then click Fetch. After reloading, click Request Indexing Select Crawl only this URL.

2. Crawled - Currently Not Indexed

Crawled - currently not indexed is one of the most potentially actionable reports in Search Console, but Google doesn't tell you why the URL isn't indexable.

Some things that might cause Crawled - currently not indexed are:

Poor internal linking structure : this page could mean it has broken or removed internal links and even no links at all.

Poor quality content : this page may, according to Google, have inadequate content quality, such as lacking the required word count or having content that is less unique/has a clear difference to content that has the same theme (niche).

Duplication : content is considered a duplication or has a similarity of content that has been crawled by Google from other websites/sites.

Problem Solution :

If you find the page is really important and should be indexed, then you should immediately take action to address it such as:

  1. Fix and add internal and external links (backlinks) on the page.
  2. Improve your content by multiplying unique words from your own thoughts, not copy and paste and create content with at least 600 wordings.
  3. Share your posts and pages to various social Media to get backlinks.
  4. Index your site with other search media such as Bing and Yahoo.
  5. Re-index or you ping the URL of the content page.

3. Discovered - Currently Not Indexed

Discovered - currently not indexed . This status means that the URL of the page has been found by Google's crawler engine, but Google prefers not to crawl and index the page.

Some things that might cause Discovered - currently not indexed are:

Server overloaded : Google is having problems crawling your site because it appears to be overloaded. Check with your hosting provider if this is the case.

Excessive content : Your website/blog contains more content than Google can crawl at the time and Google's crawler engine has decided not to be interested in crawling the page.

Poor internal linking structure : this page could mean it has broken or removed internal links and even no links at all.

Poor content quality : this page may, according to Google, have insufficient quality content, such as lacking the required word count or lacking content that is less unique/has a clear difference to content that has the same theme (niche) .

Problem Solution :

If you find the page is really important and should be indexed, then you should immediately take action to address it such as:

  1. You can fix this by trimming the content, making the content more unique if you want Google to crawl and index it, or by removing the link that points to it and updating the robots.txt file to prevent Google from accessing the URL again.
  2. Improve your content by multiplying unique words from your own thoughts, not copy and paste and create content with at least 600 wordings.
  3. Share your posts and pages on various social Media to get backlinks so that it makes Google interested in crawling and indexing them.
  4. The last and best solution after you fix the content, by creating a new Permalink and removing the old permalink and requesting re-indexing to the new Permalink.

4. Duplicate Without User - Selected Canonical

Duplicate without user-selected canonical . Google finds and treats these URLs as duplicates. Even if you canonicalize it to a URL of your own choice, Google prefers to ignore it and decides to exclude this URL from indexing.

This can happen when a website has almost or even similar content to several competing websites in the same niche as yours. This similar content is not only assessed by Google through content that has the same language and writing, but can also be compared with content in a different language and Google considers the content to be a duplication.

It could be that the content you have is purely the originality of your own thoughts, and others even copy and paste from your blog/website but your content is actually considered to be duplicating their content because you lose in terms of the number of backlinks and visitors who browse your content so Google chooses instead. to ignore your content.

Problem Solution :

Add the canonical URL to the preferred URL version such as for example product details page. If this URL should not be indexed at all, be sure to implement a noindex directive via the robots meta tag so that when you use the URL Inspection tool, Google can even display the canonical version of the URL.

5. Alternate Page With Proper Canonical Tags

Alternate Page With Proper Canonical Tag is that Google considers that the URL of the page is duplicate/similar to the URL of another Page so that Google chooses the URL of that page not to be crawled and indexed.

Each page has one Canonical URL which prevents duplicate page URLs from occurring. Well here Google considers the URL that we create has similarities to the URLs of other pages so Google prefers the Canonical URLs on other pages to be indexed and the other pages do not, so the exception report above appears.

Problem Solution : 

Change the URL of the page to be a separate canonical URL and different from the others. Also, keep an eye on the number of pages you have, if you notice a big enough increase while your site/blog doesn't have that much increase to index, it's possible that you are potentially facing bad issues with internal links to crawl.

6. Blocked Due To Unauthorized Request (401)

Blocked due to unauthorized request (401) ) . This URL is not accessible to Google because once you request indexing Google receives an HTTP 401 response, which means they are not allowed to access the URL.

Problem Solution : 

Make sure that you are not blocking Googlebot's access to crawl the page and that you are using only use Googlebot to request indexing. Give authorization access, or allow Googlebot to access your Web/blog.

7. Blocked Due To Access Forbidden (403)

Blocked due to access forbidden (403) . this means that when Googlebot performs a crawl, Google is not allowed to access this URL and receives an HTTP 403 code response.

Problem Solution : 

Make sure Google (and other search engines) have unrestricted access to the URLs you want to index. If URLs that you don't want indexed are in this problem, then we recommend implementing the noindex directive (either in the HTML source or HTTP headers).

8. Pages With Redirects

Page with redirect (Page with redirect) . URLs are read by Googlebot as redirected URLs, and are therefore not indexed by Google.

Problem Solution : 

Remove old URL and replace with new URL and re-index for new page URL. The old URL is included in the removed URL so that it is not crawled again by Googlebot. Then enter the URL into a special redirect so that the old URL is redirected to the new URL.

9. Submitted URL Not Selected As Canonical

Submitted URL not selected as canonical . You have submitted this URL via an XML sitemap, but Google considers this URL a duplicate of another URL, and therefore chose to canonicalize this URL with a canonical URL handpicked by Google.

Problem Solution : 

The method used to solve the problem in this state is almost the same as the solution in the Alternate Page With Proper Canonical Tag state.

10. Excluded By 'Noindex' Tag

Excluded by 'noindex' tag (Excluded by 'noindex' tag) . This URL has not been indexed by Google due to the noindex directive (either in the HTML source or in the HTTP header).

Problem Solution : 

Try to check your search page robot custom tag, whether you provide no index tag for search or not, if you add no index tag and want the page to be indexed then change the search tag.

11. Crawl Anomaly

Crawl anomaly — Google is having problems indexing these URLs which may be because Google is receiving response codes in the 4xx and 5xx ranges outside of the types they list in the Index Coverage report.

Troubleshooting : Re-inspection of URLs and see how the latest results are, then index them directly. If there are no problems at the time of re-indexing, simply request an indexing request and it should automatically disappear.

12. Blocked by Page Removal Tool

Blocked by page removal tool . This URL is not currently showing in Google search results due to a URL removal request. When you delete the URL, Google only hides it from Google search results for 90 days. After that period, Google can bring this URL back to the surface.

The URL removal request feature (opens in a new tab) should only be used as a quick and temporary measure to hide URLs. We recommend that you take additional measures to completely prevent these URLs from reappearing.

Problem Solution : 

Send a clear signal to Google not to index this URL via robot command. txt "noindex" and make sure this URL is re-crawled before 90 days expiry or you can add the deleted URL in the custom Errors and Redirects in blog dashboard settings.

13. Blocked Due To Other 4xx Issue

Blocked due to other 4xx issue — Google can't access these URLs because they receive a 4xx response code in addition to a 401, 403, and 404. This can happen with a misformatted URL, for example, sometimes returning a 400 response code.

Problem Solution : 

Currently have not found the right solution. Take action with Test Live URL and Request Indexing

14. Not Found (404)

Not found (404) . This URL wasn't included in the XML sitemap, but Google somehow found it and couldn't index it because it's an HTTP 404 status. It's possible that Google found this URL through another site, and this URL existed before and you have deleted it.

Problem Solution : If this URL no longer has page content, we recommend redirecting it to another Page URL or directly to Pages in Error and Custom Redirect settings.

15. Page Removed Because Of Legal Complaint

Page removed because of legal complaint (Page removed because of legal complaint) . This URL has been removed from Google's index due to a legal complaint.

Problem Solution : 

Make sure that your content is indeed your original thought as someone has reported requesting your URL to be removed from Google's index.

16. Soft 404

Soft 404 . This URL is considered a soft 404 response, meaning that it does not return a 404 HTTP status code but its content gives the impression that it is actually a 404 page, for example by displaying a “Page could not be found” message. Alternatively, this error could be caused by a redirect to a page that Google deems insufficiently relevant. 

Problem Solution : 

If this URL is the correct 404, make sure to return the correct HTTP 404 status code and perform the action as the solution in Not Found (404) above.

Those are some of the problems that are often encountered in indexing Google Search Console which are displayed in the Index Coverage Report in the Excluded status category and how you can solve them if you find these errors. 

Relevant Tips is a blogging platform about Blogger, Relevant information, Digital marketing and so on.

Post a Comment