Google changes: recently I see more of “discovered – currently not indexed”

Has Google become more conservative with indexing content of personal websites? I think we might see less and less low-traffic quality contents in Google search results.

I have carefully done basic search engine optimization for my personal website since somewhen before 2010. For example, I have tooling in place to generate an XML sitemap, encouraging search engines to keep track of my individual sites/documents.

Via Google Search Console, I have over the years observed that this process worked: after publishing a blog post, it usually ended up being visited and indexed by Google within at most a day. Explicit confirmation is of course when an anonymous search yields my contents; I often tried that to directly confirm if my setup is yielding satisfying results. It did, so far.

Recently, something has changed.

Content announced via sitemap is not generally adopted by Google anymore in a timely fashion. Interestingly, the Search Console reports that it has seen the content. It says

Page is not indexed: Discovered – currently not indexed

The official summary for this state is ambiguous, and in my opinion slightly deceiving:

The page was found by Google, but not crawled yet. Typically, Google wanted to crawl the URL but this was expected to overload the site; therefore Google rescheduled the crawl. This is why the last crawl date is empty on the report.

I did a bit of a web search on the topic (using Google :shrug:).

Other people claim that they’ve been caught by surprise by this phenomenon, too. There’s indication that things changed significantly in late 2022, at least according to this Reddit thread. Quotes:

This is like the 4th post I’ve seen about indexing problems across a few subs. Seems like there’s a lag with indexing after the recent update.

Spam brain update killing the website. After 19th October [2022] every one [is] facing this issue.

One solution is to manually request indexing:

In your list of URLs of ‘Discovered, not indexed” Hover over the first URL, the magnifying glass and “request indexing”. This should fix it in a day or two.

This has worked in my case: a blog post was sitting in said state (discovered, not indexed) for ~three months. I then requested it to be indexed manually and it appeared in Google’s search results about one day later.

While there is a solution (good!) my message here is that I never had to do this before; and nothing of relevance (I believe, at least) changed on my website’s end. It’s Google that changed; and I think we should be aware.

While “performance” and “crawl budget” and also “quality of content” might be relevant concepts here to explain the phenomenon, I think it’s also pretty obvious (and kind of makes sense) that Google has certain incentive to simply but gradually lower their quality of service for non-paying entities. Google became big by providing high-quality search results to everyone for “free”, but that is probably not their priority anymore. My words here are maybe overly diplomatic; I think we all know that the quality of Google search results has evolved in the recent past; not for the better.

Before finishing off, let me share the following plot which shows Google’s own perspective on the frequency of the search term “discovered currently not indexed”:

Let me know what you think!

