Preventing search engine bots indexing the callback URLs

Two-stage crawling

Some search engines, especially Google, now render JavaScript when crawling, but this occurs in two stages. Initially, the bot processes only the raw HTML, which doesn’t include JavaScript-generated content.

In the second, more resource-intensive rendering pass, the bot executes client-side code and thus can render our callback URL, despite all efforts to block the bot traffic.

Blocking the indexing of callback URLs

Robots.txt: Publishers should add the following line to their robots.txt file to block URLs with these parameters:

User-agent: *
Disallow: /*?callback=in