Yoast SEO settings: Crawl optimization

Crawlability is essential in SEO. If you want search engines to find your site and show it in the search results, your site must be crawlable. Not only that, but you must ensure that search engines get a chance to crawl the pages that you want to rank with. There is no easy way to do that. But, with Yoast SEO, you can clear all the URLs that don’t have any SEO value out of the search engine’s way.

There is also another, less talked about, side to this story. Crawling requires a lot of resources. Search engines and other parties like various apps, for instance, need a lot of electricity to crawl the growing number of sites and their URLs. Website owners also need powerful servers to make it possible for both visitors to visit and robots to crawl their sites. So, by making crawling more efficient, you contribute not only to your site’s SEO but also to consuming less electricity!

By using the crawl optimization settings in Yoast SEO, you can easily clear out URLs that don’t have any SEO value. This makes crawling more efficient and reduces your site’s carbon footprint. In this article, we’ll explain all of the crawl optimization settings one by one.

Video: How to use the crawl optimization settings in Yoast SEO

Where to find the crawl optimization settings

You can find the crawl optimization settings by following these steps:

  1. Log into your WordPress site.

    You will be in your WordPress dashboard.

  2. Click “Yoast SEO”.

    In the menu on the left-hand side, find the “Yoast SEO” menu item.
    Screenshot of the "Yoast SEO" menu item.

  3. Click “Settings”.

    In the menu that unfolds when clicking “Yoast SEO”, click “Settings”.
    Screenshot of the settings menu item in Yoast SEO

  4. Navigate to the “Advanced” heading and click “Crawl optimization”.

    On the Yoast SEO settings page, navigate to the “Advanced” heading and click “Crawl optimization” to open the crawl optimization settings.
    Screenshot of the crawl optimization menu item.

  5. That’s it!

    You’ll be on the crawl optimization settings page in Yoast SEO.screenshot of crawl optimization settings in Yoast SEO

The crawl optimization settings are divided into five sections. We’ll explain them one by one.

Remove unwanted metadata

The first section is called “Remove unwanted metadata”. What can you do here? Unlike humans, who read what’s on the front end of your site, robots read what they find in the source code. If you open the source code of your site (see the image below), you will notice many URLs there. When crawlers come to crawl your site, they’ll visit each one of the URLs they find. And they will do that tens or hundreds of times per day.

WordPress adds a lot of URLs and tags to your website’s header and <head> section, as you can see in the source code of your site

So, why is that a problem? Well, WordPress adds a lot of URLs and tags to your website’s header and <head> section. А lot of those additions are unnecessary, and they don’t have any SEO value. So, we’ve created multiple toggles that allow you to disable a specific piece of output. Below, you can read more about what each of the toggles does.

In the <head> section of a single post, WordPress creates a shortlink output (see example below).

<link rel='shortlink' href='http://testsite.com/?p=1' />

The shortlink is basically a shortened version of the URL of the same page. With this toggle, you can remove that output.

screenshot of the "Remove shortlinks" toggle in the crawl optimization settings in Yoast SEO

The WordPress REST API is a developer-oriented feature that lets applications interact with your WordPress site. Automatically, WordPress adds a REST API link to the <head> section of your site for discoverability.

<link rel="https://api.w.org/" href="http://testsite.com/wp-json/" />

However, most sites don’t use the WordPress REST API. If your site is one of those, you can safely remove the link with this feature.

screenshot of the "Remove REST API links" toggle in the crawl optimization settings in Yoast SEO

The RSD (Really Simple Discovery) link in the <head> section of your site is for when you use these types of services. If you do not, it is safe to remove the link with this toggle.

<link rel="EditURI" type="application/rsd+xml" title="RSD" href="http://testsite.com/xmlrpc.php?rsd" />

The WLW link is intended for users of the discontinued Windows Live Writer. If you do not use it, you can safely remove this link as well.

<link rel="wlwmanifest" type="application/wlwmanifest+xml" href="http://testsite.com/wp-includes/wlwmanifest.xml" />

screenshot of the "Remove RSD/WLW links" toggle in the crawl optimization settings in Yoast SEO

With this toggle, you can remove the oEmbed links from the <head> section of all your single posts.

<link rel="alternate" type="application/json+oembed" href="http://testsite.com/wp-json/oembed/1.0/embed?url=http%3A%2F%2Ftestsite.com%2F2022%2F05%2Fhello-world%2F" /><link rel="alternate" type="text/xml+oembed" href="http://testsite.com/wp-json/oembed/1.0/embed?url=http%3A%2F%2Ftestsite.com%2F2022%2F05%2Fhello-world%2F&format=xml" />

These links help other sites consume your content. You won’t harm any of your content by removing them.

screenshot of the "Remove oEmbed links" toggle in the crawl optimization settings in Yoast SEO

Remove generator tag

The generator tag displays the WordPress version your site is using.

<meta name="generator" content="WordPress 6.0" />

This tag has no SEO value, and, in fact, it can potentially be a security threat. So, you can easily remove it with this toggle.

screenshot of the "Remove generator tag" toggle in the crawl optimization settings in Yoast SEO

Pingback HTTP header

Pingbacks are used to notify you when someone has added a link to your site. However, this standard is very old, and you are most likely not using it anymore. If you switch the toggle to remove, it will remove the X-Pingback: http://testsite.com/xmlrpc.php from the response header.

screenshot of the "Pingback HTTP header" toggle in the crawl optimization settings in Yoast SEO

Powered by HTTP header

With this toggle, you remove the information about the PHP version your site is using from the response header. This information is not required for your site to function properly, so you can safely remove it.

screenshot of the "Remove powered by HTTP header" toggle in the crawl optimization settings in Yoast SEO

Disable unwanted content formats

The next section is called “Disable unwanted content formats”. Your site probably has more URLs than you realize. For instance, WordPress creates feeds for a lot of content on your site, which can be a problem for crawlers. A crawler will start crawling the URLs, and, at some point, it might run out of crawl budget. As a result, there won’t be any budget left for your important posts and pages. That’s why it’s wise to remove those URLs and let search engines crawl your site more efficiently.

screenshot of the "Disable unwanted content formats" section in the Yoast SEO crawl optimization settings
The “Disable unwanted content formats” section in the Yoast SEO crawl optimization settings

In the crawl optimization settings in Yoast SEO, you can toggle multiple switches that let you keep or remove the various feeds. We don’t automatically remove them for each site because we can’t predict the needs of all Yoast SEO users. But, if you don’t get any value from them, we recommend you switch the toggles on. Below, you can see exactly which feeds you can remove with the crawl optimization settings.

Remove global feed

The “Remove global feed” toggle lets you remove the global feed, which is an overview of your recent posts.

  • Type of page: any page
  • Example feed: https://www.example.com/feed/
screenshot of the "Remove global feed" toggle in the crawl optimization settings in Yoast SEO

Remove global comments feed

 The “Remove global comments feed” toggle lets you remove the global comments feed, an overview of recent comments on your site.

  • Type of page: any page
  • Example feed: https://www.example.com/comments/feed/

Note: Disabling this feed will also disable the post comments feeds.

screenshot of the "Remove global comments feed" toggle in the crawl optimization settings in Yoast SEO

Remove post comments feed

The “Remove post comments feed” toggle is for removing the feed for recent comments on each post. If you enable or disable the “remove global comments” toggle, the “remove post comments feed” will automatically be enabled or disabled too.

  • Example feed: https://www.example.com/example-post/feed/
screenshot of the "Remove post comments feed" toggle in the crawl optimization settings in Yoast SEO

Remove post author feeds

The “Remove post authors feed” toggle is for removing the feeds for recent posts by specific authors.

  • Type of page: author archive, e.g., https://www.example.com/author/admin/
  • Example feed: https://www.example.com/author/admin/feed/
screenshot of the "Remove post authors feeds" toggle in the crawl optimization settings in Yoast SEO

Remove post type feeds

The “Remove post type feeds” toggle lets you remove post type feeds, which provide information about your recent posts, for each post type.

  • Type of page: post type archive, e.g., https://www.example.com/my-books/
  • Example feed: https://wwww.example.com/my-books/feed/
screenshot of the "Remove post type feeds" toggle in the crawl optimization settings in Yoast SEO

Remove category feeds

The “Remove category feeds” toggle lets you remove category feeds, which provide information about your recent posts, for each category.

  • Type of page: category archive, e.g., https://www.example.com/fiction/
  • Example feed: https://www.example.com/category/fiction/feed/
screenshot of the "Remove category feeds" toggle in the crawl optimization settings in Yoast SEO

Remove tag feeds

The “Remove tag feeds” toggle lets you remove tag feeds, which provide information about your recent posts, for each tag.

  • Type of page: tag archive, e.g., https://www.example.com/tag/fantasy/
  • Example feed: https://www.example.com/tag/fantasy/feed/
screenshot of the "Remove tag feeds" toggle in the crawl optimization settings in Yoast SEO

Remove custom taxonomy feeds

The “Remove custom taxonomy feeds” toggle lets you remove custom taxonomy feeds, which provide information about your recent posts, for each custom taxonomy.

  • Type of page: custom taxonomy archive, e.g., https://www.example.com/book-genre/crime/
  • Example feed: https://www.example.com/book-genre/crime/feed/
screenshot of the "Remove custom taxonomy feeds" toggle in the crawl optimization settings in Yoast SEO

Search results feeds

The “Remove search results feeds” toggle lets you remove search results feeds, which provide information about your search reults

  • Type of page: search results, e.g., https://www.example.com/?s=world
  • Example feed: https://basic.wordpress.test/search/world/feed/rss2/
screenshot of the "Remove search results feeds" toggle in the crawl optimization settings in Yoast SEO

Atom/RDF feeds

 The final toggle allows you to remove Atom/RDF feeds, which are specific formats for feeds.

  • Type of page: any page
  • Example feed: any feed listed above, adding /atom or /rdf in the end, e.g.:
    • https://www.example.com/feed/atom
    • https://www.example.com/feed/rdf
    • https://www.example.com/comments/feed/atom
    • https://www.example.com/comments/feed/rdf
    • https://www.example.com/hello-world/feed/atom
    • https://www.example.com/hello-world/feed/rdf
screenshot of the "Remove Atom/RDF feeds" toggle in the crawl optimization settings in Yoast SEO

Remove unused resources

In the “Remove unused resources” section, you can remove the resources that WordPress usually loads but that your site doesn’t always need.

screenshot of the "Remove unused resources" section in the Yoast SEO crawl optimization settings
The “Remove unused resources” section in the Yoast SEO crawl optimization settings

Remove emoji scripts

If you don’t use emojis in your content, you can safely remove the JavaScript used for converting emoji characters in older browsers. You can do that by switching the toggle behind “Remove emoji scripts”.

screenshot of the "Remove emoji scripts" toggle in the crawl optimization settings in Yoast SEO
screenshot of the "Remove emoji scripts" toggle in the crawl optimization settings in Yoast SEO

Remove WP-JSON API

The “Remove WP-JSON API” toggle allows you to prevent robots from crawling the WordPress JSON API endpoints. Unless you’re using the WordPress REST API to output important content, you can switch this toggle off to improve your crawl efficiency.

This adds a “disallow” rule to your robots.txt file to prevent the crawling of the WordPress JSON API endpoints., e.g. https://www.example.com/wp-json/ and https://www.example.com/?rest_route=/.

screenshot of the "Remove WP-JSON API" toggle in the crawl optimization settings in Yoast SEO

Internal site search cleanup

The next section is called “Internal site search cleanup”. Spammers sometimes target internal site search URLs on your site for their own purposes. Those URLs might get crawled by search engines and might be seen by users. That can harm your SEO (and your branding)! This feature identifies some common spam patterns and stops them in their tracks.

screenshot of the "Internal site search cleanup" section in the Yoast SEO crawl optimization settings
The “Internal site search cleanup” section in the Yoast SEO crawl optimization settings

Filter search terms

First of all, you can choose to filter search terms by switching the toggle behind “Filter search terms”. If you enable this option, then you get more specific options, which we will discuss below.

screenshot of the "Filter search terms" toggle in the crawl optimization settings in Yoast SEO

Max number of characters to allow in searches

If you choose to filter search terms, then you can set a maximum number of characters to allow in searches. This reduces the impact of spam attacks and confusing URLs.

screenshot of the "Max number of characters to allow" toggle in the crawl optimization settings in Yoast SEO

Filter searches with emojis and other special characters

You can also decide whether you want to block searches with emojis and other special characters, as these searches may be part of a spam attack.

screenshot of the "Filter searches with emojis and other special characters" toggle in the crawl optimization settings in Yoast SEO

Filter searches with common spam patterns

Finally, you can choose to filter searches with common spam patterns. The common spam patterns our plugin cleans up are: TALK: QQ: [:()【】[]].

screenshot of the "Filter searches with common spam patterns" toggle in the crawl optimization settings in Yoast SEO

Redirect pretty URLs to ‘raw’ format

Next, you can choose to redirect pretty URLs for search pages to the raw format. WordPress supports two endpoint formats for site search queries:

  • A raw format: example.com/?s=example
  • A pretty format:example.com/search/example

The pretty format will only be supported when pretty permalinks are enabled. When both formats exist, this can lead to problems, because this doubles the number of URLs that search engines can crawl. In addition, it can increase the number of ways in which your site can be attacked by spammers.

Therefore, we provide an option to turn off one of these formats in Yoast SEO. When you switch the toggle behind “Redirect pretty URLs for search pages to raw format” to on, Yoast SEO disables the pretty format. The plugin then redirects requests from the pretty format to the raw format, while maintaining any query parameters and/or pagination. We disable the pretty format because the raw format is relatively universal and language- and territory-agnostic and more (natively) interoperable with most analytics and tracking systems.

screenshot of the "Redirect pretty URLs to 'raw' format" toggle in the crawl optimization settings in Yoast SEO

Prevent crawling of internal site search URLs

The final option in this section is to prevent crawling of internal site search URLs. This adds a disallow rule to your robots.txt file so your internal site search URLs won’t be crawled.

In general, blocking your internal search pages via your robots.txt would not be our advice. It’s better to allow search engines to crawl these pages, but to prevent them from indexing them by using a noindex tag, which Yoast SEO automatically does for your site. However, if your search results pages are being crawled excessively and there’s evidence that that’s harmful, for example, for your crawl budget, or if your search results pages are under attack, you should enable this option. 

screenshot of the "Prevent crawling of internal site search URLs" toggle in the crawl optimization settings in Yoast SEO

Advanced: URL cleanup

These are advanced settings that you should only use if you know what you are doing! To learn more read the Advanced crawl settings article.

Will using the crawl settings affect my site’s rankings?

We understand that this all might sound a bit scary. But don’t worry, using the crawl settings in Yoast SEO will not harm your website’s crawlability or rankings. The crawl settings are there to help you clean up unnecessary URLs and can help search engines crawl your site more efficiently.

In addition to this, it’s important to note that the crawl settings in Yoast SEO do not have any effect on the crawl rate of a website. This means that the speed at which search engines, like Google, crawl and index a website is not impacted by the crawl settings in Yoast SEO.

Read more

Want to know more about crawling and how it affects the environment? Check out these links:

Get free SEO tips!