We at FirmCatalyst check this using three methods:

Check your WordPress Settings

First of all, we check whether our WordPress installation is accessible for crawlers at all. To do this, we log in to the admin interface of our WordPress page and go to “Settings” > “Read“. Make sure that under “Visibility for search engines” the item “Stop search engines from indexing this website” is not selected.

Wordpress setting: Visibility for search engines

Meta Robots Checker from SEO Review Tools

Analyze a URL of your choice in the input mask. The tool shows you whether the page in question has the NoIndex or Nofollow tag, which would prevent the URL from being indexed.

Mit dem Meta Robots Checker von ReviewTools könnt Ihr überprüfen ob eure Webseite indexiert werden kann
With the Meta Robots Checker from ReviewTools you can check if your website can be indexed from SearchEngines

Use the site:domain.de search query

To do this, call up the google.de search. In the input mask type the command: “site:your-website.com”. Now you should see a list of all URLs of your website that are indexed in the Google search.

Site:domain.de Begriff Google Suche

Check error messages in the Search Console

The Search Console is Google’s hub for informing webmasters of any penalties, errors or other notices that may affect your site. The new version of the Search Console (end of 2019) will also show you how Google indexes your website and how fast the pages of your website load, provided that you have linked the search console to your website.

Indexierungsstatus Ihrer Webseite innerhalb der Search Console
Indexing status of your website within the Search Console

This is possible if your domain has been verified using Google Analytics or the required Meta Tag.

Make sure that the registered property corresponds exactly to the callable version of your website. For example, if the primary version of your website is available at https://your-website.com, the property you have entered in the Search Console should not be https://www.your-website.com. In such a case, the results would be distorted and you would not be able to access all data.

Hint: If your website is not yet verified for the Google Search Console, follow the corresponding tutorial on growthwizard.de/yoast-seo-instellungen/.

Remove 404 Search Errors

Unter Abdeckung zeigt die Search Console die gefundenen 404 Fehler Ihrer Webseite an

404 errors are pretty much the most harmful thing your website has to deal with. The task of every search engine is to always offer the user the best possible answer to his search query. Therefore, search engines are always trying to adapt their own algorithms to find the best possible result for the user.

If a user clicks on a search result and the page with the respective information can no longer be found, this is not only bad for the user, but also for you as a website operator and also puts the search engine in a bad light. In such a situation there would only be losers.

From the point of view of search engine optimization (SEO), we have to deal with another problem. Every website builds up backlinks over time. These backlinks are an indicator of quality & trust for search engines. You can imagine it like this: Every URL of your website contains a score that evaluates the quality. If the called URL is no longer available and is not redirected properly, the built-up trust fizzles out.

Therefore, it should be your job to always properly redirect such 404 errors to the correct source. For this purpose, there are various status codes that tell search engines what happened to the respective content.

  • 301: Content was permanently redirected: This indicates that the content is now permanently located at a different URL.
  • 307: Content was temporarily redirected: This indicates that the content is temporarily located at another URL.
  • 410: Content permanently deleted: This indicates that the content has been permanently removed from the website.
  • There are many more status codes: A list of these can be found in the Ryte Wiki: https://de.ryte.com/wiki/HTTP_Status_Code

Matt Cutts (former Google employee) has described this problem in a YouTube video. There he explains why it is so important to pay attention to correct redirections and how to handle them.

YouTube

By loading the video, you agree to YouTube’s privacy policy.
Learn more

Load video

Hint: You can find a list of the 404 errors of your website in the Search Console under “Index > Coverage > Excluded > Not Found (404)“.

To forward URLs correctly in WordPress, you can use plugins:

Check URLs for the Noindex tag

Also in the Search Console (Index > Coverage > Excluded > Excluded by “noindex” tag), you will find a list of all URLs that contain a so-called NoIndex tag. This meta tag tells search engines that the corresponding URL should not be included in the search results.

It will not do any harm to check all URLs at regular intervals to see if the page in question should really not be indexed. Especially when several people are working on a website or plugins are used, the NoIndex tag may be installed by mistake.

NoIndex Search Console

Check the location of your Sitemap.xml

A sitemap.xml is a list of all your URLs, images and content, including the time of the last modification. Especially large websites benefit from a sitemap because search engines find it easier to understand the structure of your website. For search engines, a sitemap is a guide to every content of your website.

You have the possibility to store the sitemap of your website in the robots.txt as well as in the Search Console. So search engines know exactly where to find the sitemap.

With the help of the plugin “Yoast SEO” you can easily edit the robots.txt:

  1. Call the WordPress backend under “yourdomain.com/wp-admin/”.
  2. Navigate to “SEO > Tools > File Editor“.
  3. Create a “robots.txt“.
  4. Add the following entry: Sitemap: https://your-website.com/sitemap_index.xml.
  5. Save robots.txt.
Beispiel wie eine Sitemap in der robots.txt hinterlegt werden kann
Example of how a sitemap can be stored in the robots.txt

If you want to add the Sitemap.xml in the Search Console, follow the steps below:

  1. Access the Search Console at search.google.com/search-console/about?hl=de.
  2. Navigate to: “Sitemaps > Add new sitemap“.
  3. Enter the URL of your sitemap (https://yourdomain.com/sitemap_index.xml) in the input field and confirm your entry.

Note: If you are not using Yoast SEO to create your sitemap, you may find your sitemap at “yourdomain.com/sitemap.xml“. This path is the most commonly used for sitemaps.

Animation: Sitemap in Search Console hinzufügen

Check the status of your robots.txt

The robots.txt is an optional text file in the FTP folder of your website, which is usually accessible under “yourdomain.com/robots.txt”. This file is only relevant for crawlers and contains instructions on which URL paths of your domain may be read and which paths are excluded from crawling.

Note for experienced webmasters: The “NoIndex Tag” within robots.txt is no longer supported since 2019. Google advises blocking crawling using alternative instructions such as “Disallow: yourdomain.de/path/“. It is also recommended to use meta tags like “NoIndex, NoFollow, DoFollow“.

Any professional SEO tool for an SEO audit should be able to check the robots.txt Checker von Ryte. If there is no access to professional SEO tools, you can also use the free robots.txt checker from Ryte.

Mit dem robots.txt Checker von Ryte kann die WordPress Seite auf Fehler geprüft werden
With the robots.txt Checker from Ryte the WordPress page can be checked for errors

At best, your robots.txt for WordPress should be as minimalistic as possible. Enclosed please find an optimal structure of a robots.txt for WordPress.

User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
Sitemap: https://ihredomain.de/sitemap_index.xml

Conclusion: Make it easy for crawlers!

Findability belongs to the fundamentals of successful search engine optimization. There are many tags and problems that can hinder the crawlability of a website. We are always working on websites of new clients who were not even aware that certain pages are excluded from indexing.

This concludes the second step of our SEO audit. In the third step of our SEO audit, we will check the loading time of our website.