SEO – Crawlability and Indexability

seo crawlability

Search traffic provides a great opportunity to engage with potential customers who are searching for products, services and businesses that meet their needs. There are considerations for the content on your website as well as more technical considerations such as how easily search crawlers can access and understand your site content.

Note: Some aspects of SEO may require assistance from a website developer and others might be things that you can do yourself, depending on the kind of platform your website is built on.

Crawling

It all starts with this a spider. In fact, millions of spiders!  We use the term spider, or crawler, robot, or bot, for computer programs that crawl the web to discover websites. They do that by following links from one page to another. They start with sites they already know from previous crawls. Sitemaps could also be a good starting point. We’ll tell you more about sitemaps later on in this course. Every time a spider finds a new link, it will try to follow that link. By following links, it discovers new pages. In that sense, it’s similar to your own browsing behavior. Spiders particularly look for content, headings, and links, so they can relay what a site is about, and how it’s structured.

In some cases crawlers may be blocked from some pages that you don’t want to be indexed.

Indexing

Next is indexing. When a crawler passes through a website, it saves the HTML version of a page – so the version with all the code that’s ‘behind’ the page that visitors see – in a giant database, which is called an index. It’s like a library, so big you can hardly imagine. And the process of adding new and updated data, images, videos, and more, that’s called ‘indexing’. Processing and analyzing the content is also part of this process, because that makes it easier to “put a page on the right shelf of the giant library.”

Get Indexed

The first thing you need to do is to set up a Google account if you don’t have one yet. Then you can open Google Search Console.

Then, we’ll go to ‘Sitemaps’. The URL of your site is already pre-filled, so you just need to add the part that leads to your sitemap.

For WordPress add: sitemap_index.xml

Hit Submit.

Crawlability

Crawlability refers to the search engine’s ability to comb through content on a webpage. If a web crawler is able to access all of the content on your site by following links in the main navigation and between pages (internal links), then your site content is easily discoverable and you should not have issues with crawlability. However, this can be an issue if you have pages on your website that aren’t easily discoverable because they aren’t part of the site’s main architecture or because of broken links which prevent the crawler from accessing certain content.

Indexability

While crawlability is focused on the algorithm finding your website content, indexability represents its understanding of content. In order to serve up search results all of the discoverable content on the web has to be indexed (categorized) by the algorithm. Then, when a user inputs a search for something they need, the algorithm ranks all of the pages in the index according to which ones seem like the best fit for the needs of the user. If your site struggles to be indexed, then you’ll have problems showing up in search.

Checking Your Indexability

Open Google Search Console. Select URL Inspection tab on left. Enter URL of page you would like to check. A report will be generated with details.

Getting Started with Search Console

Navigating a Search Consoler Report

Reports at a glance

Ranking algorithms

The index is updated every time the crawler comes around again and finds new information. How often the crawler comes around, depends on the importance of your site (according to the search engines) and the amount of changes you make on your website.

As soon as you start typing, the algorithms of the search engine start sorting and ranking. The algorithms take the data from the index and make a calculation on a bunch of factors that are tied into user experience. Then they use this calculation to show you the most relevant results. Site speed, great content and the security of sites are important, amongst many other things.

Improving Crawlability and Indexability

Avoid duplicate content.

If you have several pages on your site that include the same content (or that focus on the same topic with slight variations of wording), this could lead to issues where the crawler struggles to determine which page would be the best to show users in search results. Oftentimes, rather than ranking both pages in a search and letting users decide, it will reduce visibility for both pages because it can’t confidently determine which is most applicable. When thinking about the pages of your website, it’s a good idea to focus each one on a specific topic rather than having every page covering the same or significantly overlapping subjects. If you had duplicate pages and want to solve for that, it’s a good idea to permanently redirect the weaker of the two pages to point to the stronger one. That will consolidate your site’s authority.

Use relevant anchor text.

Anchor text is the term describing the clickable text used for a hyper link. When including links in text, it’s important to think about the actual words that you’re linking from because it’s more effective to use text that’s relevant to the page where you’re sending the user. High quality anchor text should give visitors (and the search crawlers) an idea of the type of content they’ll find after the click. Let’s use the example of a local craft brewery that’s updating its website to include information about brewery tours:

Bad anchor text: Find out more information about our tours here.

Good anchor text: Find out more information about our brewery tours.

High-quality internal links.

When creating links between pages, it’s important to think about what the user might want or need while reading information on a given page. If you have a page on your site that describes all of the services your business provides, then you don’t want to add so many links that the user never reads the page before you send them somewhere. Try to use links that add helpful context or take a user to find more information based on what they’re reading about rather than linking from every possible noun. Too many links create issues for users and increase the chance that you’ll end up with broken links that will hurt the crawlability of your site.

XML Sitemaps

This file lists all the important pages (URLs) and files on your website, making sure that search engines can find and crawl them. It also helps search engines understand the structure of your website. If your site is new and it doesn’t receive that many links yet, search engines might have a hard time finding it. A sitemap can help speed up content discovery. Your sitemap can also provide more information about specific types of content, such as videos, images, and news articles.

Find Your XML sitemap

The WordPress XML sitemap is automatically generated and enabled on your WordPress site. You can find it by typing the URL of your site into the address bar, followed by /wp-sitemap.xml, like this.

To control what goes into the sitemap, you’ll need some coding skills. You can’t just do it from the WordPress backend. You can learn more about that if you click this link.