Most active website owners put a lot of time and effort into the content that they’re putting online. We’re constantly fine tuning and re-developing our SEO strategies to target Search engines and get that precious traffic. But do the search engines you’re trying to target even know that your site exists? For your site to be considered accurately (and at all) when churning out results for a Google search query, it needs to have been crawled and added to the Google search index. Has yours?
What is Crawling?
In the simplest terms, web crawling means following links. The process of crawling a web page is carried out by something called Googlebot. Googlebot is a search bot software that Google sends out into the web to collect information to add content to the search index. Googlebot quite literally ‘crawls’ through sites by following link to link and recording the data to report back to Google.
What is Indexing?
Indexing is the processing of the data that the Googlebot collected while crawling. The information is processed, and if it is determined to be quality content, it is added into the Google search index. All content that has been indexed by Google can appear within search engine results pages. The Googlebot collects whatever details are available to it – including title tags, meta-descriptions and alt-tags – this information will help determine where your pages end up within search results. So If you want your web page to appear for relevant search queries, you need to make sure its optimised and indexed!
Why is it important that your site has been indexed?
Google doesn’t know that your content exists until it has been crawled and indexed. Alternatively, if you have made changes to your previously indexed content, Google will continue to store the old and out of date content until it has re-indexed the updated information. Regular site indexing is crucial to your sites performance and ranking in search engine results page. Let’s use this blog post as an example:
We are writing this post as a source of information for people to read and share, because it’s important and valuable. While we can share our post manually using social media, word of mouth, or directly via our website, we want it to appear in relevant SERPs for people who are seeking out information on this topic. If this post has not been crawled and indexed by Google, it wont appear when people are searching things like “what is indexing?”. In order to appear in Google search results, our post needs to have first been crawled, processed and added to the search index.
How do I check if my site has been indexed?
Now we understand how important it is to have Google index our web pages, how do we check that it’s been done? Thankfully, Google makes this relatively easy.
Index Status Report
Using the Google Search Console, you are able to view your websites ‘index status report’. The index status report “provides data about the URLs that Google tried to index in the current property…” The report shows you the total value of pages that have been indexed on your site, as well as the dates which these indexes took place and the number of pages which are blocked by no-index tags.
You can use the report to determine any gaps or issues in your sites indexed content. For example, if you have 300 pages on your website, but the report only shows 250 indexed pages, you know you need to request a re–crawl so that the missing content can be recorded. Similarly, if the report shows that the site has not been indexed recently or regularly, you need to prompt Google to preform the crawl again.
Get more information about Index Status Reports here.
You can control which pages of your website are added to the web search index, and which pages aren’t. Pages can be hidden from crawl bots by using a no-index meta tag, this means they wont show up in search results. It’s common to no-index pages such as thank you and confirmation pages, because they don’t hold any valuable content. If a page of your site isn’t getting indexed, check the meta tag!
How to prompt Google to crawl your site
You’ve created a new website, edited your existing site or just added some new content. Maybe you’ve generated your index status report and are feeling a little forgotten! How many millions of pages of content does that little bot need to crawl before it gets to mine? Thankfully, there are things that you can do to get the attention of Googlebot and get your site crawled and indexed faster. Here are some important steps to take:
Create a Sitemap
An XML sitemap is an important document on your websites server. Simply, it is a list of all of the pages on your website and is used by search engines to find new content that has been added to your site. Having a sitemap speeds up the crawling process, because it tells the bot where to travel within your website.
Creating a sitemap will depend on your websites CMS, however platforms such as WordPress have plugins and simple tools which allow you to simply generate and submit your sitemap directly. If you’re unsure, it’s something worth contacting your web developer about!
Submit your Sitemap
Manually submitting your sitemap to Google is a great way to speed up the crawling and indexing process as it sends such a clear signal to Google. In fact, according to a study by Hubspot, Googlebot was able to crawl a website with a manually submitted sitemap in just 14 minutes. It took over 1300 minutes to crawl a site without one!
You can easily submit your sitemap to Google using Google Search Console. Simply select sitemaps under the Crawl tab and enter your site map URL by selecting the ‘Add/Test Sitemap’ button. Get full instructions here.