HTTP Response Codes and SEO: An Introduction
Sometimes the most technical aspects of SEO are the most important, because they can dictate whether or not your pages end up in the search engine indexes at all. It doesn’t matter how well your write your copy or optimise your pages - if you can’t be indexed, you can’t be found.
HTTP headers are one such topic - they can be hard to understand for the non-technical SEO, but can completely decide the fate of your site in the SERPs.
An Introduction to HTTP Requests
When you point your browser in the direction of a website, the first thing it does is send a request to that website. This request details exactly what data it wants, in what format it will accept a response and generally, who it is. The request may look like this:
GET /sem-blog/ HTTP/1.1
Accept:*/*
Accept-Language: en-gb
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9b1) Gecko/2007110703 Firefox/3.0b1
Host: www.thesempai.com
Connection: Keep-Alive
This request is saying: “I am the Firefox web browser v3.0b1. I want the page called /sem-blog/ from your site www.thesempai.com. I would prefer it if you return it in British English, and if you want to compress it and make the file size smaller, that’s fine be me.”
The HTTP Response
The web server receives the request from the browser, and forms a response. This response is broken down into two parts - the header and the content.
The header tells the browser a bit about the web server, and a bit about what it thinks of the request. The content is the HTML code that makes up a web page.
The header of the response may look like this:
HTTP/1.1 200 OK
Date: Mon, 26 May 2008 18:24:21 GMT
Server: Apache/1.3.34 Ben-SSL/1.55
X-Pingback: http://www.thesempai.com/sem-blog/xmlrpc.php
X-Powered-By: PHP/4.4.8
Connection: close
Content-Type: text/html; charset=UTF-8
This says: “Your response was ok and the page is below, in HTML format, encoded in UTF-8. According to me, the date is the 26th of May 2008 and the time is 18:24:21 GMT. I’m running the Apache web server, and I’m powered by PHP 4.4.8. If you want to ping my blog use this URL…”.
Response Codes
From an SEO point of view, quite a lot of the response information can be useful, but the most important bit is the line:
HTTP/1.1 200 OK
This says “your request was ok - the document was found”. That is called a 200 response code.
There are a number of different response codes that can be returned by an HTTP server, but the most important ones are:
- 200 - Everything is ok, the document was found and is below
- 404 - The document could not be found. Instead the document below is what you should show to the user to tell them what to do next
- 301 - The document has moved permanently. Go to the following web address instead.
- 302 - The document has moved temporarily. Go to the following web address instead.
How does this affect SEO?
The way search engine spiders react when browsing your site very much depends on the response codes that they get back from your server. A badly configured server sending back the wrong response codes can stopy our site from ever being indexed.
Some common examples of really bad server configurations are:
- The server always returns the response code 404 - Just because you can see the site and browse it fine, doesn’t mean that your server isn’t returning a 404 response instead of a 200 response. Some badly programmed scripts that give sites “search engine friendly URLs” return 404 values instead of 200. In this case, the search engines won’t index these pages at all.
- The server never returns 404, even when a page is not found - Say you go to a nonsense URL on your web server - chances are you’ll get a page telling you the content cannot be found. This is great for the user, but if this page isn’t returning a 404 code then the spider will assume the page is ok. This means that whenever you remove a page, or if content on your site expires, then the page will still be indexed in the search engines, but with the “Sorry this page cannot be found” text instead of the original content. This page could compete with your other pages in the search results, and creates unnecessary duplicate content throughout your listings.
- The server redirects pages using 302, not 301 - Lets say you have a special campaign which has a short URL like “/springoffer/” and it redirects to another page on your site like /garmin-nuvi-spring-offer/. It should use a 301 redirect to tell the search engines that the real page is /garmin-nuvi-spring-offer/, not /springoffer/. Using a 302 will confuse the search engines as you are saying “The real page is /springoffer/ and /garmin-nuvi-spring-offer/ is just a temporary page”. This will make it hard for /garmin-nuvi-spring-offer/ to be listed properly in the search results.
As you can see, the examples above can have a quite drastic impact on your search results. It is therefore extremely important to understand what response codes your site is giving in different situations.
Finding out response codes
There are numerous tools available which will allow you to enter a web address and find the server response header. One of them is on this site: our HTTP response checker.
These web based tools are great, because you can access them from any PC - for example on site with a client. However for your desktop, you probably need something a little more interactive, like the Firefox Live HTTP Headers add-on. This Firefox plugin allows you to view the headers being received by your browser in real time while you surf the web:

Because Live HTTP Headers follows you as you surf, you can see step by step exactly what any visit to your site looks like, in great detail.
I’ve found a problem, what next?
So you’ve gone through your site with Live HTTP Headers or our HTTP Response Codes tool, and you’ve found a few bad pages - what next? Hopefully you have some clever developer nearby who can help you fix the issue, but if not and its down to you, then you need to make some changes. Unfortunately this is a bit more complex than just changing your HTML code, and depends on what sort of setup you have on your web server.
The most common server setups are Apache with PHP, or Microsoft IIS with ASP. Here’s a few resources for further reading:
- For Apache users: 301 redirects using .htacess
- For PHP users: Returning a 301 Header (note that Apache / PHP users can use either this or the one above. If using this method, make sure its one of the first things in your PHP file - before you return any HTML content.).
- For IIS users: 301 redirects in IIS
Other Forms of Redirect
Just to make life more complicated, redirects do not just necessarily happen using HTTP headers. They can happen in HTML code too. Common examples are:
- META redirects - As an example, the page http://www.somesillysite.com/the-first-page.html has a line of code such as:
<meta http-equiv="Refresh" content="0; URL=http://www.somesillysite.com/the-real-destination.html" />Once the web page the-first-page.html has loaded, this code will cause the browser to jump instantly to the page the-real-destination.html.
- JavaScript redirects - The page contains some JavaScript code that runs once it has fully loaded, and redirects the browser to a new page.
- iFrame redirects - Although not strictly a redirect, an unfortunately common method of acheiving the same result is to host one page within an iFrame on another page. For example, you want to use my-friendly-page.html to point to some-really-unfriendly-url.html then you create my-friendly-page.html with an iFrame containing the contents of some-really-unfriendly-url.html.
These three methods of redirecting are often used by developers simply because they are easier to implement than a proper 301 redirect. However they are very difficult for a search engine spider to follow, and they provide no extra information to the spider about why the redirect is occurring. Is it permanent or is it temporary? Which content should they pay attention to? The spiders have to make their best guess.
In all of these cases the server would have returned a 200 response code, before passing the user on to a second page, also with a 200 response code.
In an ideal world, the developer would have used none of these method, and instead used a 301 redirect to point from the initial page to the new page.
Please note: META and JavaScript redirects are bad for SEO, but the iFrame method is a messy botch job and I advise you avoid it like the plague.
June 1st, 2008 at 8:43 pm
[...] More information on understanding response codes can be found in our post HTTP Response Codes and SEO - An Introduction. [...]
June 4th, 2008 at 6:10 pm
I’d suggest using the apache mod rewrite to 301 redirect document - as you asy META and JS are not ideal for SEO, and iFrames quickly become nasty to maintain and scale.
You can mod rewrite using the .htaccess file in your root folder. With this you can 301 permenantly redirect any pages and preserve any inbound links they have acquired.
June 4th, 2008 at 6:35 pm
JJ - sounds good to me. Thanks for your feedback!