404 Errors with .NET

Published on September 21, 2008 | Filed Under SEO

I have had the opportunity to do some Search Engine Optimization (SEO) with the company I currently work for. One of the major problems we have been experiencing is that IIS 6 and ASP.NET 2.0 don’t handle 404 errors in a way, which you may expect them to.

The purpose of this article is to teach you how to use 301 Redirections when you’re creating 404 error pages.

As a side note, I want to mention that I’m not sure if this issue has been resolved in IIS 7 and/or ASP.NET 3.5

Background

When you set up your website to have a custom 404 error page, even though your web browser directly takes you there, you’re actually hitting a 302 (Temporary Redirect) error page. Here are the headers I received from a test website I setup:

  HTTP/1.1 302 Found
  Server: ASP.NET Development Server/8.0.0.0
  Date: Mon, 22 Sep 2008 02:15:40 GMT
  X-AspNet-Version: 2.0.50727
  Location: /DevPaper/404ErrorPage.aspx?aspxerrorpath=/DEVPAPER/X.ASPX
  Cache-Control: private
  Content-Type: text/html; charset=utf-8
  Content-Length: 175Connection: Close

Why is this bad? Well a 302 error actually tells the spider crawling your website that the page will return there one day, thus causing your page to remain in the search engine’s index. Not only will this drive more people to pages no longer in existence, this will also cause your overall ranking within the search engine to get worse.

The solution for this is actually quite simple; it involves creating a new HTTP Module.

What is an HTTP Module?

If you have not worked with an HTTP Module before, this is just basically a way of controlling how your website is processed before it is ever sent to the client. Before your website is ever sent to the client, it must first be processed on the server. Microsoft’s AJAX Libraries use a module called ScriptModule, which pre-processes all of your code to work through the JavaScript backend.

The Solution

In order to add the HTTP Module to your website, we have to create it as its own project. To make this easier, add this project to your current website solution file.

Open your current website Solution and go to File > Add > New Project

Select Class Library and give your new module a name. I have chosen 404ErrorModule.

Now the nice thing about this is that you don’t actually need to give it a namespace. For the sake of making sure that I completely explain everything, I am going to give it the namespace of DevPaper. Don’t forget to give your class a name, as well. For all intensive purposes, I’m going to call my class ErrorHandler.

Your class will not actually understand any of the System.Web classes, so right-click on your new project (not the solution) and select Add Reference. Under the .NET tab, select System.Web and press OK.

Now that we have our reference, add the using directive at the top of the class file for System.Web.

Now you can derive your class from the interface IHttpModule. The interesting thing to note here is that as soon as you enter IHttpModule for your class, you will get a little blue box right under the word. Click it and select “Implement Interface IHttpModule”. This will automatically fill in the functions necessary to use the IHttpModule interface.

Now the fun begins!

You have only two methods here: Dispose and Init. They are exactly as they sound:

  • Dispose: Handles Cleanup
  • Init: Handles when it is first initialized.

Well we only need to handle one function: Error.

Add the following line to your Init method:

  context.Error += new EventHandler(context_Error);

If you press Tab after you enter “+=” it will create the following method for you:

  void context_Error(object sender, EventArgs e)
  {
    throw new Exception("The method or operation is not implemented.");
  }

You can go ahead and remove the Exception being thrown. In fact, remove that line from all of your methods; Visual Studio throws them in there so that you know when they’re being executed during debugging.

For your almost-final step, add these lines of code to your new method:

  HttpApplication context = (HttpApplication)sender;
  HttpResponse response = context.Context.Response;
  string newLocation = "http://localhost:2164/DevPaper/404ErrorPage.aspx?aspxerrorpath=/DevPaper/x.aspx";
  newLocation += context.Request.Url.PathAndQuery;
  response.StatusCode = 301;
  response.AddHeader("Location", newLocation);
  response.End();

This really isn’t as scary as it looks. Here’s the quick-and-dirty breakdown:

  HttpApplication context = (HttpApplication)sender;
  HttpResponse response = context.Context.Response;

These lines create the Response object, which you’re used to seeing in ASP and ASP.NET

  string newLocation = "http://localhost:2164/DevPaper/404ErrorPage.aspx?aspxerrorpath=/DevPaper/x.aspx";
  newLocation += context.Request.Url.PathAndQuery;

These lines create the location to where you want your 404 page to be located. You don’t actually need the aspxerrorpath variable in there, but if you are logging 404 events, this information is vital!

  response.StatusCode = 301;

Setting your status code to 301 causes the browser to redirect for you and it tells the spiders that this is a permanent redirect. This is very important, because it tells the spider to take that page out of the search engine’s index! This is the whole point of this process.

  response.AddHeader("Location", newLocation);

We are telling the spider (and browser) to redirect to the location that we have just created in the newLocation string.

  response.End();

Finally we want to end our response so that the spider doesn’t see any more information at this point.

That’s just about it! Our whole class now looks like this:

Before You Build It:

Before we build our class, let’s take a few moments to actually set it up!

Right-Click on your new project and select properties. In the Build tab, set your output location to be the bin folder of your website and save it.

Web.config:

Your Web.config file must also be set up, add the following lines under your system.web node:

  
    
  

Your name property can be whatever you would like it to be, but your type property is what is important. The first part tells the name of your class (including namespace) and your second part (after the comma) is the name of your class project, which is also the name of your .dll file.

So go ahead and build your project. After I built it, I tried to access a page, which didn’t exist (and had the .aspx extension on it) and I received the following headers:

HTTP/1.1 301 Moved Permanently
Server: ASP.NET Development Server/8.0.0.0
Date: Mon, 22 Sep 2008 03:15:30 GMT
X-AspNet-Version: 2.0.50727
Location: http://localhost:2164/DevPaper/404ErrorPage.aspx?aspxerrorpath=/DevPaper/x.aspx/DEVPAPER/X.ASPX
Cache-Control: private
Content-Type: text/html
Connection: Close

As you can see, this is now spider-friendly and falls in line with good SEO practices.

This seems like a lot of work, just to get a simple 301 redirect in place, but IIS 6 and ASP.NET 2.0 do not offer any other way to do this.

The Final Step!

Everything seems to be in order at this point, but you may begin to notice that you don’t only get a 404 error when a page doesn’t exist; you’re also getting the 404 error if there is an Exception thrown on the page, as well as other page errors. The fix for this is actually quite easy. Just slightly change your context_Error method to read the following:

  public void Context_Error(Object source, EventArgs e)
  {
    HttpApplication context = (HttpApplication)sender;
    if(context.Context.Server.GetLastError().ToString().Contains("does not exist"))
    {
      HttpResponse response = context.Context.Response;
      string newLocation = "http://localhost:2164/DevPaper/404ErrorPage.aspx?aspxerrorpath=/DevPaper/x.aspx";
      newLocation += context.Request.Url.PathAndQuery;
      response.StatusCode = 301;
      response.AddHeader("Location", newLocation);
      response.End();
    }
  }

This one statement dictates that the page’s last error noted that the page does not exist. I’m personally not 100% positive that this is the only way to verify it, but it has worked for me, up to this point. If anyone else knows a better way, please don’t hesitate to leave a comment on the blog!

Further Reading:

There is actually a different way to handle this situation, and that is with a Global.asax file. Take a look at the alternate method and see if it seems more suitable for your needs.

As always, I welcome questions, comments, concerns, criticism and any other issues you may want to discuss. Please don’t hesitate to ask if you are curious!

~Derek Torrence

Leave a Reply

*