Monday, June 8, 2009

Using Google SiteMaps

Today I wanted to create a dynamic google sitemap file for my site. So I started reading the documentation for creation of a sitemap. Google has a very good documented format, making all the points clear. you can vew it here. The Google Webmaster tools are also one of the powerful features where you could view the site stats, and thats where you need to submit your google sitemap.

The sitemaps protocol is defined at http://www.sitemaps.org/protocol.php

The sitemap is placed in the root directory which contains all the site urls for google to index pages. You can have a maximum of 50,000 urls per Sitemap. For additional urls, another sitemap file can be created and then linked with each other using SiteMap indexing file. Google does great at explaining all this stuff, so make sure you go through all the stuff in the link that I have put up here above to get the best of both worlds.

Generating the dynamic SiteMap file:

Generation is just like any other xml file, except that you have to adhere to the standards mentioned in the sitemap protocol.

Here's the code. You have to modify the query to suit your needs.


public partial class NewsSiteMap : System.Web.UI.Page
{
protected void Page_Load(object sender, EventArgs e)
{
DateTime now = DateTime.Now;
DataClassesDataContext db = new DataClassesDataContext();

var latestNewsArticles = (from articles in db.article_to_magazine_mappings
where
articles.magazine_id == 1 && articles.article.article_type_id != 20
select new
{
ID = articles.article.id,
Title = articles.article.web_title,
TitleLink = articles.TitleLink,
Author = articles.AuthorName,
TitleKeywords = articles.article.meta_tags,
Date = articles.publish_date,
LegacyID = articles.article.legacy_id
})
.Take(49999);


Response.Clear();

Response.ContentType = "text/xml";

using (XmlTextWriter writer = new XmlTextWriter(Response.OutputStream, Encoding.UTF8))
{
writer.WriteStartDocument();

writer.WriteStartElement("urlset");

writer.WriteAttributeString("xmlns", "http://www.sitemaps.org/schemas/sitemap/0.9");

foreach (var newsarticle in latestNewsArticles)
{
writer.WriteStartElement("url");

writer.WriteElementString("loc", ConcatenateUrls(newsarticle.TitleLink,newsarticle.Author,newsarticle.TitleKeywords,newsarticle.ID.ToString()));

writer.WriteElementString("lastmod", String.Format("{0:yyyy-MM-dd}", newsarticle.Date));

writer.WriteElementString("changefreq", "daily");

writer.WriteElementString("priority", "1.0");

writer.WriteEndElement();
}

writer.WriteEndDocument();

writer.Flush();

}

db.Dispose();
db = null;

Response.End();
}

protected string ConcatenateUrls(string title, string author, string keywords, string articleid)
{
return GetHost(title + "/By_" + author + keywords + "/" + articleid);
}
}


In the html, you just need to add the following line
<%@ Page Language="C#" AutoEventWireup="true" CodeFile="newsSitemap.aspx.cs" Inherits="NewsSiteMap" %>


Make sure you have encoded your content so that it doesnt break.
After you are done, submit your sitemap file to Google, and you are good!

No comments: