Searching Craigslist Using C#

10/22/2009

I'm sort of all over the place with my projects as of late. But along with the Hulu, Netflix, etc. search functions, I wanted to be able to search other sites. One of these sites is Craigslist. I don't search it that often but it is one of those sites that I hit from time to time. So I figured how hard could it be to search it? Not that difficult it turns out:

   1: /// <summary>
   2: /// Craigslist helper
   3: /// </summary>
   4: public static class Craigslist
   5: {
   6:     #region Public Static Functions
   7:  
   8:     /// <summary>
   9:     /// Searches craigslist
  10:     /// </summary>
  11:     /// <param name="Site">Site to search (for instance http://charlottesville.craigslist.org/ for Charlottesville, VA)</param>
  12:     /// <param name="Category">Category to search within</param>
  13:     /// <param name="SearchString">Search term</param>
  14:     /// <returns>RSS feed object</returns>
  15:     public static Document Search(string Site, Category Category, string SearchString)
  16:     {
  17:         return new Document(Site + "/search/" + Names[(int)Category] + "?query=" + HttpUtility.UrlEncode(SearchString) + "&catAbbreviation=" + Names[(int)Category] + "&minAsk=min&maxAsk=max&format=rss");
  18:     }
  19:  
  20:     #endregion
  21:  
  22:     #region Private Static Variables
  23:  
  24:     /// <summary>
  25:     /// Abbreviations for categories that Craigslist uses
  26:     /// </summary>
  27:     private static string[] Names ={"sss","art","pts","bab","bar","bik","boa","bks","bfs","cta","ctd",
  28:         "cto","emd","clo","clt","sys","ele","grd","zip","fua","fud","fuo","tag","gms","for","hsh",
  29:         "wan","jwl","mat","mcy","msg","pho","rvs","spo","tix","tls"};
  30:  
  31:     #endregion
  32: }
  33:  
  34: #region Enum
  35:  
  36: /// <summary>
  37: /// Category to search within
  38: /// </summary>
  39: public enum Category
  40: {
  41:     All_For_Sale,
  42:     For_Sale_Arts_Crafts,
  43:     For_Sale_Auto_Parts,
  44:     For_Sale_Baby_Kid_Stuff,
  45:     For_Sale_Barter,
  46:     For_Sale_Bicycles,
  47:     For_Sale_Boats,
  48:     For_Sale_Books,
  49:     For_Sale_Business,
  50:     For_Sale_Cars_And_Trucks_All,
  51:     For_Sale_Cars_And_Trucks_Dealer,
  52:     For_Sale_Cars_And_Trucks_Owner,
  53:     For_Sale_CDs_DVDs_VHS,
  54:     For_Sale_Clothing,
  55:     For_Sale_Collectibles,
  56:     For_Sale_Computers_Tech,
  57:     For_Sale_Electronics,
  58:     For_Sale_Farm_Garden,
  59:     For_Sale_Free_Stuff,
  60:     For_Sale_Furniture_All,
  61:     For_Sale_Furniture_By_Dealer,
  62:     For_Sale_Furniture_By_Owner,
  63:     For_Sale_Games_Toys,
  64:     For_Sale_Garage_Sales,
  65:     For_Sale_General,
  66:     For_Sale_Household,
  67:     For_Sale_Items_Wanted,
  68:     For_Sale_Jewelry,
  69:     For_Sale_Materials,
  70:     For_Sale_Motorcycles,
  71:     For_Sale_Musical_Instruments,
  72:     For_Sale_Photo_Video,
  73:     For_Sale_Recreational_Vehicles,
  74:     For_Sale_Sporting_Goods,
  75:     For_Sale_Tickets,
  76:     For_Sale_Tools
  77: }
  78:  
  79: #endregion

Note that there are other sections to craigslist, but I really only look for things being sold (although I did manage to find my current job through the site). But basically you just need to enter in the individual site (since they have about 10,000 of them), the category you want to search in, and the search term. In return you're given a RSS Document object. Note that this is a slightly modified version of my RSS helper that can be found in my utility library). The reason it's modified is that Craigslist doesn't do a normal RSS feed. Instead they use RDF with RSS elements... The only main difference, in terms of code, from the version that is currently in the utility library is in the Document class itself. Specifically the constructor:

   1: public Document(string Location)
   2: {
   3:     try
   4:     {
   5:         XmlDocument Document = new XmlDocument();
   6:         Document.Load(Location);
   7:         foreach (XmlNode Children in Document.ChildNodes)
   8:         {
   9:             if (Children.Name.Equals("rss", StringComparison.CurrentCultureIgnoreCase))
  10:             {
  11:                 foreach (XmlNode Child in Children.ChildNodes)
  12:                 {
  13:                     if (Child.Name.Equals("channel", StringComparison.CurrentCultureIgnoreCase))
  14:                     {
  15:                         Channels.Add(new Channel((XmlElement)Child));
  16:                     }
  17:                 }
  18:             }
  19:             else if (Children.Name.Equals("rdf:rdf", StringComparison.CurrentCultureIgnoreCase))
  20:             {
  21:                 List<Item> Items = new List<Item>();
  22:                 foreach (XmlNode Child in Children.ChildNodes)
  23:                 {
  24:                     if (Child.Name.Equals("channel", StringComparison.CurrentCultureIgnoreCase))
  25:                     {
  26:                         Channels.Add(new Channel((XmlElement)Child));
  27:                     }
  28:                     else if (Child.Name.Equals("item", StringComparison.CurrentCultureIgnoreCase))
  29:                     {
  30:                         Items.Add(new Item((XmlElement)Child));
  31:                     }
  32:                 }
  33:                 if (Channels.Count > 0)
  34:                 {
  35:                     Channels[0].Items = Items;
  36:                 }
  37:             }
  38:         }
  39:     }
  40:     catch { }
  41: }

I'm not that happy with the way that ends up but it is what it is. Other than that though, It's pretty simple to use. But if you want to use your own RSS parser, it shouldn't be too difficult to write. Especially considering it's not exactly an RSS feed. Anyway, try it out, leave feedback, and happy coding.



Comments