Parse US Street Addresses with Regular Expression in C#

In my business, we do a lot with addresses. Generally, we rely on 3rd party products from companies like ESRI for what we need, but from time to time, we still need to parse an address the old-fashioned way. Something like US Address Parser is exactly what I need, but I can’t use it since it’s GPL’d. I didn’t need an exhaustive, perfect solution, so I thought I’d just whip one up with regular expressions.

Sample input:

  • 100 MAIN
  • 100 MAIN ST
  • 100 S MAIN ST
  • 100 S MAIN ST W
  • 100 S MAIN ST W APT 1A

Create StreetAddress class

The first step was simply to create an address object with the properties I needed:

public class StreetAddress
{
    public string HouseNumber { get; set; }
    public string StreetPrefix { get; set; }
    public string StreetName { get; set; }
    public string StreetType { get; set; }
    public string StreetSuffix { get; set; }
    public string Apt { get; set; }
}

Build regular expression

The next thing I did was get to work on my regular expression. I built my expression with the help of RegExr and did my initial testing. Once I was satisfied, I moved it over to code. Here’s what I came up with:

private static string BuildPattern()
{
    var pattern = "^" +                                                       // beginning of string
                    "(?<HouseNumber>\\d+)" +                                  // 1 or more digits
                    "(?:\\s+(?<StreetPrefix>" + GetStreetPrefixes() + "))?" + // whitespace + valid prefix (optional)
                    "(?:\\s+(?<StreetName>.*?))" +                            // whitespace + anything
                    "(?:" +                                                   // group (optional) {
                    "(?:\\s+(?<StreetType>" + GetStreetTypes() + "))" +       //   whitespace + valid street type
                    "(?:\\s+(?<StreetSuffix>" + GetStreetSuffixes() + "))?" + //   whitespace + valid street suffix (optional)
                    "(?:\\s+(?<Apt>.*))?" +                                   //   whitespace + anything (optional)
                    ")?" +                                                    // }
                    "$";                                                      // end of string

    return pattern;
}

Functions for valid values

Note that there are several functions called while building the regular expression. This is done purely for readability and maintainability. Here are the functions, which each just return a pipe-delimited list of valid values:

private static string GetStreetPrefixes()
{
    return "TE|NW|HW|RD|E|MA|EI|NO|AU|SE|GR|OL|W|MM|OM|SW|ME|HA|JO|OV|S|OH|NE|K|N";
}

private static string GetStreetTypes()
{
    return "TE|STCT|DR|SPGS|PARK|GRV|CRK|XING|BR|PINE|CTS|TRL|VI|RD|PIKE|MA|LO|TER|UN|CIR|WALK|CO|RUN|FRD|LDG|ML|AVE|NO|PA|SQ|BLVD|VLGS|VLY|GR|LN|HOUSE|VLG|OL|STA|CH|ROW|EXT|JC|BLDG|FLD|CT|HTS|MOTEL|PKWY|COOP|ACRES|ESTS|SCH|HL|CORD|ST|CLB|FLDS|PT|STPL|MDWS|APTS|ME|LOOP|SMT|RDG|UNIV|PLZ|MDW|EXPY|WALL|TR|FLS|HBR|TRFY|BCH|CRST|CI|PKY|OV|RNCH|CV|DIV|WA|S|WAY|I|CTR|VIS|PL|ANX|BL|ST TER|DM|STHY|RR|MNR";
}

private static string GetStreetSuffixes()
{
    return "NW|E|SE|W|SW|S|NE|N";
}

Parse the input

At this point, the work is done. All that’s left is to run the regular expression on your address string and deal with the results.

public static StreetAddress Parse(string address)
{
    if (string.IsNullOrEmpty(address))
        return new StreetAddress();
            
    StreetAddress result;
    var input = address.ToUpper();
            
    var re = new Regex(BuildPattern());
    if (re.IsMatch(input))
    {
        var m = re.Match(input);
        result = new StreetAddress
                        {
                            HouseNumber = m.Groups["HouseNumber"].Value,
                            StreetPrefix = m.Groups["StreetPrefix"].Value,
                            StreetName = m.Groups["StreetName"].Value,
                            StreetType = m.Groups["StreetType"].Value,
                            StreetSuffix = m.Groups["StreetSuffix"].Value,
                            Apt = m.Groups["Apt"].Value,
                        };
    }
    else
    {
        result = new StreetAddress
                        {
                            StreetName = input,
                        };
    }
    return result;
}

End product

And, finally, for those of you who love big, gnarly regular expressions, here’s my end product:

^(?<HouseNumber>\\d+)(?:\\s+(?<StreetPrefix>TE|NW|HW|RD|E|MA|EI|NO|AU|SE|GR|OL|W|MM|OM|SW|ME|HA|JO|OV|S|OH|NE|K|N))?(?:\\s+(?<StreetName>.*?))(?:(?:\\s+(?<StreetType>TE|STCT|DR|SPGS|PARK|GRV|CRK|XING|BR|PINE|CTS|TRL|VI|RD|PIKE|MA|LO|TER|UN|CIR|WALK|CO|RUN|FRD|LDG|ML|AVE|NO|PA|SQ|BLVD|VLGS|VLY|GR|LN|HOUSE|VLG|OL|STA|CH|ROW|EXT|JC|BLDG|FLD|CT|HTS|MOTEL|PKWY|COOP|ACRES|ESTS|SCH|HL|CORD|ST|CLB|FLDS|PT|STPL|MDWS|APTS|ME|LOOP|SMT|RDG|UNIV|PLZ|MDW|EXPY|WALL|TR|FLS|HBR|TRFY|BCH|CRST|CI|PKY|OV|RNCH|CV|DIV|WA|S|WAY|I|CTR|VIS|PL|ANX|BL|ST TER|DM|STHY|RR|MNR))(?:\\s+(?<StreetSuffix>NW|E|SE|W|SW|S|NE|N))?(?:\\s+(?<Apt>.*))?)?$
Advertisements

9 thoughts on “Parse US Street Addresses with Regular Expression in C#”

  1. Dear Adam,

    This week I need to parse an address into different fields in SQL and I found your blog. I am try to follow your steps with VS 2005 but get 14 errors. Could you please provide more details or a sample project ? You help will be highly appreciated.
    Thanks a lot!

  2. Hi Adam,
    thanks for your time and your code. Would you please if possible to place an example code of how to use your code for people who are new into c# and like to take advantage of your code?
    Thanks again

    1. Sure! Here’s the code from the console application that produces the output in the screenshot example:

      static void Main(string[] args)
      {
      string input = string.Empty;
      while (!string.Equals(input, “x”, StringComparison.CurrentCultureIgnoreCase))
      {
      Console.Write(“Address (x to quit): “);
      input = Console.ReadLine();

      var a = Parser.Parse(input);
      Console.WriteLine(“House Number: {0}”, a.HouseNumber);
      Console.WriteLine(“Street Prefix: {0}”, a.StreetPrefix);
      Console.WriteLine(“Street Name: {0}”, a.StreetName);
      Console.WriteLine(“Street Type: {0}”, a.StreetType);
      Console.WriteLine(“Street Suffix: {0}”, a.StreetSuffix);
      Console.WriteLine(“Apt: {0}”, a.Apt);
      Console.WriteLine();
      }
      }

  3. This is great. Needed it for a SharePoint custom workflow. I’m marginal at regular expressions. This saved me hours of work. Thanks.

Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s