TFS SDK: Improve Performance of Parent-Child Queries

I’ve been helping a co-worker with a web application that we’ll be using in lieu of generating daily stand-up reports from Excel spreadsheets connected to TFS. We want this application to retrieve a list of User Stories and their child Tasks by Iteration Path and display them per user.

One of the pain points of Excel is that refreshing our TFS queries is a slow process that keeps getting slower as more data is added. So a key requirement of this project is to have fast-running queries. We’re writing queries using WIQL, which I’ve touched on previously. WIQL has okay performance, but executing queries is an expensive operation. You can improve the performance of your application by executing larger queries and then filtering programmatically instead of doing many queries for specific information.

For this example, I have a query to retrieve all User Stories. I’m using Iteration Path to filter items to just my team.

SELECT *
FROM WorkItems 
WHERE [System.WorkItemType] = 'User Story'
    AND [System.IterationPath] UNDER 'MyTeam\'

Each of the returned User Story work items has a collection of links. If we want to retrieve all the child work items, we can look at the link type and retrieve items by Id. The most obvious solution is to loop through collection of linked work items and retrieve any children. This code does just that:

var linkType = workItemStore.WorkItemLinkTypes
    .LinkTypeEnds["System.LinkTypes.Hierarchy-Forward"];

foreach (WorkItem d in deliverables)
{
    var stories = d.WorkItemLinks.OfType<WorkItemLink>()
        .Where(x => x.LinkTypeEnd == linkType)
        .Select(x => 
            workItemStore.Query(@"
                SELECT *
                FROM WorkItems 
                WHERE [System.Id] == " + x.TargetId)
            .Cast<WorkItem>().FirstOrDefault());
}

This solution works, but it’s slow due to the number of queries being executed. We can get a great performance boost by simply retrieving the collection of Tasks and querying the in-memory collection to find each Story’s children, like this:

var userStories = workItemStore.Query(@"
    SELECT *
    FROM WorkItems 
    WHERE [System.WorkItemType] = 'Task'
        AND [System.IterationPath] UNDER 'MyTeam\'
    ").Cast<WorkItem>();

var linkType = workItemStore.WorkItemLinkTypes
    .LinkTypeEnds["System.LinkTypes.Hierarchy-Forward"];

foreach (WorkItem d in deliverables)
{
    var stories = d.WorkItemLinks.OfType<WorkItemLink>()
        .Where(x => x.LinkTypeEnd == linkType)
        .Select(x => 
            userStories.Where(us => us.Id == x.TargetId)
                .FirstOrDefault());
}

I wrote a quick test program to test the performance gained by using this technique, and the results were quite dramatic. (55.3%!)

Average time to retrieve 85 User Stories with 250 linked child Tasks:
Method A (querying per link): avg. 4817.67 ms
Method B (query for all tasks up-front): avg. 2151.33 ms

The lesson to be learned here is that it’s typically best to execute broader queries and filter programmatically rather than to execute many queries for specific information. The above example could be improved further by retrieving Tasks and User Stories in a single query. The application could then extract User Stories from the results collection and retrieve linked Tasks from the same collection by Id.

Advertisements

One thought on “TFS SDK: Improve Performance of Parent-Child Queries”

Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s