Using Parallel.ForEach To Aggregate Results From JSON Files Stored in Windows Azure Blob Storage

April 17 2012

I love NoSQL, except when it comes to reporting.  Then I miss those handy SQL aggregation calls. I recently had a situation where I needed to look at a whole bunch of JSON files stored in Windows Azure Blob Storage and aggregate values from within those JSON files. The exact scenario was to get a count of how many people had achieved each achievement as part of the Visual Studio Achievements project.

Turned out to be an ideal use case for using some of the parallel programming features in .NET 4.0. Rather than download and process N JSON blobs linearly, I could throw the loop in a Parallel.ForEach block and gain speed from the multi-core machine I was using. The code was pretty straightforward, especially once I discovered the System.Collections.Concurrent namespace with its handy ConcurrentDictionary.

And, most significantly, it increased performance by 400 percent!

Below is the code in it’s entirety; I’ll walk through it here:

For JSON deserialization, I’m using a library provided by the WCF team up on CodePlex which is now part of the ASP.NET Web API. It provides some nifty features for turning JSON into dynamic objects.

The first thing I do is download a JSON file that has all the achievements, which is publically available (line 25).

I then put all the achievements in a ConcurrentDictionary<string, int> which I’ll use to build my report (line 33).

Then, I get the blobs and start my Parallel.ForEach loop (line 41). Inside the Action(TSource), I walk the list of the user’s earned achievements, incrementing the count in the ConcurrentDictionary of each achievement (line 57).

Finally, once I exit the loop, I turn the dictionary into an Excel spreadsheet (line 69) – folks tend to like that format.

You’ll notice the remmed out code; that’s the old non-parallelized code if you’d like to compare.

1 using System; 2 using System.Collections.Concurrent; 3 using System.IO; 4 using System.Json; 5 using System.Net; 6 using System.Threading.Tasks; 7 using Microsoft.WindowsAzure; 8 using Microsoft.WindowsAzure.StorageClient; 9 10 namespace AchievementsReporting 11 { 12 class Program 13 { 14 15 static void Main(string[] args) 16 { 17 var cloudBlobClient = new CloudBlobClient(new Uri("https://---.blob.core.windows.net", UriKind.Absolute), 18 new StorageCredentialsAccountAndKey("---", 19 "---")); 20 var container = cloudBlobClient.GetContainerReference("users"); 21 22 string masterJson = string.Empty; 23 using (var webClient = new WebClient()) 24 { 25 masterJson = 26 webClient.DownloadString(new Uri("http://channel9.msdn.com/achievements/visualstudio?json=true")); 27 } 28 dynamic masterList = JsonValue.Parse(masterJson); 29 var statisticsDictionary = new ConcurrentDictionary<string, int>(); 30 //var statisticsDictionary = new Dictionary<string, int>(); 31 foreach (var achieve in masterList.Achievements) 32 { 33 statisticsDictionary.GetOrAdd(achieve.Name.ToString(), 0); 34 //statisticsDictionary.Add(achieve.Name.ToString(), 0); 35 } 36 BlobRequestOptions options = new BlobRequestOptions(); 37 options.UseFlatBlobListing = true; 38 options.BlobListingDetails = BlobListingDetails.Snapshots; 39 Console.WriteLine("Starting..."); 40 DateTime start = DateTime.Now; 41 Parallel.ForEach(container.ListBlobs(options), blobListItem => 42 //foreach (var blobListItem in container.ListBlobs(options)) 43 { 44 CloudBlob blob = 45 container.GetBlobReference( 46 blobListItem.Uri.AbsoluteUri); 47 string json = blob.DownloadText(); 48 if (!string.IsNullOrEmpty(json)) 49 { 50 dynamic achievementsDynamic = 51 JsonValue.Parse(json) as dynamic; 52 foreach ( 53 var achieve in achievementsDynamic.Achievements) 54 { 55 if (achieve.DateEarned != null) 56 { 57 statisticsDictionary[achieve.Name.ToString()] = 58 statisticsDictionary[achieve.Name.ToString()] + 1; 59 } 60 } 61 } 62 } 63 ); 64 65 using (StreamWriter writer = new StreamWriter("report.xls")) 66 { 67 foreach (var key in statisticsDictionary.Keys) 68 { 69 writer.WriteLine(key + "\t" + statisticsDictionary[key].ToString()); 70 } 71 } 72 TimeSpan diff = DateTime.Now - start; 73 Console.WriteLine("done - took: "); 74 Console.WriteLine(diff.TotalMinutes); 75 } 76 } 77 } 78

Comments (5) -

4/17/2012 8:53:27 PM #

Scott Bussinger

Are lines 57 and 58 really thread-safe? I haven't worked with ConcurrentDictionary, but there's nothing I could find in the MSDN documentation that says this should work.

Just looking at it, I'm not sure how it could be. What would prevent one thread from interleaving its read-update-writes of the value with another thread's read-update-writes of the value?

Scott Bussinger

4/18/2012 5:00:20 AM #

Phil Bolduc

With ConcurrentDictionary, you do not need to initialize it on line 33.  You can replace lines 57 and 58 with the following:

statisticsDictionary.AddOrUpdate(achieve.Name.ToString(), 0, (key, oldValue) => oldValue + 1);

This is the correct way to use ConcurrentDictionary. See msdn.microsoft.com/en-us/library/ee378665.aspx

Phil Bolduc

4/18/2012 5:01:55 AM #

Phil Bolduc

Small error in my previous comment, it should use a add value of 1 not 0 (the second argument):

statisticsDictionary.AddOrUpdate(achieve.Name.ToString(), 1, (key, oldValue) => oldValue + 1);

Phil Bolduc

4/18/2012 8:17:11 AM #

Karsten Januszewski

@Phil Bolduc -- Good catch, thanks for that.  

@Scott -- From MSDN (msdn.microsoft.com/en-us/library/dd287191.aspx): The ConcurrentDictionary "Represents a thread-safe collection of key-value pairs that can be accessed by multiple threads concurrently."

Karsten Januszewski

4/18/2012 11:43:54 AM #

tobi

Phil Bolduc, no! This is not atomic either. The update lambda is not called under a lock. Two concurrent invocations will loose one of the two updates.

ConcurrentDictionary cannot be used (directly) to make this work. It does not offer atomic operations with user code in them.

tobi

4/19/2012 8:00:16 AM #

Karsten Januszewski

@tobi - Even better catch. My instincts were to build the dictionary values before adding them.  And looking into AddOrUpdate, I found this on MSDN msdn.microsoft.com/.../dd997369.aspx :

ConcurrentDictionary<(Of <(TKey, TValue>)>) is designed for multithreaded scenarios. You do not have to use locks in your code to add or remove items from the collection. However, it is always possible for one thread to retrieve a value, and another thread to immediately update the collection by giving the same key a new value.

Also, although all methods of ConcurrentDictionary<(Of <(TKey, TValue>)>) are thread-safe, not all methods are atomic, specifically GetOrAdd and AddOrUpdate. The user delegate that is passed to these methods is invoked outside of the dictionary's internal lock. (This is done to prevent unknown code from blocking all threads.) Therefore it is possible for this sequence of events to occur:

1) threadA calls GetOrAdd, finds no item and creates a new item to Add by invoking the valueFactory delegate.

2) threadB calls GetOrAdd concurrently, its valueFactory delegate is invoked and it arrives at the internal lock before threadA, and so its new key-value pair is added to the dictionary.

3) threadA's user delegate completes, and the thread arrives at the lock, but now sees that the item exists already

4) threadA performs a "Get", and returns the data that was previously added by threadB.

Therefore, it is not guaranteed that the data that is returned by GetOrAdd is the same data that was created by the thread's valueFactory. A similar sequence of events can occur when AddOrUpdate is called.

Karsten Januszewski

7/12/2012 9:30:21 PM #

hollister  uk

This is often great. Another one stare upon this finding gratification and we are amazed. We are most certainly interested in this kind of issues. One particular appreciate their tip, and value your precious time inside this. Please keep editing.  

hollister uk

9/19/2012 9:38:05 PM #

abercrombie paris

How <a href="www.abercrombieparisshop.net/...ie-femme.html"; title="abercrombie paris ouverture To Tell If" >abercrombie paris ouverture To Tell If</a>   You Have An Authentic Abercrombie and Fitch Hoodie<br />
<br />
When you decide to  <a href="www.abercrombieparisshop.net/...-chemise.html"; title="spend  abercrombie france paris your hard" >spend  abercrombie france paris your hard</a>  earned money to get what you want, getting exactly what  <a href="www.abercrombieparisshop.net/...-t-shirt.html"; title="you pay abercrombie paris adresse" >you pay abercrombie paris adresse</a> <a href="www.abercrombieparisshop.net/...fourrure.html"; title="abercrombie paris horaires for should go" >abercrombie paris horaires for should go</a>   without saying. Unfortunately, learning how to tell if you <a href="www.abercrombieparisshop.net/...-capuche.html"; title="doudoune abercrombie homme have an authentic" >doudoune abercrombie homme have an authentic</a>   Abercrombie and Fitch hoodie (if you decide   <a href="www.abercrombieparisshop.net/...-longues.html"; title="that you  teddy abercrombie want to" >that you  teddy abercrombie want to</a>  purchase one)is a must and the price alone is enough reason to educate yourself about the authenticity of scarves made by Abercrombie and <a href="www.abercrombieparisshop.net/...-longues.html"; title="abercrombie paris rue du bac" >abercrombie paris rue du bac</a> Fitch.<br />
<br />
Educate yourself<br />
<br />
Lets get <a href="www.abercrombieparisshop.net/...-t-shirt.html"; title="paris abercrombie on" >paris abercrombie on</a> with it,first off, knowing  <a href="www.abercrombieparisshop.net/c_bags.html"; title="what  abercrombie paris parfum to" >what  abercrombie paris parfum to</a> look for is the first line of defense when learning <a href="www.abercrombieparisshop.net/...me-short.html"; title="gilet abercrombie femme" >gilet abercrombie femme</a> how to spot fake Abercrombie  <a href="www.abercrombieparisshop.net/...pantalon.html"; title="and  veste abercrombie pas cher Fitch" >and  veste abercrombie pas cher Fitch</a> scarves, <a href="www.abercrombieparisshop.net/...mme-jean.html"; title="and abercrombie & fitch paris magasin" >and abercrombie & fitch paris magasin</a> <a href="www.abercrombieparisshop.net/...-bermuda.html"; title="t shirt abercrombie" >t shirt abercrombie</a> doing the <a href="www.abercrombieparisshop.net/...-longues.html"; title="parfum abercrombie research" >parfum abercrombie research</a>  <a href="www.abercrombieparisshop.net/...mme-polo.html"; title="before  abercrombie magasin paris hand will" >before  abercrombie magasin paris hand will</a>    <a href="www.abercrombieparisshop.net/...-t-shirt.html"; title="give you  tee shirt abercrombie femme an incredible" >give you  tee shirt abercrombie femme an incredible</a>  <a href="www.abercrombieparisshop.net/...-chemise.html"; title="contrefa?on abercrombie" >contrefa?on abercrombie</a> edge. Start by <a href="www.abercrombieparisshop.net/...-chemise.html"; title="gilet abercrombie" >gilet abercrombie</a> taking time to visit your local Abercrombie and Fitch shop <a href="www.abercrombieparisshop.net/...-bermuda.html"; title="boutique abercrombie paris or outlet" >boutique abercrombie paris or outlet</a>  store <a href="www.abercrombieparisshop.net/...-t-shirt.html"; title="abercrombie paris horaires and" >abercrombie paris horaires and</a> checking the scarves and there labels, this is very important, make sure that the labels don't read "made in china" that's a tell tale sign that it's fake<br />
<br />
<br />
<br />

abercrombie paris

9/19/2012 9:39:20 PM #

abercrombie paris

How   [url=www.abercrombieparisshop.net/...homme-bermuda.html]To Tell If tee shirt abercrombie femme[/url] You Have An Authentic Abercrombie and Fitch [url=www.abercrombieparisshop.net/c_hco-homme-jean.html]polo femme abercrombie[/url] Hoodie

When you decide to spend your hard earned money to get what you want, getting exactly  [url=http://www.abercrombieparisshop.net/c_wallet.html]what  abercrombie londres you pay[/url]  for should go without saying. Unfortunately, learning how to tell if you have an authentic Abercrombie and Fitch hoodie (if you decide that you want to purchase  [url=www.abercrombieparisshop.net/...is-homme-jean.html]one)is a soldes abercrombie paris[/url] must and the price alone [url=www.abercrombieparisshop.net/c_ceintures.html]short abercrombie is enough[/url]  reason to educate [url=www.abercrombieparisshop.net/...o-homme-veste.html]abercrombi yourself about the[/url]   authenticity of scarves made by Abercrombie [url=www.abercrombieparisshop.net/...sweat-capuche.html]vendeur abercrombie paris and[/url] Fitch.

Educate yourself

Lets get on [url=www.abercrombieparisshop.net/c_af-femme-polo.html]abercrombie paris with[/url] it,first [url=http://www.abercrombieparisshop.net/c_scarves.html]abercrombie paris horaires[/url] off, knowing what to look [url=www.abercrombieparisshop.net/...omme-pantalon.html]horaire abercrombie paris for is[/url]   [url=www.abercrombieparisshop.net/c_ceintures.html]the  pulls abercrombie first[/url] line [url=www.abercrombieparisshop.net/...homme-bermuda.html]of horaire abercrombie paris[/url] defense when learning   [url=www.abercrombieparisshop.net/...is-homme-jean.html]how to  pull abercrombie femme spot[/url]   [url=www.abercrombieparisshop.net/...sweat-capuche.html]fake Abercrombie  abercrombi and[/url] [url=www.abercrombieparisshop.net/...nches-longues.html]boutique abercrombie paris Fitch[/url] scarves, [url=www.abercrombieparisshop.net/...o-homme-veste.html]chemise abercrombie homme[/url] and doing the research before hand will give you an incredible edge. Start by taking time to visit your local Abercrombie and Fitch shop or outlet store and checking  [url=www.abercrombieparisshop.net/...femme-chemise.html]the  abercrombie paris horaires scarves[/url] and there labels, this is very important,  [url=http://www.abercrombieparisshop.net/c_wallet.html]make  magasin abercrombie sure that[/url]  [url=www.abercrombieparisshop.net/c_af-femme-short.html]loyer abercrombie paris the labels don\&#39;t[/url]   read "made in china" that's a tell tale  [url=www.abercrombieparisshop.net/c_af-femme-jean.html]sign  soldes abercrombie paris that it\&#39;s[/url]  fake



abercrombie paris

10/10/2012 5:13:25 PM #

Abercrombie Milano

Da stili classici ma sobria che si basano su colori neutri e disegni di veramente stili occhio popping e innovativi che caratterizzano gli ultimi look per pezzi di tempo, si è sicuri di trovare proprio quello che serve per tutte le vostre occasioni importanti.

Abercrombie Milano

10/10/2012 5:16:29 PM #

Abercrombie Milano

Da stili classici ma sobria che si basano su colori neutri e disegni di veramente stili occhio popping e innovativi che caratterizzano gli ultimi look per pezzi di tempo, si è sicuri di trovare proprio quello che serve per tutte le vostre occasioni importanti.

Abercrombie Milano

10/10/2012 5:18:43 PM #

Abercrombie Milano

Da stili classici ma sobria che si basano su colori neutri e disegni di veramente stili occhio popping e innovativi che caratterizzano gli ultimi look per pezzi di tempo, si è sicuri di trovare proprio quello che serve per tutte le vostre occasioni importanti.

Abercrombie Milano

Pingbacks and trackbacks (1)+

Add comment




biuquote
Loading



My VS Achievements