Usually web application or service that run in the Windows Azure Platform follow the architecture shown in the figure below. A web farm consisting of two or more web roles serve requests coming from clients. Azure provides the necessary load balancer infrastructure. This principle is called "scale out" because you can enhance performance by adding new instances of the web role. Relational data is stored in SQL Azure. As you might know every SQL Azure database is in fact a failover cluster consisting of three nodes. The problem is that currently SQL Azure does not offer a built-in scale-out mechanism (note that the SQL Azure team announced that there will be an out-of-the-box scale out solution called SQL Azure Federation in the future).

At the database layer you can implement scale-out by using the concept of "sharding" (note that there is an excellent white paper from Microsoft describing the different types of sharding). In this example we implement a custom LINQ provider that demonstrates the use of expression trees and TPL/PLINQ for running distributed queries on a sharded database. On top we put a custom WCF Data Service (=OData) to enable REST queries. Our goal is to do some rudimentary performance tests that compare a single 10GB SQL Azure database with 10 x 1GB SQL Azure databases (sharded).

The test scenario that we have chosen for this example is a real estate search engine. To keep things simple the database just consists of a single table with approx. 35 properties.
If you would like to rebuild the sample step by step for learning purposes you can follow the hands-on-lab shown in my blog at http://www.timecockpit.com/en/blogs/11-02-16/Custom_OData_Provider_for_Windows_Azure.aspx.
You can download the attached sample solution and build it in Visual Studio 2010. You will need the Windows Azure SDK and Windows Azure Tools for Microsoft Visual Studio.
In order to run the example you have to perform the following steps:

Description
If you want to learn more about the sample I recommend starting with the standard WCF Data Service that is included in the solution. You can find it in DefaultRealEstateService.svc in the project Samples.Sharding.Service. Try to run it in your browser using the local development web server (if you like you could also add a cloud project and run it in the Windows Azure Compute Emulator or even deploy it to Windows Azure). Here is a sample OData query that you could try:
http://localhost:<YourPort>/DefaultRealEstateService.svc/RealEstate?$top=25&$filter=SizeOfParcel ge 200 and SizeOfParcel lt 1000 and HasBalcony eq true&$orderby=SizeOfBuildingArea desc
Behind the scenes the WCF Data Services runtime translates the query string into an expression tree. After that it uses the LINQ provider for SQL Server to translate the expression tree into T-SQL and run it in SQL Azure. For those of you who are not that familiar with WCF Data Services/OData I have added a sample query in the form of a unit test, too (see project Samples.Sharding.Tests):
[TestMethod]
public void TestEntityFrameworkProvider()
{
using (var context = RealEstateEntities.Create())
{
Measure("Large", () =>
(from re in context.RealEstate
where re.Location == "New York City" && re.HasBalcony.Value
orderby re.SizeOfGarden
select re)
.Take(25)
.ToArray());
}
}
[TestMethod] public void TestEntityFrameworkProvider() { using (var context = RealEstateEntities.Create()) { Measure("Large", () => (from re in context.RealEstate where re.Location == "New York City" && re.HasBalcony.Value orderby re.SizeOfGarden select re) .Take(25) .ToArray()); } }
You can find the core of the sharding logic in the file ShardingProvider.cs in the project Samples.Sharding.Provider. Our implementation of the custom LINQ provider is based on the IQToolkit. It takes the expression tree of the corresponding query and sends it to all shards in parallel using PLINQ. Note that we use .NET's visitor classes to analyize and manipulate the query's expression tree.
// Send query to all sharding databases in parallel
var result = this.ConnectionStrings
.AsParallel()
.WithDegreeOfParallelism(this.ConnectionStrings.Length)
.SelectMany(connectionString =>
{
using (var context = this.ContextCreator(connectionString))
{
context.CommandTimeout = 300;
var rewriter = new SwitchQueryable<TEntity>(Expression.Constant(this.EntityProvider(context)));
var ex2 =
Expression.Lambda<Func<IEnumerable<TEntity>>>(
Expression.Call(
methodInfoExpr.Method,
rewriter.Visit(methodInfoExpr.Arguments[0]),
methodInfoExpr.Arguments[1]));
return ex2.Compile()().ToArray();
}
})
.ToArray();
// Send query to all sharding databases in parallel var result = this.ConnectionStrings .AsParallel() .WithDegreeOfParallelism(this.ConnectionStrings.Length) .SelectMany(connectionString => { using (var context = this.ContextCreator(connectionString)) { context.CommandTimeout = 300; var rewriter = new SwitchQueryable<TEntity>(Expression.Constant(this.EntityProvider(context))); var ex2 = Expression.Lambda<Func<IEnumerable<TEntity>>>( Expression.Call( methodInfoExpr.Method, rewriter.Visit(methodInfoExpr.Arguments[0]), methodInfoExpr.Arguments[1])); return ex2.Compile()().ToArray(); } }) .ToArray();
You can immediately try the custom LINQ provider with sharding by running the corresponding unit test (see project Samples.Sharding.Tests). As you can see the LINQ query with and without sharding look nearly the same:
[TestMethod]
public void TestShardingProvider()
{
using (var context = RealEstateEntities.Create())
{
Measure("Many small", () =>
(from re in CreateQueryableRoot()
where re.Location == "New York City" && re.HasBalcony.Value
orderby re.SizeOfGarden
select re)
.Take(25)
.ToArray());
}
}
[TestMethod] public void TestShardingProvider() { using (var context = RealEstateEntities.Create()) { Measure("Many small", () => (from re in CreateQueryableRoot() where re.Location == "New York City" && re.HasBalcony.Value orderby re.SizeOfGarden select re) .Take(25) .ToArray()); } }
http://localhost:<YourPort>/CustomRealEstateService.svc/RealEstate?$top=25&$filter=SizeOfParcel ge 200 and SizeOfParcel lt 1000 and HasBalcony eq true&$orderby=SizeOfBuildingArea desc
Performance Evaluation
If you compare the two OData services or unit tests manually you should already see that the sharding queries run much faster. We have published the OData service in Windows Azure and used the load testing tool LoadStorm to simulate load on both implementations. We defined three reference queries and let 10 to 50 concurrent users (step-up load testing scenario; each of them firing six queries per minute) do some queries. The result has been as expected: Because of the large database on the single SQL Azure server the standard OData service does not really scale. The first chart shows the number of users and the throughput. In the second chart you can see that from a certain number of concurrent users the response time gets greater then 35 seconds; as a result we see a lot of HTTP errors.



If we run the same load test against our sharding service the results look very different - of course. No errors and an average response time of approx. 3 seconds instead of more than 10 :-)



Please keep in mind that this example is not production code! It should demonstrate the basic ideas of
If you want to use a similar scenario in a real world project be prepared that you will have to write a lot more code. Typical issues that you will have to solve are:
I generally recommend that you use SQL Azure Federations as soon as this feature becomes available. However, there are situations in which you will not be able to use it (e.g. slightly different data models on different shards). I hope that this example will be useful for you in these cases.