Creating Automated Integration Tests for a DocumentDB Repository

You can unit test all your layers, but when you get to the repository it becomes an exercise in mocking that provides little real benefit. That’s why we have to give up on the unit test idea for repositories and embrace integration tests. There’s nothing wrong with integration tests, as I explained a while back. So let’s review the criteria of effective automated tests and see if this is worth while.

  1. Automated – Yes, my tests will be standard NUnit tests that can be run just about anywhere by just about any automated process. Continue reading

Refactoring My Repository Methods to be Asynchronous

Now I’ve written a DocumentDB based repository for Equipment data in DungeonMart, but I didn’t try to make it perfect, I just tried to make it work. Here it is, if you’re catching up: Part 1 and Part 2.

My next step in getting it closer to perfect is to make all the methods aynchronous. Face it, we’re going out to the cloud to get or write data and we shouldn’t make the whole service wait while that happens. DocumentDB gives you Async methods at least for writing, and we can figure out how to make the read methods async. If we were in Entity Framework, we would already have async methods for both reading and writing.
Continue reading

Creating a DocumentDB Repository Layer (Part 2 – Writing)

“Last time, on Bill DeLude dot com…”

Hmm, it kind of loses its effect without a good narrator voice. Does someone have Morgan Freeman’s number? Because he could say that last time I talked about creating a data repository class using DocumentDB as the data store and covered the setup of the class and the read methods. This time, I’m going to go over the write methods.

But first a disclaimer before we get started. Not everything in this class is complete. As with normal programming, I like to take a highly iterative approach. First, I’ll make it work. Then I’ll improve some part of it. And then I’ll improve some other part, and so on. There’s two things conspicuously missing from today’s code. First, the write methods, as they are offered up, are synchronous. Second, it doesn’t allow for the Equipment type to change. I’ll address both of these in a later post.

First, I’m going to work on adding Equipment to the collection. Here’s the implementation of the AddEquipment method:

public Equipment AddEquipment(Equipment equipment)
{
    dynamic doc = Client.CreateDocumentAsync(CollectionLink, equipment)
        .Result.Resource;
    Equipment result = doc;
    return result;
}

Adding a document to a collection in DocumentDB is pretty straight-forward. All you really have to do is call CreateDocumentAsync with the link to the collection and the document you’re adding.

I made it a little more complicated because I like to return the added document back to the caller, so they can have any changes such as the id or a timestamp. To accomplish that, I need to get the Resource of the CreateDocumentAsync result as a dynamic type. Then I put that into an Equipment type and return the equipment.

Now that we can add documents, let’s make sure we can update them. Updating in DocumentDB is a two step process. You first have to get a document link by reading the collection, and then you replace the document with your changed version. Gone are the days of “UPDATE Equipment SET …” because DocumentDB, as with most NoSQL databases, just doesn’t work that way. Here’s the code for updating:

public Equipment UpdateEquipment(string id, Equipment equipment)
{
    var doc = Client.CreateDocumentQuery(CollectionLink)
        .AsEnumerable().First(d => d.Id == id);
    dynamic updatedDoc = Client.ReplaceDocumentAsync(doc.SelfLink,
        equipment).Result.Resource;
    Equipment result = updatedDoc;
    return result;
}

Notice that when I get the document this time, I’m not specifying a type to return. By omitting the return type, I get it as a Document and not as an Equipment. I want to do that because Equipment doesn’t have the self link I need to do the replace on the next line. Once you have the document, replacing it is pretty simple. Just call ReplaceDocumentAsync with the document’s self link and the new document. Returning the updated equipment requires some extra work, but it’s just like the Create.

Finally, we need to delete documents from the collection. This works a lot like updating: get a self link, then call a delete method. It looks like this:

 public void DeleteEquipment(string id)
{
    var doc = Client.CreateDocumentQuery(CollectionLink)
        .AsEnumerable().First(d => d.Id == id);
    Client.DeleteDocumentAsync(doc.SelfLink);
}

And that’s it. Now you have a full blown Equipment repository using DocumentDB as the data store.

The code that was used at the time of this writing is in my github at this commit.

Creating a DocumentDB Repository Layer (Part 1 – Reading)

Even with NoSQL, having a repository layer is a good idea. In the world of DocumentDB, writing this layer is really simple. First, we’ll start with an interface, of course. In your application, this interface might already exist.

public interface IEquipmentRepository
{
    IEnumerable<Equipment> GetEquipments();
    Equipment GetEquipmentById(string id);
    Equipment AddEquipment(Equipment equipment);
    Equipment UpdateEquipment(string id, Equipment equipment);
    void DeleteEquipment(string id);
}

As, you can see, this is a pretty straight-forward interface with CRUD (Create-Read-Update-Delete) operations. Notice that the identifier is a string, not an integer like you might be used to seeing. DocumentDB, like other document databases, likes the identifier to be a GUID. Don’t try to make it the .NET GUID type. Just go with string, and then the database will handle it just like SQL does with integer identifiers.

If you’re following along in your own code, you’ll need the Equipment model. Here’s a snapshot of what it looked like when this post was written: Equipment.cs.

The next thing would be to implement the repository, but before you do that you’ll need a couple of extension methods to make your life easier. You can find them in my previous post on getting existing databases and collections. Do that, and then come back. I’ll be right here. You can also look at the source for the whole project, using the link at the bottom.

You’re going to need two fields in your repository: the DocumentClient and the Document Link. This is how I’m getting them at the time I write this. It isn’t optimal, but I have plans for more posts about making that better and those topics are beyond the scope of this post. So here it is:

private static readonly DocumentClient Client =
    new DocumentClient(new Uri(<Your Url>), <Your Key>);

private static readonly string DocumentLink =
    Client.GetOrCreateDocumentCollectionAsync("dmart", "equipment")
    .Result.DocumentsLink;

Now that we have those, let’s implement the first method, GetEquipments. There’s not much to it, but there’s a lot to talk about:

public IEnumerable<Equipment> GetEquipments()
{
    return Client.CreateDocumentQuery<Equipment>(DocumentLink)
        .AsEnumerable();
}

This method uses the DocumentClient we created as a field. It calls the CreateDocumentQuery<T> method to query the document collection referenced by the DocumentLink (our other field).

Notice that the method returns an IEnumerable<T> and not a List<T>. What this means is that when the method returns, we haven’t actually queried the collection. We’re putting that off until later to allow the caller to filter on the IEnumerable. This makes the usage of our get list method more flexible, and possibly saves us from unnecessarily passing a huge collection across the wire.

However, in some cases there will be a way of filtering that is so common we’re just going to provide the users of our class a built in method for that. The seminal example would be wanting to get an Equipment by it’s identifier. For that, we’ll implement the next method.

public Equipment GetEquipmentById(string id)
{
    return Client.CreateDocumentQuery<Equipment>(DocumentLink)
        .AsEnumerable().First(d => d.id == id);
}

It looks a lot like the last method, except that it filters and returns a single Equipment object. Notice that I’m using First instead of FirstOrDefault, which will throw an exception if the Equipment with that guid doesn’t exist. Since a guid is a really hard key to fake, I don’t expect that will happen much and an exception is probably appropriate. I might regret that decision later.

That’s it for reading from your DocumentDB collection. In my next post, I’ll go over the remaining write methods.

The code used for this post lives here:
https://github.com/qanwi1970/dungeon-mart/tree/3ad6081b36b5fc804b0c6dce03daa1a2699edd87

Continue with the writing methods in Part 2.

Update: The code when I published was missing the .AsEnumerable() in the GetEquipmentById method. Sorry about that. By the time I found and fixed it, the code had moved on and would just be confusing to point to, now.

Really, How You Start a Project in Visual Studio Can Mean So Much

You know what I hate? Being well into a project and realizing that you started the wrong type or picked the wrong scaffolding options. Some things can be very difficult to change later. Also, one of my pet peeves is bloated projects. It’s crazy that a new WebAPI project will have styles, javascript files, areas, and a we-only-supported-it-for-one-release help page. In my drive to avoid these issues, I pretty much have project creation down to a science.

You might have your favorite way of creating projects, and I’d love to hear about them. Give me ideas in the comments. But for right now, this is how I normally go about a web site. The DungeonMart website it going to use Web API 2 for data services, AngularJS for UI, and Azure’s DocumentDB for the database (for now, at least).

Start with a web application.

Create a new Web Application

This one looks simple, but there’s so much you can mess up here. Name and Location are important. You can move the whole solution if you don’t like the location, but changing the name is really annoying. I also like to uncheck “Create directory for solution” so I get a flatter folder structure, and I’m already in a source controlled folder so I’m not going to add source control again.

Remember, Templates are bad!

Select Template

These templates are where all the bloat that I hate so much comes from. So I always pick empty, and then add the core references I want using the checkboxes below the template list. Since the UI will be AngularJS, and not MVC with Razor, I did not choose the MVC checkbox.

I did check the unit tests box, but won’t be doing that again. In VS2012, it would put the unit test project folder in the same sub-folder as the main project folder, but VS2013 is placing the unit test project in the MyDocuments project path. So I had to remove the unit test project and create a new one to get it in the right folder.

If you choose to host in the cloud…

dmart3

… then make sure you pick the right name here. It’s one of those things that is really, really hard to change later.

What’s next?

Well, now you have an empty project waiting for you to add stuff. If you’ve used templates before, you’ll notice that a lot of things are missing: Areas, Scripts, Content, HomeController, four of the App_Start files, billions of Nuget dependencies, and so on. You have to add the things you’re going to want yourself, as you need them. Probably the first thing would be a start page, and then your first ApiController, but I’ll get into that next time.

Getting an Existing Database and Collection in DocumentDB

The API methods for creating databases and collections is easy to find, but how do you get a database and collection that’s already created? It’s definitely the more common use case, but it’s so obviously not there. Never fear, for we can wrap the queries used to get databases and collections in extension methods and pretend that it was always part of the API.

Did you say queries?

Yes, I did. You have to query the system database collection to get a link (not a URI, more like an ID) for your database, and ditto for your collection. The good news is that you don’t actually have to know where those collections live, because there are built in methods for querying them. It was almost easy.

First, get the Database

It’s kind of a two step process. Before you can get a collection, you have to get the database that contains that collection. But we’re going to make it easy with two simple extension methods. The first one will be used to get a Database object. You’ll rarely use this method directly, because the most common reason to get a Database object is to query it for the collection you need.

public static async Task<Database> GetOrCreateDatabaseAsync(
    this DocumentClient client, string databaseId)
{
    var databases = client.CreateDatabaseQuery()
        .Where(db => db.Id == databaseId).ToArray();

    if (databases.Any())
    {
        return databases.First();
    }

    return await client.CreateDatabaseAsync(
        new Database {Id = databaseId});
}

The method first tries to get the Database from the query. If it’s not found, then it creates the Database and returns it. I put the extension on DocumentClient because that is where CreateDatabaseAsync lives, so it made sense that the Get method would be, or appear to be, in the same place.

Then, get the collection

Now that we have a Database object, we can use its SelfLink to get the DocumentCollection. This extension method will be used a lot. It looks like this:

public static async Task<DocumentCollection>
    GetOrCreateDocumentCollectionAsync(
        this DocumentClient client,
        string databaseId,
        string collectionId)
{
    var database = await GetOrCreateDatabaseAsync(
        client, databaseId);

    var collections = client
        .CreateDocumentCollectionQuery(database.SelfLink)
        .Where(col => col.Id == collectionId).ToArray();

    if (collections.Any())
    {
        return collections.First();
    }

    return await client.CreateDocumentCollectionAsync(
        database.SelfLink,
        new DocumentCollection {Id = collectionId});
}

This works a lot like the other method, where it tries to get the collection and if it doesn’t exist then it creates the DocumentCollection and returns it. It also uses the other method to get the Database for its SelfLink. Note that this method asks for the database Id, not the object. I did that to make the two step process only look like one step from the outside.

Okay, now what?

This part’s easy. When you need a DocumentCollection object, just use the extension method. You’ll probably want to only do that once and store it in a static. We’re not going to put state in the DocumentCollection object, we really just want the DocumentLink. Here’s how you get it:

private const string DatabaseId = "ContentDB";
private const string CollectionId = "ContentCollection";
private static readonly DocumentCollection Collection =
    Client.GetDocumentCollection(DatabaseId, CollectionId).Result;

And then you use it for a document query like so:

public List<Content> GetList()
{
    var documentsLink = Collection.DocumentsLink;
    var contentList = _client
        .CreateDocumentQuery<Content>(documentsLink)
        .AsEnumerable().ToList();
    return contentList;
}

And that’s it, a very natural way to get the DocumentLink. Then again, maybe it makes sense to just store the DocumentLink as a static and just use that…

Consider that homework.