Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

I've got an existing Grails/MongoDB application that I am adding some automated tests to. I want those tests to be executed against a specific set of data in a Mongo collection. I want the tests to be able to mangle the data (with predictable results, if I'm lucky) and then be able to quickly drop and recreate/reload the database so that I can run the test again.

Since I'm going to base this seed test data on real data from our production system, I'd like to be able to perhaps load the data from a JSON/BSON format that I could retrieve from a query in the Mongo shell or something similar.

Basically I don't want to have to write a hundred lines of code like the following: new Record(name: 'John Doe', age: '25', favoriteColor: 'blue').save()

Except with 30 properties each, all the while ensuring that constraints are met and that the data is realistic. That's why I want to use production data.

I also don't want to have to resort to spawning execs that run mongorestore to load and reload real data, since that would require additional software to be running on the tester's machine.

Is there a better way? Perhaps somehow unmarshalling raw JSON into something that I can then execute with the Grails MongoDB GORM or GMongo or a direct call to the Java MongoDB driver?

share|improve this question

2 Answers 2

You can use the com.mongodb.util.JSON class to convert JSON data directly to a DBObject. Take a look at this example which demonstrates how to do it using the Java driver.
This MongoDB blog post shows how to do it using GORM and the Groovy driver.

share|improve this answer

Do you need to store your test data in a transportable file, or will you always have access to a mongodb instance on which it could live? Say, for example, that you have a test mongodb server and that you can rely on having access to it whenever your tests are run.

In that case, the simplest solution is to keep the test data in a collection which you'd clone before each test run. Tests would then be free to mangle the cloned collection as much as they want without any actual data loss.

If you need to have your test data live in a file (because, for example, you want to store it on your code repository), then you need to find a format that's easy to serialize to / deserialize from BSON. JSON seems like an obvious choice, especially since, as @drorb said above, mongodb has tools to do that for you already.

You'd then just need to write one script to dump the content of an existing collection in JSON files, and another to load a set of JSON files and store them in a collection - probably not more than a few lines each.

I'd suggest storing each object in a separate JSON file rather than have a large file with all the test data. As much as I like JSON, it doesn't lend itself well to streaming, and you'd have to store the whole collection in memory before you can start dumping it in mongodb. If your test data is big enough, it could start causing memory problems.

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.