Software Engineering Stack Exchange is a question and answer site for professionals, academics, and students working within the systems development life cycle who care about creating, delivering, and maintaining software responsibly. Join them; it only takes a minute:

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

we are working on a JAVA EE project which handles huge amount of data, but has to provide full-text-search option (in hungarian language). So we started to think about what kind of architecture could fulfill our requirements. My thoughts are the following:

Using ElasticSearch as a database is an antipattern so it must be used just for indexing and searching

MongoDB is fit for our expectations so it seems to be a good choice as database.

The problem is, how to index MongoDB data with ElasticSearch? I created a POC with 13 million documents. I iterated through the documents and in each iteration I saved them into MongoDB (it gave me an ID for each document) then I put the documents into ElasticSearch but stored only the Mongo ID. Document indexing was quite fast, average 4,8 ms per document.

When I search with Elastic, it gaves me back the matching document ID's and I can load the documents from Mongo with the $in operator. This also seemed quite fast.

All that means that it can be a good approach but is it really? I can't figure out when does this architecture slows down or what could be a bottleneck. Maybe syncronizing ElasticSearch with Mongo but it can be run on a distributed environment (Hadoop).

So my question: is there a better way to synchronize MongoDB with ElasticSearch?

share|improve this question
4  
I can't figure out when does this architecture slows down or what could be a bottleneck. -- Run some tests. Find out if your architecture will perform under heavy load. That's really the only way to know for sure. – Robert Harvey Jan 6 at 16:18

I had the same request, and found these references that could help you.

Java + MongoDB + Elastic Search = River Plugin you can find at https://github.com/richardwilly98/elasticsearch-river-mongodb/wiki

And if you are really going to have a gorgeous amount of data to manage, so please read this interesting experience and the conclusion of the Quark'sLab : http://blog.quarkslab.com/mongodb-vs-elasticsearch-the-quest-of-the-holy-performances.html

Hope it helps.

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.