Tell me more ×
Programmers Stack Exchange is a question and answer site for professional programmers interested in conceptual questions about software development. It's 100% free, no registration required.

I'm writing a system that dispatches reports and need some help with the design:

The requirements in high level:

  1. reports are being created ad hoc
  2. reports are being sent via emails to a list of users (list defined by sender)
  3. each user might have their own customization for the report (look and feel, layout etc)
  4. sending the report should be done in the background, only notifying if there is an error
  5. the status of the report sending should be persistent (e.g. pending, sent / error)

Implementation must be in Java / Scala / Any JVM language or at least interoperable

Overall architecture plan

  1. Report sender generates a request to send a report, the request includes, the report body, and a list of recipients
  2. Server receives the request and passes it to a persistent queue
  3. A set of worker threads / processes reads from the queue and performs customization logic for each recipient of the report there will be a unique report eventually.
  4. When customization is done the same worker thread sends the customized versions of the report to the individual users (e.g. via email or other messaging mechanism), updates the status to "sent", and returns the worker thread to the pool

This is where I'm not sure how to continue with the implementation. Here are the options I thought of

  1. Roll my own version 1: in memory - create a Thread pool using java.util.concurrent Executors, have a blocking queue that will include the Runnable tasks that will do the customization + notification, the problem is that it doesn't answer the need to have the queue statuses persistent (e.g. I need a "DB backed" blocking queue that I can see at any given time what tasks are in the queue, and what is their status (pending, running, done / error). Another issue is that this is not the Java EE way probably...

  2. Roll my own version 2: Write the tasks to a database table, have a @Schedule task running at specific intervals and check for "new" tasks, and dispatch worker threads (managed by a thread pool executor) that will change the status of the task to "running / done / error" based on the task execution. I don't like this one due to the polling, and it feels simply wrong. I can do a hybrid, and write to the database as the task arrives, and in parallel add a task to the blocking queue, and when the task starts, the first step will be to find the corresponding task ID in the database and change it to pending, this saves the polling, but I'm worried I'll get memory queue to database queue synchronization issues, especially after recovery from a failure.

  3. The Java EE way: Since I'm using Java, and use some Java EE features, I can go fully Java EE and use one of the following

    • JMS queue? not sure if it's not an overkill, and not sure how I keep the persistent requiremtn (e.g. see a list of all pending "send report" jobs), do I keep a copy of each queue task in the database? how do I syncronize it?
    • @Asynchronous session bean? not sure how it differs from a Message Driven Bean and who handles the thread pool / queue for me? is it used only for simple asynchronous actions?
    • @MessageDriven Beans? I assume it is ideal in conjunction with a JMS queue, but not sure if I want the network and configuration overhead. And not sure what is the benefit. Also not sure how I do the actual implementation, a session bean gets the request and passes it to the message driven bean? Can it be all done locally at first and later changed to @Remote? or are all Message Driven beans remote by definition? I think I'm lacking a lot of Java EE basics here (althoguh I've read the Java EE tutorial on that topic briefly)
  4. The "modern" way: use "new" technologies, such as Kestrel / Storm. However, I'm not sure I fully know if they fit the use cases, I've watched some introduction to storm but I'm not sure if I can use Storm distributed RPC here, e.g. each "Bolt" should do the customization logic for one recipient etc (which can be done in parallel).

  5. "Other" way? use ActiveMQ? RabbitMQ? no need for any MQ? any other option I missed?

My Question:

Which of the above is a "good enough" approach, is there any other I missed? How would you implement it in high level using that approach?

share|improve this question

closed as too broad by Jimmy Hoffa, Bart van Ingen Schenau, user61852, Martijn Pieters, MichaelT Jun 29 at 0:33

There are either too many possible answers, or good answers would be too long for this format. Please add details to narrow the answer set or to isolate an issue that can be answered in a few paragraphs.If this question can be reworded to fit the rules in the help center, please edit the question.