Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

We are thinking of the integration of apache spark in our calculation process where we at first wanted to use apache oozie and standard MR or MO (Map-Only) jobs.

After some research several questions remain:

  1. Is it possible to orchestrate an apache spark process by using apache oozie? If yes, how?
  2. Is oozie necessary anymore or could spark handle orchestration by itself? (unification seems to be one of the main concerns in spark)

Please consider the following scenarios when answering:

  1. executing a work flow every 4 hours
  2. executing a work flow whenever specific data is accessible
  3. trigger a work flow and configure it with parameters

Thanks for your answers in advance.

share|improve this question
    
Don't know much about Oozie, but I would say for spark go as simple as possible, since most of the flow handling is done within the job –  aaronman Jul 14 at 15:18

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Browse other questions tagged or ask your own question.