I'm rather new to distributed computing and would like some assistance with the overall architecture of my application.
My application has Jobs that can be added to a JobQueue. Then one or more JobRunner instances can be setup to run the jobs on the queue and generate JobResults. The JobResults will then be sent to some destination like a report, log file, email notification etc..
However, I also want to be able to group a related set of Jobs into a JobSet which in turn will be processed into a JobSetResult that contains all the corresponding JobResults. Each Job, however, will still be processed independently by a JobRunner. Once all the JobResults are collected the final JobResult will be sent to some destination like a log or email notification.
For example a user may create a set of jobs to process a list of files. They would create a JobSet containing a number of FileProcessingJobs and submit it to be run. I obviously don't want the user to get an email notification for every file, but only the final JobSetResult when the entire JobSet is complete.
I'm having trouble figuring out the best way to keep track of all this in a distributed environment. Is there some existing architectural design pattern which matches what I'm trying to do?