fema-team team mailing list archive
-
fema-team team
-
Mailing list archive
-
Message #00050
Re: FEMA Server design flaw
One someone could get more than the initial set of work units working by Monday could be to generate new work units after X time has elapsed instead of after a certain no. of them have been completed as intended.
This has problems in the long term... WorkUnit results aren't returned based on time so the WorkGenerator could generate thousands of WorkUnits before one is even processed by a client or they could all be processed and clients waiting to do work before the WorkGenerator generates work.
On 12/11/2011, at 8:36 AM, James Newell wrote:
> Hi Guys,
>
> We've got a bit of a design flaw at the moment… The way we need the Work Generator and Assimilator to work is:
>
> The Work Generator creates the initial work units. When a work unit has been completed BOINC runs the Assimilator. The Assimilator looks up the mapping (jobId, islandId and workunitId) for the work unit being assimilated. Then it needs to get the job and island objects so it can mark the work unit as completed. When the Work Generator finds that enough of the work units are completed it will generate new work units.
>
> The issue is that currently in the Work Generator, jobs, islands and work units are stored privately in memory and there is no way that the Assimilator can access the Work Generator's jobs. Even more problematic is that (I assume) the Assimilator and Work Generator are running in different processes on the server and consequently don't have any shared memory. Consequently just putting a getter on the Work Generator won't help at all
>
> If someone could put their hand to tackle this issue that would be great. If its not sorted it will mean nothing more than the initial work units get generated and our project isn't that much better than running our problems on our own machine.
>
> Firstly confirm that with BOINC the WorkGenerator and Assimilator are run in different processes (pretty certain they are but it would save you a whole heap of work if they're not).
>
> Secondly solve the problem!
>
> Personally I think that the best way of solving the problem would be to persist data to our database (Jobs, Islands and WorkUnits) so they can be accessed regardless of which process they're created in. If anyone can come up with a hack to get this working before Monday or thinks of a better way please reply all!!!
>
> The Assimilator would work something like this:
> _mapper.getJob(mapping.getJobId()).getIsland(mapping.getIslandId()).getWorkUnit(mapping.getWuId()).markCompleted();
>
> Changes to the WorkGenerator would have the following changes:
> move app lookup etc to the mapper
> remove the new jobs loop (the jobs will be pulled from the db for the process loop and know their state (e.g. new, running, paused, finished, deleted) and can act accordingly in the process method)
>
> Changes to the mapper would be something like:
> Mapper<Factory>(factory, projectPath="/srv/.….") - get the app name, app id and create the job config (with path handler etc) to pass to the job
> getJob(id) : Job - fetch job data from db and construct objects
> getJobs() : vector<Job> - fetch job data from db and construct objects
> deleteJob(job) - remove job, island and work unit records from DB and files (in, out, solutions, objectives etc) from disk
> remove getNewJobs() method
>
> Changes to the job config would be something like:
> add app name and id
>
> Changes to the job would be something like:
> Job(config, id, state, … )
> getIsland(id) : Island - fetch island data from db and construct objects (could just proxy to the mapper Mapper.getIsland(job, islandId) : Island)
> getIslands() : vector<Island> - fetch island data from db and construct objects (could just proxy to the mapper Mapper.getIslands(job) : vector<Island>)
> createIsland() : Island - persist island data in database
> //getBestIndividuals() : Population
> //getBestObjectiveValues() : vector<double>
>
> Changes to the island would be something like:
> Island(config, job, id)
> getWorkUnit(id) : Island - fetch work unit data from db and construct objects (could just proxy to the mapper Mapper.getWorkUnit(job, island, wuId) : WorkUnit)
> getWorkUnits() : vector<Island> - fetch work unit data from db and construct objects (could just proxy to the mapper Mapper.getWorkUnits(job, island) : vector<WorkUnit>)
> getSentCount() : int - iterate work units to count
> getCompletedCount() : int - iterate work units to count
> createWorkUnit() : WorkUnit - create boinc work unit, call create_work() and persist work unit data in database
>
> Changes to the work unit would be something like:
> WorkUnit(config, island, wuId, boincId, sent_datetime, completed_datetime)
> isSent()
> isCompleted()
> markSent() - update database
> markCompleted() - update database
> //getSentIndividuals() : Population
> //getCompletedIndividuals() : Population
>
> Yep quite a big job. Please comment.
>
> Takers?
>
> James
>
Follow ups
References