Actual source code: Map.hh

  1: #ifndef included_ALE_Map_hh
  2: #define included_ALE_Map_hh

  4: #ifndef  included_ALE_Sifter_hh
  5: #include <Sifter.hh>
  6: #endif


  9: // Concepts.
 10: // -----------
 11: // Because of the use of templating and generic programming techniques,
 12: // many fundamental ALE types cannot be defined by making progressively specific
 13: // class declarations within a given hierarchy. Instead, they must satisfy certain
 14: // Concept requirements that make them acceptable inputs to various Algorithms.
 15: // This reflects a shortcoming of the current compiler technology that some defining
 16: // features of Concepts have to be specified in documentation, rather than within the
 17: // language.  Sometimes, however, conceptual types can be viewed as themselves
 18: // encapsulating Algorithms acting on other types implementing certain concepts.
 19: // This allows to define the structure of these algorithmic types using generic
 20: // programming techniques available in C++, for example.


 23: // Atlas & Sec
 24: // -----------------------
 25: // In the past we have considered the Atlas concept, which, for the given
 26: // Sifter, chart and ind types computed the assignment of indices
 27: // to the points in the underlying Sieve (relative to a given chart).
 28: // This mimics a system of local coordinate systems on a manifold or some
 29: // such space.

 31: // Essentially, Atlas can be thought of as a Sifter, with the
 32: // underlying Sifter's points (base or cap?) in the cap, charts in the base
 33: // and indices being the color on the edges.  However, any type that responds
 34: // to the fundamental requests -- setting the number of indices over a point
 35: // (fiber dimension), (re)ordering the indices after some dimension modifications
 36: // have been made, and retrieving the indices -- is an Atlas.

 38: // An object that assigns data of a given type to the (point, chart) pairs of an Atlas
 39: // is called its section or Sec.  If an Atlas is viewed as a discrete model of a structure
 40: // bundle over the point space, a Sec is a discrete model of a section of that bundle.
 41: // Sec is required to define a restrict method, which, given a (point,chart) pair returns
 42: // a data iterator.

 44: // If the Sifter underling the Atlas is a Sieve, we assume that to each
 45: // covering arrow p --> q (within the same chart), corresponds a mapping of the data d(p) <-- d(q),
 46: // reflecting the idea that d(q) can be 'restricted' to d(p) within the same chart (or perhaps
 47: // in any chart?).  This sort of behavior is certainly impossible to guarantee through
 48: // an interface specification, but it remains a conceptual requirement on Sec that
 49: // the data over p are somehow "included" in the data over q.  This "inclusion" is partly specified
 50: // in the Atlas (e.g., by ensuring that the indices over p are included in those over q),
 51: // and partly in Sec itself.


 54: // Map & ParMap concepts.
 55: // -----------------------
 56: // A Map is thought of as a type of mapping from a Sec class to another Sec class.
 57: // Maps and ParMaps can be viewed as algorithmic types acting on the input/output Sec types.
 58: // Most importantly, a Map must advertise the Atlases of the input and output Sec
 59: // types.  Furthermore, a Map acts essentially as a Sec relative to the output
 60: // atlas: giving a fixed input Sec S, a Map extends the Sec interface by defining restrictions
 61: // to output  (point,chart) pairs relative to S.  Alternatively, a Map can return an output Sec T,
 62: // containing the result of applying the map to S, which can be queried independently.

 64: // With these features of a Map hardly define any implementation structure (rather an interface),
 65: // since the particular behavior of restrictions is unconstrained by this specification.
 66: // Particular Maps can impose further constraints on the input/output atlases
 67: // and expand the interface, so long as they conform to the Map concept outlined above.
 68: // For example, we can require that each output chart data depend only on the explicitly specified
 69: // input charts' data.  This is done by specifying the 'lightcone' Sifter or a Sieve, connecting
 70: // the input charts to the output charts they depend on.  Taking the cone (or the closure, if it's
 71: // a Sieve) of a given output chart in the lightcone returns all of the input charts necessary
 72: // to compute the data over the output chart.

 74: // The specification of a Map's lightcone is very necessary to enable preallocation of internal
 75: // data structures, such as the matrix storage, for linear maps.  In the distributed context
 76: // it also enables the setup of communication structures.  This behavior is encapsulated in
 77: // a ParMap, which is itself a conceptual type that extends Map.  ParMap encapsulates (among other things)
 78: // three conceptual Map objects: Gather,Scatter and the Transform. A ParMap is then an algorithm that orchestrates
 79: // the action of other maps.

 81: // ***
 82: //   The ParMap algorithm is a most clear illustration of the locazation principle underlying Sieves and computation over
 83: // them: restrict-compute-assemble.  To compute the action of a ParMap on a distributed Sec the necessary data must be
 84: // communicate to and from the processes requiring and holding the data.  The total overlap of a processes domain with the
 85: // rest of the communicator is naturally covered by the individual overlaps indexed by the remote ranks.
 86: //   To communicate the local input Sec data to the remote processes, the Sec is first restricted to each of the overlap pieces,
 87: // forming another Sec, whose charts are the overlaps indexed by indices, and whose points are the (in_point,in_chart)
 88: // pairs on which in the input Sec is defined.  This Sec can be viewed as multisheeted coverings of the overlap porition
 89: // of the input Sec, and the multiplexing process forming the new Sec will be called Scatter.  It is a map between two Secs
 90: // with the input atlas and the new rank-indexed atlas.
 91: //   After Scatter maps the input Sec to a multisheeted covering, the multisheeted data are communicated to the processes according
 92: // to the rank in each chart. This can be viewed as a map between two such Secs -- communication is certainly a mapping in
 93: // the distributed context -- done 'locally' over each chart.  Thus, the data from the input Sec are first localized onto
 94: // each rank-chart, then mapped, and finally must be assembled.
 95: //   To obtain a Sec over the local domain, the Scatter process must be reversed, using the a map called Gather.  Gather
 96: // takes in the multisheeted covering of the overlap obtained after the communication phase, and obtains a single
 97: // (in_point,in_chart) data point from the collection of all such points over all the rank charts.  During this reduction
 98: // the overlap portion of the communicated input Sec is unified with the local data over the same (in_point,in_chart) pair,
 99: // completing the assembly of the input Sec.
100: //   Once the input data have been communicate to the consuming processes, Transform locally maps the data from the input
101: // Sec to the output Sec.  The Gather/Scatter maps involved in the communication stage depend on the GatherAtlas, which
102: // describes the multisheeted covering of InAtlas.  Gather/Scatter maps can in principle operate on a Sec of any Atlas,
103: // unifying the data over the same point in different charts, producing a Sec over a single-charted atlas.  Different
104: // implementations of Gather/Scatter lead to different multiplexing/reduction procedures, while the atlas structure stays
105: // the same.  Gather/Scatter maps can be used locally as well and need not act on the result of a communication.
106: // ***

108: // The Gather output atlas has the same structure as the ParMap input atlas,
109: // while the Gather input atlas  -- GatherAtlas -- combines the ParMap input (point,chart) pairs into
110: // the source and puts the communicator rank in the target. The Scatter input/output atlases have the
111: // structure of the output/input atlases of Gather respectively. The Transform atlases have the same
112: // structure as the ParMap atlases.
113: //                   (Gather input)                  (Gather output == Transform input)
114: 
115: //                           index                            index
116: //             (point,chart) -----> rank     <==>       point -----> chart

118: //                   (Scatter output)                (Scatter input == Transform input)

120: // GatherAtlas is constructed from ParMap's input atlas using the lightcone.
121: // The Gather input/Scatter output atlas essentially applies the idea of a chart recursively:
122: // Transform input charts are distributed among different processes.  Given a single process, its overlap
123: // with other processes can be indexed by their remote communicator ranks.  All (point_in,chart_in) pairs shared
124: // with a given rank are part of a single rank-chart.  This way a single in-chart is "blown up"
125: // into a "multisheeted" covering by rank-charts; each (point_in,chart_in) pair becomes a rank-point within
126: // one or many rank-charts.

128: // The data over this rank-atlas are essentially the data in the send/receive buffers,
129: // and the Scatter map is responsible for (multiplexing) packing and moving the data from the input Sec into the rank-Sec
130: // encapsulating these buffers. Once this has been done, ParMap executes the communication code (send/recv),
131: // and the rank-multisheeted data are transfered to the required processes.

133: //                                     Scatter                                     send/recv
134: //                                                                    ... rank_0
135: //              point_in --> chart_in    ==>      (point_in,chart_in) --> rank_k      ==>
136: //                                                                    ... rank_K

138: // Then Gather reduces the data over a single (point_in,chart_in) pair in all of the rank-charts.  This can be thought
139: // of as gluing all of the partial sections over the overlaps with remote processes into a single "remote" Sec and then
140: // gluing it with the "local" Sec.  Once the remote data have been assimilated into the local input Sec, Transform does
141: // its thing.


144: //                                      rank_0    Gather                        Transform
145: //                                 ...
146: //             (point_in,chart_in) -->  rank_n     ==>    point_in --> chart_in    ==>      point_out --> chart_out
147: //                                 ...
148: //                                      rank_N

150: // Observe that the structure of the GatherAtlas is essentially the same as the structure of the
151: // Overlap Sifter in ParDelta, therefore the Overlap code can be reused.  However, that code is not customizable,
152: // while we may want to allow the GatherAtlas  constructor the flexibility to massage the atlas (e.g., to keep only
153: // s single rank for a given (point,chart) pair, thereby implementing the 'owner' concept).  Making GatherAtlas a class
154: // a class will allow this flexibility by exposing the input atlas computation method to overloading.
155: // The prototypical GatherAtlas object will be implemented to keep all of the ranks in the remote overlap under
156: // a given (point_in, chart_in) pair.  Custom GatherAtlas objects may prune that so that the number and amount
157: // of data sent/recv'd by ParMap is only as required.  Here we assume that the overlap is small and computed only once
158: // or infrequently, while ParMap mappings are frequent.

160: 


163: //
164: // Atlas, Sec and Map classes
165: //
166: namespace ALE {
167:   namespace X {

169:     // We require that any class implementing the Atlas concept extending Sifter.
170:     // FIX: should Atlas depend on a Sieve type?  A Sifter type?
171:     template <typename Ind_, typename Point_, typename Chart_>
172:     class Atlas : public ASifter<Ind_, Point_, Chart_, SifterDef::multiColor> {
173:     public:
174:       //
175:       // Encapsulated types
176:       //
177:       typedef Point_                                 point_type;
178:       typedef Chart_                                 chart_type;
179:       typedef Ind_                                   ind_type;
180:       typedef ALE::pair<ind_type, ind_type>          index_type;
181:       //
182:       typedef ASifter<index_type, point_type, chart_type, SifterDef::multiColor> sifter_type;
183:     public:
184:       //
185:       // Basic interface
186:       //
187:       Atlas(MPI_Comm comm = PETSC_COMM_SELF, const int& debug = 0) : sifter_type(comm, debug) {};
188:       virtual ~Atlas(){};
189:       //
190:       // Extended interface
191:       //
192:       ind_type size(){
193:         ind_type sz = 0;
194:         // Here we simply look at each chart's cone and add up the sizes
195:         // In fact, we assume that within a chart indices are contiguous,
196:         // so we simply add up the offset to the size of the last point of each chart's cone and sum them up.
197:         baseSequence base = this->base();
198:         for(typename baseSequence::iterator bitor = base->begin(); bitor != base->end(); bitor++) {
199:           index_type ii = this->cone(*bitor)->rbegin()->color();
200:           sz += ii.first + ii.second;
201:         }
202:         return sz;
203:       };
204:       ind_type size(const chart_type& c) {
205:         // Here we simply look at the chart's cone and add up the sizes.
206:         // In fact, we assume that within a chart indices are contiguous,
207:         // so we simply return the sum of the offset to the size of the chart's last point.
208:         index_type ii = this->cone(c).rbegin()->color();
209:         ind_type sz = ii.first + ii.second;
210:         return sz;
211:       };
212:       ind_type size(const chart_type& c, const point_type& p) {
213:         // Here we assume that at most a single arrow between p and c exists
214:         arrowSequence arrows = this->arrows(p,c);
215:         ind_type sz = 0;
216:         if(arrows.begin() != arrows.end()) {
217:           sz = arrows.begin()->first;
218:         }
219:         return sz;
220:       };
221: 
222:       ind_type offset(const chart_type& c) {
223:         // We assume that within a chart indices are contiguous, so the offset of the chart
224:         // is the offset of the first element in its cone.
225:         ind_type off = this->cone(c).begin()->color().first;
226:         return off;
227:       };
228:       ind_type offset(const chart_type& c, const point_type& p) {
229:         // Here we assume that at most a single arrow between p and c exists
230:         arrowSequence arrows = this->arrows(p,c);
231:         // CONTINUE: what's the offset in case p is not in c
232:         ind_type sz = 0;
233:         if(arrows.begin() != arrows.end()) {
234:           sz = arrows.begin()->first;
235:         }
236:         return sz;
237:       };
238: 
239:     };// class Atlas

241: 

243:     // FIX: should Sec depend on a Sieve?  Perhaps Atlas should encapsulate a Sieve type?
244:     template <typename Data_, typename Atlas_>
245:     class Sec {
246:     public:
247:       //
248:       // Encapsulated types
249:       //
250:       typedef Atlas_                           atlas_type;
251:       typedef Data_                            data_type;
252:       typedef typename atlas_type::point_type  point_type;
253:       typedef typename atlas_type::chart_type  chart_type;
254:       typedef typename atlas_type::index_type  index_type;
255:       //
256:       // Perhaps the most important incapsulated type: sequence of data elements over a sequence of AtlasArrows.
257:       // of the sequence.
258:       template <typename AtlasArrowSequence_>
259:       class DataSequence {
260:       // The AtlasArrowSequence_ encodes the arrows over a chart or a chart-point pair.
261:       // The crucial assumption is that the begin() and rbegin() of the AtlasArrowSequence_ contain the extremal indices
262:       public:
263:         //
264:         // Encapsulated types
265:         //
266:         typedef AtlasArrowSequence_ atlas_arrow_sequence_type;
267:         //
268:         // Encapsulated iterators
269:         class iterator {
270:         public:
271:           typedef std::input_iterator_tag     iterator_category;
272:           typedef data_type                   value_type;
273:           typedef int                         difference_type;
274:           typedef value_type*                 pointer;
275:           typedef value_type&                 reference;
276:         protected:
277:           // Encapsulates a data_type pointer
278:           data_type* _ptr;
279:         public:
280:           iterator(const iterator& iter) : _ptr(iter._ptr) {};
281:           iterator(data_type* ptr)       : _ptr(ptr) {};
282:           data_type&         operator*(){return *(this->_ptr);};
283:           virtual iterator   operator++() {++this->_ptr; return *this;};
284:           virtual iterator   operator++(int n) {iterator tmp(this->_ptr); ++this->_ptr; return tmp;};
285:           virtual bool       operator==(const iterator& iter) const {return this->_ptr == iter._ptr;};
286:           virtual bool       operator!=(const iterator& iter) const {return this->_ptr != iter._ptr;};
287:         };
288:         //
289:         class reverse_iterator {
290:         public:
291:           typedef std::input_iterator_tag     iterator_category;
292:           typedef data_type                   value_type;
293:           typedef int                         difference_type;
294:           typedef value_type*                 pointer;
295:           typedef value_type&                 reference;
296:         protected:
297:           // Encapsulates a data_type pointer
298:           data_type* _ptr;
299:         public:
300:           reverse_iterator(const reverse_iterator& iter) : _ptr(iter._ptr) {};
301:           reverse_iterator(data_type* ptr)       : _ptr(ptr) {};
302:           data_type&                operator*(){return *(this->_ptr);};
303:           virtual reverse_iterator  operator++() {--this->_ptr; return *this;};
304:           virtual reverse_iterator  operator++(int n) {reverse_iterator tmp(this->_ptr); --this->_ptr; return tmp;};
305:           virtual bool              operator==(const reverse_iterator& iter) const {return this->_ptr == iter._ptr;};
306:           virtual bool              operator!=(const reverse_iterator& iter) const {return this->_ptr != iter._ptr;};
307:         };
308:       protected:
309:         const atlas_arrow_sequence_type& _arrows;
310:         data_type *_base_ptr;
311:         index      _size;
312:       public:
313:         //
314:         // Basic interface
315:         DataSequence(data_type *arr, const atlas_arrow_sequence_type& arrows) : _arrows(arrows) {
316:           // We immediately calculate the base pointer into the array and the size of the data sequence.
317:           // To compute the index of the base pointer look at the beginning of the _arrows sequence.
318:           this->_base_ptr = arr + this->_arrows->begin()->color().first;
319:           // To compute the total size of the array segement, we look at the end of the _arrows sequence.
320:           ALE::pair<index_type, index_type> ii = this->_arrows->rbegin()->color();
321:           this->_size = ii.first + ii.second;
322:         };
323:        ~DataSequence(){};
324:         //
325:         // Extended interface
326:         index_type size() {return this->_size;};
327:         iterator begin()  {return iterator(this->_base_ptr);};
328:         iterator end()    {return iterator(this->_base_ptr+this->_size+1);};
329:         iterator rbegin() {return reverse_iterator(this->_base_ptr+this->_size);};
330:         iterator rend()   {return reverse_iterator(this->_base_ptr-1);};
331:       }; // class Sec::DataSequence
332:     protected:
333:       Obj<atlas_type> _atlas;
334:       data_type*      _data;
335:       bool            _allocated;
336:     public:
337:       //
338:       // Basic interface
339:       //
340:       Sec(const Obj<atlas_type> atlas, const (data_type*)& data) {this->setAtlas(atlas, false); this->_data = data;};
341:       Sec(const Obj<atlas_type> atlas = Obj<atlas_type>()) {this->_data = NULL; this->setAtlas(atlas);};
342:      ~Sec(){if((this->_data != NULL)&&(this->_allocated)) {PetscFree(this->_data);CHKERROR(ierr, "Error in PetscFree");}};
343:       //
344:       // Extended interface
345:       //
346:       void setAtlas(const Obj<atlas_type>& atlas, bool allocate = true) {
347:         if(!this->_atlas.isNull()) {
348:           throw ALE::Exception("Cannot reset nonempty atlas");
349:         }
350:         else {
351:           if(atlas.isNull()) {
352:             throw ALE::Exception("Cannot set a nonempty atlas");
353:           }
354:           else {
355:             this->_atlas = atlas;
356:             this->_allocated = allocate;
357:             if(allocate) {
358:               // Allocate data
359:               PetscMalloc(this->_atlas->size()*sizeof(data_type), &this->_data);CHKERROR(ierr, "Error in PetscMalloc");
360:             }
361:           }
362:         }
363:       };// setAtlas()
364:       Obj<atlas_type> getAtlas() {return this->_atlas;};
365:       Obj<atlas_type> atlas()    {return getAtlas();};
366:       //
367:       DataSequence<typename atlas_type::coneSequence>
368:       restrict(const chart_type& chart) {
369:         return DataSequence<typename atlas_type::coneSequence>(this->_data, this->_atlas->cone(chart));
370:       };
371:       DataSequence<typename atlas_type::arrowSequence>
372:       restrict(const chart_type& chart, const point_type& point) {
373:         return DataSequence<typename atlas_type::coneSequence>(this->_data, this->_atlas->arrows(point, chart));
374:       };
375:     };// class Sec


378:     template <typename Data_, typename Atlas_>
379:     class ArraySec : public Sec<Data_, Atlas_> {
380:     public:
381:       //
382:       // Encapsulated types
383:       //
384:       typedef Sec<Data_,Atlas_>                sec_type;
385:       typedef Atlas_                           atlas_type;
386:       typedef Data_                            data_type;
387:       typedef typename atlas_type::point_type  point_type;
388:       typedef typename atlas_type::chart_type  chart_type;
389:       typedef typename atlas_type::index_type  index_type;
390:       //
391:       // Basic interface
392:       //
393:       ArraySec(const Obj<atlas_type> atlas, const (data_type*)& data) {this->setAtlas(atlas, false); this->_data = data;};
394:       ArraySec(const Obj<atlas_type> atlas = Obj<atlas_type>()) {this->_data = NULL; this->setAtlas(atlas);};
395:      ~ArraySec(){if((this->_data != NULL)&&(this->_allocated)) {PetscFree(this->_data);CHKERROR(ierr, "Error in PetscFree");}};
396:       //
397:       // Extended interface
398:       //
399:       data_type*
400:       restrict(const chart_type& chart) {
401:         return this->_data + this->_atlas->offset(chart);
402:       };
403:       data_type*
404:       restrict(const chart_type& chart, const point_type& point) {
405:         return this->_data + this->_atlas->offset(chart,point);
406:       };
407:     }; // class ArraySec


410:     // Overlap is a container class that declares GatherAtlas and ScatterAtlas types as well as
411:     // defines their construction procedures.
412:     // InAtlas_ and OutAtlas_ are Atlases, Lightcone_ is a Sifter.
413:     // Lightcone connects the charts of some ParMap's OutAtlas_ to the ParMap's InAtlas_.
414:     // GatherAtlas and ScatterAtlas have (point_in,chart_in) pairs as points and process ranks as charts.
415:     // Preconditions: InAtlas, Lightcone, and OutAtlas share communicator; we require that chart type be Point
416:     // FIX: should GatherAtlas/ScatterAtlas depend on an underlying Topology Sieve?
417:     template <typename Data_, typename InAtlas_, typename OutAtlas_, typename Lightcone_>
418:     class Overlap {
419:     public:
420:       // encapsulated types
421:       typedef Data_                                                 data_type;
422:       typedef InAtlas_                                              in_atlas_type;
423:       typedef Lightcone_                                            lightcone_type;
424:       typedef OutAtlas_                                             out_atlas_type;
425:       typedef MPI_Int                                               chart_type;
426:       typedef ALE::pair<in_atlas_type::point_type, chart_type>      point_type;
427:       typedef typename in_atlas_type::ind_type                      ind_type;
428:       typedef typename in_atlas_type::index_type                    index_type;
429:       //
430:       typedef typename Atlas<point_type, chart_type, ind_type>      gather_scatter_atlas_type;
431:     protected:
432:       Obj<in_atlas_type>             _in_atlas;
433:       Obj<out_atlas_type>            _out_atlas;
434:       Obj<lightcone_type>            _lightcone;
435:       //
436:       Obj<gather_scatter_atlas_type> _gather_atlas;
437:       Obj<gather_scatter_atlas_type> _scatter_atlas;
438:     public:
439:       //
440:       Overlap(const Obj<in_atlas_type>& in_atlas,const Obj<out_atlas_type>& out_atlas, const Obj<lightcone_type>& lightcone = Obj<lightcone_type>()) : atlas_type(in_atlas->comm()), _in_atlas(in_atlas),_out_atlas(out_atlas),_lightcone(lightcone) {
441:         this->computeAtlases(this->_gather_atlas, this->_scatter_atlas);
442:       };
443:      ~Overlap(){};
444:       //
445:       // Extended interface
446:       Obj<in_atlas_type>  inAtlas()  {return this->_in_atlas();};
447:       Obj<out_atlas_type> outAtlas() {return this->_out_atlas();};
448:       Obj<lightcone_type> lightcone(){return this->_lightcone();}
449:       //
450:       void computeAtlases(Obj<gather_atlas_type> gather_atlas, Obj<scatter_atlas_type> scatter_atlas) {
451:         // This function computes the gather and scatter atlases necessary for exchanging and fusing the input data lying over the
452:         // overlap with the remote processes into the local data over the overlap points.
453:         // The Lightcone sifter encodes the dependence between input and output charts of some map: each out-chart in the base
454:         // of the Lightcone depends on the in-chart in its Lightcone cone (depends for the computation of the map values).
455:         // A Null Lightcone Obj is interpreted as an identity Lightcone, hence the if-else dichotomy in the code below.
456:         //

458:         // In order to retrieve the remote points that OutAtlas depends on, we compute the overlap of the local bases of InAtlas
459:         // restricted to a suitable subset of Lightcone's cap.  The idea is that at most the charts the in the cap of the Lightcone
460:         // are required locally by the OutAtlas.
461:         // Furthermore, at most the cone of OutAtlas' base in Lightcone is required, which is the set we use to compute the overlap.
462:         // If the Lightcone Obj is Null, we take all of the OutAtlas base as the base of the overlap in InAtlas.
463:         //
464:         if(gather_atlas.isNull()) {
465:           gather_atlas  = gather_atlas_type(this->_in_atlas->comm());
466:         }
467:         if(scatter_atlas.isNull()){
468:           scatter_atlas = scatter_atlas_type(this->_in_atlas->comm());
469:         }
470:         //
471:         if(!lightcone.isNull()) {
472:           // Take all of out-charts & compute their lightcone closure; this will give all of the in-charts required locally
473:           typename out_atlas_type::capSequence out_base = this->_out_atlas->base();
474:           typename lightcone_type::coneSet in_charts = this->_lightcone->closure(out_base);
475:           // Now we compute the "overlap" of in_atlas with these local in-charts; this will be the gather atlas
476:           this->__pullbackAtlases(in_charts, gather_atlas, scatter_atlas);

478:         }// if(!lightcone.isNull())
479:         else {
480:           // FIX: handle the Null lightcone case
481:         }
482:       };// computeAtlases()

484:     protected:
485:       // Internal type definitions to ensure compatibility with the legacy code in the parallel subroutines
486:       typedef ALE::Point                                Point;
487:       typedef int                                            int32_t;
488:       typedef ALE::pair<int32_t, int32_t>                    int_pair;
489:       typedef ALE::set<std::pair<int32_t, int32_t> >         int_pair_set;
490:       typedef ALE::map<int32_t,int32_t>                      int__int;
491:       typedef ALE::map<Point, int32_t>                       Point__int;
492:       typedef ALE::map<Point, std::pair<int32_t,int32_t> >   Point__int_int;
493:       typedef ALE::map<Point, int_pair_set>                  Point__int_pair_set;

495:       template <typename Sequence>
496:       svoid __determinePointOwners(const Obj<Sequence>& points, int32_t *LeaseData, Point__int& owner) {
498:         // The Sequence points will be referred to as 'base' throughout, although it may in fact represent a cap.
499:         MPI_Comm comm = this->comm();
500:         MPI_Int  rank = this->commRank();
501:         MPI_Int  size = this->commSize();

503:         // We need to partition global nodes among lessors, which we do by global prefix
504:         // First we determine the extent of global prefices and the bounds on the indices with each global prefix.
505:         int minGlobalPrefix = 0;
506:         // Determine the local extent of global domains
507:         for(typename Sequence::iterator point_itor = points->begin(); point_itor != points->end(); point_itor++) {
508:           Point p = (*point_itor);
509:           if((p.prefix < 0) && (p.prefix < minGlobalPrefix)) {
510:             minGlobalPrefix = p.prefix;
511:           }
512:         }
513:         int MinGlobalPrefix;
514:         MPI_Allreduce(&minGlobalPrefix, &MinGlobalPrefix, 1, MPIU_INT, MPI_MIN, comm);
515:         CHKERROR(ierr, "Error in MPI_Allreduce");
516: 
517:         int__int BaseLowerBound, BaseUpperBound; // global quantities computed from the local quantities below
518:         int__int BaseMaxSize;                    // the maximum size of the global base index space by global prefix
519:         int__int BaseSliceScale, BaseSliceSize, BaseSliceOffset;
520: 
521:         if(MinGlobalPrefix < 0) { // if we actually do have global base points
522:           // Determine the upper and lower bounds on the indices of base points with each global prefix.
523:           // We use maps to keep track of these quantities with different global prefices.
524:           int__int baseLowerBound, baseUpperBound; // local quantities
525:           // Initialize local bound maps with the upper below lower so we can later recognize omitted prefices.
526:           for(int d = -1; d >= MinGlobalPrefix; d--) {
527:             baseLowerBound[d] = 0; baseUpperBound[d] = -1;
528:           }
529:           // Compute local bounds
530:           for(typename Sequence::iterator point_itor = points->begin(); point_itor != points->end(); point_itor++) {
531:             Point p = (*point_itor);
532:             int d = p.prefix;
533:             int i = p.index;
534:             if(d < 0) { // it is indeed a global prefix
535:               if (i < baseLowerBound[d]) {
536:                 baseLowerBound[d] = i;
537:               }
538:               if (i > baseUpperBound[d]) {
539:                 baseUpperBound[d] = i;
540:               }
541:             }
542:           }
543:           // Compute global bounds
544:           for(int d = -1; d >= MinGlobalPrefix; d--){
545:             int lowerBound, upperBound, maxSize;
546:             MPI_Allreduce(&baseLowerBound[d],&lowerBound,1,MPIU_INT,MPI_MIN,comm);
547:             CHKERROR(ierr, "Error in MPI_Allreduce");
548:             MPI_Allreduce(&baseUpperBound[d],&upperBound,1,MPIU_INT,MPI_MAX,comm);
549:             CHKERROR(ierr, "Error in MPI_Allreduce");
550:             maxSize = upperBound - lowerBound + 1;
551:             if(maxSize > 0) { // there are actually some indices in this global prefix
552:               BaseLowerBound[d] = lowerBound;
553:               BaseUpperBound[d] = upperBound;
554:               BaseMaxSize[d]    = maxSize;
555: 
556:               // Each processor (at least potentially) owns a slice of the base indices with each global indices.
557:               // The size of the slice with global prefix d is BaseMaxSize[d]/size + 1 (except if rank == size-1,
558:               // where the slice size can be smaller; +1 is for safety).
559: 
560:               // For a non-empty domain d we compute and store the slice size in BaseSliceScale[d] (the 'typical' slice size) and
561:               // BaseSliceSize[d] (the 'actual' slice size, which only differs from 'typical' for processor with rank == size -1 ).
562:               // Likewise, each processor has to keep track of the index offset for each slice it owns and stores it in BaseSliceOffset[d].
563:               BaseSliceScale[d]  = BaseMaxSize[d]/size + 1;
564:               BaseSliceSize[d]   = BaseSliceScale[d];
565:               if (rank == size-1) {
566:                 BaseSliceSize[d] =  BaseMaxSize[d] - BaseSliceScale[d]*(size-1);
567:               }
568:               BaseSliceSize[d]   = PetscMax(1,BaseSliceSize[d]);
569:               BaseSliceOffset[d] = BaseLowerBound[d] + BaseSliceScale[d]*rank;
570:             }// for(int d = -1; d >= MinGlobalPrefix; d--){
571:           }
572:         }// if(MinGlobalDomain < 0)
573: 
574:         for (typename Sequence::iterator point_itor = points->begin(); point_itor != points->end(); point_itor++) {
575:           Point p = (*point_itor);
576:           // Determine which slice p falls into
577:           // ASSUMPTION on Point type
578:           int d = p.prefix;
579:           int i = p.index;
580:           int proc;
581:           if(d < 0) { // global domain -- determine the owner by which slice p falls into
582:             proc = (i-BaseLowerBound[d])/BaseSliceScale[d];
583:           }
584:           else { // local domain -- must refer to a rank within the comm
585:             if(d >= size) {
586:               throw ALE::Exception("Local domain outside of comm size");
587:             }
588:             proc = d;
589:           }
590:           owner[p]     = proc;
591:           LeaseData[2*proc+1] = 1;                 // processor owns at least one of ours (i.e., the number of leases from proc is 1)
592:           LeaseData[2*proc]++;                     // count of how many we lease from proc
593:         }

595:         // Base was empty
596:         if(points->begin() == points->end()) {
597:           for(int p = 0; p < size; p++) {
598:             LeaseData[2*p+0] = 0;
599:             LeaseData[2*p+1] = 0;
600:           }
601:         }
602:       }; // __determinePointOwners()


605:       template <typename BaseSequence_>
606:       void __pullbackAtlases(const BaseSequence& pointsB, Obj<gather_atlas_type>& gather_atlas, Obj<scatter_atlas_type>& scatter_atlas){
607:         typedef typename in_atlas_type::traits::baseSequence Sequence;
608:         MPI_Comm       comm = _graphA->comm();
609:         int            size = _graphA->commSize();
610:         int            rank = _graphA->commRank();
611:         PetscObject    petscObj = _graphA->petscObj();
612:         PetscMPIInt    tag1, tag2, tag3, tag4, tag5, tag6;
614:         // The bases we are going to work with
615:         Obj<Sequence> pointsA = _graphA->base();
616:         Obj<Sequence> pointsB = _graphB->base();

618:         // We MUST have the same sellers for points in A and B (same point owner determination)
619:         int *BuyDataA; // 2 ints per processor: number of A base points we buy and number of sales (0 or 1).
620:         int *BuyDataB; // 2 ints per processor: number of B base points we buy and number of sales (0 or 1).
621:         PetscMalloc2(2*size,int,&BuyDataA,2*size,int,&BuyDataB);CHKERROR(ierr, "Error in PetscMalloc");
622:         PetscMemzero(BuyDataA, 2*size * sizeof(int));CHKERROR(ierr, "Error in PetscMemzero");
623:         PetscMemzero(BuyDataB, 2*size * sizeof(int));CHKERROR(ierr, "Error in PetscMemzero");
624:         // Map from points to the process managing its bin (seller)
625:         Point__int ownerA, ownerB;

627:         // determine owners of each base node and save it in a map
628:         __determinePointOwners(_graphA, pointsA, BuyDataA, ownerA);
629:         __determinePointOwners(_graphB, pointsB, BuyDataB, ownerB);

631:         int  msgSize = 3;   // A point is 2 ints, and the cone size is 1
632:         int  BuyCountA = 0; // The number of sellers with which this process (A buyer) communicates
633:         int  BuyCountB = 0; // The number of sellers with which this process (B buyer) communicates
634:         int *BuySizesA;     // The number of A points to buy from each seller
635:         int *BuySizesB;     // The number of B points to buy from each seller
636:         int *SellersA;      // The process for each seller of A points
637:         int *SellersB;      // The process for each seller of B points
638:         int *offsetsA = new int[size];
639:         int *offsetsB = new int[size];
640:         for(int p = 0; p < size; ++p) {
641:           BuyCountA += BuyDataA[2*p+1];
642:           BuyCountB += BuyDataB[2*p+1];
643:         }
644:         PetscMalloc2(BuyCountA,int,&BuySizesA,BuyCountA,int,&SellersA);CHKERROR(ierr, "Error in PetscMalloc");
645:         PetscMalloc2(BuyCountB,int,&BuySizesB,BuyCountB,int,&SellersB);CHKERROR(ierr, "Error in PetscMalloc");
646:         for(int p = 0, buyNumA = 0, buyNumB = 0; p < size; ++p) {
647:           if (BuyDataA[2*p+1]) {
648:             SellersA[buyNumA]    = p;
649:             BuySizesA[buyNumA++] = BuyDataA[2*p];
650:           }
651:           if (BuyDataB[2*p+1]) {
652:             SellersB[buyNumB]    = p;
653:             BuySizesB[buyNumB++] = BuyDataB[2*p];
654:           }
655:           if (p == 0) {
656:             offsetsA[p] = 0;
657:             offsetsB[p] = 0;
658:           } else {
659:             offsetsA[p] = offsetsA[p-1] + msgSize*BuyDataA[2*(p-1)];
660:             offsetsB[p] = offsetsB[p-1] + msgSize*BuyDataB[2*(p-1)];
661:           }
662:         }

664:         // All points are bought from someone
665:         int32_t *BuyPointsA; // (point, coneSize) for each A point boung from a seller
666:         int32_t *BuyPointsB; // (point, coneSize) for each B point boung from a seller
667:         PetscMalloc2(msgSize*pointsA->size(),int32_t,&BuyPointsA,msgSize*pointsB->size(),int32_t,&BuyPointsB);CHKERROR(ierr,"Error in PetscMalloc");
668:         for (typename Sequence::iterator p_itor = pointsA->begin(); p_itor != pointsA->end(); p_itor++) {
669:           BuyPointsA[offsetsA[ownerA[*p_itor]]++] = (*p_itor).prefix;
670:           BuyPointsA[offsetsA[ownerA[*p_itor]]++] = (*p_itor).index;
671:           BuyPointsA[offsetsA[ownerA[*p_itor]]++] = _graphA->cone(*p_itor)->size();
672:         }
673:         for (typename Sequence::iterator p_itor = pointsB->begin(); p_itor != pointsB->end(); p_itor++) {
674:           BuyPointsB[offsetsB[ownerB[*p_itor]]++] = (*p_itor).prefix;
675:           BuyPointsB[offsetsB[ownerB[*p_itor]]++] = (*p_itor).index;
676:           BuyPointsB[offsetsB[ownerB[*p_itor]]++] = _graphB->cone(*p_itor)->size();
677:         }
678:         for(int b = 0, o = 0; b < BuyCountA; ++b) {
679:           if (offsetsA[SellersA[b]] - o != msgSize*BuySizesA[b]) {
680:             throw ALE::Exception("Invalid A point size");
681:           }
682:           o += msgSize*BuySizesA[b];
683:         }
684:         for(int b = 0, o = 0; b < BuyCountB; ++b) {
685:           if (offsetsB[SellersB[b]] - o != msgSize*BuySizesB[b]) {
686:             throw ALE::Exception("Invalid B point size");
687:           }
688:           o += msgSize*BuySizesB[b];
689:         }
690:         delete [] offsetsA;
691:         delete [] offsetsB;

693:         int  SellCountA;     // The number of A point buyers with which this process (seller) communicates
694:         int  SellCountB;     // The number of B point buyers with which this process (seller) communicates
695:         int *SellSizesA;     // The number of A points to sell to each buyer
696:         int *SellSizesB;     // The number of B points to sell to each buyer
697:         int *BuyersA;        // The process for each A point buyer
698:         int *BuyersB;        // The process for each B point buyer
699:         int  MaxSellSizeA;   // The maximum number of messages to be sold to any A point buyer
700:         int  MaxSellSizeB;   // The maximum number of messages to be sold to any B point buyer
701:         int32_t *SellPointsA = PETSC_NULL; // The points and cone sizes from all buyers
702:         int32_t *SellPointsB = PETSC_NULL; // The points and cone sizes from all buyers
703:         PetscMaxSum(comm, BuyDataA, &MaxSellSizeA, &SellCountA);CHKERROR(ierr,"Error in PetscMaxSum");
704:         PetscMaxSum(comm, BuyDataB, &MaxSellSizeB, &SellCountB);CHKERROR(ierr,"Error in PetscMaxSum");
705:         PetscMalloc2(SellCountA,int,&SellSizesA,SellCountA,int,&BuyersA);CHKERROR(ierr, "Error in PetscMalloc");
706:         PetscMalloc2(SellCountB,int,&SellSizesB,SellCountB,int,&BuyersB);CHKERROR(ierr, "Error in PetscMalloc");
707:         for(int s = 0; s < SellCountA; s++) {
708:           SellSizesA[s] = MaxSellSizeA;
709:           BuyersA[s]    = MPI_ANY_SOURCE;
710:         }
711:         for(int s = 0; s < SellCountB; s++) {
712:           SellSizesB[s] = MaxSellSizeB;
713:           BuyersB[s]    = MPI_ANY_SOURCE;
714:         }

716:         if (debug) {
717:           ostringstream txt;

719:           for(int s = 0; s < BuyCountA; s++) {
720:             txt << "BuySizesA["<<s<<"]: "<<BuySizesA[s]<<" from seller "<<SellersA[s]<< std::endl;
721:           }
722:           for(int p = 0; p < (int) pointsA->size(); p++) {
723:             txt << "["<<rank<<"]: BuyPointsA["<<p<<"]: ("<<BuyPointsA[p*msgSize]<<", "<<BuyPointsA[p*msgSize+1]<<") coneSize "<<BuyPointsA[p*msgSize+2]<<std::endl;
724:           }
725:           for(int s = 0; s < BuyCountB; s++) {
726:             txt << "BuySizesB["<<s<<"]: "<<BuySizesB[s]<<" from seller "<<SellersB[s]<< std::endl;
727:           }
728:           for(int p = 0; p < (int) pointsB->size(); p++) {
729:             txt << "["<<rank<<"]: BuyPointsB["<<p<<"]: ("<<BuyPointsB[p*msgSize]<<", "<<BuyPointsB[p*msgSize+1]<<") coneSize "<<BuyPointsB[p*msgSize+2]<<std::endl;
730:           }
731:           PetscSynchronizedPrintf(comm, txt.str().c_str()); CHKERROR(ierr, "Error in PetscSynchronizedPrintf");
732:           PetscSynchronizedFlush(comm);CHKERROR(ierr,"Error in PetscSynchronizedFlush");
733:         }

735:         // First tell sellers which points we want to buy
736:         //   SellSizes, Buyers, and SellPoints are output
737:         PetscObjectGetNewTag(petscObj, &tag1);CHKERROR(ierr, "Failed on PetscObjectGetNewTag");
738:         commCycle(comm, tag1, msgSize, BuyCountA, BuySizesA, SellersA, BuyPointsA, SellCountA, SellSizesA, BuyersA, &SellPointsA);
739:         PetscObjectGetNewTag(petscObj, &tag2);CHKERROR(ierr, "Failed on PetscObjectGetNewTag");
740:         commCycle(comm, tag2, msgSize, BuyCountB, BuySizesB, SellersB, BuyPointsB, SellCountB, SellSizesB, BuyersB, &SellPointsB);

742:         if (debug) {
743:           ostringstream txt;

745:           if (!rank) {txt << "Unsquished" << std::endl;}
746:           for(int p = 0; p < SellCountA*MaxSellSizeA; p++) {
747:             txt << "["<<rank<<"]: SellPointsA["<<p<<"]: ("<<SellPointsA[p*msgSize]<<", "<<SellPointsA[p*msgSize+1]<<") coneSize "<<SellPointsA[p*msgSize+2]<<std::endl;
748:           }
749:           for(int p = 0; p < SellCountB*MaxSellSizeB; p++) {
750:             txt << "["<<rank<<"]: SellPointsB["<<p<<"]: ("<<SellPointsB[p*msgSize]<<", "<<SellPointsB[p*msgSize+1]<<") coneSize "<<SellPointsB[p*msgSize+2]<<std::endl;
751:           }
752:           PetscSynchronizedPrintf(comm, txt.str().c_str());CHKERROR(ierr, "Error in PetscSynchronizedPrintf");
753:           PetscSynchronizedFlush(comm);CHKERROR(ierr,"Error in PetscSynchronizedFlush");
754:         }

756:         // Since we gave maximum sizes, we need to squeeze SellPoints
757:         for(int s = 0, offset = 0; s < SellCountA; s++) {
758:           if (offset != s*MaxSellSizeA*msgSize) {
759:             PetscMemmove(&SellPointsA[offset], &SellPointsA[s*MaxSellSizeA*msgSize], SellSizesA[s]*msgSize*sizeof(int32_t));CHKERROR(ierr,"Error in PetscMemmove");
760:           }
761:           offset += SellSizesA[s]*msgSize;
762:         }
763:         for(int s = 0, offset = 0; s < SellCountB; s++) {
764:           if (offset != s*MaxSellSizeB*msgSize) {
765:             PetscMemmove(&SellPointsB[offset], &SellPointsB[s*MaxSellSizeB*msgSize], SellSizesB[s]*msgSize*sizeof(int32_t));CHKERROR(ierr,"Error in PetscMemmove");
766:           }
767:           offset += SellSizesB[s]*msgSize;
768:         }

770:         if (debug) {
771:           ostringstream txt;
772:           int SellSizeA = 0, SellSizeB = 0;

774:           if (!rank) {txt << "Squished" << std::endl;}
775:           for(int s = 0; s < SellCountA; s++) {
776:             SellSizeA += SellSizesA[s];
777:             txt << "SellSizesA["<<s<<"]: "<<SellSizesA[s]<<" from buyer "<<BuyersA[s]<< std::endl;
778:           }
779:           for(int p = 0; p < SellSizeA; p++) {
780:             txt << "["<<rank<<"]: SellPointsA["<<p<<"]: ("<<SellPointsA[p*msgSize]<<", "<<SellPointsA[p*msgSize+1]<<") coneSize "<<SellPointsA[p*msgSize+2]<<std::endl;
781:           }
782:           for(int s = 0; s < SellCountB; s++) {
783:             SellSizeB += SellSizesB[s];
784:             txt << "SellSizesB["<<s<<"]: "<<SellSizesB[s]<<" from buyer "<<BuyersB[s]<< std::endl;
785:           }
786:           for(int p = 0; p < SellSizeB; p++) {
787:             txt << "["<<rank<<"]: SellPointsB["<<p<<"]: ("<<SellPointsB[p*msgSize]<<", "<<SellPointsB[p*msgSize+1]<<") coneSize "<<SellPointsB[p*msgSize+2]<<std::endl;
788:           }
789:           PetscSynchronizedPrintf(comm, txt.str().c_str());CHKERROR(ierr, "Error in PetscSynchronizedPrintf");
790:           PetscSynchronizedFlush(comm);CHKERROR(ierr,"Error in PetscSynchronizedFlush");
791:         }

793:         // Map from A base points to (B process, B coneSize) pairs
794:         Point__int_pair_set BillOfSaleAtoB;
795:         // Map from B base points to (A process, A coneSize) pairs
796:         Point__int_pair_set BillOfSaleBtoA;

798:         // Find the A points being sold to B buyers and record the B cone size
799:         for(int s = 0, offset = 0; s < SellCountA; s++) {
800:           for(int m = 0; m < SellSizesA[s]; m++) {
801:             Point point = Point(SellPointsA[offset], SellPointsA[offset+1]);
802:             // Just insert the point
803:             int size = BillOfSaleAtoB[point].size();
804:             // Avoid unused variable warning
805:             if (!size) offset += msgSize;
806:           }
807:         }
808:         for(int s = 0, offset = 0; s < SellCountB; s++) {
809:           for(int m = 0; m < SellSizesB[s]; m++) {
810:             Point point = Point(SellPointsB[offset], SellPointsB[offset+1]);

812:             if (BillOfSaleAtoB.find(point) != BillOfSaleAtoB.end()) {
813:               BillOfSaleAtoB[point].insert(int_pair(BuyersB[s], SellPointsB[offset+2]));
814:             }
815:             offset += msgSize;
816:           }
817:         }
818:         // Find the B points being sold to A buyers and record the A cone size
819:         for(int s = 0, offset = 0; s < SellCountB; s++) {
820:           for(int m = 0; m < SellSizesB[s]; m++) {
821:             Point point = Point(SellPointsB[offset], SellPointsB[offset+1]);
822:             // Just insert the point
823:             int size = BillOfSaleBtoA[point].size();
824:             // Avoid unused variable warning
825:             if (!size) offset += msgSize;
826:           }
827:         }
828:         for(int s = 0, offset = 0; s < SellCountA; s++) {
829:           for(int m = 0; m < SellSizesA[s]; m++) {
830:             Point point = Point(SellPointsA[offset], SellPointsA[offset+1]);

832:             if (BillOfSaleBtoA.find(point) != BillOfSaleBtoA.end()) {
833:               BillOfSaleBtoA[point].insert(int_pair(BuyersA[s], SellPointsA[offset+2]));
834:             }
835:             offset += msgSize;
836:           }
837:         }
838:         // Calculate number of B buyers for A base points
839:         for(int s = 0, offset = 0; s < SellCountA; s++) {
840:           for(int m = 0; m < SellSizesA[s]; m++) {
841:             Point point = Point(SellPointsA[offset], SellPointsA[offset+1]);

843:             SellPointsA[offset+2] = BillOfSaleAtoB[point].size();
844:             offset += msgSize;
845:           }
846:         }
847:         // Calculate number of A buyers for B base points
848:         for(int s = 0, offset = 0; s < SellCountB; s++) {
849:           for(int m = 0; m < SellSizesB[s]; m++) {
850:             Point point = Point(SellPointsB[offset], SellPointsB[offset+1]);

852:             SellPointsB[offset+2] = BillOfSaleBtoA[point].size();
853:             offset += msgSize;
854:           }
855:         }

857:         // Tell A buyers how many B buyers there were (contained in BuyPointsA)
858:         // Tell B buyers how many A buyers there were (contained in BuyPointsB)
859:         PetscObjectGetNewTag(petscObj, &tag3);CHKERROR(ierr, "Failed on PetscObjectGetNewTag");
860:         commCycle(comm, tag3, msgSize, SellCountA, SellSizesA, BuyersA, SellPointsA, BuyCountA, BuySizesA, SellersA, &BuyPointsA);
861:         PetscObjectGetNewTag(petscObj, &tag4);CHKERROR(ierr, "Failed on PetscObjectGetNewTag");
862:         commCycle(comm, tag4, msgSize, SellCountB, SellSizesB, BuyersB, SellPointsB, BuyCountB, BuySizesB, SellersB, &BuyPointsB);

864:         if (debug) {
865:           ostringstream txt;
866:           int BuySizeA = 0, BuySizeB = 0;

868:           if (!rank) {txt << "Got other B and A buyers" << std::endl;}
869:           for(int s = 0; s < BuyCountA; s++) {
870:             BuySizeA += BuySizesA[s];
871:             txt << "BuySizesA["<<s<<"]: "<<BuySizesA[s]<<" from seller "<<SellersA[s]<< std::endl;
872:           }
873:           for(int p = 0; p < BuySizeA; p++) {
874:             txt << "["<<rank<<"]: BuyPointsA["<<p<<"]: ("<<BuyPointsA[p*msgSize]<<", "<<BuyPointsA[p*msgSize+1]<<") B buyers "<<BuyPointsA[p*msgSize+2]<<std::endl;
875:           }
876:           for(int s = 0; s < BuyCountB; s++) {
877:             BuySizeB += BuySizesB[s];
878:             txt << "BuySizesB["<<s<<"]: "<<BuySizesB[s]<<" from seller "<<SellersB[s]<< std::endl;
879:           }
880:           for(int p = 0; p < BuySizeB; p++) {
881:             txt << "["<<rank<<"]: BuyPointsB["<<p<<"]: ("<<BuyPointsB[p*msgSize]<<", "<<BuyPointsB[p*msgSize+1]<<") A buyers "<<BuyPointsB[p*msgSize+2]<<std::endl;
882:           }
883:           PetscSynchronizedPrintf(comm, txt.str().c_str());CHKERROR(ierr, "Error in PetscSynchronizedPrintf");
884:           PetscSynchronizedFlush(comm);CHKERROR(ierr,"Error in PetscSynchronizedFlush");
885:         }

887:         int      BuyConesSizeA  = 0;
888:         int      BuyConesSizeB  = 0;
889:         int      SellConesSizeA = 0;
890:         int      SellConesSizeB = 0;
891:         int     *BuyConesSizesA;  // The number of A points to buy from each seller
892:         int     *BuyConesSizesB;  // The number of B points to buy from each seller
893:         int     *SellConesSizesA; // The number of A points to sell to each buyer
894:         int     *SellConesSizesB; // The number of B points to sell to each buyer
895:         int32_t *SellConesA;      // The (rank, B cone size) for each A point from all other B buyers
896:         int32_t *SellConesB;      // The (rank, A cone size) for each B point from all other A buyers
897:         int32_t *overlapInfoA = PETSC_NULL; // The (rank, B cone size) for each A point from all other B buyers
898:         int32_t *overlapInfoB = PETSC_NULL; // The (rank, A cone size) for each B point from all other A buyers
899:         PetscMalloc2(BuyCountA,int,&BuyConesSizesA,SellCountA,int,&SellConesSizesA);CHKERROR(ierr, "Error in PetscMalloc");
900:         PetscMalloc2(BuyCountB,int,&BuyConesSizesB,SellCountB,int,&SellConesSizesB);CHKERROR(ierr, "Error in PetscMalloc");
901:         for(int s = 0, offset = 0; s < SellCountA; s++) {
902:           SellConesSizesA[s] = 0;

904:           for(int m = 0; m < SellSizesA[s]; m++) {
905:             SellConesSizesA[s] += SellPointsA[offset+2];
906:             offset             += msgSize;
907:           }
908:           SellConesSizeA += SellConesSizesA[s];
909:         }
910:         for(int s = 0, offset = 0; s < SellCountB; s++) {
911:           SellConesSizesB[s] = 0;

913:           for(int m = 0; m < SellSizesB[s]; m++) {
914:             SellConesSizesB[s] += SellPointsB[offset+2];
915:             offset             += msgSize;
916:           }
917:           SellConesSizeB += SellConesSizesB[s];
918:         }

920:         for(int b = 0, offset = 0; b < BuyCountA; b++) {
921:           BuyConesSizesA[b] = 0;

923:           for(int m = 0; m < BuySizesA[b]; m++) {
924:             BuyConesSizesA[b] += BuyPointsA[offset+2];
925:             offset            += msgSize;
926:           }
927:           BuyConesSizeA += BuyConesSizesA[b];
928:         }
929:         for(int b = 0, offset = 0; b < BuyCountB; b++) {
930:           BuyConesSizesB[b] = 0;

932:           for(int m = 0; m < BuySizesB[b]; m++) {
933:             BuyConesSizesB[b] += BuyPointsB[offset+2];
934:             offset            += msgSize;
935:           }
936:           BuyConesSizeB += BuyConesSizesB[b];
937:         }

939:         int cMsgSize = 2;
940:         PetscMalloc2(SellConesSizeA*cMsgSize,int32_t,&SellConesA,SellConesSizeB*cMsgSize,int32_t,&SellConesB);CHKERROR(ierr, "Error in PetscMalloc");
941:         for(int s = 0, offset = 0, cOffset = 0, SellConeSize = 0; s < SellCountA; s++) {
942:           for(int m = 0; m < SellSizesA[s]; m++) {
943:             Point point(SellPointsA[offset],SellPointsA[offset+1]);

945:             for(typename int_pair_set::iterator p_iter = BillOfSaleAtoB[point].begin(); p_iter != BillOfSaleAtoB[point].end(); ++p_iter) {
946:               SellConesA[cOffset+0] = (*p_iter).first;
947:               SellConesA[cOffset+1] = (*p_iter).second;
948:               cOffset += cMsgSize;
949:             }
950:             offset += msgSize;
951:           }
952:           if (cOffset - cMsgSize*SellConeSize != cMsgSize*SellConesSizesA[s]) {
953:             throw ALE::Exception("Nonmatching sizes");
954:           }
955:           SellConeSize += SellConesSizesA[s];
956:         }
957:         for(int s = 0, offset = 0, cOffset = 0, SellConeSize = 0; s < SellCountB; s++) {
958:           for(int m = 0; m < SellSizesB[s]; m++) {
959:             Point point(SellPointsB[offset],SellPointsB[offset+1]);

961:             for(typename int_pair_set::iterator p_iter = BillOfSaleBtoA[point].begin(); p_iter != BillOfSaleBtoA[point].end(); ++p_iter) {
962:               SellConesB[cOffset+0] = (*p_iter).first;
963:               SellConesB[cOffset+1] = (*p_iter).second;
964:               cOffset += cMsgSize;
965:             }
966:             offset += msgSize;
967:           }
968:           if (cOffset - cMsgSize*SellConeSize != cMsgSize*SellConesSizesB[s]) {
969:             throw ALE::Exception("Nonmatching sizes");
970:           }
971:           SellConeSize += SellConesSizesB[s];
972:         }

974:         // Then send A buyers a (rank, cone size) for all B buyers of the same points
975:         PetscObjectGetNewTag(petscObj, &tag5);CHKERROR(ierr, "Failed on PetscObjectGetNewTag");
976:         commCycle(comm, tag5, cMsgSize, SellCountA, SellConesSizesA, BuyersA, SellConesA, BuyCountA, BuyConesSizesA, SellersA, &overlapInfoA);
977:         PetscObjectGetNewTag(petscObj, &tag6);CHKERROR(ierr, "Failed on PetscObjectGetNewTag");
978:         commCycle(comm, tag6, cMsgSize, SellCountB, SellConesSizesB, BuyersB, SellConesB, BuyCountB, BuyConesSizesB, SellersB, &overlapInfoB);

980:         // Finally build the A-->B overlap sifter
981:         //   (remote rank) ---(base A overlap point, remote cone size, local cone size)---> (base A overlap point)
982:         for(int b = 0, offset = 0, cOffset = 0; b < BuyCountA; b++) {
983:           for(int m = 0; m < BuySizesA[b]; m++) {
984:             Point p(BuyPointsA[offset],BuyPointsA[offset+1]);

986:             for(int n = 0; n < BuyPointsA[offset+2]; n++) {
987:               int neighbor = overlapInfoA[cOffset+0];
988:               int coneSize = overlapInfoA[cOffset+1];

990:               // Record the point, size of the cone over p coming in from neighbor, and going out to the neighbor for the arrow color
991:               overlap->addArrow(neighbor, ALE::pair<int,Point>(0, p), ALE::pair<Point,ALE::pair<int,int> >(p, ALE::pair<int,int>(coneSize, _graphA->cone(p)->size())) );
992:               cOffset += cMsgSize;
993:             }
994:             offset += msgSize;
995:           }
996:         }

998:         // Finally build the B-->A overlap sifter
999:         //   (remote rank) ---(base B overlap point, remote cone size, local cone size)---> (base B overlap point)
1000:         for(int b = 0, offset = 0, cOffset = 0; b < BuyCountB; b++) {
1001:           for(int m = 0; m < BuySizesB[b]; m++) {
1002:             Point p(BuyPointsB[offset],BuyPointsB[offset+1]);

1004:             for(int n = 0; n < BuyPointsB[offset+2]; n++) {
1005:               int neighbor = overlapInfoB[cOffset+0];
1006:               int coneSize = overlapInfoB[cOffset+1];

1008:               // Record the point, size of the cone over p coming in from neighbor, and going out to the neighbor for the arrow color
1009:               overlap->addArrow(neighbor, ALE::pair<int,Point>(1, p), ALE::pair<Point,ALE::pair<int,int> >(p, ALE::pair<int,int>(coneSize, _graphB->cone(p)->size())) );
1010:               cOffset += cMsgSize;
1011:             }
1012:             offset += msgSize;
1013:           }
1014:         }
1015:       }; // __pullbackAtlases()

1017:     public:
1018:     }; // class Overlap

1020:     template <typename Data_, typename Map_, typename Overlap_, typename Fusion_>
1021:     class ParMap { // class ParMap
1022:     public:
1023:       //
1024:       // Encapsulated types
1025:       //
1026:       //  Overlap is an object that encapsulates GatherAtlas & ScatterAtlas objects, while Fusion encapsulates and the corresponding
1027:       // Gather & Scatter map objects. The idea is that Gather & Scatter depend on the structure of the corresponding atlases, but
1028:       // do not necessarily control their construction.  Therefore the two types of objects and/or their constructors can be overloaded
1029:       // separately.
1030:       //  For convenience Overlap encapsulates the data it is constructed from: two atlases, InAtlas and OutAtlas, and Lightcone,
1031:       // which is a Sifter.
1032:       // InAtlas refers to the input Sec (the argument of the ParMap), while OutAtlas refers to the output Sec (the result of ParMap).
1033:       // InAtlas_ and OutAtlas_ have arrows from points to charts decorated with indices into Data_ storage of the input/output Sec
1034:       // respectively.
1035:       //  GatherAtlas is an Atlas lifting InAtlas into a rank-indexed covering by overlaps with remote processes.
1036:       // The overlap is computed accroding to Lightcone, which connects the charts of OutAtlas_ to InAtlas_, which implies the two
1037:       // atlases have charts of the same type.  The idea is that data dependencies are initially expressed at the chart level;
1038:       // a refinement of this can be achieved by subclassing (or implementing a new) GatherAtlas.
1039:       //  Gather is  a Map that reduce a Sec over GatherAtlas with data over each ((in_point,in_chart),rank) pair to a Sec over InAtlas,
1040:       // with data over each (in_point, in_chart) pair.  Scatter maps in the opposite direction by "multiplexing" the data onto a
1041:       // rank-indexed covering.
1042:       //  Map is a Map sending an InAtlas Sec obtained from Gather, into an OutAtlas Sec.
1043:       //
1044:       typedef Data_                                            data_type;
1045:       typedef Overlap_                                         overlap_type;
1046:       typedef typename overlap_type::in_atlas_type             in_atlas_type;
1047:       typedef typename overlap_type::out_atlas_type            out_atlas_type;
1048:       typedef typename lightcone_type::out_atlas_type          lightcone_type;
1049:       //
1050:       typedef Fusion_                                          fusion_type;
1051:       typedef typename fusion_type::gather_type                gather_type;
1052:       typedef typename fusion_type::scatter_type               scatter_type;
1053:       typedef Map_                                             map_type;
1054:       //
1055:     protected:
1056:       int                             _debug;
1057:       //
1058:       Obj<map_type>                   _map;
1059:       Obj<overlap_type>               _overlap_type;
1060:       Obj<fusion_type>                _fusion;
1061:     protected:
1062:       void __init(MPI_Comm comm) {
1064:         this->_comm = comm;
1065:         MPI_Comm_rank(this->_comm, &this->_commRank);CHKERROR(ierr, "Error in MPI_Comm_rank");
1066:         MPI_Comm_size(this->_comm, &this->_commSize);CHKERROR(ierr, "Error in MPI_Comm_rank");
1067:       };
1068:     public:
1069:       //
1070:       // Basic interface
1071:       //
1072:       ParMap(const Obj<map_type>& map, const Obj<overlap_type>& overlap, const Obj<fusion_type>& fusion)
1073:         : _debug(0), _map(map), _overlap(overlap), _fusion(fusion) {};
1074:      ~ParMap(){};
1075:       //
1076:       // Extended interface
1077:       //
1078:       int  getDebug() {return this->_debug;}
1079:       void setDebug(const int& d) {this->_debug = d;};
1080:       MPI_Comm comm() {return this->_comm;};
1081:       MPI_Int  commRank() {return this->_commRank;};
1082:       MPI_Int  commSize() {return this->_commSize;};
1083:     }; // class ParMap
1084: 

1086: } // namespace X
1087: } // namespace ALE

1089: #endif