[Nauty] Parallelizing generation

Tue Jan 19 04:00:23 AEDT 2021

Many thanks, this helped a lot!

Looks like this was really the problem. In the case I mentioned earlier, 
it took 10 minutes just to get to the splitting point, and only a couple 
of seconds afterwards (the splitting level was 9).

With -X-2 or -X-3 I get an almost 10-fold speedup with the same splitting.

I will continue to experiment with this, but this I'm confident this 
will solve everything one way or another, thanks!

Best,
Tomas

On 18.01.21 11:54, Brendan McKay wrote:
> Hi Tomas,
>
> Compile the program with SPLITTEST defined.  Then run it with 
> non-trivial splitting.  It will report the splitting level, the number 
> of nodes on that level, and the time.  For example, with parameters "8 
> 8 2/3" I get
>
>    >Z 25102903 splitting cases at level 6; cpu=1.43 sec
>
> The splitting level does not depend on res/mod but you have to specify 
> something for this feature to work. (I should fix that.)
>
> There is an undocumented option -X#, where # is an integer, that moves 
> the default splitting level up by #. The value of # can be negative.  
> For example, "-X-1 8 8 2/3" gives
>
>    >Z 834126 splitting cases at level 5; cpu=0.09 sec
>
> You can use -X to experiment with different splitting levels to see if 
> there is one suiting your application.  Note that if the number of 
> splitting cases is not much larger than mod, the parts probably won't 
> be of similar sizes.
>
> Brendan.
>
> On 18/1/21 9:09 pm, Tomas Peitl wrote:
>> Hi Brendan,
>>
>> Thanks for the detailed explanation!
>>
>> How is the splitting level determined? Can I set it manually, or is 
>> it simply the first level that has size >= mod? Can I diagnose this; 
>> for instance see where the splitting occurred and possibly count how 
>> much time was spent on pruning and generation before and after that?
>>
>> It seems that the space is split into roughly even parts, so a 
>> plausible explanation is indeed that generating the shared tree takes 
>> too long. But on the other hand I'm not trying to use that many cores 
>> (30-50) and I already see behavior that suggests it may take as much 
>> as 300 seconds for the shared tree, which I can't quite believe (this 
>> is for bicoloured 12 + 11 vertices with pruning and a matching on the 
>> 12 vertices, using up to 32 cores, the output has ~140k graphs, 
>> running times and output sizes are roughly even across parts).
>>
>> Best,
>> Tomas
>>
>> On 16.01.21 04:03, Brendan McKay wrote:
>>> Hi Tomas,
>>>
>>> I suspect you are using parameters for which the splitting is not good.
>>>
>>> Consider the case of generating all bicoloured graphs with 8 
>>> vertices on each side, in total 14,685,630,688 graphs.
>>>
>>> One part:  3872 sec
>>> 10 parts:  392.9 sec each (factor 9.85)
>>> 100 parts: 41.0 sec each (factor 94.4)
>>> 1000 parts: 5.33 sec each (factor 726)
>>> 10000 parts: 1.74 sec each (factor 2220)
>>>
>>> This is close to linear speedup up to around 1000 parts.  I'll 
>>> explain how it is done.  Remember that a part is specified as a pair 
>>> res/mod, where 0<=res<=mod-1.
>>>
>>> The generation has the form of a tree, with output graphs as 
>>> leaves.  Draw a line across the tree at some level (the splitting 
>>> level) and number the nodes on that level modulo mod (starting at 
>>> 0).  Part res/mod corresponds to the portion of the tree descended 
>>> from those nodes numbered res.
>>>
>>> Note that the entire tree must be generated as far as the splitting 
>>> level, which is an overhead for every part. In the example above, it 
>>> is about 1.35 seconds, which is why the parallelism gets poor when 
>>> the time per part isn't much more than the overhead. Check that the 
>>> time per part is close to 1.35 + 3870/N for N parts.
>>>
>>> In the case of different classes of bipartite graphs, various things 
>>> can go wrong.  If the shape of the tree is very irregular, the size 
>>> of the parts can vary a lot, which matters if your application 
>>> relies on waiting for the slowest part. If the tree is strongly 
>>> pruned in the levels above the splitting level, it could mean that 
>>> the splitting overhead becomes a larger fraction of the total. 
>>> Sometimes there is no splitting level that gives both acceptable 
>>> evenness of splitting and acceptable overhead.
>>>
>>> I have used two splitting levels together on occasion. This feature 
>>> is not in the code but it is something to try if changing the 
>>> splitting level is not enough to solve the problem.
>>>
>>> Brendan.
>>>
>>>
>>> On 16/1/21 2:15 am, Tomas Peitl wrote:
>>>> Hi everyone,
>>>>
>>>> I've got a question about parallelization in generators such as 
>>>> genbg (I'm using a modified version of genbg provided by Brendan 
>>>> some time ago, but that shouldn't make a difference I hope).
>>>>
>>>> I noticed that the speedup from parallelization using the res/mod 
>>>> feature pretty much plateaus at around 5-6 cores, and it doesn't 
>>>> really make too much sense to use more than that. Is this more or 
>>>> less correct, or am I missing something? To be fair, the speedup 
>>>> does grow a bit even beyond 6 cores (depends on the particular case 
>>>> I guess), but the dependence on the number of jobs looks more 
>>>> logarithmic than linear (didn't really try to fit a curve). So if 
>>>> you have more than one generation task, it makes sense to use fewer 
>>>> cores per task, and instead run several tasks in parallel.
>>>>
>>>> I have some aggressive custom pruning, but on the other hand each 
>>>> job generates roughly the same number of results, and the running 
>>>> times are also more or less uniformly distributed, so the splitting 
>>>> of the search space seems OK.
>>>>
>>>> How does the res/mod feature work? Does each job still have to 
>>>> generate everything, just with the advantage of quickly skipping 
>>>> over graphs that belong to other jobs? Or can it run in "O(N/j)"?
>>>>
>>>> Many thanks,
>>>> Tomas
>>>>
>>>> _______________________________________________
>>>> Nauty mailing list
>>>> Nauty at anu.edu.au
>>>> http://mailman.anu.edu.au/mailman/listinfo/nauty
>>> _______________________________________________
>>> Nauty mailing list
>>> Nauty at anu.edu.au
>>> http://mailman.anu.edu.au/mailman/listinfo/nauty
>> _______________________________________________
>> Nauty mailing list
>> Nauty at anu.edu.au
>> http://mailman.anu.edu.au/mailman/listinfo/nauty
> _______________________________________________
> Nauty mailing list
> Nauty at anu.edu.au
> http://mailman.anu.edu.au/mailman/listinfo/nauty