[Nauty] Canonically labeled output of nauty/Traces actually byte-wise unique?

Tue Feb 20 12:50:23 AEDT 2024

Hi Johann,

The sparsegraph structure for a labelled graph can be non-unique in 
three ways.
1. The neighbours for a vertex can be listed in any order.
2. The neighbour lists in the e array might be in any order.
3. The neighbour lists in the e array can have gaps between them.

The sortlists_sg function corrects issue 1 but not issues 2 and 3.

However, I believe that the canonical output of nauty has none of these 
issues.
That is, the neighbour lists are always sorted and in the natural order 
without gaps.
I suggest you add code to check that and let me know if you find an 
exception.

As for Traces, probably the same is true but I'll consult Adolfo.

There is a function "long hashgraph_sg(sparsegraph *sg, long key)" in 
nautil.c
that should give the same value regardless of issues 1-3. If you call it 
with
several different values of key, the combined result is unlikely to give you
the same values for two different graphs, but this is not guaranteed as the
method is very far from cryptographic quality.

For an application where very high quality signatures are needed, I would do
it like this: (1) sortlists_sg, (2) copy the neighbour lists from the e 
array one
at a time into a new array with a marker (say -1) between them.  (3) Compute
a hash signature from the new array. I use the SHA1 function that comes 
in the
libcrypto library that is part of OpenSSL, which gives 160 bits.

Brendan.

On 20/2/2024 9:49 am, Johann Birnick via Nauty wrote:
> Hello,
>
> for my application of nauty/Traces, I'm building a 'database' of some 
> data of some graphs in the form of a hashtable. That is, to store data 
> attached to some graph, my plan is to canonically label it and then 
> take the hash of the canonically labeled graph and then store some 
> data (think: graph invariants) of the graph in a hashtable.
>
> For this, it would be important to know if the canonically labeled 
> graph which is returned from say Traces is guaranteed to be unique up 
> to actual byte representation?
>
> More precisely: Take two isomorphic graphs. For each of them, 1. call 
> Traces and get a `sparsegraph` object which is the canonically labeled 
> graph, and then 2. compute the hash of this sparsegraph. Will the 
> hashes be the same?
>
> (It doesn't seem like it, as in the example programs always 
> `aresame_sg` is called, which seems to do some nontrivial things such 
> as skipping certain bytes.)
>
> If not, then are there perhaps special cases in which it will be 
> guaranteed? For example, say my two graphs really use the same number 
> of vertices (i.e. have no dead vertices of degree 0) and same number 
> of edges. Will it work in this special case?
>
> Thank you very much in advance!
>
> Johann
>
> _______________________________________________
> Nauty mailing list
> Nauty at anu.edu.au
> https://mailman.anu.edu.au/mailman/listinfo/nauty