[Nauty-list] shortg outputs

William Rummler w.a.rummler at gmail.com
Wed Jan 4 11:37:15 EST 2012


Hi Ken,

I believe that (1) does always hold.

Overly detailed comments on that (skip if you'd like): Looking at the
SORTCOMMAND macros at the top of shortg.c, the sort process either (a)
is passed the flag -u (in addition to producing unique outputs, this
flag stabilizes the sort); (b) it's given the third "index" field as a
key via "-k 3" (the index is based on the number of graphs read by
shortg so far and will break canonical-label ties in favor of the
graphs that come first in the input); or (c) it doesn't need any
special flags to maintain a stable sort when the "k" flag is off and
the "v" flag is on due to the format of the lines fed to the sort
process: "canonical_label num_read". However, it seems that if the "v"
flag is off, the "d" flag is on, and the "k" flag is also off, there
is not necessarily going to be a stable sort. It may not matter if
you're not using the "k" flag, but there it is.

Moving on: I think I see what you mean by (2). For example, here's the
output of geng -b 4:

C?
CC
CE
CF
CQ
CU
C]

If you shortg -k these guys, you get

C?
CC
CE
CF
CU  <--- moved up
CQ  <--- moved down
C]

... the reason for this being that the graphs are essentially being
sorted by their canonical labels; in this case, CQ's canonical label
(C`) is lexicographically greater than CU's canonical label (CR).

You would like this to not happen, if my interpretation is correct.
You would want (in this example) the latter output to equal the
former. There are probably several ways to do this (and maybe several
better ways to do this than what I'm about to offer), but one way to
do it (that, to its credit, avoids the potential need to re-compute
canonical labels for tens of millions of graphs if one tries to script
a workaround (there may be a good way like this, though)) is to
directly modify shortg.c to use a data structure that can keep track
of when each canonical label was first encountered. You can use this
data structure to insert another field in the lines fed to the sort
sub-process that will allow it to maintain the input file order of
canonical labels.

Toward this end, I've attached a modified version of shortg.c (called
shortg2.c; please backup your original shortg.c before copying this
one onto it and re-make'ing your nauty install) that you may want to
examine/use to see if it helps you. I ran a few small tests with what
I thought were representative flag options to my own re-make'd shortg,
and it seems to work; for instance, it produces the "right" (at least
desired) output for the example I gave above.

If I'm not mistaken, this also stabilizes the "kv"-off "d"-on sort
process that I mentioned way back at the beginning of this monograph,
er, email.

Finally, a disclaimer: I hope this helps and that I haven't
communicated anything wrong, but it's been a long time since I've
delved into nauty's source like this. If appropriate, hopefully others
can offer corrections/comments to what I've said here. And I'll try to
clarify if I can, just let me know.

"Good nauty, and good luck."


- William

ps. This issue (2) may be a "new flag" feature request for the next
nauty's shortg?




On Fri, Dec 30, 2011 at 7:00 PM,  <nauty-list-request at cs.anu.edu.au> wrote:
> Send Nauty-list mailing list submissions to
>        nauty-list at cs.anu.edu.au
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        http://dcsmail.anu.edu.au/cgi-bin/mailman/listinfo/nauty-list
> or, via email, send a message with subject or body 'help' to
>        nauty-list-request at cs.anu.edu.au
>
> You can reach the person managing the list at
>        nauty-list-owner at cs.anu.edu.au
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Nauty-list digest..."
>
>
> Today's Topics:
>
>   1. shortg outputs (Kenneth Joseph Ryan)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 30 Dec 2011 18:23:44 -0500
> From: Kenneth Joseph Ryan <kjryan at bgsu.edu>
> Subject: [Nauty-list] shortg outputs
> To: "nauty-list at cs.anu.edu.au" <nauty-list at cs.anu.edu.au>
> Message-ID:
>        <E76D862B01CC7845BF120D9591635FA0376E0180EF at MAIL4.bgsu.edu>
> Content-Type: text/plain; charset="us-ascii"
>
> Hi,
>
> I use shortg -k to reduce files of graphs. The input files in our applications can be rather large (tens of millions of bipartite undirected graphs all with the same number of vertices), and the reductions are typically an order of magnitude or more.
>
> For a new method to our problem type, each input file will be presorted by a "secondary criterion."  I ultimately need to produce an output file which
>
> 1)      contains the isomorphic copy within each class that occurs "first" in the input file and
>
> 2)      the output file needs to preserve input file order.
> At this point, I realize that I could use the class output produced by shortg -vu and then extract the needed subset of input graphs through a separate program, but am hoping to find a direct, more timely solution because the reductions are so large.
>
> I focused on reading the manual and then even tried my hand at reading nauty code but with no luck, so I moved onto simple examples. Through simple examples I already know that 2) does not happen directly, but I have not constructed an example where 1) did not hold. Does 1) always hold? Is there a setting (or some slick approach) such that I can get 1) AND 2) directly?
>
> Thanks in advance for your time,
> ----
> Ken Ryan, Associate Professor
> Applied Statistics and Operations Research
> Bowling Green State University
> Bowling Green, Ohio 43403-0267
> Phone: (419)372-2958
> Fax: (419)372-2875
> Email: kjryan at bgsu.edu<mailto:kjryan at bgsu.edu>
> Web: http://www.cba.bgsu.edu/faculty_staff/ryan/
>
>
>
> ----
> Ken Ryan, Associate Professor
> Applied Statistics and Operations Research
> Bowling Green State University
> Bowling Green, Ohio 43403-0267
> Phone: (419)372-2958
> Fax: (419)372-2875
> Email: kjryan at bgsu.edu
> Web: http://www.cba.bgsu.edu/faculty_staff/ryan/
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://dcsmail.anu.edu.au/pipermail/nauty-list/attachments/20111230/c83d3fd6/attachment.html>
>
> ------------------------------
>
> _______________________________________________
> Nauty-list mailing list
> Nauty-list at cs.anu.edu.au
> http://dcsmail.anu.edu.au/cgi-bin/mailman/listinfo/nauty-list
>
>
> End of Nauty-list Digest, Vol 56, Issue 2
> *****************************************
-------------- next part --------------
A non-text attachment was scrubbed...
Name: shortg2.c
Type: text/x-csrc
Size: 23154 bytes
Desc: not available
Url : http://mailman.anu.edu.au/mailman/private/nauty/attachments/20120103/e1fc648c/attachment.bin 


More information about the Nauty mailing list