[Nauty-list] shortg with a large number of inputs
Ken Ryan
kjryan at bgsu.edu
Mon Oct 15 09:00:19 EST 2007
Thanks, Brendan.
Temporary files are being written to the new directory that I
specified using your instructions, so it appears that your fix worked.
I wanted to verify (for myself and the mailing list) that my sort now
handles larger files, but instead first ran into large file problems
while trying larger problems. Unfortunately, I am not able to rerun
the case where that sort error manifested since I do not recall the
exact inputs I used with my C program. Before getting your response,
I ended up solving that case (discussed below) by partitioning the job
further and using shortg on each of the smaller files or sub-pieces.
I am working with bipartite undirected graphs with N+k vertices and
Nk/2 edges. For example, with N=256 and k=16, the problem (mentioned
above) had 282,959 input graphs (and 186,595 isomorphism classes). I
ran my C code (complied with large file support), and the resulting
human readable file of all these input graphs in dreadnaut format was
2.7GB. The final step would have been to run detrog followed by
shortg, but dretog gave an error due to input file size. I think that
an answer to one of the following will improve my approach:
1) Do you have a makefile which compiles gtools applications for large file
support?
2) In a different format which I use to store the graphs, the 282,959
input graphs take up only 0.5GB. It is when I represent them as the
needed graphs with N+k vertices that they total 2.7GB. However, after
dretog the resulting file in g6 format would be (much) less than
2.7GB. While I have the current graph in memory in my program is it
possible to write it out in g6 format and by-pass writing the large
2.7GB file?
OR
3) Some else that I'm missing?
Regards,
Ken
ps A thanks to Keith for a response. My system does have GNU sort.
At 03:49 AM 10/13/2007, Brendan McKay wrote:
>Hi Ken,
>
>I never heard of any such problem caused by pipes and doubt if
>that is the correct explanation. My theory is that the sort
>process is running out of temporary disk space.
>
>The usual behaviour of sort on unix-like systems is to create
>temporary files in /tmp. If /tmp has insufficient space left,
>sort will crash (to check: "df -m /tmp" answers in megabytes).
>
>A proper fix to this problem would be to have an option on shortg
>to specify the place to put temporary files. This has been on my
>list of things to do for a long time but I just moved it higher.
>
>Meanwhile, you can try two things:
>1. Define the environment variable TMPDIR before running shortg
> to be the name of a directory with enough space. If such a
> directory is "dir" (use a full path), this is done by:
> In bash: export TMPDIR=dir
> In tcsh: setenv TMPDIR dir
> This method will of course only work if your sort program obeys
> the TMPDIR variable.
>2. Modify shortg.c like this (about line 92):
>#define SORTCOMMAND SORTPROG,SORTPROG,"-T","dir","-u","+0","-1"
>#define VSORTCOMMAND1 SORTPROG,SORTPROG,"-T","dir"
>#define VSORTCOMMAND2 SORTPROG,SORTPROG,"-T","dir","+0","-1","+2"
>where dir is as in choice #1. Then remake it.
>
>In both #1 and #2 you should be able to use the current directory
>under the name '.' .
>
>Please let me know if this proves to be the solution.
>
>Brendan.
>
>
>* Ken Ryan <kjryan at bgsu.edu> [071013 01:32]:
> >
> > I observed that if a large set of graphs are given as an input shortg
> > crashes. shortg calls the unix sort utility and passes an input of graphs
> > in canonical form to the unix sort utility. If the number of graphs passed
> > to the unix sort utility is large then the unix sort utility crashes and
> > that causes shortg to crash. I was told by the system administrator that
> > there is a bottleneck due to the pipe being used to pass the input of
> > graphs to the unix sort utility. I was wondering if there is a fix to this
> > situation. If needed, I will provide additional details ...
> >
> > I really appreciate your time and help on my last few questions.
> >
> > Regards,
> >
> > Ken Ryan, Assistant Professor
> > Applied Statistics and Operations Research
> > Bowling Green State University
> > Bowling Green, Ohio 43403-0267
> > Phone: (419)372-2958
> > Fax: (419)372-2875
> > Email: kjryan at bgsu.edu
> > Web: http://www.cba.bgsu.edu/faculty_staff/ryan/
> >
> >
> > _______________________________________________
> > Nauty-list mailing list
> > Nauty-list at cs.anu.edu.au
> > http://dcsmail.anu.edu.au/cgi-bin/mailman/listinfo/nauty-list
Ken Ryan, Assistant Professor
Applied Statistics and Operations Research
Bowling Green State University
Bowling Green, Ohio 43403-0267
Phone: (419)372-2958
Fax: (419)372-2875
Email: kjryan at bgsu.edu
Web: http://www.cba.bgsu.edu/faculty_staff/ryan/
More information about the Nauty
mailing list