[LINK] Historical programming-language groups disappearing from Google

Stephen Loosley StephenLoosley at outlook.com
Thu Jul 30 15:15:08 AEST 2020


Historical programming-language groups disappearing from Google

Posted July 28, 2020 by corbet
https://lwn.net/Articles/827233/

As Alex McDonald notes in this support request, Google has recently banned the old Usenet groups comp.lang.forth and comp.lang.lisp from the Google Groups system.

"Of specific concern is the archive. These are some of the oldest groups on Usenet, and the depth & breadth of the historical material that has just disappeared from the internet, on two seminal programming languages, is huge and highly damaging. These are the history and collective memories of two communities that are being expunged, and it's not great, since there is no other comprehensive archive after Google's purchase of Dejanews around 20 years ago."

Perhaps Google can be convinced to restore the content, but it also seems that some of this material could benefit from a more stable archive.

Comments:  (Log in to post comments)

Historical programming-language groups disappearing from Google
Posted Jul 28, 2020 15:23 UTC (Tue) by auc (subscriber, #45914) [Link]

I used to read comp.lang.lisp for a couple years around 2004 and it was an amazing experience. The complete disappearence of such archives would be a great loss.

Historical programming-language groups disappearing from Google
Posted Jul 28, 2020 16:44 UTC (Tue) by nix (subscriber, #2304) [Link]

This looks like ridiculous overreaction from some braindead automation. I thought Google was supposed to have the best AI in the world, but whenever they let that AI take any decisions that affect real people it seems to make disastrous mistakes, usually with no obvious human checking before disaster is inflicted and no recourse other than signal-boosting via the press, since Google always does this with no humans in the loop and no appeals procedure (not that it's clear who could be consulted in the case of historical archives!). This is... not a sensible way to work.
Deleting historical archives because of spam is even less sensible: it's not like more spam is going to materialize in the past history of comp.lang.lisp.

Historical programming-language groups disappearing from Google
Posted Jul 28, 2020 17:51 UTC (Tue) by farnz (subscriber, #17727) [Link]

I suspect that it's happening because Google Groups is two products merged together:

The old "dejanews" Usenet archives, which I'll call "Usenet" throughout this comment.
Google's in-house forum system, which I'll call "forum" throughout this comment.
It makes sense to remove entire forum groups which have always had a spam problem, and where the forum group owner isn't willing to use Google's tools to moderate it and keep it spam-free; after all, if you've created a forum group for the purpose of spamming, or if you simply gave up the moment spammers found you, there's probably not much non-spam in the group. This is doubly true since the tools have been there since the forum group was created, and advertised to you as the forum group creator; AIUI, Google has reached out to their owner of record for such forum groups and asked them to clean up, so anything left is something that nobody still cares about.

However, that analysis ignores Usenet. Usenet predates Google's spam handling tools (after all, it predates Google), and has never had good tools for dealing with spam problems. Further, because there's no creator or owner on Google's systems for any given Usenet group, there's no-one to reach out to, so there's no-one who can (e.g.) close the group to new posts and clean up history, like there is for forums. Thus, unlike with forum groups, Google has no way to contact someone and say "hey, this group is spammy, please fix".

All it takes is someone designing an AI setup to clean out forum groups that are zero signal, and then running it on both Usenet and forum groups to get into this situation; chances are high that nobody involved in this decision has even realised that the two things are different, because they've been merged together a long time ago.


Historical programming-language groups disappearing from Google
Posted Jul 30, 2020 1:38 UTC (Thu) by Max.Hyre (subscriber, #1054) [Link]

> Usenet predates Google's spam handling tools

Usenet predates spam. There are still a few fogies (read: me) who remember the first spam. :-(

Historical programming-language groups disappearing from Google
Posted Jul 28, 2020 17:11 UTC (Tue) by craigmaloney (guest, #117695) [Link]

This looks like it spans other groups as well. comp.sys.sinclair also has been "banned".

Historical programming-language groups disappearing from Google
Posted Jul 28, 2020 19:30 UTC (Tue) by Lennie (subscriber, #49641) [Link]

This sounds like something the Internet Archive would be a good fit for.
(I hope despite their current legal issues means the data they have is still safe and remains so)

Historical programming-language groups disappearing from Google
Posted Jul 29, 2020 9:30 UTC (Wed) by t-v (guest, #112111) [Link]

https://archive.org/details/usenet-comp.lang
but I would not know how complete that is.

Historical programming-language groups disappearing from Google
Posted Jul 28, 2020 19:58 UTC (Tue) by beshr (subscriber, #133204) [Link]

I don't understand how "banning" them would help with the spam problem, if that's what it is. Can they be convinced to at least provide dumps to archive.org?

Historical programming-language groups disappearing from Google
Posted Jul 28, 2020 20:00 UTC (Tue) by readv_ (guest, #140452) [Link]

Going to go out on a limb here and assume that this is associated with any large count of spam posters leaving nefarious links across multiple groups over the past 10 years.
This is also a good way for the 'goog' to roll this up and clean out their storage and archive a lot of the data, leaving us in a lurch if we're searching for anything older than what they will allow us to view. (Sans Int.Archives)
Besides, Stack-Overflow and Reddit have become de-facto for most people. Be weary though, SO will soon shred that apart if they go enterprise.
/me starts thinking of an archive strategy.

Historical programming-language groups disappearing from Google
Posted Jul 28, 2020 21:43 UTC (Tue) by leromarinvit (subscriber, #56850) [Link]

> This is also a good way for the 'goog' to roll this up and clean out their storage and archive a lot of the data, leaving us in a lurch if we're searching for anything older than what they will allow us to view. (Sans Int.Archives)
Compared to the gazillion of YouTube videos, their entire Usenet archive must be peanuts, so I doubt that's a concern. Or if it is, it doesn't seem like a very sensible one.
I guess it's only a few dozen terabytes compressed (but I can't find any stats excluding binaries). Presumably they don't archive the binary groups anyway, for various obvious reasons.

Historical programming-language groups disappearing from Google
Posted Jul 28, 2020 23:26 UTC (Tue) by readv_ (guest, #140452) [Link]

Send Sundar an email?
they're both still in use

Posted Jul 28, 2020 22:40 UTC (Tue) by gus3 (subscriber, #61103) [Link]

Forth is definitely still in active use, every time a *BSD or OpenIndiana system boots.
And of course, the gajillions of Lisp dialects, including Emacs Lisp.
I sure hope this situation is a machine screw-up and not a conscious human decision.
they're both still in use

Posted Jul 29, 2020 8:39 UTC (Wed) by Wol (subscriber, #4433) [Link]

Isn't at least one major BIOS written in Forth?
Cheers,
Wol
Open Firmware uses Forth

Posted Jul 29, 2020 12:04 UTC (Wed) by dkg (subscriber, #55359) [Link]

yep. Open Firmware (aka IEEE 1275), which boots all the old powerpc macintosh machines, sun SPARC, OLPC, and more, is a Forth interpreter.
Open Firmware uses Forth

Posted Jul 29, 2020 13:05 UTC (Wed) by Wol (subscriber, #4433) [Link]

Actually, I was talking about AMI, or Phoenix, or one of that lot, ie one of the x86 BIOSes.
Cheers,
Wol
Open Firmware uses Forth

Posted Jul 29, 2020 15:46 UTC (Wed) by nix (subscriber, #2304) [Link]

Most unlikely. x86 pre-EFI BIOSes are all repeatedly-hacked horrors from the early 80s, as I understand it, and back then it was raw assembler or nothing.
Open Firmware uses Forth

Posted Jul 29, 2020 23:16 UTC (Wed) by Wol (subscriber, #4433) [Link]

I've lost my copy of Starting Forth, sadly (I used to own a Jupiter Ace), but I'm pretty certain Forth dates from the 70s. And I would have thought writing a BIOS in it would be both very efficient, and (relatively) easy. It lends itself easily to writing assembler primitives, and then using a higher level construct to link them together.
Forth had the reputation of creating executables that beat assembler for compactness ...
Cheers,
Wol

Historical programming-language groups disappearing from Google
Posted Jul 29, 2020 0:16 UTC (Wed) by atai (subscriber, #10977) [Link]

Google, managing the world's information
old information too?

Historical programming-language groups disappearing from Google
Posted Jul 29, 2020 2:28 UTC (Wed) by connert (guest, #140463) [Link]

I've moved the original support request to the correct community, hopefully I can also get some more information on this.

Historical programming-language groups disappearing from Google
Posted Jul 29, 2020 5:17 UTC (Wed) by bokr (subscriber, #58369) [Link]

What if UNESCO declared usenet to be a World Heritage legacy?
IMO Libraries should not be burned helter skelter, even if mistakenly sold into
unrestricted private ownership.
But why would Google burn usenet rather than offer it for sale,
if they want to get rid of it?
Why wouldn't they recognize the great goodwill value for themselves
of just transferring the archives to the FSF (Free Software Foundation)?
>From Wikipedia:
_______________________________________________________________________________________________________________
UNESCO has 193 member states and 11 associate members.[5] Based in Paris, France, most of its field offices are "cluster" offices that cover three or more countries; national and regional offices also exist.
UNESCO seeks to build a culture of peace and inclusive knowledge societies through information and communication.[6] To that end, it pursues its objectives through five major program areas: education, natural sciences, social/human sciences, culture and communication/information. It sponsors projects related to literacy, technical training, education, the advancement of science, promoting independent media and freedom of the press, preserving regional and cultural history, and promoting cultural diversity. UNESCO assists in translating and disseminating world literature, establishing international cooperation agreements to secure "World Heritage Sites" of cultural and natural importance, preserving human rights, and bridging the worldwide digital divide. It also launched and leads the Education For All movement and lifelong learning.

Historical programming-language groups disappearing from Google
Posted Jul 29, 2020 12:00 UTC (Wed) by gray_-_wolf (subscriber, #131074) [Link]

The link gives `Sorry, this page can't be found.`, so it was deleted? Or is it not supposed to be readable by public?

Historical programming-language groups disappearing from Google
Posted Jul 29, 2020 15:29 UTC (Wed) by sumanah (subscriber, #59891) [Link]

I'm having the same issue.

Historical programming-language groups disappearing from Google
Posted Jul 29, 2020 15:39 UTC (Wed) by corbet (editor, #1) [Link]

Weird...it works for me still...

Historical programming-language groups disappearing from Google
Posted Jul 29, 2020 15:47 UTC (Wed) by jake (editor, #205) [Link]

> I'm having the same issue.
Me three, i thought it might be related to this comment: https://lwn.net/Articles/827293/ but dunno ...
jake

Historical programming-language groups disappearing from Google
Posted Jul 29, 2020 13:32 UTC (Wed) by dtnameh (guest, #140476) [Link]

would it be possible to move them to archive.org

Historical programming-language groups disappearing from Google
Posted Jul 29, 2020 15:58 UTC (Wed) by oldtomas (guest, #72579) [Link]

The decline of the Library of Alexandria [1].
[1] https://en.wikipedia.org/wiki/Library_of_Alexandria
--



More information about the Link mailing list