[LINK] More to the world than physics [was: Google's WiFi bungle]

Thu May 20 09:46:49 AEST 2010

There's only two things to say here and then I'm done.  First, I often 
enough prefaced my remarks with the stated assumption that Google have 
the resources to render even basic wifi data as identifiable; on this, 
Craig and I appear to be in agreement, for he said "admittedly, google 
ARE experts at extracting signal from noise".  Second, they admitted 
themselves that they collected much more than basic wifi data, they 
apologised, and they  undertook to destroy that data, which rather 
confirms that what they collected was indeed personal information. 

Craig Sanders wrote:
> On Thu, May 20, 2010 at 05:33:13AM +1000, Stephen Wilson wrote:
>> If only physics was the best or only model for understanding the world, 
>> you'd be OK.  But it's not, and your shrill protestations and ongoing 
>> category errors prove my point that there is a chasm between the worlds 
>> of technologists and privacy policy makers.  
>
> now it's technologists vs privacy policy makers.  please make up your
> mind and quit shifting the goal posts.
>
> if you'll recall - the original dispute was that I, and others, said
> this whole wifi "issue" was just a media beat-up. which it is. and it
> has the usual elements of witch-hunt and rabble-rousing.
>
>
>> But you're wrong to treat concepts like private and public like       
>> physical properties.                                                  
>
> 1. i'm not treating them like physical properties, but as matters of
> objective fact.
>
> 2. i'm not wrong in doing that.  it is demonstrably true.
>
>
>> Yet again, I bring you back to a point of law: the operable term is   
>> "persional information", and not "public" or "private".               
>
> and, as i've said several times now, you need to first prove that there
> was any "personal information" in the packets received.
>
> you're constantly making the *assumption* that there was.  and then
> immediately judging google to be guilty of privacy infringment.
>
> this may seem quaint and old-fashioned but in my view, people (and
> corporations) ought to be judged on what they've actually done, not on
> what a hysterical and ignorant mob assumes they might have done.
>
>
>> If a data set contains information about a person, where their        
>> identity is apparent or can be readily determined, then that          
>> data stream is called "personal information" and it's subject to      
>> information privacy law.
>
> and *IF* is the operative word here.
>
> you're posing a question, and then leaping immediately to the conclusion
> ("guilty") without even bothering to answer it.
>
>
>
>> Collecting (and hanging on to) payload data is very different from 
>> collecting wifi addresses because of the issue of primary purpose.  
>
> you're wrongly assuming that "collecting payload data" and "collecting
> wifi addresses" are two different actions. they're not. they're one
> action, scanning: you scan, you get a whole load of data, then you
> filter out the stuff you don't want.
>
>
>> To go back to your examples of mail servers collecting personal       
>> information, that colelction is intrinsic to how the e-mail system    
>> works, and I wouldn't think it was unjustifiable.                     
>>
>> But if a mail service operator then put that personal information     
>> to another unrelated purpose, without informing the individuals       
>> concerned, then they may have breached information privacy law.       
>
> yes, of course.
>
> and there's another "if".
>
> so, the next thing you need to prove (after proving that there was
> "personal information" in it) is that google used, or intended to, any
> collected information.
>
> you can't just *assume* that they did - especially when that assumption
> is contrary to the way that wifi scanning actually works.
>
>
> 1. you're assuming that when you scan, you just get the minimal data that
> you want and have to go out of your way to gather any extra data.
>
> if only things worked that way. my job, and that of many other IT
> people, would be MUCH simpler.
>
> in reality, when you scan, you get everything and you have to take extra
> steps to get rid of the junk that you don't want.
>
> in reality, extracting the signal you want from the noise you don't is
> always the hardest part of the job, and it's always tedious, error-prone
> and pretty much impossible to get right first time. it's always an
> iterative process of incrementally stripping back more and more junk to
> leave behind the target data.
>
> which brings us to:
>
> 2. it's not in the least bit unusual in ANY data acquisition job
> (whether you're talking about getting data from sensors or from a
> scientific instrument or from web-server log files or from scanning for
> wireless networks etc) to separate the tasks of data gathering and data
> processing. in fact, it's normal and SOP to do things that way.
>
> which is why it does not surprise me in the least, or think it unusually
> suspicious, for google to have done that. i'd be surprised if they did
> it any other way.
>
> it's like a video recorder, you press "RECORD" and it records everything
> that was broadcast. later you go back and cut out or skip the ads.
>
>
>
>
>
> what i'm seeing in this mess is a whole lot of people making wrong
> assumptions that the technology works contrary to the way that it
> actually does, then leaping to conclusions about what the extra data
> contained ("ooh. it's not just SSIDs or MAC addresses, so it *must* be
> personal information"), then they assume that google used or intended
> to use this assumed personal info, and finally they automatically judge
> google guilty of breaching various privacy laws around the world. case
> closed.
>
> i'm appalled at how easily otherwise rational people are sucked into
> both trial-by-media and trial-by-political-grandstanding just because
> there's technology involved.
>
> i'm especially appalled that it's happening on LINK where we've seen and
> discussed this phenomenon many times over the years.
>
>
>
> BTW, if google were trying to extract "personal information" from the
> recorded data then it would also be subject to the points i made above
> about extracting signal from noise. while it's possible that there *may*
> have been *some* personal information in the data, it would have been
> miniscule compared to the rest of the garbage - and even more difficult
> to extract than purely technical information like SSID or MAC address
> because it's subjective and not easily identifiable.
>
> (admittedly, google ARE experts at extracting signal from noise. still
> doesn't mean it would be easy...and getting useful data from random
> network packets is nowhere near as simple as getting it from formatted
> html, pdf, text, and other document types)
>
> any data set gathered by google is not going to be conveniently ordered
> and easily accesible - it's more like a garbage truck full of several
> street's worth of domestic garbage that may or may not have a few
> fragments of someone's carelessly discarded bank statements hidden in
> it. if you can find them, and clean off the rotting tomato, then you
> might end up with enough fragments to assemble together in order, and
> then, yes, you might have some personal information about someone. or
> you might just end up with a bit of smudged paper with a bank logo on
> it.
>
> which is why i keep making the point that you have to prove that there
> actually was any personal information in the data that google gathered.
> it's extremely unlikely that there was, OR that there would have been
> anywhere near enough to have been worth the effort of extracting it.
>
>
>
>> Which is basically what the Buzz fuss was all about. 
>
> which is the root cause of this wifi hysteria - Buzz had serious
> privacy problems so anything google does is a privacy infringement
> whether it actually is or not.
>
> craig
>