[LINK] How filters work (was Re: Possible Letter to Conroy (the real one))

Sat Mar 21 08:10:39 AEDT 2009

Just catching up on this...

On 19/03/2009, at 6:06 PM, stephen at melbpc.org.au wrote:
> Eg, "(The filter) will apparently be a dedicated box rather than  
> simply
> filtering software. [snip]
>
> Also, if everything has to pass through a single box, and that box  
> gets
> attacked and goes down, you can kiss your connection goodbye .."

People often imagine that all Internet traffic has to pass through a  
filter box for it to work. This is not the case. It's certainly the  
simplest architecture, and may work for the smallest ISPs, but it  
doesn't scale well for the reasons outlined.

There are to other approaches...

1. Assuming we're still looking at the core aim of filtering "the ACMA  
blacklist" or something similar, i.e. a specific list of URLs... The  
first cut can be by IP address, and it can be done in a router using  
the routing table. The router doesn't have the URLs, just the IP  
addresses associated with their domains. The small proportion of the  
traffic intended for those IPs is routed to the filter box, where the  
packets are opened up to look at the URLs to see whether they're  
passed on or blocked.

In this case, the majority of the traffic is routed as normal -- the  
router always has to make a decision about where to send every packet  
anyway -- but only a little bit of traffic is routed through a box  
which does the harder work of analysis and decision-making. There can  
be multiples of those boxes to spread the load.

2. Pass-by filtering is another technique, and this is what's used  
within China by the Great Firewall. There's a diagram of one vendor's  
device at http://www.business-concepts.co.uk/internet_filtering_8e6/8e6pass_by.jpg

All traffic is routed normally. That traffic is monitored passively to  
look out for banned content -- pretty much anything you want, like  
URLs, keywords, phrases, what have you. None of the traffic actually  
passes through this magic box, so in high-traffic situations it's just  
that the box can't necessarily keep up with monitoring everything.

If the magic box detects banned content, it notes the source and  
destination IP addresses (i.e. the addresses of the each end of that  
connection) and fires at each of them three RST packets which are  
crafted to look like they can from the other end. These packets cause  
the the connection to reset, i.e. "hang up".

If you were watching this happen, say if you were even sending email  
containing bad words like "democracy" and "freedom", then you'd just  
see the connection suddenly fail, as if there'd been a network glitch.

As I wrote in "The Great Firewall of China: how it works, how to  
bypass it" in August 2008...
http://stilgherrian.com/politics/the-great-firewall-of-china-how-it-works-how-to-bypass-it/
or
http://is.gd/1aZG

     Researchers at the ConceptDoppler project have found that it can
     disrupt Internet traffic within China that even mentions touchy
     subjects. Imagine your truck encountering random checkpoints. If
     it contains banned concepts like “news blackout” (新闻封) or
     “gerontocracy” (老人政治) your delivery is simply burned,  
never to
     be seen again.

     ConceptDoppler says the banned words still get through 28% of the
     time, and the blocking can’t keep up with heavy Internet traffic.
     But even partial blocking encourages self-censorship through the
     perception that you’re being watched. Perhaps that’s even more
     effective because it discourages offline conversation too.

I also wrote:

     To avoid content filtering, just speak in code. Learn to say
     “duck-breeding club” rather than “student dissident  
meeting”.

One example of this is the grass-mud horse.

     http://technology.timesonline.co.uk/tol/news/tech_and_web/the_web/article5858267.ece

Stil

-- 
Stilgherrian http://stilgherrian.com/
Internet, IT and Media Consulting, Sydney, Australia
mobile +61 407 623 600
fax +61 2 9516 5630
Twitter: stilgherrian
Skype: stilgherrian
ABN 25 231 641 421