runet-censorship-bypass/README.md

40 lines
2.0 KiB
Markdown
Raw Normal View History

2015-11-28 18:32:42 +03:00
# Anti-censorship Solution for Russia on PAC-Script
2015-11-28 18:21:20 +03:00
2015-11-28 18:22:26 +03:00
## Censorship in Russia
2015-11-28 18:21:20 +03:00
Censorship in Russia plagues the Freedoms of
[Information](https://en.wikipedia.org/wiki/Freedom_of_information) and [Speech](https://en.wikipedia.org/wiki/Freedom_of_speech),
slowly building analogue of [China Golden Shield](https://en.wikipedia.org/wiki/Golden_Shield_Project).
For good or bad, it blocks
[Main Kampf](https://en.wikipedia.org/wiki/Mein_Kampf),
[lolicon](https://en.wikipedia.org/wiki/Lolicon) (rearly distinguishing from hentai) and
[critics of Putin](http://www.reuters.com/article/2014/03/13/us-russia-internet-idUSBREA2C21L20140313).
Looking at how Russian government [distorts TV](https://therussianreader.wordpress.com/2015/11/22/russian-truckers-strike-dagestan/) and blocks Internet, I decided to write an Anti-Censorship extension for Chomium.
2015-11-28 18:24:22 +03:00
I believe the freedom of information is a virtue and important __information mustn't be blocked based on political or other subjective views__.
2015-11-28 18:21:20 +03:00
2015-11-28 18:30:59 +03:00
## Technical Titbits
2015-11-28 18:21:20 +03:00
2015-11-28 18:30:59 +03:00
```javascript
if (Is_subdomain_of( host, blocked_hosts ))
return 'use proxy';
```
2015-11-28 18:21:20 +03:00
2015-11-28 18:30:59 +03:00
You have to make `Is_subdomain_of` very fast.
This check is executed on each request. You should watch memeory consumption too.
2015-11-28 18:21:20 +03:00
The naive solution is to keep array of blocked ips and check if the host resolves to one of the ips.
You may do it with `indexOf`, binary search, etc.
The shortcoming of every ip solution is that some providers resolve blocked hosts to wrong ips, so we eventually need list of hosts.
2015-11-27 23:47:27 +03:00
2015-11-28 19:03:14 +03:00
I have tested different solutions, and depicted [results](./benchmark/Output.txt) in the following chart:
2015-11-28 18:30:59 +03:00
2015-11-28 16:20:17 +03:00
![Host Lookup Chart: Time-Memory, Hits-Misses](./chart/host-lookup-chart.png)
* __IPs indexOf__ Blocked IP is search by `indexOf`
* __IPs binary__ Blocked IP is search by binary search. For some reason miss time slightly increased.
* __IPs switch__ Simply `switch(Blocked_IP) { case1: ... caseN: return true }`. Works even better than binary search. Magic.
* __Hosts switch__ Radix trie built on `switch`. Comparable to __IPs switch__.