runet-censorship-bypass/pac-performance-analyses
2015-12-23 13:13:46 +05:00
..
benchmark restructured folders 2015-12-21 02:04:34 +05:00
chart restructured folders 2015-12-21 02:04:34 +05:00
README.md Update README.md 2015-12-23 13:13:46 +05:00

PAC-Script Performance Analysis

Warning: this experimentation wasn't implemented in the extension yet.

Somewhere in PAC-script you may want:

if (Is_subdomain_of( host, blocked_hosts ))
  return 'use proxy';

You have to make Is_subdomain_of very fast.
This check is executed on each request. You should watch memeory consumption too.

The naive solution is to keep array of blocked ips and check if the host resolves to one of the ips.
You may do it with indexOf, binary search, etc.
The shortcoming of every ip solution is that some providers resolve blocked hosts to wrong ips, so we eventually need list of hosts.

I have tested different solutions, and depicted results in the following chart:

Host Lookup Chart: Time-Memory, Hits-Misses

  • IPs indexOf Blocked IP is searched by indexOf
  • IPs binary Blocked IP is searched by binary search. For some reason miss time slightly increased.
  • IPs switch Simply switch(Blocked_IP) { case1: ... caseN: return true }. Works even better than binary search. Magic.
  • Hosts switch Radix trie built on switch. Comparable to IPs switch.
  • Hosts reversed binary binary search on hosts, but hosts are kept in reversed form: "gro.evichra" instead of "archive.org". It shouldn't really affect anything, but it does.