runet-censorship-bypass/pac-performance-analyses
2016-05-08 21:14:08 +05:00
..
benchmark Almost adds dns-over-https by google 2016-05-08 21:14:08 +05:00
chart Almost adds dns-over-https by google 2016-05-08 21:14:08 +05:00
README.md Almost adds dns-over-https by google 2016-05-08 21:14:08 +05:00

PAC-Script Performance Analysis

Somewhere in PAC-script you may want:

if (Is_subdomain_of( host, blocked_hosts ))
  return 'use proxy';

You have to make Is_subdomain_of very fast.
This check is executed on each request. You should watch memeory consumption too.

The naive solution is to keep array of blocked ips and check if the host resolves to one of the ips.
You may do it with indexOf, binary search, etc.
The shortcoming of every ip solution is that some providers resolve blocked hosts to wrong ips, so we eventually need list of hosts.

I have tested different solutions, and depicted results in the following chart:

Host Lookup Chart: Time-Memory, Hits-Misses

  • IPs indexOf Blocked IP is searched by indexOf
  • IPs binary Blocked IP is searched by binary search. For some reason hit time is slightly increased.
  • IPs switch Simply switch(Blocked_IP) { case1: ... caseN: return true }. Works even better than binary search. Magic.
  • Hosts switch Radix trie built on switch. Comparable to IPs switch.
  • Hosts reversed binary binary search on hosts, but hosts are kept in reversed form: "gro.evichra" instead of "archive.org". It shouldn't really affect anything, but it does, maybe because I also use dnsDomainIs instead of ===.