Reorganise English tokenizer exceptions (as discussed in #718)

Add logic to generate exceptions that follow a consistent pattern (like
verbs and pronouns) and allow certain tokens to be excluded explicitly.
This commit is contained in:
Ines Montani 2017-01-03 18:17:57 +01:00
parent 1b82756cc7
commit b19cfcc144

File diff suppressed because it is too large Load Diff