💫 Tidy up and auto-format .py files (#2983)

## Description - [x] Use [`black`](https://github.com/ambv/black) to auto-format all `.py` files. - [x] Update flake8 config to exclude very large files (lemmatization tables etc.) - [x] Update code to be compatible with flake8 rules - [x] Fix various small bugs, inconsistencies and messy stuff in the language data - [x] Update docs to explain new code style (`black`, `flake8`, when to use `# fmt: off` and `# fmt: on` and what `# noqa` means) Once #2932 is merged, which auto-formats and tidies up the CLI, we'll be able to run `flake8 spacy` actually get meaningful results. At the moment, the code style and linting isn't applied automatically, but I'm hoping that the new [GitHub Actions](https://github.com/features/actions) will let us auto-format pull requests and post comments with relevant linting information. ### Types of change enhancement, code style ## Checklist  - [x] I have submitted the spaCy Contributor Agreement. - [x] I ran the tests, and all new and existing tests passed. - [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
2025-07-15 18:52:29 +03:00 · 2018-11-30 17:03:03 +01:00 · 2018-11-30 17:03:03 +01:00 · eddeb36c96
commit eddeb36c96
parent 852bc2ac16
268 changed files with 25626 additions and 17854 deletions
--- a/.flake8
+++ b/.flake8
@ -1,4 +1,13 @@
 [flake8]
-ignore = E203, E266, E501, W503
+ignore = E203, E266, E501, E731, W503
 max-line-length = 80
 select = B,C,E,F,W,T4,B9
+exclude =
+    .env,
+    .git,
+    __pycache__,
+    lemmatizer.py,
+    lookup.py,
+    _tokenizer_exceptions_list.py,
+    spacy/lang/fr/lemmatizer,
+    spacy/lang/nb/lemmatizer
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@ -186,13 +186,99 @@ sure your test passes and reference the issue in your commit message.
 ## Code conventions

 Code should loosely follow [pep8](https://www.python.org/dev/peps/pep-0008/).
-Regular line length is **80 characters**, with some tolerance for lines up to
-90 characters if the alternative would be worse — for instance, if your list
-comprehension comes to 82 characters, it's better not to split it over two lines.
-You can also use a linter like [`flake8`](https://pypi.python.org/pypi/flake8)
-or [`frosted`](https://pypi.python.org/pypi/frosted) – just keep in mind that
-it won't work very well for `.pyx` files and will complain about Cython syntax
-like `<int*>` or `cimport`.
+As of `v2.1.0`, spaCy uses [`black`](https://github.com/ambv/black) for code
+formatting and [`flake8`](http://flake8.pycqa.org/en/latest/) for linting its
+Python modules. If you've built spaCy from source, you'll already have both
+tools installed.
+
+**⚠️ Note that formatting and linting is currently only possible for Python
+modules in `.py` files, not Cython modules in `.pyx` and `.pxd` files.**
+
+### Code formatting
+
+[`black`](https://github.com/ambv/black) is an opinionated Python code
+formatter, optimised to produce readable code and small diffs. You can run
+`black` from the command-line, or via your code editor. For example, if you're
+using [Visual Studio Code](https://code.visualstudio.com/), you can  add the
+following to your `settings.json` to use `black` for formatting and auto-format
+your files on save:
+
+```json
+{
+    "python.formatting.provider": "black",
+    "[python]": {
+        "editor.formatOnSave": true
+    }
+}
+```
+
+[See here](https://github.com/ambv/black#editor-integration) for the full
+list of available editor integrations.
+
+#### Disabling formatting
+
+There are a few cases where auto-formatting doesn't improve readability – for
+example, in some of the the language data files like the `tag_map.py`, or in
+the tests that construct `Doc` objects from lists of words and other labels.
+Wrapping a block in `# fmt: off` and `# fmt: on` lets you disable formatting
+for that particular code. Here's an example:
+
+```python
+# fmt: off
+text = "I look forward to using Thingamajig.  I've been told it will make my life easier..."
+heads = [1, 0, -1, -2, -1, -1, -5, -1, 3, 2, 1, 0, 2, 1, -3, 1, 1, -3, -7]
+deps = ["nsubj", "ROOT", "advmod", "prep", "pcomp", "dobj", "punct", "",
+        "nsubjpass", "aux", "auxpass", "ROOT", "nsubj", "aux", "ccomp",
+        "poss", "nsubj", "ccomp", "punct"]
+# fmt: on
+```
+
+### Code linting
+
+[`flake8`](http://flake8.pycqa.org/en/latest/) is a tool for enforcing code
+style. It scans one or more files and outputs errors and warnings. This feedback
+can help you stick to general standards and conventions, and can be very useful
+for spotting potential mistakes and inconsistencies in your code. The most
+important things to watch out for are syntax errors and undefined names, but you
+also want to keep an eye on unused declared variables or repeated
+(i.e. overwritten) dictionary keys. If your code was formatted with `black`
+(see above), you shouldn't see any formatting-related warnings.
+
+The [`.flake8`](.flake8) config defines the configuration we use for this
+codebase. For example, we're not super strict about the line length, and we're
+excluding very large files like lemmatization and tokenizer exception tables.
+
+Ideally, running the following command from within the repo directory should
+not return any errors or warnings:
+
+```bash
+flake8 spacy
+```
+
+#### Disabling linting
+
+Sometimes, you explicitly want to write code that's not compatible with our
+rules. For example, a module's `__init__.py` might import a function so other
+modules can import it from there, but `flake8` will complain about an unused
+import. And although it's generally discouraged, there might be cases where it
+makes sense to use a bare `except`.
+
+To ignore a given line, you can add a comment like `# noqa: F401`, specifying
+the code of the error or warning we want to ignore. It's also possible to
+ignore several comma-separated codes at once, e.g. `# noqa: E731,E123`. Here
+are some examples:
+
+```python
+# The imported class isn't used in this file, but imported here, so it can be
+# imported *from* here by another module.
+from .submodule import SomeClass  # noqa: F401
+
+try:
+    do_something()
+except:  # noqa: E722
+    # This bare except is justified, for some specific reason
+    do_something_else()
+```

 ### Python conventions

--- a/bin/cythonize.py
+++ b/bin/cythonize.py
@ -35,41 +35,49 @@ import subprocess
 import argparse


-HASH_FILE = 'cythonize.json'
+HASH_FILE = "cythonize.json"


-def process_pyx(fromfile, tofile, language_level='-2'):
-    print('Processing %s' % fromfile)
+def process_pyx(fromfile, tofile, language_level="-2"):
+    print("Processing %s" % fromfile)
    try:
        from Cython.Compiler.Version import version as cython_version
        from distutils.version import LooseVersion
-        if LooseVersion(cython_version) < LooseVersion('0.19'):
-            raise Exception('Require Cython >= 0.19')
+
+        if LooseVersion(cython_version) < LooseVersion("0.19"):
+            raise Exception("Require Cython >= 0.19")

    except ImportError:
        pass

-    flags = ['--fast-fail', language_level]
-    if tofile.endswith('.cpp'):
-        flags += ['--cplus']
+    flags = ["--fast-fail", language_level]
+    if tofile.endswith(".cpp"):
+        flags += ["--cplus"]

    try:
        try:
-            r = subprocess.call(['cython'] + flags + ['-o', tofile, fromfile],
-                                env=os.environ) # See Issue #791
+            r = subprocess.call(
+                ["cython"] + flags + ["-o", tofile, fromfile], env=os.environ
+            )  # See Issue #791
            if r != 0:
-                raise Exception('Cython failed')
+                raise Exception("Cython failed")
        except OSError:
            # There are ways of installing Cython that don't result in a cython
            # executable on the path, see gh-2397.
-            r = subprocess.call([sys.executable, '-c',
-                                'import sys; from Cython.Compiler.Main import '
-                                'setuptools_main as main; sys.exit(main())'] + flags +
-                                ['-o', tofile, fromfile])
+            r = subprocess.call(
+                [
+                    sys.executable,
+                    "-c",
+                    "import sys; from Cython.Compiler.Main import "
+                    "setuptools_main as main; sys.exit(main())",
+                ]
+                + flags
+                + ["-o", tofile, fromfile]
+            )
            if r != 0:
-                raise Exception('Cython failed')
+                raise Exception("Cython failed")
    except OSError:
-        raise OSError('Cython needs to be installed')
+        raise OSError("Cython needs to be installed")


 def preserve_cwd(path, func, *args):
@ -89,12 +97,12 @@ def load_hashes(filename):


 def save_hashes(hash_db, filename):
-    with open(filename, 'w') as f:
+    with open(filename, "w") as f:
        f.write(json.dumps(hash_db))


 def get_hash(path):
-    return hashlib.md5(open(path, 'rb').read()).hexdigest()
+    return hashlib.md5(open(path, "rb").read()).hexdigest()


 def hash_changed(base, path, db):
@ -109,25 +117,27 @@ def hash_add(base, path, db):

 def process(base, filename, db):
    root, ext = os.path.splitext(filename)
-    if ext in ['.pyx', '.cpp']:
-        if hash_changed(base, filename, db) or not os.path.isfile(os.path.join(base, root + '.cpp')):
-            preserve_cwd(base, process_pyx, root + '.pyx', root + '.cpp')
-            hash_add(base, root + '.cpp', db)
-            hash_add(base, root + '.pyx', db)
+    if ext in [".pyx", ".cpp"]:
+        if hash_changed(base, filename, db) or not os.path.isfile(
+            os.path.join(base, root + ".cpp")
+        ):
+            preserve_cwd(base, process_pyx, root + ".pyx", root + ".cpp")
+            hash_add(base, root + ".cpp", db)
+            hash_add(base, root + ".pyx", db)


 def check_changes(root, db):
    res = False
    new_db = {}

-    setup_filename = 'setup.py'
-    hash_add('.', setup_filename, new_db)
-    if hash_changed('.', setup_filename, db):
+    setup_filename = "setup.py"
+    hash_add(".", setup_filename, new_db)
+    if hash_changed(".", setup_filename, db):
        res = True

    for base, _, files in os.walk(root):
        for filename in files:
-            if filename.endswith('.pxd'):
+            if filename.endswith(".pxd"):
                hash_add(base, filename, new_db)
                if hash_changed(base, filename, db):
                    res = True
@ -150,8 +160,10 @@ def run(root):
        save_hashes(db, HASH_FILE)


-if __name__ == '__main__':
-    parser = argparse.ArgumentParser(description='Cythonize pyx files into C++ files as needed')
-    parser.add_argument('root', help='root directory')
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser(
+        description="Cythonize pyx files into C++ files as needed"
+    )
+    parser.add_argument("root", help="root directory")
    args = parser.parse_args()
    run(args.root)
--- a/bin/load_reddit.py
+++ b/bin/load_reddit.py
@ -15,12 +15,13 @@ _unset = object()

 class Reddit(object):
    """Stream cleaned comments from Reddit."""
-    pre_format_re = re.compile(r'^[\`\*\~]')
-    post_format_re = re.compile(r'[\`\*\~]$')
-    url_re = re.compile(r'\[([^]]+)\]\(%%URL\)')
-    link_re = re.compile(r'\[([^]]+)\]\(https?://[^\)]+\)')

-    def __init__(self, file_path, meta_keys={'subreddit': 'section'}):
+    pre_format_re = re.compile(r"^[\`\*\~]")
+    post_format_re = re.compile(r"[\`\*\~]$")
+    url_re = re.compile(r"\[([^]]+)\]\(%%URL\)")
+    link_re = re.compile(r"\[([^]]+)\]\(https?://[^\)]+\)")
+
+    def __init__(self, file_path, meta_keys={"subreddit": "section"}):
        """
        file_path (unicode / Path): Path to archive or directory of archives.
        meta_keys (dict): Meta data key included in the Reddit corpus, mapped
@ -45,28 +46,30 @@ class Reddit(object):
                        continue
                    comment = ujson.loads(line)
                    if self.is_valid(comment):
-                        text = self.strip_tags(comment['body'])
-                        yield {'text': text}
+                        text = self.strip_tags(comment["body"])
+                        yield {"text": text}

    def get_meta(self, item):
-        return {name: item.get(key, 'n/a') for key, name in self.meta.items()}
+        return {name: item.get(key, "n/a") for key, name in self.meta.items()}

    def iter_files(self):
        for file_path in self.files:
            yield file_path

    def strip_tags(self, text):
-        text = self.link_re.sub(r'\1', text)
-        text = text.replace('&gt;', '>').replace('&lt;', '<')
-        text = self.pre_format_re.sub('', text)
-        text = self.post_format_re.sub('', text)
-        text = re.sub(r'\s+', ' ', text)
+        text = self.link_re.sub(r"\1", text)
+        text = text.replace("&gt;", ">").replace("&lt;", "<")
+        text = self.pre_format_re.sub("", text)
+        text = self.post_format_re.sub("", text)
+        text = re.sub(r"\s+", " ", text)
        return text.strip()

    def is_valid(self, comment):
-        return comment['body'] is not None \
-            and comment['body'] != '[deleted]' \
-            and comment['body'] != '[removed]'
+        return (
+            comment["body"] is not None
+            and comment["body"] != "[deleted]"
+            and comment["body"] != "[removed]"
+        )


 def main(path):
@ -75,8 +78,9 @@ def main(path):
        print(ujson.dumps(comment))


-if __name__ == '__main__':
+if __name__ == "__main__":
    import socket
+
    try:
        BrokenPipeError
    except NameError:
@ -85,6 +89,7 @@ if __name__ == '__main__':
        plac.call(main)
    except BrokenPipeError:
        import os, sys
+
        # Python flushes standard streams on exit; redirect remaining output
        # to devnull to avoid another BrokenPipeError at shutdown
        devnull = os.open(os.devnull, os.O_WRONLY)
--- a/requirements.txt
+++ b/requirements.txt
@ -11,7 +11,10 @@ ujson>=1.35
 dill>=0.2,<0.3
 regex==2018.01.10
 requests>=2.13.0,<3.0.0
+pathlib==1.0.1; python_version < "3.4"
+# Development dependencies
 pytest>=4.0.0,<5.0.0
 pytest-timeout>=1.3.0,<2.0.0
 mock>=2.0.0,<3.0.0
-pathlib==1.0.1; python_version < "3.4"
+black==18.9b0
+flake8>=3.5.0,<3.6.0
--- a/spacy/_ml.py
+++ b/spacy/_ml.py
@ -14,8 +14,7 @@ from thinc.api import uniqued, wrap, noop
 from thinc.api import with_square_sequences
 from thinc.linear.linear import LinearModel
 from thinc.neural.ops import NumpyOps, CupyOps
-from thinc.neural.util import get_array_module, copy_array
-from thinc.neural._lsuv import svd_orthonormal
+from thinc.neural.util import get_array_module
 from thinc.neural.optimizers import Adam

 from thinc import describe
@ -33,36 +32,36 @@ try:
 except:
    torch = None

-VECTORS_KEY = 'spacy_pretrained_vectors'
+VECTORS_KEY = "spacy_pretrained_vectors"


 def cosine(vec1, vec2):
    xp = get_array_module(vec1)
    norm1 = xp.linalg.norm(vec1)
    norm2 = xp.linalg.norm(vec2)
-    if norm1 == 0. or norm2 == 0.:
+    if norm1 == 0.0 or norm2 == 0.0:
        return 0
    else:
        return vec1.dot(vec2) / (norm1 * norm2)


 def create_default_optimizer(ops, **cfg):
-    learn_rate = util.env_opt('learn_rate', 0.001)
-    beta1 = util.env_opt('optimizer_B1', 0.8)
-    beta2 = util.env_opt('optimizer_B2', 0.8)
-    eps = util.env_opt('optimizer_eps', 0.00001)
-    L2 = util.env_opt('L2_penalty', 1e-6)
-    max_grad_norm = util.env_opt('grad_norm_clip', 5.)
-    optimizer = Adam(ops, learn_rate, L2=L2, beta1=beta1,
-                     beta2=beta2, eps=eps)
+    learn_rate = util.env_opt("learn_rate", 0.001)
+    beta1 = util.env_opt("optimizer_B1", 0.8)
+    beta2 = util.env_opt("optimizer_B2", 0.8)
+    eps = util.env_opt("optimizer_eps", 0.00001)
+    L2 = util.env_opt("L2_penalty", 1e-6)
+    max_grad_norm = util.env_opt("grad_norm_clip", 5.0)
+    optimizer = Adam(ops, learn_rate, L2=L2, beta1=beta1, beta2=beta2, eps=eps)
    optimizer.max_grad_norm = max_grad_norm
    optimizer.device = ops.device
    return optimizer

+
@layerize
-def _flatten_add_lengths(seqs, pad=0, drop=0.):
+def _flatten_add_lengths(seqs, pad=0, drop=0.0):
    ops = Model.ops
-    lengths = ops.asarray([len(seq) for seq in seqs], dtype='i')
+    lengths = ops.asarray([len(seq) for seq in seqs], dtype="i")

    def finish_update(d_X, sgd=None):
        return ops.unflatten(d_X, lengths, pad=pad)
@ -74,14 +73,15 @@ def _flatten_add_lengths(seqs, pad=0, drop=0.):
 def _zero_init(model):
    def _zero_init_impl(self, X, y):
        self.W.fill(0)
+
    model.on_data_hooks.append(_zero_init_impl)
    if model.W is not None:
-        model.W.fill(0.)
+        model.W.fill(0.0)
    return model


@layerize
-def _preprocess_doc(docs, drop=0.):
+def _preprocess_doc(docs, drop=0.0):
    keys = [doc.to_array(LOWER) for doc in docs]
    ops = Model.ops
    # The dtype here matches what thinc is expecting -- which differs per
@ -89,11 +89,12 @@ def _preprocess_doc(docs, drop=0.):
    # is fixed on Thinc's side.
    lengths = ops.asarray([arr.shape[0] for arr in keys], dtype=numpy.int_)
    keys = ops.xp.concatenate(keys)
-    vals = ops.allocate(keys.shape) + 1.
+    vals = ops.allocate(keys.shape) + 1.0
    return (keys, vals, lengths), None

+
@layerize
-def _preprocess_doc_bigrams(docs, drop=0.):
+def _preprocess_doc_bigrams(docs, drop=0.0):
    unigrams = [doc.to_array(LOWER) for doc in docs]
    ops = Model.ops
    bigrams = [ops.ngrams(2, doc_unis) for doc_unis in unigrams]
@ -104,27 +105,29 @@ def _preprocess_doc_bigrams(docs, drop=0.):
    # is fixed on Thinc's side.
    lengths = ops.asarray([arr.shape[0] for arr in keys], dtype=numpy.int_)
    keys = ops.xp.concatenate(keys)
-    vals = ops.asarray(ops.xp.concatenate(vals), dtype='f')
+    vals = ops.asarray(ops.xp.concatenate(vals), dtype="f")
    return (keys, vals, lengths), None


-@describe.on_data(_set_dimensions_if_needed,
-    lambda model, X, y: model.init_weights(model))
+@describe.on_data(
+    _set_dimensions_if_needed, lambda model, X, y: model.init_weights(model)
+)
@describe.attributes(
    nI=Dimension("Input size"),
    nF=Dimension("Number of features"),
    nO=Dimension("Output size"),
    nP=Dimension("Maxout pieces"),
-    W=Synapses("Weights matrix",
-        lambda obj: (obj.nF, obj.nO, obj.nP, obj.nI)),
-    b=Biases("Bias vector",
-        lambda obj: (obj.nO, obj.nP)),
-    pad=Synapses("Pad",
+    W=Synapses("Weights matrix", lambda obj: (obj.nF, obj.nO, obj.nP, obj.nI)),
+    b=Biases("Bias vector", lambda obj: (obj.nO, obj.nP)),
+    pad=Synapses(
+        "Pad",
        lambda obj: (1, obj.nF, obj.nO, obj.nP),
-        lambda M, ops: ops.normal_init(M, 1.)),
+        lambda M, ops: ops.normal_init(M, 1.0),
+    ),
    d_W=Gradient("W"),
    d_pad=Gradient("pad"),
-    d_b=Gradient("b"))
+    d_b=Gradient("b"),
+)
 class PrecomputableAffine(Model):
    def __init__(self, nO=None, nI=None, nF=None, nP=None, **kwargs):
        Model.__init__(self, **kwargs)
@ -133,9 +136,10 @@ class PrecomputableAffine(Model):
        self.nI = nI
        self.nF = nF

-    def begin_update(self, X, drop=0.):
-        Yf = self.ops.gemm(X,
-            self.W.reshape((self.nF*self.nO*self.nP, self.nI)), trans2=True)
+    def begin_update(self, X, drop=0.0):
+        Yf = self.ops.gemm(
+            X, self.W.reshape((self.nF * self.nO * self.nP, self.nI)), trans2=True
+        )
        Yf = Yf.reshape((Yf.shape[0], self.nF, self.nO, self.nP))
        Yf = self._add_padding(Yf)

@ -146,15 +150,16 @@ class PrecomputableAffine(Model):
            Xf = Xf.reshape((Xf.shape[0], self.nF * self.nI))

            self.d_b += dY.sum(axis=0)
-            dY = dY.reshape((dY.shape[0], self.nO*self.nP))
+            dY = dY.reshape((dY.shape[0], self.nO * self.nP))

            Wopfi = self.W.transpose((1, 2, 0, 3))
            Wopfi = self.ops.xp.ascontiguousarray(Wopfi)
-            Wopfi = Wopfi.reshape((self.nO*self.nP, self.nF * self.nI))
-            dXf = self.ops.gemm(dY.reshape((dY.shape[0], self.nO*self.nP)), Wopfi)
+            Wopfi = Wopfi.reshape((self.nO * self.nP, self.nF * self.nI))
+            dXf = self.ops.gemm(dY.reshape((dY.shape[0], self.nO * self.nP)), Wopfi)

            # Reuse the buffer
-            dWopfi = Wopfi; dWopfi.fill(0.)
+            dWopfi = Wopfi
+            dWopfi.fill(0.0)
            self.ops.gemm(dY, Xf, out=dWopfi, trans1=True)
            dWopfi = dWopfi.reshape((self.nO, self.nP, self.nF, self.nI))
            # (o, p, f, i) --> (f, o, p, i)
@ -163,6 +168,7 @@ class PrecomputableAffine(Model):
            if sgd is not None:
                sgd(self._mem.weights, self._mem.gradient, key=self.id)
            return dXf.reshape((dXf.shape[0], self.nF, self.nI))
+
        return Yf, backward

    def _add_padding(self, Yf):
@ -171,7 +177,7 @@ class PrecomputableAffine(Model):

    def _backprop_padding(self, dY, ids):
        # (1, nF, nO, nP) += (nN, nF, nO, nP) where IDs (nN, nF) < 0
-        mask = ids < 0.
+        mask = ids < 0.0
        mask = mask.sum(axis=1)
        d_pad = dY * mask.reshape((ids.shape[0], 1, 1))
        self.d_pad += d_pad.sum(axis=0)
@ -179,33 +185,36 @@ class PrecomputableAffine(Model):

    @staticmethod
    def init_weights(model):
-        '''This is like the 'layer sequential unit variance', but instead
+        """This is like the 'layer sequential unit variance', but instead
        of taking the actual inputs, we randomly generate whitened data.

        Why's this all so complicated? We have a huge number of inputs,
        and the maxout unit makes guessing the dynamics tricky. Instead
        we set the maxout weights to values that empirically result in
        whitened outputs given whitened inputs.
-        '''
-        if (model.W**2).sum() != 0.:
+        """
+        if (model.W ** 2).sum() != 0.0:
            return
        ops = model.ops
        xp = ops.xp
        ops.normal_init(model.W, model.nF * model.nI, inplace=True)

-        ids = ops.allocate((5000, model.nF), dtype='f')
+        ids = ops.allocate((5000, model.nF), dtype="f")
        ids += xp.random.uniform(0, 1000, ids.shape)
-        ids = ops.asarray(ids, dtype='i')
-        tokvecs = ops.allocate((5000, model.nI), dtype='f')
-        tokvecs += xp.random.normal(loc=0., scale=1.,
-                    size=tokvecs.size).reshape(tokvecs.shape)
+        ids = ops.asarray(ids, dtype="i")
+        tokvecs = ops.allocate((5000, model.nI), dtype="f")
+        tokvecs += xp.random.normal(loc=0.0, scale=1.0, size=tokvecs.size).reshape(
+            tokvecs.shape
+        )

        def predict(ids, tokvecs):
            # nS ids. nW tokvecs. Exclude the padding array.
            hiddens = model(tokvecs[:-1])  # (nW, f, o, p)
-            vectors = model.ops.allocate((ids.shape[0], model.nO * model.nP), dtype='f')
+            vectors = model.ops.allocate((ids.shape[0], model.nO * model.nP), dtype="f")
            # need nS vectors
-            hiddens = hiddens.reshape((hiddens.shape[0] * model.nF, model.nO * model.nP))
+            hiddens = hiddens.reshape(
+                (hiddens.shape[0] * model.nF, model.nO * model.nP)
+            )
            model.ops.scatter_add(vectors, ids.flatten(), hiddens)
            vectors = vectors.reshape((vectors.shape[0], model.nO, model.nP))
            vectors += model.b
@ -238,7 +247,8 @@ def link_vectors_to_models(vocab):
        if vectors.data.size != 0:
            print(
                "Warning: Unnamed vectors -- this won't allow multiple vectors "
-                "models to be loaded. (Shape: (%d, %d))" % vectors.data.shape)
+                "models to be loaded. (Shape: (%d, %d))" % vectors.data.shape
+            )
    ops = Model.ops
    for word in vocab:
        if word.orth in vectors.key2row:
@ -254,28 +264,31 @@ def link_vectors_to_models(vocab):
 def PyTorchBiLSTM(nO, nI, depth, dropout=0.2):
    if depth == 0:
        return layerize(noop())
-    model = torch.nn.LSTM(nI, nO//2, depth, bidirectional=True, dropout=dropout)
+    model = torch.nn.LSTM(nI, nO // 2, depth, bidirectional=True, dropout=dropout)
    return with_square_sequences(PyTorchWrapperRNN(model))


 def Tok2Vec(width, embed_size, **kwargs):
-    pretrained_vectors = kwargs.get('pretrained_vectors', None)
-    cnn_maxout_pieces = kwargs.get('cnn_maxout_pieces', 2)
-    subword_features = kwargs.get('subword_features', True)
-    conv_depth = kwargs.get('conv_depth', 4)
-    bilstm_depth = kwargs.get('bilstm_depth', 0)
+    pretrained_vectors = kwargs.get("pretrained_vectors", None)
+    cnn_maxout_pieces = kwargs.get("cnn_maxout_pieces", 2)
+    subword_features = kwargs.get("subword_features", True)
+    conv_depth = kwargs.get("conv_depth", 4)
+    bilstm_depth = kwargs.get("bilstm_depth", 0)
    cols = [ID, NORM, PREFIX, SUFFIX, SHAPE, ORTH]
-    with Model.define_operators({'>>': chain, '|': concatenate, '**': clone,
-                                 '+': add, '*': reapply}):
-        norm = HashEmbed(width, embed_size, column=cols.index(NORM),
-                         name='embed_norm')
+    with Model.define_operators(
+        {">>": chain, "|": concatenate, "**": clone, "+": add, "*": reapply}
+    ):
+        norm = HashEmbed(width, embed_size, column=cols.index(NORM), name="embed_norm")
        if subword_features:
-            prefix = HashEmbed(width, embed_size//2, column=cols.index(PREFIX),
-                               name='embed_prefix')
-            suffix = HashEmbed(width, embed_size//2, column=cols.index(SUFFIX),
-                               name='embed_suffix')
-            shape = HashEmbed(width, embed_size//2, column=cols.index(SHAPE),
-                              name='embed_shape')
+            prefix = HashEmbed(
+                width, embed_size // 2, column=cols.index(PREFIX), name="embed_prefix"
+            )
+            suffix = HashEmbed(
+                width, embed_size // 2, column=cols.index(SUFFIX), name="embed_suffix"
+            )
+            shape = HashEmbed(
+                width, embed_size // 2, column=cols.index(SHAPE), name="embed_shape"
+            )
        else:
            prefix, suffix, shape = (None, None, None)
        if pretrained_vectors is not None:
@ -284,28 +297,29 @@ def Tok2Vec(width, embed_size, **kwargs):
            if subword_features:
                embed = uniqued(
                    (glove | norm | prefix | suffix | shape)
-                    >> LN(Maxout(width, width*5, pieces=3)), column=cols.index(ORTH))
+                    >> LN(Maxout(width, width * 5, pieces=3)),
+                    column=cols.index(ORTH),
+                )
            else:
                embed = uniqued(
-                    (glove | norm)
-                    >> LN(Maxout(width, width*2, pieces=3)), column=cols.index(ORTH))
+                    (glove | norm) >> LN(Maxout(width, width * 2, pieces=3)),
+                    column=cols.index(ORTH),
+                )
        elif subword_features:
            embed = uniqued(
                (norm | prefix | suffix | shape)
-                >> LN(Maxout(width, width*4, pieces=3)), column=cols.index(ORTH))
+                >> LN(Maxout(width, width * 4, pieces=3)),
+                column=cols.index(ORTH),
+            )
        else:
            embed = norm

        convolution = Residual(
            ExtractWindow(nW=1)
-            >> LN(Maxout(width, width*3, pieces=cnn_maxout_pieces))
-        )
-        tok2vec = (
-            FeatureExtracter(cols)
-            >> with_flatten(
-                embed
-                >> convolution ** conv_depth, pad=conv_depth
+            >> LN(Maxout(width, width * 3, pieces=cnn_maxout_pieces))
        )
+        tok2vec = FeatureExtracter(cols) >> with_flatten(
+            embed >> convolution ** conv_depth, pad=conv_depth
        )
        if bilstm_depth >= 1:
            tok2vec = tok2vec >> PyTorchBiLSTM(width, width, bilstm_depth)
@ -316,7 +330,7 @@ def Tok2Vec(width, embed_size, **kwargs):


 def reapply(layer, n_times):
-    def reapply_fwd(X, drop=0.):
+    def reapply_fwd(X, drop=0.0):
        backprops = []
        for i in range(n_times):
            Y, backprop = layer.begin_update(X, drop=drop)
@ -334,12 +348,14 @@ def reapply(layer, n_times):
            return dX

        return Y, reapply_bwd
+
    return wrap(reapply_fwd, layer)


 def asarray(ops, dtype):
-    def forward(X, drop=0.):
+    def forward(X, drop=0.0):
        return ops.asarray(X, dtype=dtype), None
+
    return layerize(forward)


@ -347,7 +363,7 @@ def _divide_array(X, size):
    parts = []
    index = 0
    while index < len(X):
-        parts.append(X[index:index + size])
+        parts.append(X[index : index + size])
        index += size
    return parts

@ -356,7 +372,7 @@ def get_col(idx):
    if idx < 0:
        raise IndexError(Errors.E066.format(value=idx))

-    def forward(X, drop=0.):
+    def forward(X, drop=0.0):
        if isinstance(X, numpy.ndarray):
            ops = NumpyOps()
        else:
@ -377,7 +393,7 @@ def doc2feats(cols=None):
    if cols is None:
        cols = [ID, NORM, PREFIX, SUFFIX, SHAPE, ORTH]

-    def forward(docs, drop=0.):
+    def forward(docs, drop=0.0):
        feats = []
        for doc in docs:
            feats.append(doc.to_array(cols))
@ -389,13 +405,14 @@ def doc2feats(cols=None):


 def print_shape(prefix):
-    def forward(X, drop=0.):
+    def forward(X, drop=0.0):
        return X, lambda dX, **kwargs: dX
+
    return layerize(forward)


@layerize
-def get_token_vectors(tokens_attrs_vectors, drop=0.):
+def get_token_vectors(tokens_attrs_vectors, drop=0.0):
    tokens, attrs, vectors = tokens_attrs_vectors

    def backward(d_output, sgd=None):
@ -405,17 +422,17 @@ def get_token_vectors(tokens_attrs_vectors, drop=0.):


@layerize
-def logistic(X, drop=0.):
+def logistic(X, drop=0.0):
    xp = get_array_module(X)
    if not isinstance(X, xp.ndarray):
        X = xp.asarray(X)
    # Clip to range (-10, 10)
-    X = xp.minimum(X, 10., X)
-    X = xp.maximum(X, -10., X)
-    Y = 1. / (1. + xp.exp(-X))
+    X = xp.minimum(X, 10.0, X)
+    X = xp.maximum(X, -10.0, X)
+    Y = 1.0 / (1.0 + xp.exp(-X))

    def logistic_bwd(dY, sgd=None):
-        dX = dY * (Y * (1-Y))
+        dX = dY * (Y * (1 - Y))
        return dX

    return Y, logistic_bwd
@ -424,12 +441,13 @@ def logistic(X, drop=0.):
 def zero_init(model):
    def _zero_init_impl(self, X, y):
        self.W.fill(0)
+
    model.on_data_hooks.append(_zero_init_impl)
    return model


@layerize
-def preprocess_doc(docs, drop=0.):
+def preprocess_doc(docs, drop=0.0):
    keys = [doc.to_array([LOWER]) for doc in docs]
    ops = Model.ops
    lengths = ops.asarray([arr.shape[0] for arr in keys])
@ -439,31 +457,32 @@ def preprocess_doc(docs, drop=0.):


 def getitem(i):
-    def getitem_fwd(X, drop=0.):
+    def getitem_fwd(X, drop=0.0):
        return X[i], None
+
    return layerize(getitem_fwd)


 def build_tagger_model(nr_class, **cfg):
-    embed_size = util.env_opt('embed_size', 2000)
-    if 'token_vector_width' in cfg:
-        token_vector_width = cfg['token_vector_width']
+    embed_size = util.env_opt("embed_size", 2000)
+    if "token_vector_width" in cfg:
+        token_vector_width = cfg["token_vector_width"]
    else:
-        token_vector_width = util.env_opt('token_vector_width', 96)
-    pretrained_vectors = cfg.get('pretrained_vectors')
-    subword_features = cfg.get('subword_features', True)
-    with Model.define_operators({'>>': chain, '+': add}):
-        if 'tok2vec' in cfg:
-            tok2vec = cfg['tok2vec']
+        token_vector_width = util.env_opt("token_vector_width", 96)
+    pretrained_vectors = cfg.get("pretrained_vectors")
+    subword_features = cfg.get("subword_features", True)
+    with Model.define_operators({">>": chain, "+": add}):
+        if "tok2vec" in cfg:
+            tok2vec = cfg["tok2vec"]
        else:
-            tok2vec = Tok2Vec(token_vector_width, embed_size,
+            tok2vec = Tok2Vec(
+                token_vector_width,
+                embed_size,
                subword_features=subword_features,
-                              pretrained_vectors=pretrained_vectors)
-        softmax = with_flatten(Softmax(nr_class, token_vector_width))
-        model = (
-            tok2vec
-            >> softmax
+                pretrained_vectors=pretrained_vectors,
            )
+        softmax = with_flatten(Softmax(nr_class, token_vector_width))
+        model = tok2vec >> softmax
    model.nI = None
    model.tok2vec = tok2vec
    model.softmax = softmax
@ -471,10 +490,10 @@ def build_tagger_model(nr_class, **cfg):


@layerize
-def SpacyVectors(docs, drop=0.):
+def SpacyVectors(docs, drop=0.0):
    batch = []
    for doc in docs:
-        indices = numpy.zeros((len(doc),), dtype='i')
+        indices = numpy.zeros((len(doc),), dtype="i")
        for i, word in enumerate(doc):
            if word.orth in doc.vocab.vectors.key2row:
                indices[i] = doc.vocab.vectors.key2row[word.orth]
@ -486,12 +505,11 @@ def SpacyVectors(docs, drop=0.):


 def build_text_classifier(nr_class, width=64, **cfg):
-    depth = cfg.get('depth', 2)
-    nr_vector = cfg.get('nr_vector', 5000)
-    pretrained_dims = cfg.get('pretrained_dims', 0)
-    with Model.define_operators({'>>': chain, '+': add, '|': concatenate,
-                                 '**': clone}):
-        if cfg.get('low_data') and pretrained_dims:
+    depth = cfg.get("depth", 2)
+    nr_vector = cfg.get("nr_vector", 5000)
+    pretrained_dims = cfg.get("pretrained_dims", 0)
+    with Model.define_operators({">>": chain, "+": add, "|": concatenate, "**": clone}):
+        if cfg.get("low_data") and pretrained_dims:
            model = (
                SpacyVectors
                >> flatten_add_lengths
@ -505,41 +523,35 @@ def build_text_classifier(nr_class, width=64, **cfg):
            return model

        lower = HashEmbed(width, nr_vector, column=1)
-        prefix = HashEmbed(width//2, nr_vector, column=2)
-        suffix = HashEmbed(width//2, nr_vector, column=3)
-        shape = HashEmbed(width//2, nr_vector, column=4)
+        prefix = HashEmbed(width // 2, nr_vector, column=2)
+        suffix = HashEmbed(width // 2, nr_vector, column=3)
+        shape = HashEmbed(width // 2, nr_vector, column=4)

-        trained_vectors = (
-            FeatureExtracter([ORTH, LOWER, PREFIX, SUFFIX, SHAPE, ID])
-            >> with_flatten(
+        trained_vectors = FeatureExtracter(
+            [ORTH, LOWER, PREFIX, SUFFIX, SHAPE, ID]
+        ) >> with_flatten(
            uniqued(
                (lower | prefix | suffix | shape)
-                    >> LN(Maxout(width, width+(width//2)*3)),
-                    column=0
-                )
+                >> LN(Maxout(width, width + (width // 2) * 3)),
+                column=0,
            )
        )

        if pretrained_dims:
-            static_vectors = (
-                SpacyVectors
-                >> with_flatten(Affine(width, pretrained_dims))
+            static_vectors = SpacyVectors >> with_flatten(
+                Affine(width, pretrained_dims)
            )
            # TODO Make concatenate support lists
            vectors = concatenate_lists(trained_vectors, static_vectors)
-            vectors_width = width*2
+            vectors_width = width * 2
        else:
            vectors = trained_vectors
            vectors_width = width
            static_vectors = None
-        tok2vec = (
-            vectors
-            >> with_flatten(
+        tok2vec = vectors >> with_flatten(
            LN(Maxout(width, vectors_width))
-                >> Residual(
-                    (ExtractWindow(nW=1) >> LN(Maxout(width, width*3)))
-                ) ** depth, pad=depth
-            )
+            >> Residual((ExtractWindow(nW=1) >> LN(Maxout(width, width * 3)))) ** depth,
+            pad=depth,
        )
        cnn_model = (
            tok2vec
@ -550,13 +562,10 @@ def build_text_classifier(nr_class, width=64, **cfg):
            >> zero_init(Affine(nr_class, width, drop_factor=0.0))
        )

-        linear_model = (
-            _preprocess_doc
-            >> LinearModel(nr_class)
-        )
+        linear_model = _preprocess_doc >> LinearModel(nr_class)
        model = (
            (linear_model | cnn_model)
-            >> zero_init(Affine(nr_class, nr_class*2, drop_factor=0.0))
+            >> zero_init(Affine(nr_class, nr_class * 2, drop_factor=0.0))
            >> logistic
        )
        model.tok2vec = tok2vec
@ -566,9 +575,9 @@ def build_text_classifier(nr_class, width=64, **cfg):


@layerize
-def flatten(seqs, drop=0.):
+def flatten(seqs, drop=0.0):
    ops = Model.ops
-    lengths = ops.asarray([len(seq) for seq in seqs], dtype='i')
+    lengths = ops.asarray([len(seq) for seq in seqs], dtype="i")

    def finish_update(d_X, sgd=None):
        return ops.unflatten(d_X, lengths, pad=0)
@ -583,14 +592,14 @@ def concatenate_lists(*layers, **kwargs):  # pragma: no cover
    """
    if not layers:
        return noop()
-    drop_factor = kwargs.get('drop_factor', 1.0)
+    drop_factor = kwargs.get("drop_factor", 1.0)
    ops = layers[0].ops
    layers = [chain(layer, flatten) for layer in layers]
    concat = concatenate(*layers)

-    def concatenate_lists_fwd(Xs, drop=0.):
+    def concatenate_lists_fwd(Xs, drop=0.0):
        drop *= drop_factor
-        lengths = ops.asarray([len(X) for X in Xs], dtype='i')
+        lengths = ops.asarray([len(X) for X in Xs], dtype="i")
        flat_y, bp_flat_y = concat.begin_update(Xs, drop=drop)
        ys = ops.unflatten(flat_y, lengths)

--- a/spacy/about.py
+++ b/spacy/about.py
@ -1,16 +1,17 @@
 # inspired from:
 # https://python-packaging-user-guide.readthedocs.org/en/latest/single_source_version/
 # https://github.com/pypa/warehouse/blob/master/warehouse/__about__.py
+# fmt: off

-__title__ = 'spacy-nightly'
-__version__ = '2.1.0a3'
-__summary__ = 'Industrial-strength Natural Language Processing (NLP) with Python and Cython'
-__uri__ = 'https://spacy.io'
-__author__ = 'Explosion AI'
-__email__ = 'contact@explosion.ai'
-__license__ = 'MIT'
+__title__ = "spacy-nightly"
+__version__ = "2.1.0a3"
+__summary__ = "Industrial-strength Natural Language Processing (NLP) with Python and Cython"
+__uri__ = "https://spacy.io"
+__author__ = "Explosion AI"
+__email__ = "contact@explosion.ai"
+__license__ = "MIT"
 __release__ = False

-__download_url__ = 'https://github.com/explosion/spacy-models/releases/download'
-__compatibility__ = 'https://raw.githubusercontent.com/explosion/spacy-models/master/compatibility.json'
-__shortcuts__ = 'https://raw.githubusercontent.com/explosion/spacy-models/master/shortcuts-v2.json'
+__download_url__ = "https://github.com/explosion/spacy-models/releases/download"
+__compatibility__ = "https://raw.githubusercontent.com/explosion/spacy-models/master/compatibility.json"
+__shortcuts__ = "https://raw.githubusercontent.com/explosion/spacy-models/master/shortcuts-v2.json"
--- a/spacy/compat.py
+++ b/spacy/compat.py
@ -6,7 +6,6 @@ import sys
 import ujson
 import itertools
 import locale
-import os

 from thinc.neural.util import copy_array

@ -31,9 +30,9 @@ except ImportError:
    cupy = None

 try:
-    from thinc.neural.optimizers import Optimizer
+    from thinc.neural.optimizers import Optimizer  # noqa: F401
 except ImportError:
-    from thinc.neural.optimizers import Adam as Optimizer
+    from thinc.neural.optimizers import Adam as Optimizer  # noqa: F401

 pickle = pickle
 copy_reg = copy_reg
--- a/spacy/displacy/init.py
+++ b/spacy/displacy/init.py
@ -12,8 +12,15 @@ _html = {}
 IS_JUPYTER = is_in_jupyter()


-def render(docs, style='dep', page=False, minify=False, jupyter=IS_JUPYTER,
-           options={}, manual=False):
+def render(
+    docs,
+    style="dep",
+    page=False,
+    minify=False,
+    jupyter=IS_JUPYTER,
+    options={},
+    manual=False,
+):
    """Render displaCy visualisation.

    docs (list or Doc): Document(s) to visualise.
@ -25,8 +32,10 @@ def render(docs, style='dep', page=False, minify=False, jupyter=IS_JUPYTER,
    manual (bool): Don't parse `Doc` and instead expect a dict/list of dicts.
    RETURNS (unicode): Rendered HTML markup.
    """
-    factories = {'dep': (DependencyRenderer, parse_deps),
-                 'ent': (EntityRenderer, parse_ents)}
+    factories = {
+        "dep": (DependencyRenderer, parse_deps),
+        "ent": (EntityRenderer, parse_ents),
+    }
    if style not in factories:
        raise ValueError(Errors.E087.format(style=style))
    if isinstance(docs, (Doc, Span, dict)):
@ -37,16 +46,18 @@ def render(docs, style='dep', page=False, minify=False, jupyter=IS_JUPYTER,
    renderer, converter = factories[style]
    renderer = renderer(options=options)
    parsed = [converter(doc, options) for doc in docs] if not manual else docs
-    _html['parsed'] = renderer.render(parsed, page=page, minify=minify).strip()
-    html = _html['parsed']
+    _html["parsed"] = renderer.render(parsed, page=page, minify=minify).strip()
+    html = _html["parsed"]
    if jupyter:  # return HTML rendered by IPython display()
        from IPython.core.display import display, HTML
+
        return display(HTML(html))
    return html


-def serve(docs, style='dep', page=True, minify=False, options={}, manual=False,
-          port=5000):
+def serve(
+    docs, style="dep", page=True, minify=False, options={}, manual=False, port=5000
+):
    """Serve displaCy visualisation.

    docs (list or Doc): Document(s) to visualise.
@ -58,11 +69,13 @@ def serve(docs, style='dep', page=True, minify=False, options={}, manual=False,
    port (int): Port to serve visualisation.
    """
    from wsgiref import simple_server
-    render(docs, style=style, page=page, minify=minify, options=options,
-           manual=manual)
-    httpd = simple_server.make_server('0.0.0.0', port, app)
-    prints("Using the '{}' visualizer".format(style),
-           title="Serving on port {}...".format(port))
+
+    render(docs, style=style, page=page, minify=minify, options=options, manual=manual)
+    httpd = simple_server.make_server("0.0.0.0", port, app)
+    prints(
+        "Using the '{}' visualizer".format(style),
+        title="Serving on port {}...".format(port),
+    )
    try:
        httpd.serve_forever()
    except KeyboardInterrupt:
@ -72,11 +85,10 @@ def serve(docs, style='dep', page=True, minify=False, options={}, manual=False,


 def app(environ, start_response):
-    # headers and status need to be bytes in Python 2, see #1227
-    headers = [(b_to_str(b'Content-type'),
-                b_to_str(b'text/html; charset=utf-8'))]
-    start_response(b_to_str(b'200 OK'), headers)
-    res = _html['parsed'].encode(encoding='utf-8')
+    # Headers and status need to be bytes in Python 2, see #1227
+    headers = [(b_to_str(b"Content-type"), b_to_str(b"text/html; charset=utf-8"))]
+    start_response(b_to_str(b"200 OK"), headers)
+    res = _html["parsed"].encode(encoding="utf-8")
    return [res]


@ -89,11 +101,10 @@ def parse_deps(orig_doc, options={}):
    doc = Doc(orig_doc.vocab).from_bytes(orig_doc.to_bytes())
    if not doc.is_parsed:
        user_warning(Warnings.W005)
-    if options.get('collapse_phrases', False):
+    if options.get("collapse_phrases", False):
        for np in list(doc.noun_chunks):
-            np.merge(tag=np.root.tag_, lemma=np.root.lemma_,
-                    ent_type=np.root.ent_type_)
-    if options.get('collapse_punct', True):
+            np.merge(tag=np.root.tag_, lemma=np.root.lemma_, ent_type=np.root.ent_type_)
+    if options.get("collapse_punct", True):
        spans = []
        for word in doc[:-1]:
            if word.is_punct or not word.nbor(1).is_punct:
@ -103,23 +114,31 @@ def parse_deps(orig_doc, options={}):
            while end < len(doc) and doc[end].is_punct:
                end += 1
            span = doc[start:end]
-            spans.append((span.start_char, span.end_char, word.tag_,
-                          word.lemma_, word.ent_type_))
+            spans.append(
+                (span.start_char, span.end_char, word.tag_, word.lemma_, word.ent_type_)
+            )
        for start, end, tag, lemma, ent_type in spans:
            doc.merge(start, end, tag=tag, lemma=lemma, ent_type=ent_type)
-    if options.get('fine_grained'):
-        words = [{'text': w.text, 'tag': w.tag_} for w in doc]
+    if options.get("fine_grained"):
+        words = [{"text": w.text, "tag": w.tag_} for w in doc]
    else:
-        words = [{'text': w.text, 'tag': w.pos_} for w in doc]
+        words = [{"text": w.text, "tag": w.pos_} for w in doc]
    arcs = []
    for word in doc:
        if word.i < word.head.i:
-            arcs.append({'start': word.i, 'end': word.head.i,
-                         'label': word.dep_, 'dir': 'left'})
+            arcs.append(
+                {"start": word.i, "end": word.head.i, "label": word.dep_, "dir": "left"}
+            )
        elif word.i > word.head.i:
-            arcs.append({'start': word.head.i, 'end': word.i,
-                         'label': word.dep_, 'dir': 'right'})
-    return {'words': words, 'arcs': arcs}
+            arcs.append(
+                {
+                    "start": word.head.i,
+                    "end": word.i,
+                    "label": word.dep_,
+                    "dir": "right",
+                }
+            )
+    return {"words": words, "arcs": arcs}


 def parse_ents(doc, options={}):
@ -128,10 +147,11 @@ def parse_ents(doc, options={}):
    doc (Doc): Document do parse.
    RETURNS (dict): Generated entities keyed by text (original text) and ents.
    """
-    ents = [{'start': ent.start_char, 'end': ent.end_char, 'label': ent.label_}
-            for ent in doc.ents]
+    ents = [
+        {"start": ent.start_char, "end": ent.end_char, "label": ent.label_}
+        for ent in doc.ents
+    ]
    if not ents:
        user_warning(Warnings.W006)
-    title = (doc.user_data.get('title', None)
-             if hasattr(doc, 'user_data') else None)
-    return {'text': doc.text, 'ents': ents, 'title': title}
+    title = doc.user_data.get("title", None) if hasattr(doc, "user_data") else None
+    return {"text": doc.text, "ents": ents, "title": title}
--- a/spacy/displacy/render.py
+++ b/spacy/displacy/render.py
@ -10,7 +10,8 @@ from ..util import minify_html, escape_html

 class DependencyRenderer(object):
    """Render dependency parses as SVGs."""
-    style = 'dep'
+
+    style = "dep"

    def __init__(self, options={}):
        """Initialise dependency renderer.
@ -19,18 +20,16 @@ class DependencyRenderer(object):
            arrow_spacing, arrow_width, arrow_stroke, distance, offset_x,
            color, bg, font)
        """
-        self.compact = options.get('compact', False)
-        self.word_spacing = options.get('word_spacing', 45)
-        self.arrow_spacing = options.get('arrow_spacing',
-                                         12 if self.compact else 20)
-        self.arrow_width = options.get('arrow_width',
-                                       6 if self.compact else 10)
-        self.arrow_stroke = options.get('arrow_stroke', 2)
-        self.distance = options.get('distance', 150 if self.compact else 175)
-        self.offset_x = options.get('offset_x', 50)
-        self.color = options.get('color', '#000000')
-        self.bg = options.get('bg', '#ffffff')
-        self.font = options.get('font', 'Arial')
+        self.compact = options.get("compact", False)
+        self.word_spacing = options.get("word_spacing", 45)
+        self.arrow_spacing = options.get("arrow_spacing", 12 if self.compact else 20)
+        self.arrow_width = options.get("arrow_width", 6 if self.compact else 10)
+        self.arrow_stroke = options.get("arrow_stroke", 2)
+        self.distance = options.get("distance", 150 if self.compact else 175)
+        self.offset_x = options.get("offset_x", 50)
+        self.color = options.get("color", "#000000")
+        self.bg = options.get("bg", "#ffffff")
+        self.font = options.get("font", "Arial")

    def render(self, parsed, page=False, minify=False):
        """Render complete markup.
@ -43,14 +42,15 @@ class DependencyRenderer(object):
        # Create a random ID prefix to make sure parses don't receive the
        # same ID, even if they're identical
        id_prefix = random.randint(0, 999)
-        rendered = [self.render_svg('{}-{}'.format(id_prefix, i), p['words'], p['arcs'])
-                    for i, p in enumerate(parsed)]
+        rendered = [
+            self.render_svg("{}-{}".format(id_prefix, i), p["words"], p["arcs"])
+            for i, p in enumerate(parsed)
+        ]
        if page:
-            content = ''.join([TPL_FIGURE.format(content=svg)
-                               for svg in rendered])
+            content = "".join([TPL_FIGURE.format(content=svg) for svg in rendered])
            markup = TPL_PAGE.format(content=content)
        else:
-            markup = ''.join(rendered)
+            markup = "".join(rendered)
        if minify:
            return minify_html(markup)
        return markup
@ -65,19 +65,25 @@ class DependencyRenderer(object):
        """
        self.levels = self.get_levels(arcs)
        self.highest_level = len(self.levels)
-        self.offset_y = self.distance/2*self.highest_level+self.arrow_stroke
-        self.width = self.offset_x+len(words)*self.distance
-        self.height = self.offset_y+3*self.word_spacing
+        self.offset_y = self.distance / 2 * self.highest_level + self.arrow_stroke
+        self.width = self.offset_x + len(words) * self.distance
+        self.height = self.offset_y + 3 * self.word_spacing
        self.id = render_id
-        words = [self.render_word(w['text'], w['tag'], i)
-                 for i, w in enumerate(words)]
-        arcs = [self.render_arrow(a['label'], a['start'],
-                                  a['end'], a['dir'], i)
-                for i, a in enumerate(arcs)]
-        content = ''.join(words) + ''.join(arcs)
-        return TPL_DEP_SVG.format(id=self.id, width=self.width,
-                                  height=self.height, color=self.color,
-                                  bg=self.bg, font=self.font, content=content)
+        words = [self.render_word(w["text"], w["tag"], i) for i, w in enumerate(words)]
+        arcs = [
+            self.render_arrow(a["label"], a["start"], a["end"], a["dir"], i)
+            for i, a in enumerate(arcs)
+        ]
+        content = "".join(words) + "".join(arcs)
+        return TPL_DEP_SVG.format(
+            id=self.id,
+            width=self.width,
+            height=self.height,
+            color=self.color,
+            bg=self.bg,
+            font=self.font,
+            content=content,
+        )

    def render_word(self, text, tag, i):
        """Render individual word.
@ -87,12 +93,11 @@ class DependencyRenderer(object):
        i (int): Unique ID, typically word index.
        RETURNS (unicode): Rendered SVG markup.
        """
-        y = self.offset_y+self.word_spacing
-        x = self.offset_x+i*self.distance
+        y = self.offset_y + self.word_spacing
+        x = self.offset_x + i * self.distance
        html_text = escape_html(text)
        return TPL_DEP_WORDS.format(text=html_text, tag=tag, x=x, y=y)

-
    def render_arrow(self, label, start, end, direction, i):
        """Render indivicual arrow.

@ -103,20 +108,30 @@ class DependencyRenderer(object):
        i (int): Unique ID, typically arrow index.
        RETURNS (unicode): Rendered SVG markup.
        """
-        level = self.levels.index(end-start)+1
-        x_start = self.offset_x+start*self.distance+self.arrow_spacing
+        level = self.levels.index(end - start) + 1
+        x_start = self.offset_x + start * self.distance + self.arrow_spacing
        y = self.offset_y
-        x_end = (self.offset_x+(end-start)*self.distance+start*self.distance
-                 - self.arrow_spacing*(self.highest_level-level)/4)
-        y_curve = self.offset_y-level*self.distance/2
+        x_end = (
+            self.offset_x
+            + (end - start) * self.distance
+            + start * self.distance
+            - self.arrow_spacing * (self.highest_level - level) / 4
+        )
+        y_curve = self.offset_y - level * self.distance / 2
        if self.compact:
-            y_curve = self.offset_y-level*self.distance/6
+            y_curve = self.offset_y - level * self.distance / 6
        if y_curve == 0 and len(self.levels) > 5:
            y_curve = -self.distance
        arrowhead = self.get_arrowhead(direction, x_start, y, x_end)
        arc = self.get_arc(x_start, y, y_curve, x_end)
-        return TPL_DEP_ARCS.format(id=self.id, i=i, stroke=self.arrow_stroke,
-                                   head=arrowhead, label=label, arc=arc)
+        return TPL_DEP_ARCS.format(
+            id=self.id,
+            i=i,
+            stroke=self.arrow_stroke,
+            head=arrowhead,
+            label=label,
+            arc=arc,
+        )

    def get_arc(self, x_start, y, y_curve, x_end):
        """Render individual arc.
@ -141,13 +156,22 @@ class DependencyRenderer(object):
        end (int): X-coordinate of arrow end point.
        RETURNS (unicode): Definition of the arrow head path ('d' attribute).
        """
-        if direction == 'left':
-            pos1, pos2, pos3 = (x, x-self.arrow_width+2, x+self.arrow_width-2)
+        if direction == "left":
+            pos1, pos2, pos3 = (x, x - self.arrow_width + 2, x + self.arrow_width - 2)
        else:
-            pos1, pos2, pos3 = (end, end+self.arrow_width-2,
-                                end-self.arrow_width+2)
-        arrowhead = (pos1, y+2, pos2, y-self.arrow_width, pos3,
-                     y-self.arrow_width)
+            pos1, pos2, pos3 = (
+                end,
+                end + self.arrow_width - 2,
+                end - self.arrow_width + 2,
+            )
+        arrowhead = (
+            pos1,
+            y + 2,
+            pos2,
+            y - self.arrow_width,
+            pos3,
+            y - self.arrow_width,
+        )
        return "M{},{} L{},{} {},{}".format(*arrowhead)

    def get_levels(self, arcs):
@ -157,30 +181,44 @@ class DependencyRenderer(object):
        args (list): Individual arcs and their start, end, direction and label.
        RETURNS (list): Arc levels sorted from lowest to highest.
        """
-        levels = set(map(lambda arc: arc['end'] - arc['start'], arcs))
+        levels = set(map(lambda arc: arc["end"] - arc["start"], arcs))
        return sorted(list(levels))


 class EntityRenderer(object):
    """Render named entities as HTML."""
-    style = 'ent'
+
+    style = "ent"

    def __init__(self, options={}):
        """Initialise dependency renderer.

        options (dict): Visualiser-specific options (colors, ents)
        """
-        colors = {'ORG': '#7aecec', 'PRODUCT': '#bfeeb7', 'GPE': '#feca74',
-                  'LOC': '#ff9561', 'PERSON': '#aa9cfc', 'NORP': '#c887fb',
-                  'FACILITY': '#9cc9cc', 'EVENT': '#ffeb80', 'LAW': '#ff8197',
-                  'LANGUAGE': '#ff8197', 'WORK_OF_ART': '#f0d0ff',
-                  'DATE': '#bfe1d9', 'TIME': '#bfe1d9', 'MONEY': '#e4e7d2',
-                  'QUANTITY': '#e4e7d2', 'ORDINAL': '#e4e7d2',
-                  'CARDINAL': '#e4e7d2', 'PERCENT': '#e4e7d2'}
-        colors.update(options.get('colors', {}))
-        self.default_color = '#ddd'
+        colors = {
+            "ORG": "#7aecec",
+            "PRODUCT": "#bfeeb7",
+            "GPE": "#feca74",
+            "LOC": "#ff9561",
+            "PERSON": "#aa9cfc",
+            "NORP": "#c887fb",
+            "FACILITY": "#9cc9cc",
+            "EVENT": "#ffeb80",
+            "LAW": "#ff8197",
+            "LANGUAGE": "#ff8197",
+            "WORK_OF_ART": "#f0d0ff",
+            "DATE": "#bfe1d9",
+            "TIME": "#bfe1d9",
+            "MONEY": "#e4e7d2",
+            "QUANTITY": "#e4e7d2",
+            "ORDINAL": "#e4e7d2",
+            "CARDINAL": "#e4e7d2",
+            "PERCENT": "#e4e7d2",
+        }
+        colors.update(options.get("colors", {}))
+        self.default_color = "#ddd"
        self.colors = colors
-        self.ents = options.get('ents', None)
+        self.ents = options.get("ents", None)

    def render(self, parsed, page=False, minify=False):
        """Render complete markup.
@ -190,14 +228,14 @@ class EntityRenderer(object):
        minify (bool): Minify HTML markup.
        RETURNS (unicode): Rendered HTML markup.
        """
-        rendered = [self.render_ents(p['text'], p['ents'],
-                    p.get('title', None)) for p in parsed]
+        rendered = [
+            self.render_ents(p["text"], p["ents"], p.get("title", None)) for p in parsed
+        ]
        if page:
-            docs = ''.join([TPL_FIGURE.format(content=doc)
-                            for doc in rendered])
+            docs = "".join([TPL_FIGURE.format(content=doc) for doc in rendered])
            markup = TPL_PAGE.format(content=docs)
        else:
-            markup = ''.join(rendered)
+            markup = "".join(rendered)
        if minify:
            return minify_html(markup)
        return markup
@ -209,18 +247,18 @@ class EntityRenderer(object):
        spans (list): Individual entity spans and their start, end and label.
        title (unicode or None): Document title set in Doc.user_data['title'].
        """
-        markup = ''
+        markup = ""
        offset = 0
        for span in spans:
-            label = span['label']
-            start = span['start']
-            end = span['end']
+            label = span["label"]
+            start = span["start"]
+            end = span["end"]
            entity = text[start:end]
-            fragments = text[offset:start].split('\n')
+            fragments = text[offset:start].split("\n")
            for i, fragment in enumerate(fragments):
                markup += fragment
-                if len(fragments) > 1 and i != len(fragments)-1:
-                    markup += '</br>'
+                if len(fragments) > 1 and i != len(fragments) - 1:
+                    markup += "</br>"
            if self.ents is None or label.upper() in self.ents:
                color = self.colors.get(label.upper(), self.default_color)
                markup += TPL_ENT.format(label=label, text=entity, bg=color)
--- a/spacy/displacy/templates.py
+++ b/spacy/displacy/templates.py
@ -2,7 +2,7 @@
 from __future__ import unicode_literals


-# setting explicit height and max-width: none on the SVG is required for
+# Setting explicit height and max-width: none on the SVG is required for
 # Jupyter to render it properly in a cell

 TPL_DEP_SVG = """
--- a/spacy/errors.py
+++ b/spacy/errors.py
@ -8,13 +8,17 @@ import inspect

 def add_codes(err_cls):
    """Add error codes to string messages via class attribute names."""
+
    class ErrorsWithCodes(object):
        def __getattribute__(self, code):
            msg = getattr(err_cls, code)
-            return '[{code}] {msg}'.format(code=code, msg=msg)
+            return "[{code}] {msg}".format(code=code, msg=msg)
+
    return ErrorsWithCodes()


+# fmt: off
+
@add_codes
 class Warnings(object):
    W001 = ("As of spaCy v2.0, the keyword argument `path=` is deprecated. "
@ -275,6 +279,7 @@ class Errors(object):
            " can only be part of one entity, so make sure the entities you're "
            "setting don't overlap.")

+
@add_codes
 class TempErrors(object):
    T001 = ("Max length currently 10 for phrase matching")
@ -292,55 +297,57 @@ class TempErrors(object):
            "(pretrained_dims) but not the new name (pretrained_vectors).")


+# fmt: on
+
+
 class ModelsWarning(UserWarning):
    pass


 WARNINGS = {
-    'user': UserWarning,
-    'deprecation': DeprecationWarning,
-    'models': ModelsWarning,
+    "user": UserWarning,
+    "deprecation": DeprecationWarning,
+    "models": ModelsWarning,
 }


 def _get_warn_types(arg):
-    if arg == '':  # don't show any warnings
+    if arg == "":  # don't show any warnings
        return []
-    if not arg or arg == 'all':  # show all available warnings
+    if not arg or arg == "all":  # show all available warnings
        return WARNINGS.keys()
-    return [w_type.strip() for w_type in arg.split(',')
-            if w_type.strip() in WARNINGS]
+    return [w_type.strip() for w_type in arg.split(",") if w_type.strip() in WARNINGS]


 def _get_warn_excl(arg):
    if not arg:
        return []
-    return [w_id.strip() for w_id in arg.split(',')]
+    return [w_id.strip() for w_id in arg.split(",")]


-SPACY_WARNING_FILTER = os.environ.get('SPACY_WARNING_FILTER')
-SPACY_WARNING_TYPES = _get_warn_types(os.environ.get('SPACY_WARNING_TYPES'))
-SPACY_WARNING_IGNORE = _get_warn_excl(os.environ.get('SPACY_WARNING_IGNORE'))
+SPACY_WARNING_FILTER = os.environ.get("SPACY_WARNING_FILTER")
+SPACY_WARNING_TYPES = _get_warn_types(os.environ.get("SPACY_WARNING_TYPES"))
+SPACY_WARNING_IGNORE = _get_warn_excl(os.environ.get("SPACY_WARNING_IGNORE"))


 def user_warning(message):
-    _warn(message, 'user')
+    _warn(message, "user")


 def deprecation_warning(message):
-    _warn(message, 'deprecation')
+    _warn(message, "deprecation")


 def models_warning(message):
-    _warn(message, 'models')
+    _warn(message, "models")


-def _warn(message, warn_type='user'):
+def _warn(message, warn_type="user"):
    """
    message (unicode): The message to display.
    category (Warning): The Warning to show.
    """
-    w_id = message.split('[', 1)[1].split(']', 1)[0]  # get ID from string
+    w_id = message.split("[", 1)[1].split("]", 1)[0]  # get ID from string
    if warn_type in SPACY_WARNING_TYPES and w_id not in SPACY_WARNING_IGNORE:
        category = WARNINGS[warn_type]
        stack = inspect.stack()[-1]
--- a/spacy/glossary.py
+++ b/spacy/glossary.py
@ -21,295 +21,272 @@ GLOSSARY = {
    # POS tags
    # Universal POS Tags
    # http://universaldependencies.org/u/pos/
-
-    'ADJ':          'adjective',
-    'ADP':          'adposition',
-    'ADV':          'adverb',
-    'AUX':          'auxiliary',
-    'CONJ':         'conjunction',
-    'CCONJ':        'coordinating conjunction',
-    'DET':          'determiner',
-    'INTJ':         'interjection',
-    'NOUN':         'noun',
-    'NUM':          'numeral',
-    'PART':         'particle',
-    'PRON':         'pronoun',
-    'PROPN':        'proper noun',
-    'PUNCT':        'punctuation',
-    'SCONJ':        'subordinating conjunction',
-    'SYM':          'symbol',
-    'VERB':         'verb',
-    'X':            'other',
-    'EOL':          'end of line',
-    'SPACE':        'space',
-
-
+    "ADJ": "adjective",
+    "ADP": "adposition",
+    "ADV": "adverb",
+    "AUX": "auxiliary",
+    "CONJ": "conjunction",
+    "CCONJ": "coordinating conjunction",
+    "DET": "determiner",
+    "INTJ": "interjection",
+    "NOUN": "noun",
+    "NUM": "numeral",
+    "PART": "particle",
+    "PRON": "pronoun",
+    "PROPN": "proper noun",
+    "PUNCT": "punctuation",
+    "SCONJ": "subordinating conjunction",
+    "SYM": "symbol",
+    "VERB": "verb",
+    "X": "other",
+    "EOL": "end of line",
+    "SPACE": "space",
    # POS tags (English)
    # OntoNotes 5 / Penn Treebank
    # https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html
-
-    '.':            'punctuation mark, sentence closer',
-    ',':            'punctuation mark, comma',
-    '-LRB-':        'left round bracket',
-    '-RRB-':        'right round bracket',
-    '``':           'opening quotation mark',
-    '""':           'closing quotation mark',
-    "''":           'closing quotation mark',
-    ':':            'punctuation mark, colon or ellipsis',
-    '$':            'symbol, currency',
-    '#':            'symbol, number sign',
-    'AFX':          'affix',
-    'CC':           'conjunction, coordinating',
-    'CD':           'cardinal number',
-    'DT':           'determiner',
-    'EX':           'existential there',
-    'FW':           'foreign word',
-    'HYPH':         'punctuation mark, hyphen',
-    'IN':           'conjunction, subordinating or preposition',
-    'JJ':           'adjective',
-    'JJR':          'adjective, comparative',
-    'JJS':          'adjective, superlative',
-    'LS':           'list item marker',
-    'MD':           'verb, modal auxiliary',
-    'NIL':          'missing tag',
-    'NN':           'noun, singular or mass',
-    'NNP':          'noun, proper singular',
-    'NNPS':         'noun, proper plural',
-    'NNS':          'noun, plural',
-    'PDT':          'predeterminer',
-    'POS':          'possessive ending',
-    'PRP':          'pronoun, personal',
-    'PRP$':         'pronoun, possessive',
-    'RB':           'adverb',
-    'RBR':          'adverb, comparative',
-    'RBS':          'adverb, superlative',
-    'RP':           'adverb, particle',
-    'TO':           'infinitival to',
-    'UH':           'interjection',
-    'VB':           'verb, base form',
-    'VBD':          'verb, past tense',
-    'VBG':          'verb, gerund or present participle',
-    'VBN':          'verb, past participle',
-    'VBP':          'verb, non-3rd person singular present',
-    'VBZ':          'verb, 3rd person singular present',
-    'WDT':          'wh-determiner',
-    'WP':           'wh-pronoun, personal',
-    'WP$':          'wh-pronoun, possessive',
-    'WRB':          'wh-adverb',
-    'SP':           'space',
-    'ADD':          'email',
-    'NFP':          'superfluous punctuation',
-    'GW':           'additional word in multi-word expression',
-    'XX':           'unknown',
-    'BES':          'auxiliary "be"',
-    'HVS':          'forms of "have"',
-
-
+    ".": "punctuation mark, sentence closer",
+    ",": "punctuation mark, comma",
+    "-LRB-": "left round bracket",
+    "-RRB-": "right round bracket",
+    "``": "opening quotation mark",
+    '""': "closing quotation mark",
+    "''": "closing quotation mark",
+    ":": "punctuation mark, colon or ellipsis",
+    "$": "symbol, currency",
+    "#": "symbol, number sign",
+    "AFX": "affix",
+    "CC": "conjunction, coordinating",
+    "CD": "cardinal number",
+    "DT": "determiner",
+    "EX": "existential there",
+    "FW": "foreign word",
+    "HYPH": "punctuation mark, hyphen",
+    "IN": "conjunction, subordinating or preposition",
+    "JJ": "adjective",
+    "JJR": "adjective, comparative",
+    "JJS": "adjective, superlative",
+    "LS": "list item marker",
+    "MD": "verb, modal auxiliary",
+    "NIL": "missing tag",
+    "NN": "noun, singular or mass",
+    "NNP": "noun, proper singular",
+    "NNPS": "noun, proper plural",
+    "NNS": "noun, plural",
+    "PDT": "predeterminer",
+    "POS": "possessive ending",
+    "PRP": "pronoun, personal",
+    "PRP$": "pronoun, possessive",
+    "RB": "adverb",
+    "RBR": "adverb, comparative",
+    "RBS": "adverb, superlative",
+    "RP": "adverb, particle",
+    "TO": "infinitival to",
+    "UH": "interjection",
+    "VB": "verb, base form",
+    "VBD": "verb, past tense",
+    "VBG": "verb, gerund or present participle",
+    "VBN": "verb, past participle",
+    "VBP": "verb, non-3rd person singular present",
+    "VBZ": "verb, 3rd person singular present",
+    "WDT": "wh-determiner",
+    "WP": "wh-pronoun, personal",
+    "WP$": "wh-pronoun, possessive",
+    "WRB": "wh-adverb",
+    "SP": "space",
+    "ADD": "email",
+    "NFP": "superfluous punctuation",
+    "GW": "additional word in multi-word expression",
+    "XX": "unknown",
+    "BES": 'auxiliary "be"',
+    "HVS": 'forms of "have"',
    # POS Tags (German)
    # TIGER Treebank
    # http://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/TIGERCorpus/annotation/tiger_introduction.pdf
-
-    '$(':           'other sentence-internal punctuation mark',
-    '$,':           'comma',
-    '$.':           'sentence-final punctuation mark',
-    'ADJA':         'adjective, attributive',
-    'ADJD':         'adjective, adverbial or predicative',
-    'APPO':         'postposition',
-    'APPR':         'preposition; circumposition left',
-    'APPRART':      'preposition with article',
-    'APZR':         'circumposition right',
-    'ART':          'definite or indefinite article',
-    'CARD':         'cardinal number',
-    'FM':           'foreign language material',
-    'ITJ':          'interjection',
-    'KOKOM':        'comparative conjunction',
-    'KON':          'coordinate conjunction',
-    'KOUI':         'subordinate conjunction with "zu" and infinitive',
-    'KOUS':         'subordinate conjunction with sentence',
-    'NE':           'proper noun',
-    'NNE':          'proper noun',
-    'PAV':          'pronominal adverb',
-    'PROAV':        'pronominal adverb',
-    'PDAT':         'attributive demonstrative pronoun',
-    'PDS':          'substituting demonstrative pronoun',
-    'PIAT':         'attributive indefinite pronoun without determiner',
-    'PIDAT':        'attributive indefinite pronoun with determiner',
-    'PIS':          'substituting indefinite pronoun',
-    'PPER':         'non-reflexive personal pronoun',
-    'PPOSAT':       'attributive possessive pronoun',
-    'PPOSS':        'substituting possessive pronoun',
-    'PRELAT':       'attributive relative pronoun',
-    'PRELS':        'substituting relative pronoun',
-    'PRF':          'reflexive personal pronoun',
-    'PTKA':         'particle with adjective or adverb',
-    'PTKANT':       'answer particle',
-    'PTKNEG':       'negative particle',
-    'PTKVZ':        'separable verbal particle',
-    'PTKZU':        '"zu" before infinitive',
-    'PWAT':         'attributive interrogative pronoun',
-    'PWAV':         'adverbial interrogative or relative pronoun',
-    'PWS':          'substituting interrogative pronoun',
-    'TRUNC':        'word remnant',
-    'VAFIN':        'finite verb, auxiliary',
-    'VAIMP':        'imperative, auxiliary',
-    'VAINF':        'infinitive, auxiliary',
-    'VAPP':         'perfect participle, auxiliary',
-    'VMFIN':        'finite verb, modal',
-    'VMINF':        'infinitive, modal',
-    'VMPP':         'perfect participle, modal',
-    'VVFIN':        'finite verb, full',
-    'VVIMP':        'imperative, full',
-    'VVINF':        'infinitive, full',
-    'VVIZU':        'infinitive with "zu", full',
-    'VVPP':         'perfect participle, full',
-    'XY':           'non-word containing non-letter',
-
-
+    "$(": "other sentence-internal punctuation mark",
+    "$,": "comma",
+    "$.": "sentence-final punctuation mark",
+    "ADJA": "adjective, attributive",
+    "ADJD": "adjective, adverbial or predicative",
+    "APPO": "postposition",
+    "APPR": "preposition; circumposition left",
+    "APPRART": "preposition with article",
+    "APZR": "circumposition right",
+    "ART": "definite or indefinite article",
+    "CARD": "cardinal number",
+    "FM": "foreign language material",
+    "ITJ": "interjection",
+    "KOKOM": "comparative conjunction",
+    "KON": "coordinate conjunction",
+    "KOUI": 'subordinate conjunction with "zu" and infinitive',
+    "KOUS": "subordinate conjunction with sentence",
+    "NE": "proper noun",
+    "NNE": "proper noun",
+    "PAV": "pronominal adverb",
+    "PROAV": "pronominal adverb",
+    "PDAT": "attributive demonstrative pronoun",
+    "PDS": "substituting demonstrative pronoun",
+    "PIAT": "attributive indefinite pronoun without determiner",
+    "PIDAT": "attributive indefinite pronoun with determiner",
+    "PIS": "substituting indefinite pronoun",
+    "PPER": "non-reflexive personal pronoun",
+    "PPOSAT": "attributive possessive pronoun",
+    "PPOSS": "substituting possessive pronoun",
+    "PRELAT": "attributive relative pronoun",
+    "PRELS": "substituting relative pronoun",
+    "PRF": "reflexive personal pronoun",
+    "PTKA": "particle with adjective or adverb",
+    "PTKANT": "answer particle",
+    "PTKNEG": "negative particle",
+    "PTKVZ": "separable verbal particle",
+    "PTKZU": '"zu" before infinitive',
+    "PWAT": "attributive interrogative pronoun",
+    "PWAV": "adverbial interrogative or relative pronoun",
+    "PWS": "substituting interrogative pronoun",
+    "TRUNC": "word remnant",
+    "VAFIN": "finite verb, auxiliary",
+    "VAIMP": "imperative, auxiliary",
+    "VAINF": "infinitive, auxiliary",
+    "VAPP": "perfect participle, auxiliary",
+    "VMFIN": "finite verb, modal",
+    "VMINF": "infinitive, modal",
+    "VMPP": "perfect participle, modal",
+    "VVFIN": "finite verb, full",
+    "VVIMP": "imperative, full",
+    "VVINF": "infinitive, full",
+    "VVIZU": 'infinitive with "zu", full',
+    "VVPP": "perfect participle, full",
+    "XY": "non-word containing non-letter",
    # Noun chunks
-
-    'NP':           'noun phrase',
-    'PP':           'prepositional phrase',
-    'VP':           'verb phrase',
-    'ADVP':         'adverb phrase',
-    'ADJP':         'adjective phrase',
-    'SBAR':         'subordinating conjunction',
-    'PRT':          'particle',
-    'PNP':          'prepositional noun phrase',
-
-
+    "NP": "noun phrase",
+    "PP": "prepositional phrase",
+    "VP": "verb phrase",
+    "ADVP": "adverb phrase",
+    "ADJP": "adjective phrase",
+    "SBAR": "subordinating conjunction",
+    "PRT": "particle",
+    "PNP": "prepositional noun phrase",
    # Dependency Labels (English)
    # ClearNLP / Universal Dependencies
    # https://github.com/clir/clearnlp-guidelines/blob/master/md/specifications/dependency_labels.md
-
-    'acomp':        'adjectival complement',
-    'advcl':        'adverbial clause modifier',
-    'advmod':       'adverbial modifier',
-    'agent':        'agent',
-    'amod':         'adjectival modifier',
-    'appos':        'appositional modifier',
-    'attr':         'attribute',
-    'aux':          'auxiliary',
-    'auxpass':      'auxiliary (passive)',
-    'cc':           'coordinating conjunction',
-    'ccomp':        'clausal complement',
-    'complm':       'complementizer',
-    'conj':         'conjunct',
-    'cop':          'copula',
-    'csubj':        'clausal subject',
-    'csubjpass':    'clausal subject (passive)',
-    'dep':          'unclassified dependent',
-    'det':          'determiner',
-    'dobj':         'direct object',
-    'expl':         'expletive',
-    'hmod':         'modifier in hyphenation',
-    'hyph':         'hyphen',
-    'infmod':       'infinitival modifier',
-    'intj':         'interjection',
-    'iobj':         'indirect object',
-    'mark':         'marker',
-    'meta':         'meta modifier',
-    'neg':          'negation modifier',
-    'nmod':         'modifier of nominal',
-    'nn':           'noun compound modifier',
-    'npadvmod':     'noun phrase as adverbial modifier',
-    'nsubj':        'nominal subject',
-    'nsubjpass':    'nominal subject (passive)',
-    'num':          'number modifier',
-    'number':       'number compound modifier',
-    'oprd':         'object predicate',
-    'obj':          'object',
-    'obl':          'oblique nominal',
-    'parataxis':    'parataxis',
-    'partmod':      'participal modifier',
-    'pcomp':        'complement of preposition',
-    'pobj':         'object of preposition',
-    'poss':         'possession modifier',
-    'possessive':   'possessive modifier',
-    'preconj':      'pre-correlative conjunction',
-    'prep':         'prepositional modifier',
-    'prt':          'particle',
-    'punct':        'punctuation',
-    'quantmod':     'modifier of quantifier',
-    'rcmod':        'relative clause modifier',
-    'root':         'root',
-    'xcomp':        'open clausal complement',
-
-
+    "acomp": "adjectival complement",
+    "advcl": "adverbial clause modifier",
+    "advmod": "adverbial modifier",
+    "agent": "agent",
+    "amod": "adjectival modifier",
+    "appos": "appositional modifier",
+    "attr": "attribute",
+    "aux": "auxiliary",
+    "auxpass": "auxiliary (passive)",
+    "cc": "coordinating conjunction",
+    "ccomp": "clausal complement",
+    "complm": "complementizer",
+    "conj": "conjunct",
+    "cop": "copula",
+    "csubj": "clausal subject",
+    "csubjpass": "clausal subject (passive)",
+    "dep": "unclassified dependent",
+    "det": "determiner",
+    "dobj": "direct object",
+    "expl": "expletive",
+    "hmod": "modifier in hyphenation",
+    "hyph": "hyphen",
+    "infmod": "infinitival modifier",
+    "intj": "interjection",
+    "iobj": "indirect object",
+    "mark": "marker",
+    "meta": "meta modifier",
+    "neg": "negation modifier",
+    "nmod": "modifier of nominal",
+    "nn": "noun compound modifier",
+    "npadvmod": "noun phrase as adverbial modifier",
+    "nsubj": "nominal subject",
+    "nsubjpass": "nominal subject (passive)",
+    "num": "number modifier",
+    "number": "number compound modifier",
+    "oprd": "object predicate",
+    "obj": "object",
+    "obl": "oblique nominal",
+    "parataxis": "parataxis",
+    "partmod": "participal modifier",
+    "pcomp": "complement of preposition",
+    "pobj": "object of preposition",
+    "poss": "possession modifier",
+    "possessive": "possessive modifier",
+    "preconj": "pre-correlative conjunction",
+    "prep": "prepositional modifier",
+    "prt": "particle",
+    "punct": "punctuation",
+    "quantmod": "modifier of quantifier",
+    "rcmod": "relative clause modifier",
+    "root": "root",
+    "xcomp": "open clausal complement",
    # Dependency labels (German)
    # TIGER Treebank
    # http://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/TIGERCorpus/annotation/tiger_introduction.pdf
    # currently missing: 'cc' (comparative complement) because of conflict
    # with English labels
-
-    'ac':           'adpositional case marker',
-    'adc':          'adjective component',
-    'ag':           'genitive attribute',
-    'ams':          'measure argument of adjective',
-    'app':          'apposition',
-    'avc':          'adverbial phrase component',
-    'cd':           'coordinating conjunction',
-    'cj':           'conjunct',
-    'cm':           'comparative conjunction',
-    'cp':           'complementizer',
-    'cvc':          'collocational verb construction',
-    'da':           'dative',
-    'dh':           'discourse-level head',
-    'dm':           'discourse marker',
-    'ep':           'expletive es',
-    'hd':           'head',
-    'ju':           'junctor',
-    'mnr':          'postnominal modifier',
-    'mo':           'modifier',
-    'ng':           'negation',
-    'nk':           'noun kernel element',
-    'nmc':          'numerical component',
-    'oa':           'accusative object',
-    'oc':           'clausal object',
-    'og':           'genitive object',
-    'op':           'prepositional object',
-    'par':          'parenthetical element',
-    'pd':           'predicate',
-    'pg':           'phrasal genitive',
-    'ph':           'placeholder',
-    'pm':           'morphological particle',
-    'pnc':          'proper noun component',
-    'rc':           'relative clause',
-    're':           'repeated element',
-    'rs':           'reported speech',
-    'sb':           'subject',
-
-
+    "ac": "adpositional case marker",
+    "adc": "adjective component",
+    "ag": "genitive attribute",
+    "ams": "measure argument of adjective",
+    "app": "apposition",
+    "avc": "adverbial phrase component",
+    "cd": "coordinating conjunction",
+    "cj": "conjunct",
+    "cm": "comparative conjunction",
+    "cp": "complementizer",
+    "cvc": "collocational verb construction",
+    "da": "dative",
+    "dh": "discourse-level head",
+    "dm": "discourse marker",
+    "ep": "expletive es",
+    "hd": "head",
+    "ju": "junctor",
+    "mnr": "postnominal modifier",
+    "mo": "modifier",
+    "ng": "negation",
+    "nk": "noun kernel element",
+    "nmc": "numerical component",
+    "oa": "accusative object",
+    "oc": "clausal object",
+    "og": "genitive object",
+    "op": "prepositional object",
+    "par": "parenthetical element",
+    "pd": "predicate",
+    "pg": "phrasal genitive",
+    "ph": "placeholder",
+    "pm": "morphological particle",
+    "pnc": "proper noun component",
+    "rc": "relative clause",
+    "re": "repeated element",
+    "rs": "reported speech",
+    "sb": "subject",
    # Named Entity Recognition
    # OntoNotes 5
    # https://catalog.ldc.upenn.edu/docs/LDC2013T19/OntoNotes-Release-5.0.pdf
-
-    'PERSON':       'People, including fictional',
-    'NORP':         'Nationalities or religious or political groups',
-    'FACILITY':     'Buildings, airports, highways, bridges, etc.',
-    'FAC':          'Buildings, airports, highways, bridges, etc.',
-    'ORG':          'Companies, agencies, institutions, etc.',
-    'GPE':          'Countries, cities, states',
-    'LOC':          'Non-GPE locations, mountain ranges, bodies of water',
-    'PRODUCT':      'Objects, vehicles, foods, etc. (not services)',
-    'EVENT':        'Named hurricanes, battles, wars, sports events, etc.',
-    'WORK_OF_ART':  'Titles of books, songs, etc.',
-    'LAW':          'Named documents made into laws.',
-    'LANGUAGE':     'Any named language',
-    'DATE':         'Absolute or relative dates or periods',
-    'TIME':         'Times smaller than a day',
-    'PERCENT':      'Percentage, including "%"',
-    'MONEY':        'Monetary values, including unit',
-    'QUANTITY':     'Measurements, as of weight or distance',
-    'ORDINAL':      '"first", "second", etc.',
-    'CARDINAL':     'Numerals that do not fall under another type',
-
-
+    "PERSON": "People, including fictional",
+    "NORP": "Nationalities or religious or political groups",
+    "FACILITY": "Buildings, airports, highways, bridges, etc.",
+    "FAC": "Buildings, airports, highways, bridges, etc.",
+    "ORG": "Companies, agencies, institutions, etc.",
+    "GPE": "Countries, cities, states",
+    "LOC": "Non-GPE locations, mountain ranges, bodies of water",
+    "PRODUCT": "Objects, vehicles, foods, etc. (not services)",
+    "EVENT": "Named hurricanes, battles, wars, sports events, etc.",
+    "WORK_OF_ART": "Titles of books, songs, etc.",
+    "LAW": "Named documents made into laws.",
+    "LANGUAGE": "Any named language",
+    "DATE": "Absolute or relative dates or periods",
+    "TIME": "Times smaller than a day",
+    "PERCENT": 'Percentage, including "%"',
+    "MONEY": "Monetary values, including unit",
+    "QUANTITY": "Measurements, as of weight or distance",
+    "ORDINAL": '"first", "second", etc.',
+    "CARDINAL": "Numerals that do not fall under another type",
    # Named Entity Recognition
    # Wikipedia
    # http://www.sciencedirect.com/science/article/pii/S0004370212000276
    # https://pdfs.semanticscholar.org/5744/578cc243d92287f47448870bb426c66cc941.pdf
-
-    'PER':          'Named person or family.',
-    'MISC':         ('Miscellaneous entities, e.g. events, nationalities, '
-                     'products or works of art'),
+    "PER": "Named person or family.",
+    "MISC": "Miscellaneous entities, e.g. events, nationalities, products or works of art",
 }
--- a/spacy/lang/ar/init.py
+++ b/spacy/lang/ar/init.py
@ -16,16 +16,18 @@ from ...util import update_exc, add_lookups
 class ArabicDefaults(Language.Defaults):
    lex_attr_getters = dict(Language.Defaults.lex_attr_getters)
    lex_attr_getters.update(LEX_ATTRS)
-    lex_attr_getters[LANG] = lambda text: 'ar'
-    lex_attr_getters[NORM] = add_lookups(Language.Defaults.lex_attr_getters[NORM], BASE_NORMS)
+    lex_attr_getters[LANG] = lambda text: "ar"
+    lex_attr_getters[NORM] = add_lookups(
+        Language.Defaults.lex_attr_getters[NORM], BASE_NORMS
+    )
    tokenizer_exceptions = update_exc(BASE_EXCEPTIONS, TOKENIZER_EXCEPTIONS)
    stop_words = STOP_WORDS
    suffixes = TOKENIZER_SUFFIXES


 class Arabic(Language):
-    lang = 'ar'
+    lang = "ar"
    Defaults = ArabicDefaults


-__all__ = ['Arabic']
+__all__ = ["Arabic"]
--- a/spacy/lang/ar/examples.py
+++ b/spacy/lang/ar/examples.py
@ -10,11 +10,11 @@ Example sentences to test spaCy and its language models.

 sentences = [
    "نال الكاتب خالد توفيق  جائزة الرواية العربية في معرض الشارقة الدولي للكتاب",
-    "أين تقع دمشق ؟"
+    "أين تقع دمشق ؟",
    "كيف حالك ؟",
    "هل يمكن ان نلتقي على الساعة الثانية عشرة ظهرا ؟",
    "ماهي أبرز التطورات السياسية، الأمنية والاجتماعية في العالم ؟",
    "هل بالإمكان أن نلتقي غدا؟",
    "هناك نحو 382 مليون شخص مصاب بداء السكَّري في العالم",
-    "كشفت دراسة حديثة أن الخيل تقرأ تعبيرات الوجه وتستطيع أن تتذكر مشاعر الناس وعواطفهم"
+    "كشفت دراسة حديثة أن الخيل تقرأ تعبيرات الوجه وتستطيع أن تتذكر مشاعر الناس وعواطفهم",
 ]
--- a/spacy/lang/ar/lex_attrs.py
+++ b/spacy/lang/ar/lex_attrs.py
@ -2,7 +2,8 @@
 from __future__ import unicode_literals
 from ...attrs import LIKE_NUM

-_num_words = set("""
+_num_words = set(
+    """
 صفر
 واحد
 إثنان
@ -52,9 +53,11 @@ _num_words = set("""
 مليون
 مليار
 مليارات
-""".split())
+""".split()
+)

-_ordinal_words = set("""
+_ordinal_words = set(
+    """
 اول
 أول
 حاد
@ -69,20 +72,21 @@ _ordinal_words = set("""
 ثامن
 تاسع
 عاشر
-""".split())
+""".split()
+)


 def like_num(text):
    """
-    check if text resembles a number
+    Check if text resembles a number
    """
-    if text.startswith(('+', '-', '±', '~')):
+    if text.startswith(("+", "-", "±", "~")):
        text = text[1:]
-    text = text.replace(',', '').replace('.', '')
+    text = text.replace(",", "").replace(".", "")
    if text.isdigit():
        return True
-    if text.count('/') == 1:
-        num, denom = text.split('/')
+    if text.count("/") == 1:
+        num, denom = text.split("/")
        if num.isdigit() and denom.isdigit():
            return True
    if text in _num_words:
@ -92,6 +96,4 @@ def like_num(text):
    return False


-LEX_ATTRS = {
-    LIKE_NUM: like_num
-}
+LEX_ATTRS = {LIKE_NUM: like_num}
--- a/spacy/lang/ar/punctuation.py
+++ b/spacy/lang/ar/punctuation.py
@ -1,15 +1,20 @@
 # coding: utf8
 from __future__ import unicode_literals

-from ..punctuation import TOKENIZER_INFIXES
 from ..char_classes import LIST_PUNCT, LIST_ELLIPSES, LIST_QUOTES, CURRENCY
-from ..char_classes import QUOTES, UNITS, ALPHA, ALPHA_LOWER, ALPHA_UPPER
+from ..char_classes import UNITS, ALPHA_UPPER

-_suffixes = (LIST_PUNCT + LIST_ELLIPSES + LIST_QUOTES +
-             [r'(?<=[0-9])\+',
+_suffixes = (
+    LIST_PUNCT
+    + LIST_ELLIPSES
+    + LIST_QUOTES
+    + [
+        r"(?<=[0-9])\+",
        # Arabic is written from Right-To-Left
-              r'(?<=[0-9])(?:{})'.format(CURRENCY),
-              r'(?<=[0-9])(?:{})'.format(UNITS),
-              r'(?<=[{au}][{au}])\.'.format(au=ALPHA_UPPER)])
+        r"(?<=[0-9])(?:{})".format(CURRENCY),
+        r"(?<=[0-9])(?:{})".format(UNITS),
+        r"(?<=[{au}][{au}])\.".format(au=ALPHA_UPPER),
+    ]
+)

 TOKENIZER_SUFFIXES = _suffixes
--- a/spacy/lang/ar/stop_words.py
+++ b/spacy/lang/ar/stop_words.py
@ -1,7 +1,8 @@
 # coding: utf8
 from __future__ import unicode_literals

-STOP_WORDS = set("""
+STOP_WORDS = set(
+    """
 من
 نحو
 لعل
@ -388,4 +389,5 @@ STOP_WORDS = set("""
 وإن
 ولو
 يا
-""".split())
+""".split()
+)
--- a/spacy/lang/ar/tokenizer_exceptions.py
+++ b/spacy/lang/ar/tokenizer_exceptions.py
@ -1,21 +1,23 @@
 # coding: utf8
 from __future__ import unicode_literals

-from ...symbols import ORTH, LEMMA, TAG, NORM, PRON_LEMMA
-import re
+from ...symbols import ORTH, LEMMA
+

 _exc = {}

-# time
+
+# Time
 for exc_data in [
    {LEMMA: "قبل الميلاد", ORTH: "ق.م"},
    {LEMMA: "بعد الميلاد", ORTH: "ب. م"},
    {LEMMA: "ميلادي", ORTH: ".م"},
    {LEMMA: "هجري", ORTH: ".هـ"},
-    {LEMMA: "توفي", ORTH: ".ت"}]:
+    {LEMMA: "توفي", ORTH: ".ت"},
+]:
    _exc[exc_data[ORTH]] = [exc_data]

-# scientific abv.
+# Scientific abv.
 for exc_data in [
    {LEMMA: "صلى الله عليه وسلم", ORTH: "صلعم"},
    {LEMMA: "الشارح", ORTH: "الشـ"},
@ -28,20 +30,20 @@ for exc_data in [
    {LEMMA: "أنبأنا", ORTH: "أنا"},
    {LEMMA: "أخبرنا", ORTH: "نا"},
    {LEMMA: "مصدر سابق", ORTH: "م. س"},
-    {LEMMA: "مصدر نفسه", ORTH: "م. ن"}]:
+    {LEMMA: "مصدر نفسه", ORTH: "م. ن"},
+]:
    _exc[exc_data[ORTH]] = [exc_data]

-# other abv.
+# Other abv.
 for exc_data in [
    {LEMMA: "دكتور", ORTH: "د."},
    {LEMMA: "أستاذ دكتور", ORTH: "أ.د"},
    {LEMMA: "أستاذ", ORTH: "أ."},
-    {LEMMA: "بروفيسور", ORTH: "ب."}]:
+    {LEMMA: "بروفيسور", ORTH: "ب."},
+]:
    _exc[exc_data[ORTH]] = [exc_data]

-for exc_data in [
-    {LEMMA: "تلفون", ORTH: "ت."},
-    {LEMMA: "صندوق بريد", ORTH: "ص.ب"}]:
+for exc_data in [{LEMMA: "تلفون", ORTH: "ت."}, {LEMMA: "صندوق بريد", ORTH: "ص.ب"}]:
    _exc[exc_data[ORTH]] = [exc_data]

 TOKENIZER_EXCEPTIONS = _exc
--- a/spacy/lang/bn/init.py
+++ b/spacy/lang/bn/init.py
@ -15,7 +15,7 @@ from ...util import update_exc

 class BengaliDefaults(Language.Defaults):
    lex_attr_getters = dict(Language.Defaults.lex_attr_getters)
-    lex_attr_getters[LANG] = lambda text: 'bn'
+    lex_attr_getters[LANG] = lambda text: "bn"
    tokenizer_exceptions = update_exc(BASE_EXCEPTIONS, TOKENIZER_EXCEPTIONS)
    tag_map = TAG_MAP
    stop_words = STOP_WORDS
@ -26,8 +26,8 @@ class BengaliDefaults(Language.Defaults):


 class Bengali(Language):
-    lang = 'bn'
+    lang = "bn"
    Defaults = BengaliDefaults


-__all__ = ['Bengali']
+__all__ = ["Bengali"]
--- a/spacy/lang/bn/lemmatizer.py
+++ b/spacy/lang/bn/lemmatizer.py
@ -13,11 +13,9 @@ LEMMA_RULES = {
        ["গাছা", ""],
        ["গাছি", ""],
        ["ছড়া", ""],
-
        ["কে", ""],
        ["ে", ""],
        ["তে", ""],
-
        ["র", ""],
        ["রা", ""],
        ["রে", ""],
@ -28,7 +26,6 @@ LEMMA_RULES = {
        ["গুলা", ""],
        ["গুলো", ""],
        ["গুলি", ""],
-
        ["কুল", ""],
        ["গণ", ""],
        ["দল", ""],
@ -45,7 +42,6 @@ LEMMA_RULES = {
        ["সকল", ""],
        ["মহল", ""],
        ["াবলি", ""],  # আবলি
-
        # Bengali digit representations
        ["০", "0"],
        ["১", "1"],
@ -58,11 +54,5 @@ LEMMA_RULES = {
        ["৮", "8"],
        ["৯", "9"],
    ],
-
-    "punct": [
-        ["“", "\""],
-        ["”", "\""],
-        ["\u2018", "'"],
-        ["\u2019", "'"]
-    ]
+    "punct": [["“", '"'], ["”", '"'], ["\u2018", "'"], ["\u2019", "'"]],
 }
--- a/spacy/lang/bn/morph_rules.py
+++ b/spacy/lang/bn/morph_rules.py
@ -6,63 +6,252 @@ from ...symbols import LEMMA, PRON_LEMMA

 MORPH_RULES = {
    "PRP": {
-        'ঐ':         {LEMMA: PRON_LEMMA, 'PronType': 'Dem'},
-        'আমাকে':     {LEMMA: PRON_LEMMA, 'Number': 'Sing', 'Person': 'One', 'PronType': 'Prs', 'Case': 'Acc'},
-        'কি':        {LEMMA: PRON_LEMMA, 'Number': 'Sing', 'Gender': 'Neut', 'PronType': 'Int', 'Case': 'Acc'},
-        'সে':        {LEMMA: PRON_LEMMA, 'Number': 'Sing', 'Person': 'Three', 'PronType': 'Prs', 'Case': 'Nom'},
-        'কিসে':      {LEMMA: PRON_LEMMA, 'Number': 'Sing', 'Gender': 'Neut', 'PronType': 'Int', 'Case': 'Acc'},
-        'তাকে':      {LEMMA: PRON_LEMMA, 'Number': 'Sing', 'Person': 'Three', 'PronType': 'Prs', 'Case': 'Acc'},
-        'স্বয়ং':     {LEMMA: PRON_LEMMA, 'Reflex': 'Yes', 'PronType': 'Ref'},
-        'কোনগুলো':   {LEMMA: PRON_LEMMA, 'Number': 'Plur', 'Gender': 'Neut', 'PronType': 'Int', 'Case': 'Acc'},
-        'তুমি':      {LEMMA: PRON_LEMMA, 'Number': 'Sing', 'Person': 'Two', 'PronType': 'Prs', 'Case': 'Nom'},
-        'তুই':      {LEMMA: PRON_LEMMA, 'Number': 'Sing', 'Person': 'Two', 'PronType': 'Prs', 'Case': 'Nom'},
-        'তাদেরকে':   {LEMMA: PRON_LEMMA, 'Number': 'Plur', 'Person': 'Three', 'PronType': 'Prs', 'Case': 'Acc'},
-        'আমরা':      {LEMMA: PRON_LEMMA, 'Number': 'Plur', 'Person': 'One ', 'PronType': 'Prs', 'Case': 'Nom'},
-        'যিনি':      {LEMMA: PRON_LEMMA, 'Number': 'Sing', 'PronType': 'Rel', 'Case': 'Nom'},
-        'আমাদেরকে':  {LEMMA: PRON_LEMMA, 'Number': 'Plur', 'Person': 'One', 'PronType': 'Prs', 'Case': 'Acc'},
-        'কোন':       {LEMMA: PRON_LEMMA, 'Number': 'Sing', 'PronType': 'Int', 'Case': 'Acc'},
-        'কারা':      {LEMMA: PRON_LEMMA, 'Number': 'Plur', 'PronType': 'Int', 'Case': 'Acc'},
-        'তোমাকে':    {LEMMA: PRON_LEMMA, 'Number': 'Sing', 'Person': 'Two', 'PronType': 'Prs', 'Case': 'Acc'},
-        'তোকে':    {LEMMA: PRON_LEMMA, 'Number': 'Sing', 'Person': 'Two', 'PronType': 'Prs', 'Case': 'Acc'},
-        'খোদ':       {LEMMA: PRON_LEMMA, 'Reflex': 'Yes', 'PronType': 'Ref'},
-        'কে':        {LEMMA: PRON_LEMMA, 'Number': 'Sing', 'PronType': 'Int', 'Case': 'Acc'},
-        'যারা':      {LEMMA: PRON_LEMMA, 'Number': 'Plur', 'PronType': 'Rel', 'Case': 'Nom'},
-        'যে':        {LEMMA: PRON_LEMMA, 'Number': 'Sing', 'PronType': 'Rel', 'Case': 'Nom'},
-        'তোমরা':     {LEMMA: PRON_LEMMA, 'Number': 'Plur', 'Person': 'Two', 'PronType': 'Prs', 'Case': 'Nom'},
-        'তোরা':     {LEMMA: PRON_LEMMA, 'Number': 'Plur', 'Person': 'Two', 'PronType': 'Prs', 'Case': 'Nom'},
-        'তোমাদেরকে': {LEMMA: PRON_LEMMA, 'Number': 'Plur', 'Person': 'Two', 'PronType': 'Prs', 'Case': 'Acc'},
-        'তোদেরকে': {LEMMA: PRON_LEMMA, 'Number': 'Plur', 'Person': 'Two', 'PronType': 'Prs', 'Case': 'Acc'},
-        'আপন':       {LEMMA: PRON_LEMMA, 'Reflex': 'Yes', 'PronType': 'Ref'},
-        'এ':         {LEMMA: PRON_LEMMA, 'PronType': 'Dem'},
-        'নিজ':       {LEMMA: PRON_LEMMA, 'Reflex': 'Yes', 'PronType': 'Ref'},
-        'কার':       {LEMMA: PRON_LEMMA, 'Number': 'Sing', 'PronType': 'Int', 'Case': 'Acc'},
-        'যা':        {LEMMA: PRON_LEMMA, 'Number': 'Sing', 'Gender': 'Neut', 'PronType': 'Rel', 'Case': 'Nom'},
-        'তারা':      {LEMMA: PRON_LEMMA, 'Number': 'Plur', 'Person': 'Three', 'PronType': 'Prs', 'Case': 'Nom'},
-        'আমি':       {LEMMA: PRON_LEMMA, 'Number': 'Sing', 'Person': 'One', 'PronType': 'Prs', 'Case': 'Nom'}
+        "ঐ": {LEMMA: PRON_LEMMA, "PronType": "Dem"},
+        "আমাকে": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Sing",
+            "Person": "One",
+            "PronType": "Prs",
+            "Case": "Acc",
+        },
+        "কি": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Sing",
+            "Gender": "Neut",
+            "PronType": "Int",
+            "Case": "Acc",
+        },
+        "সে": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Sing",
+            "Person": "Three",
+            "PronType": "Prs",
+            "Case": "Nom",
+        },
+        "কিসে": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Sing",
+            "Gender": "Neut",
+            "PronType": "Int",
+            "Case": "Acc",
+        },
+        "তাকে": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Sing",
+            "Person": "Three",
+            "PronType": "Prs",
+            "Case": "Acc",
+        },
+        "স্বয়ং": {LEMMA: PRON_LEMMA, "Reflex": "Yes", "PronType": "Ref"},
+        "কোনগুলো": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Plur",
+            "Gender": "Neut",
+            "PronType": "Int",
+            "Case": "Acc",
+        },
+        "তুমি": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Sing",
+            "Person": "Two",
+            "PronType": "Prs",
+            "Case": "Nom",
+        },
+        "তুই": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Sing",
+            "Person": "Two",
+            "PronType": "Prs",
+            "Case": "Nom",
+        },
+        "তাদেরকে": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Plur",
+            "Person": "Three",
+            "PronType": "Prs",
+            "Case": "Acc",
+        },
+        "আমরা": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Plur",
+            "Person": "One ",
+            "PronType": "Prs",
+            "Case": "Nom",
+        },
+        "যিনি": {LEMMA: PRON_LEMMA, "Number": "Sing", "PronType": "Rel", "Case": "Nom"},
+        "আমাদেরকে": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Plur",
+            "Person": "One",
+            "PronType": "Prs",
+            "Case": "Acc",
+        },
+        "কোন": {LEMMA: PRON_LEMMA, "Number": "Sing", "PronType": "Int", "Case": "Acc"},
+        "কারা": {LEMMA: PRON_LEMMA, "Number": "Plur", "PronType": "Int", "Case": "Acc"},
+        "তোমাকে": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Sing",
+            "Person": "Two",
+            "PronType": "Prs",
+            "Case": "Acc",
+        },
+        "তোকে": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Sing",
+            "Person": "Two",
+            "PronType": "Prs",
+            "Case": "Acc",
+        },
+        "খোদ": {LEMMA: PRON_LEMMA, "Reflex": "Yes", "PronType": "Ref"},
+        "কে": {LEMMA: PRON_LEMMA, "Number": "Sing", "PronType": "Int", "Case": "Acc"},
+        "যারা": {LEMMA: PRON_LEMMA, "Number": "Plur", "PronType": "Rel", "Case": "Nom"},
+        "যে": {LEMMA: PRON_LEMMA, "Number": "Sing", "PronType": "Rel", "Case": "Nom"},
+        "তোমরা": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Plur",
+            "Person": "Two",
+            "PronType": "Prs",
+            "Case": "Nom",
+        },
+        "তোরা": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Plur",
+            "Person": "Two",
+            "PronType": "Prs",
+            "Case": "Nom",
+        },
+        "তোমাদেরকে": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Plur",
+            "Person": "Two",
+            "PronType": "Prs",
+            "Case": "Acc",
+        },
+        "তোদেরকে": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Plur",
+            "Person": "Two",
+            "PronType": "Prs",
+            "Case": "Acc",
+        },
+        "আপন": {LEMMA: PRON_LEMMA, "Reflex": "Yes", "PronType": "Ref"},
+        "এ": {LEMMA: PRON_LEMMA, "PronType": "Dem"},
+        "নিজ": {LEMMA: PRON_LEMMA, "Reflex": "Yes", "PronType": "Ref"},
+        "কার": {LEMMA: PRON_LEMMA, "Number": "Sing", "PronType": "Int", "Case": "Acc"},
+        "যা": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Sing",
+            "Gender": "Neut",
+            "PronType": "Rel",
+            "Case": "Nom",
+        },
+        "তারা": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Plur",
+            "Person": "Three",
+            "PronType": "Prs",
+            "Case": "Nom",
+        },
+        "আমি": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Sing",
+            "Person": "One",
+            "PronType": "Prs",
+            "Case": "Nom",
+        },
    },
    "PRP$": {
-
-        'আমার':    {LEMMA:  PRON_LEMMA, 'Number': 'Sing', 'Person': 'One', 'PronType': 'Prs', 'Poss': 'Yes',
-                    'Case': 'Nom'},
-        'মোর':     {LEMMA:  PRON_LEMMA, 'Number': 'Sing', 'Person': 'One', 'PronType': 'Prs', 'Poss': 'Yes',
-                    'Case': 'Nom'},
-        'মোদের':   {LEMMA:  PRON_LEMMA, 'Number': 'Plur', 'Person': 'One', 'PronType': 'Prs', 'Poss': 'Yes',
-                    'Case': 'Nom'},
-        'তার':     {LEMMA:  PRON_LEMMA, 'Number': 'Sing', 'Person': 'Three', 'PronType': 'Prs', 'Poss': 'Yes',
-                    'Case': 'Nom'},
-        'তোমাদের': {LEMMA:  PRON_LEMMA, 'Number': 'Plur', 'Person': 'Two', 'PronType': 'Prs', 'Poss': 'Yes',
-                    'Case': 'Nom'},
-        'আমাদের':  {LEMMA:  PRON_LEMMA, 'Number': 'Plur', 'Person': 'One', 'PronType': 'Prs', 'Poss': 'Yes',
-                    'Case': 'Nom'},
-        'তোমার':   {LEMMA:  PRON_LEMMA, 'Number': 'Sing', 'Person': 'Two', 'PronType': 'Prs', 'Poss': 'Yes',
-                    'Case': 'Nom'},
-        'তোর':     {LEMMA:  PRON_LEMMA, 'Number': 'Sing', 'Person': 'Two', 'PronType': 'Prs', 'Poss': 'Yes',
-                    'Case': 'Nom'},
-        'তাদের':   {LEMMA:  PRON_LEMMA, 'Number': 'Plur', 'Person': 'Three', 'PronType': 'Prs', 'Poss': 'Yes',
-                    'Case': 'Nom'},
-        'কাদের':   {LEMMA: PRON_LEMMA, 'Number': 'Plur', 'PronType': 'Int', 'Case': 'Acc'},
-        'তোদের':   {LEMMA:  PRON_LEMMA, 'Number': 'Plur', 'Person': 'Two', 'PronType': 'Prs', 'Poss': 'Yes',
-                    'Case': 'Nom'},
-        'যাদের':   {LEMMA: PRON_LEMMA, 'Number': 'Plur', 'PronType': 'Int', 'Case': 'Acc'},
-    }
+        "আমার": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Sing",
+            "Person": "One",
+            "PronType": "Prs",
+            "Poss": "Yes",
+            "Case": "Nom",
+        },
+        "মোর": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Sing",
+            "Person": "One",
+            "PronType": "Prs",
+            "Poss": "Yes",
+            "Case": "Nom",
+        },
+        "মোদের": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Plur",
+            "Person": "One",
+            "PronType": "Prs",
+            "Poss": "Yes",
+            "Case": "Nom",
+        },
+        "তার": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Sing",
+            "Person": "Three",
+            "PronType": "Prs",
+            "Poss": "Yes",
+            "Case": "Nom",
+        },
+        "তোমাদের": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Plur",
+            "Person": "Two",
+            "PronType": "Prs",
+            "Poss": "Yes",
+            "Case": "Nom",
+        },
+        "আমাদের": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Plur",
+            "Person": "One",
+            "PronType": "Prs",
+            "Poss": "Yes",
+            "Case": "Nom",
+        },
+        "তোমার": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Sing",
+            "Person": "Two",
+            "PronType": "Prs",
+            "Poss": "Yes",
+            "Case": "Nom",
+        },
+        "তোর": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Sing",
+            "Person": "Two",
+            "PronType": "Prs",
+            "Poss": "Yes",
+            "Case": "Nom",
+        },
+        "তাদের": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Plur",
+            "Person": "Three",
+            "PronType": "Prs",
+            "Poss": "Yes",
+            "Case": "Nom",
+        },
+        "কাদের": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Plur",
+            "PronType": "Int",
+            "Case": "Acc",
+        },
+        "তোদের": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Plur",
+            "Person": "Two",
+            "PronType": "Prs",
+            "Poss": "Yes",
+            "Case": "Nom",
+        },
+        "যাদের": {
+            LEMMA: PRON_LEMMA,
+            "Number": "Plur",
+            "PronType": "Int",
+            "Case": "Acc",
+        },
+    },
 }
--- a/spacy/lang/bn/punctuation.py
+++ b/spacy/lang/bn/punctuation.py
@ -2,29 +2,45 @@
 from __future__ import unicode_literals

 from ..char_classes import LIST_PUNCT, LIST_ELLIPSES, LIST_QUOTES, LIST_ICONS
-from ..char_classes import ALPHA_LOWER, ALPHA_UPPER, ALPHA, HYPHENS, QUOTES, UNITS
+from ..char_classes import ALPHA_LOWER, ALPHA, HYPHENS, QUOTES, UNITS


 _currency = r"\$|¢|£|€|¥|฿|৳"
-_quotes = QUOTES.replace("'", '')
-_list_punct = LIST_PUNCT + '। ॥'.strip().split()
+_quotes = QUOTES.replace("'", "")
+_list_punct = LIST_PUNCT + "। ॥".strip().split()


-_prefixes = ([r'\+'] + _list_punct + LIST_ELLIPSES + LIST_QUOTES + LIST_ICONS)
+_prefixes = [r"\+"] + _list_punct + LIST_ELLIPSES + LIST_QUOTES + LIST_ICONS

-_suffixes = (_list_punct + LIST_ELLIPSES + LIST_QUOTES + LIST_ICONS +
-             [r'(?<=[0-9])\+',
-              r'(?<=°[FfCcKk])\.',
-              r'(?<=[0-9])(?:{})'.format(_currency),
-              r'(?<=[0-9])(?:{})'.format(UNITS),
-              r'(?<=[{}(?:{})])\.'.format('|'.join([ALPHA_LOWER, r'%²\-\)\]\+', QUOTES]), _currency)])
+_suffixes = (
+    _list_punct
+    + LIST_ELLIPSES
+    + LIST_QUOTES
+    + LIST_ICONS
+    + [
+        r"(?<=[0-9])\+",
+        r"(?<=°[FfCcKk])\.",
+        r"(?<=[0-9])(?:{})".format(_currency),
+        r"(?<=[0-9])(?:{})".format(UNITS),
+        r"(?<=[{}(?:{})])\.".format(
+            "|".join([ALPHA_LOWER, r"%²\-\)\]\+", QUOTES]), _currency
+        ),
+    ]
+)

-_infixes = (LIST_ELLIPSES + LIST_ICONS +
-            [r'(?<=[0-9{zero}-{nine}])[+\-\*^=](?=[0-9{zero}-{nine}-])'.format(zero=u'০', nine=u'৯'),
-             r'(?<=[{a}]),(?=[{a}])'.format(a=ALPHA),
-             r'(?<=[{a}])[{h}](?={ae})'.format(a=ALPHA, h=HYPHENS, ae=u'এ'),
+_infixes = (
+    LIST_ELLIPSES
+    + LIST_ICONS
+    + [
+        r"(?<=[0-9{zero}-{nine}])[+\-\*^=](?=[0-9{zero}-{nine}-])".format(
+            zero="০", nine="৯"
+        ),
+        r"(?<=[{a}]),(?=[{a}])".format(a=ALPHA),
+        r"(?<=[{a}])[{h}](?={ae})".format(a=ALPHA, h=HYPHENS, ae="এ"),
        r'(?<=[{a}])[?";:=,.]*(?:{h})(?=[{a}])'.format(a=ALPHA, h=HYPHENS),
-             r'(?<=[{a}"])[:<>=/](?=[{a}])'.format(a=ALPHA)])
+        r'(?<=[{a}"])[:<>=/](?=[{a}])'.format(a=ALPHA),
+    ]
+)


 TOKENIZER_PREFIXES = _prefixes
--- a/spacy/lang/bn/stop_words.py
+++ b/spacy/lang/bn/stop_words.py
@ -2,7 +2,8 @@
 from __future__ import unicode_literals


-STOP_WORDS = set("""
+STOP_WORDS = set(
+    """
 অতএব অথচ অথবা অনুযায়ী অনেক অনেকে অনেকেই অন্তত  অবধি অবশ্য অর্থাৎ অন্য অনুযায়ী অর্ধভাগে
 আগামী আগে আগেই আছে আজ আদ্যভাগে আপনার আপনি আবার আমরা আমাকে আমাদের আমার  আমি আর আরও
 ইত্যাদি ইহা
@ -41,4 +42,5 @@ STOP_WORDS = set("""
 সাধারণ সামনে সঙ্গে সঙ্গেও সব সবার সমস্ত সম্প্রতি সময় সহ সহিত সাথে সুতরাং সে  সেই সেখান সেখানে  সেটা সেটাই সেটাও সেটি স্পষ্ট স্বয়ং
 হইতে হইবে হইয়া হওয়া হওয়ায় হওয়ার হচ্ছে হত হতে হতেই হন হবে হবেন হয় হয়তো হয়নি হয়ে হয়েই হয়েছিল হয়েছে হাজার
 হয়েছেন হল হলে হলেই হলেও হলো হিসাবে হিসেবে হৈলে হোক হয় হয়ে হয়েছে হৈতে হইয়া  হয়েছিল হয়েছেন হয়নি হয়েই হয়তো হওয়া হওয়ার হওয়ায়
-""".split())
+""".split()
+)
--- a/spacy/lang/bn/tag_map.py
+++ b/spacy/lang/bn/tag_map.py
@ -11,7 +11,7 @@ TAG_MAP = {
    "-LRB-": {POS: PUNCT, "PunctType": "brck", "PunctSide": "ini"},
    "-RRB-": {POS: PUNCT, "PunctType": "brck", "PunctSide": "fin"},
    "``": {POS: PUNCT, "PunctType": "quot", "PunctSide": "ini"},
-    "\"\"":     {POS: PUNCT, "PunctType": "quot", "PunctSide": "fin"},
+    '""': {POS: PUNCT, "PunctType": "quot", "PunctSide": "fin"},
    "''": {POS: PUNCT, "PunctType": "quot", "PunctSide": "fin"},
    ":": {POS: PUNCT},
    "৳": {POS: SYM, "Other": {"SymType": "currency"}},
@ -42,7 +42,6 @@ TAG_MAP = {
    "RBR": {POS: ADV, "Degree": "comp"},
    "RBS": {POS: ADV, "Degree": "sup"},
    "RP": {POS: PART},
-    "SYM":      {POS: SYM},
    "TO": {POS: PART, "PartType": "inf", "VerbForm": "inf"},
    "UH": {POS: INTJ},
    "VB": {POS: VERB, "VerbForm": "inf"},
@ -50,7 +49,13 @@ TAG_MAP = {
    "VBG": {POS: VERB, "VerbForm": "part", "Tense": "pres", "Aspect": "prog"},
    "VBN": {POS: VERB, "VerbForm": "part", "Tense": "past", "Aspect": "perf"},
    "VBP": {POS: VERB, "VerbForm": "fin", "Tense": "pres"},
-    "VBZ":      {POS: VERB, "VerbForm": "fin", "Tense": "pres", "Number": "sing", "Person": 3},
+    "VBZ": {
+        POS: VERB,
+        "VerbForm": "fin",
+        "Tense": "pres",
+        "Number": "sing",
+        "Person": 3,
+    },
    "WDT": {POS: ADJ, "PronType": "int|rel"},
    "WP": {POS: NOUN, "PronType": "int|rel"},
    "WP$": {POS: ADJ, "Poss": "yes", "PronType": "int|rel"},
--- a/spacy/lang/bn/tokenizer_exceptions.py
+++ b/spacy/lang/bn/tokenizer_exceptions.py
@ -19,7 +19,8 @@ for exc_data in [
    {ORTH: "কি.মি", LEMMA: "কিলোমিটার"},
    {ORTH: "সে.মি.", LEMMA: "সেন্টিমিটার"},
    {ORTH: "সে.মি", LEMMA: "সেন্টিমিটার"},
-    {ORTH: "মি.লি.", LEMMA: "মিলিলিটার"}]:
+    {ORTH: "মি.লি.", LEMMA: "মিলিলিটার"},
+]:
    _exc[exc_data[ORTH]] = [exc_data]


--- a/spacy/lang/ca/init.py
+++ b/spacy/lang/ca/init.py
@ -4,13 +4,6 @@ from __future__ import unicode_literals
 from .tokenizer_exceptions import TOKENIZER_EXCEPTIONS
 from .stop_words import STOP_WORDS
 from .lex_attrs import LEX_ATTRS
-
-# uncomment if files are available
-# from .norm_exceptions import NORM_EXCEPTIONS
-# from .tag_map import TAG_MAP
-# from .morph_rules import MORPH_RULES
-
-# uncomment if lookup-based lemmatizer is available
 from .lemmatizer import LOOKUP

 from ..tokenizer_exceptions import BASE_EXCEPTIONS
@ -19,46 +12,22 @@ from ...language import Language
 from ...attrs import LANG, NORM
 from ...util import update_exc, add_lookups

-# Create a Language subclass
-# Documentation: https://spacy.io/docs/usage/adding-languages
-
-# This file should be placed in spacy/lang/ca (ISO code of language).
-# Before submitting a pull request, make sure the remove all comments from the
-# language data files, and run at least the basic tokenizer tests. Simply add the
-# language ID to the list of languages in spacy/tests/conftest.py to include it
-# in the basic tokenizer sanity tests. You can optionally add a fixture for the
-# language's tokenizer and add more specific tests. For more info, see the
-# tests documentation: https://github.com/explosion/spaCy/tree/master/spacy/tests
-

 class CatalanDefaults(Language.Defaults):
    lex_attr_getters = dict(Language.Defaults.lex_attr_getters)
-    lex_attr_getters[LANG] = lambda text: 'ca' # ISO code
-    # add more norm exception dictionaries here
-    lex_attr_getters[NORM] = add_lookups(Language.Defaults.lex_attr_getters[NORM], BASE_NORMS)
-
-    # overwrite functions for lexical attributes
+    lex_attr_getters[LANG] = lambda text: "ca"
+    lex_attr_getters[NORM] = add_lookups(
+        Language.Defaults.lex_attr_getters[NORM], BASE_NORMS
+    )
    lex_attr_getters.update(LEX_ATTRS)
-
-    # add custom tokenizer exceptions to base exceptions
    tokenizer_exceptions = update_exc(BASE_EXCEPTIONS, TOKENIZER_EXCEPTIONS)
-
-    # add stop words
    stop_words = STOP_WORDS
-
-    # if available: add tag map
-    # tag_map = dict(TAG_MAP)
-
-    # if available: add morph rules
-    # morph_rules = dict(MORPH_RULES)
-
    lemma_lookup = LOOKUP


 class Catalan(Language):
-    lang = 'ca' # ISO code
-    Defaults = CatalanDefaults # set Defaults to custom language defaults
+    lang = "ca"
+    Defaults = CatalanDefaults


-# set default export – this allows the language class to be lazy-loaded
-__all__ = ['Catalan']
+__all__ = ["Catalan"]
--- a/spacy/lang/ca/examples.py
+++ b/spacy/lang/ca/examples.py
@ -5,7 +5,7 @@ from __future__ import unicode_literals
 """
 Example sentences to test spaCy and its language models.

->>> from spacy.lang.es.examples import sentences
+>>> from spacy.lang.ca.examples import sentences
 >>> docs = nlp.pipe(sentences)
 """

--- a/spacy/lang/ca/lex_attrs.py
+++ b/spacy/lang/ca/lex_attrs.py
@ -1,33 +1,57 @@
 # coding: utf8
 from __future__ import unicode_literals

-# import the symbols for the attrs you want to overwrite
 from ...attrs import LIKE_NUM


-# Overwriting functions for lexical attributes
-# Documentation: https://localhost:1234/docs/usage/adding-languages#lex-attrs
-# Most of these functions, like is_lower or like_url should be language-
-# independent. Others, like like_num (which includes both digits and number
-# words), requires customisation.
-
-
-# Example: check if token resembles a number
-
-_num_words = ['zero', 'un', 'dos', 'tres', 'quatre', 'cinc', 'sis', 'set',
-              'vuit', 'nou', 'deu', 'onze', 'dotze', 'tretze', 'catorze',
-              'quinze', 'setze', 'disset', 'divuit', 'dinou', 'vint',
-              'trenta', 'quaranta', 'cinquanta', 'seixanta', 'setanta', 'vuitanta', 'noranta',
-              'cent', 'mil', 'milió', 'bilió', 'trilió', 'quatrilió',
-              'gazilió', 'bazilió']
+_num_words = [
+    "zero",
+    "un",
+    "dos",
+    "tres",
+    "quatre",
+    "cinc",
+    "sis",
+    "set",
+    "vuit",
+    "nou",
+    "deu",
+    "onze",
+    "dotze",
+    "tretze",
+    "catorze",
+    "quinze",
+    "setze",
+    "disset",
+    "divuit",
+    "dinou",
+    "vint",
+    "trenta",
+    "quaranta",
+    "cinquanta",
+    "seixanta",
+    "setanta",
+    "vuitanta",
+    "noranta",
+    "cent",
+    "mil",
+    "milió",
+    "bilió",
+    "trilió",
+    "quatrilió",
+    "gazilió",
+    "bazilió",
+]


 def like_num(text):
-    text = text.replace(',', '').replace('.', '')
+    if text.startswith(("+", "-", "±", "~")):
+        text = text[1:]
+    text = text.replace(",", "").replace(".", "")
    if text.isdigit():
        return True
-    if text.count('/') == 1:
-        num, denom = text.split('/')
+    if text.count("/") == 1:
+        num, denom = text.split("/")
        if num.isdigit() and denom.isdigit():
            return True
    if text in _num_words:
@ -35,9 +59,4 @@ def like_num(text):
    return False


-# Create dictionary of functions to overwrite. The default lex_attr_getters are
-# updated with this one, so only the functions defined here are overwritten.
-
-LEX_ATTRS = {
-    LIKE_NUM: like_num
-}
+LEX_ATTRS = {LIKE_NUM: like_num}
--- a/spacy/lang/ca/stop_words.py
+++ b/spacy/lang/ca/stop_words.py
@ -2,9 +2,8 @@
 from __future__ import unicode_literals


-# Stop words
-
-STOP_WORDS = set("""
+STOP_WORDS = set(
+    """
 a abans ací ah així això al aleshores algun alguna algunes alguns alhora allà allí allò
 als altra altre altres amb ambdues ambdós anar ans apa aquell aquella aquelles aquells
 aquest aquesta aquestes aquests aquí
@ -53,4 +52,5 @@ un una unes uns us últim ús

 va vaig vam van vas veu vosaltres vostra vostre vostres

-""".split())
+""".split()
+)
--- a/spacy/lang/ca/tag_map.py
+++ b/spacy/lang/ca/tag_map.py
@ -5,14 +5,6 @@ from ..symbols import POS, ADV, NOUN, ADP, PRON, SCONJ, PROPN, DET, SYM, INTJ
 from ..symbols import PUNCT, NUM, AUX, X, CONJ, ADJ, VERB, PART, SPACE, CCONJ


-# Add a tag map
-# Documentation: https://spacy.io/docs/usage/adding-languages#tag-map
-# Universal Dependencies: http://universaldependencies.org/u/pos/all.html
-# The keys of the tag map should be strings in your tag set. The dictionary must
-# have an entry POS whose value is one of the Universal Dependencies tags.
-# Optionally, you can also include morphological features or other attributes.
-
-
 TAG_MAP = {
    "ADV": {POS: ADV},
    "NOUN": {POS: NOUN},
@ -32,5 +24,5 @@ TAG_MAP = {
    "ADJ": {POS: ADJ},
    "VERB": {POS: VERB},
    "PART": {POS: PART},
-    "SP":     	{POS: SPACE}
+    "SP": {POS: SPACE},
 }
--- a/spacy/lang/ca/tokenizer_exceptions.py
+++ b/spacy/lang/ca/tokenizer_exceptions.py
@ -1,8 +1,7 @@
 # coding: utf8
 from __future__ import unicode_literals

-# import symbols – if you need to use more, add them here
-from ...symbols import ORTH, LEMMA, TAG, NORM, ADP, DET
+from ...symbols import ORTH, LEMMA


 _exc = {}
@ -25,27 +24,18 @@ for exc_data in [
    {ORTH: "Srta.", LEMMA: "senyoreta"},
    {ORTH: "núm", LEMMA: "número"},
    {ORTH: "St.", LEMMA: "sant"},
-    {ORTH: "Sta.", LEMMA: "santa"}]:
+    {ORTH: "Sta.", LEMMA: "santa"},
+]:
    _exc[exc_data[ORTH]] = [exc_data]

 # Times
-
-_exc["12m."] = [
-    {ORTH: "12"},
-    {ORTH: "m.", LEMMA: "p.m."}]
-
+_exc["12m."] = [{ORTH: "12"}, {ORTH: "m.", LEMMA: "p.m."}]

 for h in range(1, 12 + 1):
    for period in ["a.m.", "am"]:
-        _exc["%d%s" % (h, period)] = [
-            {ORTH: "%d" % h},
-            {ORTH: period, LEMMA: "a.m."}]
+        _exc["%d%s" % (h, period)] = [{ORTH: "%d" % h}, {ORTH: period, LEMMA: "a.m."}]
    for period in ["p.m.", "pm"]:
-        _exc["%d%s" % (h, period)] = [
-            {ORTH: "%d" % h},
-            {ORTH: period, LEMMA: "p.m."}]
+        _exc["%d%s" % (h, period)] = [{ORTH: "%d" % h}, {ORTH: period, LEMMA: "p.m."}]

-# To keep things clean and readable, it's recommended to only declare the
-# TOKENIZER_EXCEPTIONS at the bottom:

 TOKENIZER_EXCEPTIONS = _exc
--- a/spacy/lang/char_classes.py
+++ b/spacy/lang/char_classes.py
@ -4,23 +4,23 @@ from __future__ import unicode_literals
 import regex as re

 re.DEFAULT_VERSION = re.VERSION1
-merge_char_classes = lambda classes: '[{}]'.format('||'.join(classes))
-split_chars = lambda char: list(char.strip().split(' '))
-merge_chars = lambda char: char.strip().replace(' ', '|')
+merge_char_classes = lambda classes: "[{}]".format("||".join(classes))
+split_chars = lambda char: list(char.strip().split(" "))
+merge_chars = lambda char: char.strip().replace(" ", "|")

-_bengali = r'[\p{L}&&\p{Bengali}]'
-_hebrew = r'[\p{L}&&\p{Hebrew}]'
-_latin_lower = r'[\p{Ll}&&\p{Latin}]'
-_latin_upper = r'[\p{Lu}&&\p{Latin}]'
-_latin = r'[[\p{Ll}||\p{Lu}]&&\p{Latin}]'
-_persian = r'[\p{L}&&\p{Arabic}]'
-_russian_lower = r'[ёа-я]'
-_russian_upper = r'[ЁА-Я]'
-_sinhala = r'[\p{L}&&\p{Sinhala}]'
-_tatar_lower = r'[әөүҗңһ]'
-_tatar_upper = r'[ӘӨҮҖҢҺ]'
-_greek_lower = r'[α-ωάέίόώήύ]'
-_greek_upper = r'[Α-ΩΆΈΊΌΏΉΎ]'
+_bengali = r"[\p{L}&&\p{Bengali}]"
+_hebrew = r"[\p{L}&&\p{Hebrew}]"
+_latin_lower = r"[\p{Ll}&&\p{Latin}]"
+_latin_upper = r"[\p{Lu}&&\p{Latin}]"
+_latin = r"[[\p{Ll}||\p{Lu}]&&\p{Latin}]"
+_persian = r"[\p{L}&&\p{Arabic}]"
+_russian_lower = r"[ёа-я]"
+_russian_upper = r"[ЁА-Я]"
+_sinhala = r"[\p{L}&&\p{Sinhala}]"
+_tatar_lower = r"[әөүҗңһ]"
+_tatar_upper = r"[ӘӨҮҖҢҺ]"
+_greek_lower = r"[α-ωάέίόώήύ]"
+_greek_upper = r"[Α-ΩΆΈΊΌΏΉΎ]"

 _upper = [_latin_upper, _russian_upper, _tatar_upper, _greek_upper]
 _lower = [_latin_lower, _russian_lower, _tatar_lower, _greek_lower]
@ -30,23 +30,27 @@ ALPHA = merge_char_classes(_upper + _lower + _uncased)
 ALPHA_LOWER = merge_char_classes(_lower + _uncased)
 ALPHA_UPPER = merge_char_classes(_upper + _uncased)

-_units = ('km km² km³ m m² m³ dm dm² dm³ cm cm² cm³ mm mm² mm³ ha µm nm yd in ft '
-          'kg g mg µg t lb oz m/s km/h kmh mph hPa Pa mbar mb MB kb KB gb GB tb '
-          'TB T G M K % км км² км³ м м² м³ дм дм² дм³ см см² см³ мм мм² мм³ нм '
-          'кг г мг м/с км/ч кПа Па мбар Кб КБ кб Мб МБ мб Гб ГБ гб Тб ТБ тб'
-          'كم كم² كم³ م م² م³ سم سم² سم³ مم مم² مم³ كم غرام جرام جم كغ ملغ كوب اكواب')
-_currency = r'\$ £ € ¥ ฿ US\$ C\$ A\$ ₽ ﷼'
+_units = (
+    "km km² km³ m m² m³ dm dm² dm³ cm cm² cm³ mm mm² mm³ ha µm nm yd in ft "
+    "kg g mg µg t lb oz m/s km/h kmh mph hPa Pa mbar mb MB kb KB gb GB tb "
+    "TB T G M K % км км² км³ м м² м³ дм дм² дм³ см см² см³ мм мм² мм³ нм "
+    "кг г мг м/с км/ч кПа Па мбар Кб КБ кб Мб МБ мб Гб ГБ гб Тб ТБ тб"
+    "كم كم² كم³ م م² م³ سم سم² سم³ مم مم² مم³ كم غرام جرام جم كغ ملغ كوب اكواب"
+)
+_currency = r"\$ £ € ¥ ฿ US\$ C\$ A\$ ₽ ﷼"

 # These expressions contain various unicode variations, including characters
 # used in Chinese (see #1333, #1340, #1351) – unless there are cross-language
 # conflicts, spaCy's base tokenizer should handle all of those by default
-_punct = r'… …… , : ; \! \? ¿ ؟ ¡ \( \) \[ \] \{ \} < > _ # \* & 。 ？ ！ ， 、 ； ： ～ · । ، ؛ ٪'
+_punct = (
+    r"… …… , : ; \! \? ¿ ؟ ¡ \( \) \[ \] \{ \} < > _ # \* & 。 ？ ！ ， 、 ； ： ～ · । ، ؛ ٪"
+)
 _quotes = r'\' \'\' " ” “ `` ` ‘ ´ ‘‘ ’’ ‚ , „ » « 「 」 『 』 （ ） 〔 〕 【 】 《 》 〈 〉'
-_hyphens = '- – — -- --- —— ~'
+_hyphens = "- – — -- --- —— ~"

 # Various symbols like dingbats, but also emoji
 # Details: https://www.compart.com/en/unicode/category/So
-_other_symbols = r'[\p{So}]'
+_other_symbols = r"[\p{So}]"

 UNITS = merge_chars(_units)
 CURRENCY = merge_chars(_currency)
@ -60,5 +64,5 @@ LIST_CURRENCY = split_chars(_currency)
 LIST_QUOTES = split_chars(_quotes)
 LIST_PUNCT = split_chars(_punct)
 LIST_HYPHENS = split_chars(_hyphens)
-LIST_ELLIPSES = [r'\.\.+', '…']
+LIST_ELLIPSES = [r"\.\.+", "…"]
 LIST_ICONS = [_other_symbols]
--- a/spacy/lang/da/init.py
+++ b/spacy/lang/da/init.py
@ -20,9 +20,10 @@ from ...util import update_exc, add_lookups
 class DanishDefaults(Language.Defaults):
    lex_attr_getters = dict(Language.Defaults.lex_attr_getters)
    lex_attr_getters.update(LEX_ATTRS)
-    lex_attr_getters[LANG] = lambda text: 'da'
-    lex_attr_getters[NORM] = add_lookups(Language.Defaults.lex_attr_getters[NORM],
-                                         BASE_NORMS, NORM_EXCEPTIONS)
+    lex_attr_getters[LANG] = lambda text: "da"
+    lex_attr_getters[NORM] = add_lookups(
+        Language.Defaults.lex_attr_getters[NORM], BASE_NORMS, NORM_EXCEPTIONS
+    )
    tokenizer_exceptions = update_exc(BASE_EXCEPTIONS, TOKENIZER_EXCEPTIONS)
    morph_rules = MORPH_RULES
    infixes = TOKENIZER_INFIXES
@ -33,8 +34,8 @@ class DanishDefaults(Language.Defaults):


 class Danish(Language):
-    lang = 'da'
+    lang = "da"
    Defaults = DanishDefaults


-__all__ = ['Danish']
+__all__ = ["Danish"]
--- a/spacy/lang/da/examples.py
+++ b/spacy/lang/da/examples.py
@ -14,5 +14,5 @@ sentences = [
    "Apple overvejer at købe et britisk startup for 1 milliard dollar",
    "Selvkørende biler flytter forsikringsansvaret over på producenterne",
    "San Francisco overvejer at forbyde udbringningsrobotter på fortov",
-    "London er en stor by i Storbritannien"
+    "London er en stor by i Storbritannien",
 ]
--- a/spacy/lang/da/lex_attrs.py
+++ b/spacy/lang/da/lex_attrs.py
@ -3,8 +3,8 @@ from __future__ import unicode_literals

 from ...attrs import LIKE_NUM

-# Source http://fjern-uv.dk/tal.php

+# Source http://fjern-uv.dk/tal.php
 _num_words = """nul
 en et to tre fire fem seks syv otte ni ti
 elleve tolv tretten fjorten femten seksten sytten atten nitten tyve
@ -19,8 +19,8 @@ enoghalvfems tooghalvfems treoghalvfems fireoghalvfems femoghalvfems seksoghalvf
 million milliard billion billiard trillion trilliard
 """.split()

-# source http://www.duda.dk/video/dansk/grammatik/talord/talord.html

+# Source: http://www.duda.dk/video/dansk/grammatik/talord/talord.html
 _ordinal_words = """nulte
 første anden tredje fjerde femte sjette syvende ottende niende tiende
 elfte tolvte trettende fjortende femtende sekstende syttende attende nittende tyvende
@ -33,14 +33,15 @@ enogfirsindstyvende toogfirsindstyvende treogfirsindstyvende fireogfirsindstyven
 enoghalvfemsindstyvende tooghalvfemsindstyvende treoghalvfemsindstyvende fireoghalvfemsindstyvende femoghalvfemsindstyvende seksoghalvfemsindstyvende syvoghalvfemsindstyvende otteoghalvfemsindstyvende nioghalvfemsindstyvende
 """.split()

+
 def like_num(text):
-    if text.startswith(('+', '-', '±', '~')):
+    if text.startswith(("+", "-", "±", "~")):
        text = text[1:]
-    text = text.replace(',', '').replace('.', '')
+    text = text.replace(",", "").replace(".", "")
    if text.isdigit():
        return True
-    if text.count('/') == 1:
-        num, denom = text.split('/')
+    if text.count("/") == 1:
+        num, denom = text.split("/")
        if num.isdigit() and denom.isdigit():
            return True
    if text.lower() in _num_words:
@ -49,6 +50,5 @@ def like_num(text):
        return True
    return False

-LEX_ATTRS = {
-    LIKE_NUM: like_num
-}
+
+LEX_ATTRS = {LIKE_NUM: like_num}
--- a/spacy/lang/da/morph_rules.py
+++ b/spacy/lang/da/morph_rules.py
@ -11,53 +11,299 @@ from ...symbols import LEMMA, PRON_LEMMA

 MORPH_RULES = {
    "PRON": {
-        "jeg":          {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "One", "Number": "Sing", "Case": "Nom", "Gender": "Com"},                      # Case=Nom|Gender=Com|Number=Sing|Person=1|PronType=Prs
-        "mig":          {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "One", "Number": "Sing", "Case": "Acc", "Gender": "Com"},                      # Case=Acc|Gender=Com|Number=Sing|Person=1|PronType=Prs
-        "min":          {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "One", "Number": "Sing", "Poss": "Yes", "Gender": "Com"},                      # Gender=Com|Number=Sing|Number[psor]=Sing|Person=1|Poss=Yes|PronType=Prs
-        "mit":          {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "One", "Number": "Sing", "Poss": "Yes", "Gender": "Neut"},                     # Gender=Neut|Number=Sing|Number[psor]=Sing|Person=1|Poss=Yes|PronType=Prs
-        "vor":          {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "One", "Number": "Sing", "Poss": "Yes", "Gender": "Com"},                      # Gender=Com|Number=Sing|Number[psor]=Plur|Person=1|Poss=Yes|PronType=Prs|Style=Form
-        "vort":         {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "One", "Number": "Sing", "Poss": "Yes", "Gender": "Neut"},                     # Gender=Neut|Number=Sing|Number[psor]=Plur|Person=1|Poss=Yes|PronType=Prs|Style=Form
-        "du":           {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Two", "Number": "Sing", "Case": "Nom", "Gender": "Com"},                      # Case=Nom|Gender=Com|Number=Sing|Person=2|PronType=Prs
-        "dig":          {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Two", "Number": "Sing", "Case": "Acc", "Gender": "Com"},                      # Case=Acc|Gender=Com|Number=Sing|Person=2|PronType=Prs
-        "din":          {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Two", "Number": "Sing", "Poss": "Yes", "Gender": "Com"},                      # Gender=Com|Number=Sing|Number[psor]=Sing|Person=2|Poss=Yes|PronType=Prs
-        "dit":          {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Two", "Number": "Sing", "Poss": "Yes", "Gender": "Neut"},                     # Gender=Neut|Number=Sing|Number[psor]=Sing|Person=2|Poss=Yes|PronType=Prs
-        "han":          {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Sing", "Case": "Nom", "Gender": "Com"},                    # Case=Nom|Gender=Com|Number=Sing|Person=3|PronType=Prs
-        "hun":          {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Sing", "Case": "Nom", "Gender": "Com"},                    # Case=Nom|Gender=Com|Number=Sing|Person=3|PronType=Prs
-        "den":          {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Sing", "Gender": "Com"},                                   # Case=Acc|Gender=Com|Number=Sing|Person=3|PronType=Prs, See note above.
-        "det":          {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Sing", "Gender": "Neut"},                                  # Case=Acc|Gender=Neut|Number=Sing|Person=3|PronType=Prs See note above.
-        "ham":          {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Sing", "Case": "Acc", "Gender": "Com"},                    # Case=Acc|Gender=Com|Number=Sing|Person=3|PronType=Prs
-        "hende":        {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Sing", "Case": "Acc", "Gender": "Com"},                    # Case=Acc|Gender=Com|Number=Sing|Person=3|PronType=Prs
-        "sin":          {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Sing", "Poss": "Yes", "Gender": "Com", "Reflex": "Yes"},   # Gender=Com|Number=Sing|Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes
-        "sit":          {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Sing", "Poss": "Yes", "Gender": "Neut", "Reflex": "Yes"},  # Gender=Neut|Number=Sing|Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes
-
-        "vi":           {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "One", "Number": "Plur", "Case": "Nom", "Gender": "Com"},                      # Case=Nom|Gender=Com|Number=Plur|Person=1|PronType=Prs
-        "os":           {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "One", "Number": "Plur", "Case": "Acc", "Gender": "Com"},                      # Case=Acc|Gender=Com|Number=Plur|Person=1|PronType=Prs
-        "mine":         {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "One", "Number": "Plur", "Poss": "Yes"},                                       # Number=Plur|Number[psor]=Sing|Person=1|Poss=Yes|PronType=Prs
-        "vore":         {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "One", "Number": "Plur", "Poss": "Yes"},                                       # Number=Plur|Number[psor]=Plur|Person=1|Poss=Yes|PronType=Prs|Style=Form
-        "I":            {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Two", "Number": "Plur", "Case": "Nom", "Gender": "Com"},                      # Case=Nom|Gender=Com|Number=Plur|Person=2|PronType=Prs
-        "jer":          {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Two", "Number": "Plur", "Case": "Acc", "Gender": "Com"},                      # Case=Acc|Gender=Com|Number=Plur|Person=2|PronType=Prs
-        "dine":         {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Two", "Number": "Plur", "Poss": "Yes"},                                       # Number=Plur|Number[psor]=Sing|Person=2|Poss=Yes|PronType=Prs
-        "de":           {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Plur", "Case": "Nom"},                                     # Case=Nom|Number=Plur|Person=3|PronType=Prs
-        "dem":          {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Plur", "Case": "Acc"},                                     # Case=Acc|Number=Plur|Person=3|PronType=Prs
-        "sine":         {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Plur", "Poss": "Yes", "Reflex": "Yes"},                    # Number=Plur|Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes
-
-        "vores":        {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "One", "Poss": "Yes"},                                                         # Number[psor]=Plur|Person=1|Poss=Yes|PronType=Prs
-        "De":           {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Two", "Case": "Nom", "Gender": "Com"},                                        # Case=Nom|Gender=Com|Person=2|Polite=Form|PronType=Prs
-        "Dem":          {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Two", "Case": "Acc", "Gender": "Com"},                                        # Case=Acc|Gender=Com|Person=2|Polite=Form|PronType=Prs
-        "Deres":        {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Two", "Poss": "Yes"},                                                         # Person=2|Polite=Form|Poss=Yes|PronType=Prs
-        "jeres":        {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Two", "Poss": "Yes"},                                                         # Number[psor]=Plur|Person=2|Poss=Yes|PronType=Prs
-        "sig":          {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Case": "Acc", "Reflex": "Yes"},                                      # Case=Acc|Person=3|PronType=Prs|Reflex=Yes
-        "hans":         {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Poss": "Yes"},                                                       # Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs
-        "hendes":       {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Poss": "Yes"},                                                       # Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs
-        "dens":         {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Poss": "Yes"},                                                       # Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs
-        "dets":         {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Poss": "Yes"},                                                       # Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs
-        "deres":        {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Poss": "Yes"},                                                       # Number[psor]=Plur|Person=3|Poss=Yes|PronType=Prs
+        "jeg": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "One",
+            "Number": "Sing",
+            "Case": "Nom",
+            "Gender": "Com",
+        },  # Case=Nom|Gender=Com|Number=Sing|Person=1|PronType=Prs
+        "mig": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "One",
+            "Number": "Sing",
+            "Case": "Acc",
+            "Gender": "Com",
+        },  # Case=Acc|Gender=Com|Number=Sing|Person=1|PronType=Prs
+        "min": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "One",
+            "Number": "Sing",
+            "Poss": "Yes",
+            "Gender": "Com",
+        },  # Gender=Com|Number=Sing|Number[psor]=Sing|Person=1|Poss=Yes|PronType=Prs
+        "mit": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "One",
+            "Number": "Sing",
+            "Poss": "Yes",
+            "Gender": "Neut",
+        },  # Gender=Neut|Number=Sing|Number[psor]=Sing|Person=1|Poss=Yes|PronType=Prs
+        "vor": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "One",
+            "Number": "Sing",
+            "Poss": "Yes",
+            "Gender": "Com",
+        },  # Gender=Com|Number=Sing|Number[psor]=Plur|Person=1|Poss=Yes|PronType=Prs|Style=Form
+        "vort": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "One",
+            "Number": "Sing",
+            "Poss": "Yes",
+            "Gender": "Neut",
+        },  # Gender=Neut|Number=Sing|Number[psor]=Plur|Person=1|Poss=Yes|PronType=Prs|Style=Form
+        "du": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Two",
+            "Number": "Sing",
+            "Case": "Nom",
+            "Gender": "Com",
+        },  # Case=Nom|Gender=Com|Number=Sing|Person=2|PronType=Prs
+        "dig": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Two",
+            "Number": "Sing",
+            "Case": "Acc",
+            "Gender": "Com",
+        },  # Case=Acc|Gender=Com|Number=Sing|Person=2|PronType=Prs
+        "din": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Two",
+            "Number": "Sing",
+            "Poss": "Yes",
+            "Gender": "Com",
+        },  # Gender=Com|Number=Sing|Number[psor]=Sing|Person=2|Poss=Yes|PronType=Prs
+        "dit": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Two",
+            "Number": "Sing",
+            "Poss": "Yes",
+            "Gender": "Neut",
+        },  # Gender=Neut|Number=Sing|Number[psor]=Sing|Person=2|Poss=Yes|PronType=Prs
+        "han": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Sing",
+            "Case": "Nom",
+            "Gender": "Com",
+        },  # Case=Nom|Gender=Com|Number=Sing|Person=3|PronType=Prs
+        "hun": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Sing",
+            "Case": "Nom",
+            "Gender": "Com",
+        },  # Case=Nom|Gender=Com|Number=Sing|Person=3|PronType=Prs
+        "den": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Sing",
+            "Gender": "Com",
+        },  # Case=Acc|Gender=Com|Number=Sing|Person=3|PronType=Prs, See note above.
+        "det": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Sing",
+            "Gender": "Neut",
+        },  # Case=Acc|Gender=Neut|Number=Sing|Person=3|PronType=Prs See note above.
+        "ham": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Sing",
+            "Case": "Acc",
+            "Gender": "Com",
+        },  # Case=Acc|Gender=Com|Number=Sing|Person=3|PronType=Prs
+        "hende": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Sing",
+            "Case": "Acc",
+            "Gender": "Com",
+        },  # Case=Acc|Gender=Com|Number=Sing|Person=3|PronType=Prs
+        "sin": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Sing",
+            "Poss": "Yes",
+            "Gender": "Com",
+            "Reflex": "Yes",
+        },  # Gender=Com|Number=Sing|Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes
+        "sit": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Sing",
+            "Poss": "Yes",
+            "Gender": "Neut",
+            "Reflex": "Yes",
+        },  # Gender=Neut|Number=Sing|Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes
+        "vi": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "One",
+            "Number": "Plur",
+            "Case": "Nom",
+            "Gender": "Com",
+        },  # Case=Nom|Gender=Com|Number=Plur|Person=1|PronType=Prs
+        "os": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "One",
+            "Number": "Plur",
+            "Case": "Acc",
+            "Gender": "Com",
+        },  # Case=Acc|Gender=Com|Number=Plur|Person=1|PronType=Prs
+        "mine": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "One",
+            "Number": "Plur",
+            "Poss": "Yes",
+        },  # Number=Plur|Number[psor]=Sing|Person=1|Poss=Yes|PronType=Prs
+        "vore": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "One",
+            "Number": "Plur",
+            "Poss": "Yes",
+        },  # Number=Plur|Number[psor]=Plur|Person=1|Poss=Yes|PronType=Prs|Style=Form
+        "I": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Two",
+            "Number": "Plur",
+            "Case": "Nom",
+            "Gender": "Com",
+        },  # Case=Nom|Gender=Com|Number=Plur|Person=2|PronType=Prs
+        "jer": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Two",
+            "Number": "Plur",
+            "Case": "Acc",
+            "Gender": "Com",
+        },  # Case=Acc|Gender=Com|Number=Plur|Person=2|PronType=Prs
+        "dine": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Two",
+            "Number": "Plur",
+            "Poss": "Yes",
+        },  # Number=Plur|Number[psor]=Sing|Person=2|Poss=Yes|PronType=Prs
+        "de": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Plur",
+            "Case": "Nom",
+        },  # Case=Nom|Number=Plur|Person=3|PronType=Prs
+        "dem": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Plur",
+            "Case": "Acc",
+        },  # Case=Acc|Number=Plur|Person=3|PronType=Prs
+        "sine": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Plur",
+            "Poss": "Yes",
+            "Reflex": "Yes",
+        },  # Number=Plur|Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes
+        "vores": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "One",
+            "Poss": "Yes",
+        },  # Number[psor]=Plur|Person=1|Poss=Yes|PronType=Prs
+        "De": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Two",
+            "Case": "Nom",
+            "Gender": "Com",
+        },  # Case=Nom|Gender=Com|Person=2|Polite=Form|PronType=Prs
+        "Dem": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Two",
+            "Case": "Acc",
+            "Gender": "Com",
+        },  # Case=Acc|Gender=Com|Person=2|Polite=Form|PronType=Prs
+        "Deres": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Two",
+            "Poss": "Yes",
+        },  # Person=2|Polite=Form|Poss=Yes|PronType=Prs
+        "jeres": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Two",
+            "Poss": "Yes",
+        },  # Number[psor]=Plur|Person=2|Poss=Yes|PronType=Prs
+        "sig": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Case": "Acc",
+            "Reflex": "Yes",
+        },  # Case=Acc|Person=3|PronType=Prs|Reflex=Yes
+        "hans": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Poss": "Yes",
+        },  # Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs
+        "hendes": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Poss": "Yes",
+        },  # Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs
+        "dens": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Poss": "Yes",
+        },  # Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs
+        "dets": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Poss": "Yes",
+        },  # Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs
+        "deres": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Poss": "Yes",
+        },  # Number[psor]=Plur|Person=3|Poss=Yes|PronType=Prs
    },
-
    "VERB": {
        "er": {LEMMA: "være", "VerbForm": "Fin", "Tense": "Pres"},
-        "var":          {LEMMA: "være", "VerbForm": "Fin", "Tense": "Past"}
-    }
+        "var": {LEMMA: "være", "VerbForm": "Fin", "Tense": "Past"},
+    },
 }

 for tag, rules in MORPH_RULES.items():
--- a/spacy/lang/da/norm_exceptions.py
+++ b/spacy/lang/da/norm_exceptions.py
@ -516,7 +516,7 @@ _exc = {
    "øjeåbner": "øjenåbner",  # 1
    "økonomiministerium": "økonomiministerie",  # 1
    "ørenring": "ørering",  # 2
-    "øvehefte": "øvehæfte"  # 1
+    "øvehefte": "øvehæfte",  # 1
 }


--- a/spacy/lang/da/punctuation.py
+++ b/spacy/lang/da/punctuation.py
@ -6,17 +6,26 @@ from ..char_classes import QUOTES, ALPHA, ALPHA_LOWER, ALPHA_UPPER
 from ..punctuation import TOKENIZER_SUFFIXES


-_quotes = QUOTES.replace("'", '')
+_quotes = QUOTES.replace("'", "")

-_infixes = (LIST_ELLIPSES + LIST_ICONS +
-            [r'(?<=[{}])\.(?=[{}])'.format(ALPHA_LOWER, ALPHA_UPPER),
-             r'(?<=[{a}])[,!?](?=[{a}])'.format(a=ALPHA),
+_infixes = (
+    LIST_ELLIPSES
+    + LIST_ICONS
+    + [
+        r"(?<=[{}])\.(?=[{}])".format(ALPHA_LOWER, ALPHA_UPPER),
+        r"(?<=[{a}])[,!?](?=[{a}])".format(a=ALPHA),
        r'(?<=[{a}"])[:<>=](?=[{a}])'.format(a=ALPHA),
-             r'(?<=[{a}]),(?=[{a}])'.format(a=ALPHA),
-             r'(?<=[{a}])([{q}\)\]\(\[])(?=[\{a}])'.format(a=ALPHA, q=_quotes),
-             r'(?<=[{a}])--(?=[{a}])'.format(a=ALPHA)])
+        r"(?<=[{a}]),(?=[{a}])".format(a=ALPHA),
+        r"(?<=[{a}])([{q}\)\]\(\[])(?=[\{a}])".format(a=ALPHA, q=_quotes),
+        r"(?<=[{a}])--(?=[{a}])".format(a=ALPHA),
+    ]
+)

-_suffixes = [suffix for suffix in TOKENIZER_SUFFIXES if suffix not in ["'s", "'S", "’s", "’S", r"\'"]]
+_suffixes = [
+    suffix
+    for suffix in TOKENIZER_SUFFIXES
+    if suffix not in ["'s", "'S", "’s", "’S", r"\'"]
+]
 _suffixes += [r"(?<=[^sSxXzZ])\'"]


--- a/spacy/lang/da/stop_words.py
+++ b/spacy/lang/da/stop_words.py
@ -3,7 +3,8 @@ from __future__ import unicode_literals

 # Source: Handpicked by Jens Dahl Møllerhøj.

-STOP_WORDS = set("""
+STOP_WORDS = set(
+    """
 af aldrig alene alle allerede alligevel alt altid anden andet andre at

 bag begge blandt blev blive bliver burde bør
@ -43,4 +44,5 @@ ud uden udover under undtagen
 var ved vi via vil ville vore vores vær være været

 øvrigt
-""".split())
+""".split()
+)
--- a/spacy/lang/da/tokenizer_exceptions.py
+++ b/spacy/lang/da/tokenizer_exceptions.py
@ -51,73 +51,482 @@ for exc_data in [
    {ORTH: "Tirs.", LEMMA: "tirsdag"},
    {ORTH: "Ons.", LEMMA: "onsdag"},
    {ORTH: "Fre.", LEMMA: "fredag"},
-        {ORTH: "Lør.", LEMMA: "lørdag"}]:
+    {ORTH: "Lør.", LEMMA: "lørdag"},
+]:
    _exc[exc_data[ORTH]] = [exc_data]


 # Specified case only
 for orth in [
-        "diam.", "ib.", "mia.", "mik.", "pers.", "A.D.", "A/S", "B.C.", "BK.",
-        "Dr.", "Boul.", "Chr.", "Dronn.", "H.K.H.", "H.M.", "Hf.", "i/s", "I/S",
-        "Kprs.", "L.A.", "Ll.", "m/s", "M/S", "Mag.", "Mr.", "Ndr.", "Ph.d.",
-        "Prs.", "Rcp.", "Sdr.", "Skt.", "Spl.", "Vg."]:
+    "diam.",
+    "ib.",
+    "mia.",
+    "mik.",
+    "pers.",
+    "A.D.",
+    "A/S",
+    "B.C.",
+    "BK.",
+    "Dr.",
+    "Boul.",
+    "Chr.",
+    "Dronn.",
+    "H.K.H.",
+    "H.M.",
+    "Hf.",
+    "i/s",
+    "I/S",
+    "Kprs.",
+    "L.A.",
+    "Ll.",
+    "m/s",
+    "M/S",
+    "Mag.",
+    "Mr.",
+    "Ndr.",
+    "Ph.d.",
+    "Prs.",
+    "Rcp.",
+    "Sdr.",
+    "Skt.",
+    "Spl.",
+    "Vg.",
+]:
    _exc[orth] = [{ORTH: orth}]


 for orth in [
-        "aarh.", "ac.", "adj.", "adr.", "adsk.", "adv.", "afb.", "afd.", "afg.",
-        "afk.", "afs.", "aht.", "alg.", "alk.", "alm.", "amer.", "ang.", "ank.",
-        "anl.", "anv.", "arb.", "arr.", "att.", "bd.", "bdt.", "beg.", "begr.",
-        "beh.", "bet.", "bev.", "bhk.", "bib.", "bibl.", "bidr.", "bildl.",
-        "bill.", "biol.", "bk.", "bl.", "bl.a.", "borgm.", "br.", "brolægn.",
-        "bto.", "bygn.", "ca.", "cand.", "d.d.", "d.m.", "d.s.", "d.s.s.",
-        "d.y.", "d.å.", "d.æ.", "dagl.", "dat.", "dav.", "def.", "dek.", "dep.",
-        "desl.", "dir.", "disp.", "distr.", "div.", "dkr.", "dl.", "do.",
-        "dobb.", "dr.h.c", "dr.phil.", "ds.", "dvs.", "e.b.", "e.l.", "e.o.",
-        "e.v.t.", "eftf.", "eftm.", "egl.", "eks.", "eksam.", "ekskl.", "eksp.",
-        "ekspl.", "el.lign.", "emer.", "endv.", "eng.", "enk.", "etc.", "etym.",
-        "eur.", "evt.", "exam.", "f.eks.", "f.m.", "f.n.", "f.o.", "f.o.m.",
-        "f.s.v.", "f.t.", "f.v.t.", "f.å.", "fa.", "fakt.", "fam.", "ff.",
-        "fg.", "fhv.", "fig.", "filol.", "filos.", "fl.", "flg.", "fm.", "fmd.",
-        "fol.", "forb.", "foreg.", "foren.", "forf.", "fork.", "forr.", "fors.",
-        "forsk.", "forts.", "fr.", "fr.u.", "frk.", "fsva.", "fuldm.", "fung.",
-        "fx.", "fys.", "fær.", "g.d.", "g.m.", "gd.", "gdr.", "genuds.", "gl.",
-        "gn.", "gns.", "gr.", "grdl.", "gross.", "h.a.", "h.c.", "hdl.",
-        "henv.", "hhv.", "hj.hj.", "hj.spl.", "hort.", "hosp.", "hpl.", "hr.",
-        "hrs.", "hum.", "hvp.", "i.e.", "id.", "if.", "iflg.", "ifm.", "ift.",
-        "iht.", "ill.", "indb.", "indreg.", "inf.", "ing.", "inh.", "inj.",
-        "inkl.", "insp.", "instr.", "isl.", "istf.", "it.", "ital.", "iv.",
-        "jap.", "jf.", "jfr.", "jnr.", "j.nr.", "jr.", "jur.", "jvf.", "kap.",
-        "kbh.", "kem.", "kgl.", "kl.", "kld.", "knsp.", "komm.", "kons.",
-        "korr.", "kp.", "kr.", "kst.", "kt.", "ktr.", "kv.", "kvt.", "l.c.",
-        "lab.", "lat.", "lb.m.", "lb.nr.", "lejl.", "lgd.", "lic.", "lign.",
-        "lin.", "ling.merc.", "litt.", "loc.cit.", "lok.", "lrs.", "ltr.",
-        "m.a.o.", "m.fl.", "m.m.", "m.v.", "m.v.h.", "maks.", "md.", "mdr.",
-        "mdtl.", "mezz.", "mfl.", "m.h.p.", "m.h.t.", "mht.", "mill.", "mio.",
-        "modt.", "mrk.", "mul.", "mv.", "n.br.", "n.f.", "nb.", "nedenst.",
-        "nl.", "nr.", "nto.", "nuv.", "o/m", "o.a.", "o.fl.", "o.h.", "o.l.",
-        "o.lign.", "o.m.a.", "o.s.fr.", "obl.", "obs.", "odont.", "oecon.",
-        "off.", "ofl.", "omg.", "omkr.", "omr.", "omtr.", "opg.", "opl.",
-        "opr.", "org.", "orig.", "osv.", "ovenst.", "overs.", "ovf.", "p.a.",
-        "p.b.a", "p.b.v", "p.c.", "p.m.", "p.m.v.", "p.n.", "p.p.", "p.p.s.",
-        "p.s.", "p.t.", "p.v.a.", "p.v.c.", "pag.", "pass.", "pcs.", "pct.",
-        "pd.", "pens.", "pft.", "pg.", "pga.", "pgl.", "pinx.", "pk.", "pkt.",
-        "polit.", "polyt.", "pos.", "pp.", "ppm.", "pr.", "prc.", "priv.",
-        "prod.", "prof.", "pron.", "præd.", "præf.", "præt.", "psych.", "pt.",
-        "pæd.", "q.e.d.", "rad.", "red.", "ref.", "reg.", "regn.", "rel.",
-        "rep.", "repr.", "resp.", "rest.", "rm.", "rtg.", "russ.", "s.br.",
-        "s.d.", "s.f.", "s.m.b.a.", "s.u.", "s.å.", "sa.", "sb.", "sc.",
-        "scient.", "scil.", "sek.", "sekr.", "self.", "sem.", "shj.", "sign.",
-        "sing.", "sj.", "skr.", "slutn.", "sml.", "smp.", "snr.", "soc.",
-        "soc.dem.", "sp.", "spec.", "spm.", "spr.", "spsk.", "statsaut.", "st.",
-        "stk.", "str.", "stud.", "subj.", "subst.", "suff.", "sup.", "suppl.",
-        "sv.", "såk.", "sædv.", "t/r", "t.h.", "t.o.", "t.o.m.", "t.v.", "tbl.",
-        "tcp/ip", "td.", "tdl.", "tdr.", "techn.", "tekn.", "temp.", "th.",
-        "theol.", "tidl.", "tilf.", "tilh.", "till.", "tilsv.", "tjg.", "tkr.",
-        "tlf.", "tlgr.", "tr.", "trp.", "tsk.", "tv.", "ty.", "u/b", "udb.",
-        "udbet.", "ugtl.", "undt.", "v.f.", "vb.", "vedk.", "vedl.", "vedr.",
-        "vejl.", "vh.", "vha.", "vs.", "vsa.", "vær.", "zool.", "ø.lgd.",
-        "øvr.", "årg.", "årh."]:
+    "aarh.",
+    "ac.",
+    "adj.",
+    "adr.",
+    "adsk.",
+    "adv.",
+    "afb.",
+    "afd.",
+    "afg.",
+    "afk.",
+    "afs.",
+    "aht.",
+    "alg.",
+    "alk.",
+    "alm.",
+    "amer.",
+    "ang.",
+    "ank.",
+    "anl.",
+    "anv.",
+    "arb.",
+    "arr.",
+    "att.",
+    "bd.",
+    "bdt.",
+    "beg.",
+    "begr.",
+    "beh.",
+    "bet.",
+    "bev.",
+    "bhk.",
+    "bib.",
+    "bibl.",
+    "bidr.",
+    "bildl.",
+    "bill.",
+    "biol.",
+    "bk.",
+    "bl.",
+    "bl.a.",
+    "borgm.",
+    "br.",
+    "brolægn.",
+    "bto.",
+    "bygn.",
+    "ca.",
+    "cand.",
+    "d.d.",
+    "d.m.",
+    "d.s.",
+    "d.s.s.",
+    "d.y.",
+    "d.å.",
+    "d.æ.",
+    "dagl.",
+    "dat.",
+    "dav.",
+    "def.",
+    "dek.",
+    "dep.",
+    "desl.",
+    "dir.",
+    "disp.",
+    "distr.",
+    "div.",
+    "dkr.",
+    "dl.",
+    "do.",
+    "dobb.",
+    "dr.h.c",
+    "dr.phil.",
+    "ds.",
+    "dvs.",
+    "e.b.",
+    "e.l.",
+    "e.o.",
+    "e.v.t.",
+    "eftf.",
+    "eftm.",
+    "egl.",
+    "eks.",
+    "eksam.",
+    "ekskl.",
+    "eksp.",
+    "ekspl.",
+    "el.lign.",
+    "emer.",
+    "endv.",
+    "eng.",
+    "enk.",
+    "etc.",
+    "etym.",
+    "eur.",
+    "evt.",
+    "exam.",
+    "f.eks.",
+    "f.m.",
+    "f.n.",
+    "f.o.",
+    "f.o.m.",
+    "f.s.v.",
+    "f.t.",
+    "f.v.t.",
+    "f.å.",
+    "fa.",
+    "fakt.",
+    "fam.",
+    "ff.",
+    "fg.",
+    "fhv.",
+    "fig.",
+    "filol.",
+    "filos.",
+    "fl.",
+    "flg.",
+    "fm.",
+    "fmd.",
+    "fol.",
+    "forb.",
+    "foreg.",
+    "foren.",
+    "forf.",
+    "fork.",
+    "forr.",
+    "fors.",
+    "forsk.",
+    "forts.",
+    "fr.",
+    "fr.u.",
+    "frk.",
+    "fsva.",
+    "fuldm.",
+    "fung.",
+    "fx.",
+    "fys.",
+    "fær.",
+    "g.d.",
+    "g.m.",
+    "gd.",
+    "gdr.",
+    "genuds.",
+    "gl.",
+    "gn.",
+    "gns.",
+    "gr.",
+    "grdl.",
+    "gross.",
+    "h.a.",
+    "h.c.",
+    "hdl.",
+    "henv.",
+    "hhv.",
+    "hj.hj.",
+    "hj.spl.",
+    "hort.",
+    "hosp.",
+    "hpl.",
+    "hr.",
+    "hrs.",
+    "hum.",
+    "hvp.",
+    "i.e.",
+    "id.",
+    "if.",
+    "iflg.",
+    "ifm.",
+    "ift.",
+    "iht.",
+    "ill.",
+    "indb.",
+    "indreg.",
+    "inf.",
+    "ing.",
+    "inh.",
+    "inj.",
+    "inkl.",
+    "insp.",
+    "instr.",
+    "isl.",
+    "istf.",
+    "it.",
+    "ital.",
+    "iv.",
+    "jap.",
+    "jf.",
+    "jfr.",
+    "jnr.",
+    "j.nr.",
+    "jr.",
+    "jur.",
+    "jvf.",
+    "kap.",
+    "kbh.",
+    "kem.",
+    "kgl.",
+    "kl.",
+    "kld.",
+    "knsp.",
+    "komm.",
+    "kons.",
+    "korr.",
+    "kp.",
+    "kr.",
+    "kst.",
+    "kt.",
+    "ktr.",
+    "kv.",
+    "kvt.",
+    "l.c.",
+    "lab.",
+    "lat.",
+    "lb.m.",
+    "lb.nr.",
+    "lejl.",
+    "lgd.",
+    "lic.",
+    "lign.",
+    "lin.",
+    "ling.merc.",
+    "litt.",
+    "loc.cit.",
+    "lok.",
+    "lrs.",
+    "ltr.",
+    "m.a.o.",
+    "m.fl.",
+    "m.m.",
+    "m.v.",
+    "m.v.h.",
+    "maks.",
+    "md.",
+    "mdr.",
+    "mdtl.",
+    "mezz.",
+    "mfl.",
+    "m.h.p.",
+    "m.h.t.",
+    "mht.",
+    "mill.",
+    "mio.",
+    "modt.",
+    "mrk.",
+    "mul.",
+    "mv.",
+    "n.br.",
+    "n.f.",
+    "nb.",
+    "nedenst.",
+    "nl.",
+    "nr.",
+    "nto.",
+    "nuv.",
+    "o/m",
+    "o.a.",
+    "o.fl.",
+    "o.h.",
+    "o.l.",
+    "o.lign.",
+    "o.m.a.",
+    "o.s.fr.",
+    "obl.",
+    "obs.",
+    "odont.",
+    "oecon.",
+    "off.",
+    "ofl.",
+    "omg.",
+    "omkr.",
+    "omr.",
+    "omtr.",
+    "opg.",
+    "opl.",
+    "opr.",
+    "org.",
+    "orig.",
+    "osv.",
+    "ovenst.",
+    "overs.",
+    "ovf.",
+    "p.a.",
+    "p.b.a",
+    "p.b.v",
+    "p.c.",
+    "p.m.",
+    "p.m.v.",
+    "p.n.",
+    "p.p.",
+    "p.p.s.",
+    "p.s.",
+    "p.t.",
+    "p.v.a.",
+    "p.v.c.",
+    "pag.",
+    "pass.",
+    "pcs.",
+    "pct.",
+    "pd.",
+    "pens.",
+    "pft.",
+    "pg.",
+    "pga.",
+    "pgl.",
+    "pinx.",
+    "pk.",
+    "pkt.",
+    "polit.",
+    "polyt.",
+    "pos.",
+    "pp.",
+    "ppm.",
+    "pr.",
+    "prc.",
+    "priv.",
+    "prod.",
+    "prof.",
+    "pron.",
+    "præd.",
+    "præf.",
+    "præt.",
+    "psych.",
+    "pt.",
+    "pæd.",
+    "q.e.d.",
+    "rad.",
+    "red.",
+    "ref.",
+    "reg.",
+    "regn.",
+    "rel.",
+    "rep.",
+    "repr.",
+    "resp.",
+    "rest.",
+    "rm.",
+    "rtg.",
+    "russ.",
+    "s.br.",
+    "s.d.",
+    "s.f.",
+    "s.m.b.a.",
+    "s.u.",
+    "s.å.",
+    "sa.",
+    "sb.",
+    "sc.",
+    "scient.",
+    "scil.",
+    "sek.",
+    "sekr.",
+    "self.",
+    "sem.",
+    "shj.",
+    "sign.",
+    "sing.",
+    "sj.",
+    "skr.",
+    "slutn.",
+    "sml.",
+    "smp.",
+    "snr.",
+    "soc.",
+    "soc.dem.",
+    "sp.",
+    "spec.",
+    "spm.",
+    "spr.",
+    "spsk.",
+    "statsaut.",
+    "st.",
+    "stk.",
+    "str.",
+    "stud.",
+    "subj.",
+    "subst.",
+    "suff.",
+    "sup.",
+    "suppl.",
+    "sv.",
+    "såk.",
+    "sædv.",
+    "t/r",
+    "t.h.",
+    "t.o.",
+    "t.o.m.",
+    "t.v.",
+    "tbl.",
+    "tcp/ip",
+    "td.",
+    "tdl.",
+    "tdr.",
+    "techn.",
+    "tekn.",
+    "temp.",
+    "th.",
+    "theol.",
+    "tidl.",
+    "tilf.",
+    "tilh.",
+    "till.",
+    "tilsv.",
+    "tjg.",
+    "tkr.",
+    "tlf.",
+    "tlgr.",
+    "tr.",
+    "trp.",
+    "tsk.",
+    "tv.",
+    "ty.",
+    "u/b",
+    "udb.",
+    "udbet.",
+    "ugtl.",
+    "undt.",
+    "v.f.",
+    "vb.",
+    "vedk.",
+    "vedl.",
+    "vedr.",
+    "vejl.",
+    "vh.",
+    "vha.",
+    "vs.",
+    "vsa.",
+    "vær.",
+    "zool.",
+    "ø.lgd.",
+    "øvr.",
+    "årg.",
+    "årh.",
+]:
    _exc[orth] = [{ORTH: orth}]
    capitalized = orth.capitalize()
    _exc[capitalized] = [{ORTH: capitalized}]
@ -138,7 +547,8 @@ for exc_data in [
    {ORTH: "ha'", LEMMA: "have", NORM: "have"},
    {ORTH: "Ha'", LEMMA: "have", NORM: "have"},
    {ORTH: "ik'", LEMMA: "ikke", NORM: "ikke"},
-        {ORTH: "Ik'", LEMMA: "ikke", NORM: "ikke"}]:
+    {ORTH: "Ik'", LEMMA: "ikke", NORM: "ikke"},
+]:
    _exc[exc_data[ORTH]] = [exc_data]


@ -147,11 +557,7 @@ for h in range(1, 31 + 1):
    for period in ["."]:
        _exc["%d%s" % (h, period)] = [{ORTH: "%d." % h}]

-_custom_base_exc = {
-    "i.": [
-        {ORTH: "i", LEMMA: "i", NORM: "i"},
-        {ORTH: ".", TAG: PUNCT}]
-}
+_custom_base_exc = {"i.": [{ORTH: "i", LEMMA: "i", NORM: "i"}, {ORTH: ".", TAG: PUNCT}]}
 _exc.update(_custom_base_exc)

 TOKENIZER_EXCEPTIONS = _exc
--- a/spacy/lang/de/init.py
+++ b/spacy/lang/de/init.py
@ -18,9 +18,10 @@ from ...util import update_exc, add_lookups

 class GermanDefaults(Language.Defaults):
    lex_attr_getters = dict(Language.Defaults.lex_attr_getters)
-    lex_attr_getters[LANG] = lambda text: 'de'
-    lex_attr_getters[NORM] = add_lookups(Language.Defaults.lex_attr_getters[NORM],
-                                         NORM_EXCEPTIONS, BASE_NORMS)
+    lex_attr_getters[LANG] = lambda text: "de"
+    lex_attr_getters[NORM] = add_lookups(
+        Language.Defaults.lex_attr_getters[NORM], NORM_EXCEPTIONS, BASE_NORMS
+    )
    tokenizer_exceptions = update_exc(BASE_EXCEPTIONS, TOKENIZER_EXCEPTIONS)
    infixes = TOKENIZER_INFIXES
    tag_map = TAG_MAP
@ -30,8 +31,8 @@ class GermanDefaults(Language.Defaults):


 class German(Language):
-    lang = 'de'
+    lang = "de"
    Defaults = GermanDefaults


-__all__ = ['German']
+__all__ = ["German"]
--- a/spacy/lang/de/examples.py
+++ b/spacy/lang/de/examples.py
@ -18,5 +18,5 @@ sentences = [
    "San Francisco erwägt Verbot von Lieferrobotern",
    "Autonome Fahrzeuge verlagern Haftpflicht auf Hersteller",
    "Wo bist du?",
-    "Was ist die Hauptstadt von Deutschland?"
+    "Was ist die Hauptstadt von Deutschland?",
 ]
--- a/spacy/lang/de/norm_exceptions.py
+++ b/spacy/lang/de/norm_exceptions.py
@ -6,9 +6,7 @@ from __future__ import unicode_literals
 # old vs. new spelling rules, and all possible cases.


-_exc = {
-    "daß": "dass"
-}
+_exc = {"daß": "dass"}


 NORM_EXCEPTIONS = {}
--- a/spacy/lang/de/punctuation.py
+++ b/spacy/lang/de/punctuation.py
@ -5,16 +5,21 @@ from ..char_classes import LIST_ELLIPSES, LIST_ICONS
 from ..char_classes import QUOTES, ALPHA, ALPHA_LOWER, ALPHA_UPPER


-_quotes = QUOTES.replace("'", '')
+_quotes = QUOTES.replace("'", "")

-_infixes = (LIST_ELLIPSES + LIST_ICONS +
-            [r'(?<=[{}])\.(?=[{}])'.format(ALPHA_LOWER, ALPHA_UPPER),
-             r'(?<=[{a}])[,!?](?=[{a}])'.format(a=ALPHA),
+_infixes = (
+    LIST_ELLIPSES
+    + LIST_ICONS
+    + [
+        r"(?<=[{}])\.(?=[{}])".format(ALPHA_LOWER, ALPHA_UPPER),
+        r"(?<=[{a}])[,!?](?=[{a}])".format(a=ALPHA),
        r'(?<=[{a}"])[:<>=](?=[{a}])'.format(a=ALPHA),
-             r'(?<=[{a}]),(?=[{a}])'.format(a=ALPHA),
-             r'(?<=[{a}])([{q}\)\]\(\[])(?=[\{a}])'.format(a=ALPHA, q=_quotes),
-             r'(?<=[{a}])--(?=[{a}])'.format(a=ALPHA),
-             r'(?<=[0-9])-(?=[0-9])'])
+        r"(?<=[{a}]),(?=[{a}])".format(a=ALPHA),
+        r"(?<=[{a}])([{q}\)\]\(\[])(?=[\{a}])".format(a=ALPHA, q=_quotes),
+        r"(?<=[{a}])--(?=[{a}])".format(a=ALPHA),
+        r"(?<=[0-9])-(?=[0-9])",
+    ]
+)


 TOKENIZER_INFIXES = _infixes
--- a/spacy/lang/de/stop_words.py
+++ b/spacy/lang/de/stop_words.py
@ -2,7 +2,8 @@
 from __future__ import unicode_literals


-STOP_WORDS = set("""
+STOP_WORDS = set(
+    """
 á a ab aber ach acht achte achten achter achtes ag alle allein allem allen
 aller allerdings alles allgemeinen als also am an andere anderen andern anders
 auch auf aus ausser außer ausserdem außerdem
@ -78,4 +79,5 @@ wollt wollte wollten worden wurde würde wurden würden

 zehn zehnte zehnten zehnter zehntes zeit zu zuerst zugleich zum zunächst zur
 zurück zusammen zwanzig zwar zwei zweite zweiten zweiter zweites zwischen
-""".split())
+""".split()
+)
--- a/spacy/lang/de/syntax_iterators.py
+++ b/spacy/lang/de/syntax_iterators.py
@ -13,26 +13,37 @@ def noun_chunks(obj):
    # measurement construction, the span is sometimes extended to the right of
    # the NOUN. Example: "eine Tasse Tee" (a cup (of) tea) returns "eine Tasse Tee"
    # and not just "eine Tasse", same for "das Thema Familie".
-    labels = ['sb', 'oa', 'da', 'nk', 'mo', 'ag', 'ROOT', 'root', 'cj', 'pd', 'og', 'app']
+    labels = [
+        "sb",
+        "oa",
+        "da",
+        "nk",
+        "mo",
+        "ag",
+        "ROOT",
+        "root",
+        "cj",
+        "pd",
+        "og",
+        "app",
+    ]
    doc = obj.doc  # Ensure works on both Doc and Span.
-    np_label = doc.vocab.strings.add('NP')
+    np_label = doc.vocab.strings.add("NP")
    np_deps = set(doc.vocab.strings.add(label) for label in labels)
-    close_app = doc.vocab.strings.add('nk')
+    close_app = doc.vocab.strings.add("nk")

    rbracket = 0
    for i, word in enumerate(obj):
        if i < rbracket:
            continue
        if word.pos in (NOUN, PROPN, PRON) and word.dep in np_deps:
-            rbracket = word.i+1
+            rbracket = word.i + 1
            # try to extend the span to the right
            # to capture close apposition/measurement constructions
            for rdep in doc[word.i].rights:
                if rdep.pos in (NOUN, PROPN) and rdep.dep == close_app:
-                    rbracket = rdep.i+1
+                    rbracket = rdep.i + 1
            yield word.left_edge.i, rbracket, np_label


-SYNTAX_ITERATORS = {
-    'noun_chunks': noun_chunks
-}
+SYNTAX_ITERATORS = {"noun_chunks": noun_chunks}
--- a/spacy/lang/de/tag_map.py
+++ b/spacy/lang/de/tag_map.py
@ -62,5 +62,5 @@ TAG_MAP = {
    "VVIZU": {POS: VERB, "VerbForm": "inf"},
    "VVPP": {POS: VERB, "Aspect": "perf", "VerbForm": "part"},
    "XY": {POS: X},
-    "_SP":      {POS: SPACE}
+    "_SP": {POS: SPACE},
 }
--- a/spacy/lang/de/tokenizer_exceptions.py
+++ b/spacy/lang/de/tokenizer_exceptions.py
@ -5,49 +5,41 @@ from ...symbols import ORTH, LEMMA, TAG, NORM, PRON_LEMMA


 _exc = {
-    "auf'm": [
-        {ORTH: "auf", LEMMA: "auf"},
-        {ORTH: "'m", LEMMA: "der", NORM: "dem"}],
-
+    "auf'm": [{ORTH: "auf", LEMMA: "auf"}, {ORTH: "'m", LEMMA: "der", NORM: "dem"}],
    "du's": [
        {ORTH: "du", LEMMA: PRON_LEMMA, TAG: "PPER"},
-        {ORTH: "'s", LEMMA: PRON_LEMMA, TAG: "PPER", NORM: "es"}],
-
+        {ORTH: "'s", LEMMA: PRON_LEMMA, TAG: "PPER", NORM: "es"},
+    ],
    "er's": [
        {ORTH: "er", LEMMA: PRON_LEMMA, TAG: "PPER"},
-        {ORTH: "'s", LEMMA: PRON_LEMMA, TAG: "PPER", NORM: "es"}],
-
+        {ORTH: "'s", LEMMA: PRON_LEMMA, TAG: "PPER", NORM: "es"},
+    ],
    "hinter'm": [
        {ORTH: "hinter", LEMMA: "hinter"},
-        {ORTH: "'m", LEMMA: "der", NORM: "dem"}],
-
+        {ORTH: "'m", LEMMA: "der", NORM: "dem"},
+    ],
    "ich's": [
        {ORTH: "ich", LEMMA: PRON_LEMMA, TAG: "PPER"},
-        {ORTH: "'s", LEMMA: PRON_LEMMA, TAG: "PPER", NORM: "es"}],
-
+        {ORTH: "'s", LEMMA: PRON_LEMMA, TAG: "PPER", NORM: "es"},
+    ],
    "ihr's": [
        {ORTH: "ihr", LEMMA: PRON_LEMMA, TAG: "PPER"},
-        {ORTH: "'s", LEMMA: PRON_LEMMA, TAG: "PPER", NORM: "es"}],
-
+        {ORTH: "'s", LEMMA: PRON_LEMMA, TAG: "PPER", NORM: "es"},
+    ],
    "sie's": [
        {ORTH: "sie", LEMMA: PRON_LEMMA, TAG: "PPER"},
-        {ORTH: "'s", LEMMA: PRON_LEMMA, TAG: "PPER", NORM: "es"}],
-
+        {ORTH: "'s", LEMMA: PRON_LEMMA, TAG: "PPER", NORM: "es"},
+    ],
    "unter'm": [
        {ORTH: "unter", LEMMA: "unter"},
-        {ORTH: "'m", LEMMA: "der", NORM: "dem"}],
-
-    "vor'm": [
-        {ORTH: "vor", LEMMA: "vor"},
-        {ORTH: "'m", LEMMA: "der", NORM: "dem"}],
-
+        {ORTH: "'m", LEMMA: "der", NORM: "dem"},
+    ],
+    "vor'm": [{ORTH: "vor", LEMMA: "vor"}, {ORTH: "'m", LEMMA: "der", NORM: "dem"}],
    "wir's": [
        {ORTH: "wir", LEMMA: PRON_LEMMA, TAG: "PPER"},
-        {ORTH: "'s", LEMMA: PRON_LEMMA, TAG: "PPER", NORM: "es"}],
-
-    "über'm": [
-        {ORTH: "über", LEMMA: "über"},
-        {ORTH: "'m", LEMMA: "der", NORM: "dem"}]
+        {ORTH: "'s", LEMMA: PRON_LEMMA, TAG: "PPER", NORM: "es"},
+    ],
+    "über'm": [{ORTH: "über", LEMMA: "über"}, {ORTH: "'m", LEMMA: "der", NORM: "dem"}],
 }


@ -162,21 +154,95 @@ for exc_data in [
    {ORTH: "z.Zt.", LEMMA: "zur Zeit"},
    {ORTH: "z.b.", LEMMA: "zum Beispiel"},
    {ORTH: "zzgl.", LEMMA: "zuzüglich"},
-    {ORTH: "österr.", LEMMA: "österreichisch", NORM: "österreichisch"}]:
+    {ORTH: "österr.", LEMMA: "österreichisch", NORM: "österreichisch"},
+]:
    _exc[exc_data[ORTH]] = [exc_data]


 for orth in [
-    "A.C.", "a.D.", "A.D.", "A.G.", "a.M.", "a.Z.", "Abs.", "adv.", "al.",
-    "B.A.", "B.Sc.", "betr.", "biol.", "Biol.", "ca.", "Chr.", "Cie.", "co.",
-    "Co.", "D.C.", "Dipl.-Ing.", "Dipl.", "Dr.", "e.g.", "e.V.", "ehem.",
-    "entspr.", "erm.", "etc.", "ev.", "G.m.b.H.", "geb.", "Gebr.", "gem.",
-    "h.c.", "Hg.", "hrsg.", "Hrsg.", "i.A.", "i.e.", "i.G.", "i.Tr.", "i.V.",
-    "Ing.", "jr.", "Jr.", "jun.", "jur.", "K.O.", "L.A.", "lat.", "M.A.",
-    "m.E.", "m.M.", "M.Sc.", "Mr.", "N.Y.", "N.Y.C.", "nat.", "o.a.",
-    "o.ä.", "o.g.", "o.k.", "O.K.", "p.a.", "p.s.", "P.S.", "pers.", "phil.",
-    "q.e.d.", "R.I.P.", "rer.", "sen.", "St.", "std.", "u.a.", "U.S.", "U.S.A.",
-    "U.S.S.", "Vol.", "vs.", "wiss."]:
+    "A.C.",
+    "a.D.",
+    "A.D.",
+    "A.G.",
+    "a.M.",
+    "a.Z.",
+    "Abs.",
+    "adv.",
+    "al.",
+    "B.A.",
+    "B.Sc.",
+    "betr.",
+    "biol.",
+    "Biol.",
+    "ca.",
+    "Chr.",
+    "Cie.",
+    "co.",
+    "Co.",
+    "D.C.",
+    "Dipl.-Ing.",
+    "Dipl.",
+    "Dr.",
+    "e.g.",
+    "e.V.",
+    "ehem.",
+    "entspr.",
+    "erm.",
+    "etc.",
+    "ev.",
+    "G.m.b.H.",
+    "geb.",
+    "Gebr.",
+    "gem.",
+    "h.c.",
+    "Hg.",
+    "hrsg.",
+    "Hrsg.",
+    "i.A.",
+    "i.e.",
+    "i.G.",
+    "i.Tr.",
+    "i.V.",
+    "Ing.",
+    "jr.",
+    "Jr.",
+    "jun.",
+    "jur.",
+    "K.O.",
+    "L.A.",
+    "lat.",
+    "M.A.",
+    "m.E.",
+    "m.M.",
+    "M.Sc.",
+    "Mr.",
+    "N.Y.",
+    "N.Y.C.",
+    "nat.",
+    "o.a.",
+    "o.ä.",
+    "o.g.",
+    "o.k.",
+    "O.K.",
+    "p.a.",
+    "p.s.",
+    "P.S.",
+    "pers.",
+    "phil.",
+    "q.e.d.",
+    "R.I.P.",
+    "rer.",
+    "sen.",
+    "St.",
+    "std.",
+    "u.a.",
+    "U.S.",
+    "U.S.A.",
+    "U.S.S.",
+    "Vol.",
+    "vs.",
+    "wiss.",
+]:
    _exc[orth] = [{ORTH: orth}]


--- a/spacy/lang/el/init.py
+++ b/spacy/lang/el/init.py
@ -21,9 +21,10 @@ from ...util import update_exc, add_lookups
 class GreekDefaults(Language.Defaults):
    lex_attr_getters = dict(Language.Defaults.lex_attr_getters)
    lex_attr_getters.update(LEX_ATTRS)
-    lex_attr_getters[LANG] = lambda text: 'el'  # ISO code
+    lex_attr_getters[LANG] = lambda text: "el"  # ISO code
    lex_attr_getters[NORM] = add_lookups(
-        Language.Defaults.lex_attr_getters[NORM], BASE_NORMS, NORM_EXCEPTIONS)
+        Language.Defaults.lex_attr_getters[NORM], BASE_NORMS, NORM_EXCEPTIONS
+    )
    tokenizer_exceptions = update_exc(BASE_EXCEPTIONS, TOKENIZER_EXCEPTIONS)
    stop_words = STOP_WORDS
    tag_map = TAG_MAP
@ -37,15 +38,16 @@ class GreekDefaults(Language.Defaults):
        lemma_rules = LEMMA_RULES
        lemma_index = LEMMA_INDEX
        lemma_exc = LEMMA_EXC
-        return GreekLemmatizer(index=lemma_index, exceptions=lemma_exc,
-                               rules=lemma_rules)
+        return GreekLemmatizer(
+            index=lemma_index, exceptions=lemma_exc, rules=lemma_rules
+        )


 class Greek(Language):

-    lang = 'el'  # ISO code
+    lang = "el"  # ISO code
    Defaults = GreekDefaults  # set Defaults to custom language defaults


 # set default export – this allows the language class to be lazy-loaded
-__all__ = ['Greek']
+__all__ = ["Greek"]
--- a/spacy/lang/el/examples.py
+++ b/spacy/lang/el/examples.py
@ -9,20 +9,20 @@ Example sentences to test spaCy and its language models.
 """

 sentences = [
-    '''Η άνιση κατανομή του πλούτου και του εισοδήματος, η οποία έχει λάβει
-    τρομερές διαστάσεις, δεν δείχνει τάσεις βελτίωσης.''',
-    '''Ο στόχος της σύντομης αυτής έκθεσης είναι να συνοψίσει τα κυριότερα
-    συμπεράσματα των επισκοπήσεων κάθε μιας χώρας.''',
-    '''Μέχρι αργά χθες το βράδυ ο πλοιοκτήτης παρέμενε έξω από το γραφείο του
+    """Η άνιση κατανομή του πλούτου και του εισοδήματος, η οποία έχει λάβει
+    τρομερές διαστάσεις, δεν δείχνει τάσεις βελτίωσης.""",
+    """Ο στόχος της σύντομης αυτής έκθεσης είναι να συνοψίσει τα κυριότερα
+    συμπεράσματα των επισκοπήσεων κάθε μιας χώρας.""",
+    """Μέχρι αργά χθες το βράδυ ο πλοιοκτήτης παρέμενε έξω από το γραφείο του
    γενικού γραμματέα του υπουργείου, ενώ είχε μόνον τηλεφωνική επικοινωνία με
-    τον υπουργό.''',
-    '''Σύμφωνα με καλά ενημερωμένη πηγή, από την επεξεργασία του προέκυψε ότι
+    τον υπουργό.""",
+    """Σύμφωνα με καλά ενημερωμένη πηγή, από την επεξεργασία του προέκυψε ότι
    οι δράστες της επίθεσης ήταν δύο, καθώς και ότι προσέγγισαν και αποχώρησαν
-    από το σημείο με μοτοσικλέτα.''',
+    από το σημείο με μοτοσικλέτα.""",
    "Η υποδομή καταλυμάτων στην Ελλάδα είναι πλήρης και ανανεώνεται συνεχώς.",
-    '''Το επείγον ταχυδρομείο (ήτοι το παραδοτέο εντός 48 ωρών το πολύ) μπορεί
+    """Το επείγον ταχυδρομείο (ήτοι το παραδοτέο εντός 48 ωρών το πολύ) μπορεί
    να μεταφέρεται αεροπορικώς μόνον εφόσον εφαρμόζονται οι κανόνες
-    ασφαλείας''',
-    ''''Στις ορεινές περιοχές του νησιού οι χιονοπτώσεις και οι παγετοί είναι
-    περιορισμένοι ενώ στις παραθαλάσσιες περιοχές σημειώνονται σπανίως.'''
+    ασφαλείας""",
+    """'Στις ορεινές περιοχές του νησιού οι χιονοπτώσεις και οι παγετοί είναι
+    περιορισμένοι ενώ στις παραθαλάσσιες περιοχές σημειώνονται σπανίως.""",
 ]
--- a/spacy/lang/el/lemmatizer/init.py
+++ b/spacy/lang/el/lemmatizer/init.py
@ -12,10 +12,19 @@ from ._verbs import VERBS
 from ._lemma_rules import ADJECTIVE_RULES, NOUN_RULES, VERB_RULES, PUNCT_RULES


-LEMMA_INDEX = {'adj': ADJECTIVES, 'adv': ADVERBS, 'noun': NOUNS, 'verb': VERBS}
+LEMMA_INDEX = {"adj": ADJECTIVES, "adv": ADVERBS, "noun": NOUNS, "verb": VERBS}


-LEMMA_RULES = {'adj': ADJECTIVE_RULES, 'noun': NOUN_RULES, 'verb': VERB_RULES,
-               'punct': PUNCT_RULES}
+LEMMA_RULES = {
+    "adj": ADJECTIVE_RULES,
+    "noun": NOUN_RULES,
+    "verb": VERB_RULES,
+    "punct": PUNCT_RULES,
+}

-LEMMA_EXC = {'adj': ADJECTIVES_IRREG, 'noun': NOUNS_IRREG, 'det': DETS_IRREG, 'verb': VERBS_IRREG}
+LEMMA_EXC = {
+    "adj": ADJECTIVES_IRREG,
+    "noun": NOUNS_IRREG,
+    "det": DETS_IRREG,
+    "verb": VERBS_IRREG,
+}
--- a/spacy/lang/el/lemmatizer/_adjectives.py
+++ b/spacy/lang/el/lemmatizer/_adjectives.py
@ -1,6 +1,8 @@
 # coding: utf8
 from __future__ import unicode_literals
-ADJECTIVES = set("""
+
+ADJECTIVES = set(
+    """
 n-διάστατος µεταφυτρωτικός άβαθος άβαλτος άβαρος άβατος άβαφος άβγαλτος άβιος
 άβλαπτος άβλεπτος άβολος άβουλος άβραστος άβρεχτος άβροχος άβυθος άγαμος
 άγγιχτος άγδαρτος άγδυτος άγευστος άγιος άγλυκος άγλωσσος άγναθος άγναντος
@ -2438,4 +2440,5 @@ ADJECTIVES = set("""
 όμορφος όνειος όξινος όρθιος όσιος όφκαιρος όψια όψιμος ύπανδρος ύπατος
 ύπουλος ύπτιος ύστατος ύστερος ύψιστος ώριμος ώριος ἀγκυλωτός ἀκαταμέτρητος
 ἄπειρος ἄτροπος ἐλαφρός ἐνεστώς ἐνυπόστατος ἔναυλος ἥττων ἰσχυρός ἵστωρ
-""".split())
+""".split()
+)
--- a/spacy/lang/el/lemmatizer/_adjectives_irreg.py
+++ b/spacy/lang/el/lemmatizer/_adjectives_irreg.py
@ -32,5 +32,4 @@ ADJECTIVES_IRREG = {
    "πολύς": ("πολύ",),
    "πολλύ": ("πολύ",),
    "πολλύς": ("πολύ",),
-
 }
--- a/spacy/lang/el/lemmatizer/_adverbs.py
+++ b/spacy/lang/el/lemmatizer/_adverbs.py
@ -1,6 +1,8 @@
 # coding: utf8
 from __future__ import unicode_literals
-ADVERBS = set("""
+
+ADVERBS = set(
+    """
 άβλαβα άβολα άβουλα άγαν άγαρμπα άγγιχτα άγνωμα άγρια άγρυπνα άδηλα άδικα
 άδοξα άθελα άθλια άκαιρα άκακα άκαμπτα άκαρδα άκαρπα άκεφα άκομψα άκοπα άκοσμα
 άκρως άκυρα άλαλα άλιωτα άλλοθεν άλλοτε άλλως άλλωστε άλογα άλυπα άμεμπτα
@ -861,4 +863,5 @@ ADVERBS = set("""
 ψυχραντικά ψωροπερήφανα ψόφια ψύχραιμα ωδικώς ωμά ωρίμως ωραία ωραιότατα
 ωριαία ωριαίως ως ωσαύτως ωσεί ωφέλιμα ωφελίμως ωφελιμιστικά ωχρά όθε όθεν όλο
 όμορφα όντως όξω όπισθεν όπου όπως όρθια όρτσα όσια όσο όχι όψιμα ύπερθεν
-""".split())
+""".split()
+)
--- a/spacy/lang/el/lemmatizer/_dets.py
+++ b/spacy/lang/el/lemmatizer/_dets.py
@ -1,5 +1,8 @@
 # coding: utf8
 from __future__ import unicode_literals
-DETS = set("""
+
+DETS = set(
+    """
 ένας η ο το τη
-""".split())
+""".split()
+)
--- a/spacy/lang/el/lemmatizer/_dets_irreg.py
+++ b/spacy/lang/el/lemmatizer/_dets_irreg.py
@ -8,5 +8,5 @@ DETS_IRREG = {
    "τους": ("το",),
    "τις": ("τη",),
    "τα": ("το",),
-    "οι": ("ο","η"),
+    "οι": ("ο", "η"),
 }
--- a/spacy/lang/el/lemmatizer/_lemma_rules.py
+++ b/spacy/lang/el/lemmatizer/_lemma_rules.py
@ -140,17 +140,7 @@ VERB_RULES = [
    ["ξουμε", "ζω"],
    ["ξετε", "ζω"],
    ["ξουν", "ζω"],
-
-
-
-
-
 ]


-PUNCT_RULES = [
-    ["“", "\""],
-    ["”", "\""],
-    ["\u2018", "'"],
-    ["\u2019", "'"]
-]
+PUNCT_RULES = [["“", '"'], ["”", '"'], ["\u2018", "'"], ["\u2019", "'"]]
--- a/spacy/lang/el/lemmatizer/_nouns.py
+++ b/spacy/lang/el/lemmatizer/_nouns.py
@ -1,6 +1,8 @@
 # coding: utf8
 from __future__ import unicode_literals
-NOUNS = set("""
+
+NOUNS = set(
+    """
 -αλγία -βατώ -βατῶ -ούλα -πληξία -ώνυμο sofa table άβακας άβατο άβατον άβυσσος
 άγανο άγαρ άγγελμα άγγελος άγγιγμα άγγισμα άγγλος άγημα άγιασμα άγιο φως
 άγκλισμα άγκυρα άγμα άγνοια άγνωστος άγονο άγος άγουρος άγουσα άγρα άγρευμα
@ -6066,4 +6068,5 @@ NOUNS = set("""
 ἐντευκτήριον ἐντόσθια ἐξοικείωσις ἐξοχή ἐξωκκλήσιον ἐπίσκεψις ἐπίσχεστρον
 ἐρωτίς ἑρμηνεία ἔκθλιψις ἔκτισις ἔκτρωμα ἔπαλξις ἱππάρχας ἱππάρχης ἴς ἵππαρχος
 ὑστερικός ὕστερον ὠάριον ὠοθήκη ὠοθηκῖτις ὠοθυλάκιον ὠορρηξία ὠοσκόπιον
-""".split())
+""".split()
+)
--- a/spacy/lang/el/lemmatizer/_participles.py
+++ b/spacy/lang/el/lemmatizer/_participles.py
@ -1,6 +1,8 @@
 # coding: utf8
 from __future__ import unicode_literals
-PARTICIPLES = set("""
+
+PARTICIPLES = set(
+    """
 έρποντας έχοντας αβανιάζοντας αβγατισμένος αγαπημένος αγαπώντας αγγίζοντας
 αγγιγμένος αγιασμένος αγιογραφώντας αγιοποιημένος αγιοποιώντας αγκαζαρισμένος
 αγκιστρωμένος αγκυλωμένος αγκυροβολημένος αγλακώντας αγνοημένος αγνοούμενος
@ -941,4 +943,5 @@ PARTICIPLES = set("""
 ψιλούμενος ψοφολογώντας ψυχογραφώντας ψυχολογημένος ψυχομαχώντας ψυχομαχώντας
 ψυχορραγώντας ψυχρηλατώντας ψυχωμένος ψωμοζητώντας ψωμοζώντας ψωμωμένος
 ωθηθείς ωθώντας ωραιοποιημένος ωραιοποιώντας ωρυόμενος ωτοσκοπώντας όντας
-""".split())
+""".split()
+)
--- a/spacy/lang/el/lemmatizer/_proper_names.py
+++ b/spacy/lang/el/lemmatizer/_proper_names.py
@ -1,5 +1,8 @@
+# coding: utf8
 from __future__ import unicode_literals
-PROPER_NAMES = set("""
+
+PROPER_NAMES = set(
+    """
 άαχεν άβαρος άβδηρα άβελ άβιλα άβολα άγγελοι άγγελος άγιο πνεύμα
 άγιοι τόποι άγιον όρος άγιος αθανάσιος άγιος αναστάσιος άγιος αντώνιος
 άγιος αριστείδης άγιος βαρθολομαίος άγιος βασίλειος άγιος βασίλης
@ -641,4 +644,5 @@ PROPER_NAMES = set("""
 ωρολόγιον ωρωπός ωσηέ όγκα όγκατα όγκι όθρυς όθων όιτα όλγα όλιβερ όλυμπος
 όμουρα όμπιδος όνειρος όνο όρεγκον όσακι όσατο όσκαρ όσλο όταμα ότσου όφενμπαχ
 όχιρα ύδρα ύδρος ύψιστος ώλενος ώρες ώρχους ώστιν ἀλεξανδρούπολις ἀμαλιούπολις
-""".split())
+""".split()
+)
--- a/spacy/lang/el/lemmatizer/_verbs.py
+++ b/spacy/lang/el/lemmatizer/_verbs.py
@ -1,6 +1,8 @@
 # coding: utf8
 from __future__ import unicode_literals
-VERBS = set("""
+
+VERBS = set(
+    """
 'γγίζω άγομαι άγχομαι άγω άδω άπτομαι άπωσον άρχομαι άρχω άφτω έγκειται έκιοσε
 έπομαι έρπω έρχομαι έστω έχω ήγγικεν ήθελε ίπταμαι ίσταμαι αίρομαι αίρω
 αβαντάρω αβαντζάρω αβαντσάρω αβαράρω αβασκαίνω αβγατίζω αβγαταίνω αβγοκόβω
@ -1186,4 +1188,5 @@ VERBS = set("""
 ωρύομαι ωτακουστώ ωτοσκοπώ ωφελούμαι ωφελώ ωχραίνω ωχριώ όζω όψομαι ἀδικῶ
 ἀκροῶμαι ἀλέθω ἀμελῶ ἀναπτερυγιάζω ἀναπτερώνω ἀναπτερώνω ἀνασαίνω ἀναταράσσω
 ἀναφτερουγίζω ἀναφτερουγιάζω ἀναφτερώνω ἀναχωρίζω ἀντιμετρῶ ἀράζω ἀφοδεύω
-""".split())
+""".split()
+)
--- a/spacy/lang/el/lemmatizer/_verbs_irreg.py
+++ b/spacy/lang/el/lemmatizer/_verbs_irreg.py
@ -1,7 +1,6 @@
 # coding: utf8
 from __future__ import unicode_literals

-
 VERBS_IRREG = {
    "είσαι": ("είμαι",),
    "είναι": ("είμαι",),
@ -196,5 +195,4 @@ VERBS_IRREG = {
    "έφθασες": ("φτάνω",),
    "έφθασε": ("φτάνω",),
    "έφθασαν": ("φτάνω",),
-
 }
--- a/spacy/lang/el/lemmatizer/get_pos_from_wiktionary.py
+++ b/spacy/lang/el/lemmatizer/get_pos_from_wiktionary.py
@ -1,34 +1,45 @@
 # coding: utf8
+from __future__ import unicode_literals
+
 import re
-import pickle

 from gensim.corpora.wikicorpus import extract_pages

-regex = re.compile(r'==={{(\w+)\|el}}===')
-regex2 = re.compile(r'==={{(\w+ \w+)\|el}}===')
+
+regex = re.compile(r"==={{(\w+)\|el}}===")
+regex2 = re.compile(r"==={{(\w+ \w+)\|el}}===")

 # get words based on the Wiktionary dump
 # check only for specific parts

 # ==={{κύριο όνομα|el}}===
-expected_parts = ['μετοχή', 'ρήμα', 'επίθετο',
-                  'επίρρημα',  'ουσιαστικό', 'κύριο όνομα', 'άρθρο']
+expected_parts = [
+    "μετοχή",
+    "ρήμα",
+    "επίθετο",
+    "επίρρημα",
+    "ουσιαστικό",
+    "κύριο όνομα",
+    "άρθρο",
+]

-unwanted_parts = '''
+unwanted_parts = """
    {'αναγραμματισμοί': 2, 'σύνδεσμος': 94, 'απαρέμφατο': 1, 'μορφή άρθρου': 1, 'ένθημα': 1, 'μερική συνωνυμία': 57, 'ορισμός': 1, 'σημείωση': 3, 'πρόσφυμα': 3, 'ταυτόσημα': 8, 'χαρακτήρας': 51, 'μορφή επιρρήματος': 1, 'εκφράσεις': 22, 'ρηματικό σχήμα': 3, 'πολυλεκτικό επίρρημα': 2, 'μόριο': 35, 'προφορά': 412, 'ρηματική έκφραση': 15, 'λογοπαίγνια': 2, 'πρόθεση': 46, 'ρηματικό επίθετο': 1, 'κατάληξη επιρρημάτων': 10, 'συναφείς όροι': 1, 'εξωτερικοί σύνδεσμοι': 1, 'αρσενικό γένος': 1, 'πρόθημα': 169, 'κατάληξη': 3, 'υπώνυμα': 7, 'επιφώνημα': 197, 'ρηματικός τύπος': 1, 'συντομομορφή': 560, 'μορφή ρήματος': 68282, 'μορφή επιθέτου': 61779, 'μορφές': 71, 'ιδιωματισμός': 2, 'πολυλεκτικός όρος': 719, 'πολυλεκτικό ουσιαστικό': 180, 'παράγωγα': 25, 'μορφή μετοχής': 806, 'μορφή αριθμητικού': 3, 'άκλιτο': 1, 'επίθημα': 181, 'αριθμητικό': 129, 'συγγενικά': 94, 'σημειώσεις': 45, 'Ιδιωματισμός': 1, 'ρητά': 12, 'φράση': 9, 'συνώνυμα': 556, 'μεταφράσεις': 1, 'κατάληξη ρημάτων': 15, 'σύνθετα': 27, 'υπερώνυμα': 1, 'εναλλακτικός τύπος': 22, 'μορφή ουσιαστικού': 35122, 'επιρρηματική έκφραση': 12, 'αντώνυμα': 76, 'βλέπε': 7, 'μορφή αντωνυμίας': 51, 'αντωνυμία': 100, 'κλίση': 11, 'σύνθετοι τύποι': 1, 'παροιμία': 5, 'μορφή_επιθέτου': 2, 'έκφραση': 738, 'σύμβολο': 8, 'πολυλεκτικό επίθετο': 1, 'ετυμολογία': 867}
-'''
+"""


-wiktionary_file_path = '/data/gsoc2018-spacy/spacy/lang/el/res/elwiktionary-latest-pages-articles.xml'
+wiktionary_file_path = (
+    "/data/gsoc2018-spacy/spacy/lang/el/res/elwiktionary-latest-pages-articles.xml"
+)

-proper_names_dict={
-    'ουσιαστικό':'nouns',
-    'επίθετο':'adjectives',
-    'άρθρο':'dets',
-    'επίρρημα':'adverbs',
-    'κύριο όνομα': 'proper_names',
-    'μετοχή': 'participles',
-    'ρήμα': 'verbs'
+proper_names_dict = {
+    "ουσιαστικό": "nouns",
+    "επίθετο": "adjectives",
+    "άρθρο": "dets",
+    "επίρρημα": "adverbs",
+    "κύριο όνομα": "proper_names",
+    "μετοχή": "participles",
+    "ρήμα": "verbs",
 }
 expected_parts_dict = {}
 for expected_part in expected_parts:
@ -36,7 +47,7 @@ for expected_part in expected_parts:

 other_parts = {}
 for title, text, pageid in extract_pages(wiktionary_file_path):
-    if text.startswith('#REDIRECT'):
+    if text.startswith("#REDIRECT"):
        continue
    title = title.lower()
    all_regex = regex.findall(text)
@ -47,20 +58,17 @@ for title, text, pageid in extract_pages(wiktionary_file_path):


 for i in expected_parts_dict:
-    with open('_{0}.py'.format(proper_names_dict[i]), 'w') as f:
-        f.write('from __future__ import unicode_literals\n')
-        f.write('{} = set(\"\"\"\n'.format(proper_names_dict[i].upper()))
+    with open("_{0}.py".format(proper_names_dict[i]), "w") as f:
+        f.write("from __future__ import unicode_literals\n")
+        f.write('{} = set("""\n'.format(proper_names_dict[i].upper()))
        words = sorted(expected_parts_dict[i])
-        line = ''
+        line = ""
        to_write = []
        for word in words:
-            if len(line + ' ' + word) > 79:
+            if len(line + " " + word) > 79:
                to_write.append(line)
-                line = ''
+                line = ""
            else:
-                line = line + ' ' + word
-        f.write('\n'.join(to_write))
-        f.write('\n\"\"\".split())')
-
-
-
+                line = line + " " + word
+        f.write("\n".join(to_write))
+        f.write('\n""".split())')
--- a/spacy/lang/el/lemmatizer/lemmatizer.py
+++ b/spacy/lang/el/lemmatizer/lemmatizer.py
@ -3,18 +3,18 @@ from __future__ import unicode_literals

 from ....symbols import NOUN, VERB, ADJ, PUNCT

-'''
-Greek language lemmatizer applies the default rule based lemmatization
-procedure with some modifications for better Greek language support.
-
-The first modification is that it checks if the word for lemmatization is
-already a lemma and if yes, it just returns it.
-The second modification is about removing the base forms function which is
-not applicable for Greek language.
-'''
-

 class GreekLemmatizer(object):
+    """
+    Greek language lemmatizer applies the default rule based lemmatization
+    procedure with some modifications for better Greek language support.
+
+    The first modification is that it checks if the word for lemmatization is
+    already a lemma and if yes, it just returns it.
+    The second modification is about removing the base forms function which is
+    not applicable for Greek language.
+    """
+
    @classmethod
    def load(cls, path, index=None, exc=None, rules=None, lookup=None):
        return cls(index, exc, rules, lookup)
@ -28,26 +28,29 @@ class GreekLemmatizer(object):
    def __call__(self, string, univ_pos, morphology=None):
        if not self.rules:
            return [self.lookup_table.get(string, string)]
-        if univ_pos in (NOUN, 'NOUN', 'noun'):
-            univ_pos = 'noun'
-        elif univ_pos in (VERB, 'VERB', 'verb'):
-            univ_pos = 'verb'
-        elif univ_pos in (ADJ, 'ADJ', 'adj'):
-            univ_pos = 'adj'
-        elif univ_pos in (PUNCT, 'PUNCT', 'punct'):
-            univ_pos = 'punct'
+        if univ_pos in (NOUN, "NOUN", "noun"):
+            univ_pos = "noun"
+        elif univ_pos in (VERB, "VERB", "verb"):
+            univ_pos = "verb"
+        elif univ_pos in (ADJ, "ADJ", "adj"):
+            univ_pos = "adj"
+        elif univ_pos in (PUNCT, "PUNCT", "punct"):
+            univ_pos = "punct"
        else:
            return list(set([string.lower()]))
-        lemmas = lemmatize(string, self.index.get(univ_pos, {}),
+        lemmas = lemmatize(
+            string,
+            self.index.get(univ_pos, {}),
            self.exc.get(univ_pos, {}),
-                           self.rules.get(univ_pos, []))
+            self.rules.get(univ_pos, []),
+        )
        return lemmas


 def lemmatize(string, index, exceptions, rules):
    string = string.lower()
    forms = []
-    if (string in index):
+    if string in index:
        forms.append(string)
        return forms
    forms.extend(exceptions.get(string, []))
@ -55,7 +58,7 @@ def lemmatize(string, index, exceptions, rules):
    if not forms:
        for old, new in rules:
            if string.endswith(old):
-                form = string[:len(string) - len(old)] + new
+                form = string[: len(string) - len(old)] + new
                if not form:
                    pass
                elif form in index or not form.isalpha():
--- a/spacy/lang/el/lex_attrs.py
+++ b/spacy/lang/el/lex_attrs.py
@ -4,43 +4,100 @@ from __future__ import unicode_literals

 from ...attrs import LIKE_NUM

-_num_words = ['μηδέν', 'ένας', 'δυο', 'δυό', 'τρεις', 'τέσσερις', 'πέντε',
-              'έξι', 'εφτά', 'επτά', 'οκτώ', 'οχτώ',
-              'εννιά', 'εννέα', 'δέκα', 'έντεκα', 'ένδεκα', 'δώδεκα',
-              'δεκατρείς', 'δεκατέσσερις', 'δεκαπέντε', 'δεκαέξι', 'δεκαεπτά',
-              'δεκαοχτώ', 'δεκαεννέα', 'δεκαεννεα', 'είκοσι', 'τριάντα',
-              'σαράντα', 'πενήντα', 'εξήντα', 'εβδομήντα', 'ογδόντα',
-              'ενενήντα', 'εκατό', 'διακόσιοι', 'διακόσοι', 'τριακόσιοι',
-              'τριακόσοι', 'τετρακόσιοι', 'τετρακόσοι', 'πεντακόσιοι',
-              'πεντακόσοι', 'εξακόσιοι', 'εξακόσοι', 'εφτακόσιοι', 'εφτακόσοι',
-              'επτακόσιοι', 'επτακόσοι', 'οχτακόσιοι', 'οχτακόσοι',
-              'οκτακόσιοι', 'οκτακόσοι', 'εννιακόσιοι', 'χίλιοι', 'χιλιάδα',
-              'εκατομμύριο', 'δισεκατομμύριο', 'τρισεκατομμύριο', 'τετράκις',
-              'πεντάκις', 'εξάκις', 'επτάκις', 'οκτάκις', 'εννεάκις', 'ένα',
-              'δύο', 'τρία', 'τέσσερα', 'δις', 'χιλιάδες']
+_num_words = [
+    "μηδέν",
+    "ένας",
+    "δυο",
+    "δυό",
+    "τρεις",
+    "τέσσερις",
+    "πέντε",
+    "έξι",
+    "εφτά",
+    "επτά",
+    "οκτώ",
+    "οχτώ",
+    "εννιά",
+    "εννέα",
+    "δέκα",
+    "έντεκα",
+    "ένδεκα",
+    "δώδεκα",
+    "δεκατρείς",
+    "δεκατέσσερις",
+    "δεκαπέντε",
+    "δεκαέξι",
+    "δεκαεπτά",
+    "δεκαοχτώ",
+    "δεκαεννέα",
+    "δεκαεννεα",
+    "είκοσι",
+    "τριάντα",
+    "σαράντα",
+    "πενήντα",
+    "εξήντα",
+    "εβδομήντα",
+    "ογδόντα",
+    "ενενήντα",
+    "εκατό",
+    "διακόσιοι",
+    "διακόσοι",
+    "τριακόσιοι",
+    "τριακόσοι",
+    "τετρακόσιοι",
+    "τετρακόσοι",
+    "πεντακόσιοι",
+    "πεντακόσοι",
+    "εξακόσιοι",
+    "εξακόσοι",
+    "εφτακόσιοι",
+    "εφτακόσοι",
+    "επτακόσιοι",
+    "επτακόσοι",
+    "οχτακόσιοι",
+    "οχτακόσοι",
+    "οκτακόσιοι",
+    "οκτακόσοι",
+    "εννιακόσιοι",
+    "χίλιοι",
+    "χιλιάδα",
+    "εκατομμύριο",
+    "δισεκατομμύριο",
+    "τρισεκατομμύριο",
+    "τετράκις",
+    "πεντάκις",
+    "εξάκις",
+    "επτάκις",
+    "οκτάκις",
+    "εννεάκις",
+    "ένα",
+    "δύο",
+    "τρία",
+    "τέσσερα",
+    "δις",
+    "χιλιάδες",
+]


 def like_num(text):
-    if text.startswith(('+', '-', '±', '~')):
+    if text.startswith(("+", "-", "±", "~")):
        text = text[1:]
-    text = text.replace(',', '').replace('.', '')
+    text = text.replace(",", "").replace(".", "")
    if text.isdigit():
        return True
-    if text.count('/') == 1:
-        num, denom = text.split('/')
+    if text.count("/") == 1:
+        num, denom = text.split("/")
        if num.isdigit() and denom.isdigit():
            return True
-    if text.count('^') == 1:
-        num, denom = text.split('^')
+    if text.count("^") == 1:
+        num, denom = text.split("^")
        if num.isdigit() and denom.isdigit():
            return True
-    if text.lower() in _num_words or text.lower().split(' ')[0] in _num_words:
+    if text.lower() in _num_words or text.lower().split(" ")[0] in _num_words:
        return True
    if text in _num_words:
        return True
    return False


-LEX_ATTRS = {
-    LIKE_NUM: like_num
-}
+LEX_ATTRS = {LIKE_NUM: like_num}
--- a/spacy/lang/el/norm_exceptions.py
+++ b/spacy/lang/el/norm_exceptions.py
@ -3,8 +3,6 @@ from __future__ import unicode_literals


 # These exceptions are used to add NORM values based on a token's ORTH value.
-
-
 # Norms are only set if no alternative is provided in the tokenizer exceptions.

 _exc = {
--- a/spacy/lang/el/punctuation.py
+++ b/spacy/lang/el/punctuation.py
@ -6,66 +6,91 @@ from ..char_classes import LIST_PUNCT, LIST_ELLIPSES, LIST_QUOTES, LIST_CURRENCY
 from ..char_classes import LIST_ICONS, ALPHA_LOWER, ALPHA_UPPER, ALPHA, HYPHENS
 from ..char_classes import QUOTES, CURRENCY

-_units = ('km km² km³ m m² m³ dm dm² dm³ cm cm² cm³ mm mm² mm³ ha µm nm yd in ft '
-          'kg g mg µg t lb oz m/s km/h kmh mph hPa Pa mbar mb MB kb KB gb GB tb '
-          'TB T G M K км км² км³ м м² м³ дм дм² дм³ см см² см³ мм мм² мм³ нм '
-          'кг г мг м/с км/ч кПа Па мбар Кб КБ кб Мб МБ мб Гб ГБ гб Тб ТБ тб')
+_units = (
+    "km km² km³ m m² m³ dm dm² dm³ cm cm² cm³ mm mm² mm³ ha µm nm yd in ft "
+    "kg g mg µg t lb oz m/s km/h kmh mph hPa Pa mbar mb MB kb KB gb GB tb "
+    "TB T G M K км км² км³ м м² м³ дм дм² дм³ см см² см³ мм мм² мм³ нм "
+    "кг г мг м/с км/ч кПа Па мбар Кб КБ кб Мб МБ мб Гб ГБ гб Тб ТБ тб"
+)


-def merge_chars(char): return char.strip().replace(' ', '|')
+def merge_chars(char):
+    return char.strip().replace(" ", "|")


 UNITS = merge_chars(_units)

-_prefixes = (['\'\'', '§', '%', '=', r'\+[0-9]+%',  # 90%
-              r'\'([0-9]){2}([\-]\'([0-9]){2})*',  # '12'-13
-              r'\-([0-9]){1,9}\.([0-9]){1,9}',  # -12.13
-              r'\'([Α-Ωα-ωίϊΐόάέύϋΰήώ]+)\'',  # 'αβγ'
-              r'([Α-Ωα-ωίϊΐόάέύϋΰήώ]){1,3}\'',  # αβγ'
-              r'http://www.[A-Za-z]+\-[A-Za-z]+(\.[A-Za-z]+)+(\/[A-Za-z]+)*(\.[A-Za-z]+)*',
-              r'[ΈΆΊΑ-Ωα-ωίϊΐόάέύϋΰήώ]+\*',  # όνομα*
-              r'\$([0-9])+([\,\.]([0-9])+){0,1}',
-              ] + LIST_PUNCT + LIST_ELLIPSES + LIST_QUOTES +
-             LIST_CURRENCY + LIST_ICONS)
+_prefixes = (
+    [
+        "''",
+        "§",
+        "%",
+        "=",
+        r"\+[0-9]+%",  # 90%
+        r"\'([0-9]){2}([\-]\'([0-9]){2})*",  # '12'-13
+        r"\-([0-9]){1,9}\.([0-9]){1,9}",  # -12.13
+        r"\'([Α-Ωα-ωίϊΐόάέύϋΰήώ]+)\'",  # 'αβγ'
+        r"([Α-Ωα-ωίϊΐόάέύϋΰήώ]){1,3}\'",  # αβγ'
+        r"http://www.[A-Za-z]+\-[A-Za-z]+(\.[A-Za-z]+)+(\/[A-Za-z]+)*(\.[A-Za-z]+)*",
+        r"[ΈΆΊΑ-Ωα-ωίϊΐόάέύϋΰήώ]+\*",  # όνομα*
+        r"\$([0-9])+([\,\.]([0-9])+){0,1}",
+    ]
+    + LIST_PUNCT
+    + LIST_ELLIPSES
+    + LIST_QUOTES
+    + LIST_CURRENCY
+    + LIST_ICONS
+)

-_suffixes = (LIST_PUNCT + LIST_ELLIPSES + LIST_QUOTES + LIST_ICONS +
-             [r'(?<=[0-9])\+',  # 12+
-              r'([0-9])+\'',  # 12'
-              r'([A-Za-z])?\'',  # a'
-              r'^([0-9]){1,2}\.',  # 12.
-              r' ([0-9]){1,2}\.',  # 12.
-              r'([0-9]){1}\) ',  # 12)
-              r'^([0-9]){1}\)$',  # 12)
-              r'(?<=°[FfCcKk])\.',
-              r'([0-9])+\&',  # 12&
-              r'(?<=[0-9])(?:{})'.format(CURRENCY),
-              r'(?<=[0-9])(?:{})'.format(UNITS),
-              r'(?<=[0-9{}{}(?:{})])\.'.format(ALPHA_LOWER, r'²\-\)\]\+', QUOTES),
-              r'(?<=[{a}][{a}])\.'.format(a=ALPHA_UPPER),
-              r'(?<=[Α-Ωα-ωίϊΐόάέύϋΰήώ])\-',  # όνομα-
-              r'(?<=[Α-Ωα-ωίϊΐόάέύϋΰήώ])\.',
-              r'^[Α-Ω]{1}\.',
-              r'\ [Α-Ω]{1}\.',
+_suffixes = (
+    LIST_PUNCT
+    + LIST_ELLIPSES
+    + LIST_QUOTES
+    + LIST_ICONS
+    + [
+        r"(?<=[0-9])\+",  # 12+
+        r"([0-9])+\'",  # 12'
+        r"([A-Za-z])?\'",  # a'
+        r"^([0-9]){1,2}\.",  # 12.
+        r" ([0-9]){1,2}\.",  # 12.
+        r"([0-9]){1}\) ",  # 12)
+        r"^([0-9]){1}\)$",  # 12)
+        r"(?<=°[FfCcKk])\.",
+        r"([0-9])+\&",  # 12&
+        r"(?<=[0-9])(?:{})".format(CURRENCY),
+        r"(?<=[0-9])(?:{})".format(UNITS),
+        r"(?<=[0-9{}{}(?:{})])\.".format(ALPHA_LOWER, r"²\-\)\]\+", QUOTES),
+        r"(?<=[{a}][{a}])\.".format(a=ALPHA_UPPER),
+        r"(?<=[Α-Ωα-ωίϊΐόάέύϋΰήώ])\-",  # όνομα-
+        r"(?<=[Α-Ωα-ωίϊΐόάέύϋΰήώ])\.",
+        r"^[Α-Ω]{1}\.",
+        r"\ [Α-Ω]{1}\.",
        # πρώτος-δεύτερος , πρώτος-δεύτερος-τρίτος
-              r'[ΈΆΊΑΌ-Ωα-ωίϊΐόάέύϋΰήώ]+([\-]([ΈΆΊΑΌ-Ωα-ωίϊΐόάέύϋΰήώ]+))+',
-              r'([0-9]+)mg',  # 13mg
-              r'([0-9]+)\.([0-9]+)m'  # 1.2m
-              ])
+        r"[ΈΆΊΑΌ-Ωα-ωίϊΐόάέύϋΰήώ]+([\-]([ΈΆΊΑΌ-Ωα-ωίϊΐόάέύϋΰήώ]+))+",
+        r"([0-9]+)mg",  # 13mg
+        r"([0-9]+)\.([0-9]+)m",  # 1.2m
+    ]
+)

-_infixes = (LIST_ELLIPSES + LIST_ICONS +
-            [r'(?<=[0-9])[+\/\-\*^](?=[0-9])',  # 1/2 , 1-2 , 1*2
-             r'([a-zA-Z]+)\/([a-zA-Z]+)\/([a-zA-Z]+)',  # name1/name2/name3
-             r'([0-9])+(\.([0-9]+))*([\-]([0-9])+)+',  # 10.9 , 10.9.9 , 10.9-6
-             r'([0-9])+[,]([0-9])+[\-]([0-9])+[,]([0-9])+',  # 10,11,12
-             r'([0-9])+[ης]+([\-]([0-9])+)+',  # 1ης-2
+_infixes = (
+    LIST_ELLIPSES
+    + LIST_ICONS
+    + [
+        r"(?<=[0-9])[+\/\-\*^](?=[0-9])",  # 1/2 , 1-2 , 1*2
+        r"([a-zA-Z]+)\/([a-zA-Z]+)\/([a-zA-Z]+)",  # name1/name2/name3
+        r"([0-9])+(\.([0-9]+))*([\-]([0-9])+)+",  # 10.9 , 10.9.9 , 10.9-6
+        r"([0-9])+[,]([0-9])+[\-]([0-9])+[,]([0-9])+",  # 10,11,12
+        r"([0-9])+[ης]+([\-]([0-9])+)+",  # 1ης-2
        # 15/2 , 15/2/17 , 2017/2/15
-             r'([0-9]){1,4}[\/]([0-9]){1,2}([\/]([0-9]){0,4}){0,1}',
-             r'[A-Za-z]+\@[A-Za-z]+(\-[A-Za-z]+)*\.[A-Za-z]+',  # abc@cde-fgh.a
-             r'([a-zA-Z]+)(\-([a-zA-Z]+))+',  # abc-abc
-             r'(?<=[{}])\.(?=[{}])'.format(ALPHA_LOWER, ALPHA_UPPER),
-             r'(?<=[{a}]),(?=[{a}])'.format(a=ALPHA),
+        r"([0-9]){1,4}[\/]([0-9]){1,2}([\/]([0-9]){0,4}){0,1}",
+        r"[A-Za-z]+\@[A-Za-z]+(\-[A-Za-z]+)*\.[A-Za-z]+",  # abc@cde-fgh.a
+        r"([a-zA-Z]+)(\-([a-zA-Z]+))+",  # abc-abc
+        r"(?<=[{}])\.(?=[{}])".format(ALPHA_LOWER, ALPHA_UPPER),
+        r"(?<=[{a}]),(?=[{a}])".format(a=ALPHA),
        r'(?<=[{a}])[?";:=,.]*(?:{h})(?=[{a}])'.format(a=ALPHA, h=HYPHENS),
-             r'(?<=[{a}"])[:<>=/](?=[{a}])'.format(a=ALPHA)])
+        r'(?<=[{a}"])[:<>=/](?=[{a}])'.format(a=ALPHA),
+    ]
+)

 TOKENIZER_PREFIXES = _prefixes
 TOKENIZER_SUFFIXES = _suffixes
--- a/spacy/lang/el/stop_words.py
+++ b/spacy/lang/el/stop_words.py
@ -1,13 +1,11 @@
-# -*- coding: utf-8 -*-
-
+# coding: utf8
 from __future__ import unicode_literals

+
 # Stop words
-
 # Link to greek stop words: https://www.translatum.gr/forum/index.php?topic=3550.0?topic=3550.0
-
-
-STOP_WORDS = set("""
+STOP_WORDS = set(
+    """
 αδιάκοπα αι ακόμα ακόμη ακριβώς άλλα αλλά αλλαχού άλλες άλλη άλλην
 άλλης αλλιώς αλλιώτικα άλλο άλλοι αλλοιώς αλλοιώτικα άλλον άλλος άλλοτε αλλού
 άλλους άλλων άμα άμεσα αμέσως αν ανά ανάμεσα αναμεταξύ άνευ αντί αντίπερα αντίς
@ -89,4 +87,5 @@ STOP_WORDS = set("""
 χωρίς χωριστά

 ω ως ωσάν ωσότου ώσπου ώστε ωστόσο ωχ
-""".split())
+""".split()
+)
--- a/spacy/lang/el/syntax_iterators.py
+++ b/spacy/lang/el/syntax_iterators.py
@ -8,18 +8,16 @@ def noun_chunks(obj):
    """
    Detect base noun phrases. Works on both Doc and Span.
    """
-
-    # it follows the logic of the noun chunks finder of English language,
+    # It follows the logic of the noun chunks finder of English language,
    # adjusted to some Greek language special characteristics.
-
    # obj tag corrects some DEP tagger mistakes.
    # Further improvement of the models will eliminate the need for this tag.
-    labels = ['nsubj', 'obj', 'iobj', 'appos', 'ROOT', 'obl']
+    labels = ["nsubj", "obj", "iobj", "appos", "ROOT", "obl"]
    doc = obj.doc  # Ensure works on both Doc and Span.
    np_deps = [doc.vocab.strings.add(label) for label in labels]
-    conj = doc.vocab.strings.add('conj')
-    nmod = doc.vocab.strings.add('nmod')
-    np_label = doc.vocab.strings.add('NP')
+    conj = doc.vocab.strings.add("conj")
+    nmod = doc.vocab.strings.add("nmod")
+    np_label = doc.vocab.strings.add("NP")
    seen = set()
    for i, word in enumerate(obj):
        if word.pos not in (NOUN, PROPN, PRON):
@ -31,16 +29,17 @@ def noun_chunks(obj):
            if any(w.i in seen for w in word.subtree):
                continue
            flag = False
-            if (word.pos == NOUN):
+            if word.pos == NOUN:
                #  check for patterns such as γραμμή παραγωγής
                for potential_nmod in word.rights:
-                    if (potential_nmod.dep == nmod):
-                        seen.update(j for j in range(
-                            word.left_edge.i, potential_nmod.i + 1))
+                    if potential_nmod.dep == nmod:
+                        seen.update(
+                            j for j in range(word.left_edge.i, potential_nmod.i + 1)
+                        )
                        yield word.left_edge.i, potential_nmod.i + 1, np_label
                        flag = True
                        break
-            if (flag is False):
+            if flag is False:
                seen.update(j for j in range(word.left_edge.i, word.i + 1))
                yield word.left_edge.i, word.i + 1, np_label
        elif word.dep == conj:
@ -56,6 +55,4 @@ def noun_chunks(obj):
                yield word.left_edge.i, word.i + 1, np_label


-SYNTAX_ITERATORS = {
-    'noun_chunks': noun_chunks
-}
+SYNTAX_ITERATORS = {"noun_chunks": noun_chunks}
--- a/spacy/lang/el/tag_map.py
+++ b/spacy/lang/el/tag_map.py
--- a/spacy/lang/el/tag_map_general.py
+++ b/spacy/lang/el/tag_map_general.py
@ -1,3 +1,4 @@
+# coding: utf8
 from __future__ import unicode_literals

 from ...symbols import POS, ADV, NOUN, ADP, PRON, SCONJ, PROPN, DET, SYM, INTJ
@ -22,5 +23,5 @@ TAG_MAP = {
    "AUX": {POS: AUX},
    "SPACE": {POS: SPACE},
    "DET": {POS: DET},
-    "X": {POS: X}
+    "X": {POS: X},
 }
--- a/spacy/lang/el/tokenizer_exceptions.py
+++ b/spacy/lang/el/tokenizer_exceptions.py
@ -1,303 +1,132 @@
-# -*- coding: utf-8 -*-
-
+# coding: utf8
 from __future__ import unicode_literals

 from ...symbols import ORTH, LEMMA, NORM

+
 _exc = {}

 for token in ["Απ'", "ΑΠ'", "αφ'", "Αφ'"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "από", NORM: "από"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "από", NORM: "από"}]

 for token in ["Αλλ'", "αλλ'"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "αλλά", NORM: "αλλά"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "αλλά", NORM: "αλλά"}]

 for token in ["παρ'", "Παρ'", "ΠΑΡ'"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "παρά", NORM: "παρά"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "παρά", NORM: "παρά"}]

 for token in ["καθ'", "Καθ'"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "κάθε", NORM: "κάθε"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "κάθε", NORM: "κάθε"}]

 for token in ["κατ'", "Κατ'"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "κατά", NORM: "κατά"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "κατά", NORM: "κατά"}]

 for token in ["'ΣΟΥΝ", "'ναι", "'ταν", "'τανε", "'μαστε", "'μουνα", "'μουν"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "είμαι", NORM: "είμαι"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "είμαι", NORM: "είμαι"}]

 for token in ["Επ'", "επ'", "εφ'", "Εφ'"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "επί", NORM: "επί"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "επί", NORM: "επί"}]

 for token in ["Δι'", "δι'"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "δια", NORM: "δια"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "δια", NORM: "δια"}]

 for token in ["'χουν", "'χουμε", "'χαμε", "'χα", "'χε", "'χεις", "'χει"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "έχω", NORM: "έχω"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "έχω", NORM: "έχω"}]

 for token in ["υπ'", "Υπ'"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "υπό", NORM: "υπό"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "υπό", NORM: "υπό"}]

 for token in ["Μετ'", "ΜΕΤ'", "'μετ"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "μετά", NORM: "μετά"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "μετά", NORM: "μετά"}]

 for token in ["Μ'", "μ'"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "με", NORM: "με"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "με", NORM: "με"}]

 for token in ["Γι'", "ΓΙ'", "γι'"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "για", NORM: "για"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "για", NORM: "για"}]

 for token in ["Σ'", "σ'"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "σε", NORM: "σε"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "σε", NORM: "σε"}]

 for token in ["Θ'", "θ'"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "θα", NORM: "θα"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "θα", NORM: "θα"}]

 for token in ["Ν'", "ν'"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "να", NORM: "να"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "να", NORM: "να"}]

 for token in ["Τ'", "τ'"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "να", NORM: "να"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "να", NORM: "να"}]

 for token in ["'γω", "'σένα", "'μεις"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "εγώ", NORM: "εγώ"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "εγώ", NORM: "εγώ"}]

 for token in ["Τ'", "τ'"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "το", NORM: "το"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "το", NORM: "το"}]

 for token in ["Φέρ'", "Φερ'", "φέρ'", "φερ'"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "φέρνω", NORM: "φέρνω"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "φέρνω", NORM: "φέρνω"}]

 for token in ["'ρθούνε", "'ρθουν", "'ρθει", "'ρθεί", "'ρθε", "'ρχεται"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "έρχομαι", NORM: "έρχομαι"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "έρχομαι", NORM: "έρχομαι"}]

 for token in ["'πανε", "'λεγε", "'λεγαν", "'πε", "'λεγα"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "λέγω", NORM: "λέγω"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "λέγω", NORM: "λέγω"}]

 for token in ["Πάρ'", "πάρ'"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "παίρνω", NORM: "παίρνω"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "παίρνω", NORM: "παίρνω"}]

 for token in ["μέσ'", "Μέσ'", "μεσ'"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "μέσα", NORM: "μέσα"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "μέσα", NORM: "μέσα"}]

 for token in ["Δέσ'", "Δεσ'", "δεσ'"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "δένω", NORM: "δένω"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "δένω", NORM: "δένω"}]

 for token in ["'κανε", "Κάν'"]:
-    _exc[token] = [
-        {ORTH: token, LEMMA: "κάνω", NORM: "κάνω"}
-    ]
+    _exc[token] = [{ORTH: token, LEMMA: "κάνω", NORM: "κάνω"}]

 _other_exc = {
-
-    "κι": [
-        {ORTH: "κι", LEMMA: "και", NORM: "και"},
-    ],
-
-    "Παίξ'": [
-        {ORTH: "Παίξ'", LEMMA: "παίζω", NORM: "παίζω"},
-    ],
-
-    "Αντ'": [
-        {ORTH: "Αντ'", LEMMA: "αντί", NORM: "αντί"},
-    ],
-
-    "ολ'": [
-        {ORTH: "ολ'", LEMMA: "όλος", NORM: "όλος"},
-    ],
-
-    "ύστερ'": [
-        {ORTH: "ύστερ'", LEMMA: "ύστερα", NORM: "ύστερα"},
-    ],
-
-    "'πρεπε": [
-        {ORTH: "'πρεπε", LEMMA: "πρέπει", NORM: "πρέπει"},
-    ],
-
-    "Δύσκολ'": [
-        {ORTH: "Δύσκολ'", LEMMA: "δύσκολος", NORM: "δύσκολος"},
-    ],
-
-    "'θελα": [
-        {ORTH: "'θελα", LEMMA: "θέλω", NORM: "θέλω"},
-    ],
-
-    "'γραφα": [
-        {ORTH: "'γραφα", LEMMA: "γράφω", NORM: "γράφω"},
-    ],
-
-    "'παιρνα": [
-        {ORTH: "'παιρνα", LEMMA: "παίρνω", NORM: "παίρνω"},
-    ],
-
-    "'δειξε": [
-        {ORTH: "'δειξε", LEMMA: "δείχνω", NORM: "δείχνω"},
-    ],
-
-    "όμουρφ'": [
-        {ORTH: "όμουρφ'", LEMMA: "όμορφος", NORM: "όμορφος"},
-    ],
-
-    "κ'τσή": [
-        {ORTH: "κ'τσή", LEMMA: "κουτσός", NORM: "κουτσός"},
-    ],
-
-    "μηδ'": [
-        {ORTH: "μηδ'", LEMMA: "μήδε", NORM: "μήδε"},
-    ],
-
+    "κι": [{ORTH: "κι", LEMMA: "και", NORM: "και"}],
+    "Παίξ'": [{ORTH: "Παίξ'", LEMMA: "παίζω", NORM: "παίζω"}],
+    "Αντ'": [{ORTH: "Αντ'", LEMMA: "αντί", NORM: "αντί"}],
+    "ολ'": [{ORTH: "ολ'", LEMMA: "όλος", NORM: "όλος"}],
+    "ύστερ'": [{ORTH: "ύστερ'", LEMMA: "ύστερα", NORM: "ύστερα"}],
+    "'πρεπε": [{ORTH: "'πρεπε", LEMMA: "πρέπει", NORM: "πρέπει"}],
+    "Δύσκολ'": [{ORTH: "Δύσκολ'", LEMMA: "δύσκολος", NORM: "δύσκολος"}],
+    "'θελα": [{ORTH: "'θελα", LEMMA: "θέλω", NORM: "θέλω"}],
+    "'γραφα": [{ORTH: "'γραφα", LEMMA: "γράφω", NORM: "γράφω"}],
+    "'παιρνα": [{ORTH: "'παιρνα", LEMMA: "παίρνω", NORM: "παίρνω"}],
+    "'δειξε": [{ORTH: "'δειξε", LEMMA: "δείχνω", NORM: "δείχνω"}],
+    "όμουρφ'": [{ORTH: "όμουρφ'", LEMMA: "όμορφος", NORM: "όμορφος"}],
+    "κ'τσή": [{ORTH: "κ'τσή", LEMMA: "κουτσός", NORM: "κουτσός"}],
+    "μηδ'": [{ORTH: "μηδ'", LEMMA: "μήδε", NORM: "μήδε"}],
    "'ξομολογήθηκε": [
-        {ORTH: "'ξομολογήθηκε", LEMMA: "εξομολογούμαι", NORM: "εξομολογούμαι"},
+        {ORTH: "'ξομολογήθηκε", LEMMA: "εξομολογούμαι", NORM: "εξομολογούμαι"}
    ],
-
-    "'μας": [
-        {ORTH: "'μας", LEMMA: "εμάς", NORM: "εμάς"},
-    ],
-
-    "'ξερες": [
-        {ORTH: "'ξερες", LEMMA: "ξέρω", NORM: "ξέρω"},
-    ],
-
-    "έφθασ'": [
-        {ORTH: "έφθασ'", LEMMA: "φθάνω", NORM: "φθάνω"},
-    ],
-
-    "εξ'": [
-        {ORTH: "εξ'", LEMMA: "εκ", NORM: "εκ"},
-    ],
-
-    "δώσ'": [
-        {ORTH: "δώσ'", LEMMA: "δίνω", NORM: "δίνω"},
-    ],
-
-    "τίποτ'": [
-        {ORTH: "τίποτ'", LEMMA: "τίποτα", NORM: "τίποτα"},
-    ],
-
-    "Λήξ'": [
-        {ORTH: "Λήξ'", LEMMA: "λήγω", NORM: "λήγω"},
-    ],
-
-    "άσ'": [
-        {ORTH: "άσ'", LEMMA: "αφήνω", NORM: "αφήνω"},
-    ],
-
-    "Στ'": [
-        {ORTH: "Στ'", LEMMA: "στο", NORM: "στο"},
-
-    ],
-
-    "Δωσ'": [
-        {ORTH: "Δωσ'", LEMMA: "δίνω", NORM: "δίνω"},
-    ],
-
-    "Βάψ'": [
-        {ORTH: "Βάψ'", LEMMA: "βάφω", NORM: "βάφω"},
-    ],
-
-    "Αλλ'": [
-        {ORTH: "Αλλ'", LEMMA: "αλλά", NORM: "αλλά"},
-    ],
-
-    "Αμ'": [
-        {ORTH: "Αμ'", LEMMA: "άμα", NORM: "άμα"},
-    ],
-
-    "Αγόρασ'": [
-        {ORTH: "Αγόρασ'", LEMMA: "αγοράζω", NORM: "αγοράζω"},
-    ],
-
-    "'φύγε": [
-        {ORTH: "'φύγε", LEMMA: "φεύγω", NORM: "φεύγω"},
-    ],
-
-    "'φερε": [
-        {ORTH: "'φερε", LEMMA: "φέρνω", NORM: "φέρνω"},
-    ],
-
-    "'φαγε": [
-        {ORTH: "'φαγε", LEMMA: "τρώω", NORM: "τρώω"},
-    ],
-
-    "'σπαγαν": [
-        {ORTH: "'σπαγαν", LEMMA: "σπάω", NORM: "σπάω"},
-    ],
-
-    "'σκασε": [
-        {ORTH: "'σκασε", LEMMA: "σκάω", NORM: "σκάω"},
-    ],
-
-    "'σβηνε": [
-        {ORTH: "'σβηνε", LEMMA: "σβήνω", NORM: "σβήνω"},
-    ],
-
-    "'ριξε": [
-        {ORTH: "'ριξε", LEMMA: "ρίχνω", NORM: "ρίχνω"},
-    ],
-
-    "'κλεβε": [
-        {ORTH: "'κλεβε", LEMMA: "κλέβω", NORM: "κλέβω"},
-    ],
-
-    "'κει": [
-        {ORTH: "'κει", LEMMA: "εκεί", NORM: "εκεί"},
-    ],
-
-    "'βλεπε": [
-        {ORTH: "'βλεπε", LEMMA: "βλέπω", NORM: "βλέπω"},
-    ],
-
-    "'βγαινε": [
-        {ORTH: "'βγαινε", LEMMA: "βγαίνω", NORM: "βγαίνω"},
-    ]
+    "'μας": [{ORTH: "'μας", LEMMA: "εμάς", NORM: "εμάς"}],
+    "'ξερες": [{ORTH: "'ξερες", LEMMA: "ξέρω", NORM: "ξέρω"}],
+    "έφθασ'": [{ORTH: "έφθασ'", LEMMA: "φθάνω", NORM: "φθάνω"}],
+    "εξ'": [{ORTH: "εξ'", LEMMA: "εκ", NORM: "εκ"}],
+    "δώσ'": [{ORTH: "δώσ'", LEMMA: "δίνω", NORM: "δίνω"}],
+    "τίποτ'": [{ORTH: "τίποτ'", LEMMA: "τίποτα", NORM: "τίποτα"}],
+    "Λήξ'": [{ORTH: "Λήξ'", LEMMA: "λήγω", NORM: "λήγω"}],
+    "άσ'": [{ORTH: "άσ'", LEMMA: "αφήνω", NORM: "αφήνω"}],
+    "Στ'": [{ORTH: "Στ'", LEMMA: "στο", NORM: "στο"}],
+    "Δωσ'": [{ORTH: "Δωσ'", LEMMA: "δίνω", NORM: "δίνω"}],
+    "Βάψ'": [{ORTH: "Βάψ'", LEMMA: "βάφω", NORM: "βάφω"}],
+    "Αλλ'": [{ORTH: "Αλλ'", LEMMA: "αλλά", NORM: "αλλά"}],
+    "Αμ'": [{ORTH: "Αμ'", LEMMA: "άμα", NORM: "άμα"}],
+    "Αγόρασ'": [{ORTH: "Αγόρασ'", LEMMA: "αγοράζω", NORM: "αγοράζω"}],
+    "'φύγε": [{ORTH: "'φύγε", LEMMA: "φεύγω", NORM: "φεύγω"}],
+    "'φερε": [{ORTH: "'φερε", LEMMA: "φέρνω", NORM: "φέρνω"}],
+    "'φαγε": [{ORTH: "'φαγε", LEMMA: "τρώω", NORM: "τρώω"}],
+    "'σπαγαν": [{ORTH: "'σπαγαν", LEMMA: "σπάω", NORM: "σπάω"}],
+    "'σκασε": [{ORTH: "'σκασε", LEMMA: "σκάω", NORM: "σκάω"}],
+    "'σβηνε": [{ORTH: "'σβηνε", LEMMA: "σβήνω", NORM: "σβήνω"}],
+    "'ριξε": [{ORTH: "'ριξε", LEMMA: "ρίχνω", NORM: "ρίχνω"}],
+    "'κλεβε": [{ORTH: "'κλεβε", LEMMA: "κλέβω", NORM: "κλέβω"}],
+    "'κει": [{ORTH: "'κει", LEMMA: "εκεί", NORM: "εκεί"}],
+    "'βλεπε": [{ORTH: "'βλεπε", LEMMA: "βλέπω", NORM: "βλέπω"}],
+    "'βγαινε": [{ORTH: "'βγαινε", LEMMA: "βγαίνω", NORM: "βγαίνω"}],
 }

 _exc.update(_other_exc)
@ -307,12 +136,14 @@ for h in range(1, 12 + 1):
    for period in ["π.μ.", "πμ"]:
        _exc["%d%s" % (h, period)] = [
            {ORTH: "%d" % h},
-            {ORTH: period, LEMMA: "π.μ.", NORM: "π.μ."}]
+            {ORTH: period, LEMMA: "π.μ.", NORM: "π.μ."},
+        ]

    for period in ["μ.μ.", "μμ"]:
        _exc["%d%s" % (h, period)] = [
            {ORTH: "%d" % h},
-            {ORTH: period, LEMMA: "μ.μ.", NORM: "μ.μ."}]
+            {ORTH: period, LEMMA: "μ.μ.", NORM: "μ.μ."},
+        ]

 for exc_data in [
    {ORTH: "ΑΓΡ.", LEMMA: "Αγροτικός", NORM: "Αγροτικός"},
@ -339,43 +170,228 @@ for exc_data in [

 for orth in [
    "$ΗΠΑ",
-    "Α'", "Α.Ε.", "Α.Ε.Β.Ε.", "Α.Ε.Ι.", "Α.Ε.Π.", "Α.Μ.Α.", "Α.Π.Θ.", "Α.Τ.", "Α.Χ.", "ΑΝ.", "Αγ.", "Αλ.", "Αν.",
-    "Αντ.", "Απ.",
-    "Β'", "Β)", "Β.Ζ.", "Β.Ι.Ο.", "Β.Κ.", "Β.Μ.Α.", "Βασ.",
-    "Γ'", "Γ)", "Γ.Γ.", "Γ.Δ.", "Γκ.",
-    "Δ.Ε.Η.", "Δ.Ε.Σ.Ε.", "Δ.Ν.", "Δ.Ο.Υ.", "Δ.Σ.", "Δ.Υ.", "ΔΙ.ΚΑ.Τ.Σ.Α.", "Δηλ.", "Διον.",
-    "Ε.Α.", "Ε.Α.Κ.", "Ε.Α.Π.", "Ε.Ε.", "Ε.Κ.", "Ε.ΚΕ.ΠΙΣ.", "Ε.Λ.Α.", "Ε.Λ.Ι.Α.", "Ε.Π.Σ.", "Ε.Π.Τ.Α.", "Ε.Σ.Ε.Ε.Κ.",
-    "Ε.Υ.Κ.", "ΕΕ.", "ΕΚ.", "ΕΛ.", "ΕΛ.ΑΣ.", "Εθν.", "Ελ.", "Εμ.", "Επ.", "Ευ.",
-    "Η'", "Η.Π.Α.",
-    "ΘΕ.", "Θεμ.", "Θεοδ.", "Θρ.",
-    "Ι.Ε.Κ.", "Ι.Κ.Α.", "Ι.Κ.Υ.", "Ι.Σ.Θ.", "Ι.Χ.", "ΙΖ'", "ΙΧ.",
-    "Κ.Α.Α.", "Κ.Α.Ε.", "Κ.Β.Σ.", "Κ.Δ.", "Κ.Ε.", "Κ.Ε.Κ.", "Κ.Ι.", "Κ.Κ.", "Κ.Ι.Θ.", "Κ.Ι.Θ.", "Κ.ΚΕΚ.", "Κ.Ο.",
-    "Κ.Π.Ρ.", "ΚΑΤ.", "ΚΚ.", "Καν.", "Καρ.", "Κατ.", "Κυρ.", "Κων.",
-    "Λ.Α.", "Λ.χ.", "Λ.Χ.", "Λεωφ.", "Λι.",
-    "Μ.Δ.Ε.", "Μ.Ε.Ο.", "Μ.Ζ.", "Μ.Μ.Ε.", "Μ.Ο.", "Μεγ.", "Μιλτ.", "Μιχ.",
-    "Ν.Δ.", "Ν.Ε.Α.", "Ν.Κ.", "Ν.Ο.", "Ν.Ο.Θ.", "Ν.Π.Δ.Δ.", "Ν.Υ.", "ΝΔ.", "Νικ.", "Ντ'", "Ντ.",
-    "Ο'", "Ο.Α.", "Ο.Α.Ε.Δ.", "Ο.Δ.", "Ο.Ε.Ε.", "Ο.Ε.Ε.Κ.", "Ο.Η.Ε.", "Ο.Κ.",
-    "Π.Δ.", "Π.Ε.Κ.Δ.Υ.", "Π.Ε.Π.", "Π.Μ.Σ.", "ΠΟΛ.", "Π.Χ.", "Παρ.", "Πλ.", "Πρ.",
-    "Σ.Δ.Ο.Ε.", "Σ.Ε.", "Σ.Ε.Κ.", "Σ.Π.Δ.Ω.Β.", "Σ.Τ.", "Σαβ.", "Στ.", "ΣτΕ.", "Στρ.",
-    "Τ.Α.", "Τ.Ε.Ε.", "Τ.Ε.Ι.", "ΤΡ.", "Τζ.", "Τηλ.",
-    "Υ.Γ.", "ΥΓ.", "ΥΠ.Ε.Π.Θ.",
-    "Φ.Α.Β.Ε.", "Φ.Κ.", "Φ.Σ.", "Φ.Χ.", "Φ.Π.Α.", "Φιλ.",
-    "Χ.Α.Α.", "ΧΡ.", "Χ.Χ.", "Χαρ.", "Χιλ.", "Χρ.",
-    "άγ.", "άρθρ.", "αι.", "αν.", "απ.", "αρ.", "αριθ.", "αριθμ.",
-    "β'", "βλ.",
-    "γ.γ.", "γεν.", "γραμμ.",
-    "δ.δ.", "δ.σ.", "δηλ.", "δισ.", "δολ.", "δρχ.",
-    "εκ.", "εκατ.", "ελ.",
+    "Α'",
+    "Α.Ε.",
+    "Α.Ε.Β.Ε.",
+    "Α.Ε.Ι.",
+    "Α.Ε.Π.",
+    "Α.Μ.Α.",
+    "Α.Π.Θ.",
+    "Α.Τ.",
+    "Α.Χ.",
+    "ΑΝ.",
+    "Αγ.",
+    "Αλ.",
+    "Αν.",
+    "Αντ.",
+    "Απ.",
+    "Β'",
+    "Β)",
+    "Β.Ζ.",
+    "Β.Ι.Ο.",
+    "Β.Κ.",
+    "Β.Μ.Α.",
+    "Βασ.",
+    "Γ'",
+    "Γ)",
+    "Γ.Γ.",
+    "Γ.Δ.",
+    "Γκ.",
+    "Δ.Ε.Η.",
+    "Δ.Ε.Σ.Ε.",
+    "Δ.Ν.",
+    "Δ.Ο.Υ.",
+    "Δ.Σ.",
+    "Δ.Υ.",
+    "ΔΙ.ΚΑ.Τ.Σ.Α.",
+    "Δηλ.",
+    "Διον.",
+    "Ε.Α.",
+    "Ε.Α.Κ.",
+    "Ε.Α.Π.",
+    "Ε.Ε.",
+    "Ε.Κ.",
+    "Ε.ΚΕ.ΠΙΣ.",
+    "Ε.Λ.Α.",
+    "Ε.Λ.Ι.Α.",
+    "Ε.Π.Σ.",
+    "Ε.Π.Τ.Α.",
+    "Ε.Σ.Ε.Ε.Κ.",
+    "Ε.Υ.Κ.",
+    "ΕΕ.",
+    "ΕΚ.",
+    "ΕΛ.",
+    "ΕΛ.ΑΣ.",
+    "Εθν.",
+    "Ελ.",
+    "Εμ.",
+    "Επ.",
+    "Ευ.",
+    "Η'",
+    "Η.Π.Α.",
+    "ΘΕ.",
+    "Θεμ.",
+    "Θεοδ.",
+    "Θρ.",
+    "Ι.Ε.Κ.",
+    "Ι.Κ.Α.",
+    "Ι.Κ.Υ.",
+    "Ι.Σ.Θ.",
+    "Ι.Χ.",
+    "ΙΖ'",
+    "ΙΧ.",
+    "Κ.Α.Α.",
+    "Κ.Α.Ε.",
+    "Κ.Β.Σ.",
+    "Κ.Δ.",
+    "Κ.Ε.",
+    "Κ.Ε.Κ.",
+    "Κ.Ι.",
+    "Κ.Κ.",
+    "Κ.Ι.Θ.",
+    "Κ.Ι.Θ.",
+    "Κ.ΚΕΚ.",
+    "Κ.Ο.",
+    "Κ.Π.Ρ.",
+    "ΚΑΤ.",
+    "ΚΚ.",
+    "Καν.",
+    "Καρ.",
+    "Κατ.",
+    "Κυρ.",
+    "Κων.",
+    "Λ.Α.",
+    "Λ.χ.",
+    "Λ.Χ.",
+    "Λεωφ.",
+    "Λι.",
+    "Μ.Δ.Ε.",
+    "Μ.Ε.Ο.",
+    "Μ.Ζ.",
+    "Μ.Μ.Ε.",
+    "Μ.Ο.",
+    "Μεγ.",
+    "Μιλτ.",
+    "Μιχ.",
+    "Ν.Δ.",
+    "Ν.Ε.Α.",
+    "Ν.Κ.",
+    "Ν.Ο.",
+    "Ν.Ο.Θ.",
+    "Ν.Π.Δ.Δ.",
+    "Ν.Υ.",
+    "ΝΔ.",
+    "Νικ.",
+    "Ντ'",
+    "Ντ.",
+    "Ο'",
+    "Ο.Α.",
+    "Ο.Α.Ε.Δ.",
+    "Ο.Δ.",
+    "Ο.Ε.Ε.",
+    "Ο.Ε.Ε.Κ.",
+    "Ο.Η.Ε.",
+    "Ο.Κ.",
+    "Π.Δ.",
+    "Π.Ε.Κ.Δ.Υ.",
+    "Π.Ε.Π.",
+    "Π.Μ.Σ.",
+    "ΠΟΛ.",
+    "Π.Χ.",
+    "Παρ.",
+    "Πλ.",
+    "Πρ.",
+    "Σ.Δ.Ο.Ε.",
+    "Σ.Ε.",
+    "Σ.Ε.Κ.",
+    "Σ.Π.Δ.Ω.Β.",
+    "Σ.Τ.",
+    "Σαβ.",
+    "Στ.",
+    "ΣτΕ.",
+    "Στρ.",
+    "Τ.Α.",
+    "Τ.Ε.Ε.",
+    "Τ.Ε.Ι.",
+    "ΤΡ.",
+    "Τζ.",
+    "Τηλ.",
+    "Υ.Γ.",
+    "ΥΓ.",
+    "ΥΠ.Ε.Π.Θ.",
+    "Φ.Α.Β.Ε.",
+    "Φ.Κ.",
+    "Φ.Σ.",
+    "Φ.Χ.",
+    "Φ.Π.Α.",
+    "Φιλ.",
+    "Χ.Α.Α.",
+    "ΧΡ.",
+    "Χ.Χ.",
+    "Χαρ.",
+    "Χιλ.",
+    "Χρ.",
+    "άγ.",
+    "άρθρ.",
+    "αι.",
+    "αν.",
+    "απ.",
+    "αρ.",
+    "αριθ.",
+    "αριθμ.",
+    "β'",
+    "βλ.",
+    "γ.γ.",
+    "γεν.",
+    "γραμμ.",
+    "δ.δ.",
+    "δ.σ.",
+    "δηλ.",
+    "δισ.",
+    "δολ.",
+    "δρχ.",
+    "εκ.",
+    "εκατ.",
+    "ελ.",
    "θιν'",
-    "κ.", "κ.ά.", "κ.α.", "κ.κ.", "κ.λπ.", "κ.ο.κ.", "κ.τ.λ.", "κλπ.", "κτλ.", "κυβ.",
+    "κ.",
+    "κ.ά.",
+    "κ.α.",
+    "κ.κ.",
+    "κ.λπ.",
+    "κ.ο.κ.",
+    "κ.τ.λ.",
+    "κλπ.",
+    "κτλ.",
+    "κυβ.",
    "λ.χ.",
-    "μ.", "μ.Χ.", "μ.μ.", "μιλ.",
+    "μ.",
+    "μ.Χ.",
+    "μ.μ.",
+    "μιλ.",
    "ντ'",
-    "π.Χ.", "π.β.", "π.δ.", "π.μ.", "π.χ.",
-    "σ.", "σ.α.λ.", "σ.σ.", "σελ.", "στρ.",
-    "τ'ς", "τ.μ.", "τετ.", "τετρ.", "τηλ.", "τρισ.", "τόν.",
+    "π.Χ.",
+    "π.β.",
+    "π.δ.",
+    "π.μ.",
+    "π.χ.",
+    "σ.",
+    "σ.α.λ.",
+    "σ.σ.",
+    "σελ.",
+    "στρ.",
+    "τ'ς",
+    "τ.μ.",
+    "τετ.",
+    "τετρ.",
+    "τηλ.",
+    "τρισ.",
+    "τόν.",
    "υπ.",
-    "χ.μ.", "χγρ.", "χιλ.", "χλμ."
+    "χ.μ.",
+    "χγρ.",
+    "χιλ.",
+    "χλμ.",
 ]:
    _exc[orth] = [{ORTH: orth}]

--- a/spacy/lang/en/init.py
+++ b/spacy/lang/en/init.py
@ -16,15 +16,18 @@ from ...language import Language
 from ...attrs import LANG, NORM
 from ...util import update_exc, add_lookups

+
 def _return_en(_):
-    return 'en'
+    return "en"
+

 class EnglishDefaults(Language.Defaults):
    lex_attr_getters = dict(Language.Defaults.lex_attr_getters)
    lex_attr_getters.update(LEX_ATTRS)
    lex_attr_getters[LANG] = _return_en
-    lex_attr_getters[NORM] = add_lookups(Language.Defaults.lex_attr_getters[NORM],
-                                         BASE_NORMS, NORM_EXCEPTIONS)
+    lex_attr_getters[NORM] = add_lookups(
+        Language.Defaults.lex_attr_getters[NORM], BASE_NORMS, NORM_EXCEPTIONS
+    )
    tokenizer_exceptions = update_exc(BASE_EXCEPTIONS, TOKENIZER_EXCEPTIONS)
    tag_map = TAG_MAP
    stop_words = STOP_WORDS
@ -37,8 +40,8 @@ class EnglishDefaults(Language.Defaults):


 class English(Language):
-    lang = 'en'
+    lang = "en"
    Defaults = EnglishDefaults


-__all__ = ['English']
+__all__ = ["English"]
--- a/spacy/lang/en/examples.py
+++ b/spacy/lang/en/examples.py
@ -18,5 +18,5 @@ sentences = [
    "Where are you?",
    "Who is the president of France?",
    "What is the capital of the United States?",
-    "When was Barack Obama born?"
+    "When was Barack Obama born?",
 ]
--- a/spacy/lang/en/lemmatizer/init.py
+++ b/spacy/lang/en/lemmatizer/init.py
@ -1,7 +1,7 @@
 # coding: utf8
 from __future__ import unicode_literals

-from .lookup import LOOKUP
+from .lookup import LOOKUP  # noqa: F401
 from ._adjectives import ADJECTIVES
 from ._adjectives_irreg import ADJECTIVES_IRREG
 from ._adverbs import ADVERBS
@ -13,10 +13,18 @@ from ._verbs_irreg import VERBS_IRREG
 from ._lemma_rules import ADJECTIVE_RULES, NOUN_RULES, VERB_RULES, PUNCT_RULES


-LEMMA_INDEX = {'adj': ADJECTIVES, 'adv': ADVERBS, 'noun': NOUNS, 'verb': VERBS}
+LEMMA_INDEX = {"adj": ADJECTIVES, "adv": ADVERBS, "noun": NOUNS, "verb": VERBS}

-LEMMA_EXC = {'adj': ADJECTIVES_IRREG, 'adv': ADVERBS_IRREG, 'noun': NOUNS_IRREG,
-             'verb': VERBS_IRREG}
+LEMMA_EXC = {
+    "adj": ADJECTIVES_IRREG,
+    "adv": ADVERBS_IRREG,
+    "noun": NOUNS_IRREG,
+    "verb": VERBS_IRREG,
+}

-LEMMA_RULES = {'adj': ADJECTIVE_RULES, 'noun': NOUN_RULES, 'verb': VERB_RULES,
-               'punct': PUNCT_RULES}
+LEMMA_RULES = {
+    "adj": ADJECTIVE_RULES,
+    "noun": NOUN_RULES,
+    "verb": VERB_RULES,
+    "punct": PUNCT_RULES,
+}
--- a/spacy/lang/en/lemmatizer/_adjectives.py
+++ b/spacy/lang/en/lemmatizer/_adjectives.py
@ -2,7 +2,8 @@
 from __future__ import unicode_literals


-ADJECTIVES = set("""
+ADJECTIVES = set(
+    """
 .22-caliber .22-calibre .38-caliber .38-calibre .45-caliber .45-calibre 0 1 10
 10-membered 100 1000 1000th 100th 101 101st 105 105th 10th 11 110 110th 115
 115th 11th 12 120 120th 125 125th 12th 13 130 130th 135 135th 13th 14 140 140th
@ -2824,4 +2825,5 @@ zealous zenithal zero zeroth zestful zesty zig-zag zigzag zillion zimbabwean
 zionist zippy zodiacal zoftig zoic zolaesque zonal zonary zoological zoonotic
 zoophagous zoroastrian zygodactyl zygomatic zygomorphic zygomorphous zygotic
 zymoid zymolytic zymotic
-""".split())
+""".split()
+)
--- a/spacy/lang/en/lemmatizer/_adjectives_irreg.py
+++ b/spacy/lang/en/lemmatizer/_adjectives_irreg.py
@ -48,8 +48,7 @@ ADJECTIVES_IRREG = {
    "bendier": ("bendy",),
    "bendiest": ("bendy",),
    "best": ("good",),
-    "better": ("good",
-        "well",),
+    "better": ("good", "well"),
    "bigger": ("big",),
    "biggest": ("big",),
    "bitchier": ("bitchy",),
@ -289,10 +288,8 @@ ADJECTIVES_IRREG = {
    "doughtiest": ("doughty",),
    "dowdier": ("dowdy",),
    "dowdiest": ("dowdy",),
-    "dowier": ("dowie",
-        "dowy",),
-    "dowiest": ("dowie",
-        "dowy",),
+    "dowier": ("dowie", "dowy"),
+    "dowiest": ("dowie", "dowy"),
    "downer": ("downer",),
    "downier": ("downy",),
    "downiest": ("downy",),
@ -1494,5 +1491,5 @@ ADJECTIVES_IRREG = {
    "zanier": ("zany",),
    "zaniest": ("zany",),
    "zippier": ("zippy",),
-    "zippiest": ("zippy",)
+    "zippiest": ("zippy",),
 }
--- a/spacy/lang/en/lemmatizer/_adverbs.py
+++ b/spacy/lang/en/lemmatizer/_adverbs.py
@ -2,7 +2,8 @@
 from __future__ import unicode_literals


-ADVERBS = set("""
+ADVERBS = set(
+    """
 'tween a.d. a.k.a. a.m. aback abaft abaxially abeam abed abjectly ably
 abnormally aboard abominably aborad abortively about above aboveboard abreast
 abroad abruptly absently absentmindedly absolutely abstemiously abstractedly
@ -540,4 +541,5 @@ wordlessly worriedly worryingly worse worst worthily worthlessly wrathfully
 wretchedly wrong wrongfully wrongheadedly wrongly wryly yea yeah yearly
 yearningly yesterday yet yieldingly yon yonder youthfully zealously zestfully
 zestily zigzag
-""".split())
+""".split()
+)
--- a/spacy/lang/en/lemmatizer/_adverbs_irreg.py
+++ b/spacy/lang/en/lemmatizer/_adverbs_irreg.py
@ -9,5 +9,5 @@ ADVERBS_IRREG = {
    "farther": ("far",),
    "further": ("far",),
    "harder": ("hard",),
-    "hardest": ("hard",)
+    "hardest": ("hard",),
 }
--- a/spacy/lang/en/lemmatizer/_lemma_rules.py
+++ b/spacy/lang/en/lemmatizer/_lemma_rules.py
@ -2,12 +2,7 @@
 from __future__ import unicode_literals


-ADJECTIVE_RULES = [
-    ["er", ""],
-    ["est", ""],
-    ["er", "e"],
-    ["est", "e"]
-]
+ADJECTIVE_RULES = [["er", ""], ["est", ""], ["er", "e"], ["est", "e"]]


 NOUN_RULES = [
@ -19,7 +14,7 @@ NOUN_RULES = [
    ["ches", "ch"],
    ["shes", "sh"],
    ["men", "man"],
-    ["ies", "y"]
+    ["ies", "y"],
 ]


@ -31,13 +26,8 @@ VERB_RULES = [
    ["ed", "e"],
    ["ed", ""],
    ["ing", "e"],
-    ["ing", ""]
+    ["ing", ""],
 ]


-PUNCT_RULES = [
-    ["“", "\""],
-    ["”", "\""],
-    ["\u2018", "'"],
-    ["\u2019", "'"]
-]
+PUNCT_RULES = [["“", '"'], ["”", '"'], ["\u2018", "'"], ["\u2019", "'"]]
--- a/spacy/lang/en/lemmatizer/_nouns.py
+++ b/spacy/lang/en/lemmatizer/_nouns.py
@ -2,7 +2,8 @@
 from __future__ import unicode_literals


-NOUNS = set("""
+NOUNS = set(
+    """
 'hood .22 0 1 1-dodecanol 1-hitter 10 100 1000 10000 100000 1000000 1000000000
 1000000000000 11 11-plus 12 120 13 14 144 15 1530s 16 17 1728 1750s 1760s 1770s
 1780s 1790s 18 1820s 1830s 1840s 1850s 1860s 1870s 1880s 1890s 19 1900s 1920s
@ -7110,4 +7111,5 @@ zurvanism zweig zwieback zwingli zworykin zydeco zygnema zygnemales
 zygnemataceae zygnematales zygocactus zygoma zygomatic zygomycetes zygomycota
 zygomycotina zygophyllaceae zygophyllum zygoptera zygospore zygote zygotene
 zyloprim zymase zymogen zymology zymolysis zymosis zymurgy zyrian
-""".split())
+""".split()
+)
--- a/spacy/lang/en/lemmatizer/_verbs.py
+++ b/spacy/lang/en/lemmatizer/_verbs.py
@ -2,7 +2,8 @@
 from __future__ import unicode_literals


-VERBS = set("""
+VERBS = set(
+    """
 aah abacinate abandon abase abash abate abbreviate abdicate abduce abduct
 aberrate abet abhor abide abjure ablactate ablate abnegate abolish abominate
 abort abound about-face abrade abrase abreact abridge abrogate abscise abscond
@ -912,4 +913,5 @@ wreck wrench wrest wrestle wrick wriggle wring wrinkle write writhe wrong x-ray
 xerox yacht yack yak yammer yank yap yarn yarn-dye yaup yaw yawl yawn yawp yearn
 yell yellow yelp yen yield yip yodel yoke yowl zap zero zest zigzag zinc zip
 zipper zone zoom
-""".split())
+""".split()
+)
--- a/spacy/lang/en/lex_attrs.py
+++ b/spacy/lang/en/lex_attrs.py
@ -4,22 +4,54 @@ from __future__ import unicode_literals
 from ...attrs import LIKE_NUM


-_num_words = ['zero', 'one', 'two', 'three', 'four', 'five', 'six', 'seven',
-              'eight', 'nine', 'ten', 'eleven', 'twelve', 'thirteen', 'fourteen',
-              'fifteen', 'sixteen', 'seventeen', 'eighteen', 'nineteen', 'twenty',
-              'thirty', 'forty', 'fifty', 'sixty', 'seventy', 'eighty', 'ninety',
-              'hundred', 'thousand', 'million', 'billion', 'trillion', 'quadrillion',
-              'gajillion', 'bazillion']
+_num_words = [
+    "zero",
+    "one",
+    "two",
+    "three",
+    "four",
+    "five",
+    "six",
+    "seven",
+    "eight",
+    "nine",
+    "ten",
+    "eleven",
+    "twelve",
+    "thirteen",
+    "fourteen",
+    "fifteen",
+    "sixteen",
+    "seventeen",
+    "eighteen",
+    "nineteen",
+    "twenty",
+    "thirty",
+    "forty",
+    "fifty",
+    "sixty",
+    "seventy",
+    "eighty",
+    "ninety",
+    "hundred",
+    "thousand",
+    "million",
+    "billion",
+    "trillion",
+    "quadrillion",
+    "gajillion",
+    "bazillion",
+]


 def like_num(text):
-    if text.startswith(('+', '-', '±', '~')):
+    if text.startswith(("+", "-", "±", "~")):
        text = text[1:]
-    text = text.replace(',', '').replace('.', '')
+    text = text.replace(",", "").replace(".", "")
    if text.isdigit():
        return True
-    if text.count('/') == 1:
-        num, denom = text.split('/')
+    if text.count("/") == 1:
+        num, denom = text.split("/")
        if num.isdigit() and denom.isdigit():
            return True
    if text.lower() in _num_words:
@ -27,6 +59,4 @@ def like_num(text):
    return False


-LEX_ATTRS = {
-    LIKE_NUM: like_num
-}
+LEX_ATTRS = {LIKE_NUM: like_num}
--- a/spacy/lang/en/morph_rules.py
+++ b/spacy/lang/en/morph_rules.py
@ -6,66 +6,321 @@ from ...symbols import LEMMA, PRON_LEMMA

 MORPH_RULES = {
    "PRP": {
-        "I":            {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "One", "Number": "Sing", "Case": "Nom"},
-        "me":           {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "One", "Number": "Sing", "Case": "Acc"},
+        "I": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "One",
+            "Number": "Sing",
+            "Case": "Nom",
+        },
+        "me": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "One",
+            "Number": "Sing",
+            "Case": "Acc",
+        },
        "you": {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Two"},
-        "he":           {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Sing", "Gender": "Masc", "Case": "Nom"},
-        "him":          {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Sing", "Gender": "Masc", "Case": "Acc"},
-        "she":          {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Sing", "Gender": "Fem",  "Case": "Nom"},
-        "her":          {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Sing", "Gender": "Fem",  "Case": "Acc"},
-        "it":           {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Sing", "Gender": "Neut"},
-        "we":           {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "One", "Number": "Plur", "Case": "Nom"},
-        "us":           {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "One", "Number": "Plur", "Case": "Acc"},
-        "they":         {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Plur", "Case": "Nom"},
-        "them":         {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Plur", "Case": "Acc"},
-
-        "mine":         {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "One", "Number": "Sing", "Poss": "Yes", "Reflex": "Yes"},
-        "his":          {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Sing", "Gender": "Masc", "Poss": "Yes", "Reflex": "Yes"},
-        "hers":         {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Sing", "Gender": "Fem",  "Poss": "Yes", "Reflex": "Yes"},
-        "its":          {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Sing", "Gender": "Neut", "Poss": "Yes", "Reflex": "Yes"},
-        "ours":         {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "One", "Number": "Plur", "Poss": "Yes", "Reflex": "Yes"},
-        "yours":        {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Two", "Number": "Plur", "Poss": "Yes", "Reflex": "Yes"},
-        "theirs":       {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Plur", "Poss": "Yes", "Reflex": "Yes"},
-
-        "myself":       {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "One", "Number": "Sing",  "Case": "Acc", "Reflex": "Yes"},
-        "yourself":     {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Two", "Case": "Acc", "Reflex": "Yes"},
-        "himself":      {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Sing", "Case": "Acc", "Gender": "Masc", "Reflex": "Yes"},
-        "herself":      {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Sing", "Case": "Acc", "Gender": "Fem",  "Reflex": "Yes"},
-        "itself":       {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Sing", "Case": "Acc", "Gender": "Neut", "Reflex": "Yes"},
-        "themself":     {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Sing", "Case": "Acc", "Reflex": "Yes"},
-        "ourselves":    {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "One", "Number": "Plur", "Case": "Acc", "Reflex": "Yes"},
-        "yourselves":   {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Two", "Case": "Acc", "Reflex": "Yes"},
-        "themselves":   {LEMMA: PRON_LEMMA, "PronType": "Prs", "Person": "Three", "Number": "Plur", "Case": "Acc", "Reflex": "Yes"}
+        "he": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Sing",
+            "Gender": "Masc",
+            "Case": "Nom",
+        },
+        "him": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Sing",
+            "Gender": "Masc",
+            "Case": "Acc",
+        },
+        "she": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Sing",
+            "Gender": "Fem",
+            "Case": "Nom",
+        },
+        "her": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Sing",
+            "Gender": "Fem",
+            "Case": "Acc",
+        },
+        "it": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Sing",
+            "Gender": "Neut",
+        },
+        "we": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "One",
+            "Number": "Plur",
+            "Case": "Nom",
+        },
+        "us": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "One",
+            "Number": "Plur",
+            "Case": "Acc",
+        },
+        "they": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Plur",
+            "Case": "Nom",
+        },
+        "them": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Plur",
+            "Case": "Acc",
+        },
+        "mine": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "One",
+            "Number": "Sing",
+            "Poss": "Yes",
+            "Reflex": "Yes",
+        },
+        "his": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Sing",
+            "Gender": "Masc",
+            "Poss": "Yes",
+            "Reflex": "Yes",
+        },
+        "hers": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Sing",
+            "Gender": "Fem",
+            "Poss": "Yes",
+            "Reflex": "Yes",
+        },
+        "its": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Sing",
+            "Gender": "Neut",
+            "Poss": "Yes",
+            "Reflex": "Yes",
+        },
+        "ours": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "One",
+            "Number": "Plur",
+            "Poss": "Yes",
+            "Reflex": "Yes",
+        },
+        "yours": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Two",
+            "Number": "Plur",
+            "Poss": "Yes",
+            "Reflex": "Yes",
+        },
+        "theirs": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Plur",
+            "Poss": "Yes",
+            "Reflex": "Yes",
+        },
+        "myself": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "One",
+            "Number": "Sing",
+            "Case": "Acc",
+            "Reflex": "Yes",
+        },
+        "yourself": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Two",
+            "Case": "Acc",
+            "Reflex": "Yes",
+        },
+        "himself": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Sing",
+            "Case": "Acc",
+            "Gender": "Masc",
+            "Reflex": "Yes",
+        },
+        "herself": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Sing",
+            "Case": "Acc",
+            "Gender": "Fem",
+            "Reflex": "Yes",
+        },
+        "itself": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Sing",
+            "Case": "Acc",
+            "Gender": "Neut",
+            "Reflex": "Yes",
+        },
+        "themself": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Sing",
+            "Case": "Acc",
+            "Reflex": "Yes",
+        },
+        "ourselves": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "One",
+            "Number": "Plur",
+            "Case": "Acc",
+            "Reflex": "Yes",
+        },
+        "yourselves": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Two",
+            "Case": "Acc",
+            "Reflex": "Yes",
+        },
+        "themselves": {
+            LEMMA: PRON_LEMMA,
+            "PronType": "Prs",
+            "Person": "Three",
+            "Number": "Plur",
+            "Case": "Acc",
+            "Reflex": "Yes",
+        },
    },
-
    "PRP$": {
-        "my":           {LEMMA: PRON_LEMMA, "Person": "One", "Number": "Sing", "PronType": "Prs", "Poss": "Yes"},
+        "my": {
+            LEMMA: PRON_LEMMA,
+            "Person": "One",
+            "Number": "Sing",
+            "PronType": "Prs",
+            "Poss": "Yes",
+        },
        "your": {LEMMA: PRON_LEMMA, "Person": "Two", "PronType": "Prs", "Poss": "Yes"},
-        "his":          {LEMMA: PRON_LEMMA, "Person": "Three", "Number": "Sing", "Gender": "Masc", "PronType": "Prs", "Poss": "Yes"},
-        "her":          {LEMMA: PRON_LEMMA, "Person": "Three", "Number": "Sing", "Gender": "Fem",  "PronType": "Prs", "Poss": "Yes"},
-        "its":          {LEMMA: PRON_LEMMA, "Person": "Three", "Number": "Sing", "Gender": "Neut", "PronType": "Prs", "Poss": "Yes"},
-        "our":          {LEMMA: PRON_LEMMA, "Person": "One", "Number": "Plur", "PronType": "Prs", "Poss": "Yes"},
-        "their":        {LEMMA: PRON_LEMMA, "Person": "Three", "Number": "Plur", "PronType": "Prs", "Poss": "Yes"}
+        "his": {
+            LEMMA: PRON_LEMMA,
+            "Person": "Three",
+            "Number": "Sing",
+            "Gender": "Masc",
+            "PronType": "Prs",
+            "Poss": "Yes",
+        },
+        "her": {
+            LEMMA: PRON_LEMMA,
+            "Person": "Three",
+            "Number": "Sing",
+            "Gender": "Fem",
+            "PronType": "Prs",
+            "Poss": "Yes",
+        },
+        "its": {
+            LEMMA: PRON_LEMMA,
+            "Person": "Three",
+            "Number": "Sing",
+            "Gender": "Neut",
+            "PronType": "Prs",
+            "Poss": "Yes",
+        },
+        "our": {
+            LEMMA: PRON_LEMMA,
+            "Person": "One",
+            "Number": "Plur",
+            "PronType": "Prs",
+            "Poss": "Yes",
+        },
+        "their": {
+            LEMMA: PRON_LEMMA,
+            "Person": "Three",
+            "Number": "Plur",
+            "PronType": "Prs",
+            "Poss": "Yes",
+        },
    },
-
    "VBZ": {
-        "am":           {LEMMA: "be", "VerbForm": "Fin", "Person": "One", "Tense": "Pres", "Mood": "Ind"},
-        "are":          {LEMMA: "be", "VerbForm": "Fin", "Person": "Two", "Tense": "Pres", "Mood": "Ind"},
-        "is":           {LEMMA: "be", "VerbForm": "Fin", "Person": "Three", "Tense": "Pres", "Mood": "Ind"},
-        "'re":          {LEMMA: "be", "VerbForm": "Fin", "Person": "Two", "Tense": "Pres", "Mood": "Ind"},
-        "'s":           {LEMMA: "be", "VerbForm": "Fin", "Person": "Three", "Tense": "Pres", "Mood": "Ind"},
+        "am": {
+            LEMMA: "be",
+            "VerbForm": "Fin",
+            "Person": "One",
+            "Tense": "Pres",
+            "Mood": "Ind",
+        },
+        "are": {
+            LEMMA: "be",
+            "VerbForm": "Fin",
+            "Person": "Two",
+            "Tense": "Pres",
+            "Mood": "Ind",
+        },
+        "is": {
+            LEMMA: "be",
+            "VerbForm": "Fin",
+            "Person": "Three",
+            "Tense": "Pres",
+            "Mood": "Ind",
+        },
+        "'re": {
+            LEMMA: "be",
+            "VerbForm": "Fin",
+            "Person": "Two",
+            "Tense": "Pres",
+            "Mood": "Ind",
+        },
+        "'s": {
+            LEMMA: "be",
+            "VerbForm": "Fin",
+            "Person": "Three",
+            "Tense": "Pres",
+            "Mood": "Ind",
+        },
    },
-
    "VBP": {
        "are": {LEMMA: "be", "VerbForm": "Fin", "Tense": "Pres", "Mood": "Ind"},
        "'re": {LEMMA: "be", "VerbForm": "Fin", "Tense": "Pres", "Mood": "Ind"},
-        "am":           {LEMMA: "be", "VerbForm": "Fin", "Person": "One", "Tense": "Pres", "Mood": "Ind"},
+        "am": {
+            LEMMA: "be",
+            "VerbForm": "Fin",
+            "Person": "One",
+            "Tense": "Pres",
+            "Mood": "Ind",
+        },
    },
-
    "VBD": {
        "was": {LEMMA: "be", "VerbForm": "Fin", "Tense": "Past", "Number": "Sing"},
-        "were":         {LEMMA: "be", "VerbForm": "Fin", "Tense": "Past", "Number": "Plur"}
-    }
+        "were": {LEMMA: "be", "VerbForm": "Fin", "Tense": "Past", "Number": "Plur"},
+    },
 }


--- a/spacy/lang/en/norm_exceptions.py
+++ b/spacy/lang/en/norm_exceptions.py
@ -12,7 +12,6 @@ _exc = {
    "plz": "please",
    "pls": "please",
    "thx": "thanks",
-
    # US vs. UK spelling
    "accessorise": "accessorize",
    "accessorised": "accessorized",
@ -690,7 +689,7 @@ _exc = {
    "globalising": "globalizing",
    "glueing ": "gluing ",
    "goin": "going",
-    "goin'":"going",
+    "goin'": "going",
    "goitre": "goiter",
    "goitres": "goiters",
    "gonorrhoea": "gonorrhea",
@ -1758,7 +1757,7 @@ _exc = {
    "yoghourt": "yogurt",
    "yoghourts": "yogurts",
    "yoghurt": "yogurt",
-    "yoghurts": "yogurts"
+    "yoghurts": "yogurts",
 }


--- a/spacy/lang/en/stop_words.py
+++ b/spacy/lang/en/stop_words.py
@ -3,8 +3,8 @@ from __future__ import unicode_literals


 # Stop words
-
-STOP_WORDS = set("""
+STOP_WORDS = set(
+    """
 a about above across after afterwards again against all almost alone along
 already also although always am among amongst amount an and another any anyhow
 anyone anything anyway anywhere are around as at
@ -68,4 +68,5 @@ whither who whoever whole whom whose why will with within without would
 yet you your yours yourself yourselves

 'd 'll 'm 're 's 've
-""".split())
+""".split()
+)
--- a/spacy/lang/en/syntax_iterators.py
+++ b/spacy/lang/en/syntax_iterators.py
@ -8,12 +8,21 @@ def noun_chunks(obj):
    """
    Detect base noun phrases from a dependency parse. Works on both Doc and Span.
    """
-    labels = ['nsubj', 'dobj', 'nsubjpass', 'pcomp', 'pobj', 'dative', 'appos',
-              'attr', 'ROOT']
+    labels = [
+        "nsubj",
+        "dobj",
+        "nsubjpass",
+        "pcomp",
+        "pobj",
+        "dative",
+        "appos",
+        "attr",
+        "ROOT",
+    ]
    doc = obj.doc  # Ensure works on both Doc and Span.
    np_deps = [doc.vocab.strings.add(label) for label in labels]
-    conj = doc.vocab.strings.add('conj')
-    np_label = doc.vocab.strings.add('NP')
+    conj = doc.vocab.strings.add("conj")
+    np_label = doc.vocab.strings.add("NP")
    seen = set()
    for i, word in enumerate(obj):
        if word.pos not in (NOUN, PROPN, PRON):
@ -24,8 +33,8 @@ def noun_chunks(obj):
        if word.dep in np_deps:
            if any(w.i in seen for w in word.subtree):
                continue
-            seen.update(j for j in range(word.left_edge.i, word.i+1))
-            yield word.left_edge.i, word.i+1, np_label
+            seen.update(j for j in range(word.left_edge.i, word.i + 1))
+            yield word.left_edge.i, word.i + 1, np_label
        elif word.dep == conj:
            head = word.head
            while head.dep == conj and head.head.i < head.i:
@ -34,10 +43,8 @@ def noun_chunks(obj):
            if head.dep in np_deps:
                if any(w.i in seen for w in word.subtree):
                    continue
-                seen.update(j for j in range(word.left_edge.i, word.i+1))
-                yield word.left_edge.i, word.i+1, np_label
+                seen.update(j for j in range(word.left_edge.i, word.i + 1))
+                yield word.left_edge.i, word.i + 1, np_label


-SYNTAX_ITERATORS = {
-    'noun_chunks': noun_chunks
-}
+SYNTAX_ITERATORS = {"noun_chunks": noun_chunks}
--- a/spacy/lang/en/tag_map.py
+++ b/spacy/lang/en/tag_map.py
@ -11,7 +11,7 @@ TAG_MAP = {
    "-LRB-": {POS: PUNCT, "PunctType": "brck", "PunctSide": "ini"},
    "-RRB-": {POS: PUNCT, "PunctType": "brck", "PunctSide": "fin"},
    "``": {POS: PUNCT, "PunctType": "quot", "PunctSide": "ini"},
-    "\"\"":     {POS: PUNCT, "PunctType": "quot", "PunctSide": "fin"},
+    '""': {POS: PUNCT, "PunctType": "quot", "PunctSide": "fin"},
    "''": {POS: PUNCT, "PunctType": "quot", "PunctSide": "fin"},
    ":": {POS: PUNCT},
    "$": {POS: SYM, "Other": {"SymType": "currency"}},
@ -51,7 +51,13 @@ TAG_MAP = {
    "VBG": {POS: VERB, "VerbForm": "part", "Tense": "pres", "Aspect": "prog"},
    "VBN": {POS: VERB, "VerbForm": "part", "Tense": "past", "Aspect": "perf"},
    "VBP": {POS: VERB, "VerbForm": "fin", "Tense": "pres"},
-    "VBZ":      {POS: VERB, "VerbForm": "fin", "Tense": "pres", "Number": "sing", "Person": 3},
+    "VBZ": {
+        POS: VERB,
+        "VerbForm": "fin",
+        "Tense": "pres",
+        "Number": "sing",
+        "Person": 3,
+    },
    "WDT": {POS: ADJ, "PronType": "int|rel"},
    "WP": {POS: NOUN, "PronType": "int|rel"},
    "WP$": {POS: ADJ, "Poss": "yes", "PronType": "int|rel"},
--- a/spacy/lang/en/tokenizer_exceptions.py
+++ b/spacy/lang/en/tokenizer_exceptions.py
@ -5,103 +5,143 @@ from ...symbols import ORTH, LEMMA, TAG, NORM, PRON_LEMMA


 _exc = {}
-_exclude = ["Ill", "ill", "Its", "its", "Hell", "hell", "Shell", "shell",
-               "Shed", "shed", "were", "Were", "Well", "well", "Whore", "whore"]
+_exclude = [
+    "Ill",
+    "ill",
+    "Its",
+    "its",
+    "Hell",
+    "hell",
+    "Shell",
+    "shell",
+    "Shed",
+    "shed",
+    "were",
+    "Were",
+    "Well",
+    "well",
+    "Whore",
+    "whore",
+]


 # Pronouns
-
 for pron in ["i"]:
    for orth in [pron, pron.title()]:
        _exc[orth + "'m"] = [
            {ORTH: orth, LEMMA: PRON_LEMMA, NORM: pron, TAG: "PRP"},
-            {ORTH: "'m", LEMMA: "be", NORM: "am", TAG: "VBP", "tenspect": 1, "number": 1}]
+            {
+                ORTH: "'m",
+                LEMMA: "be",
+                NORM: "am",
+                TAG: "VBP",
+                "tenspect": 1,
+                "number": 1,
+            },
+        ]

        _exc[orth + "m"] = [
            {ORTH: orth, LEMMA: PRON_LEMMA, NORM: pron, TAG: "PRP"},
-            {ORTH: "m", LEMMA: "be", TAG: "VBP", "tenspect": 1, "number": 1 }]
+            {ORTH: "m", LEMMA: "be", TAG: "VBP", "tenspect": 1, "number": 1},
+        ]

        _exc[orth + "'ma"] = [
            {ORTH: orth, LEMMA: PRON_LEMMA, NORM: pron, TAG: "PRP"},
            {ORTH: "'m", LEMMA: "be", NORM: "am"},
-            {ORTH: "a", LEMMA: "going to", NORM: "gonna"}]
+            {ORTH: "a", LEMMA: "going to", NORM: "gonna"},
+        ]

        _exc[orth + "ma"] = [
            {ORTH: orth, LEMMA: PRON_LEMMA, NORM: pron, TAG: "PRP"},
            {ORTH: "m", LEMMA: "be", NORM: "am"},
-            {ORTH: "a", LEMMA: "going to", NORM: "gonna"}]
+            {ORTH: "a", LEMMA: "going to", NORM: "gonna"},
+        ]


 for pron in ["i", "you", "he", "she", "it", "we", "they"]:
    for orth in [pron, pron.title()]:
        _exc[orth + "'ll"] = [
            {ORTH: orth, LEMMA: PRON_LEMMA, NORM: pron, TAG: "PRP"},
-            {ORTH: "'ll", LEMMA: "will", NORM: "will", TAG: "MD"}]
+            {ORTH: "'ll", LEMMA: "will", NORM: "will", TAG: "MD"},
+        ]

        _exc[orth + "ll"] = [
            {ORTH: orth, LEMMA: PRON_LEMMA, NORM: pron, TAG: "PRP"},
-            {ORTH: "ll", LEMMA: "will", NORM: "will", TAG: "MD"}]
+            {ORTH: "ll", LEMMA: "will", NORM: "will", TAG: "MD"},
+        ]

        _exc[orth + "'ll've"] = [
            {ORTH: orth, LEMMA: PRON_LEMMA, NORM: pron, TAG: "PRP"},
            {ORTH: "'ll", LEMMA: "will", NORM: "will", TAG: "MD"},
-            {ORTH: "'ve", LEMMA: "have", NORM: "have", TAG: "VB"}]
+            {ORTH: "'ve", LEMMA: "have", NORM: "have", TAG: "VB"},
+        ]

        _exc[orth + "llve"] = [
            {ORTH: orth, LEMMA: PRON_LEMMA, NORM: pron, TAG: "PRP"},
            {ORTH: "ll", LEMMA: "will", NORM: "will", TAG: "MD"},
-            {ORTH: "ve", LEMMA: "have", NORM: "have", TAG: "VB"}]
+            {ORTH: "ve", LEMMA: "have", NORM: "have", TAG: "VB"},
+        ]

        _exc[orth + "'d"] = [
            {ORTH: orth, LEMMA: PRON_LEMMA, NORM: pron, TAG: "PRP"},
-            {ORTH: "'d", LEMMA: "would", NORM: "would", TAG: "MD"}]
+            {ORTH: "'d", LEMMA: "would", NORM: "would", TAG: "MD"},
+        ]

        _exc[orth + "d"] = [
            {ORTH: orth, LEMMA: PRON_LEMMA, NORM: pron, TAG: "PRP"},
-            {ORTH: "d", LEMMA: "would", NORM: "would", TAG: "MD"}]
+            {ORTH: "d", LEMMA: "would", NORM: "would", TAG: "MD"},
+        ]

        _exc[orth + "'d've"] = [
            {ORTH: orth, LEMMA: PRON_LEMMA, NORM: pron, TAG: "PRP"},
            {ORTH: "'d", LEMMA: "would", NORM: "would", TAG: "MD"},
-            {ORTH: "'ve", LEMMA: "have", NORM: "have", TAG: "VB"}]
+            {ORTH: "'ve", LEMMA: "have", NORM: "have", TAG: "VB"},
+        ]

        _exc[orth + "dve"] = [
            {ORTH: orth, LEMMA: PRON_LEMMA, NORM: pron, TAG: "PRP"},
            {ORTH: "d", LEMMA: "would", NORM: "would", TAG: "MD"},
-            {ORTH: "ve", LEMMA: "have", NORM: "have", TAG: "VB"}]
+            {ORTH: "ve", LEMMA: "have", NORM: "have", TAG: "VB"},
+        ]


 for pron in ["i", "you", "we", "they"]:
    for orth in [pron, pron.title()]:
        _exc[orth + "'ve"] = [
            {ORTH: orth, LEMMA: PRON_LEMMA, NORM: pron, TAG: "PRP"},
-            {ORTH: "'ve", LEMMA: "have", NORM: "have", TAG: "VB"}]
+            {ORTH: "'ve", LEMMA: "have", NORM: "have", TAG: "VB"},
+        ]

        _exc[orth + "ve"] = [
            {ORTH: orth, LEMMA: PRON_LEMMA, NORM: pron, TAG: "PRP"},
-            {ORTH: "ve", LEMMA: "have", NORM: "have", TAG: "VB"}]
+            {ORTH: "ve", LEMMA: "have", NORM: "have", TAG: "VB"},
+        ]


 for pron in ["you", "we", "they"]:
    for orth in [pron, pron.title()]:
        _exc[orth + "'re"] = [
            {ORTH: orth, LEMMA: PRON_LEMMA, NORM: pron, TAG: "PRP"},
-            {ORTH: "'re", LEMMA: "be", NORM: "are"}]
+            {ORTH: "'re", LEMMA: "be", NORM: "are"},
+        ]

        _exc[orth + "re"] = [
            {ORTH: orth, LEMMA: PRON_LEMMA, NORM: pron, TAG: "PRP"},
-            {ORTH: "re", LEMMA: "be", NORM: "are", TAG: "VBZ"}]
+            {ORTH: "re", LEMMA: "be", NORM: "are", TAG: "VBZ"},
+        ]


 for pron in ["he", "she", "it"]:
    for orth in [pron, pron.title()]:
        _exc[orth + "'s"] = [
            {ORTH: orth, LEMMA: PRON_LEMMA, NORM: pron, TAG: "PRP"},
-            {ORTH: "'s", NORM: "'s"}]
+            {ORTH: "'s", NORM: "'s"},
+        ]

        _exc[orth + "s"] = [
            {ORTH: orth, LEMMA: PRON_LEMMA, NORM: pron, TAG: "PRP"},
-            {ORTH: "s"}]
+            {ORTH: "s"},
+        ]


 # W-words, relative pronouns, prepositions etc.
@ -110,63 +150,71 @@ for word in ["who", "what", "when", "where", "why", "how", "there", "that"]:
    for orth in [word, word.title()]:
        _exc[orth + "'s"] = [
            {ORTH: orth, LEMMA: word, NORM: word},
-            {ORTH: "'s", NORM: "'s"}]
+            {ORTH: "'s", NORM: "'s"},
+        ]

-        _exc[orth + "s"] = [
-            {ORTH: orth, LEMMA: word, NORM: word},
-            {ORTH: "s"}]
+        _exc[orth + "s"] = [{ORTH: orth, LEMMA: word, NORM: word}, {ORTH: "s"}]

        _exc[orth + "'ll"] = [
            {ORTH: orth, LEMMA: word, NORM: word},
-            {ORTH: "'ll", LEMMA: "will", NORM: "will", TAG: "MD"}]
+            {ORTH: "'ll", LEMMA: "will", NORM: "will", TAG: "MD"},
+        ]

        _exc[orth + "ll"] = [
            {ORTH: orth, LEMMA: word, NORM: word},
-            {ORTH: "ll", LEMMA: "will", NORM: "will", TAG: "MD"}]
+            {ORTH: "ll", LEMMA: "will", NORM: "will", TAG: "MD"},
+        ]

        _exc[orth + "'ll've"] = [
            {ORTH: orth, LEMMA: word, NORM: word},
            {ORTH: "'ll", LEMMA: "will", NORM: "will", TAG: "MD"},
-            {ORTH: "'ve", LEMMA: "have", NORM: "have", TAG: "VB"}]
+            {ORTH: "'ve", LEMMA: "have", NORM: "have", TAG: "VB"},
+        ]

        _exc[orth + "llve"] = [
            {ORTH: orth, LEMMA: word, NORM: word},
            {ORTH: "ll", LEMMA: "will", NORM: "will", TAG: "MD"},
-            {ORTH: "ve", LEMMA: "have", NORM: "have", TAG: "VB"}]
+            {ORTH: "ve", LEMMA: "have", NORM: "have", TAG: "VB"},
+        ]

        _exc[orth + "'re"] = [
            {ORTH: orth, LEMMA: word, NORM: word},
-            {ORTH: "'re", LEMMA: "be", NORM: "are"}]
+            {ORTH: "'re", LEMMA: "be", NORM: "are"},
+        ]

        _exc[orth + "re"] = [
            {ORTH: orth, LEMMA: word, NORM: word},
-            {ORTH: "re", LEMMA: "be", NORM: "are"}]
+            {ORTH: "re", LEMMA: "be", NORM: "are"},
+        ]

        _exc[orth + "'ve"] = [
            {ORTH: orth, LEMMA: word, NORM: word},
-            {ORTH: "'ve", LEMMA: "have", TAG: "VB"}]
+            {ORTH: "'ve", LEMMA: "have", TAG: "VB"},
+        ]

        _exc[orth + "ve"] = [
            {ORTH: orth, LEMMA: word},
-            {ORTH: "ve", LEMMA: "have", NORM: "have", TAG: "VB"}]
+            {ORTH: "ve", LEMMA: "have", NORM: "have", TAG: "VB"},
+        ]

        _exc[orth + "'d"] = [
            {ORTH: orth, LEMMA: word, NORM: word},
-            {ORTH: "'d", NORM: "'d"}]
+            {ORTH: "'d", NORM: "'d"},
+        ]

-        _exc[orth + "d"] = [
-            {ORTH: orth, LEMMA: word, NORM: word},
-            {ORTH: "d"}]
+        _exc[orth + "d"] = [{ORTH: orth, LEMMA: word, NORM: word}, {ORTH: "d"}]

        _exc[orth + "'d've"] = [
            {ORTH: orth, LEMMA: word, NORM: word},
            {ORTH: "'d", LEMMA: "would", NORM: "would", TAG: "MD"},
-            {ORTH: "'ve", LEMMA: "have", NORM: "have", TAG: "VB"}]
+            {ORTH: "'ve", LEMMA: "have", NORM: "have", TAG: "VB"},
+        ]

        _exc[orth + "dve"] = [
            {ORTH: orth, LEMMA: word, NORM: word},
            {ORTH: "d", LEMMA: "would", NORM: "would", TAG: "MD"},
-            {ORTH: "ve", LEMMA: "have", NORM: "have", TAG: "VB"}]
+            {ORTH: "ve", LEMMA: "have", NORM: "have", TAG: "VB"},
+        ]


 # Verbs
@ -186,27 +234,32 @@ for verb_data in [
    {ORTH: "sha", LEMMA: "shall", NORM: "shall", TAG: "MD"},
    {ORTH: "should", NORM: "should", TAG: "MD"},
    {ORTH: "wo", LEMMA: "will", NORM: "will", TAG: "MD"},
-    {ORTH: "would", NORM: "would", TAG: "MD"}]:
+    {ORTH: "would", NORM: "would", TAG: "MD"},
+]:
    verb_data_tc = dict(verb_data)
    verb_data_tc[ORTH] = verb_data_tc[ORTH].title()
    for data in [verb_data, verb_data_tc]:
        _exc[data[ORTH] + "n't"] = [
            dict(data),
-            {ORTH: "n't", LEMMA: "not", NORM: "not", TAG: "RB"}]
+            {ORTH: "n't", LEMMA: "not", NORM: "not", TAG: "RB"},
+        ]

        _exc[data[ORTH] + "nt"] = [
            dict(data),
-            {ORTH: "nt", LEMMA: "not", NORM: "not", TAG: "RB"}]
+            {ORTH: "nt", LEMMA: "not", NORM: "not", TAG: "RB"},
+        ]

        _exc[data[ORTH] + "n't've"] = [
            dict(data),
            {ORTH: "n't", LEMMA: "not", NORM: "not", TAG: "RB"},
-            {ORTH: "'ve", LEMMA: "have", NORM: "have", TAG: "VB"}]
+            {ORTH: "'ve", LEMMA: "have", NORM: "have", TAG: "VB"},
+        ]

        _exc[data[ORTH] + "ntve"] = [
            dict(data),
            {ORTH: "nt", LEMMA: "not", NORM: "not", TAG: "RB"},
-            {ORTH: "ve", LEMMA: "have", NORM: "have", TAG: "VB"}]
+            {ORTH: "ve", LEMMA: "have", NORM: "have", TAG: "VB"},
+        ]


 for verb_data in [
@ -214,17 +267,14 @@ for verb_data in [
    {ORTH: "might", NORM: "might", TAG: "MD"},
    {ORTH: "must", NORM: "must", TAG: "MD"},
    {ORTH: "should", NORM: "should", TAG: "MD"},
-    {ORTH: "would", NORM: "would", TAG: "MD"}]:
+    {ORTH: "would", NORM: "would", TAG: "MD"},
+]:
    verb_data_tc = dict(verb_data)
    verb_data_tc[ORTH] = verb_data_tc[ORTH].title()
    for data in [verb_data, verb_data_tc]:
-        _exc[data[ORTH] + "'ve"] = [
-            dict(data),
-            {ORTH: "'ve", LEMMA: "have", TAG: "VB"}]
+        _exc[data[ORTH] + "'ve"] = [dict(data), {ORTH: "'ve", LEMMA: "have", TAG: "VB"}]

-        _exc[data[ORTH] + "ve"] = [
-            dict(data),
-            {ORTH: "ve", LEMMA: "have", TAG: "VB"}]
+        _exc[data[ORTH] + "ve"] = [dict(data), {ORTH: "ve", LEMMA: "have", TAG: "VB"}]


 for verb_data in [
@ -235,17 +285,20 @@ for verb_data in [
    {ORTH: "were", LEMMA: "be", NORM: "were"},
    {ORTH: "have", NORM: "have"},
    {ORTH: "has", LEMMA: "have", NORM: "has"},
-    {ORTH: "dare", NORM: "dare"}]:
+    {ORTH: "dare", NORM: "dare"},
+]:
    verb_data_tc = dict(verb_data)
    verb_data_tc[ORTH] = verb_data_tc[ORTH].title()
    for data in [verb_data, verb_data_tc]:
        _exc[data[ORTH] + "n't"] = [
            dict(data),
-            {ORTH: "n't", LEMMA: "not", NORM: "not", TAG: "RB"}]
+            {ORTH: "n't", LEMMA: "not", NORM: "not", TAG: "RB"},
+        ]

        _exc[data[ORTH] + "nt"] = [
            dict(data),
-            {ORTH: "nt", LEMMA: "not", NORM: "not", TAG: "RB"}]
+            {ORTH: "nt", LEMMA: "not", NORM: "not", TAG: "RB"},
+        ]


 # Other contractions with trailing apostrophe
@ -256,7 +309,8 @@ for exc_data in [
    {ORTH: "nothin", LEMMA: "nothing", NORM: "nothing"},
    {ORTH: "nuthin", LEMMA: "nothing", NORM: "nothing"},
    {ORTH: "ol", LEMMA: "old", NORM: "old"},
-    {ORTH: "somethin", LEMMA: "something", NORM: "something"}]:
+    {ORTH: "somethin", LEMMA: "something", NORM: "something"},
+]:
    exc_data_tc = dict(exc_data)
    exc_data_tc[ORTH] = exc_data_tc[ORTH].title()
    for data in [exc_data, exc_data_tc]:
@ -272,7 +326,8 @@ for exc_data in [
    {ORTH: "cause", LEMMA: "because", NORM: "because"},
    {ORTH: "em", LEMMA: PRON_LEMMA, NORM: "them"},
    {ORTH: "ll", LEMMA: "will", NORM: "will"},
-    {ORTH: "nuff", LEMMA: "enough", NORM: "enough"}]:
+    {ORTH: "nuff", LEMMA: "enough", NORM: "enough"},
+]:
    exc_data_apos = dict(exc_data)
    exc_data_apos[ORTH] = "'" + exc_data_apos[ORTH]
    for data in [exc_data, exc_data_apos]:
@ -285,81 +340,69 @@ for h in range(1, 12 + 1):
    for period in ["a.m.", "am"]:
        _exc["%d%s" % (h, period)] = [
            {ORTH: "%d" % h},
-            {ORTH: period, LEMMA: "a.m.", NORM: "a.m."}]
+            {ORTH: period, LEMMA: "a.m.", NORM: "a.m."},
+        ]
    for period in ["p.m.", "pm"]:
        _exc["%d%s" % (h, period)] = [
            {ORTH: "%d" % h},
-            {ORTH: period, LEMMA: "p.m.", NORM: "p.m."}]
+            {ORTH: period, LEMMA: "p.m.", NORM: "p.m."},
+        ]


 # Rest

 _other_exc = {
-    "y'all": [
-        {ORTH: "y'", LEMMA: PRON_LEMMA, NORM: "you"},
-        {ORTH: "all"}],
-
-    "yall": [
-        {ORTH: "y", LEMMA: PRON_LEMMA, NORM: "you"},
-        {ORTH: "all"}],
-
+    "y'all": [{ORTH: "y'", LEMMA: PRON_LEMMA, NORM: "you"}, {ORTH: "all"}],
+    "yall": [{ORTH: "y", LEMMA: PRON_LEMMA, NORM: "you"}, {ORTH: "all"}],
    "how'd'y": [
        {ORTH: "how", LEMMA: "how"},
        {ORTH: "'d", LEMMA: "do"},
-        {ORTH: "'y", LEMMA: PRON_LEMMA, NORM: "you"}],
-
+        {ORTH: "'y", LEMMA: PRON_LEMMA, NORM: "you"},
+    ],
    "How'd'y": [
        {ORTH: "How", LEMMA: "how", NORM: "how"},
        {ORTH: "'d", LEMMA: "do"},
-        {ORTH: "'y", LEMMA: PRON_LEMMA, NORM: "you"}],
-
+        {ORTH: "'y", LEMMA: PRON_LEMMA, NORM: "you"},
+    ],
    "not've": [
        {ORTH: "not", LEMMA: "not", TAG: "RB"},
-        {ORTH: "'ve", LEMMA: "have", NORM: "have", TAG: "VB"}],
-
+        {ORTH: "'ve", LEMMA: "have", NORM: "have", TAG: "VB"},
+    ],
    "notve": [
        {ORTH: "not", LEMMA: "not", TAG: "RB"},
-        {ORTH: "ve", LEMMA: "have", NORM: "have", TAG: "VB"}],
-
+        {ORTH: "ve", LEMMA: "have", NORM: "have", TAG: "VB"},
+    ],
    "Not've": [
        {ORTH: "Not", LEMMA: "not", NORM: "not", TAG: "RB"},
-        {ORTH: "'ve", LEMMA: "have", NORM: "have", TAG: "VB"}],
-
+        {ORTH: "'ve", LEMMA: "have", NORM: "have", TAG: "VB"},
+    ],
    "Notve": [
        {ORTH: "Not", LEMMA: "not", NORM: "not", TAG: "RB"},
-        {ORTH: "ve", LEMMA: "have", NORM: "have", TAG: "VB"}],
-
+        {ORTH: "ve", LEMMA: "have", NORM: "have", TAG: "VB"},
+    ],
    "cannot": [
        {ORTH: "can", LEMMA: "can", TAG: "MD"},
-        {ORTH: "not", LEMMA: "not", TAG: "RB"}],
-
+        {ORTH: "not", LEMMA: "not", TAG: "RB"},
+    ],
    "Cannot": [
        {ORTH: "Can", LEMMA: "can", NORM: "can", TAG: "MD"},
-        {ORTH: "not", LEMMA: "not", TAG: "RB"}],
-
+        {ORTH: "not", LEMMA: "not", TAG: "RB"},
+    ],
    "gonna": [
        {ORTH: "gon", LEMMA: "go", NORM: "going"},
-        {ORTH: "na", LEMMA: "to", NORM: "to"}],
-
+        {ORTH: "na", LEMMA: "to", NORM: "to"},
+    ],
    "Gonna": [
        {ORTH: "Gon", LEMMA: "go", NORM: "going"},
-        {ORTH: "na", LEMMA: "to", NORM: "to"}],
-
-    "gotta": [
-        {ORTH: "got"},
-        {ORTH: "ta", LEMMA: "to", NORM: "to"}],
-
-    "Gotta": [
-        {ORTH: "Got", NORM: "got"},
-        {ORTH: "ta", LEMMA: "to", NORM: "to"}],
-
-    "let's": [
-        {ORTH: "let"},
-        {ORTH: "'s", LEMMA: PRON_LEMMA, NORM: "us"}],
-
+        {ORTH: "na", LEMMA: "to", NORM: "to"},
+    ],
+    "gotta": [{ORTH: "got"}, {ORTH: "ta", LEMMA: "to", NORM: "to"}],
+    "Gotta": [{ORTH: "Got", NORM: "got"}, {ORTH: "ta", LEMMA: "to", NORM: "to"}],
+    "let's": [{ORTH: "let"}, {ORTH: "'s", LEMMA: PRON_LEMMA, NORM: "us"}],
    "Let's": [
        {ORTH: "Let", LEMMA: "let", NORM: "let"},
-        {ORTH: "'s", LEMMA: PRON_LEMMA, NORM: "us"}]
+        {ORTH: "'s", LEMMA: PRON_LEMMA, NORM: "us"},
+    ],
 }

 _exc.update(_other_exc)
@ -402,8 +445,6 @@ for exc_data in [
    {ORTH: "Goin'", LEMMA: "go", NORM: "going"},
    {ORTH: "goin", LEMMA: "go", NORM: "going"},
    {ORTH: "Goin", LEMMA: "go", NORM: "going"},
-
-
    {ORTH: "Mt.", LEMMA: "Mount", NORM: "Mount"},
    {ORTH: "Ak.", LEMMA: "Alaska", NORM: "Alaska"},
    {ORTH: "Ala.", LEMMA: "Alabama", NORM: "Alabama"},
@ -456,15 +497,47 @@ for exc_data in [
    {ORTH: "Tenn.", LEMMA: "Tennessee", NORM: "Tennessee"},
    {ORTH: "Va.", LEMMA: "Virginia", NORM: "Virginia"},
    {ORTH: "Wash.", LEMMA: "Washington", NORM: "Washington"},
-    {ORTH: "Wis.", LEMMA: "Wisconsin", NORM: "Wisconsin"}]:
+    {ORTH: "Wis.", LEMMA: "Wisconsin", NORM: "Wisconsin"},
+]:
    _exc[exc_data[ORTH]] = [exc_data]


 for orth in [
-    "'d", "a.m.", "Adm.", "Bros.", "co.", "Co.", "Corp.", "D.C.", "Dr.", "e.g.",
-    "E.g.", "E.G.", "Gen.", "Gov.", "i.e.", "I.e.", "I.E.", "Inc.", "Jr.",
-    "Ltd.", "Md.", "Messrs.", "Mo.", "Mont.", "Mr.", "Mrs.", "Ms.", "p.m.",
-    "Ph.D.", "Rep.", "Rev.", "Sen.", "St.", "vs."]:
+    "'d",
+    "a.m.",
+    "Adm.",
+    "Bros.",
+    "co.",
+    "Co.",
+    "Corp.",
+    "D.C.",
+    "Dr.",
+    "e.g.",
+    "E.g.",
+    "E.G.",
+    "Gen.",
+    "Gov.",
+    "i.e.",
+    "I.e.",
+    "I.E.",
+    "Inc.",
+    "Jr.",
+    "Ltd.",
+    "Md.",
+    "Messrs.",
+    "Mo.",
+    "Mont.",
+    "Mr.",
+    "Mrs.",
+    "Ms.",
+    "p.m.",
+    "Ph.D.",
+    "Rep.",
+    "Rev.",
+    "Sen.",
+    "St.",
+    "vs.",
+]:
    _exc[orth] = [{ORTH: orth}]


--- a/spacy/lang/entity_rules.py
+++ b/spacy/lang/entity_rules.py
@ -30,8 +30,9 @@ for name, tag, patterns in [
    ("Facebook", "ORG", [[{LOWER: "facebook"}]]),
    ("Blizzard", "ORG", [[{LOWER: "blizzard"}]]),
    ("Ubuntu", "ORG", [[{LOWER: "ubuntu"}]]),
-    ("YouTube", "PRODUCT", [[{LOWER: "youtube"}]]),]:
-    ENTITY_RULES.append({ENT_ID: name, 'attrs': {ENT_TYPE: tag}, 'patterns': patterns})
+    ("YouTube", "PRODUCT", [[{LOWER: "youtube"}]]),
+]:
+    ENTITY_RULES.append({ENT_ID: name, "attrs": {ENT_TYPE: tag}, "patterns": patterns})


 FALSE_POSITIVES = [
@ -46,5 +47,5 @@ FALSE_POSITIVES = [
    [{ORTH: "Yay"}],
    [{ORTH: "Ahh"}],
    [{ORTH: "Yea"}],
-    [{ORTH: "Bah"}]
+    [{ORTH: "Bah"}],
 ]
--- a/spacy/lang/es/init.py
+++ b/spacy/lang/es/init.py
@ -16,8 +16,10 @@ from ...util import update_exc, add_lookups

 class SpanishDefaults(Language.Defaults):
    lex_attr_getters = dict(Language.Defaults.lex_attr_getters)
-    lex_attr_getters[LANG] = lambda text: 'es'
-    lex_attr_getters[NORM] = add_lookups(Language.Defaults.lex_attr_getters[NORM], BASE_NORMS)
+    lex_attr_getters[LANG] = lambda text: "es"
+    lex_attr_getters[NORM] = add_lookups(
+        Language.Defaults.lex_attr_getters[NORM], BASE_NORMS
+    )
    tokenizer_exceptions = update_exc(BASE_EXCEPTIONS, TOKENIZER_EXCEPTIONS)
    tag_map = TAG_MAP
    stop_words = STOP_WORDS
@ -26,8 +28,8 @@ class SpanishDefaults(Language.Defaults):


 class Spanish(Language):
-    lang = 'es'
+    lang = "es"
    Defaults = SpanishDefaults


-__all__ = ['Spanish']
+__all__ = ["Spanish"]
--- a/spacy/lang/es/examples.py
+++ b/spacy/lang/es/examples.py
@ -18,5 +18,5 @@ sentences = [
    "El gato come pescado",
    "Veo al hombre con el telescopio",
    "La araña come moscas",
-    "El pingüino incuba en su nido"
+    "El pingüino incuba en su nido",
 ]
--- a/spacy/lang/es/stop_words.py
+++ b/spacy/lang/es/stop_words.py
@ -2,7 +2,8 @@
 from __future__ import unicode_literals


-STOP_WORDS = set("""
+STOP_WORDS = set(
+    """
 actualmente acuerdo adelante ademas además adrede afirmó agregó ahi ahora ahí
 al algo alguna algunas alguno algunos algún alli allí alrededor ambos ampleamos
 antano antaño ante anterior antes apenas aproximadamente aquel aquella aquellas
@ -81,4 +82,5 @@ va vais valor vamos van varias varios vaya veces ver verdad verdadera verdadero
 vez vosotras vosotros voy vuestra vuestras vuestro vuestros

 ya yo
-""".split())
+""".split()
+)
--- a/spacy/lang/es/syntax_iterators.py
+++ b/spacy/lang/es/syntax_iterators.py
@ -8,18 +8,20 @@ def noun_chunks(obj):
    doc = obj.doc
    if not len(doc):
        return
-    np_label = doc.vocab.strings.add('NP')
-    left_labels = ['det', 'fixed', 'neg'] #['nunmod', 'det', 'appos', 'fixed']
-    right_labels = ['flat', 'fixed', 'compound', 'neg']
-    stop_labels = ['punct']
+    np_label = doc.vocab.strings.add("NP")
+    left_labels = ["det", "fixed", "neg"]  # ['nunmod', 'det', 'appos', 'fixed']
+    right_labels = ["flat", "fixed", "compound", "neg"]
+    stop_labels = ["punct"]
    np_left_deps = [doc.vocab.strings.add(label) for label in left_labels]
    np_right_deps = [doc.vocab.strings.add(label) for label in right_labels]
    stop_deps = [doc.vocab.strings.add(label) for label in stop_labels]
    token = doc[0]
    while token and token.i < len(doc):
        if token.pos in [PROPN, NOUN, PRON]:
-            left, right = noun_bounds(doc, token, np_left_deps, np_right_deps, stop_deps)
-            yield left.i, right.i+1, np_label
+            left, right = noun_bounds(
+                doc, token, np_left_deps, np_right_deps, stop_deps
+            )
+            yield left.i, right.i + 1, np_label
            token = right
        token = next_token(token)

@ -31,7 +33,7 @@ def is_verb_token(token):
 def next_token(token):
    try:
        return token.nbor()
-    except:
+    except IndexError:
        return None


@ -42,16 +44,20 @@ def noun_bounds(doc, root, np_left_deps, np_right_deps, stop_deps):
            left_bound = token
    right_bound = root
    for token in root.rights:
-        if (token.dep in np_right_deps):
-            left, right = noun_bounds(doc, token, np_left_deps, np_right_deps, stop_deps)
-            if list(filter(lambda t: is_verb_token(t) or t.dep in stop_deps,
-                           doc[left_bound.i: right.i])):
+        if token.dep in np_right_deps:
+            left, right = noun_bounds(
+                doc, token, np_left_deps, np_right_deps, stop_deps
+            )
+            if list(
+                filter(
+                    lambda t: is_verb_token(t) or t.dep in stop_deps,
+                    doc[left_bound.i : right.i],
+                )
+            ):
                break
            else:
                right_bound = right
    return left_bound, right_bound


-SYNTAX_ITERATORS = {
-    'noun_chunks': noun_chunks
-}
+SYNTAX_ITERATORS = {"noun_chunks": noun_chunks}
--- a/spacy/lang/es/tag_map.py
+++ b/spacy/lang/es/tag_map.py
@ -4,7 +4,7 @@ from __future__ import unicode_literals
 from ...symbols import POS, PUNCT, SYM, ADJ, NUM, DET, ADV, ADP, X, VERB
 from ...symbols import NOUN, PROPN, PART, INTJ, SPACE, PRON, SCONJ, AUX, CONJ

-
+# fmt: off
 TAG_MAP = {
    "ADJ___": {"morph": "_", POS: ADJ},
    "ADJ__AdpType=Prep": {"morph": "AdpType=Prep", POS: ADJ},
@ -29,7 +29,7 @@ TAG_MAP = {
    "ADP__AdpType=Preppron|Gender=Fem|Number=Sing": {"morph": "AdpType=Preppron|Gender=Fem|Number=Sing", POS: ADP},
    "ADP__AdpType=Preppron|Gender=Masc|Number=Plur": {"morph": "AdpType=Preppron|Gender=Masc|Number=Plur", POS: ADP},
    "ADP__AdpType=Preppron|Gender=Masc|Number=Sing": {"morph": "AdpType=Preppron|Gender=Masc|Number=Sing", POS: ADP},
-    "ADP": { POS: ADP},
+    "ADP": {POS: ADP},
    "ADV___": {"morph": "_", POS: ADV},
    "ADV__AdpType=Prep": {"morph": "AdpType=Prep", POS: ADV},
    "ADV__AdpType=Preppron|Gender=Masc|Number=Sing": {"morph": "AdpType=Preppron|Gender=Masc|Number=Sing", POS: ADV},
@ -135,7 +135,7 @@ TAG_MAP = {
    "DET__Number=Sing|PronType=Ind": {"morph": "Number=Sing|PronType=Ind", POS: DET},
    "DET__PronType=Int": {"morph": "PronType=Int", POS: DET},
    "DET__PronType=Rel": {"morph": "PronType=Rel", POS: DET},
-    "DET": { POS: DET},
+    "DET": {POS: DET},
    "INTJ___": {"morph": "_", POS: INTJ},
    "NOUN___": {"morph": "_", POS: NOUN},
    "NOUN__AdvType=Tim": {"morph": "AdvType=Tim", POS: NOUN},
@ -307,3 +307,4 @@ TAG_MAP = {
    "X___": {"morph": "_", POS: X},
    "_SP": {"morph": "_", POS: SPACE},
 }
+# fmt: on
--- a/spacy/lang/es/tokenizer_exceptions.py
+++ b/spacy/lang/es/tokenizer_exceptions.py
@ -1,17 +1,12 @@
 # coding: utf8
 from __future__ import unicode_literals

-from ...symbols import ORTH, LEMMA, TAG, NORM, ADP, DET, PRON_LEMMA
+from ...symbols import ORTH, LEMMA, NORM, PRON_LEMMA


 _exc = {
-    "pal": [
-        {ORTH: "pa", LEMMA: "para"},
-        {ORTH: "l", LEMMA: "el", NORM: "el"}],
-
-    "pala": [
-        {ORTH: "pa", LEMMA: "para"},
-        {ORTH: "la", LEMMA: "la", NORM: "la"}]
+    "pal": [{ORTH: "pa", LEMMA: "para"}, {ORTH: "l", LEMMA: "el", NORM: "el"}],
+    "pala": [{ORTH: "pa", LEMMA: "para"}, {ORTH: "la", LEMMA: "la", NORM: "la"}],
 }


@ -24,32 +19,50 @@ for exc_data in [
    {ORTH: "Ud.", LEMMA: PRON_LEMMA, NORM: "usted"},
    {ORTH: "Vd.", LEMMA: PRON_LEMMA, NORM: "usted"},
    {ORTH: "Uds.", LEMMA: PRON_LEMMA, NORM: "ustedes"},
-    {ORTH: "Vds.", LEMMA: PRON_LEMMA, NORM: "ustedes"}]:
+    {ORTH: "Vds.", LEMMA: PRON_LEMMA, NORM: "ustedes"},
+]:
    _exc[exc_data[ORTH]] = [exc_data]


 # Times

-_exc["12m."] = [
-    {ORTH: "12"},
-    {ORTH: "m.", LEMMA: "p.m."}]
+_exc["12m."] = [{ORTH: "12"}, {ORTH: "m.", LEMMA: "p.m."}]


 for h in range(1, 12 + 1):
    for period in ["a.m.", "am"]:
-        _exc["%d%s" % (h, period)] = [
-            {ORTH: "%d" % h},
-            {ORTH: period, LEMMA: "a.m."}]
+        _exc["%d%s" % (h, period)] = [{ORTH: "%d" % h}, {ORTH: period, LEMMA: "a.m."}]
    for period in ["p.m.", "pm"]:
-        _exc["%d%s" % (h, period)] = [
-            {ORTH: "%d" % h},
-            {ORTH: period, LEMMA: "p.m."}]
+        _exc["%d%s" % (h, period)] = [{ORTH: "%d" % h}, {ORTH: period, LEMMA: "p.m."}]


 for orth in [
-    "a.C.", "a.J.C.", "apdo.", "Av.", "Avda.", "Cía.", "etc.", "Gob.", "Gral.",
-    "Ing.", "J.C.", "Lic.", "m.n.", "no.", "núm.", "P.D.", "Prof.", "Profa.",
-    "q.e.p.d.", "S.A.", "S.L.", "s.s.s.", "Sr.", "Sra.", "Srta."]:
+    "a.C.",
+    "a.J.C.",
+    "apdo.",
+    "Av.",
+    "Avda.",
+    "Cía.",
+    "etc.",
+    "Gob.",
+    "Gral.",
+    "Ing.",
+    "J.C.",
+    "Lic.",
+    "m.n.",
+    "no.",
+    "núm.",
+    "P.D.",
+    "Prof.",
+    "Profa.",
+    "q.e.p.d.",
+    "S.A.",
+    "S.L.",
+    "s.s.s.",
+    "Sr.",
+    "Sra.",
+    "Srta.",
+]:
    _exc[orth] = [{ORTH: orth}]


--- a/spacy/lang/fa/init.py
+++ b/spacy/lang/fa/init.py
@ -12,11 +12,14 @@ from .tag_map import TAG_MAP
 from .punctuation import TOKENIZER_SUFFIXES
 from .lemmatizer import LEMMA_RULES, LEMMA_INDEX, LEMMA_EXC

+
 class PersianDefaults(Language.Defaults):
    lex_attr_getters = dict(Language.Defaults.lex_attr_getters)
    lex_attr_getters.update(LEX_ATTRS)
-    lex_attr_getters[NORM] = add_lookups(Language.Defaults.lex_attr_getters[NORM], BASE_NORMS)
-    lex_attr_getters[LANG] = lambda text: 'fa'
+    lex_attr_getters[NORM] = add_lookups(
+        Language.Defaults.lex_attr_getters[NORM], BASE_NORMS
+    )
+    lex_attr_getters[LANG] = lambda text: "fa"
    tokenizer_exceptions = update_exc(TOKENIZER_EXCEPTIONS)
    lemma_rules = LEMMA_RULES
    lemma_index = LEMMA_INDEX
@ -27,8 +30,8 @@ class PersianDefaults(Language.Defaults):


 class Persian(Language):
-    lang = 'fa'
+    lang = "fa"
    Defaults = PersianDefaults


-__all__ = ['Persian']
+__all__ = ["Persian"]
--- a/spacy/lang/fa/examples.py
+++ b/spacy/lang/fa/examples.py
@ -12,8 +12,8 @@ Example sentences to test spaCy and its language models.

 sentences = [
    "این یک جمله نمونه می باشد.",
-    "قرار ما، امروز ساعت ۲:۳۰ بعدازظهر هست!"
+    "قرار ما، امروز ساعت ۲:۳۰ بعدازظهر هست!",
    "دیروز علی به من ۲۰۰۰.۱﷼ پول نقد داد.",
-    "چطور می‌توان از تهران به کاشان رفت؟"
-    "حدود ۸۰٪ هوا از نیتروژن تشکیل شده است."
+    "چطور می‌توان از تهران به کاشان رفت؟",
+    "حدود ۸۰٪ هوا از نیتروژن تشکیل شده است.",
 ]
--- a/spacy/lang/fa/lemmatizer/init.py
+++ b/spacy/lang/fa/lemmatizer/init.py
@ -10,23 +10,13 @@ from ._verbs_exc import VERBS_EXC
 from ._lemma_rules import ADJECTIVE_RULES, NOUN_RULES, VERB_RULES, PUNCT_RULES


-LEMMA_INDEX = {
-    'adj': ADJECTIVES,
-    'noun': NOUNS, 
-    'verb': VERBS
-}
+LEMMA_INDEX = {"adj": ADJECTIVES, "noun": NOUNS, "verb": VERBS}

 LEMMA_RULES = {
-    'adj': ADJECTIVE_RULES, 
-    'noun': NOUN_RULES, 
-    'verb': VERB_RULES,
-    'punct': PUNCT_RULES
+    "adj": ADJECTIVE_RULES,
+    "noun": NOUN_RULES,
+    "verb": VERB_RULES,
+    "punct": PUNCT_RULES,
 }

-LEMMA_EXC = {
-    'adj': ADJECTIVES_EXC, 
-    'noun': NOUNS_EXC,
-    'verb': VERBS_EXC
-}
-
-
+LEMMA_EXC = {"adj": ADJECTIVES_EXC, "noun": NOUNS_EXC, "verb": VERBS_EXC}
--- a/Show More
+++ b/Show More