spaCy/spacy
Matthew Honnibal 04395ffa49 Bring English tag_map in line with UD Treebank
I wrote a small script to read the UD English training data and check
that our tag map and morph rules were resulting in the best POS map.
This hadn't been done for some time, and there have been various changes
to the UD schema since it has been done. After these changes we should
see much better agreement between our POS assignments and the UD POS
tags.
2019-03-21 13:53:44 +01:00
..
cli Merge pull request #3441 from explosion/fix/cli-ud-scripts 2019-03-20 12:19:15 +01:00
data Make spacy/data a package 2017-03-18 20:04:22 +01:00
displacy 💫 Fix displaCy support for RTL languages (#3393) 2019-03-11 18:52:50 +01:00
lang Bring English tag_map in line with UD Treebank 2019-03-21 13:53:44 +01:00
matcher Add actual deprecation warning for n_threads (resolves #3410) 2019-03-15 16:38:44 +01:00
pipeline Tidy up references to n_threads and fix default 2019-03-15 16:24:26 +01:00
syntax Improve beam search defaults 2019-03-17 21:47:45 +01:00
tests Merging conversion scripts for conll formats (#3405) 2019-03-15 18:14:46 +01:00
tokens Fix similarity calculation if vectors are on GPU (#3440) 2019-03-20 12:09:59 +01:00
__init__.pxd * Seems to be working after refactor. Need to wire up more POS tag features, and wire up save/load of POS tags. 2014-10-24 02:23:42 +11:00
__init__.py Fix formatting (hopefully also restarts build properly) 2019-03-20 09:55:45 +01:00
__main__.py Update __main__.py 2019-03-20 09:43:26 +01:00
_align.pyx Improve alignment around quotes 2018-08-16 01:04:34 +02:00
_ml.py Revert changes to optimizer default hyper-params (WIP) (#3415) 2019-03-16 21:39:02 +01:00
about.py Set version to 2.1.1 2019-03-20 00:59:45 +01:00
attrs.pxd Fix LANG symbol 2018-02-17 18:10:50 +01:00
attrs.pyx Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop" 2018-03-27 19:23:02 +02:00
compat.py Tidy up and improve docs and docstrings (#3370) 2019-03-08 11:42:26 +01:00
errors.py Fix formatting (hopefully also restarts build properly) 2019-03-20 09:55:45 +01:00
glossary.py 💫 Tidy up and auto-format .py files (#2983) 2018-11-30 17:03:03 +01:00
gold.pxd Add support for sent_start to GoldParse 2017-08-25 20:03:14 -05:00
gold.pyx Fix jsonl to json conversion (#3419) 2019-03-17 22:12:54 +01:00
language.py Merge pull request #3416 from explosion/feature/improve-beam 2019-03-16 18:42:18 +01:00
lemmatizer.py Tidy up and improve docs and docstrings (#3370) 2019-03-08 11:42:26 +01:00
lexeme.pxd 💫 Support lexical attributes in retokenizer attrs (closes #2390) (#3325) 2019-02-24 21:13:51 +01:00
lexeme.pyx Tidy up property code style (#3391) 2019-03-11 15:59:09 +01:00
morphology.pxd Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop" 2018-03-27 19:23:02 +02:00
morphology.pyx 💫 Fix interaction of lemmatizer and tokenizer exceptions (#3388) 2019-03-11 01:31:21 +01:00
parts_of_speech.pxd Add support for Universal Dependencies v2.0 2017-03-03 13:17:34 +01:00
parts_of_speech.pyx Tidy up rest 2017-10-27 21:07:59 +02:00
scorer.py 💫 Tidy up and auto-format .py files (#2983) 2018-11-30 17:03:03 +01:00
strings.pxd Try to fix StringStore clean up (see #1506) 2017-11-11 03:11:27 +03:00
strings.pyx 💫 Make serialization methods consistent (#3385) 2019-03-10 19:16:45 +01:00
structs.pxd Make NORM a token attribute (#3029) 2018-12-08 10:49:10 +01:00
symbols.pxd Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop" 2018-03-27 19:23:02 +02:00
symbols.pyx Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop" 2018-03-27 19:23:02 +02:00
tokenizer.pxd Disable tokenizer cache for special-cases. Fixes #1250 2017-10-24 16:08:05 +02:00
tokenizer.pyx Add actual deprecation warning for n_threads (resolves #3410) 2019-03-15 16:38:44 +01:00
typedefs.pxd Work on changing StringStore to return hashes. 2017-05-28 12:36:27 +02:00
typedefs.pyx Tidy up rest 2017-10-27 21:07:59 +02:00
util.py Auto-format [ci skip] 2019-03-11 17:10:50 +01:00
vectors.pyx Update Vectors.find docs [ci skip] 2019-03-16 17:10:57 +01:00
vocab.pxd 💫 Small efficiency fixes to tokenizer (#2587) 2018-07-24 23:35:54 +02:00
vocab.pyx Tidy up property code style (#3391) 2019-03-11 15:59:09 +01:00