Commit Graph

15955 Commits

Author SHA1 Message Date
Matthew Honnibal
5b89e2454c * Improve error-reporting in tagger 2015-08-27 10:26:36 +02:00
Matthew Honnibal
f0a7c99554 * Relax rule-requirement in lemmatizer 2015-08-27 10:26:19 +02:00
Matthew Honnibal
b6b1e1aa12 * Add link for Finnish model 2015-08-27 10:26:02 +02:00
Matthew Honnibal
0af139e183 * Tagger training now working. Still need to test load/save of model. Morphology still broken. 2015-08-27 09:16:11 +02:00
Matthew Honnibal
320ced276a * Add tagger training script 2015-08-27 09:15:41 +02:00
Matthew Honnibal
56c4e07a59 Update gazetteer.json 2015-08-27 08:53:48 +10:00
Matthew Honnibal
c07eea8563 * Comment out old doc tests for now 2015-08-26 19:23:04 +02:00
Matthew Honnibal
884251801e * Mark space tests as requiring model 2015-08-26 19:22:50 +02:00
Matthew Honnibal
ff9db9f3ae * Fix serializer tests for new attr scheme 2015-08-26 19:22:26 +02:00
Matthew Honnibal
658c4a3930 * Mark test_inital as requiring models 2015-08-26 19:22:06 +02:00
Matthew Honnibal
1302d35dff * Rework interfaces in vocab 2015-08-26 19:21:46 +02:00
Matthew Honnibal
2d521768a3 * Store Morphology class in Vocab 2015-08-26 19:21:03 +02:00
Matthew Honnibal
d30029979e * Avoid import of morphology in spans 2015-08-26 19:20:46 +02:00
Matthew Honnibal
119c0f8c3f * Hack out morphology stuff from tokenizer, while morphology being reimplemented. 2015-08-26 19:20:11 +02:00
Matthew Honnibal
b4faf551f5 * Refactor language-independent tagger class 2015-08-26 19:19:21 +02:00
Matthew Honnibal
a3d5e6c0dd * Reform constructor and save/load workflow in parser model 2015-08-26 19:19:01 +02:00
Matthew Honnibal
1d7f2d3abc * Hack on morphology structs 2015-08-26 19:18:36 +02:00
Matthew Honnibal
f8f2f4e545 * Temporarily add PUNC name to parts_of_specch dictionary, until better solution 2015-08-26 19:18:19 +02:00
Matthew Honnibal
008b02b035 * Comment out enums in Morpohlogy for now 2015-08-26 19:17:35 +02:00
Matthew Honnibal
378729f81a * Hack Morphology class towards usability 2015-08-26 19:17:21 +02:00
Matthew Honnibal
430affc347 * Fix missing n_patterns property in Matcher class. Fix from_dir method 2015-08-26 19:17:02 +02:00
Matthew Honnibal
3acf60df06 * Add missing properties in Lexeme class 2015-08-26 19:16:28 +02:00
Matthew Honnibal
76996f4145 * Hack on generic Language class. Still needs work for morphology, defaults, etc 2015-08-26 19:16:09 +02:00
Matthew Honnibal
e2ef78b29c * Gut pos.pyx module, since functionality moved to spacy/tagger.pyx 2015-08-26 19:15:42 +02:00
Matthew Honnibal
c4d8754385 * Specify LOCAL_DATA_DIR global in spacy.en.__init__.py 2015-08-26 19:15:07 +02:00
Matthew Honnibal
c2d8edd0bd * Add PROB attribute in attrs.pxd 2015-08-26 19:14:19 +02:00
Matthew Honnibal
dc13edd7cb * Refactor init_model to accomodate other languages 2015-08-26 19:14:05 +02:00
Matthew Honnibal
494da25872 * Refactor for more universal spacy 2015-08-26 19:13:50 +02:00
Matthew Honnibal
c5a27d1821 * Move lemmatizer to spacy 2015-08-25 15:47:08 +02:00
Matthew Honnibal
82217c6ec6 * Generalize lemmatizer 2015-08-25 15:46:19 +02:00
Matthew Honnibal
8083a07c3e * Use language base class 2015-08-25 15:37:30 +02:00
Matthew Honnibal
f2f699ac18 * Add language base class 2015-08-25 15:37:17 +02:00
jxs8172
85f01c5e16 Add contributor agreement. Add exception to 'it' so that 'its' and 'Its' isn't generated (its =/= it's) 2015-08-24 18:20:06 -04:00
Matthew Honnibal
25f29232ca Merge pull request #86 from vsolovyov/fix-c-ext-in-setuppy
Correctly pass link_args in c_ext() in setup.py
2015-08-24 20:18:49 +10:00
Vsevolod Solovyov
bbdb973398 Add contributor agreement for vsolovyov 2015-08-24 13:09:23 +03:00
Vsevolod Solovyov
39cfe28f33 Correctly pass link_args in c_ext() in setup.py 2015-08-24 12:52:05 +03:00
Matthew Honnibal
5dd76be446 * Split EnPosTagger up into base class and subclass 2015-08-24 05:25:55 +02:00
Matthew Honnibal
bbf07ac253 * Cut down init_model to work on more languages 2015-08-24 01:05:20 +02:00
Matthew Honnibal
5d5922dbfa * Begin laying out morphological features 2015-08-24 01:04:30 +02:00
Matthew Honnibal
6f1743692a * Work on language-independent refactoring 2015-08-23 20:49:18 +02:00
Matthew Honnibal
3879d28457 * Fix https for url detection 2015-08-23 02:40:35 +02:00
Matthew Honnibal
aa12b374c0 * Remove old doc tests 2015-08-22 22:12:55 +02:00
Matthew Honnibal
692a8d3e3c * Begin rewriting twitter_filter examples 2015-08-22 22:12:26 +02:00
Matthew Honnibal
f9a6bea746 * Ignore keys and other things 2015-08-22 22:12:07 +02:00
Matthew Honnibal
ffbf9e9ca5 * Remove docs 2015-08-22 22:11:14 +02:00
Matthew Honnibal
dcc8fadc7e * Add gazetteer-matcher 2015-08-22 22:10:43 +02:00
Matthew Honnibal
890d6aa216 * Remove old docs 2015-08-22 22:06:30 +02:00
Matthew Honnibal
cad0cca4e3 * Tmp 2015-08-22 22:04:34 +02:00
jxs8172
5876248109 Add missing we've and hardcoded 's and 'S 2015-08-21 22:57:47 -04:00
jxs8172
a5e0a0073b Add a script to generate the specials.json file, to take care of handling uppercase and missing apostrophe contractions 2015-08-21 22:39:33 -04:00