Commit Graph

8864 Commits

Author SHA1 Message Date
Matthew Honnibal
950ce36660 * Update init model 2015-09-06 17:51:30 +02:00
Matthew Honnibal
4f765eee79 Merge branch 'gaz' of https://github.com/honnibal/spaCy into gaz 2015-09-06 14:07:43 +02:00
Matthew Honnibal
7e4fea67d3 * Fix bug in token subtree, introduced by duplication of L/R code in Stateclass. Need to consolidate the two methods. 2015-09-06 10:48:36 +02:00
Matthew Honnibal
571b6eda88 * Upd tests 2015-09-06 05:40:10 +02:00
Matthew Honnibal
5edac11225 * Wrap self.parse in nogil, and break if an invalid move is predicted. The invalid break is a work-around that papers over likely bugs, but we can't easily break in the nogil block, and otherwise we'll get an infinite loop. Need to set this as an error flag. 2015-09-06 04:15:00 +02:00
Matthew Honnibal
fd1eeb3102 * Add POS attribute support in get_attr 2015-09-06 04:13:03 +02:00
Matthew Honnibal
534e3dda3c * More work on language independent parsing 2015-08-28 03:44:54 +02:00
Matthew Honnibal
c2307fa9ee * More work on language-generic parsing 2015-08-28 02:02:33 +02:00
Matthew Honnibal
86c4a8e3e2 * Work on new morphology organization 2015-08-27 23:11:51 +02:00
Matthew Honnibal
5b89e2454c * Improve error-reporting in tagger 2015-08-27 10:26:36 +02:00
Matthew Honnibal
f0a7c99554 * Relax rule-requirement in lemmatizer 2015-08-27 10:26:19 +02:00
Matthew Honnibal
b6b1e1aa12 * Add link for Finnish model 2015-08-27 10:26:02 +02:00
Matthew Honnibal
0af139e183 * Tagger training now working. Still need to test load/save of model. Morphology still broken. 2015-08-27 09:16:11 +02:00
Matthew Honnibal
320ced276a * Add tagger training script 2015-08-27 09:15:41 +02:00
Matthew Honnibal
56c4e07a59 Update gazetteer.json 2015-08-27 08:53:48 +10:00
Matthew Honnibal
c07eea8563 * Comment out old doc tests for now 2015-08-26 19:23:04 +02:00
Matthew Honnibal
884251801e * Mark space tests as requiring model 2015-08-26 19:22:50 +02:00
Matthew Honnibal
ff9db9f3ae * Fix serializer tests for new attr scheme 2015-08-26 19:22:26 +02:00
Matthew Honnibal
658c4a3930 * Mark test_inital as requiring models 2015-08-26 19:22:06 +02:00
Matthew Honnibal
1302d35dff * Rework interfaces in vocab 2015-08-26 19:21:46 +02:00
Matthew Honnibal
2d521768a3 * Store Morphology class in Vocab 2015-08-26 19:21:03 +02:00
Matthew Honnibal
d30029979e * Avoid import of morphology in spans 2015-08-26 19:20:46 +02:00
Matthew Honnibal
119c0f8c3f * Hack out morphology stuff from tokenizer, while morphology being reimplemented. 2015-08-26 19:20:11 +02:00
Matthew Honnibal
b4faf551f5 * Refactor language-independent tagger class 2015-08-26 19:19:21 +02:00
Matthew Honnibal
a3d5e6c0dd * Reform constructor and save/load workflow in parser model 2015-08-26 19:19:01 +02:00
Matthew Honnibal
1d7f2d3abc * Hack on morphology structs 2015-08-26 19:18:36 +02:00
Matthew Honnibal
f8f2f4e545 * Temporarily add PUNC name to parts_of_specch dictionary, until better solution 2015-08-26 19:18:19 +02:00
Matthew Honnibal
008b02b035 * Comment out enums in Morpohlogy for now 2015-08-26 19:17:35 +02:00
Matthew Honnibal
378729f81a * Hack Morphology class towards usability 2015-08-26 19:17:21 +02:00
Matthew Honnibal
430affc347 * Fix missing n_patterns property in Matcher class. Fix from_dir method 2015-08-26 19:17:02 +02:00
Matthew Honnibal
3acf60df06 * Add missing properties in Lexeme class 2015-08-26 19:16:28 +02:00
Matthew Honnibal
76996f4145 * Hack on generic Language class. Still needs work for morphology, defaults, etc 2015-08-26 19:16:09 +02:00
Matthew Honnibal
e2ef78b29c * Gut pos.pyx module, since functionality moved to spacy/tagger.pyx 2015-08-26 19:15:42 +02:00
Matthew Honnibal
c4d8754385 * Specify LOCAL_DATA_DIR global in spacy.en.__init__.py 2015-08-26 19:15:07 +02:00
Matthew Honnibal
c2d8edd0bd * Add PROB attribute in attrs.pxd 2015-08-26 19:14:19 +02:00
Matthew Honnibal
dc13edd7cb * Refactor init_model to accomodate other languages 2015-08-26 19:14:05 +02:00
Matthew Honnibal
494da25872 * Refactor for more universal spacy 2015-08-26 19:13:50 +02:00
Matthew Honnibal
c5a27d1821 * Move lemmatizer to spacy 2015-08-25 15:47:08 +02:00
Matthew Honnibal
82217c6ec6 * Generalize lemmatizer 2015-08-25 15:46:19 +02:00
Matthew Honnibal
8083a07c3e * Use language base class 2015-08-25 15:37:30 +02:00
Matthew Honnibal
f2f699ac18 * Add language base class 2015-08-25 15:37:17 +02:00
jxs8172
85f01c5e16 Add contributor agreement. Add exception to 'it' so that 'its' and 'Its' isn't generated (its =/= it's) 2015-08-24 18:20:06 -04:00
Matthew Honnibal
25f29232ca Merge pull request #86 from vsolovyov/fix-c-ext-in-setuppy
Correctly pass link_args in c_ext() in setup.py
2015-08-24 20:18:49 +10:00
Vsevolod Solovyov
bbdb973398 Add contributor agreement for vsolovyov 2015-08-24 13:09:23 +03:00
Vsevolod Solovyov
39cfe28f33 Correctly pass link_args in c_ext() in setup.py 2015-08-24 12:52:05 +03:00
Matthew Honnibal
5dd76be446 * Split EnPosTagger up into base class and subclass 2015-08-24 05:25:55 +02:00
Matthew Honnibal
bbf07ac253 * Cut down init_model to work on more languages 2015-08-24 01:05:20 +02:00
Matthew Honnibal
5d5922dbfa * Begin laying out morphological features 2015-08-24 01:04:30 +02:00
Matthew Honnibal
6f1743692a * Work on language-independent refactoring 2015-08-23 20:49:18 +02:00
Matthew Honnibal
3879d28457 * Fix https for url detection 2015-08-23 02:40:35 +02:00