Commit Graph

1808 Commits

Author SHA1 Message Date
Matthew Honnibal
238b2f533b * Add lemma rules 2015-09-06 17:55:53 +02:00
Matthew Honnibal
c9f2082e3c * Fix compilation error in en/tag_map.json 2015-09-06 17:54:51 +02:00
Matthew Honnibal
9eae9837c4 * Fix morphology look up 2015-09-06 17:53:39 +02:00
Matthew Honnibal
6427a3fcac * Temporarily import flag attributes in matcher 2015-09-06 17:53:12 +02:00
Matthew Honnibal
7cc56ada6e * Temporarily add py_set_flag attribute in Lexeme 2015-09-06 17:52:51 +02:00
Matthew Honnibal
e35bb36be7 * Ensure Lexeme.check_flag returns a boolean value 2015-09-06 17:52:32 +02:00
Matthew Honnibal
d1eea2d865 * Update train.py for language-generic spaCy 2015-09-06 17:51:48 +02:00
Matthew Honnibal
950ce36660 * Update init model 2015-09-06 17:51:30 +02:00
Matthew Honnibal
534e3dda3c * More work on language independent parsing 2015-08-28 03:44:54 +02:00
Matthew Honnibal
c2307fa9ee * More work on language-generic parsing 2015-08-28 02:02:33 +02:00
Matthew Honnibal
86c4a8e3e2 * Work on new morphology organization 2015-08-27 23:11:51 +02:00
Matthew Honnibal
5b89e2454c * Improve error-reporting in tagger 2015-08-27 10:26:36 +02:00
Matthew Honnibal
f0a7c99554 * Relax rule-requirement in lemmatizer 2015-08-27 10:26:19 +02:00
Matthew Honnibal
b6b1e1aa12 * Add link for Finnish model 2015-08-27 10:26:02 +02:00
Matthew Honnibal
0af139e183 * Tagger training now working. Still need to test load/save of model. Morphology still broken. 2015-08-27 09:16:11 +02:00
Matthew Honnibal
320ced276a * Add tagger training script 2015-08-27 09:15:41 +02:00
Matthew Honnibal
c07eea8563 * Comment out old doc tests for now 2015-08-26 19:23:04 +02:00
Matthew Honnibal
884251801e * Mark space tests as requiring model 2015-08-26 19:22:50 +02:00
Matthew Honnibal
ff9db9f3ae * Fix serializer tests for new attr scheme 2015-08-26 19:22:26 +02:00
Matthew Honnibal
658c4a3930 * Mark test_inital as requiring models 2015-08-26 19:22:06 +02:00
Matthew Honnibal
1302d35dff * Rework interfaces in vocab 2015-08-26 19:21:46 +02:00
Matthew Honnibal
2d521768a3 * Store Morphology class in Vocab 2015-08-26 19:21:03 +02:00
Matthew Honnibal
d30029979e * Avoid import of morphology in spans 2015-08-26 19:20:46 +02:00
Matthew Honnibal
119c0f8c3f * Hack out morphology stuff from tokenizer, while morphology being reimplemented. 2015-08-26 19:20:11 +02:00
Matthew Honnibal
b4faf551f5 * Refactor language-independent tagger class 2015-08-26 19:19:21 +02:00
Matthew Honnibal
a3d5e6c0dd * Reform constructor and save/load workflow in parser model 2015-08-26 19:19:01 +02:00
Matthew Honnibal
1d7f2d3abc * Hack on morphology structs 2015-08-26 19:18:36 +02:00
Matthew Honnibal
f8f2f4e545 * Temporarily add PUNC name to parts_of_specch dictionary, until better solution 2015-08-26 19:18:19 +02:00
Matthew Honnibal
008b02b035 * Comment out enums in Morpohlogy for now 2015-08-26 19:17:35 +02:00
Matthew Honnibal
378729f81a * Hack Morphology class towards usability 2015-08-26 19:17:21 +02:00
Matthew Honnibal
430affc347 * Fix missing n_patterns property in Matcher class. Fix from_dir method 2015-08-26 19:17:02 +02:00
Matthew Honnibal
3acf60df06 * Add missing properties in Lexeme class 2015-08-26 19:16:28 +02:00
Matthew Honnibal
76996f4145 * Hack on generic Language class. Still needs work for morphology, defaults, etc 2015-08-26 19:16:09 +02:00
Matthew Honnibal
e2ef78b29c * Gut pos.pyx module, since functionality moved to spacy/tagger.pyx 2015-08-26 19:15:42 +02:00
Matthew Honnibal
c4d8754385 * Specify LOCAL_DATA_DIR global in spacy.en.__init__.py 2015-08-26 19:15:07 +02:00
Matthew Honnibal
c2d8edd0bd * Add PROB attribute in attrs.pxd 2015-08-26 19:14:19 +02:00
Matthew Honnibal
dc13edd7cb * Refactor init_model to accomodate other languages 2015-08-26 19:14:05 +02:00
Matthew Honnibal
494da25872 * Refactor for more universal spacy 2015-08-26 19:13:50 +02:00
Matthew Honnibal
c5a27d1821 * Move lemmatizer to spacy 2015-08-25 15:47:08 +02:00
Matthew Honnibal
82217c6ec6 * Generalize lemmatizer 2015-08-25 15:46:19 +02:00
Matthew Honnibal
8083a07c3e * Use language base class 2015-08-25 15:37:30 +02:00
Matthew Honnibal
f2f699ac18 * Add language base class 2015-08-25 15:37:17 +02:00
Matthew Honnibal
5dd76be446 * Split EnPosTagger up into base class and subclass 2015-08-24 05:25:55 +02:00
Matthew Honnibal
bbf07ac253 * Cut down init_model to work on more languages 2015-08-24 01:05:20 +02:00
Matthew Honnibal
5d5922dbfa * Begin laying out morphological features 2015-08-24 01:04:30 +02:00
Matthew Honnibal
6f1743692a * Work on language-independent refactoring 2015-08-23 20:49:18 +02:00
Matthew Honnibal
3879d28457 * Fix https for url detection 2015-08-23 02:40:35 +02:00
Matthew Honnibal
aa12b374c0 * Remove old doc tests 2015-08-22 22:12:55 +02:00
Matthew Honnibal
692a8d3e3c * Begin rewriting twitter_filter examples 2015-08-22 22:12:26 +02:00
Matthew Honnibal
f9a6bea746 * Ignore keys and other things 2015-08-22 22:12:07 +02:00