Commit Graph

14893 Commits

Author SHA1 Message Date
Matthew Honnibal
0e24d099a1 * Fix L/R edge bug, by ensuring l_edge and r_edge are preset, and fixing the way the edge update in del_arc. Bugs keep arising here because the edges are absolute positions, where everything else is relative. I'm also not 100% convinced that del_arc is handled correctly. Do we need to update the parents? 2015-09-09 03:40:44 +02:00
Matthew Honnibal
83d1a1e512 * Fix lemmatizer tests 2015-09-08 15:39:43 +02:00
Matthew Honnibal
2be3620333 * Save morphological analyses in a cache 2015-09-08 15:39:24 +02:00
Matthew Honnibal
1def5a6cbe * Fix print statements in matcher 2015-09-08 15:38:19 +02:00
Matthew Honnibal
64d71f8893 * Fix lemmatizer 2015-09-08 15:38:03 +02:00
Matthew Honnibal
b2e82e55f6 * Create POS model dir in training script 2015-09-08 15:36:23 +02:00
Matthew Honnibal
623329b19a Merge branch 'master' of ssh://github.com/honnibal/spaCy into develop 2015-09-08 14:27:01 +02:00
Matthew Honnibal
62a01dd41d * Fix issue #92: lexemes.bin read error on 32-bit platforms. 2015-09-08 14:23:58 +02:00
Matthew Honnibal
55ed3b3a63 Merge pull request #85 from NSchrading/master
Add a script to generate the specials.json file
2015-09-07 09:05:19 +10:00
Matthew Honnibal
ef58607a99 * Add spacy.it 2015-09-06 22:10:37 +02:00
Matthew Honnibal
2154a54f6b * Add spacy.de 2015-09-06 21:56:47 +02:00
Matthew Honnibal
a03e2a0b65 * Remove old docs files 2015-09-06 20:20:55 +02:00
Matthew Honnibal
fc8f7b123d * Mark a matcher test as requiring the model 2015-09-06 20:19:51 +02:00
Matthew Honnibal
f6ec5bf1b0 * Use empty tag map in vocab if none supplied 2015-09-06 20:19:27 +02:00
Matthew Honnibal
4f8e38271d * Fix merge errors in lexeme.pxd 2015-09-06 20:19:08 +02:00
Matthew Honnibal
5ad4527c42 * Rename Deutsch to German 2015-09-06 20:18:58 +02:00
Matthew Honnibal
86c888667f * Merge in changes from de branch 2015-09-06 19:49:28 +02:00
Matthew Honnibal
d2fc104a26 * Begin merge of Gazetteer and DE branches 2015-09-06 19:45:15 +02:00
Matthew Honnibal
dbf8dce109 Merge branch 'gaz' of ssh://github.com/honnibal/spaCy into gaz 2015-09-06 18:44:14 +02:00
Matthew Honnibal
577418986a * Add draft Italian stuff 2015-09-06 18:44:10 +02:00
Matthew Honnibal
80a66c0159 * Add draft finnish stuff 2015-09-06 18:43:44 +02:00
Matthew Honnibal
b3703836f9 * Add en lemma rules 2015-09-06 17:56:11 +02:00
Matthew Honnibal
238b2f533b * Add lemma rules 2015-09-06 17:55:53 +02:00
Matthew Honnibal
c9f2082e3c * Fix compilation error in en/tag_map.json 2015-09-06 17:54:51 +02:00
Matthew Honnibal
9eae9837c4 * Fix morphology look up 2015-09-06 17:53:39 +02:00
Matthew Honnibal
6427a3fcac * Temporarily import flag attributes in matcher 2015-09-06 17:53:12 +02:00
Matthew Honnibal
7cc56ada6e * Temporarily add py_set_flag attribute in Lexeme 2015-09-06 17:52:51 +02:00
Matthew Honnibal
e35bb36be7 * Ensure Lexeme.check_flag returns a boolean value 2015-09-06 17:52:32 +02:00
Matthew Honnibal
d1eea2d865 * Update train.py for language-generic spaCy 2015-09-06 17:51:48 +02:00
Matthew Honnibal
950ce36660 * Update init model 2015-09-06 17:51:30 +02:00
Matthew Honnibal
4f765eee79 Merge branch 'gaz' of https://github.com/honnibal/spaCy into gaz 2015-09-06 14:07:43 +02:00
Matthew Honnibal
7e4fea67d3 * Fix bug in token subtree, introduced by duplication of L/R code in Stateclass. Need to consolidate the two methods. 2015-09-06 10:48:36 +02:00
Matthew Honnibal
571b6eda88 * Upd tests 2015-09-06 05:40:10 +02:00
Matthew Honnibal
5edac11225 * Wrap self.parse in nogil, and break if an invalid move is predicted. The invalid break is a work-around that papers over likely bugs, but we can't easily break in the nogil block, and otherwise we'll get an infinite loop. Need to set this as an error flag. 2015-09-06 04:15:00 +02:00
Matthew Honnibal
fd1eeb3102 * Add POS attribute support in get_attr 2015-09-06 04:13:03 +02:00
Matthew Honnibal
534e3dda3c * More work on language independent parsing 2015-08-28 03:44:54 +02:00
Matthew Honnibal
c2307fa9ee * More work on language-generic parsing 2015-08-28 02:02:33 +02:00
Matthew Honnibal
86c4a8e3e2 * Work on new morphology organization 2015-08-27 23:11:51 +02:00
Matthew Honnibal
5b89e2454c * Improve error-reporting in tagger 2015-08-27 10:26:36 +02:00
Matthew Honnibal
f0a7c99554 * Relax rule-requirement in lemmatizer 2015-08-27 10:26:19 +02:00
Matthew Honnibal
b6b1e1aa12 * Add link for Finnish model 2015-08-27 10:26:02 +02:00
Matthew Honnibal
0af139e183 * Tagger training now working. Still need to test load/save of model. Morphology still broken. 2015-08-27 09:16:11 +02:00
Matthew Honnibal
320ced276a * Add tagger training script 2015-08-27 09:15:41 +02:00
Matthew Honnibal
56c4e07a59 Update gazetteer.json 2015-08-27 08:53:48 +10:00
Matthew Honnibal
c07eea8563 * Comment out old doc tests for now 2015-08-26 19:23:04 +02:00
Matthew Honnibal
884251801e * Mark space tests as requiring model 2015-08-26 19:22:50 +02:00
Matthew Honnibal
ff9db9f3ae * Fix serializer tests for new attr scheme 2015-08-26 19:22:26 +02:00
Matthew Honnibal
658c4a3930 * Mark test_inital as requiring models 2015-08-26 19:22:06 +02:00
Matthew Honnibal
1302d35dff * Rework interfaces in vocab 2015-08-26 19:21:46 +02:00
Matthew Honnibal
2d521768a3 * Store Morphology class in Vocab 2015-08-26 19:21:03 +02:00