Commit Graph

1069 Commits

Author SHA1 Message Date
Matthew Honnibal
dfdd4f2d60 Merge branch 'develop' of https://github.com/honnibal/spaCy into develop 2015-09-10 15:23:06 +02:00
Matthew Honnibal
e285ca7d6c * Load serializer freqs in vocab 2015-09-10 15:22:48 +02:00
Matthew Honnibal
f7fdcce1f9 Merge branch 'develop' of https://github.com/honnibal/spaCy into develop 2015-09-10 14:52:47 +02:00
Matthew Honnibal
85c3fec1d1 * Fix morphology loading 2015-09-10 14:52:23 +02:00
Matthew Honnibal
7c660c5efc * Use dict.get in lemmatizer 2015-09-10 14:51:39 +02:00
Matthew Honnibal
094440f9f5 Merge branch 'develop' of ssh://github.com/honnibal/spaCy into develop 2015-09-10 14:51:17 +02:00
Matthew Honnibal
c3f773cd63 * Fix Lexeme.check_flag 2015-09-10 14:51:05 +02:00
Matthew Honnibal
90da3a695d * Load lemmatizer from disk in Vocab.from_dir 2015-09-10 14:49:10 +02:00
Matthew Honnibal
e7e529edf4 * Fix Lexeme.check_flag 2015-09-10 14:45:43 +02:00
Matthew Honnibal
9e7bfe8449 * Fix space at end of merged token 2015-09-10 14:45:17 +02:00
Matthew Honnibal
f634191e27 * Fix vocab read/write 2015-09-10 14:44:38 +02:00
Matthew Honnibal
31ccf494e6 Merge branch 'develop' of https://github.com/honnibal/spaCy into develop 2015-09-09 14:33:38 +02:00
Matthew Honnibal
a7f4b26c8c * Tmp 2015-09-09 14:33:26 +02:00
Matthew Honnibal
07686470a9 * Don't consider a coordinated NP a base chunk 2015-09-09 14:32:28 +02:00
Matthew Honnibal
d9f1fc2112 * Add deprecation warning for unused load_vectors argument. 2015-09-09 14:31:09 +02:00
Matthew Honnibal
0b527fbdc8 * Set POS tag in morphology 2015-09-09 14:30:24 +02:00
Matthew Honnibal
07c09a0e1b * Fix attribute getters and setters in Lexeme 2015-09-09 14:29:22 +02:00
Matthew Honnibal
d6561988cf * Fix lexemes.bin 2015-09-09 11:49:51 +02:00
Matthew Honnibal
c301bebd33 Merge branch 'master' of https://github.com/honnibal/spaCy into develop 2015-09-09 10:55:39 +02:00
Matthew Honnibal
0e24d099a1 * Fix L/R edge bug, by ensuring l_edge and r_edge are preset, and fixing the way the edge update in del_arc. Bugs keep arising here because the edges are absolute positions, where everything else is relative. I'm also not 100% convinced that del_arc is handled correctly. Do we need to update the parents? 2015-09-09 03:40:44 +02:00
Matthew Honnibal
2be3620333 * Save morphological analyses in a cache 2015-09-08 15:39:24 +02:00
Matthew Honnibal
1def5a6cbe * Fix print statements in matcher 2015-09-08 15:38:19 +02:00
Matthew Honnibal
64d71f8893 * Fix lemmatizer 2015-09-08 15:38:03 +02:00
Matthew Honnibal
623329b19a Merge branch 'master' of ssh://github.com/honnibal/spaCy into develop 2015-09-08 14:27:01 +02:00
Matthew Honnibal
62a01dd41d * Fix issue #92: lexemes.bin read error on 32-bit platforms. 2015-09-08 14:23:58 +02:00
Matthew Honnibal
ef58607a99 * Add spacy.it 2015-09-06 22:10:37 +02:00
Matthew Honnibal
2154a54f6b * Add spacy.de 2015-09-06 21:56:47 +02:00
Matthew Honnibal
f6ec5bf1b0 * Use empty tag map in vocab if none supplied 2015-09-06 20:19:27 +02:00
Matthew Honnibal
4f8e38271d * Fix merge errors in lexeme.pxd 2015-09-06 20:19:08 +02:00
Matthew Honnibal
86c888667f * Merge in changes from de branch 2015-09-06 19:49:28 +02:00
Matthew Honnibal
d2fc104a26 * Begin merge of Gazetteer and DE branches 2015-09-06 19:45:15 +02:00
Matthew Honnibal
dbf8dce109 Merge branch 'gaz' of ssh://github.com/honnibal/spaCy into gaz 2015-09-06 18:44:14 +02:00
Matthew Honnibal
9eae9837c4 * Fix morphology look up 2015-09-06 17:53:39 +02:00
Matthew Honnibal
6427a3fcac * Temporarily import flag attributes in matcher 2015-09-06 17:53:12 +02:00
Matthew Honnibal
7cc56ada6e * Temporarily add py_set_flag attribute in Lexeme 2015-09-06 17:52:51 +02:00
Matthew Honnibal
e35bb36be7 * Ensure Lexeme.check_flag returns a boolean value 2015-09-06 17:52:32 +02:00
Matthew Honnibal
7e4fea67d3 * Fix bug in token subtree, introduced by duplication of L/R code in Stateclass. Need to consolidate the two methods. 2015-09-06 10:48:36 +02:00
Matthew Honnibal
5edac11225 * Wrap self.parse in nogil, and break if an invalid move is predicted. The invalid break is a work-around that papers over likely bugs, but we can't easily break in the nogil block, and otherwise we'll get an infinite loop. Need to set this as an error flag. 2015-09-06 04:15:00 +02:00
Matthew Honnibal
fd1eeb3102 * Add POS attribute support in get_attr 2015-09-06 04:13:03 +02:00
Matthew Honnibal
534e3dda3c * More work on language independent parsing 2015-08-28 03:44:54 +02:00
Matthew Honnibal
c2307fa9ee * More work on language-generic parsing 2015-08-28 02:02:33 +02:00
Matthew Honnibal
86c4a8e3e2 * Work on new morphology organization 2015-08-27 23:11:51 +02:00
Matthew Honnibal
5b89e2454c * Improve error-reporting in tagger 2015-08-27 10:26:36 +02:00
Matthew Honnibal
f0a7c99554 * Relax rule-requirement in lemmatizer 2015-08-27 10:26:19 +02:00
Matthew Honnibal
0af139e183 * Tagger training now working. Still need to test load/save of model. Morphology still broken. 2015-08-27 09:16:11 +02:00
Matthew Honnibal
1302d35dff * Rework interfaces in vocab 2015-08-26 19:21:46 +02:00
Matthew Honnibal
2d521768a3 * Store Morphology class in Vocab 2015-08-26 19:21:03 +02:00
Matthew Honnibal
d30029979e * Avoid import of morphology in spans 2015-08-26 19:20:46 +02:00
Matthew Honnibal
119c0f8c3f * Hack out morphology stuff from tokenizer, while morphology being reimplemented. 2015-08-26 19:20:11 +02:00
Matthew Honnibal
b4faf551f5 * Refactor language-independent tagger class 2015-08-26 19:19:21 +02:00