Commit Graph

20 Commits

Author SHA1 Message Date
Matthew Honnibal
68f479e821 * Rename Doc.data to Doc.c 2015-11-04 00:15:14 +11:00
Matthew Honnibal
6727a46bb5 * Fix Issue #118: Matcher behaves unpredictably when matches overlap. 2015-10-19 16:45:32 +11:00
Matthew Honnibal
c99285b8b9 * Clean up C++ usage in spacy/matcher.pyx 2015-10-18 17:20:50 +11:00
Matthew Honnibal
20fd36a0f7 * Very scrappy, likely buggy first-cut pickle implementation, to work on Issue #125: allow pickle for Apache Spark. The current implementation sends stuff to temp files, and does almost nothing to ensure all modifiable state is actually preserved. The Language() instance is a deep tree of extension objects, and if pickling during training, some of the C-data state is hard to preserve. 2015-10-13 13:44:41 +11:00
Matthew Honnibal
85ce36ab11 * Refactor symbols, so that frequency rank can be derived from the orth id of a word. 2015-10-13 13:44:39 +11:00
Matthew Honnibal
801d55a6d9 * Fix phrase matcher 2015-10-09 02:00:45 +11:00
Matthew Honnibal
1def5a6cbe * Fix print statements in matcher 2015-09-08 15:38:19 +02:00
Matthew Honnibal
86c888667f * Merge in changes from de branch 2015-09-06 19:49:28 +02:00
Matthew Honnibal
d2fc104a26 * Begin merge of Gazetteer and DE branches 2015-09-06 19:45:15 +02:00
Matthew Honnibal
6427a3fcac * Temporarily import flag attributes in matcher 2015-09-06 17:53:12 +02:00
Matthew Honnibal
430affc347 * Fix missing n_patterns property in Matcher class. Fix from_dir method 2015-08-26 19:17:02 +02:00
Matthew Honnibal
6f1743692a * Work on language-independent refactoring 2015-08-23 20:49:18 +02:00
Matthew Honnibal
cad0cca4e3 * Tmp 2015-08-22 22:04:34 +02:00
Matthew Honnibal
9f65879991 * Fix shape attr bug, and fix handling of false positive matches 2015-08-06 17:28:14 +02:00
Matthew Honnibal
383dfabd67 * Fix matcher setting of entities 2015-08-06 16:27:01 +02:00
Matthew Honnibal
cd7d1682cd * Fix loading of gazetteer.json file 2015-08-06 16:08:25 +02:00
Matthew Honnibal
5737115e1e * Work on gazetteer matching 2015-08-06 14:33:21 +02:00
Matthew Honnibal
9c1724ecae * Gazetteer stuff working, now need to wire up to API 2015-08-06 00:35:40 +02:00
Matthew Honnibal
5bc0e83f9a * Reimplement matching in Cython, instead of Python. 2015-08-05 01:05:54 +02:00
Matthew Honnibal
4c87a696b3 * Add draft dfa matcher, in Python. Passing tests. 2015-08-04 15:55:28 +02:00