Commit Graph

18 Commits

Author SHA1 Message Date
Matthew Honnibal
f4809e562f * Allow json to be used as a fallback if ujson is not available 2015-07-25 18:11:36 +02:00
Matthew Honnibal
2ae0b439b2 * Fix space check in gold.pyx 2015-07-14 00:10:27 +02:00
Matthew Honnibal
89a91ad726 * Add SPACE part-of-speech tag, and train tagger to assign it. Also train tagger not to make whitespace an entity 2015-07-09 13:30:41 +02:00
Matthew Honnibal
43ef5ddea5 * Ensure root albel is spelled ROOT, for backwards compatibility 2015-06-23 04:14:03 +02:00
Matthew Honnibal
46fb24e9fd * Add cycle-checking code in gold.pyx 2015-06-23 00:02:22 +02:00
Matthew Honnibal
b643cb3d5c * Allow training documents to be filtered in gold.pyx 2015-06-12 02:42:08 +02:00
Matthew Honnibal
00a0dfcb59 * Avoid shipping the spacy.munge package 2015-06-08 00:54:13 +02:00
Matthew Honnibal
89b8775887 * Fix output from _min_edit_path when inputs match. 2015-06-06 05:58:53 +02:00
Matthew Honnibal
ae653b850a * Remove unused import from gold.pyx 2015-06-03 06:07:15 +02:00
Matthew Honnibal
a513ec500f * Have oracle functions take a struct instead of a Python object 2015-06-02 20:01:06 +02:00
Matthew Honnibal
87d6551d19 * Allow gold parse to cut non-projective arcs 2015-05-31 01:11:56 +02:00
Matthew Honnibal
9e39a206da * Fix efficiency of JSON reading, by using ujson instead of stream 2015-05-30 17:54:52 +02:00
Matthew Honnibal
76300bbb1b * Use updated JSON format, with sentences below paragraphs. Allows use of gold preprocessing flag. 2015-05-30 01:25:46 +02:00
Matthew Honnibal
b76bbbd12c * Read json files recursively from a directory, instead of requiring a single .json file 2015-05-29 03:52:55 +02:00
Matthew Honnibal
7a2725bca4 * Read input json in a streaming way 2015-05-27 19:13:11 +02:00
Matthew Honnibal
6016ee83a6 * Fix reading of NER in gold.pyx 2015-05-27 03:17:50 +02:00
Matthew Honnibal
3593babd35 * Add functions for Levenshtein distance alignment 2015-05-24 21:50:48 +02:00
Matthew Honnibal
fc75210941 * Move spacy.syntax.conll to spacy.gold 2015-05-24 21:35:02 +02:00