Commit Graph

5306 Commits

Author SHA1 Message Date
Daylen Yang
1692c2df3c Fix get_lang_class parsing
We want the get_lang_class to return "en" for both "en" and "en_glove_cc_300_1m_vectors". Changed the split rule to "_" so that this happens.
2016-05-16 14:38:20 -07:00
Matthew Honnibal
17137f5c0c * Fix issue #372: mistake in Lexeme rich comparison 2016-05-12 12:58:57 +02:00
Matthew Honnibal
cc8bf62208 * Fix Issue #360: Tokenizer failed when the infix regex matched the start of the string while trying to tokenize multi-infix tokens. 2016-05-09 13:23:47 +02:00
Matthew Honnibal
eab2376547 * Allow longer ellipses to be treated as a single token, e.g. Hello......there 2016-05-09 13:22:53 +02:00
Matthew Honnibal
c61ee8f9fa * Increment version 2016-05-09 13:20:00 +02:00
Matthew Honnibal
f6ef64f02c * Update changelog in preparation for 0.101.0 release 2016-05-09 12:57:07 +02:00
Matthew Honnibal
5d86c30f0b * Fix Issue #367: Missing has_vector property on Doc and Span objects 2016-05-09 12:36:14 +02:00
Wolfgang Seeker
7b78239436 add fix for German noun chunk iterator (issue #365) 2016-05-06 01:41:26 +02:00
Matthew Honnibal
8c0888d6cb * Fix error in span.sent 2016-05-06 00:28:05 +02:00
Matthew Honnibal
bb94022975 * Fix Issue #365: Error introduced during noun phrase chunking, due to use of corrected PRON/PROPN/etc tags. 2016-05-06 00:21:05 +02:00
Matthew Honnibal
41342ca79b Merge branch 'master' of ssh://github.com/spacy-io/spaCy 2016-05-06 00:17:58 +02:00
Matthew Honnibal
26095f9722 * Add span.sent property, re Issue #366 2016-05-06 00:17:38 +02:00
Wolfgang Seeker
dbf8f5f3ec fix bug in StateC.set_break() 2016-05-05 15:15:34 +02:00
Wolfgang Seeker
3c44b5dc1a call deprojectivization after parsing 2016-05-05 15:10:36 +02:00
Matthew Honnibal
472f576b82 * Deprojectivize German parses 2016-05-05 15:01:10 +02:00
Matthew Honnibal
9bbd6cf031 * Work on Chinese support 2016-05-05 11:39:12 +02:00
Matthew Honnibal
a6a25166ba * Remove print from test 2016-05-05 11:10:59 +02:00
Matthew Honnibal
c4c55d9005 Merge branch 'master' of ssh://github.com/spacy-io/spaCy 2016-05-05 01:33:36 +02:00
Matthew Honnibal
e31df66d26 * Fix Issue #361: Lexemes didn't have rich comparison. 2016-05-05 01:32:26 +02:00
Matthew Honnibal
7441ca30ee * Add tests for Issue #361: Lexeme rich comparison 2016-05-05 01:31:58 +02:00
Matthew Honnibal
02d0fe242c Make latest release note the end of the readme 2016-05-05 00:26:16 +10:00
Matthew Honnibal
4f46c0f398 Fix code format in README.rst 2016-05-05 00:25:19 +10:00
Matthew Honnibal
886bf55bd9 Fix list formatting 2016-05-05 00:18:25 +10:00
Matthew Honnibal
1b8b888a57 Update readme with release notes for v0.100.8 2016-05-05 00:16:13 +10:00
Matthew Honnibal
72564213e3 * Add test for Issue #309 2016-05-04 16:00:28 +02:00
Matthew Honnibal
76f1d871da Merge branch 'master' of ssh://github.com/spacy-io/spaCy 2016-05-04 15:54:00 +02:00
Matthew Honnibal
519366f677 * Fix Issue #351: Indices off when leading whitespace 2016-05-04 15:53:36 +02:00
Matthew Honnibal
b4bfc6ae55 * Add test for Issue #351: Indices off when leading whitespace 2016-05-04 15:53:17 +02:00
Matthew Honnibal
76021cb853 * Fix bug in Doc.text, introduced by a862edc 2016-05-04 11:02:16 +02:00
Matthew Honnibal
1822bb4ff1 Merge pull request #359 from wbwseeker/reorganize_tests
Fix German noun chunker
2016-05-04 18:15:17 +10:00
Wolfgang Seeker
e4ea2bea01 fix whitespace 2016-05-04 07:40:38 +02:00
Wolfgang Seeker
5bf2fd1f78 make the code less cryptic 2016-05-03 17:19:05 +02:00
Wolfgang Seeker
a06fca9fdf German noun chunk iterator now doesn't return tokens more than once 2016-05-03 16:58:59 +02:00
Wolfgang Seeker
fd8019ec92 fix typo in german_noun_chunks 2016-05-03 15:53:30 +02:00
Wolfgang Seeker
7825b75548 add tests for German noun chunker 2016-05-03 15:01:28 +02:00
Matthew Honnibal
24337175df * Register zh package in setup.py 2016-05-03 14:36:59 +02:00
Wolfgang Seeker
7b246c13cb reformulate noun chunk tests for English 2016-05-03 14:24:35 +02:00
Wolfgang Seeker
1786331cd8 add model sanity test 2016-05-03 12:51:47 +02:00
Matthew Honnibal
1f1532142f * Fix cost calculation on non-monotonic oracle 2016-05-03 00:21:08 +02:00
Matthew Honnibal
377a624046 Merge pull request #358 from wbwseeker/german_lemmatizer_dummy
German lemmatizer dummy
2016-05-03 07:38:26 +10:00
Wolfgang Seeker
92bfbebeec remove unnecessary imports 2016-05-02 17:33:22 +02:00
Wolfgang Seeker
857454ffa0 fix indentation -.- 2016-05-02 17:10:41 +02:00
Matthew Honnibal
308a28c26c * Whitespace 2016-05-02 16:08:11 +02:00
Matthew Honnibal
29a114e645 * Don't assign 0-valued tags in Doc.from_array 2016-05-02 16:07:50 +02:00
Matthew Honnibal
c1c11a8ae0 * Fix formatting on serializer tests 2016-05-02 16:07:21 +02:00
Wolfgang Seeker
dae6bc05eb define German dummy lemmatizer until morphology is done 2016-05-02 16:04:53 +02:00
Matthew Honnibal
6e1f1c4b9e Merge pull request #357 from wbwseeker/german_ner
German ner
2016-05-02 23:39:34 +10:00
Wolfgang Seeker
b6b96b233c don't require read_json_file to expect particular annotations 2016-05-02 15:29:30 +02:00
Matthew Honnibal
902a389d85 * Fix merge conflict in test_parse 2016-05-02 15:28:07 +02:00
Matthew Honnibal
276fbe9996 * Fix assignment of iterator on Doc object 2016-05-02 15:26:24 +02:00