Commit Graph

8397 Commits

Author SHA1 Message Date
dejanmarich
71c261d58b
Update stop_words.py
Added more words
2018-02-23 10:31:01 +01:00
Matthew Honnibal
dd3ebe4931
Merge pull request #2019 from explosion/feature/better-gold
Make Levenshtein alignment faster, bug fixes to parser, add UD parsing script
2018-02-23 04:41:26 +01:00
Matthew Honnibal
3e6c1111b7 Remove obsolete test 2018-02-23 03:22:07 +01:00
Matthew Honnibal
6b30dbd736
Merge pull request #1999 from explosion/feature/better-faster-matcher
Improved Matcher engine
2018-02-22 21:50:05 +01:00
Matthew Honnibal
331904fa9c Merge branch 'master' of https://github.com/explosion/spaCy into feature/better-faster-matcher 2018-02-22 21:47:10 +01:00
Matthew Honnibal
a4fdec524a Merge branch 'master' of https://github.com/explosion/spaCy into feature/better-gold 2018-02-22 21:44:28 +01:00
Matthew Honnibal
23236340f4 Update CoNLL script. Don't preset SBD. Set batch size to 8, avoid writing twice 2018-02-22 21:35:50 +01:00
Matthew Honnibal
a26e399f84 Update conllu script 2018-02-22 19:43:54 +01:00
ines
9c8a0f6eba Version-lock msgpack-python (see #2015) 2018-02-22 19:42:03 +01:00
Matthew Honnibal
50817dc9ad Improve parser oracle around sentence breaks. 2018-02-22 19:22:26 +01:00
Matthew Honnibal
001e2ec6d6 Refactor CoNLL training script 2018-02-22 16:00:34 +01:00
ines
8c09850354 Version-lock msgpack-python (see #2015) 2018-02-22 13:25:52 +01:00
Feng Niu
1c60384bed return on empty doc 2018-02-21 15:39:04 -08:00
Feng Niu
7eb1cd100b unbound doc var 2018-02-21 15:05:37 -08:00
Feng Niu
8df75b229c fix unbound vars in es.syntax_iterators 2018-02-21 13:11:17 -08:00
Feng Niu
a5981914a6 contributor file 2018-02-21 13:05:57 -08:00
alldefector
4244e285c2
Fix Spanish noun_chunks failure caused by typo 2018-02-21 12:43:21 -08:00
Matthew Honnibal
6a27a4f77c Set accelerating batch size in CONLL train script 2018-02-21 21:02:41 +01:00
Matthew Honnibal
661873ee4c Randomize the rebatch size in parser 2018-02-21 21:02:07 +01:00
Matthew Honnibal
0872cf611d Don't lower-case lemmas of proper nouns 2018-02-21 16:01:16 +01:00
Matthew Honnibal
a0ddb803fd Make error when no label found more helpful 2018-02-21 16:00:59 +01:00
Matthew Honnibal
ea2fc5d45f Improve length and freq cutoffs in parser 2018-02-21 16:00:38 +01:00
Matthew Honnibal
e5757d4bf0 Add labels property to parser 2018-02-21 16:00:00 +01:00
Matthew Honnibal
4dc0fc9954 Replace labels that didn't make freq cutoff 2018-02-21 15:59:22 +01:00
Matthew Honnibal
eff4ae809a Fix nonproj label filter 2018-02-21 15:59:04 +01:00
Matthew Honnibal
97164b1763 Fix conllu script 2018-02-21 14:46:54 +01:00
Matthew Honnibal
24fb2c246f Add script to do conllu training 2018-02-21 13:53:59 +01:00
Matthew Honnibal
e624405cda Temporarily remove cutoff when filtering labels in nonproj 2018-02-21 13:53:40 +01:00
Matthew Honnibal
f466f0186e Use new alignment implementation in GoldParse 2018-02-20 21:16:35 +01:00
Matthew Honnibal
c0734ba526 Make alignment work with strings 2018-02-20 17:51:49 +01:00
Matthew Honnibal
8180c84a98 Add tests for new Levenshtein alignment 2018-02-20 17:32:25 +01:00
Matthew Honnibal
f46bf2a7e9 Build _align.pyx 2018-02-20 17:32:13 +01:00
Matthew Honnibal
930c980570 Add improved Levenshtein alignment implementation 2018-02-20 17:31:56 +01:00
Ines Montani
14e7e0f12a
Merge pull request #2000 from jimregan/polish-tag-map
Polish tag map
2018-02-18 19:05:58 +01:00
Matthew Honnibal
667a4141f7
Merge pull request #2002 from jimregan/prepcase
missing PrepCase attribute
2018-02-18 18:29:16 +01:00
Jim O'Regan
664407de5d missing PrepCase attribute 2018-02-18 14:46:12 +00:00
Matthew Honnibal
68727922cc
Merge pull request #2001 from jimregan/animacy-morphology
fix typo/missing here too
2018-02-18 15:42:03 +01:00
Jim O'Regan
95f0673fbc fix typo/missing here too 2018-02-18 14:38:27 +00:00
Matthew Honnibal
2bccad8815 Fix incorrect matcher test 2018-02-18 14:56:12 +01:00
Matthew Honnibal
530172d57a Merge branch 'master' of https://github.com/explosion/spaCy into feature/better-faster-matcher 2018-02-18 14:40:42 +01:00
Matthew Honnibal
c9eeceba00 Merge branch 'master' of https://github.com/explosion/spaCy 2018-02-18 14:18:06 +01:00
Matthew Honnibal
cf0e320f2b Add doc.is_sentenced attribute, re #1959 2018-02-18 14:16:55 +01:00
ines
29106ec740 Add "new" tag to is_currency [ci skip] 2018-02-18 14:16:26 +01:00
ines
ca2fcad5a3 Add v2.1 tag to new arguments [ci skip] 2018-02-18 14:15:18 +01:00
ines
64f97adef1 Document new Matcher.pipe keyword args [ci skip]
See 1cf774bdc1
2018-02-18 14:13:58 +01:00
Matthew Honnibal
1e5aeb4eec
Merge pull request #1987 from thomasopsomer/span-sent
Make span.sent work when only manual / custom sbd
2018-02-18 14:05:37 +01:00
Matthew Honnibal
1cf774bdc1 Add output options return_matches and as_tuples to Matcher 2018-02-18 14:00:45 +01:00
Matthew Honnibal
dd9b0945af Fix inconsistencies in the symbols table 2018-02-18 13:51:31 +01:00
Matthew Honnibal
66496ac8e1 Set version to v2.1.0.dev0 2018-02-18 13:48:39 +01:00
Matthew Honnibal
eb3040ce46
Merge pull request #1891 from fucking-signup/master
Fix issue #1889
2018-02-18 13:47:47 +01:00