Commit Graph

11774 Commits

Author SHA1 Message Date
Matthew Honnibal
a1be01185c Fix array out of bounds error in Span 2018-02-28 12:27:09 +01:00
Matthew Honnibal
aff5f007b3 Improve fabfile 2018-02-28 12:09:53 +01:00
Matthew Honnibal
80e9c6bac7 Improve fabfile 2018-02-28 12:07:19 +01:00
Matthew Honnibal
7cf6b1c7a4 Improve fabfile 2018-02-28 12:04:38 +01:00
Matthew Honnibal
64e53f1b1b Improve fabfile 2018-02-28 12:01:52 +01:00
Matthew Honnibal
bf6a52a986 Merge branch 'master' of https://github.com/explosion/spaCy 2018-02-28 11:56:30 +01:00
Matthew Honnibal
7521f85014 Improve fabfile 2018-02-28 11:56:09 +01:00
Matthew Honnibal
31dcb4af2e
Merge pull request #2040 from thomasopsomer/fix-lemma
Make token.lemma property return hash instead of unicode
2018-02-28 11:35:42 +01:00
Matthew Honnibal
1b840f1ac1 Try to fix fabfile 2018-02-28 03:30:44 +01:00
Matthew Honnibal
aa96f769d2 Try to fix fabfile 2018-02-28 03:28:22 +01:00
Matthew Honnibal
c5bc0eadc8 Fix fab test 2018-02-28 02:22:57 +01:00
Matthew Honnibal
eaef36e4a5 Fix fab install command 2018-02-28 01:56:14 +01:00
Matthew Honnibal
67cd2d42b0 Fix fab install command 2018-02-28 01:51:21 +01:00
Matthew Honnibal
d322c0ae8b Have fab env create with correct Python 2018-02-28 01:45:19 +01:00
Matthew Honnibal
7ade5160ca Fix test command 2018-02-27 23:48:00 +01:00
Matthew Honnibal
60567ae646 Install pytest if necessary in fabfile 2018-02-27 23:42:22 +01:00
Matthew Honnibal
fd816bbd1b Fix env command in fabfile 2018-02-27 23:29:48 +01:00
Matthew Honnibal
071a2fbd02 Add buildkite script to trigger training 2018-02-27 23:27:27 +01:00
Thomas Opsomer
8df9e52829 lemma property to return hash instead of unicode 2018-02-27 19:50:01 +01:00
Matthew Honnibal
54ebdacc17 Fix missing import 2018-02-27 18:20:39 +01:00
Matthew Honnibal
bd22899fb3 Merge branch 'master' of https://github.com/explosion/spaCy 2018-02-27 18:04:24 +01:00
Matthew Honnibal
74d5d398f8 Improve fabfile, removing fabtools dependency 2018-02-27 18:03:57 +01:00
Ines Montani
35634352fe
Merge pull request #2025 from dejanmarich/patch-1
Update stop_words.py for Croatian language
2018-02-26 18:22:32 +01:00
Matthew Honnibal
7441fce7ba Fix undefined variable in conllu script 2018-02-26 14:59:56 +01:00
Matthew Honnibal
14f729c72a Add subtok label to parser 2018-02-26 12:26:35 +01:00
Matthew Honnibal
7137ad8b0b Make label filtering clearer for projectivisation 2018-02-26 12:02:01 +01:00
Matthew Honnibal
b8d52cb285 Fix inconsistent label freq cutoff for projectivisation 2018-02-26 12:01:44 +01:00
Matthew Honnibal
7b66ec896a Revert "Revert "Improve parser oracle around sentence breaks.""
This reverts commit 36e481c584.
2018-02-26 10:57:37 +01:00
Matthew Honnibal
36e481c584 Revert "Improve parser oracle around sentence breaks."
This reverts commit 50817dc9ad.
2018-02-26 10:53:55 +01:00
Matthew Honnibal
f0478635df Fix Japanese tokenizer flag 2018-02-26 10:32:12 +01:00
Matthew Honnibal
5faae803c6 Add option to not use Janome for Japanese tokenization 2018-02-26 09:39:46 +01:00
Matthew Honnibal
9b406181cd Add Chinese.Defaults.use_jieba setting, for UD 2018-02-25 15:12:38 +01:00
Matthew Honnibal
9ccd0c643b Add Vietnamese 2018-02-25 15:00:46 +01:00
Matthew Honnibal
d4fdb97c87 Fix alignment for words with spaces 2018-02-25 14:55:00 +01:00
Matthew Honnibal
9e960d24fc Refactor conllu script, fix interface, generalize 2018-02-25 14:54:47 +01:00
Matthew Honnibal
551c93fe01 Shuffle data after each epoch. Improve script 2018-02-25 13:35:32 +01:00
Matthew Honnibal
bdb0174571 Update conllu training script 2018-02-25 13:12:39 +01:00
Matthew Honnibal
e09070eca7 Refactor conllu script 2018-02-25 12:50:29 +01:00
Matthew Honnibal
44e496a82e Refactor conllu script 2018-02-25 12:48:22 +01:00
Matthew Honnibal
c388833ca6 Minibatch by number of tokens, support other vectors, refactor CoNLL printing 2018-02-25 10:38:06 +01:00
Matthew Honnibal
dd78ef066a Unset data size limit in conll script 2018-02-24 18:14:57 +01:00
Matthew Honnibal
6d2c1ef52c Fix SP tag in generic tag map 2018-02-24 16:04:56 +01:00
Matthew Honnibal
8adeea3746 Generalize conllu script. Now handling Chinese (maybe badly) 2018-02-24 16:04:27 +01:00
Matthew Honnibal
5cc3bd1c1d Update alignment tests 2018-02-24 16:03:58 +01:00
Matthew Honnibal
6138439469 Fix many-to-one alignment 2018-02-24 16:03:50 +01:00
Matthew Honnibal
4890ee1732 Fix scoring of tokenization for punct 2018-02-24 10:32:32 +01:00
Matthew Honnibal
12b39f87da Move cython declarations in matcher.pyx 2018-02-24 10:32:18 +01:00
Matthew Honnibal
329b14c9e6 Clean up conllu script 2018-02-24 10:31:53 +01:00
Matthew Honnibal
01d1b7abdf Support many-to-one alignment in GoldParse 2018-02-24 10:17:01 +01:00
Matthew Honnibal
7865746574 Support many-to-one alignment 2018-02-24 02:09:53 +01:00