Commit Graph

3694 Commits

Author SHA1 Message Date
Matthew Honnibal
22189e60db Use unicode literals in train_ud 2016-11-25 17:45:45 -06:00
Matthew Honnibal
bc0a202c9c Fix unicode problem in nonproj module 2016-11-25 17:29:17 -06:00
Matthew Honnibal
da5f0cce36 Fix train_ud script, which trains models from the Universal Dependencies format. 2016-11-25 11:19:33 -06:00
Matthew Honnibal
6dd3b94fa6 Filter out deprecated attributes when reading special-case tokenization rules. 2016-11-25 09:57:18 -06:00
Matthew Honnibal
e879c79b8c Merge branch 'master' of https://github.com/explosion/spaCy 2016-11-25 09:18:28 -06:00
Matthew Honnibal
a335c6dcc2 Exclude morphs from deprecated token attributes for now 2016-11-25 16:17:32 +01:00
Matthew Honnibal
f799a07f25 Merge branch 'master' of https://github.com/explosion/spaCy 2016-11-25 09:16:43 -06:00
Matthew Honnibal
159e8c46e1 Merge old training fixes with newer state 2016-11-25 09:16:36 -06:00
Matthew Honnibal
6c1b2c0c2e Merge branch 'master' of ssh://github.com/explosion/spaCy 2016-11-25 16:15:08 +01:00
Matthew Honnibal
846e80f2f4 Exclude morphs from deprecated token attributes for now 2016-11-25 16:14:54 +01:00
Matthew Honnibal
664f2dd1c0 Allow dep to be None in scorer, for missing labels. 2016-11-25 09:02:49 -06:00
Matthew Honnibal
39341598bb Fix NER label calculation 2016-11-25 09:02:22 -06:00
Matthew Honnibal
ca773a1f53 Tweak arc_eager n_gold to deal with negative costs, and improve error message. 2016-11-25 09:01:52 -06:00
Matthew Honnibal
a2f55e7015 Pass cfg through loading, for training. 2016-11-25 09:01:20 -06:00
Matthew Honnibal
608d8f5421 Pass cfg through parser, and have is_valid default to 1, not 0 when resetting state 2016-11-25 09:00:21 -06:00
Matthew Honnibal
cc7e607a8a Fix gold.pyx for 1.0 2016-11-25 08:57:59 -06:00
Matthew Honnibal
314bc8d34f Fix train script for 1.0 2016-11-25 08:57:37 -06:00
root
080d29e092 Fix train.py for 1.0 2016-11-25 08:55:33 -06:00
Ines Montani
ada007cb73 Fix formatting for consistency 2016-11-25 15:53:40 +01:00
Ines Montani
19f27cc6ef Use consistent entity tables across docs 2016-11-25 15:48:50 +01:00
Matthew Honnibal
6652f2a135 Test #656, #624: special case rules for tokenizer with attributes. 2016-11-25 12:44:13 +01:00
Matthew Honnibal
1e0f566d95 Fix #656, #624: Support arbitrary token attributes when adding special-case rules. 2016-11-25 12:43:24 +01:00
Matthew Honnibal
87613edf8f Add set_struct_attr staticmethod to token 2016-11-25 12:41:47 +01:00
Matthew Honnibal
fb69aa648f Merge branch 'master' of ssh://github.com/explosion/spaCy 2016-11-25 11:35:44 +01:00
Matthew Honnibal
9a03a3f85e Add get_struct_attr staticmethod to Token, to match Lexeme.get_struct_attr. 2016-11-25 11:35:17 +01:00
Matthew Honnibal
53d8ca8f51 Add spacy.attrs.intify_attrs function, to normalize strings in token attribute dictionaries. 2016-11-25 11:34:30 +01:00
Ines Montani
e0c7a22f09 Add usage workflow for entity recognizer 2016-11-25 02:30:31 +01:00
Ines Montani
c8e69b98cc Update tutorial tags 2016-11-25 02:30:31 +01:00
Ines Montani
bf65d070ef Add CodePen embed mixin 2016-11-25 02:30:31 +01:00
Ines Montani
3092efdbeb Update CONTRIBUTORS.md 2016-11-24 22:08:36 +01:00
Ines Montani
6f7835bb70 Add tutorial 2016-11-24 19:25:21 +01:00
Ines Montani
427e942e84 Ignore temporary files 2016-11-24 19:21:27 +01:00
Matthew Honnibal
1f247959f3 Merge pull request #658 from pokey/master
Add noun_chunks to Span
2016-11-24 23:33:57 +11:00
Matthew Honnibal
b8c4f5ea76 Allow German noun chunks to work on Span
Update the German noun chunks iterator, so that it also works on Span objects.
2016-11-24 23:30:15 +11:00
Pokey Rule
3e3bda142d Add noun_chunks to Span 2016-11-24 10:47:20 +00:00
Ines Montani
a98da29232 Update CONTRIBUTORS.md 2016-11-24 11:37:08 +01:00
Matthew Honnibal
09f68bc641 Fix Issue #639: stop words in language class not used. This patch is messy, but it's better not to change too much until the language data loading can be properly refactored. 2016-11-24 00:13:55 +01:00
Matthew Honnibal
48e1dc29d4 Fix default path loading. 2016-11-23 23:48:55 +01:00
Matthew Honnibal
e01c1875ee Work on test for #615 2016-11-23 23:48:41 +01:00
Matthew Honnibal
1b77932ba5 Merge pull request #654 from ExplodingCabbage/patch-1
Fix syntax mistake
2016-11-24 09:31:36 +11:00
ExplodingCabbage
6c4f488e89 Fix syntax mistake 2016-11-23 15:12:45 +00:00
Matthew Honnibal
60eb2343ce Only try to load vectors if they exist. 2016-11-23 13:50:24 +01:00
Matthew Honnibal
618ac36093 Fix use of path argument in Language.__init__. Needs to be keyword arg, not positional. 2016-11-23 13:26:34 +01:00
Ines Montani
a7b5fba132 Merge pull request #642 from ExplodingCabbage/specify-data-path
Let --data-path be specified when running download.py scripts
2016-11-23 13:05:03 +01:00
Ines Montani
ede2baba19 Merge pull request #647 from wjt/patch-1
Fix typos in docs
2016-11-21 14:33:50 +01:00
Will Thompson
e896466dcf
docs: processing-text: fix missing line wrap 2016-11-21 10:43:16 +00:00
Will Thompson
1adc96f0a6 docs: fix "installaton" typo 2016-11-21 10:37:57 +00:00
Matthew Honnibal
ba2cd3d1e7 Merge pull request #646 from ExplodingCabbage/ignore-more-stuff
Ignore entire data folder
2016-11-21 09:52:18 +11:00
Matthew Honnibal
d0c999e0ad Add config.py for paddle example 2016-11-20 23:24:51 +01:00
Matthew Honnibal
605144398b Merge branch 'master' of https://github.com/explosion/spaCy 2016-11-20 23:23:59 +01:00