Matthew Honnibal
|
1f6c37c6f5
|
Fix create_tokenizer when nlp is None
|
2016-11-26 12:36:04 +01:00 |
|
Ines Montani
|
38f5ad4bfb
|
Merge pull request #660 from jsmootiv/patch-2
Minor typos Fix
|
2016-11-26 12:23:48 +01:00 |
|
Jimi Smoot
|
8373115cbd
|
Minor typos
|
2016-11-25 18:22:52 -08:00 |
|
Matthew Honnibal
|
c7889492f9
|
Fix model saving error for Python 3
|
2016-11-25 18:04:30 -06:00 |
|
Matthew Honnibal
|
22189e60db
|
Use unicode literals in train_ud
|
2016-11-25 17:45:45 -06:00 |
|
Matthew Honnibal
|
bc0a202c9c
|
Fix unicode problem in nonproj module
|
2016-11-25 17:29:17 -06:00 |
|
Matthew Honnibal
|
da5f0cce36
|
Fix train_ud script, which trains models from the Universal Dependencies format.
|
2016-11-25 11:19:33 -06:00 |
|
Matthew Honnibal
|
6dd3b94fa6
|
Filter out deprecated attributes when reading special-case tokenization rules.
|
2016-11-25 09:57:18 -06:00 |
|
Matthew Honnibal
|
e879c79b8c
|
Merge branch 'master' of https://github.com/explosion/spaCy
|
2016-11-25 09:18:28 -06:00 |
|
Matthew Honnibal
|
a335c6dcc2
|
Exclude morphs from deprecated token attributes for now
|
2016-11-25 16:17:32 +01:00 |
|
Matthew Honnibal
|
f799a07f25
|
Merge branch 'master' of https://github.com/explosion/spaCy
|
2016-11-25 09:16:43 -06:00 |
|
Matthew Honnibal
|
159e8c46e1
|
Merge old training fixes with newer state
|
2016-11-25 09:16:36 -06:00 |
|
Matthew Honnibal
|
6c1b2c0c2e
|
Merge branch 'master' of ssh://github.com/explosion/spaCy
|
2016-11-25 16:15:08 +01:00 |
|
Matthew Honnibal
|
846e80f2f4
|
Exclude morphs from deprecated token attributes for now
|
2016-11-25 16:14:54 +01:00 |
|
Matthew Honnibal
|
664f2dd1c0
|
Allow dep to be None in scorer, for missing labels.
|
2016-11-25 09:02:49 -06:00 |
|
Matthew Honnibal
|
39341598bb
|
Fix NER label calculation
|
2016-11-25 09:02:22 -06:00 |
|
Matthew Honnibal
|
ca773a1f53
|
Tweak arc_eager n_gold to deal with negative costs, and improve error message.
|
2016-11-25 09:01:52 -06:00 |
|
Matthew Honnibal
|
a2f55e7015
|
Pass cfg through loading, for training.
|
2016-11-25 09:01:20 -06:00 |
|
Matthew Honnibal
|
608d8f5421
|
Pass cfg through parser, and have is_valid default to 1, not 0 when resetting state
|
2016-11-25 09:00:21 -06:00 |
|
Matthew Honnibal
|
cc7e607a8a
|
Fix gold.pyx for 1.0
|
2016-11-25 08:57:59 -06:00 |
|
Matthew Honnibal
|
314bc8d34f
|
Fix train script for 1.0
|
2016-11-25 08:57:37 -06:00 |
|
root
|
080d29e092
|
Fix train.py for 1.0
|
2016-11-25 08:55:33 -06:00 |
|
Ines Montani
|
ada007cb73
|
Fix formatting for consistency
|
2016-11-25 15:53:40 +01:00 |
|
Ines Montani
|
19f27cc6ef
|
Use consistent entity tables across docs
|
2016-11-25 15:48:50 +01:00 |
|
Matthew Honnibal
|
6652f2a135
|
Test #656, #624: special case rules for tokenizer with attributes.
|
2016-11-25 12:44:13 +01:00 |
|
Matthew Honnibal
|
1e0f566d95
|
Fix #656, #624: Support arbitrary token attributes when adding special-case rules.
|
2016-11-25 12:43:24 +01:00 |
|
Matthew Honnibal
|
87613edf8f
|
Add set_struct_attr staticmethod to token
|
2016-11-25 12:41:47 +01:00 |
|
Matthew Honnibal
|
fb69aa648f
|
Merge branch 'master' of ssh://github.com/explosion/spaCy
|
2016-11-25 11:35:44 +01:00 |
|
Matthew Honnibal
|
9a03a3f85e
|
Add get_struct_attr staticmethod to Token, to match Lexeme.get_struct_attr.
|
2016-11-25 11:35:17 +01:00 |
|
Matthew Honnibal
|
53d8ca8f51
|
Add spacy.attrs.intify_attrs function, to normalize strings in token attribute dictionaries.
|
2016-11-25 11:34:30 +01:00 |
|
Ines Montani
|
e0c7a22f09
|
Add usage workflow for entity recognizer
|
2016-11-25 02:30:31 +01:00 |
|
Ines Montani
|
c8e69b98cc
|
Update tutorial tags
|
2016-11-25 02:30:31 +01:00 |
|
Ines Montani
|
bf65d070ef
|
Add CodePen embed mixin
|
2016-11-25 02:30:31 +01:00 |
|
Ines Montani
|
3092efdbeb
|
Update CONTRIBUTORS.md
|
2016-11-24 22:08:36 +01:00 |
|
Ines Montani
|
6f7835bb70
|
Add tutorial
|
2016-11-24 19:25:21 +01:00 |
|
Ines Montani
|
427e942e84
|
Ignore temporary files
|
2016-11-24 19:21:27 +01:00 |
|
Ines Montani
|
d21ad01840
|
Add emoticons
|
2016-11-24 19:13:00 +01:00 |
|
dafnevk
|
d8c7ac203a
|
Added nl module for dutch
|
2016-11-24 16:39:49 +01:00 |
|
dafnevk
|
3db8b0d322
|
Added language class and some language data (with some TODOs) for Dutch
|
2016-11-24 15:56:38 +01:00 |
|
Ines Montani
|
4dcfafde02
|
Add line breaks
|
2016-11-24 14:57:37 +01:00 |
|
Ines Montani
|
6247c005a2
|
Add test for tokenizer regular expressions
|
2016-11-24 13:51:59 +01:00 |
|
Ines Montani
|
de747e39e7
|
Reformat language data
|
2016-11-24 13:51:32 +01:00 |
|
Matthew Honnibal
|
1f247959f3
|
Merge pull request #658 from pokey/master
Add noun_chunks to Span
|
2016-11-24 23:33:57 +11:00 |
|
Matthew Honnibal
|
b8c4f5ea76
|
Allow German noun chunks to work on Span
Update the German noun chunks iterator, so that it also works on Span objects.
|
2016-11-24 23:30:15 +11:00 |
|
Pokey Rule
|
3e3bda142d
|
Add noun_chunks to Span
|
2016-11-24 10:47:20 +00:00 |
|
Ines Montani
|
a98da29232
|
Update CONTRIBUTORS.md
|
2016-11-24 11:37:08 +01:00 |
|
Janneke van der Zwaan
|
83daade0e4
|
Add directory and initial (empty) files for language Dutch
|
2016-11-24 09:45:41 +01:00 |
|
Matthew Honnibal
|
09f68bc641
|
Fix Issue #639: stop words in language class not used. This patch is messy, but it's better not to change too much until the language data loading can be properly refactored.
|
2016-11-24 00:13:55 +01:00 |
|
Matthew Honnibal
|
48e1dc29d4
|
Fix default path loading.
|
2016-11-23 23:48:55 +01:00 |
|
Matthew Honnibal
|
e01c1875ee
|
Work on test for #615
|
2016-11-23 23:48:41 +01:00 |
|