Alex
|
95836abee1
|
Update CONTRIBUTORS.md
|
2017-10-13 21:02:19 +07:00 |
|
Alex
|
ce00405afc
|
Create yuukos.md
|
2017-10-13 21:00:15 +07:00 |
|
yuukos
|
6fb9d75bd2
|
fixed test with creating tokenizer
|
2017-10-13 15:51:03 +07:00 |
|
yuukos
|
a229b6e0de
|
added tests for Russian language
added tests of creating Russian Language instance and Russian tokenizer
|
2017-10-13 14:04:37 +07:00 |
|
yuukos
|
622b6d6270
|
updated Russian tokenizer
moved the trying to import pymorph into __init__
|
2017-10-13 13:57:29 +07:00 |
|
ines
|
bfd9506f1d
|
Update extensions docs and add resources
|
2017-10-13 00:18:13 +02:00 |
|
ines
|
5f5d6897e8
|
Increment version
|
2017-10-13 00:18:02 +02:00 |
|
ines
|
9fd68334ab
|
Add validate command docs
|
2017-10-12 23:36:48 +02:00 |
|
Matthew Honnibal
|
cf6da9301a
|
Update lemmatizer test
|
2017-10-12 22:50:52 +02:00 |
|
Matthew Honnibal
|
9b90d235d1
|
Fix tag check in lemmatizer
|
2017-10-12 22:50:43 +02:00 |
|
Matthew Honnibal
|
dc01acd821
|
Escape encoding in validate function
|
2017-10-12 22:23:21 +02:00 |
|
Matthew Honnibal
|
27b927259a
|
Add locale_escape compat function
|
2017-10-12 22:22:04 +02:00 |
|
Matthew Honnibal
|
e72603f39f
|
Merge pull request #1416 from explosion/feature/cli-validate
💫 Add "validate" command to CLI
|
2017-10-12 21:45:20 +02:00 |
|
Matthew Honnibal
|
cb0e727c54
|
Merge pull request #1415 from IamJeffG/fix-alpha-example-train-ner-standalone
Bugfix example script train_ner_standalone.py, fails after training
|
2017-10-12 21:44:28 +02:00 |
|
ines
|
9c6de3dcfa
|
Merge branch 'develop' into feature/cli-validate
|
2017-10-12 21:44:28 +02:00 |
|
Jeffrey Gerard
|
5ba970b495
|
minor cleanup
|
2017-10-12 12:34:46 -07:00 |
|
Matthew Honnibal
|
462caf835a
|
Fix SBD test
|
2017-10-12 21:18:22 +02:00 |
|
Jeffrey Gerard
|
39d3cbfdba
|
Bugfix example script train_ner_standalone.py, fails after training
|
2017-10-12 11:39:12 -07:00 |
|
ines
|
fff1028391
|
Add validate CLI command
|
2017-10-12 20:05:06 +02:00 |
|
yuukos
|
f81dd284eb
|
updated spacy/__init__.py
registered russian language via set_lang_class
|
2017-10-12 22:28:34 +07:00 |
|
yuukos
|
7b9491679f
|
added russian language support
|
2017-10-12 22:24:20 +07:00 |
|
yuukos
|
2a78f4d634
|
updated .gitignore file
added excluding PyCharm's idea directory
|
2017-10-12 22:23:19 +07:00 |
|
Matthew Honnibal
|
908f44c3fe
|
Disable history features by default
|
2017-10-12 14:56:11 +02:00 |
|
Matthew Honnibal
|
a955843684
|
Increase default number of epochs
|
2017-10-12 13:13:01 +02:00 |
|
Matthew Honnibal
|
cecfcc7711
|
Set default hyper params back to 'slow' settings
|
2017-10-12 13:12:26 +02:00 |
|
Ines Montani
|
37aa523a8e
|
Merge pull request #1408 from explosion/feature/dot-underscore
💫 Custom attributes via Doc._, Token._ and Span._
|
2017-10-11 18:35:56 +02:00 |
|
Matthew Honnibal
|
40dbc85ffa
|
Merge pull request #1413 from explosion/feature/lemmatizer
💫 Integrate lookup lemmatization (9+ languages)
|
2017-10-11 17:54:36 +02:00 |
|
ines
|
8ce6f96180
|
Don't make copies of language data components
|
2017-10-11 15:34:55 +02:00 |
|
Ines Montani
|
a06b84e7cc
|
Merge pull request #1407 from hscspring/patch-6
Update training.jade
|
2017-10-11 14:25:38 +02:00 |
|
ines
|
eac9e99086
|
Update docs on adding lemmatization to languages
|
2017-10-11 14:21:15 +02:00 |
|
ines
|
51519251c2
|
Fix underscore method test
|
2017-10-11 13:34:19 +02:00 |
|
ines
|
c6ae49e8bf
|
Fix formatting
|
2017-10-11 13:34:11 +02:00 |
|
ines
|
453c47ca24
|
Add German lemmatizer tests
|
2017-10-11 13:27:26 +02:00 |
|
ines
|
15fe0fd82d
|
Fix tests
|
2017-10-11 13:27:18 +02:00 |
|
ines
|
6dd14dc342
|
Add lookup lemmas to tokens without POS tags
|
2017-10-11 13:27:10 +02:00 |
|
ines
|
9620c1a640
|
Add lemma_lookup to Language defaults
|
2017-10-11 13:26:05 +02:00 |
|
ines
|
9fd471372a
|
Add lookup lemmatizer to lemmatizer as lookup() method
|
2017-10-11 13:25:51 +02:00 |
|
ines
|
e0ff145a8b
|
Merge branch 'develop' into feature/dot-underscore
|
2017-10-11 11:57:05 +02:00 |
|
ines
|
c1d6d43c83
|
Merge branch 'develop' into feature/lemmatizer
|
2017-10-11 11:56:35 +02:00 |
|
Ines Montani
|
ffc2fef13c
|
Merge pull request #1411 from raphael0202/issue_1078
Resolve issue #1078 by simplifying URL pattern
|
2017-10-11 11:54:57 +02:00 |
|
Raphaël Bournhonesque
|
3452d6ce52
|
Resolve issue #1078 by simplifying URL pattern
- avoid catastrophic backtracking
- reduce character range of host name, domain name and TLD identifier
|
2017-10-11 11:24:00 +02:00 |
|
Matthew Honnibal
|
17c467e0ab
|
Avoid clobbering existing lemmas
|
2017-10-11 03:33:06 -05:00 |
|
Matthew Honnibal
|
807e109f2b
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-10-11 02:47:59 -05:00 |
|
Matthew Honnibal
|
6e552c9d83
|
Prune number of non-projective labels more aggressiely
|
2017-10-11 02:46:44 -05:00 |
|
Matthew Honnibal
|
76fe24f44d
|
Improve embedding defaults
|
2017-10-11 09:44:17 +02:00 |
|
Matthew Honnibal
|
188f620046
|
Improve parser defaults
|
2017-10-11 09:43:48 +02:00 |
|
Matthew Honnibal
|
acba2e1051
|
Fix metadata in training
|
2017-10-11 08:55:52 +02:00 |
|
Matthew Honnibal
|
74c2c6a58c
|
Add default name and lang to meta
|
2017-10-11 08:49:12 +02:00 |
|
Matthew Honnibal
|
3814a161e6
|
Avoid clobbering preset lemmas
|
2017-10-11 08:41:03 +02:00 |
|
Matthew Honnibal
|
fd47f8e89f
|
Fix failing test
|
2017-10-11 08:38:34 +02:00 |
|