Ines Montani
aa876884f0
Revert "Revert "Merge remote-tracking branch 'origin/master'""
...
This reverts commit fb9d3bb022
.
2017-01-09 13:28:13 +01:00
Matthew Honnibal
3679fb43a3
Fix loading of lemmatizer
2016-12-18 17:34:09 +01:00
Ines Montani
b11d8cd3db
Merge remote-tracking branch 'origin/organize-language-data' into organize-language-data
2016-12-18 16:57:12 +01:00
Ines Montani
753068f1d5
Use base language data as default
2016-12-18 16:55:25 +01:00
Ines Montani
bcc1d50d09
Remove trailing whitespace
2016-12-18 16:54:52 +01:00
Matthew Honnibal
44f4f008bd
Wire up lemmatizer rules for English
2016-12-18 15:50:09 +01:00
Matthew Honnibal
296d33a4fc
Merge branch 'master' of ssh://github.com/explosion/spaCy
2016-11-26 12:36:18 +01:00
Matthew Honnibal
1f6c37c6f5
Fix create_tokenizer when nlp is None
2016-11-26 12:36:04 +01:00
Matthew Honnibal
c7889492f9
Fix model saving error for Python 3
2016-11-25 18:04:30 -06:00
Matthew Honnibal
159e8c46e1
Merge old training fixes with newer state
2016-11-25 09:16:36 -06:00
Matthew Honnibal
a2f55e7015
Pass cfg through loading, for training.
2016-11-25 09:01:20 -06:00
Matthew Honnibal
09f68bc641
Fix Issue #639 : stop words in language class not used. This patch is messy, but it's better not to change too much until the language data loading can be properly refactored.
2016-11-24 00:13:55 +01:00
Matthew Honnibal
48e1dc29d4
Fix default path loading.
2016-11-23 23:48:55 +01:00
ExplodingCabbage
6c4f488e89
Fix syntax mistake
2016-11-23 15:12:45 +00:00
Matthew Honnibal
60eb2343ce
Only try to load vectors if they exist.
2016-11-23 13:50:24 +01:00
Matthew Honnibal
618ac36093
Fix use of path argument in Language.__init__. Needs to be keyword arg, not positional.
2016-11-23 13:26:34 +01:00
Mark Amery
fbe19680a6
Fix another bug related to Language.__init__'s path parameter
2016-11-20 20:31:34 +00:00
Mark Amery
b0a07c21a0
Fix path
param of Language.__init__
always being ignored
...
There was an explicitly-declared `path` keyword argument, so 'path'
would never be present in `**overrides`. This line just overwrote
any manually-specified value the user might've passed to the `path`
parameter.
2016-11-20 16:29:57 +00:00
Matthew Honnibal
22647c2423
Check that patterns aren't null before compiling regex for tokenizer
2016-11-02 20:35:29 +01:00
Matthew Honnibal
f7fee6c24b
Check for class-defined make_docs method before assigning one provided as an argument
2016-11-02 19:57:13 +01:00
Matthew Honnibal
b86f8af0c1
Fix doc strings
2016-11-01 12:25:36 +01:00
Matthew Honnibal
cb49189477
Remove dead code
2016-10-26 13:11:07 +02:00
Matthew Honnibal
150e02d72e
Fix Issue #566
2016-10-23 20:19:01 +02:00
Matthew Honnibal
739213a8af
Fix create_pipeline keyword argument.
2016-10-23 14:24:16 +02:00
Matthew Honnibal
5ec32f5d97
Fix loading of GloVe vectors, to address Issue #541
2016-10-20 18:27:48 +02:00
Matthew Honnibal
d4aaf2752c
Fix issue #535 : Pipeline elements added even when data not installed.
2016-10-19 19:55:19 +02:00
Matthew Honnibal
1b651db9c5
Fix parser creation in Language class.
2016-10-18 19:36:44 +02:00
Matthew Honnibal
45a6f9b9c7
Fix loading of tagger.
2016-10-18 19:33:04 +02:00
Matthew Honnibal
7d5212f131
Refactor defaults
2016-10-18 16:18:25 +02:00
Matthew Honnibal
f787cd29fe
Refactor the pipeline classes to make them more consistent, and remove the redundant blank() constructor.
2016-10-16 21:34:57 +02:00
Matthew Honnibal
ca51f3b77e
Use DependencyParser and EntityRecognizer in the Language class.
2016-10-16 17:58:12 +02:00
Matthew Honnibal
a81c5a7abf
Fix name of labels keyword to 'actions'.
2016-10-16 12:00:27 +02:00
Matthew Honnibal
8a6b35d266
Delay binding in MakeDoc
2016-10-16 11:41:55 +02:00
Matthew Honnibal
08e9134760
Change default value of path to True
2016-10-15 14:12:54 +02:00
Matthew Honnibal
6d8cb515ac
Break the tokenization stage out of the pipeline into a function 'make_doc'. This allows all pipeline methods to have the same signature.
2016-10-14 17:38:29 +02:00
Matthew Honnibal
41f88ce938
Fix dep model loading in parser
2016-10-12 20:26:38 +02:00
Matthew Honnibal
0e2bedc373
Fix default labels for parser and NER
2016-10-12 19:12:40 +02:00
Matthew Honnibal
847a4a4182
Refactor Language, dropping Language.blank() method.
2016-10-12 13:45:58 +02:00
Matthew Honnibal
ea23b64cc8
Refactor training, with new spacy.train module. Defaults still a little awkward.
2016-10-09 12:24:24 +02:00
Matthew Honnibal
eceeaefe53
Fix defaults for Parser and Entity, adding a blank= argument.
2016-09-30 19:56:06 +02:00
Matthew Honnibal
e382e48d9f
Temporarily patch handling of defaul templates for tagger. Need to move these to language_data.
2016-09-27 13:21:28 +02:00
Matthew Honnibal
b14b9b096b
Return None if /deps directory not present, instead of trying to load the parser.
2016-09-26 18:48:03 +02:00
Matthew Honnibal
0b2d7ae9d6
Fix Entity creation
2016-09-26 15:41:22 +02:00
Matthew Honnibal
2debc4e0a2
Add .blank() method to Parser. Start housing default dep labels and entity types within the Defaults class.
2016-09-26 11:57:54 +02:00
Matthew Honnibal
722199acb8
Add spacy.blank() method, that doesn't load data. Don't try to load data if path is falsey
2016-09-26 11:07:46 +02:00
Matthew Honnibal
7db956133e
Move tokenizer data for German into spacy.de.language_data
2016-09-25 15:37:33 +02:00
Matthew Honnibal
95aaea0d3f
Refactor so that the tokenizer data is read from Python data, rather than from disk
2016-09-25 14:49:53 +02:00
Matthew Honnibal
fd58f7655a
Python 3 compatible basestring
2016-09-24 22:16:43 +02:00
Matthew Honnibal
fd65cf6cbb
Finish refactoring data loading
2016-09-24 20:26:17 +02:00
Matthew Honnibal
83e364188c
Mostly finished loading refactoring. Design is in place, but doesn't work yet.
2016-09-24 15:42:01 +02:00