Matthew Honnibal
59d655e8d0
Fix model init from jsonl
2018-07-04 01:30:40 +02:00
Matthew Honnibal
1e38bea6e9
Save vectors init
2018-07-03 23:55:04 +02:00
Matthew Honnibal
6692833887
Fix init_model
2018-07-03 23:24:11 +02:00
Matthew Honnibal
4a38a26cb5
Fix init_model
2018-07-03 22:57:11 +02:00
Matthew Honnibal
019d09e3c3
Fix init model
2018-07-03 22:16:44 +02:00
Matthew Honnibal
2543f8c93a
Support .npz vectors in init-model command
2018-07-03 21:42:16 +02:00
Matthew Honnibal
86aad11939
Fix init_model arg
2018-07-03 17:00:42 +02:00
Matthew Honnibal
eff42d36e3
Fix init model command
2018-07-03 16:32:23 +02:00
Matthew Honnibal
97487122ea
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2018-07-03 15:44:37 +02:00
Matthew Honnibal
6a89faf12e
Add support for jsonl-formatted lexical attributes to init-model command.
2018-07-03 12:22:56 +02:00
Matthew Honnibal
2ec2192000
Revert #1389 : Don't overrule rules when lemma exception is present
2018-06-29 19:43:02 +02:00
Matthew Honnibal
01ace9734d
Make pipeline work on empty docs
2018-06-29 19:21:38 +02:00
Matthew Honnibal
a1b05048d0
Fix tagger when doc is empty
2018-06-29 16:05:40 +02:00
Matthew Honnibal
3786942ff1
Fix tagger when docs are empty
2018-06-29 15:13:45 +02:00
ines
526be40823
Add test for 46d8a66
2018-06-29 14:33:12 +02:00
ines
f08c871adf
Fix typo in Language.from_disk
2018-06-29 14:32:16 +02:00
Matthew Honnibal
46d8a66fef
Fix tokenizer serialization if token_match is None
2018-06-29 14:24:46 +02:00
Matthew Honnibal
e0860bcfb3
Fix bug when docs are empty
2018-06-29 13:56:29 +02:00
Matthew Honnibal
a4d2b0c293
Fix bug when docs are empty
2018-06-29 13:44:25 +02:00
Matthew Honnibal
c83fccfe2a
Fix output of best model
2018-06-25 23:05:56 +02:00
Matthew Honnibal
5a65418c40
Fix handling of unseen labels in tagger
2018-06-25 22:28:59 +02:00
Matthew Honnibal
5b56aad4c2
Fix handling of unseen labels in tagger
2018-06-25 22:24:54 +02:00
Matthew Honnibal
3aabf621a3
Fix handling of unknown tags in tagger update
2018-06-25 22:01:02 +02:00
Matthew Honnibal
69c900f003
Fix init-model if no vectors provided
2018-06-25 18:26:02 +02:00
Matthew Honnibal
664f89327a
Fix init-model if no vectors provided
2018-06-25 17:58:45 +02:00
Matthew Honnibal
c4698f5712
Don't collate model unless training succeeds
2018-06-25 16:36:42 +02:00
Matthew Honnibal
24dfbb8a28
Fix model collation
2018-06-25 14:35:24 +02:00
Matthew Honnibal
62237755a4
Import shutil
2018-06-25 13:40:17 +02:00
Matthew Honnibal
a040fca99e
Import json into cli.train
2018-06-25 11:50:37 +02:00
Matthew Honnibal
2c703d99c2
Fix collation of best models
2018-06-25 01:21:34 +02:00
Matthew Honnibal
9d6a1c57f2
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2018-06-24 23:40:06 +02:00
Matthew Honnibal
2c80b7c013
Collate best model after training
2018-06-24 23:39:52 +02:00
ines
778e5f4da3
Merge branch 'master' into develop
2018-06-11 00:38:04 +02:00
himkt
57311d5d47
replace janome with mecab in the documentation and the test ( #2415 )
...
* Add links to Reddit data (see #2401 )
* replace janome with mecab in the documentation and the test
* add the assignment
2018-06-11 00:33:13 +02:00
Nour Shalabi
a169b79092
Additions to Arabic stop words. ( #2422 )
...
* Additions to Arabic stop words.
* Create nourshalabi.md
2018-06-08 02:33:23 +02:00
ines
a0017e4909
Merge branch 'master' into develop
2018-05-30 14:10:47 +02:00
ines
b8ef9c1000
Fix model names in conftest (see #2379 )
2018-05-30 14:10:20 +02:00
ines
4a62486340
Merge branch 'master' into develop
2018-05-30 13:01:01 +02:00
Maciej
c7d53348d7
Fix bug in CLI iob and ner converter ( #2392 ) ( fixes #2385 )
...
* issue_2385 add tests for iob_to_biluo converter function
* issue_2385 fix and modify iob_to_biluo function to accept either iob or biluo tags in cli.converter
* issue_2385 add test to fix b char bug
* add contributor agreement
* fill contributor agreement
2018-05-30 12:28:44 +02:00
ines
3c3a175018
Merge branch 'master' into develop
2018-05-28 18:37:09 +02:00
ansgar-t
9732988951
escape html in displacy.render ( #2378 ) ( closes #2361 )
...
## Description
Fix for issue #2361 :
replace &, <, >, " with &amp; , &lt; , &gt; , &quot; in before rendering svg
## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [ ] I ran the tests, and all new and existing tests passed.
(As discussed in the comments to #2361 )
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
2018-05-28 18:36:41 +02:00
ines
f7103babd9
Only overwrite warnings filter if set explicitly ( resolves #2369 )
...
This way, pre-defined warning filters are respected and users are still able to use the fine-grained warning settings if they like.
2018-05-26 18:44:15 +02:00
ines
330c039106
Merge branch 'master' into develop
2018-05-26 18:30:52 +02:00
James Messinger
4515e96e90
Better formatting for spacy train
CLI ( #2357 )
...
* Better formatting for `spacy train` CLI
Changed to use fixed-spaces rather than tabs to align table headers and data.
### Before:
```
Itn. P.Loss N.Loss UAS NER P. NER R. NER F. Tag % Token %
0 4618.857 2910.004 76.172 79.645 67.987 88.732 88.261 100.000 4436.9 6376.4
1 4671.972 3764.812 74.481 78.046 62.374 82.680 88.377 100.000 4672.2 6227.1
2 4742.756 3673.473 71.994 77.380 63.966 84.494 90.620 100.000 4298.0 5983.9
```
### After:
```
Itn. Dep Loss NER Loss UAS NER P. NER R. NER F. Tag % Token % CPU WPS GPU WPS
0 4618.857 2910.004 76.172 79.645 67.987 88.732 88.261 100.000 4436.9 6376.4
1 4671.972 3764.812 74.481 78.046 62.374 82.680 88.377 100.000 4672.2 6227.1
2 4742.756 3673.473 71.994 77.380 63.966 84.494 90.620 100.000 4298.0 5983.9
```
* Added contributor file
2018-05-25 13:08:45 +02:00
Aristo Rinjuang
432ede04af
adding more words and rephrasing ( #2351 )
...
* adding more words and rephrasing
* adding a contributor
* tokenizer bugs solved
2018-05-24 11:40:57 +02:00
Jani Monoses
ec62cadf4c
Updates to Romanian support ( #2354 )
...
* Add back Romanian in conftest
* Romanian lex_attr
* More tokenizer exceptions for Romanian
* Add tests for some Romanian tokenizer exceptions
2018-05-24 11:40:00 +02:00
Matthew Honnibal
5d281cf302
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2018-05-22 20:50:59 +02:00
Matthew Honnibal
ce458c2428
Fix spacy requirement constraint in package template
2018-05-22 20:50:46 +02:00
Ines Montani
862da5e793
Support pipeline factories via entry points ( #2348 )
2018-05-22 18:29:45 +02:00
Matthew Honnibal
d5af38f80c
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2018-05-21 17:42:55 +02:00