Matthew Honnibal
a040fca99e
Import json into cli.train
2018-06-25 11:50:37 +02:00
Matthew Honnibal
2c703d99c2
Fix collation of best models
2018-06-25 01:21:34 +02:00
Matthew Honnibal
9d6a1c57f2
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2018-06-24 23:40:06 +02:00
Matthew Honnibal
2c80b7c013
Collate best model after training
2018-06-24 23:39:52 +02:00
Matthew Honnibal
5435b071b9
Add make clean command
2018-06-24 23:39:34 +02:00
ines
778e5f4da3
Merge branch 'master' into develop
2018-06-11 00:38:04 +02:00
himkt
57311d5d47
replace janome with mecab in the documentation and the test ( #2415 )
...
* Add links to Reddit data (see #2401 )
* replace janome with mecab in the documentation and the test
* add the assignment
2018-06-11 00:33:13 +02:00
ines
effb55d591
Adjust formatting [ci skip]
2018-06-11 00:29:13 +02:00
Nathan Breit
ba6d2cf393
Add EpiTator to Universe ( #2429 )
2018-06-11 00:24:13 +02:00
Daniel Ruf
d6d688914f
chore: cache dependencies ( #2418 )
...
* chore: cache dependencies
* chore: add CLA
2018-06-11 00:22:41 +02:00
himkt
1a568f2e08
fix wrong documentations ( #2423 )
2018-06-11 00:21:06 +02:00
Bohdan Moskalevskyi
d66292f767
fix UD data file extensions ( #2425 )
...
* fix UD data files extension
* add contributor agreement for msklvsk
2018-06-08 14:26:11 +02:00
Nour Shalabi
a169b79092
Additions to Arabic stop words. ( #2422 )
...
* Additions to Arabic stop words.
* Create nourshalabi.md
2018-06-08 02:33:23 +02:00
Matthew Honnibal
12f09313b1
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2018-06-02 17:10:34 +02:00
Matthew Honnibal
4f19fe0f3a
Add Makefile
2018-06-02 17:10:15 +02:00
Ines Montani
3f2e3cbd27
Add links to Reddit data (see #2401 )
2018-05-31 16:22:43 +02:00
ines
a0017e4909
Merge branch 'master' into develop
2018-05-30 14:10:47 +02:00
ines
b8ef9c1000
Fix model names in conftest (see #2379 )
2018-05-30 14:10:20 +02:00
ines
0baaf836cf
Update formatting [ci skip]
2018-05-30 13:32:49 +02:00
ines
3913e18201
Add self-attentive-parser to universe (see #59 )
2018-05-30 13:31:28 +02:00
ines
4a62486340
Merge branch 'master' into develop
2018-05-30 13:01:01 +02:00
Maciej
c7d53348d7
Fix bug in CLI iob and ner converter ( #2392 ) ( fixes #2385 )
...
* issue_2385 add tests for iob_to_biluo converter function
* issue_2385 fix and modify iob_to_biluo function to accept either iob or biluo tags in cli.converter
* issue_2385 add test to fix b char bug
* add contributor agreement
* fill contributor agreement
2018-05-30 12:28:44 +02:00
ines
605c663a4c
Fix HTML merger examples (see #2390 )
2018-05-30 12:22:32 +02:00
ines
3c3a175018
Merge branch 'master' into develop
2018-05-28 18:37:09 +02:00
ansgar-t
9732988951
escape html in displacy.render ( #2378 ) ( closes #2361 )
...
## Description
Fix for issue #2361 :
replace &, <, >, " with &amp; , &lt; , &gt; , &quot; in before rendering svg
## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [ ] I ran the tests, and all new and existing tests passed.
(As discussed in the comments to #2361 )
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
2018-05-28 18:36:41 +02:00
ines
d0b16aa014
Update list of languages
2018-05-26 18:56:26 +02:00
ines
f7103babd9
Only overwrite warnings filter if set explicitly ( resolves #2369 )
...
This way, pre-defined warning filters are respected and users are still able to use the fine-grained warning settings if they like.
2018-05-26 18:44:15 +02:00
ines
330c039106
Merge branch 'master' into develop
2018-05-26 18:30:52 +02:00
Samuel Pouyt
d85494bfae
Added agrement ( #2374 )
2018-05-26 18:19:08 +02:00
Samuel Pouyt
5f988b8e9c
Update _custom.jade ( #2372 )
...
It seems based on the doc and trying out that the `en` or `[lang]` is missing from the `spacy model-init`
2018-05-26 18:17:12 +02:00
ines
d84a830d79
Merge branch 'master' of https://github.com/explosion/spaCy
2018-05-26 17:57:05 +02:00
ines
fb923b31ea
Fix bad HTML example (see #2376 ) and turn it into section on matcher + components
...
Avoid problems caused by merging while matching (e.g. index errors). Creating a Matcher component also better reflects the recommended best practices.
2018-05-26 17:57:02 +02:00
James Messinger
4515e96e90
Better formatting for spacy train
CLI ( #2357 )
...
* Better formatting for `spacy train` CLI
Changed to use fixed-spaces rather than tabs to align table headers and data.
### Before:
```
Itn. P.Loss N.Loss UAS NER P. NER R. NER F. Tag % Token %
0 4618.857 2910.004 76.172 79.645 67.987 88.732 88.261 100.000 4436.9 6376.4
1 4671.972 3764.812 74.481 78.046 62.374 82.680 88.377 100.000 4672.2 6227.1
2 4742.756 3673.473 71.994 77.380 63.966 84.494 90.620 100.000 4298.0 5983.9
```
### After:
```
Itn. Dep Loss NER Loss UAS NER P. NER R. NER F. Tag % Token % CPU WPS GPU WPS
0 4618.857 2910.004 76.172 79.645 67.987 88.732 88.261 100.000 4436.9 6376.4
1 4671.972 3764.812 74.481 78.046 62.374 82.680 88.377 100.000 4672.2 6227.1
2 4742.756 3673.473 71.994 77.380 63.966 84.494 90.620 100.000 4298.0 5983.9
```
* Added contributor file
2018-05-25 13:08:45 +02:00
Shantam Raj
592834183a
corrected spelling ( #2359 )
...
changed **interpretted** to **interpreted**
2018-05-24 13:29:52 +02:00
ines
8adb967e0c
Fix from source quickstart instructions for Windows
...
See: https://stackoverflow.com/a/50478036/6400719
2018-05-24 12:42:16 +02:00
Aristo Rinjuang
432ede04af
adding more words and rephrasing ( #2351 )
...
* adding more words and rephrasing
* adding a contributor
* tokenizer bugs solved
2018-05-24 11:40:57 +02:00
Jani Monoses
ec62cadf4c
Updates to Romanian support ( #2354 )
...
* Add back Romanian in conftest
* Romanian lex_attr
* More tokenizer exceptions for Romanian
* Add tests for some Romanian tokenizer exceptions
2018-05-24 11:40:00 +02:00
Matthew Honnibal
5d281cf302
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2018-05-22 20:50:59 +02:00
Matthew Honnibal
ce458c2428
Fix spacy requirement constraint in package template
2018-05-22 20:50:46 +02:00
Ines Montani
862da5e793
Support pipeline factories via entry points ( #2348 )
2018-05-22 18:29:45 +02:00
Matthew Honnibal
94ad2d66b6
Require thinc 6.11.2
2018-05-21 19:26:28 +02:00
Matthew Honnibal
d5af38f80c
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2018-05-21 17:42:55 +02:00
Matthew Honnibal
ee33de8652
Fix unpickling of NER parser
2018-05-21 17:42:40 +02:00
Shantam Raj
1a4682dd0b
Update _training.jade ( #2340 )
...
* Update _training.jade
Correcting grammar. Replacing "The" with "To".
* Create armsp.md
* Update armsp.md
2018-05-21 11:09:33 +02:00
ines
f9dbcac8e4
Merge branch 'master' into develop
2018-05-21 02:29:29 +02:00
cclauss
f7dcaa1f6b
Simplify is_config() and normalize_string_keys() ( #2305 )
...
* Simplify is_config() and normalize_string_keys()
* Use __in__ to avoid the nested _ands_ and _ors_.
* Dict comprehension directly tracks with the doc string
* Keep more basic loop in normalize_string_keys
* Whitespace
2018-05-21 01:54:35 +02:00
Ines Montani
cae4457c38
💫 Add .similarity warnings for no vectors and option to exclude warnings ( #2197 )
...
* Add logic to filter out warning IDs via environment variable
Usage: SPACY_WARNING_EXCLUDE=W001,W007
* Add warnings for empty vectors
* Add warning if no word vectors are used in .similarity methods
For example, if only tensors are available in small models – should hopefully clear up some confusion around this
* Capture warnings in tests
* Rename SPACY_WARNING_EXCLUDE to SPACY_WARNING_IGNORE
2018-05-21 01:22:38 +02:00
ines
ff1082d8e4
Add version tag in CLI docs [ci skip]
2018-05-21 01:17:49 +02:00
Matthew Honnibal
b096b22c20
Merge pull request #2247 from skrcode/1480
...
1480 - Implement Fast-Text vectors with subword features
2018-05-21 01:16:21 +02:00
Matthew Honnibal
f3b4f6a4ec
Merge setup.py
2018-05-20 23:21:00 +02:00