Sofie Van Landeghem
82347110f5
Default empty KB in EL component ( #5872 )
...
* EL field documentation
* documentation consistent with docs
* default empty KB, initialize vocab separately
* formatting
* add test for changing the default entity vector length
* update comment
2020-08-04 14:34:09 +02:00
Adriane Boyd
b7e3018d97
Recalculate alignment if tokenization differs ( #5868 )
...
* Recalculate alignment if tokenization differs
* Refactor cached alignment data
2020-08-04 14:31:32 +02:00
Adriane Boyd
c62fd878a3
Allow Doc.char_span to snap to token boundaries ( #5849 )
...
* Allow Doc.char_span to snap to token boundaries
Add a `mode` option to allow `Doc.char_span` to snap to token
boundaries. The `mode` options:
* `strict`: character offsets must match token boundaries (default, same as
before)
* `inside`: all tokens completely within the character span
* `outside`: all tokens at least partially covered by the character span
Add a new helper function `token_by_char` that returns the token
corresponding to a character position in the text. Update
`token_by_start` and `token_by_end` to use `token_by_char` for more
efficient searching.
* Remove unused import
* Rename mode to alignment_mode
Rename `mode` to `alignment_mode` with the options
`strict`/`contract`/`expand`. Any unrecognized modes are silently
converted to `strict`.
2020-08-04 13:36:32 +02:00
Adriane Boyd
b841248589
Add Span index boundary checks ( #5861 )
...
* Add Span index boundary checks
* Return Span-specific IndexError in all cases
* Simplify and fix if/else
2020-08-04 13:35:25 +02:00
svlandeg
17797764fa
update EL docs
2020-08-03 22:51:35 +02:00
svlandeg
d862975f31
remove todo comment
2020-08-03 22:07:24 +02:00
svlandeg
c66481a699
remove old alignment information in top-level
2020-08-03 22:06:35 +02:00
svlandeg
01f9c1d06e
add Aligment section to Example
2020-08-03 19:38:39 +02:00
svlandeg
f846245936
update gold.align explanation in linguistic features
2020-08-03 18:15:36 +02:00
svlandeg
35946783c4
add some links and titles
2020-08-03 14:55:06 +02:00
Adriane Boyd
cd59979ab4
Fix span boundary handling in Spanish noun_chunks ( #5860 )
2020-08-03 13:53:15 +02:00
Ines Montani
934447a611
Merge pull request #5855 from svlandeg/fix/cli-debug
2020-08-03 13:09:20 +02:00
svlandeg
2ab8c3f780
adding example dictionary formats to data-formats.md
2020-08-02 19:11:01 +02:00
Ines Montani
4c055f0aa7
Add init CLI and init config ( #5854 )
...
* Add init CLI and init config draft
* Improve config validation
* Auto-format
* Don't export anything in debug config
* Update docs
2020-08-02 15:18:30 +02:00
svlandeg
6f4e46ee93
Merge remote-tracking branch 'upstream/develop' into fix/cli-debug
...
# Conflicts:
# pyproject.toml
# requirements.txt
# setup.cfg
2020-08-01 18:38:59 +02:00
Ines Montani
e393ebd78b
Merge pull request #5851 from explosion/feature/better-pipe-analysis
2020-08-01 14:20:27 +02:00
Ines Montani
b40f44419b
Simplify pipe analysis
...
- remove unused code
- don't print by default
- integrate attrs info into analysis output
2020-08-01 13:40:06 +02:00
Ines Montani
93144bde97
Update code block style [ci skip]
2020-07-31 18:55:55 +02:00
Ines Montani
98c6a85c8b
Update docs [ci skip]
2020-07-31 18:55:38 +02:00
Ines Montani
b68c53858c
Remove global
2020-07-31 18:37:58 +02:00
Ines Montani
30a76fcf6f
Integrate and simplify pipe analysis
2020-07-31 18:34:35 +02:00
svlandeg
c376c2e122
add docs & examples for debug_model
2020-07-31 18:19:17 +02:00
svlandeg
9b719dfb1a
use divider inbetween steps
2020-07-31 18:06:48 +02:00
svlandeg
51ffc4a166
rename pipe_name to component
2020-07-31 17:58:55 +02:00
svlandeg
878327d38e
printing final predictions by default to False
2020-07-31 17:36:32 +02:00
Ines Montani
2d955fbf98
Fix linting [ci skip]
2020-07-31 17:05:28 +02:00
Ines Montani
e9e8fa2466
Update docs and types
2020-07-31 17:02:54 +02:00
Ines Montani
dab31426e1
Pin to latest Thinc
2020-07-31 17:00:14 +02:00
svlandeg
cc2f58a1b0
use data_validation context manager
2020-07-31 16:49:42 +02:00
Adriane Boyd
ac14ce7c30
Prefer earlier spans in EntityRuler ( #5843 )
...
Similar to #4414 , update the sorting in EntityRuler to prefer the first
span in overlapping spans.
2020-07-31 16:09:32 +02:00
svlandeg
5fa3235d06
set DATA_VALIDATION to False for debug_model (upgrade thinc)
2020-07-31 15:21:01 +02:00
svlandeg
a52e1f99ff
revert commits that should have been on different, local branch
2020-07-31 15:10:20 +02:00
svlandeg
08d3c36c20
bugfix in train CLI
2020-07-31 15:03:43 +02:00
svlandeg
d5d7fe5968
set DATA_VALIDATION to False for debug_model (upgrade thinc)
2020-07-31 14:19:10 +02:00
svlandeg
35dd91a671
bugfix in train CLI
2020-07-31 14:18:27 +02:00
Ines Montani
6365837ca9
Merge pull request #5833 from explosion/feature/scorer-adjustments
2020-07-31 14:00:39 +02:00
Ines Montani
5a221f79c2
Revert "Remove keyword-only from Scorer API docs" [ci skip]
...
This reverts commit 7a6ac47dc1
.
2020-07-31 14:00:21 +02:00
Ines Montani
160f1a5f94
Update docs [ci skip]
2020-07-31 13:26:39 +02:00
Adriane Boyd
9b509aa87f
Move Language.evaluate scorer config to new arg
...
Move `Language.evaluate` scorer config from `component_cfg` to separate
argument `scorer_cfg`.
2020-07-31 11:05:16 +02:00
Adriane Boyd
901801b33b
Fix default arguments in DependencyParser.score
2020-07-31 10:55:44 +02:00
Adriane Boyd
9d79916792
Merge branch 'develop' into feature/scorer-adjustments
2020-07-31 10:48:14 +02:00
svlandeg
355cc6b921
Merge branch 'fix/cli-debug' into nightly.spacy.io
2020-07-31 10:18:48 +02:00
Sofie Van Landeghem
ca491722ad
The Parser is now a Pipe (2) ( #5844 )
...
* moving syntax folder to _parser_internals
* moving nn_parser and transition_system
* move nn_parser and transition_system out of internals folder
* moving nn_parser code into transition_system file
* rename transition_system to transition_parser
* moving parser_model and _state to ml
* move _state back to internals
* The Parser now inherits from Pipe!
* small code fixes
* removing unnecessary imports
* remove link_vectors_to_models
* transition_system to internals folder
* little bit more cleanup
* newlines
2020-07-30 23:30:54 +02:00
svlandeg
0b23594953
pipe_name instead of section in debug_model
2020-07-30 20:06:28 +02:00
svlandeg
abe91cff92
update documentation of debug config
2020-07-30 19:12:42 +02:00
holubvl3
d16c0f2c3a
Create holubvl3 ( #5845 )
...
* Create holubvl3
* Rename holubvl3 to holubvl3.md
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2020-07-30 17:40:31 +02:00
Rahul Gupta
f76fae0e8d
English: adds ordinal numbers ( #5830 )
2020-07-29 20:22:47 +02:00
Ines Montani
3449c45fd9
Update docs [ci skip]
2020-07-29 19:48:26 +02:00
Ines Montani
9c80cb673d
Update docs [ci skip]
2020-07-29 19:41:34 +02:00
Ines Montani
9f69afdd1e
Update docs [ci skip]
2020-07-29 19:09:44 +02:00