Kevin Humphreys
b393525b50
Merge branch 'rapidfuzz' of https://github.com/kwhumphreys/spaCy into rapidfuzz
2022-09-14 15:56:18 -07:00
Kevin Humphreys
b7599dfb2f
fuzzy match only on oov tokens
2022-09-14 15:54:05 -07:00
Kevin Humphreys
a6d26a0195
switch to polyleven
...
(Python package)
2022-09-12 16:45:51 -07:00
Kevin Humphreys
568a843c09
revert changes added for fuzzy param
2022-09-12 16:45:51 -07:00
Kevin Humphreys
3591a69d35
switch to FUZZYn predicates
...
use Levenshtein distance.
remove fuzzy param.
remove rapidfuzz_capi.
2022-09-12 16:45:51 -07:00
Kevin Humphreys
974e5f9902
case fix
2022-09-12 16:45:51 -07:00
Kevin Humphreys
e636f4941b
simplify fuzzy sets
2022-09-12 16:45:51 -07:00
Kevin Humphreys
9c0f9368a9
handle fuzzy sets
2022-09-12 16:45:51 -07:00
Kevin Humphreys
0859e391c6
remove unnecessary dependency
2022-09-12 16:45:50 -07:00
Kevin Humphreys
ee25d434b6
tidying
2022-09-12 16:45:50 -07:00
Kevin Humphreys
3dba984db9
fix type properly
2022-09-12 16:45:50 -07:00
Kevin Humphreys
63f5e1331d
add fuzzy attribute list
2022-09-12 16:45:50 -07:00
Kevin Humphreys
594674db92
add FUZZY predicate
2022-09-12 16:45:50 -07:00
Kevin Humphreys
426f3349d4
fix type
2022-09-12 16:45:50 -07:00
Kevin Humphreys
3a63ad1913
include rapidfuzz_capi
...
not yet used
2022-09-12 16:45:50 -07:00
Kevin Humphreys
66e9fdd246
add fuzzy param to EntityMatcher
2022-09-12 16:45:50 -07:00
Kevin Humphreys
dacfb57b03
enable fuzzy matching
2022-09-12 16:45:50 -07:00
Kevin Humphreys
59021f7d25
switch to polyleven
...
(Python package)
2022-08-29 21:42:10 +02:00
Kevin Humphreys
a8a4d86bae
revert changes added for fuzzy param
2022-08-29 18:28:17 +02:00
Kevin Humphreys
43948f731b
switch to FUZZYn predicates
...
use Levenshtein distance.
remove fuzzy param.
remove rapidfuzz_capi.
2022-08-29 18:10:42 +02:00
Kevin Humphreys
ecd0455acd
case fix
2022-08-29 15:49:15 +02:00
Kevin Humphreys
ecebb5b145
simplify fuzzy sets
2022-08-29 12:49:14 +02:00
Kevin Humphreys
9bdccf94e5
handle fuzzy sets
2022-08-29 10:58:50 +02:00
Kevin Humphreys
b189f25aaa
remove unnecessary dependency
2022-08-29 10:58:11 +02:00
Kevin Humphreys
c03394810b
tidying
2022-08-26 02:06:05 +02:00
Kevin Humphreys
c017de997a
fix type properly
2022-08-26 01:30:44 +02:00
Kevin Humphreys
78699ab0ce
add fuzzy attribute list
2022-08-26 00:10:53 +02:00
Kevin Humphreys
3dc5b9c7be
add FUZZY predicate
2022-08-24 17:54:42 +02:00
Kevin Humphreys
9600fe1d99
fix type
2022-08-24 15:04:09 +02:00
Kevin Humphreys
ee985a382e
include rapidfuzz_capi
...
not yet used
2022-08-24 13:13:54 +02:00
Kevin Humphreys
b617382dc6
add fuzzy param to EntityMatcher
2022-08-24 13:13:27 +02:00
Kevin Humphreys
1f2e57eca4
enable fuzzy matching
2022-08-22 17:02:47 +02:00
Sofie Van Landeghem
6e20842370
dev docs: numeric comparators ( #11334 )
...
* add section on numeric comparators
* edit
* prettier
* Update extra/DEVELOPER_DOCS/Code Conventions.md
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* note on typing imports
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-08-22 15:52:53 +02:00
Adriane Boyd
f55bb7470d
Clean up warnings in the test suite ( #11331 )
2022-08-22 12:04:30 +02:00
Paul O'Leary McCann
0f07defe2c
Remove reference to voting on issue ( #11335 )
...
Not clear which issue this refers to, we don't suggest this for any
other issues, and we don't use votes in general.
2022-08-22 11:29:05 +02:00
Adriane Boyd
04c6e5cb95
Improve floret vectors display in pipeline docs ( #11343 )
2022-08-22 11:28:13 +02:00
Adriane Boyd
3e4cf1bbe1
Check for . in factory names ( #11336 )
2022-08-19 09:52:12 +02:00
Adriane Boyd
09b3118b26
Add uk pipelines to website ( #11332 )
2022-08-18 14:04:57 +02:00
Sofie Van Landeghem
cab263791f
include span_ruler for default warning filter ( #11333 )
2022-08-17 19:55:54 +02:00
Peter Baumgartner
db7b9938a4
Docs: displaCy documentation - data types, parse_{deps,ents,spans}
, spans example ( #10950 )
...
* add in spans example and parse references
* rm autoformatter
* rm extra ents copy
* TypedDict draft
* type fixes
* restore non-documentation files
* docs update
* fix spans example
* fix hyperlinks
* add parse example
* example fix + argument fix
* fix api arg in docs
* fix bad variable replacement
* fix spacing in style
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* fix spacing on table
* fix spacing on table
* rm temp files
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-08-16 11:23:34 -04:00
Adriane Boyd
ed4ad309e6
Fix Dutch noun chunks to skip overlapping spans ( #11275 )
...
* Add test for overlapping noun chunks
* Skip overlapping noun chunks
* Update spacy/tests/lang/nl/test_noun_chunks.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-08-10 09:49:08 +02:00
Paul O'Leary McCann
231a17817d
Clean up automated label-based issue handling ( #11284 )
...
* Clean up automated label-based issue handline
1. upgrade tiangolo/issue-manager to latest
2. move needs-more-info to tiangolo
3. change needs-more-info close time to 7 days
4. delete old needs-more-info config
* Use old, longer message
* Fix label name
2022-08-09 14:50:50 +02:00
Adriane Boyd
e700358ba0
Add W605 to the errors raised by flake8 in the CI ( #11283 )
2022-08-09 12:15:13 +02:00
Adriane Boyd
fc4246558b
Fix regex invalid escape sequences ( #11276 )
2022-08-09 10:59:36 +02:00
stefawolf
23749cfc91
adding spans to doc_annotation in Example.to_dict ( #11261 )
...
* adding spans to doc_annotation in Example.to_dict
* to_dict compatible with from_dict: tuples instead of spans
* use strings for label and kb_id
* Simplify test
* Update data formats docs
Co-authored-by: Stefanie Wolf <stefanie.wolf@vitecsoftware.com>
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-08-05 12:26:38 +02:00
Adriane Boyd
b07708d5d0
Support full prerelease versions in the compat table ( #11228 )
...
* Support full prerelease versions in the compat table
* Fix types
2022-08-04 15:14:19 +02:00
Jules Belveze
cd09614ab2
chore: add 'concepCy' to spacy universe ( #11255 )
...
* chore: add 'concepCy' to spacy universe
* docs: add 'slogan' to concepCy
2022-08-04 15:42:38 +09:00
Lj Miranda
d993df41e5
Update docs for pipeline initialize() methods ( #11221 )
...
* Update documentation for dependency parser
* Update documentation for trainable_lemmatizer
* Update documentation for entity_linker
* Update documentation for ner
* Update documentation for morphologizer
* Update documentation for senter
* Update documentation for spancat
* Update documentation for tagger
* Update documentation for textcat
* Update documentation for tok2vec
* Run prettier on edited files
* Apply similar changes in transformer docs
* Remove need to say annotated example explicitly
I removed the need to say "Must contain at least one annotated Example"
because it's often a given that Examples will contain some gold-standard
annotation.
* Run prettier on transformer docs
2022-08-03 16:53:02 +02:00
Adriane Boyd
d0578c2ede
Add scorer to textcat API docs config settings ( #11263 )
2022-08-03 16:41:20 +02:00
Paul O'Leary McCann
2d89dd9db8
Update natto-py version spec ( #11222 )
...
* Update natto-py version spec
* Update setup.cfg
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-07-28 07:45:02 +02:00