Updated example config

This commit is contained in:
richardpaulhudson 2022-11-04 12:36:14 +01:00
parent dcfc810033
commit f97d6e6826
2 changed files with 11 additions and 8 deletions

View File

@ -1782,8 +1782,11 @@ cdef class Doc:
the fact that we are hashing short affixes and searching for small groups of characters. The calling code is responsible
for ensuring that lengths being passed in cannot exceed 63 and hence that resulting values with a maximum of four-byte
character widths can never exceed 255.
"""
Note that this method performs no data validation itself as it expects the calling code will already have done so, and
that the behaviour of the code may be erratic if the supplied parameters do not conform to expectations.
"""
# Work out lengths
cdef int p_lengths_l = strlen(<char*> p_lengths)
cdef int s_lengths_l = strlen(<char*> s_lengths)

View File

@ -177,18 +177,18 @@ updated).
> ```ini
> [model]
> @architectures = "spacy.RichMultiHashEmbed.v1"
> width = 64
> attrs = ["LOWER","SHAPE"]
> rows = [2000,1000]
> width = ${components.tok2vec.model.encode:width}
> attrs = ["LOWER","SHAPE","SPACY"]
> rows = [5000,2500,50]
> include_static_vectors = "False"
> case_sensitive = "False"
> pref_lengths = [2, 3, 5]
> pref_rows = [2000,2000,2000]
> pref_rows = [10000, 10000, 10000]
> suff_lengths = [2, 3, 4, 5]
> suff_rows = [2000,2000,2000,2000]
> suff_search_chars = "aeiouäöüyß"
> suff_rows = [10000, 10000,10000,10000]
> suff_search_chars = "aeiouäöüß"
> suff_search_lengths = [2, 3]
> suff_search_rows = [2000,2000]
> suff_search_rows = [10000,10000]
> ```
Construct an embedding layer with the features of