spaCy/website/docs/api/morphology.md
Paul O'Leary McCann ba6a37d358
Document Assigned Attributes of Pipeline Components (#9041)
* Add textcat docs

* Add NER docs

* Add Entity Linker docs

* Add assigned fields docs for the tagger

This also adds a preamble, since there wasn't one.

* Add morphologizer docs

* Add dependency parser docs

* Update entityrecognizer docs

This is a little weird because `Doc.ents` is the only thing assigned to,
but it's actually a bidirectional property.

* Add token fields for entityrecognizer

* Fix section name

* Add entity ruler docs

* Add lemmatizer docs

* Add sentencizer/recognizer docs

* Update website/docs/api/entityrecognizer.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update website/docs/api/entityruler.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update website/docs/api/tagger.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update website/docs/api/entityruler.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update type for Doc.ents

This was `Tuple[Span, ...]` everywhere but `Tuple[Span]` seems to be
correct.

* Run prettier

* Apply suggestions from code review

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Run prettier

* Add transformers section

This basically just moves and renames the "custom attributes" section
from the bottom of the page to be consistent with "assigned attributes"
on other pages.

I looked at moving the paragraph just above the section into the
section, but it includes the unrelated registry additions, so it seemed
better to leave it unchanged.

* Make table header consistent

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-09-01 12:09:39 +02:00

9.4 KiB

title tag source
Morphology class spacy/morphology.pyx

Store the possible morphological analyses for a language, and index them by hash. To save space on each token, tokens only know the hash of their morphological analysis, so queries of morphological attributes are delegated to this class. See MorphAnalysis for the container storing a single morphological analysis.

Morphology.__init__

Create a Morphology object.

Example

from spacy.morphology import Morphology

morphology = Morphology(strings)
Name Description
strings The string store. StringStore

Morphology.add

Insert a morphological analysis in the morphology table, if not already present. The morphological analysis may be provided in the Universal Dependencies FEATS format as a string or in the tag map dictionary format. Returns the hash of the new analysis.

Example

feats = "Feat1=Val1|Feat2=Val2"
hash = nlp.vocab.morphology.add(feats)
assert hash == nlp.vocab.strings[feats]
Name Description
features The morphological features. Union[Dict, str]

Morphology.get

Example

feats = "Feat1=Val1|Feat2=Val2"
hash = nlp.vocab.morphology.add(feats)
assert nlp.vocab.morphology.get(hash) == feats

Get the FEATS string for the hash of the morphological analysis.

Name Description
morph The hash of the morphological analysis. int

Morphology.feats_to_dict

Convert a string FEATS representation to a dictionary of features and values in the same format as the tag map.

Example

from spacy.morphology import Morphology
d = Morphology.feats_to_dict("Feat1=Val1|Feat2=Val2")
assert d == {"Feat1": "Val1", "Feat2": "Val2"}
Name Description
feats The morphological features in Universal Dependencies FEATS format. str
RETURNS The morphological features as a dictionary. Dict[str, str]

Morphology.dict_to_feats

Convert a dictionary of features and values to a string FEATS representation.

Example

from spacy.morphology import Morphology
f = Morphology.dict_to_feats({"Feat1": "Val1", "Feat2": "Val2"})
assert f == "Feat1=Val1|Feat2=Val2"
Name Description
feats_dict The morphological features as a dictionary. Dict[str, str]
RETURNS The morphological features in Universal Dependencies FEATS format. str

Attributes

Name Description
FEATURE_SEP The FEATS feature separator. Default is `
FIELD_SEP The FEATS field separator. Default is =. str
VALUE_SEP The FEATS value separator. Default is ,. str

MorphAnalysis

Stores a single morphological analysis.

MorphAnalysis.__init__

Initialize a MorphAnalysis object from a Universal Dependencies FEATS string or a dictionary of morphological features.

Example

from spacy.tokens import MorphAnalysis

feats = "Feat1=Val1|Feat2=Val2"
m = MorphAnalysis(nlp.vocab, feats)
Name Description
vocab The vocab. Vocab
features The morphological features. Union[Dict[str, str], str]

MorphAnalysis.__contains__

Whether a feature/value pair is in the analysis.

Example

feats = "Feat1=Val1,Val2|Feat2=Val2"
morph = MorphAnalysis(nlp.vocab, feats)
assert "Feat1=Val1" in morph
Name Description
RETURNS A feature/value pair in the analysis. str

MorphAnalysis.__iter__

Iterate over the feature/value pairs in the analysis.

Example

feats = "Feat1=Val1,Val3|Feat2=Val2"
morph = MorphAnalysis(nlp.vocab, feats)
assert list(morph) == ["Feat1=Va1", "Feat1=Val3", "Feat2=Val2"]
Name Description
YIELDS A feature/value pair in the analysis. str

MorphAnalysis.__len__

Returns the number of features in the analysis.

Example

feats = "Feat1=Val1,Val2|Feat2=Val2"
morph = MorphAnalysis(nlp.vocab, feats)
assert len(morph) == 3
Name Description
RETURNS The number of features in the analysis. int

MorphAnalysis.__str__

Returns the morphological analysis in the Universal Dependencies FEATS string format.

Example

feats = "Feat1=Val1,Val2|Feat2=Val2"
morph = MorphAnalysis(nlp.vocab, feats)
assert str(morph) == feats
Name Description
RETURNS The analysis in the Universal Dependencies FEATS format. str

MorphAnalysis.get

Retrieve values for a feature by field.

Example

feats = "Feat1=Val1,Val2"
morph = MorphAnalysis(nlp.vocab, feats)
assert morph.get("Feat1") == ["Val1", "Val2"]
Name Description
field The field to retrieve. str
RETURNS A list of the individual features. List[str]

MorphAnalysis.to_dict

Produce a dict representation of the analysis, in the same format as the tag map.

Example

feats = "Feat1=Val1,Val2|Feat2=Val2"
morph = MorphAnalysis(nlp.vocab, feats)
assert morph.to_dict() == {"Feat1": "Val1,Val2", "Feat2": "Val2"}
Name Description
RETURNS The dict representation of the analysis. Dict[str, str]

MorphAnalysis.from_id

Create a morphological analysis from a given hash ID.

Example

feats = "Feat1=Val1|Feat2=Val2"
hash = nlp.vocab.strings[feats]
morph = MorphAnalysis.from_id(nlp.vocab, hash)
assert str(morph) == feats
Name Description
vocab The vocab. Vocab
key The hash of the features string. int