1
1
mirror of https://github.com/explosion/spaCy.git synced 2025-01-24 08:14:15 +03:00
spaCy/website/docs/api/morphology.md
Paul O'Leary McCann ba6a37d358
Document Assigned Attributes of Pipeline Components ()
* Add textcat docs

* Add NER docs

* Add Entity Linker docs

* Add assigned fields docs for the tagger

This also adds a preamble, since there wasn't one.

* Add morphologizer docs

* Add dependency parser docs

* Update entityrecognizer docs

This is a little weird because `Doc.ents` is the only thing assigned to,
but it's actually a bidirectional property.

* Add token fields for entityrecognizer

* Fix section name

* Add entity ruler docs

* Add lemmatizer docs

* Add sentencizer/recognizer docs

* Update website/docs/api/entityrecognizer.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update website/docs/api/entityruler.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update website/docs/api/tagger.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update website/docs/api/entityruler.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update type for Doc.ents

This was `Tuple[Span, ...]` everywhere but `Tuple[Span]` seems to be
correct.

* Run prettier

* Apply suggestions from code review

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Run prettier

* Add transformers section

This basically just moves and renames the "custom attributes" section
from the bottom of the page to be consistent with "assigned attributes"
on other pages.

I looked at moving the paragraph just above the section into the
section, but it includes the unrelated registry additions, so it seemed
better to leave it unchanged.

* Make table header consistent

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-09-01 12:09:39 +02:00

255 lines
9.4 KiB
Markdown

---
title: Morphology
tag: class
source: spacy/morphology.pyx
---
Store the possible morphological analyses for a language, and index them by
hash. To save space on each token, tokens only know the hash of their
morphological analysis, so queries of morphological attributes are delegated to
this class. See [`MorphAnalysis`](/api/morphology#morphanalysis) for the
container storing a single morphological analysis.
## Morphology.\_\_init\_\_ {#init tag="method"}
Create a `Morphology` object.
> #### Example
>
> ```python
> from spacy.morphology import Morphology
>
> morphology = Morphology(strings)
> ```
| Name | Description |
| --------- | --------------------------------- |
| `strings` | The string store. ~~StringStore~~ |
## Morphology.add {#add tag="method"}
Insert a morphological analysis in the morphology table, if not already present.
The morphological analysis may be provided in the Universal Dependencies
[FEATS](https://universaldependencies.org/format.html#morphological-annotation)
format as a string or in the tag map dictionary format. Returns the hash of the
new analysis.
> #### Example
>
> ```python
> feats = "Feat1=Val1|Feat2=Val2"
> hash = nlp.vocab.morphology.add(feats)
> assert hash == nlp.vocab.strings[feats]
> ```
| Name | Description |
| ---------- | ------------------------------------------------ |
| `features` | The morphological features. ~~Union[Dict, str]~~ |
## Morphology.get {#get tag="method"}
> #### Example
>
> ```python
> feats = "Feat1=Val1|Feat2=Val2"
> hash = nlp.vocab.morphology.add(feats)
> assert nlp.vocab.morphology.get(hash) == feats
> ```
Get the
[FEATS](https://universaldependencies.org/format.html#morphological-annotation)
string for the hash of the morphological analysis.
| Name | Description |
| ------- | ----------------------------------------------- |
| `morph` | The hash of the morphological analysis. ~~int~~ |
## Morphology.feats_to_dict {#feats_to_dict tag="staticmethod"}
Convert a string
[FEATS](https://universaldependencies.org/format.html#morphological-annotation)
representation to a dictionary of features and values in the same format as the
tag map.
> #### Example
>
> ```python
> from spacy.morphology import Morphology
> d = Morphology.feats_to_dict("Feat1=Val1|Feat2=Val2")
> assert d == {"Feat1": "Val1", "Feat2": "Val2"}
> ```
| Name | Description |
| ----------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- |
| `feats` | The morphological features in Universal Dependencies [FEATS](https://universaldependencies.org/format.html#morphological-annotation) format. ~~str~~ |
| **RETURNS** | The morphological features as a dictionary. ~~Dict[str, str]~~ |
## Morphology.dict_to_feats {#dict_to_feats tag="staticmethod"}
Convert a dictionary of features and values to a string
[FEATS](https://universaldependencies.org/format.html#morphological-annotation)
representation.
> #### Example
>
> ```python
> from spacy.morphology import Morphology
> f = Morphology.dict_to_feats({"Feat1": "Val1", "Feat2": "Val2"})
> assert f == "Feat1=Val1|Feat2=Val2"
> ```
| Name | Description |
| ------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------- |
| `feats_dict` | The morphological features as a dictionary. ~~Dict[str, str]~~ |
| **RETURNS** | The morphological features in Universal Dependencies [FEATS](https://universaldependencies.org/format.html#morphological-annotation) format. ~~str~~ |
## Attributes {#attributes}
| Name | Description |
| ------------- | ------------------------------------------------------------------------------------------------------------------------------ |
| `FEATURE_SEP` | The [FEATS](https://universaldependencies.org/format.html#morphological-annotation) feature separator. Default is `|`. ~~str~~ |
| `FIELD_SEP` | The [FEATS](https://universaldependencies.org/format.html#morphological-annotation) field separator. Default is `=`. ~~str~~ |
| `VALUE_SEP` | The [FEATS](https://universaldependencies.org/format.html#morphological-annotation) value separator. Default is `,`. ~~str~~ |
## MorphAnalysis {#morphanalysis tag="class" source="spacy/tokens/morphanalysis.pyx"}
Stores a single morphological analysis.
### MorphAnalysis.\_\_init\_\_ {#morphanalysis-init tag="method"}
Initialize a MorphAnalysis object from a Universal Dependencies
[FEATS](https://universaldependencies.org/format.html#morphological-annotation)
string or a dictionary of morphological features.
> #### Example
>
> ```python
> from spacy.tokens import MorphAnalysis
>
> feats = "Feat1=Val1|Feat2=Val2"
> m = MorphAnalysis(nlp.vocab, feats)
> ```
| Name | Description |
| ---------- | ---------------------------------------------------------- |
| `vocab` | The vocab. ~~Vocab~~ |
| `features` | The morphological features. ~~Union[Dict[str, str], str]~~ |
### MorphAnalysis.\_\_contains\_\_ {#morphanalysis-contains tag="method"}
Whether a feature/value pair is in the analysis.
> #### Example
>
> ```python
> feats = "Feat1=Val1,Val2|Feat2=Val2"
> morph = MorphAnalysis(nlp.vocab, feats)
> assert "Feat1=Val1" in morph
> ```
| Name | Description |
| ----------- | --------------------------------------------- |
| **RETURNS** | A feature/value pair in the analysis. ~~str~~ |
### MorphAnalysis.\_\_iter\_\_ {#morphanalysis-iter tag="method"}
Iterate over the feature/value pairs in the analysis.
> #### Example
>
> ```python
> feats = "Feat1=Val1,Val3|Feat2=Val2"
> morph = MorphAnalysis(nlp.vocab, feats)
> assert list(morph) == ["Feat1=Va1", "Feat1=Val3", "Feat2=Val2"]
> ```
| Name | Description |
| ---------- | --------------------------------------------- |
| **YIELDS** | A feature/value pair in the analysis. ~~str~~ |
### MorphAnalysis.\_\_len\_\_ {#morphanalysis-len tag="method"}
Returns the number of features in the analysis.
> #### Example
>
> ```python
> feats = "Feat1=Val1,Val2|Feat2=Val2"
> morph = MorphAnalysis(nlp.vocab, feats)
> assert len(morph) == 3
> ```
| Name | Description |
| ----------- | ----------------------------------------------- |
| **RETURNS** | The number of features in the analysis. ~~int~~ |
### MorphAnalysis.\_\_str\_\_ {#morphanalysis-str tag="method"}
Returns the morphological analysis in the Universal Dependencies
[FEATS](https://universaldependencies.org/format.html#morphological-annotation)
string format.
> #### Example
>
> ```python
> feats = "Feat1=Val1,Val2|Feat2=Val2"
> morph = MorphAnalysis(nlp.vocab, feats)
> assert str(morph) == feats
> ```
| Name | Description |
| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------ |
| **RETURNS** | The analysis in the Universal Dependencies [FEATS](https://universaldependencies.org/format.html#morphological-annotation) format. ~~str~~ |
### MorphAnalysis.get {#morphanalysis-get tag="method"}
Retrieve values for a feature by field.
> #### Example
>
> ```python
> feats = "Feat1=Val1,Val2"
> morph = MorphAnalysis(nlp.vocab, feats)
> assert morph.get("Feat1") == ["Val1", "Val2"]
> ```
| Name | Description |
| ----------- | ------------------------------------------------ |
| `field` | The field to retrieve. ~~str~~ |
| **RETURNS** | A list of the individual features. ~~List[str]~~ |
### MorphAnalysis.to_dict {#morphanalysis-to_dict tag="method"}
Produce a dict representation of the analysis, in the same format as the tag
map.
> #### Example
>
> ```python
> feats = "Feat1=Val1,Val2|Feat2=Val2"
> morph = MorphAnalysis(nlp.vocab, feats)
> assert morph.to_dict() == {"Feat1": "Val1,Val2", "Feat2": "Val2"}
> ```
| Name | Description |
| ----------- | ----------------------------------------------------------- |
| **RETURNS** | The dict representation of the analysis. ~~Dict[str, str]~~ |
### MorphAnalysis.from_id {#morphanalysis-from_id tag="classmethod"}
Create a morphological analysis from a given hash ID.
> #### Example
>
> ```python
> feats = "Feat1=Val1|Feat2=Val2"
> hash = nlp.vocab.strings[feats]
> morph = MorphAnalysis.from_id(nlp.vocab, hash)
> assert str(morph) == feats
> ```
| Name | Description |
| ------- | ---------------------------------------- |
| `vocab` | The vocab. ~~Vocab~~ |
| `key` | The hash of the features string. ~~int~~ |