From 14df00ae989f7332535cb3f74cebed2125aecc91 Mon Sep 17 00:00:00 2001 From: Adriane Boyd Date: Tue, 21 Jul 2020 10:33:46 +0200 Subject: [PATCH] Add Morphology and MorphAnalsysis API docs Add initial draft of `Morphology` and `MorphAnalysis` API docs. --- website/docs/api/morphanalysis.md | 154 ++++++++++++++++++++++++++++ website/docs/api/morphology.md | 165 ++++++++++++++++++++++++++++++ website/meta/sidebars.json | 2 + 3 files changed, 321 insertions(+) create mode 100644 website/docs/api/morphanalysis.md create mode 100644 website/docs/api/morphology.md diff --git a/website/docs/api/morphanalysis.md b/website/docs/api/morphanalysis.md new file mode 100644 index 000000000..7d883c86b --- /dev/null +++ b/website/docs/api/morphanalysis.md @@ -0,0 +1,154 @@ +--- +title: MorphAnalysis +tag: class +source: spacy/tokens/morphanalysis.pyx +--- + +Stores a single morphological analysis. + + +## MorphAnalysis.\_\_init\_\_ {#init tag="method"} + +Initialize a MorphAnalysis object from a UD FEATS string or a dictionary of +morphological features. + +> #### Example +> +> ```python +> from spacy.tokens import MorphAnalysis +> +> feats = "Feat1=Val1|Feat2=Val2" +> m = MorphAnalysis(nlp.vocab, feats) +> ``` + +| Name | Type | Description | +| ----------- | ------------------ | ----------------------------- | +| `vocab` | `Vocab` | The vocab. | +| `features` | `Union[Dict, str]` | The morphological features. | +| **RETURNS** | `MorphAnalysis` | The newly constructed object. | + + +## MorphAnalysis.\_\_contains\_\_ {#contains tag="method"} + +Whether a feature/value pair is in the analysis. + +> #### Example +> +> ```python +> feats = "Feat1=Val1,Val2|Feat2=Val2" +> morph = MorphAnalysis(nlp.vocab, feats) +> assert "Feat1=Val1" in morph +> ``` + +| Name | Type | Description | +| ----------- | ----- | ------------------------------------- | +| **RETURNS** | `str` | A feature/value pair in the analysis. | + + +## MorphAnalysis.\_\_iter\_\_ {#iter tag="method"} + +Iterate over the feature/value pairs in the analysis. + +> #### Example +> +> ```python +> feats = "Feat1=Val1|Feat2=Val2" +> morph = MorphAnalysis(nlp.vocab, feats) +> for feat in morph: +> print(feat) +> ``` + +| Name | Type | Description | +| ---------- | ----- | ------------------------------------- | +| **YIELDS** | `str` | A feature/value pair in the analysis. | + + +## MorphAnalysis.\_\_len\_\_ {#len tag="method"} + +Returns the number of features in the analysis. + +> #### Example +> +> ```python +> feats = "Feat1=Val1,Val2|Feat2=Val2" +> morph = MorphAnalysis(nlp.vocab, feats) +> assert len(morph) == 3 +> ``` + +| Name | Type | Description | +| ----------- | ----- | --------------------------------------- | +| **RETURNS** | `int` | The number of features in the analysis. | + + +## MorphAnalysis.\_\_str\_\_ {#str tag="method"} + +Returns the morphological analysis in the UD FEATS string format. + +> #### Example +> +> ```python +> feats = "Feat1=Val1,Val2|Feat2=Val2" +> morph = MorphAnalysis(nlp.vocab, feats) +> assert str(morph) == feats +> ``` + +| Name | Type | Description | +| ----------- | ----- | ---------------------------------| +| **RETURNS** | `str` | The analysis in UD FEATS format. | + + +## MorphAnalysis.get {#get tag="method"} + +Retrieve a feature by field. + +> #### Example +> +> ```python +> feats = "Feat1=Val1,Val2" +> morph = MorphAnalysis(nlp.vocab, feats) +> assert morph.get("Feat1") == ['Feat1=Val1', 'Feat1=Val2'] +> ``` + +| Name | Type | Description | +| ----------- | ------ | ----------------------------------- | +| `field` | `str` | The field to retrieve. | +| **RETURNS** | `list` | A list of the individual features. | + + +## MorphAnalysis.to_dict {#to_dict tag="method"} + +Produce a dict representation of the analysis, in the same format as the tag +map. + +> #### Example +> +> ```python +> feats = "Feat1=Val1,Val2|Feat2=Val2" +> morph = MorphAnalysis(nlp.vocab, feats) +> assert morph.to_dict() == {'Feat1': 'Val1,Val2', 'Feat2': 'Val2'} +> ``` + +| Name | Type | Description | +| ----------- | ------ | -----------------------------------------| +| **RETURNS** | `dict` | The dict representation of the analysis. | + + +## MorphAnalysis.from_id {#from_id tag="classmethod"} + +Create a morphological analysis from a given hash ID. + +> #### Example +> +> ```python +> feats = "Feat1=Val1|Feat2=Val2" +> hash = nlp.vocab.strings[feats] +> morph = MorphAnalysis.from_id(nlp.vocab, hash) +> assert str(morph) == feats +> ``` + +| Name | Type | Description | +| ------- | ------- | -------------------------------- | +| `vocab` | `Vocab` | The vocab. | +| `key` | `int` | The hash of the features string. | + + diff --git a/website/docs/api/morphology.md b/website/docs/api/morphology.md new file mode 100644 index 000000000..ad279bff7 --- /dev/null +++ b/website/docs/api/morphology.md @@ -0,0 +1,165 @@ +--- +title: Morphology +tag: class +source: spacy/morphology.pyx +--- + +Store the possible morphological analyses for a language, and index them +by hash. To save space on each token, tokens only know the hash of their +morphological analysis, so queries of morphological attributes are delegated to +this class. + + +## Morphology.\_\_init\_\_ {#init tag="method"} + +Create a Morphology object using the tag map, lemmatizer and exceptions. + +> #### Example +> +> ```python +> from spacy.morphology import Morphology +> +> morphology = Morphology(strings, tag_map, lemmatizer) +> ``` + +| Name | Type | Description | +| ----------- | ---------------------------------------- | --------------------------------------------------------------------------------------------------------- | +| `strings` | `StringStore` | The string store. | +| `tag_map` | `Dict[str, Dict]` | The tag map. | +| `lemmatizer`| `Lemmatizer` | The lemmatizer. | +| `exc` | `Dict[str, Dict]` | A dictionary of exceptions in the format `{tag: {orth: {"POS": "X", "Feat1": "Val1, "Feat2": "Val2", ...}` | +| **RETURNS** | `Morphology` | The newly constructed object. | + + +## Morphology.add {#add tag="method"} + +Insert a morphological analysis in the morphology table, if not already +present. The morphological analysis may be provided in the UD FEATS format as a +string or in the tag map dictionary format. Returns the hash of the new +analysis. + +> #### Example +> +> ```python +> feats = "Feat1=Val1|Feat2=Val2" +> hash = nlp.vocab.morphology.add(feats) +> assert hash == nlp.vocab.strings[feats] +> ``` + +| Name | Type | Description | +| ----------- | ------------------- | --------------------------- | +| `features` | `Union[Dict, str]` | The morphological features. | + + +## Morphology.get {#get tag="method"} + +> #### Example +> +> ```python +> feats = "Feat1=Val1|Feat2=Val2" +> hash = nlp.vocab.morphology.add(feats) +> assert nlp.vocab.morphology.get(hash) == feats +> ``` + +Get the FEATS string for the hash of the morphological analysis. + +| Name | Type | Description | +| ----------- | ------ | --------------------------------------- | +| `morph` | int | The hash of the morphological analysis. | + + +## Morphology.load_tag_map {#load_tag_map tag="method"} + +Replace the current tag map with the provided tag map. + +| Name | Type | Description | +| ----------- | ------------------ | ------------ | +| `tag_map` | `Dict[str, Dict]` | The tag map. | + + +## Morphology.load_morph_exceptions {#load_morph_exceptions tag="method"} + +Replace the current morphological exceptions with the provided exceptions. + +| Name | Type | Description | +| ------------- | ------------------ | ----------------------------- | +| `morph_rules` | `Dict[str, Dict]` | The morphological exceptions. | + + +## Morphology.add_special_case {#add_special_case tag="method"} + +Add a special-case rule to the morphological analyzer. Tokens whose tag and +orth match the rule will receive the specified properties. + +> #### Example +> +> ```python +> attrs = {"POS": "DET", "Definite": "Def"} +> morphology.add_special_case("DT", "the", attrs) +> ``` + +| Name | Type | Description | +| ----------- | ---- | ---------------------------------------------- | +| `tag_str` | str | The fine-grained tag. | +| `orth_str` | str | The token text. | +| `attrs` | dict | The features to assign for this token and tag. | + + +## Morphology.exc {#exc tag="property"} + +The current morphological exceptions. + +| Name | Type | Description | +| ---------- | ----- | --------------------------------------------------- | +| **YIELDS** | dict | The current dictionary of morphological exceptions. | + + +## Morphology.lemmatize {#lemmatize tag="method"} + +TODO + + +## Morphology.feats_to_dict {#feats_to_dict tag="staticmethod"} + +Convert a string FEATS representation to a dictionary of features and values in +the same format as the tag map. + +> #### Example +> +> ```python +> from spacy.morphology import Morphology +> d = Morphology.feats_to_dict("Feat1=Val1|Feat2=Val2") +> assert d == {"Feat1": "Val1", "Feat2": "Val2"} +> ``` + +| Name | Type | Description | +| ----------- | ---- | ------------------------------------------------------------- | +| `feats` | str | The morphological features in Universal Dependencies FEATS format. | +| **RETURNS** | dict | The morphological features as a dictionary. | + + +## Morphology.dict_to_feats {#dict_to_feats tag="staticmethod"} + +Convert a dictionary of features and values to a string FEATS representation. + +> #### Example +> +> ```python +> from spacy.morphology import Morphology +> f = Morphology.dict_to_feats({"Feat1": "Val1", "Feat2": "Val2"}) +> assert f == "Feat1=Val1|Feat2=Val2" +> ``` + +| Name | Type | Description | +| ------------ | ----------------- | --------------------------------------------------------------------- | +| `feats_dict` | `Dict[str, Dict]` | The morphological features as a dictionary. | +| **RETURNS** | str | The morphological features as in Universal Dependencies FEATS format. | + + +## Attributes {#attributes} + +| Name | Type | Description | +| ------------- | ----- | -------------------------------------------- | +| `FEATURE_SEP` | `str` | The FEATS feature separator. Default is `|`. | +| `FIELD_SEP` | `str` | The FEATS field separator. Default is `=`. | +| `VALUE_SEP` | `str` | The FEATS value separator. Default is `,`. | diff --git a/website/meta/sidebars.json b/website/meta/sidebars.json index 3fed561d0..1357c9d62 100644 --- a/website/meta/sidebars.json +++ b/website/meta/sidebars.json @@ -70,6 +70,7 @@ { "text": "Token", "url": "/api/token" }, { "text": "Span", "url": "/api/span" }, { "text": "Lexeme", "url": "/api/lexeme" }, + { "text": "MorphAnalysis", "url": "/api/morphanalysis" }, { "text": "Example", "url": "/api/example" }, { "text": "DocBin", "url": "/api/docbin" } ] @@ -102,6 +103,7 @@ { "text": "StringStore", "url": "/api/stringstore" }, { "text": "Vectors", "url": "/api/vectors" }, { "text": "Lookups", "url": "/api/lookups" }, + { "text": "Morphology", "url": "/api/morphology" }, { "text": "KnowledgeBase", "url": "/api/kb" }, { "text": "Scorer", "url": "/api/scorer" }, { "text": "Corpus", "url": "/api/corpus" }