Update docs [ci skip]

2025-10-18 17:54:17 +03:00 · 2020-08-25 13:27:59 +02:00 · 2020-08-25 13:27:59 +02:00 · f31c4462ca
commit f31c4462ca
parent dd84577a98
1 changed files with 47 additions and 14 deletions
--- a/website/docs/usage/linguistic-features.md
+++ b/website/docs/usage/linguistic-features.md
@ -82,6 +82,14 @@ check whether a [`Doc`](/api/doc) object has been parsed with the
 `doc.is_parsed` attribute, which returns a boolean value. If this attribute is
 `False`, the default sentence iterator will raise an exception.
 <Infobox title="Dependency label scheme" emoji="📖">
 For a list of the syntactic dependency labels assigned by spaCy's models across
 different languages, see the label schemes documented in the
 [models directory](/models).
 </Infobox>
 ### Noun chunks {#noun-chunks}
 Noun chunks are "base noun phrases" – flat phrases that have a noun as their
@ -288,11 +296,45 @@ for token in doc:
 | their                               | `ADJ`  | `poss`  | requests  |
 | requests                            | `NOUN` | `dobj`  | submit    |
-<Infobox title="Dependency label scheme" emoji="📖">
+The dependency parse can be a useful tool for **information extraction**,
 especially when combined with other predictions like
 [named entities](#named-entities). The following example extracts money and
 currency values, i.e. entities labeled as `MONEY`, and then uses the dependency
 parse to find the noun phrase they are referring to – for example `"Net income"`
 &rarr; `"$9.4 million"`.
-For a list of the syntactic dependency labels assigned by spaCy's models across
+```python
-different languages, see the label schemes documented in the
+### {executable="true"}
-[models directory](/models).
+import spacy
 nlp = spacy.load("en_core_web_sm")
 # Merge noun phrases and entities for easier analysis
 nlp.add_pipe("merge_entities")
 nlp.add_pipe("merge_noun_chunks")
 TEXTS = [
    "Net income was $9.4 million compared to the prior year of $2.7 million.",
    "Revenue exceeded twelve billion dollars, with a loss of $1b.",
 ]
 for doc in nlp.pipe(TEXTS):
    for token in doc:
        if token.ent_type_ == "MONEY":
            # We have an attribute and direct object, so check for subject
            if token.dep_ in ("attr", "dobj"):
                subj = [w for w in token.head.lefts if w.dep_ == "nsubj"]
                if subj:
                    print(subj[0], "-->", token)
            # We have a prepositional object with a preposition
            elif token.dep_ == "pobj" and token.head.dep_ == "prep":
                print(token.head.head, "-->", token)
 ```
 <Infobox title="Combining models and rules" emoji="📖">
 For more examples of how to write rule-based information extraction logic that
 takes advantage of the model's predictions produced by the different components,
 see the usage guide on
 [combining models and rules](/usage/rule-based-matching#models-rules).
 </Infobox>
@ -545,7 +587,7 @@ identifier from a knowledge base (KB). You can create your own
 [train a new Entity Linking model](/usage/training#entity-linker) using that
 custom-made KB.
-### Accessing entity identifiers {#entity-linking-accessing}
+### Accessing entity identifiers {#entity-linking-accessing model="entity linking"}
 The annotated KB identifier is accessible as either a hash value or as a string,
 using the attributes `ent.kb_id` and `ent.kb_id_` of a [`Span`](/api/span)
@ -571,15 +613,6 @@ print(ent_ada_1)  # ['Lovelace', 'PERSON', 'Q7259']
 print(ent_london_5)  # ['London', 'GPE', 'Q84']
 ```
 | Text     | ent_type\_ | ent_kb_id\_ |
 | -------- | ---------- | ----------- |
 | Ada      | `"PERSON"` | `"Q7259"`   |
 | Lovelace | `"PERSON"` | `"Q7259"`   |
 | was      | -          | -           |
 | born     | -          | -           |
 | in       | -          | -           |
 | London   | `"GPE"`    | `"Q84"`     |
 ## Tokenization {#tokenization}
 Tokenization is the task of splitting a text into meaningful segments, called