mirror of
https://github.com/explosion/spaCy.git
synced 2024-12-26 01:46:28 +03:00
Update text, examples, typos, wording and formatting
This commit is contained in:
parent
f8185b8e11
commit
69bda9aed7
|
@ -4,7 +4,7 @@ include ../../_includes/_mixins
|
||||||
|
|
||||||
p
|
p
|
||||||
| As of v2.0, spaCy comes with a built-in visualization suite. For more
|
| As of v2.0, spaCy comes with a built-in visualization suite. For more
|
||||||
| info and examples, see the usage workflow on
|
| info and examples, see the usage guide on
|
||||||
| #[+a("/docs/usage/visualizers") visualizing spaCy].
|
| #[+a("/docs/usage/visualizers") visualizing spaCy].
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -2,6 +2,8 @@
|
||||||
|
|
||||||
include ../../_includes/_mixins
|
include ../../_includes/_mixins
|
||||||
|
|
||||||
|
+under-construction
|
||||||
|
|
||||||
+h(2, "comparison") Feature comparison
|
+h(2, "comparison") Feature comparison
|
||||||
|
|
||||||
p
|
p
|
||||||
|
|
|
@ -79,7 +79,7 @@ p Find all token sequences matching the supplied patterns on the #[code Doc].
|
||||||
| #[+api("matcher#add") #[code add]]. This allows you to define custom
|
| #[+api("matcher#add") #[code add]]. This allows you to define custom
|
||||||
| actions per pattern within the same matcher. For example, you might only
|
| actions per pattern within the same matcher. For example, you might only
|
||||||
| want to merge some entity types, and set custom flags for other matched
|
| want to merge some entity types, and set custom flags for other matched
|
||||||
| patterns. For more details and examples, see the usage workflow on
|
| patterns. For more details and examples, see the usage guide on
|
||||||
| #[+a("/docs/usage/rule-based-matching") rule-based matching].
|
| #[+a("/docs/usage/rule-based-matching") rule-based matching].
|
||||||
|
|
||||||
+h(2, "pipe") Matcher.pipe
|
+h(2, "pipe") Matcher.pipe
|
||||||
|
|
|
@ -175,7 +175,7 @@ p
|
||||||
|
|
||||||
p
|
p
|
||||||
| Add a special-case tokenization rule. This mechanism is also used to add
|
| Add a special-case tokenization rule. This mechanism is also used to add
|
||||||
| custom tokenizer exceptions to the language data. See the usage workflow
|
| custom tokenizer exceptions to the language data. See the usage guide
|
||||||
| on #[+a("/docs/usage/adding-languages#tokenizer-exceptions") adding languages]
|
| on #[+a("/docs/usage/adding-languages#tokenizer-exceptions") adding languages]
|
||||||
| for more details and examples.
|
| for more details and examples.
|
||||||
|
|
||||||
|
|
|
@ -34,7 +34,7 @@ p
|
||||||
|
|
||||||
+infobox
|
+infobox
|
||||||
| For more details on the language-specific data, see the
|
| For more details on the language-specific data, see the
|
||||||
| usage workflow on #[+a("/docs/usage/adding-languages") adding languages].
|
| usage guide on #[+a("/docs/usage/adding-languages") adding languages].
|
||||||
|
|
||||||
+h(2, "special-cases") Adding special case tokenization rules
|
+h(2, "special-cases") Adding special case tokenization rules
|
||||||
|
|
||||||
|
|
|
@ -201,7 +201,7 @@ p
|
||||||
|
|
||||||
+infobox
|
+infobox
|
||||||
| For more details and examples, see the
|
| For more details and examples, see the
|
||||||
| #[+a("/docs/usage/visualizers") usage workflow on visualizing spaCy]. You
|
| #[+a("/docs/usage/visualizers") usage guide on visualizing spaCy]. You
|
||||||
| can also test displaCy in our #[+a(DEMOS_URL + "/displacy", true) online demo].
|
| can also test displaCy in our #[+a(DEMOS_URL + "/displacy", true) online demo].
|
||||||
|
|
||||||
+h(2, "disabling") Disabling the parser
|
+h(2, "disabling") Disabling the parser
|
||||||
|
|
|
@ -248,7 +248,7 @@ p
|
||||||
|
|
||||||
p
|
p
|
||||||
| For more details and examples, see the
|
| For more details and examples, see the
|
||||||
| #[+a("/docs/usage/visualizers") usage workflow on visualizing spaCy].
|
| #[+a("/docs/usage/visualizers") usage guide on visualizing spaCy].
|
||||||
|
|
||||||
+code("Named Entity example").
|
+code("Named Entity example").
|
||||||
import spacy
|
import spacy
|
||||||
|
|
|
@ -4,7 +4,8 @@ include ../../_includes/_mixins
|
||||||
|
|
||||||
p
|
p
|
||||||
| The following examples and code snippets give you an overview of spaCy's
|
| The following examples and code snippets give you an overview of spaCy's
|
||||||
| functionality and its usage.
|
| functionality and its usage. If you're new to spaCy, make sure to check
|
||||||
|
| out the #[+a("/docs/usage/spacy-101") spaCy 101 guide].
|
||||||
|
|
||||||
+h(2, "models") Install models and process text
|
+h(2, "models") Install models and process text
|
||||||
|
|
||||||
|
@ -80,13 +81,13 @@ p
|
||||||
|
|
||||||
+code.
|
+code.
|
||||||
doc = nlp(u'San Francisco considers banning sidewalk delivery robots')
|
doc = nlp(u'San Francisco considers banning sidewalk delivery robots')
|
||||||
ents = [(e.text, e.start_char, e.end_char, e.label_) for e in doc.ents]
|
ents = [(ent.text, ent.start_char, ent.end_char, ent.label_) for ent in doc.ents]
|
||||||
assert ents == [(u'San Francisco', 0, 13, u'GPE')]
|
assert ents == [(u'San Francisco', 0, 13, u'GPE')]
|
||||||
|
|
||||||
from spacy.tokens import Span
|
from spacy.tokens import Span
|
||||||
doc = nlp(u'Netflix is hiring a new VP of global policy')
|
doc = nlp(u'Netflix is hiring a new VP of global policy')
|
||||||
doc.ents = [Span(doc, 0, 1, label=doc.vocab.strings[u'ORG'])]
|
doc.ents = [Span(doc, 0, 1, label=doc.vocab.strings[u'ORG'])]
|
||||||
ents = [(e.start_char, e.end_char, e.label_) for ent in doc.ents]
|
ents = [(ent.start_char, ent.end_char, ent.label_) for ent in doc.ents]
|
||||||
assert ents == [(0, 7, u'ORG')]
|
assert ents == [(0, 7, u'ORG')]
|
||||||
|
|
||||||
+infobox
|
+infobox
|
||||||
|
@ -95,6 +96,42 @@ p
|
||||||
+h(2, "displacy") Visualize a dependency parse and named entities in your browser
|
+h(2, "displacy") Visualize a dependency parse and named entities in your browser
|
||||||
+tag-model("dependency parse", "NER")
|
+tag-model("dependency parse", "NER")
|
||||||
|
|
||||||
|
+aside
|
||||||
|
.u-text-center(style="overflow: auto").
|
||||||
|
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" class="o-svg" viewBox="270 35 125 240" width="400" height="150" style="max-width: none; color: #fff; background: #1a1e23; font-family: inherit; font-size: 2rem">
|
||||||
|
<text fill="currentColor" text-anchor="middle" y="222.0">
|
||||||
|
<tspan style="font-weight: bold" fill="currentColor" x="50">This</tspan>
|
||||||
|
<tspan dy="2em" class="u-color-theme" style="font-weight: bold" fill="currentColor" x="50">DT</tspan>
|
||||||
|
</text>
|
||||||
|
<text fill="currentColor" text-anchor="middle" y="222.0">
|
||||||
|
<tspan style="font-weight: bold" fill="currentColor" x="225">is</tspan>
|
||||||
|
<tspan dy="2em" class="u-color-theme" style="font-weight: bold" fill="currentColor" x="225">VBZ</tspan>
|
||||||
|
</text>
|
||||||
|
<text fill="currentColor" text-anchor="middle" y="222.0">
|
||||||
|
<tspan style="font-weight: bold" fill="currentColor" x="400">a</tspan>
|
||||||
|
<tspan dy="2em" class="u-color-theme" style="font-weight: bold" fill="currentColor" x="400">DT</tspan>
|
||||||
|
</text>
|
||||||
|
<text fill="currentColor" text-anchor="middle" y="222.0">
|
||||||
|
<tspan style="font-weight: bold" fill="currentColor" x="575">sentence.</tspan>
|
||||||
|
<tspan dy="2em" class="u-color-theme" style="font-weight: bold" fill="currentColor" x="575">NN</tspan>
|
||||||
|
</text>
|
||||||
|
<path id="arrow-0-0" stroke-width="2px" d="M70,177.0 C70,89.5 220.0,89.5 220.0,177.0" fill="none" stroke="currentColor"/>
|
||||||
|
<text dy="1.25em" style="font-size: 0.9em; letter-spacing: 2px">
|
||||||
|
<textPath xlink:href="#arrow-0-0" startOffset="50%" fill="currentColor" text-anchor="middle">nsubj</textPath>
|
||||||
|
</text>
|
||||||
|
<path d="M70,179.0 L62,167.0 78,167.0" fill="currentColor"/>
|
||||||
|
<path id="arrow-0-1" stroke-width="2px" d="M420,177.0 C420,89.5 570.0,89.5 570.0,177.0" fill="none" stroke="currentColor"/>
|
||||||
|
<text dy="1.25em" style="font-size: 0.9em; letter-spacing: 2px">
|
||||||
|
<textPath xlink:href="#arrow-0-1" startOffset="50%" fill="currentColor" text-anchor="middle">det</textPath>
|
||||||
|
</text>
|
||||||
|
<path d="M420,179.0 L412,167.0 428,167.0" fill="currentColor"/>
|
||||||
|
<path id="arrow-0-2" stroke-width="2px" d="M245,177.0 C245,2.0 575.0,2.0 575.0,177.0" fill="none" stroke="currentColor"/>
|
||||||
|
<text dy="1.25em" style="font-size: 0.9em; letter-spacing: 2px">
|
||||||
|
<textPath xlink:href="#arrow-0-2" startOffset="50%" fill="currentColor" text-anchor="middle">attr</textPath>
|
||||||
|
</text>
|
||||||
|
<path d="M575.0,179.0 L583.0,167.0 567.0,167.0" fill="currentColor"/>
|
||||||
|
</svg>
|
||||||
|
|
||||||
+code.
|
+code.
|
||||||
from spacy import displacy
|
from spacy import displacy
|
||||||
|
|
||||||
|
@ -158,7 +195,7 @@ p
|
||||||
pattern1 = [{'ORTH': 'Google'}, {'UPPER': 'I'}, {'ORTH': '/'}, {'UPPER': 'O'}]
|
pattern1 = [{'ORTH': 'Google'}, {'UPPER': 'I'}, {'ORTH': '/'}, {'UPPER': 'O'}]
|
||||||
pattern2 = [[{'ORTH': emoji, 'OP': '+'}] for emoji in ['😀', '😂', '🤣', '😍']]
|
pattern2 = [[{'ORTH': emoji, 'OP': '+'}] for emoji in ['😀', '😂', '🤣', '😍']]
|
||||||
matcher.add('GoogleIO', None, pattern1) # match "Google I/O" or "Google i/o"
|
matcher.add('GoogleIO', None, pattern1) # match "Google I/O" or "Google i/o"
|
||||||
matcher.add('HAPPY', set_sentiment, pattern2) # match one or more happy emoji
|
matcher.add('HAPPY', set_sentiment, *pattern2) # match one or more happy emoji
|
||||||
matches = nlp(LOTS_OF TEXT)
|
matches = nlp(LOTS_OF TEXT)
|
||||||
|
|
||||||
+infobox
|
+infobox
|
||||||
|
|
|
@ -141,7 +141,7 @@ p
|
||||||
html = displacy.render(doc, style='ent', page=True,
|
html = displacy.render(doc, style='ent', page=True,
|
||||||
options={'ents': ['EVENT']})
|
options={'ents': ['EVENT']})
|
||||||
|
|
||||||
| For more info and examples, see the usage workflow on
|
| For more info and examples, see the usage guide on
|
||||||
| #[+a("/docs/usage/visualizers") visualizing spaCy].
|
| #[+a("/docs/usage/visualizers") visualizing spaCy].
|
||||||
|
|
||||||
p
|
p
|
||||||
|
|
|
@ -151,7 +151,7 @@ p
|
||||||
|
|
||||||
+infobox("Custom models with pipeline components")
|
+infobox("Custom models with pipeline components")
|
||||||
| For more details and an example of how to package a sentiment model
|
| For more details and an example of how to package a sentiment model
|
||||||
| with a custom pipeline component, see the usage workflow on
|
| with a custom pipeline component, see the usage guide on
|
||||||
| #[+a("/docs/usage/language-processing-pipeline#example2") language processing pipelines].
|
| #[+a("/docs/usage/language-processing-pipeline#example2") language processing pipelines].
|
||||||
|
|
||||||
+h(3, "models-building") Building the model package
|
+h(3, "models-building") Building the model package
|
||||||
|
|
|
@ -16,59 +16,67 @@ include ../../_includes/_mixins
|
||||||
+table(["Name", "Description", "Needs model"])
|
+table(["Name", "Description", "Needs model"])
|
||||||
+row
|
+row
|
||||||
+cell #[strong Tokenization]
|
+cell #[strong Tokenization]
|
||||||
+cell
|
+cell Segmenting text into words, punctuations marks etc.
|
||||||
+cell #[+procon("con")]
|
+cell #[+procon("con")]
|
||||||
|
|
||||||
+row
|
+row
|
||||||
+cell #[strong Part-of-speech Tagging]
|
+cell #[strong Part-of-speech] (POS) #[strong Tagging]
|
||||||
+cell
|
+cell Assigning word types to tokens, like verb or noun.
|
||||||
+cell #[+procon("pro")]
|
+cell #[+procon("pro")]
|
||||||
|
|
||||||
+row
|
+row
|
||||||
+cell #[strong Dependency Parsing]
|
+cell #[strong Dependency Parsing]
|
||||||
+cell
|
+cell
|
||||||
|
| Assigning syntactic dependency labels, i.e. the relations between
|
||||||
|
| individual tokens.
|
||||||
+cell #[+procon("pro")]
|
+cell #[+procon("pro")]
|
||||||
|
|
||||||
+row
|
+row
|
||||||
+cell #[strong Sentence Boundary Detection]
|
+cell #[strong Sentence Boundary Detection] (SBD)
|
||||||
+cell
|
+cell Finding and segmenting individual sentences.
|
||||||
+cell #[+procon("pro")]
|
+cell #[+procon("pro")]
|
||||||
|
|
||||||
+row
|
+row
|
||||||
+cell #[strong Named Entity Recongition] (NER)
|
+cell #[strong Named Entity Recongition] (NER)
|
||||||
+cell
|
+cell
|
||||||
|
| Labelling named "real-world" objects, like persons, companies or
|
||||||
|
| locations.
|
||||||
+cell #[+procon("pro")]
|
+cell #[+procon("pro")]
|
||||||
|
|
||||||
+row
|
+row
|
||||||
+cell #[strong Rule-based Matching]
|
+cell #[strong Rule-based Matching]
|
||||||
+cell
|
+cell
|
||||||
|
| Finding sequences of tokens based on their texts and linguistic
|
||||||
|
| annotations, similar to regular expressions.
|
||||||
+cell #[+procon("con")]
|
+cell #[+procon("con")]
|
||||||
|
|
||||||
+row
|
+row
|
||||||
+cell #[strong Similarity]
|
+cell #[strong Similarity]
|
||||||
+cell
|
+cell
|
||||||
|
| Comparing words, text spans and documents and how similar they
|
||||||
|
| are to each other.
|
||||||
+cell #[+procon("pro")]
|
+cell #[+procon("pro")]
|
||||||
|
|
||||||
+row
|
+row
|
||||||
+cell #[strong Training]
|
+cell #[strong Training]
|
||||||
+cell
|
+cell Updating and improving a statistical model's predictions.
|
||||||
+cell #[+procon("neutral")]
|
+cell #[+procon("neutral")]
|
||||||
|
|
||||||
+row
|
+row
|
||||||
+cell #[strong Serialization]
|
+cell #[strong Serialization]
|
||||||
+cell
|
+cell Saving objects to files or byte strings.
|
||||||
+cell #[+procon("neutral")]
|
+cell #[+procon("neutral")]
|
||||||
|
|
||||||
+h(2, "annotations") Linguistic annotations
|
+h(2, "annotations") Linguistic annotations
|
||||||
|
|
||||||
p
|
p
|
||||||
| spaCy provides a variety of linguistic annotations to give you insights
|
| spaCy provides a variety of linguistic annotations to give you
|
||||||
| into a text's grammatical structure. This includes the word types,
|
| #[strong insights into a text's grammatical structure]. This includes the
|
||||||
| i.e. the parts of speech, and how the words are related to each other.
|
| word types, like the parts of speech, and how the words are related to
|
||||||
| For example, if you're analysing text, it makes a huge difference
|
| each other. For example, if you're analysing text, it makes a huge
|
||||||
| whether a noun is the subject of a sentence, or the object – or whether
|
| difference whether a noun is the subject of a sentence, or the object –
|
||||||
| "google" is used as a verb, or refers to the website or company in a
|
| or whether "google" is used as a verb, or refers to the website or
|
||||||
| specific context.
|
| company in a specific context.
|
||||||
|
|
||||||
p
|
p
|
||||||
| Once you've downloaded and installed a #[+a("/docs/usage/models") model],
|
| Once you've downloaded and installed a #[+a("/docs/usage/models") model],
|
||||||
|
@ -223,6 +231,15 @@ include _spacy-101/_training
|
||||||
| Segment text, and create #[code Doc] objects with the discovered
|
| Segment text, and create #[code Doc] objects with the discovered
|
||||||
| segment boundaries.
|
| segment boundaries.
|
||||||
|
|
||||||
|
+row
|
||||||
|
+cell #[+api("matcher") #[code Matcher]]
|
||||||
|
+cell
|
||||||
|
| Match sequences of tokens, based on pattern rules, similar to
|
||||||
|
| regular expressions.
|
||||||
|
|
||||||
|
+h(3, "architecture-pipeline") Pipeline components
|
||||||
|
|
||||||
|
+table(["Name", "Description"])
|
||||||
+row
|
+row
|
||||||
+cell #[+api("tagger") #[code Tagger]]
|
+cell #[+api("tagger") #[code Tagger]]
|
||||||
+cell Annotate part-of-speech tags on #[code Doc] objects.
|
+cell Annotate part-of-speech tags on #[code Doc] objects.
|
||||||
|
@ -237,15 +254,13 @@ include _spacy-101/_training
|
||||||
| Annotate named entities, e.g. persons or products, on #[code Doc]
|
| Annotate named entities, e.g. persons or products, on #[code Doc]
|
||||||
| objects.
|
| objects.
|
||||||
|
|
||||||
+row
|
+h(3, "architecture-other") Other classes
|
||||||
+cell #[+api("matcher") #[code Matcher]]
|
|
||||||
+cell
|
|
||||||
| Match sequences of tokens, based on pattern rules, similar to
|
|
||||||
| regular expressions.
|
|
||||||
|
|
||||||
+h(3, "architecture-other") Other
|
|
||||||
|
|
||||||
+table(["Name", "Description"])
|
+table(["Name", "Description"])
|
||||||
|
+row
|
||||||
|
+cell #[+api("binder") #[code Binder]]
|
||||||
|
+cell
|
||||||
|
|
||||||
+row
|
+row
|
||||||
+cell #[+api("goldparse") #[code GoldParse]]
|
+cell #[+api("goldparse") #[code GoldParse]]
|
||||||
+cell Collection for training annotations.
|
+cell Collection for training annotations.
|
||||||
|
|
|
@ -1,7 +1,7 @@
|
||||||
include ../../_includes/_mixins
|
include ../../_includes/_mixins
|
||||||
|
|
||||||
p
|
p
|
||||||
| This workflow describes how to train new statistical models for spaCy's
|
| This guide describes how to train new statistical models for spaCy's
|
||||||
| part-of-speech tagger, named entity recognizer and dependency parser.
|
| part-of-speech tagger, named entity recognizer and dependency parser.
|
||||||
| Once the model is trained, you can then
|
| Once the model is trained, you can then
|
||||||
| #[+a("/docs/usage/saving-loading") save and load] it.
|
| #[+a("/docs/usage/saving-loading") save and load] it.
|
||||||
|
@ -61,7 +61,7 @@ p
|
||||||
|
|
||||||
p.o-inline-list
|
p.o-inline-list
|
||||||
+button(gh("spaCy", "examples/training/train_new_entity_type.py"), true, "secondary") Full example
|
+button(gh("spaCy", "examples/training/train_new_entity_type.py"), true, "secondary") Full example
|
||||||
+button("/docs/usage/training-ner", false, "secondary") Usage Workflow
|
+button("/docs/usage/training-ner", false, "secondary") Usage guide
|
||||||
|
|
||||||
+h(2, "train-dependency") Training the dependency parser
|
+h(2, "train-dependency") Training the dependency parser
|
||||||
|
|
||||||
|
|
|
@ -8,6 +8,20 @@ p
|
||||||
|
|
||||||
+h(2, "features") New features
|
+h(2, "features") New features
|
||||||
|
|
||||||
|
p
|
||||||
|
| This section contains an overview of the most important
|
||||||
|
| #[strong new features and improvements]. The #[+a("/docs/api") API docs]
|
||||||
|
| include additional deprecation notes. New methods and functions that
|
||||||
|
| were introduced in this version are marked with a #[+tag-new(2)] tag.
|
||||||
|
|
||||||
|
p
|
||||||
|
| To help you make the most of v2.0, we also
|
||||||
|
| #[strong re-wrote almost all of the usage guides and API docs], and added
|
||||||
|
| more real-world examples. If you're new to spaCy, or just want to brush
|
||||||
|
| up on some NLP basics and the details of the library, check out
|
||||||
|
| the #[+a("/docs/usage/spacy-101") spaCy 101 guide] that explains the most
|
||||||
|
| important concepts with examples and illustrations.
|
||||||
|
|
||||||
+h(3, "features-pipelines") Improved processing pipelines
|
+h(3, "features-pipelines") Improved processing pipelines
|
||||||
|
|
||||||
+aside-code("Example").
|
+aside-code("Example").
|
||||||
|
@ -97,9 +111,6 @@ p
|
||||||
| complex regular expressions. The language data has also been tidied up
|
| complex regular expressions. The language data has also been tidied up
|
||||||
| and simplified. spaCy now also supports simple lookup-based lemmatization.
|
| and simplified. spaCy now also supports simple lookup-based lemmatization.
|
||||||
|
|
||||||
+image
|
|
||||||
include ../../assets/img/docs/language_data.svg
|
|
||||||
|
|
||||||
+infobox
|
+infobox
|
||||||
| #[strong API:] #[+api("language") #[code Language]]
|
| #[strong API:] #[+api("language") #[code Language]]
|
||||||
| #[strong Code:] #[+src(gh("spaCy", "spacy/lang")) spacy/lang]
|
| #[strong Code:] #[+src(gh("spaCy", "spacy/lang")) spacy/lang]
|
||||||
|
@ -126,10 +137,18 @@ p
|
||||||
| #[strong API:] #[+api("matcher") #[code Matcher]]
|
| #[strong API:] #[+api("matcher") #[code Matcher]]
|
||||||
| #[strong Usage:] #[+a("/docs/usage/rule-based-matching") Rule-based matching]
|
| #[strong Usage:] #[+a("/docs/usage/rule-based-matching") Rule-based matching]
|
||||||
|
|
||||||
+h(3, "features-models") Neural network models for English, German, French and Spanish
|
+h(3, "features-models") Neural network models for English, German, French, Spanish and multi-language NER
|
||||||
|
|
||||||
|
+aside-code("Example", "bash").
|
||||||
|
python -m spacy download en # default English model
|
||||||
|
python -m spacy download de # default German model
|
||||||
|
python -m spacy download fr # default French model
|
||||||
|
python -m spacy download es # default Spanish model
|
||||||
|
python -m spacy download xx_ent_web_md # multi-language NER
|
||||||
|
|
||||||
+infobox
|
+infobox
|
||||||
| #[strong Details:] #[+src(gh("spacy-models")) spacy-models]
|
| #[strong Details:] #[+src(gh("spacy-models")) spacy-models]
|
||||||
|
| #[+a("/docs/api/language-models") Languages]
|
||||||
| #[strong Usage:] #[+a("/docs/usage/models") Models]
|
| #[strong Usage:] #[+a("/docs/usage/models") Models]
|
||||||
|
|
||||||
+h(2, "incompat") Backwards incompatibilities
|
+h(2, "incompat") Backwards incompatibilities
|
||||||
|
@ -147,6 +166,10 @@ p
|
||||||
+cell #[code spacy.orth]
|
+cell #[code spacy.orth]
|
||||||
+cell #[code spacy.lang.xx.lex_attrs]
|
+cell #[code spacy.lang.xx.lex_attrs]
|
||||||
|
|
||||||
|
+row
|
||||||
|
+cell #[code cli.model]
|
||||||
|
+cell -
|
||||||
|
|
||||||
+row
|
+row
|
||||||
+cell #[code Language.save_to_directory]
|
+cell #[code Language.save_to_directory]
|
||||||
+cell #[+api("language#to_disk") #[code Language.to_disk]]
|
+cell #[+api("language#to_disk") #[code Language.to_disk]]
|
||||||
|
|
|
@ -58,6 +58,11 @@ p
|
||||||
| The argument #[code options] lets you specify a dictionary of settings
|
| The argument #[code options] lets you specify a dictionary of settings
|
||||||
| to customise the layout, for example:
|
| to customise the layout, for example:
|
||||||
|
|
||||||
|
+aside("Important note")
|
||||||
|
| There's currently a known issue with the #[code compact] mode for long
|
||||||
|
| sentences with arrow spacing. If the spacing is larger than the arc
|
||||||
|
| itself, it'll cause the arc and its label to flip.
|
||||||
|
|
||||||
+table(["Name", "Type", "Description", "Default"])
|
+table(["Name", "Type", "Description", "Default"])
|
||||||
+row
|
+row
|
||||||
+cell #[code compact]
|
+cell #[code compact]
|
||||||
|
@ -330,11 +335,12 @@ p
|
||||||
| It's certainly possible to just have your server return the markup.
|
| It's certainly possible to just have your server return the markup.
|
||||||
| But outputting raw, unsanitised HTML is risky and makes your app vulnerable to
|
| But outputting raw, unsanitised HTML is risky and makes your app vulnerable to
|
||||||
| #[+a("https://en.wikipedia.org/wiki/Cross-site_scripting") cross-site scripting]
|
| #[+a("https://en.wikipedia.org/wiki/Cross-site_scripting") cross-site scripting]
|
||||||
| (XSS). All your user needs to do is find a way to make spaCy return one
|
| (XSS). All your user needs to do is find a way to make spaCy return text
|
||||||
| token #[code <script src="malicious-code.js"><script>].
|
| like #[code <script src="malicious-code.js"><script>], which
|
||||||
| Instead of relying on the server to render and sanitize HTML, you
|
| is pretty easy in NER mode. Instead of relying on the server to render
|
||||||
| can do this on the client in JavaScript. displaCy.js creates
|
| and sanitise HTML, you can do this on the client in JavaScript.
|
||||||
| the markup as DOM nodes and will never insert raw HTML.
|
| displaCy.js creates the markup as DOM nodes and will never insert raw
|
||||||
|
| HTML.
|
||||||
|
|
||||||
p
|
p
|
||||||
| The #[code parse_deps] function takes a #[code Doc] object and returns
|
| The #[code parse_deps] function takes a #[code Doc] object and returns
|
||||||
|
|
Loading…
Reference in New Issue
Block a user