mirror of
				https://github.com/explosion/spaCy.git
				synced 2025-10-25 21:21:10 +03:00 
			
		
		
		
	* Rename all MDX file to `.mdx`
* Lock current node version (#11885)
* Apply Prettier (#11996)
* Minor website fixes (#11974) [ci skip]
* fix table
* Migrate to Next WEB-17 (#12005)
* Initial commit
* Run `npx create-next-app@13 next-blog`
* Install MDX packages
Following: 77b5f79a4d/packages/next-mdx/readme.md
* Add MDX to Next
* Allow Next to handle `.md` and `.mdx` files.
* Add VSCode extension recommendation
* Disabled TypeScript strict mode for now
* Add prettier
* Apply Prettier to all files
* Make sure to use correct Node version
* Add basic implementation for `MDXRemote`
* Add experimental Rust MDX parser
* Add `/public`
* Add SASS support
* Remove default pages and styling
* Convert to module
This allows to use `import/export` syntax
* Add import for custom components
* Add ability to load plugins
* Extract function
This will make the next commit easier to read
* Allow to handle directories for page creation
* Refactoring
* Allow to parse subfolders for pages
* Extract logic
* Redirect `index.mdx` to parent directory
* Disabled ESLint during builds
* Disabled typescript during build
* Remove Gatsby from `README.md`
* Rephrase Docker part of `README.md`
* Update project structure in `README.md`
* Move and rename plugins
* Update plugin for wrapping sections
* Add dependencies for  plugin
* Use  plugin
* Rename wrapper type
* Simplify unnessary adding of id to sections
The slugified section ids are useless, because they can not be referenced anywhere anyway. The navigation only works if the section has the same id as the heading.
* Add plugin for custom attributes on Markdown elements
* Add plugin to readd support for tables
* Add plugin to fix problem with wrapped images
For more details see this issue: https://github.com/mdx-js/mdx/issues/1798
* Add necessary meta data to pages
* Install necessary dependencies
* Remove outdated MDX handling
* Remove reliance on `InlineList`
* Use existing Remark components
* Remove unallowed heading
Before `h1` components where not overwritten and would never have worked and they aren't used anywhere either.
* Add missing components to MDX
* Add correct styling
* Fix broken list
* Fix broken CSS classes
* Implement layout
* Fix links
* Fix broken images
* Fix pattern image
* Fix heading attributes
* Rename heading attribute
`new` was causing some weird issue, so renaming it to `version`
* Update comment syntax in MDX
* Merge imports
* Fix markdown rendering inside components
* Add model pages
* Simplify anchors
* Fix default value for theme
* Add Universe index page
* Add Universe categories
* Add Universe projects
* Fix Next problem with copy
Next complains when the server renders something different then the client, therfor we move the differing logic to `useEffect`
* Fix improper component nesting
Next doesn't allow block elements inside a `<p>`
* Replace landing page MDX with page component
* Remove inlined iframe content
* Remove ability to inline HTML content in iFrames
* Remove MDX imports
* Fix problem with image inside link in MDX
* Escape character for MDX
* Fix unescaped characters in MDX
* Fix headings with logo
* Allow to export static HTML pages
* Add prebuild script
This command is automatically run by Next
* Replace `svg-loader` with `react-inlinesvg`
`svg-loader` is no longer maintained
* Fix ESLint `react-hooks/exhaustive-deps`
* Fix dropdowns
* Change code language from `cli` to `bash`
* Remove unnessary language `none`
* Fix invalid code language
`markdown_` with an underscore was used to basically turn of syntax highlighting, but using unknown languages know throws an error.
* Enable code blocks plugin
* Readd `InlineCode` component
MDX2 removed the `inlineCode` component
> The special component name `inlineCode` was removed, we recommend to use `pre` for the block version of code, and code for both the block and inline versions
Source: https://mdxjs.com/migrating/v2/#update-mdx-content
* Remove unused code
* Extract function to own file
* Fix code syntax highlighting
* Update syntax for code block meta data
* Remove unused prop
* Fix internal link recognition
There is a problem with regex between Node and browser, and since Next runs the component on both, this create an error.
`Prop `rel` did not match. Server: "null" Client: "noopener nofollow noreferrer"`
This simplifies the implementation and fixes the above error.
* Replace `react-helmet` with `next/head`
* Fix `className` problem for JSX component
* Fix broken bold markdown
* Convert file to `.mjs` to be used by Node process
* Add plugin to replace strings
* Fix custom table row styling
* Fix problem with `span` inside inline `code`
React doesn't allow a `span` inside an inline `code` element and throws an error in dev mode.
* Add `_document` to be able to customize `<html>` and `<body>`
* Add `lang="en"`
* Store Netlify settings in file
This way we don't need to update via Netlify UI, which can be tricky if changing build settings.
* Add sitemap
* Add Smartypants
* Add PWA support
* Add `manifest.webmanifest`
* Fix bug with anchor links after reloading
There was no need for the previous implementation, since the browser handles this nativly. Additional the manual scrolling into view was actually broken, because the heading would disappear behind the menu bar.
* Rename custom event
I was googeling for ages to find out what kind of event `inview` is, only to figure out it was a custom event with a name that sounds pretty much like a native one. 🫠
* Fix missing comment syntax highlighting
* Refactor Quickstart component
The previous implementation was hidding the irrelevant lines via data-props and dynamically generated CSS. This created problems with Next and was also hard to follow. CSS was used to do what React is supposed to handle.
The new implementation simplfy filters the list of children (React elements) via their props.
* Fix syntax highlighting for Training Quickstart
* Unify code rendering
* Improve error logging in Juniper
* Fix Juniper component
* Automatically generate "Read Next" link
* Add Plausible
* Use recent DocSearch component and adjust styling
* Fix images
* Turn of image optimization
> Image Optimization using Next.js' default loader is not compatible with `next export`.
We currently deploy to Netlify via `next export`
* Dont build pages starting with `_`
* Remove unused files
* Add Next plugin to Netlify
* Fix button layout
MDX automatically adds `p` tags around text on a new line and Prettier wants to put the text on a new line. Hacking with JSX string.
* Add 404 page
* Apply Prettier
* Update Prettier for `package.json`
Next sometimes wants to patch `package-lock.json`. The old Prettier setting indended with 4 spaces, but Next always indends with 2 spaces. Since `npm install` automatically uses the indendation from `package.json` for `package-lock.json` and to avoid the format switching back and forth, both files are now set to 2 spaces.
* Apply Next patch to `package-lock.json`
When starting the dev server Next would warn `warn  - Found lockfile missing swc dependencies, patching...` and update the `package-lock.json`. These are the patched changes.
* fix link
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* small backslash fixes
* adjust to new style
Co-authored-by: Marcus Blättermann <marcus@essenmitsosse.de>
		
	
			
		
			
				
	
	
		
			616 lines
		
	
	
		
			19 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			616 lines
		
	
	
		
			19 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| ---
 | ||
| title: Styleguide
 | ||
| section: styleguide
 | ||
| search_exclude: true
 | ||
| menu:
 | ||
|   - ['Logo', 'logo']
 | ||
|   - ['Colors', 'colors']
 | ||
|   - ['Typography', 'typography']
 | ||
|   - ['Elements', 'elements']
 | ||
|   - ['Components', 'components']
 | ||
|   - ['Markdown Reference', 'markdown']
 | ||
|   - ['Editorial', 'editorial']
 | ||
| sidebar:
 | ||
|   - label: Styleguide
 | ||
|     items:
 | ||
|       - text: ''
 | ||
|         url: '/styleguide'
 | ||
|   - label: Resources
 | ||
|     items:
 | ||
|       - text: Website Source
 | ||
|         url: https://github.com/explosion/spacy/tree/master/website
 | ||
|       - text: Contributing Guide
 | ||
|         url: https://github.com/explosion/spaCy/blob/master/CONTRIBUTING.md
 | ||
| ---
 | ||
| 
 | ||
| The [spacy.io](https://spacy.io) website is implemented using
 | ||
| [Gatsby](https://www.gatsbyjs.org) with
 | ||
| [Remark](https://github.com/remarkjs/remark) and [MDX](https://mdxjs.com/). This
 | ||
| allows authoring content in **straightforward Markdown** without the usual
 | ||
| limitations. Standard elements can be overwritten with powerful
 | ||
| [React](http://reactjs.org/) components and wherever Markdown syntax isn't
 | ||
| enough, JSX components can be used.
 | ||
| 
 | ||
| > #### Contributing to the site
 | ||
| >
 | ||
| > The docs can always use another example or more detail, and they should always
 | ||
| > be up to date and not misleading. We always appreciate a
 | ||
| > [pull request](https://github.com/explosion/spaCy/pulls). To quickly find the
 | ||
| > correct file to edit, simply click on the "Suggest edits" button at the bottom
 | ||
| > of a page.
 | ||
| >
 | ||
| > For more details on editing the site locally, see the installation
 | ||
| > instructions and markdown reference below.
 | ||
| 
 | ||
| ## Logo {id="logo",source="website/src/images/logo.svg"}
 | ||
| 
 | ||
| If you would like to use the spaCy logo on your site, please get in touch and
 | ||
| ask us first. However, if you want to show support and tell others that your
 | ||
| project is using spaCy, you can grab one of our
 | ||
| [spaCy badges](/usage/spacy-101#faq-project-with-spacy).
 | ||
| 
 | ||
| <Logos />
 | ||
| 
 | ||
| ## Colors {id="colors"}
 | ||
| 
 | ||
| <Colors />
 | ||
| 
 | ||
| ### Patterns
 | ||
| 
 | ||
| <Patterns />
 | ||
| 
 | ||
| ## Typography {id="typography"}
 | ||
| 
 | ||
| > #### Markdown
 | ||
| >
 | ||
| > ```markdown
 | ||
| > ## Headline 2
 | ||
| >
 | ||
| > ## Headline 2 {id="some_id"}
 | ||
| >
 | ||
| > ## Headline 2 {id="some_id" tag="method"}
 | ||
| > ```
 | ||
| >
 | ||
| > #### JSX
 | ||
| >
 | ||
| > ```jsx
 | ||
| > <H2>Headline 2</H2>
 | ||
| > <H2 id="some_id">Headline 2</H2>
 | ||
| > <H2 id="some_id" tag="method">Headline 2</H2>
 | ||
| > ```
 | ||
| 
 | ||
| Headlines are set in
 | ||
| [HK Grotesk](http://cargocollective.com/hanken/HK-Grotesk-Open-Source-Font) by
 | ||
| Hanken Design. All other body text and code uses the best-matching default
 | ||
| system font to provide a "native" reading experience. All code uses the
 | ||
| [JetBrains Mono](https://www.jetbrains.com/lp/mono/) typeface by JetBrains.
 | ||
| 
 | ||
| <Infobox title="Important note" variant="warning">
 | ||
| 
 | ||
| Level 2 headings are automatically wrapped in `<section>` elements at compile
 | ||
| time, using a custom
 | ||
| [Markdown transformer](https://github.com/explosion/spaCy/tree/master/website/plugins/remark-wrap-section.js).
 | ||
| This makes it easier to highlight the section that's currently in the viewpoint
 | ||
| in the sidebar menu.
 | ||
| 
 | ||
| </Infobox>
 | ||
| 
 | ||
| <div>
 | ||
|   <H2>Headline 2</H2>
 | ||
|   <H3>Headline 3</H3>
 | ||
|   <H4>Headline 4</H4>
 | ||
|   <H5>Headline 5</H5>
 | ||
|   <Label>Label</Label>
 | ||
| </div>
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| The following optional attributes can be set on the headline to modify it. For
 | ||
| example, to add a tag for the documented type or mark features that have been
 | ||
| introduced in a specific version or require statistical models to be loaded.
 | ||
| Tags are also available as standalone `<Tag />` components.
 | ||
| 
 | ||
| | Argument  | Example                    | Result                                    |
 | ||
| | --------- | -------------------------- | ----------------------------------------- |
 | ||
| | `tag`     | `{tag="method"}`           | <Tag>method</Tag>                         |
 | ||
| | `version` | `{version="3"}`            | <Tag variant="new">3</Tag>                |
 | ||
| | `model`   | `{model="tagger, parser"}` | <Tag variant="model">tagger, parser</Tag> |
 | ||
| | `hidden`  | `{hidden="true"}`          |                                           |
 | ||
| 
 | ||
| ## Elements {id="elements"}
 | ||
| 
 | ||
| ### Links {id="links"}
 | ||
| 
 | ||
| > #### Markdown
 | ||
| >
 | ||
| > ```markdown
 | ||
| > [I am a link](https://spacy.io)
 | ||
| > ```
 | ||
| >
 | ||
| > #### JSX
 | ||
| >
 | ||
| > ```jsx
 | ||
| > <Link to="https://spacy.io">I am a link</Link>
 | ||
| > ```
 | ||
| 
 | ||
| Special link styles are used depending on the link URL.
 | ||
| 
 | ||
| - [I am a regular external link](https://explosion.ai)
 | ||
| - [I am a link to the documentation](/api/doc)
 | ||
| - [I am a link to an architecture](/api/architectures#HashEmbedCNN)
 | ||
| - [I am a link to a model](/models/en#en_core_web_sm)
 | ||
| - [I am a link to GitHub](https://github.com/explosion/spaCy)
 | ||
| 
 | ||
| ### Abbreviations {id="abbr"}
 | ||
| 
 | ||
| > #### JSX
 | ||
| >
 | ||
| > ```jsx
 | ||
| > <Abbr title="Explanation">Abbreviation</Abbr>
 | ||
| > ```
 | ||
| 
 | ||
| Some text with <Abbr title="Explanation here">an abbreviation</Abbr>. On small
 | ||
| screens, I collapse and the explanation text is displayed next to the
 | ||
| abbreviation.
 | ||
| 
 | ||
| ### Tags {id="tags"}
 | ||
| 
 | ||
| > ```jsx
 | ||
| > <Tag>method</Tag>
 | ||
| > <Tag variant="version">4</Tag>
 | ||
| > <Tag variant="model">tagger, parser</Tag>
 | ||
| > ```
 | ||
| 
 | ||
| Tags can be used together with headlines, or next to properties across the
 | ||
| documentation, and combined with tooltips to provide additional information. An
 | ||
| optional `variant` argument can be used for special tags. `variant="new"` makes
 | ||
| the tag take a version number to mark new features. Using the component,
 | ||
| visibility of this tag can later be toggled once the feature isn't considered
 | ||
| new anymore. Setting `variant="model"` takes a description of model capabilities
 | ||
| and can be used to mark features that require a respective model to be
 | ||
| installed.
 | ||
| 
 | ||
| <p>
 | ||
|   <Tag>method</Tag>
 | ||
|   <Tag variant="new">4</Tag>
 | ||
|   <Tag variant="model">tagger, parser</Tag>
 | ||
| </p>
 | ||
| 
 | ||
| ### Buttons {id="buttons"}
 | ||
| 
 | ||
| > ```jsx
 | ||
| > <Button to="#" variant="primary">Primary small</Button>
 | ||
| > <Button to="#" variant="secondary">Secondary small</Button>
 | ||
| > ```
 | ||
| 
 | ||
| Link buttons come in two variants, `primary` and `secondary` and two sizes, with
 | ||
| an optional `large` size modifier. Since they're mostly used as enhanced links,
 | ||
| the buttons are implemented as styled links instead of native button elements.
 | ||
| 
 | ||
| <p>
 | ||
| <Button to="#" variant="primary">Primary small</Button>
 | ||
| 
 | ||
| {' '}
 | ||
| 
 | ||
| <Button to="#" variant="secondary">Secondary small</Button>
 | ||
| </p>
 | ||
| 
 | ||
| <p>
 | ||
| <Button to="#" variant="primary">Primary small</Button>
 | ||
| 
 | ||
| {' '}
 | ||
| 
 | ||
| <Button to="#" variant="secondary">Secondary small</Button>
 | ||
| </p>
 | ||
| 
 | ||
| ## Components
 | ||
| 
 | ||
| ### Table {id="table"}
 | ||
| 
 | ||
| > #### Markdown
 | ||
| >
 | ||
| > ```markdown
 | ||
| > | Header 1 | Header 2 |
 | ||
| > | -------- | -------- |
 | ||
| > | Column 1 | Column 2 |
 | ||
| > ```
 | ||
| >
 | ||
| > #### JSX
 | ||
| >
 | ||
| > ```markup
 | ||
| > <Table>
 | ||
| >     <Tr><Th>Header 1</Th><Th>Header 2</Th></Tr></thead>
 | ||
| >     <Tr><Td>Column 1</Td><Td>Column 2</Td></Tr>
 | ||
| > </Table>
 | ||
| > ```
 | ||
| 
 | ||
| Tables are used to present data and API documentation. Certain keywords can be
 | ||
| used to mark a footer row with a distinct style, for example to visualize the
 | ||
| return values of a documented function.
 | ||
| 
 | ||
| | Header 1    | Header 2 | Header 3 | Header 4 |
 | ||
| | ----------- | -------- | :------: | -------: |
 | ||
| | Column 1    | Column 2 | Column 3 | Column 4 |
 | ||
| | Column 1    | Column 2 | Column 3 | Column 4 |
 | ||
| | Column 1    | Column 2 | Column 3 | Column 4 |
 | ||
| | Column 1    | Column 2 | Column 3 | Column 4 |
 | ||
| | **RETURNS** | Column 2 | Column 3 | Column 4 |
 | ||
| 
 | ||
| Tables also support optional "divider" rows that are typically used to denote
 | ||
| keyword-only arguments in API documentation. To turn a row into a dividing
 | ||
| headline, it should only include content in its first cell, and its value should
 | ||
| be italicized:
 | ||
| 
 | ||
| > #### Markdown
 | ||
| >
 | ||
| > ```markdown
 | ||
| > | Header 1 | Header 2 | Header 3 |
 | ||
| > | -------- | -------- | -------- |
 | ||
| > | Column 1 | Column 2 | Column 3 |
 | ||
| > | _Hello_  |          |          |
 | ||
| > | Column 1 | Column 2 | Column 3 |
 | ||
| > ```
 | ||
| 
 | ||
| | Header 1 | Header 2 | Header 3 |
 | ||
| | -------- | -------- | -------- |
 | ||
| | Column 1 | Column 2 | Column 3 |
 | ||
| | _Hello_  |          |          |
 | ||
| | Column 1 | Column 2 | Column 3 |
 | ||
| 
 | ||
| ### Type Annotations {id="type-annotations"}
 | ||
| 
 | ||
| > #### Markdown
 | ||
| >
 | ||
| > ```markdown
 | ||
| > ~~Model[List[Doc], Floats2d]~~
 | ||
| > ```
 | ||
| >
 | ||
| > #### JSX
 | ||
| >
 | ||
| > ```markup
 | ||
| > <TypeAnnotation>Model[List[Doc], Floats2d]</Typeannotation>
 | ||
| > ```
 | ||
| 
 | ||
| Type annotations are special inline code blocks are used to describe Python
 | ||
| types in the [type hints](https://docs.python.org/3/library/typing.html) format.
 | ||
| The special component will split the type, apply syntax highlighting and link
 | ||
| all types that specify links in `meta/type-annotations.json`. Types can link to
 | ||
| internal or external documentation pages. To make it easy to represent the type
 | ||
| annotations in Markdown, the rendering "hijacks" the `~~` tags that would
 | ||
| typically be converted to a `<del>` element – but in this case, text surrounded
 | ||
| by `~~` becomes a type annotation.
 | ||
| 
 | ||
| - ~~Dict[str, List[Union[Doc, Span]]]~~
 | ||
| - ~~Model[List[Doc], List[numpy.ndarray]]~~
 | ||
| 
 | ||
| Type annotations support a special visual style in tables and will render as a
 | ||
| separate row, under the cell text. This allows the API docs to display complex
 | ||
| types without taking up too much space in the cell. The type annotation should
 | ||
| always be the **last element** in the row.
 | ||
| 
 | ||
| > #### Markdown
 | ||
| >
 | ||
| > ```markdown
 | ||
| > | Header 1 | Header 2               |
 | ||
| > | -------- | ---------------------- |
 | ||
| > | Column 1 | Column 2 ~~List[Doc]~~ |
 | ||
| > ```
 | ||
| 
 | ||
| | Name                    | Description                                                                                                                                                                 |
 | ||
| | ----------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | ||
| | `vocab`                 | The shared vocabulary. ~~Vocab~~                                                                                                                                            |
 | ||
| | `model`                 | The Thinc [`Model`](https://thinc.ai/docs/api-model) wrapping the transformer. ~~Model[List[Doc], FullTransformerBatch]~~                                                   |
 | ||
| | `set_extra_annotations` | Function that takes a batch of `Doc` objects and transformer outputs and can set additional annotations on the `Doc`. ~~Callable[[List[Doc], FullTransformerBatch], None]~~ |
 | ||
| 
 | ||
| ### List {id="list"}
 | ||
| 
 | ||
| > #### Markdown
 | ||
| >
 | ||
| > ```markdown
 | ||
| > 1. One
 | ||
| > 2. Two
 | ||
| > ```
 | ||
| >
 | ||
| > #### JSX
 | ||
| >
 | ||
| > ```markup
 | ||
| > <Ol>
 | ||
| >     <Li>One</Li>
 | ||
| >     <Li>Two</Li>
 | ||
| > </Ol>
 | ||
| > ```
 | ||
| 
 | ||
| Lists are available as bulleted and numbered. Markdown lists are transformed
 | ||
| automatically.
 | ||
| 
 | ||
| - I am a bulleted list
 | ||
| - I have nice bullets
 | ||
| - Lorem ipsum dolor
 | ||
| - consectetur adipiscing elit
 | ||
| 
 | ||
| 1. I am an ordered list
 | ||
| 2. I have nice numbers
 | ||
| 3. Lorem ipsum dolor
 | ||
| 4. consectetur adipiscing elit
 | ||
| 
 | ||
| ### Aside {id="aside"}
 | ||
| 
 | ||
| > #### Markdown
 | ||
| >
 | ||
| > ```markdown
 | ||
| > > #### Aside title
 | ||
| > >
 | ||
| > > This is aside text.
 | ||
| > ```
 | ||
| >
 | ||
| > #### JSX
 | ||
| >
 | ||
| > ```jsx
 | ||
| > <Aside title="Aside title">This is aside text.</Aside>
 | ||
| > ```
 | ||
| 
 | ||
| Asides can be used to display additional notes and content in the right-hand
 | ||
| column. Asides can contain text, code and other elements if needed. Visually,
 | ||
| asides are moved to the side on the X-axis, and displayed at the same level they
 | ||
| were inserted. On small screens, they collapse and are rendered in their
 | ||
| original position, in between the text.
 | ||
| 
 | ||
| To make them easier to use in Markdown, paragraphs formatted as blockquotes will
 | ||
| turn into asides by default. Level 4 headlines (with a leading `####`) will
 | ||
| become aside titles.
 | ||
| 
 | ||
| ### Code Block {id="code-block"}
 | ||
| 
 | ||
| > #### Markdown
 | ||
| >
 | ||
| > ````markdown
 | ||
| > ```python
 | ||
| > ### This is a title
 | ||
| > import spacy
 | ||
| > ```
 | ||
| > ````
 | ||
| >
 | ||
| > #### JSX
 | ||
| >
 | ||
| > ```jsx
 | ||
| > <CodeBlock title="This is a title" lang="python">
 | ||
| >   import spacy
 | ||
| > </CodeBlock>
 | ||
| > ```
 | ||
| 
 | ||
| Code blocks use the [Prism](http://prismjs.com/) syntax highlighter with a
 | ||
| custom theme. The language can be set individually on each block, and defaults
 | ||
| to raw text with no highlighting. An optional label can be added as the first
 | ||
| line with the prefix `####` (Python-like) and `///` (JavaScript-like). the
 | ||
| indented block as plain text and preserve whitespace.
 | ||
| 
 | ||
| ```python {title="Using spaCy"}
 | ||
| import spacy
 | ||
| nlp = spacy.load("en_core_web_sm")
 | ||
| doc = nlp("This is a sentence.")
 | ||
| for token in doc:
 | ||
|     print(token.text, token.pos_)
 | ||
| ```
 | ||
| 
 | ||
| Code blocks and also specify an optional range of line numbers to highlight by
 | ||
| adding `{highlight="..."}` to the headline. Acceptable ranges are spans like
 | ||
| `5-7`, but also `5-7,10` or `5-7,10,13-14`.
 | ||
| 
 | ||
| > #### Markdown
 | ||
| >
 | ||
| > ````markdown
 | ||
| > ```python
 | ||
| > ### This is a title {highlight="1-2"}
 | ||
| > import spacy
 | ||
| > nlp = spacy.load("en_core_web_sm")
 | ||
| > ```
 | ||
| > ````
 | ||
| 
 | ||
| ```python {title="Using the matcher",highlight="5-7"}
 | ||
| import spacy
 | ||
| from spacy.matcher import Matcher
 | ||
| 
 | ||
| nlp = spacy.load('en_core_web_sm')
 | ||
| matcher = Matcher(nlp.vocab)
 | ||
| pattern = [{"LOWER": "hello"}, {"IS_PUNCT": True}, {"LOWER": "world"}]
 | ||
| matcher.add("HelloWorld", None, pattern)
 | ||
| doc = nlp("Hello, world! Hello world!")
 | ||
| matches = matcher(doc)
 | ||
| ```
 | ||
| 
 | ||
| Adding `{executable="true"}` to the title turns the code into an executable
 | ||
| block, powered by [Binder](https://mybinder.org) and
 | ||
| [Juniper](https://github.com/ines/juniper). If JavaScript is disabled, the
 | ||
| interactive widget defaults to a regular code block.
 | ||
| 
 | ||
| > #### Markdown
 | ||
| >
 | ||
| > ````markdown
 | ||
| > ```python
 | ||
| > ### {executable="true"}
 | ||
| > import spacy
 | ||
| > nlp = spacy.load("en_core_web_sm")
 | ||
| > ```
 | ||
| > ````
 | ||
| 
 | ||
| ```python {executable="true"}
 | ||
| import spacy
 | ||
| nlp = spacy.load("en_core_web_sm")
 | ||
| doc = nlp("This is a sentence.")
 | ||
| for token in doc:
 | ||
|     print(token.text, token.pos_)
 | ||
| ```
 | ||
| 
 | ||
| If a code block only contains a URL to a GitHub file, the raw file contents are
 | ||
| embedded automatically and syntax highlighting is applied. The link to the
 | ||
| original file is shown at the top of the widget.
 | ||
| 
 | ||
| > #### Markdown
 | ||
| >
 | ||
| > ````markdown
 | ||
| > ```python
 | ||
| > https://github.com/...
 | ||
| > ```
 | ||
| > ````
 | ||
| >
 | ||
| > #### JSX
 | ||
| >
 | ||
| > ```jsx
 | ||
| > <GitHubCode url="https://github.com/..." lang="python" />
 | ||
| > ```
 | ||
| 
 | ||
| ```python
 | ||
| https://github.com/explosion/spaCy/tree/master/spacy/language.py
 | ||
| ```
 | ||
| 
 | ||
| ### Infobox {id="infobox"}
 | ||
| 
 | ||
| > #### JSX
 | ||
| >
 | ||
| > ```jsx
 | ||
| > <Infobox title="Information">Regular infobox</Infobox>
 | ||
| > <Infobox title="Important note" variant="warning">This is a warning.</Infobox>
 | ||
| > <Infobox title="Be careful!" variant="danger">This is dangerous.</Infobox>
 | ||
| > ```
 | ||
| 
 | ||
| Infoboxes can be used to add notes, updates, warnings or additional information
 | ||
| to a page or section. Semantically, they're implemented and interpreted as an
 | ||
| `aside` element. Infoboxes can take an optional `title` argument, as well as an
 | ||
| optional `variant` (either `"warning"` or `"danger"`).
 | ||
| 
 | ||
| <Infobox title="This is an infobox">
 | ||
| 
 | ||
| If needed, an infobox can contain regular text, `inline code`, lists and other
 | ||
| blocks.
 | ||
| 
 | ||
| </Infobox>
 | ||
| 
 | ||
| <Infobox title="This is a warning" variant="warning">
 | ||
| 
 | ||
| If needed, an infobox can contain regular text, `inline code`, lists and other
 | ||
| blocks.
 | ||
| 
 | ||
| </Infobox>
 | ||
| 
 | ||
| <Infobox title="This is dangerous" variant="danger">
 | ||
| 
 | ||
| If needed, an infobox can contain regular text, `inline code`, lists and other
 | ||
| blocks.
 | ||
| 
 | ||
| </Infobox>
 | ||
| 
 | ||
| ### Accordion {id="accordion"}
 | ||
| 
 | ||
| > #### JSX
 | ||
| >
 | ||
| > ```jsx
 | ||
| > <Accordion title="This is an accordion">
 | ||
| >   Accordion content goes here.
 | ||
| > </Accordion>
 | ||
| > ```
 | ||
| 
 | ||
| Accordions are collapsible sections that are mostly used for lengthy tables,
 | ||
| like the tag and label annotation schemes for different languages. They all need
 | ||
| to be presented – but chances are the user doesn't actually care about _all_ of
 | ||
| them, especially not at the same time. So it's fairly reasonable to hide them
 | ||
| begin a click. This particular implementation was inspired by the amazing
 | ||
| [Inclusive Components blog](https://inclusive-components.design/collapsible-sections/).
 | ||
| 
 | ||
| <Accordion title="This is an accordion">
 | ||
| 
 | ||
| Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quisque enim ante,
 | ||
| pretium a orci eget, varius dignissim augue. Nam eu dictum mauris, id tincidunt
 | ||
| nisi. Integer commodo pellentesque tincidunt. Nam at turpis finibus tortor
 | ||
| gravida sodales tincidunt sit amet est. Nullam euismod arcu in tortor auctor,
 | ||
| sit amet dignissim justo congue.
 | ||
| 
 | ||
| </Accordion>
 | ||
| 
 | ||
| ## Markdown reference {id="markdown"}
 | ||
| 
 | ||
| All page content and page meta lives in the `.mdx` files in the `/docs`
 | ||
| directory. The frontmatter block at the top of each file defines the page title
 | ||
| and other settings like the sidebar menu.
 | ||
| 
 | ||
| ````markdown
 | ||
| ---
 | ||
| title: Page title
 | ||
| ---
 | ||
| 
 | ||
| ## Headline starting a section {id="some_id"}
 | ||
| 
 | ||
| This is a regular paragraph with a [link](https://spacy.io) and **bold text**.
 | ||
| 
 | ||
| > #### This is an aside title
 | ||
| >
 | ||
| > This is aside text.
 | ||
| 
 | ||
| ### Subheadline
 | ||
| 
 | ||
| | Header 1 | Header 2 |
 | ||
| | -------- | -------- |
 | ||
| | Column 1 | Column 2 |
 | ||
| 
 | ||
| ```python {title="Code block title",highlight="2-3"}
 | ||
| import spacy
 | ||
| nlp = spacy.load("en_core_web_sm")
 | ||
| doc = nlp("Hello world")
 | ||
| ```
 | ||
| 
 | ||
| <Infobox title="Important note" variant="warning">
 | ||
| 
 | ||
| This is content in the infobox.
 | ||
| 
 | ||
| </Infobox>
 | ||
| ````
 | ||
| 
 | ||
| In addition to the native markdown elements, you can use the components
 | ||
| [`<Infobox />`][infobox], [`<Accordion />`][accordion], [`<Abbr />`][abbr] and
 | ||
| [`<Tag />`][tag] via their JSX syntax.
 | ||
| 
 | ||
| [infobox]: https://spacy.io/styleguide#infobox
 | ||
| [accordion]: https://spacy.io/styleguide#accordion
 | ||
| [abbr]: https://spacy.io/styleguide#abbr
 | ||
| [tag]: https://spacy.io/styleguide#tag
 | ||
| 
 | ||
| ## Editorial {id="editorial"}
 | ||
| 
 | ||
| - "spaCy" should always be spelled with a lowercase "s" and a capital "C",
 | ||
|   unless it specifically refers to the Python package or Python import `spacy`
 | ||
|   (in which case it should be formatted as code).
 | ||
|   - ✅ spaCy is a library for advanced NLP in Python.
 | ||
|   - ❌ Spacy is a library for advanced NLP in Python.
 | ||
|   - ✅ First, you need to install the `spacy` package from pip.
 | ||
| - Mentions of code, like function names, classes, variable names etc. in inline
 | ||
|   text should be formatted as `code`.
 | ||
|   - ✅ "Calling the `nlp` object on a text returns a `Doc`."
 | ||
| - Objects that have pages in the [API docs](/api) should be linked – for
 | ||
|   example, [`Doc`](/api/doc) or [`Language.to_disk`](/api/language#to_disk). The
 | ||
|   mentions should still be formatted as code within the link. Links pointing to
 | ||
|   the API docs will automatically receive a little icon. However, if a paragraph
 | ||
|   includes many references to the API, the links can easily get messy. In that
 | ||
|   case, we typically only link the first mention of an object and not any
 | ||
|   subsequent ones.
 | ||
|   - ✅ The [`Span`](/api/span) and [`Token`](/api/token) objects are views of a
 | ||
|     [`Doc`](/api/doc). [`Span.as_doc`](/api/span#as_doc) creates a `Doc` object
 | ||
|     from a `Span`.
 | ||
|   - ❌ The [`Span`](/api/span) and [`Token`](/api/token) objects are views of a
 | ||
|     [`Doc`](/api/doc). [`Span.as_doc`](/api/span#as_doc) creates a
 | ||
|     [`Doc`](/api/doc) object from a [`Span`](/api/span).
 | ||
| - Other things we format as code are: references to trained pipeline packages
 | ||
|   like `en_core_web_sm` or file names like `code.py` or `meta.json`.
 | ||
|   - ✅ After training, the `config.cfg` is saved to disk.
 | ||
| - [Type annotations](#type-annotations) are a special type of code formatting,
 | ||
|   expressed by wrapping the text in `~~` instead of backticks. The result looks
 | ||
|   like this: ~~List[Doc]~~. All references to known types will be linked
 | ||
|   automatically.
 | ||
|   - ✅ The model has the input type ~~List[Doc]~~ and it outputs a
 | ||
|     ~~List[Array2d]~~.
 | ||
| - We try to keep links meaningful but short.
 | ||
|   - ✅ For details, see the usage guide on
 | ||
|     [training with custom code](/usage/training#custom-code).
 | ||
|   - ❌ For details, see
 | ||
|     [the usage guide on training with custom code](/usage/training#custom-code).
 | ||
|   - ❌ For details, see the usage guide on training with custom code
 | ||
|     [here](/usage/training#custom-code).
 |