spaCy/website/index.jade

//- 💫 LANDING PAGE

include _includes/_mixins

+landing-header
    h1.c-landing__title.u-heading-0
        | Industrial-Strength#[br]
        | Natural Language#[br]
        | Processing

    h2.c-landing__title.o-block.u-heading-3
        span.u-text-label.u-text-label--light in Python

+grid.o-content.c-landing__blocks
    +grid-col("third").c-landing__card.o-card.o-grid.o-grid--space
        +h(3) Fastest in the world
        p
            |  spaCy excels at large-scale information extraction tasks.
            |  It's written from the ground up in carefully memory-managed
            |  Cython. Independent research has confirmed that spaCy is
            |  the fastest in the world.  If your application needs to
            |  process entire web dumps, spaCy is the library you want to
            |  be using.

        +button("/usage/facts-figures", true, "primary")
            |  Facts & figures

    +grid-col("third").c-landing__card.o-card.o-grid.o-grid--space
        +h(3) Get things done
        p
            |  spaCy is designed to help you do real work — to build real
            |  products, or gather real insights. The library respects
            |  your time, and tries to avoid wasting it. It's easy to
            |  install, and its API is simple and productive. We like to
            |  think of spaCy as the Ruby on Rails of Natural Language
            |  Processing.

        +button("/usage", true, "primary")
            |  Get started

    +grid-col("third").c-landing__card.o-card.o-grid.o-grid--space
        +h(3) Deep learning
        p
            |  spaCy is the best way to prepare text for deep learning.
            |  It interoperates seamlessly with TensorFlow, PyTorch,
            |  scikit-learn, Gensim and the rest of Python's awesome AI
            |  ecosystem. With spaCy, you can easily construct linguistically
            |  sophisticated statistical models for a variety of NLP problems.

        +button("/usage/training", true, "primary")
            |  Read more

.o-content
    +grid
        +grid-col("two-thirds")
            +terminal("lightning_tour.py", "More examples", "/usage/spacy-101#lightning-tour").
                # Install: pip install spacy && spacy download en
                import spacy

                # Load English tokenizer, tagger, parser, NER and word vectors
                nlp = spacy.load('en')

                # Process a document, of any size
                text = open('war_and_peace.txt').read()
                doc = nlp(text)

                # Find named entities, phrases and concepts
                for entity in doc.ents:
                    print(entity.text, entity.label_)

                # Determine semantic similarities
                doc1 = nlp(u'the fries were gross')
                doc2 = nlp(u'worst fries ever')
                doc1.similarity(doc2)

                # Hook in your own deep learning models
                nlp.add_pipe(load_my_model(), before='parser')

        +grid-col("third")
            +h(2) Features
            +list
                +item Non-destructive #[strong tokenization]
                +item #[strong Named entity] recognition
                +item Support for #[strong #{LANG_COUNT}+ languages]
                +item #[strong #{MODEL_COUNT} statistical models] for #{MODEL_LANG_COUNT} languages
                +item Pre-trained #[strong word vectors]
                +item Easy #[strong deep learning] integration
                +item Part-of-speech tagging
                +item Labelled dependency parsing
                +item Syntax-driven sentence segmentation
                +item Built in #[strong visualizers] for syntax and NER
                +item Convenient string-to-hash mapping
                +item Export to numpy data arrays
                +item Efficient binary serialization
                +item Easy #[strong model packaging] and deployment
                +item State-of-the-art speed
                +item Robust, rigorously evaluated accuracy

+landing-banner("Convolutional neural network models", "New in v2.0")
    p
        |  spaCy v2.0 features new neural models for #[strong tagging],
        |  #[strong parsing] and #[strong entity recognition]. The models have
        |  been designed and implemented from scratch specifically for spaCy, to
        |  give you an unmatched balance of speed, size and accuracy. A novel
        |  bloom embedding strategy with subword features is used to support
        |  huge vocabularies in tiny tables. Convolutional layers with residual
        |  connections, layer normalization and maxout non-linearity are used,
        |  giving much better efficiency than the standard BiLSTM solution.
        |  Finally, the parser and NER use an imitation learning objective to
        |  deliver accuracy in-line with the latest research systems,
        |  even when  evaluated from raw text. With these innovations, spaCy
        |  v2.0's models are #[strong 10&times; smaller],
        |  #[strong 20% more accurate], and #[strong even cheaper to run] than
        |  the previous generation.

    .o-block-small.u-text-right
        +button("/models", true, "secondary-light") Download models

+landing-logos("spaCy is trusted by", logos)
    +button(gh("spacy") + "/stargazers", false, "secondary", "small")
        |  and many more

+landing-logos("Featured on", features).o-block-small

+landing-banner("Prodigy: Radically efficient machine teaching", "From the makers of spaCy")
    p
        |  Prodigy is an #[strong annotation tool] so efficient that data scientists can
        |  do the annotation themselves, enabling a new level of rapid
        |  iteration. Whether you're working on entity recognition, intent
        |  detection or image classification, Prodigy can help you
        |  #[strong train and evaluate] your models faster. Stream in your own examples or
        |  real-world data from live APIs, update your model in real-time and
        |  chain models together to build more complex systems.

    .o-block-small.u-text-right
        +button("https://prodi.gy", true, "secondary-light") Try it out

.o-content
    +grid
        +grid-col("half")
            +h(2) Benchmarks

            p
                |  In 2015, independent researchers from Emory University and
                |  Yahoo! Labs showed that spaCy offered the
                |  #[strong fastest syntactic parser in the world] and that its
                |  accuracy was #[strong within 1% of the best] available
                |  (#[+a("https://aclweb.org/anthology/P/P15/P15-1038.pdf") Choi et al., 2015]).
                |  spaCy v2.0, released in 2017, is more accurate than any of
                |  the systems Choi et al. evaluated.

            .o-inline-list
                +button("/usage/facts-figures#benchmarks", true, "secondary") See details

        +grid-col("half")
            include usage/_facts-figures/_benchmarks-choi-2015
Update website 2016-10-03 21:19:13 +03:00			`//- 💫 LANDING PAGE`
Replace website with new version 2016-03-31 17:24:48 +03:00
Update website 2016-10-03 21:19:13 +03:00			`include _includes/_mixins`
Replace website with new version 2016-03-31 17:24:48 +03:00
Update to new website 2016-10-31 21:04:15 +03:00			`+landing-header`
			`h1.c-landing__title.u-heading-0`
			`\| Industrial-Strength#[br]`
			`\| Natural Language#[br]`
			`\| Processing`

Update global config and landing page 2017-10-03 15:28:18 +03:00			`h2.c-landing__title.o-block.u-heading-3`
			`span.u-text-label.u-text-label--light in Python`

			`+grid.o-content.c-landing__blocks`
			`+grid-col("third").c-landing__card.o-card.o-grid.o-grid--space`
			`+h(3) Fastest in the world`
			`p`
			`\| spaCy excels at large-scale information extraction tasks.`
			`\| It's written from the ground up in carefully memory-managed`
			`\| Cython. Independent research has confirmed that spaCy is`
			`\| the fastest in the world. If your application needs to`
			`\| process entire web dumps, spaCy is the library you want to`
			`\| be using.`

			`+button("/usage/facts-figures", true, "primary")`
			`\| Facts & figures`

			`+grid-col("third").c-landing__card.o-card.o-grid.o-grid--space`
			`+h(3) Get things done`
			`p`
			`\| spaCy is designed to help you do real work — to build real`
			`\| products, or gather real insights. The library respects`
			`\| your time, and tries to avoid wasting it. It's easy to`
			`\| install, and its API is simple and productive. We like to`
			`\| think of spaCy as the Ruby on Rails of Natural Language`
			`\| Processing.`

			`+button("/usage", true, "primary")`
			`\| Get started`

			`+grid-col("third").c-landing__card.o-card.o-grid.o-grid--space`
			`+h(3) Deep learning`
			`p`
			`\| spaCy is the best way to prepare text for deep learning.`
			`\| It interoperates seamlessly with TensorFlow, PyTorch,`
Update website for v2.0 2017-11-07 16:48:17 +03:00			`\| scikit-learn, Gensim and the rest of Python's awesome AI`
			`\| ecosystem. With spaCy, you can easily construct linguistically`
			`\| sophisticated statistical models for a variety of NLP problems.`
Update global config and landing page 2017-10-03 15:28:18 +03:00
Update website for v2.0 2017-11-07 16:48:17 +03:00			`+button("/usage/training", true, "primary")`
Update global config and landing page 2017-10-03 15:28:18 +03:00			`\| Read more`
Add option for "latest news" on landing page 2016-11-07 04:14:43 +03:00
Update to new website 2016-10-31 21:04:15 +03:00			`.o-content`
Replace website with new version 2016-03-31 17:24:48 +03:00			`+grid`
Update to new website 2016-10-31 21:04:15 +03:00			`+grid-col("two-thirds")`
Update lightning tour 2017-11-01 21:49:36 +03:00			`+terminal("lightning_tour.py", "More examples", "/usage/spacy-101#lightning-tour").`
Change python -m spacy to spacy Reflects latest change to entry point or auto-alias 2017-08-14 13:41:06 +03:00			`# Install: pip install spacy && spacy download en`
Update to new website 2016-10-31 21:04:15 +03:00			`import spacy`

			`# Load English tokenizer, tagger, parser, NER and word vectors`
			`nlp = spacy.load('en')`

			`# Process a document, of any size`
			`text = open('war_and_peace.txt').read()`
			`doc = nlp(text)`

Update lightning tour 2017-11-01 21:49:36 +03:00			`# Find named entities, phrases and concepts`
			`for entity in doc.ents:`
			`print(entity.text, entity.label_)`
Update to new website 2016-10-31 21:04:15 +03:00
Update lightning tour 2017-11-01 21:49:36 +03:00			`# Determine semantic similarities`
Update to new website 2016-10-31 21:04:15 +03:00			`doc1 = nlp(u'the fries were gross')`
			`doc2 = nlp(u'worst fries ever')`
			`doc1.similarity(doc2)`

Update lightning tour 2017-11-01 21:49:36 +03:00			`# Hook in your own deep learning models`
			`nlp.add_pipe(load_my_model(), before='parser')`

Update to new website 2016-10-31 21:04:15 +03:00			`+grid-col("third")`
			`+h(2) Features`
			`+list`
			`+item Non-destructive #[strong tokenization]`
Update feature list 2017-10-24 22:38:53 +03:00			`+item #[strong Named entity] recognition`
Update global config and landing page 2017-10-03 15:28:18 +03:00			`+item Support for #[strong #{LANG_COUNT}+ languages]`
			`+item #[strong #{MODEL_COUNT} statistical models] for #{MODEL_LANG_COUNT} languages`
Update to new website 2016-10-31 21:04:15 +03:00			`+item Pre-trained #[strong word vectors]`
Update global config and landing page 2017-10-03 15:28:18 +03:00			`+item Easy #[strong deep learning] integration`
Update to new website 2016-10-31 21:04:15 +03:00			`+item Part-of-speech tagging`
			`+item Labelled dependency parsing`
Update global config and landing page 2017-10-03 15:28:18 +03:00			`+item Syntax-driven sentence segmentation`
			`+item Built in #[strong visualizers] for syntax and NER`
Update docs to reflect StringStore changes 2017-05-28 19:19:11 +03:00			`+item Convenient string-to-hash mapping`
Update to new website 2016-10-31 21:04:15 +03:00			`+item Export to numpy data arrays`
			`+item Efficient binary serialization`
Update global config and landing page 2017-10-03 15:28:18 +03:00			`+item Easy #[strong model packaging] and deployment`
Update to new website 2016-10-31 21:04:15 +03:00			`+item State-of-the-art speed`
			`+item Robust, rigorously evaluated accuracy`

Update global config and landing page 2017-10-03 15:28:18 +03:00			`+landing-banner("Convolutional neural network models", "New in v2.0")`
			`p`
			`\| spaCy v2.0 features new neural models for #[strong tagging],`
			`\| #[strong parsing] and #[strong entity recognition]. The models have`
			`\| been designed and implemented from scratch specifically for spaCy, to`
			`\| give you an unmatched balance of speed, size and accuracy. A novel`
			`\| bloom embedding strategy with subword features is used to support`
			`\| huge vocabularies in tiny tables. Convolutional layers with residual`
			`\| connections, layer normalization and maxout non-linearity are used,`
			`\| giving much better efficiency than the standard BiLSTM solution.`
			`\| Finally, the parser and NER use an imitation learning objective to`
			`\| deliver accuracy in-line with the latest research systems,`
			`\| even when evaluated from raw text. With these innovations, spaCy`
			`\| v2.0's models are #[strong 10× smaller],`
Update v2 details 2017-11-06 23:15:36 +03:00			`\| #[strong 20% more accurate], and #[strong even cheaper to run] than`
			`\| the previous generation.`
Update global config and landing page 2017-10-03 15:28:18 +03:00
			`.o-block-small.u-text-right`
			`+button("/models", true, "secondary-light") Download models`

			`+landing-logos("spaCy is trusted by", logos)`
			`+button(gh("spacy") + "/stargazers", false, "secondary", "small")`
			`\| and many more`

			`+landing-logos("Featured on", features).o-block-small`

			`+landing-banner("Prodigy: Radically efficient machine teaching", "From the makers of spaCy")`
			`p`
			`\| Prodigy is an #[strong annotation tool] so efficient that data scientists can`
			`\| do the annotation themselves, enabling a new level of rapid`
			`\| iteration. Whether you're working on entity recognition, intent`
			`\| detection or image classification, Prodigy can help you`
			`\| #[strong train and evaluate] your models faster. Stream in your own examples or`
			`\| real-world data from live APIs, update your model in real-time and`
			`\| chain models together to build more complex systems.`

			`.o-block-small.u-text-right`
			`+button("https://prodi.gy", true, "secondary-light") Try it out`
Update to new website 2016-10-31 21:04:15 +03:00
Update global config and landing page 2017-10-03 15:28:18 +03:00			`.o-content`
			`+grid`
			`+grid-col("half")`
			`+h(2) Benchmarks`
Update to new website 2016-10-31 21:04:15 +03:00
			`p`
Update global config and landing page 2017-10-03 15:28:18 +03:00			`\| In 2015, independent researchers from Emory University and`
			`\| Yahoo! Labs showed that spaCy offered the`
			`\| #[strong fastest syntactic parser in the world] and that its`
			`\| accuracy was #[strong within 1% of the best] available`
			`\| (#[+a("https://aclweb.org/anthology/P/P15/P15-1038.pdf") Choi et al., 2015]).`
			`\| spaCy v2.0, released in 2017, is more accurate than any of`
			`\| the systems Choi et al. evaluated.`
Update to new website 2016-10-31 21:04:15 +03:00
Update global config and landing page 2017-10-03 15:28:18 +03:00			`.o-inline-list`
			`+button("/usage/facts-figures#benchmarks", true, "secondary") See details`

			`+grid-col("half")`
			`include usage/_facts-figures/_benchmarks-choi-2015`