Commit Graph

679 Commits

Author SHA1 Message Date
Ines Montani
a31e9e1cd5 Update training docs [ci skip] 2019-09-12 15:32:39 +02:00
Ines Montani
b544dcb3c5 Document debug-data [ci skip] 2019-09-12 15:26:20 +02:00
Ines Montani
c0a4cab178 Update "Adding languages" docs [ci skip] 2019-09-12 14:53:06 +02:00
Ines Montani
e7c20ad1d2 Update colors entry points docs [ci skip] 2019-09-12 12:59:10 +02:00
Ines Montani
7b59a919e6 Update entry points docs [ci skip] 2019-09-12 12:52:06 +02:00
Sofie Van Landeghem
0b4b4f1819 Documentation for Entity Linking (#4065)
* document token ent_kb_id

* document span kb_id

* update pipeline documentation

* prior and context weights as bool's instead

* entitylinker api documentation

* drop for both models

* finish entitylinker documentation

* small fixes

* documentation for KB

* candidate documentation

* links to api pages in code

* small fix

* frequency examples as counts for consistency

* consistent documentation about tensors returned by predict

* add entity linking to usage 101

* add entity linking infobox and KB section to 101

* entity-linking in linguistic features

* small typo corrections

* training example and docs for entity_linker

* predefined nlp and kb

* revert back to similarity encodings for simplicity (for now)

* set prior probabilities to 0 when excluded

* code clean up

* bugfix: deleting kb ID from tokens when entities were removed

* refactor train el example to use either model or vocab

* pretrain_kb example for example kb generation

* add to training docs for KB + EL example scripts

* small fixes

* error numbering

* ensure the language of vocab and nlp stay consistent across serialization

* equality with =

* avoid conflict in errors file

* add error 151

* final adjustements to the train scripts - consistency

* update of goldparse documentation

* small corrections

* push commit

* typo fix

* add candidate API to kb documentation

* update API sidebar with EntityLinker and KnowledgeBase

* remove EL from 101 docs

* remove entity linker from 101 pipelines / rephrase

* custom el model instead of existing model

* set version to 2.2 for EL functionality

* update documentation for 2 CLI scripts
2019-09-12 11:38:34 +02:00
Sofie Van Landeghem
6b012cebff Make pos/tag distinction more clear in docs (#4246)
* make distinction between tag and pos more prominent in docs

* out of the 101
2019-09-06 10:31:21 +02:00
adrianeboyd
8fe7bdd0fa Improve token pattern checking without validation (#4105)
* Fix typo in rule-based matching docs

* Improve token pattern checking without validation

Add more detailed token pattern checks without full JSON pattern validation and
provide more detailed error messages.

Addresses #4070 (also related: #4063, #4100).

* Check whether top-level attributes in patterns and attr for PhraseMatcher are
  in token pattern schema

* Check whether attribute value types are supported in general (as opposed to
  per attribute with full validation)

* Report various internal error types (OverflowError, AttributeError, KeyError)
  as ValueError with standard error messages

* Check for tagger/parser in PhraseMatcher pipeline for attributes TAG, POS,
  LEMMA, and DEP

* Add error messages with relevant details on how to use validate=True or nlp()
  instead of nlp.make_doc()

* Support attr=TEXT for PhraseMatcher

* Add NORM to schema

* Expand tests for pattern validation, Matcher, PhraseMatcher, and EntityRuler

* Remove unnecessary .keys()

* Rephrase error messages

* Add another type check to Matcher

Add another type check to Matcher for more understandable error messages
in some rare cases.

* Support phrase_matcher_attr=TEXT for EntityRuler

* Don't use spacy.errors in examples and bin scripts

* Fix error code

* Auto-format

Also try get Azure pipelines to finally start a build :(

* Update errors.py


Co-authored-by: Ines Montani <ines@ines.io>
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
2019-08-21 14:00:37 +02:00
Ines Montani
3134a9b6e0 Add section on expanding regex match to token boundaries (see #4158) [ci skip] 2019-08-21 12:53:31 +02:00
Ines Montani
66aba2d676 Improve regex matching docs [ci skip] 2019-08-19 13:59:41 +02:00
Sofie Van Landeghem
cc66f47893 Make enabling/disabling jupyter mode more explicit (#4144)
* make enabling/disabling jupyter mode more explicit

* markup fix
2019-08-19 11:53:34 +02:00
Ines Montani
e520eb3f6c Make visualized NER examples more clear (closes #4104) [ci skip] 2019-08-18 16:29:29 +02:00
Ines Montani
1362f793cf Improve docs on phrase pattern attributes (closes #4100) [ci skip] 2019-08-11 11:13:49 +02:00
Ines Montani
8b4a0fabbb Adjust docs example [ci skip] 2019-08-07 00:46:47 +02:00
adrianeboyd
69aca7d839 Add validate option to EntityRuler (#4089)
* Add validate option to EntityRuler

* Add validate to EntityRuler, passed to Matcher and PhraseMatcher

* Add validate to usage and API docs

* Update website/docs/usage/rule-based-matching.md

Co-Authored-By: Ines Montani <ines@ines.io>

* Update website/docs/usage/rule-based-matching.md

Co-Authored-By: Ines Montani <ines@ines.io>
2019-08-07 00:40:53 +02:00
Ines Montani
4ae320e5c2 Use consistent casing for entity ruler patterns (see #4063) [ci skip] 2019-08-06 12:20:22 +02:00
Ines Montani
223bde5cf6 Improve docs on matcher attributes [ci skip] (closes #4063) 2019-08-06 12:13:42 +02:00
Ines Montani
2bfae0b167 Auto-format 2019-08-06 12:13:31 +02:00
Ines Montani
bd39e5e630 Add "Processing text" section [ci skip] 2019-07-25 17:38:03 +02:00
Ines Montani
a5e3d2f318 Improve section on disabling pipes [ci skip] 2019-07-25 14:25:34 +02:00
Ines Montani
02e444ec7c Add section on special tokenizer component [ci skip] 2019-07-25 14:25:03 +02:00
Ines Montani
1fa6d6ba55 Improve consistency of docs examples [ci skip] 2019-07-25 14:24:56 +02:00
Ines Montani
1167c303a0 Fix typos [ci skip] 2019-07-19 13:08:18 +02:00
Ines Montani
c3ead02ea5 Adjust wording [ci skip] 2019-07-17 16:06:25 +02:00
Ines Montani
1d5ff3e455 Add infobox 2019-07-17 15:29:36 +02:00
Ines Montani
114cb18892 Improve wording 2019-07-17 15:27:53 +02:00
Ines Montani
7522beef9e Add "Things to try" prompts 2019-07-17 15:25:02 +02:00
Ines Montani
9f02e3c027 Adjust example
Not actually supported in this alignment interpretation
2019-07-17 15:13:50 +02:00
Ines Montani
1ea472468a Add usage docs for aligning tokenization 2019-07-17 15:08:33 +02:00
pmbaumgartner
9a86d95ea2 fix custom attribute links 2019-07-14 20:23:54 -04:00
Ines Montani
ebe58e7fa1 Document gold.docs_to_json [ci skip] 2019-07-10 10:27:33 +02:00
Ines Montani
881f5bc401 Auto-format 2019-07-10 10:27:29 +02:00
Ines Montani
d361e380b8 Fix matcher callback example (closes #3862) 2019-06-26 14:47:26 +02:00
Alejandro Alcalde
4866a7ee9e Changed learning rate by its param name. (#3855)
* Changed learning rate by its param name.

I've been searching for a while how the parameter learning rate was named, with `beta1` and `beta2` its easy as they are marked as code, but learning rate wasn't. I think writing the actual parameter name would be helpful.

* Signing SCA
2019-06-20 10:29:20 +02:00
Ramanan Balakrishnan
eb12703d10 minor fix to broken link in documentation (#3819) [ci skip] 2019-06-04 11:15:35 +02:00
Ines Montani
0c74506c9c Fix typos in docs (closes #3802) [ci skip] 2019-06-01 11:35:01 +02:00
mak
89379a7fa4 Corrected example model URL in requirements.txt (#3786)
The URL used to show how to add a model to the requirements.txt had the old release path (excl. explosion).
2019-05-29 10:51:55 +02:00
Aaron Kub
719a15f23d fixing regex matcher examples (#3708) (#3719) 2019-05-10 14:23:52 +02:00
张晓飞
ba1ff00370 update response after calling add_pipe (#3661)
* update response after calling add_pipe

component:print_info is appened in the last, so need show it at the end of  pipeline

* Create henry860916.md
2019-05-01 12:02:18 +02:00
Ramiro Gómez
8ee4100f8f Remove dangling M (#3657)
I assume this is a typo. Sorry if it has a meaning that I'm not aware of.
2019-04-29 19:44:43 +02:00
Amit Chaudhary
167d63af31 Fix broken link to Dive Into Python 3 website (#3656)
* Fix broken link to Dive Into Python 3 website

* Sign spaCy Contributor Agreement
2019-04-29 19:44:00 +02:00
Ivan Tham
fa94f83697 Improve redundant variable name (#3643)
* Improve redundant variable name

* Apply suggestions from code review

Co-Authored-By: pickfire <pickfire@riseup.net>
2019-04-26 16:50:14 +02:00
Ines Montani
0dce4585b1 Add course to 101 2019-04-19 15:59:51 +02:00
Ines Montani
38395d9518 Merge branch 'spacy.io' 2019-04-19 15:26:20 +02:00
Ines Montani
7ac5bb0a7b Update landing and feature overview 2019-04-19 15:23:08 +02:00
fizban99
f2f2df6e78 entity types for colors should be in uppercase (#3599)
although the text indicates the entity types should be in lowercase, the sample code shows uppercase, which is the correct format.
2019-04-17 11:22:56 +02:00
Ines Montani
9e7deeaf48 Remove Datacamp 2019-04-13 17:46:32 +02:00
Ines Montani
2f0f439c54 Remove non-existent example (closes #3533) 2019-04-03 09:59:17 +02:00
Ines Montani
200d8bdb3c Merge branch 'spacy.io' [ci skip] 2019-03-23 16:46:34 +01:00
Ines Montani
06bf130890 💫 Add better and serializable sentencizer (#3471)
* Add better serializable sentencizer component

* Replace default factory

* Add tests

* Tidy up

* Pass test

* Update docs
2019-03-23 15:45:02 +01:00
Ines Montani
b532386a60 Fix typo [ci skip] 2019-03-22 18:36:17 +01:00
Ines Montani
5073ce63fd Merge branch 'spacy.io' [ci skip] 2019-03-22 15:17:11 +01:00
Ines Montani
0712efc6b3 Update version requirements [ci skip] 2019-03-21 10:23:54 +01:00
Ines Montani
d4eed4a84f Add note on unicode build to troubleshooting guide (see #3421) [ci skip] 2019-03-19 10:27:02 +01:00
Ines Montani
a611b32fbf Update model docs [ci skip] 2019-03-17 11:48:18 +01:00
Ines Montani
cbcba699dd Fix missing ids 2019-03-14 17:56:53 +01:00
Ines Montani
4cfe4aa224 Fix small issues in the docs [ci skip] 2019-03-12 22:57:15 +01:00
Ines Montani
ba7eb2d131 Update section [ci skip] 2019-03-12 16:18:34 +01:00
Ines Montani
cecc31b765 Don't auto-slugify accordion links [ci skip] 2019-03-12 15:30:49 +01:00
Ines Montani
72fb324d95 Add vector training script to bin [ci skip] 2019-03-12 12:07:56 +01:00
Ines Montani
3abf0e6b9f Replace dev-resources links with real examples 2019-03-12 12:07:40 +01:00
Ines Montani
59c0620487 Auto-format 2019-03-12 12:07:11 +01:00
Ines Montani
7c05ca01e8 💫 Support mutable default values for extension attributes (#3389)
* Support mutable default values in extensions

* Update documentation
2019-03-11 12:50:44 +01:00
Ines Montani
8dbf1e9037 Also fix #3387 on develop 2019-03-10 23:36:28 +01:00
Ines Montani
9a8f169e5c Update v2-1.md 2019-03-10 18:58:51 +01:00
Ines Montani
296446a1c8
Tidy up and improve docs and docstrings (#3370)
<!--- Provide a general summary of your changes in the title. -->

## Description
* tidy up and adjust Cython code to code style
* improve docstrings and make calling `help()` nicer
* add URLs to new docs pages to docstrings wherever possible, mostly to user-facing objects
* fix various typos and inconsistencies in docs

### Types of change
enhancement, docs

## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
2019-03-08 11:42:26 +01:00
Ines Montani
48a206a95f Fix displaCy visualizations in docs (closes #3357) [ci skip] 2019-03-06 13:20:44 +01:00
Ines Montani
c478a2ccb6 Update backwards incompat [ci skip] 2019-02-27 11:56:56 +01:00
Ines Montani
1b6238101a Add table explaining training metrics [closes #2644] 2019-02-25 10:03:43 +01:00
Ines Montani
62b558ab72 💫 Support lexical attributes in retokenizer attrs (closes #2390) (#3325)
* Fix formatting and whitespace

* Add support for lexical attributes (closes #2390)

* Document lexical attribute setting during retokenization

* Assign variable oputside of nested loop
2019-02-24 21:13:51 +01:00
Ines Montani
aa52305461 Improve pipeline model and meta example [ci skip] 2019-02-24 18:45:39 +01:00
Ines Montani
df19e2bff6
💫 Allow setting of custom attributes during retokenization (closes #3314) (#3324)
<!--- Provide a general summary of your changes in the title. -->

## Description

This PR adds the abilility to override custom extension attributes during merging. This will only work for attributes that are writable, i.e. attributes registered with a default value like `default=False` or attribute that have both a getter *and* a setter implemented.

```python
Token.set_extension('is_musician', default=False)

doc = nlp("I like David Bowie.")
with doc.retokenize() as retokenizer:
    attrs = {"LEMMA": "David Bowie", "_": {"is_musician": True}}
    retokenizer.merge(doc[2:4], attrs=attrs)

assert doc[2].text == "David Bowie"
assert doc[2].lemma_ == "David Bowie"
assert doc[2]._.is_musician
```

### Types of change
enhancement

## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
2019-02-24 18:38:47 +01:00
Ines Montani
403b9cd58b Add docs on adding to existing tokenizer rules [ci skip] 2019-02-24 18:35:19 +01:00
Ines Montani
383e2e1f12 Update Python versions [ci skip] 2019-02-24 11:49:45 +01:00
Ines Montani
b624cb4b89 Update v2-1.md 2019-02-24 11:49:27 +01:00
Ines Montani
0fc908d7a5 Add note on merging speed in v2.1 (see #3300) [ci skip] 2019-02-21 12:34:18 +01:00
Ines Montani
236aa94ded Update v2-1.md 2019-02-21 12:33:56 +01:00
Sofie
9a478b6db8 Clean up of char classes, few tokenizer fixes and faster default French tokenizer (#3293)
* splitting up latin unicode interval

* removing hyphen as infix for French

* adding failing test for issue 1235

* test for issue #3002 which now works

* partial fix for issue #2070

* keep the hyphen as infix for French (as it was)

* restore french expressions with hyphen as infix (as it was)

* added succeeding unit test for Issue #2656

* Fix issue #2822 with custom Italian exception

* Fix issue #2926 by allowing numbers right before infix /

* splitting up latin unicode interval

* removing hyphen as infix for French

* adding failing test for issue 1235

* test for issue #3002 which now works

* partial fix for issue #2070

* keep the hyphen as infix for French (as it was)

* restore french expressions with hyphen as infix (as it was)

* added succeeding unit test for Issue #2656

* Fix issue #2822 with custom Italian exception

* Fix issue #2926 by allowing numbers right before infix /

* remove duplicate

* remove xfail for Issue #2179 fixed by Matt

* adjust documentation and remove reference to regex lib
2019-02-20 22:10:13 +01:00
Ines Montani
57ae71ea95 Add docs on serializing the pipeline (see #3289) [ci skip] 2019-02-18 14:13:29 +01:00
Ines Montani
38e4422c0d Improve matcher example (resolves #3287) 2019-02-18 13:26:37 +01:00
Ines Montani
660cfe44c5 Fix formatting 2019-02-18 13:26:22 +01:00
Ines Montani
212ff359ef Fix links [ci skip] 2019-02-17 22:25:50 +01:00
Ines Montani
e597110d31
💫 Update website (#3285)
<!--- Provide a general summary of your changes in the title. -->

## Description

The new website is implemented using [Gatsby](https://www.gatsbyjs.org) with [Remark](https://github.com/remarkjs/remark) and [MDX](https://mdxjs.com/). This allows authoring content in **straightforward Markdown** without the usual limitations. Standard elements can be overwritten with powerful [React](http://reactjs.org/) components and wherever Markdown syntax isn't enough, JSX components can be used. Hopefully, this update will also make it much easier to contribute to the docs. Once this PR is merged, I'll implement auto-deployment via [Netlify](https://netlify.com) on a specific branch (to avoid building the website on every PR). There's a bunch of other cool stuff that the new setup will allow us to do – including writing front-end tests, service workers, offline support, implementing a search and so on.

This PR also includes various new docs pages and content.
Resolves #3270. Resolves #3222. Resolves #2947. Resolves #2837.


### Types of change
enhancement

## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
2019-02-17 19:31:19 +01:00
ines
3f4fd2c5d5 Update usage documentation 2017-10-03 14:26:20 +02:00
Reza Gharibi
0461b82158 Fix typos 2017-09-27 03:56:20 +03:30
Reza Gharibi
fa1844b132 Fix typo 2017-09-27 03:55:54 +03:30
Reza Gharibi
b5dd7e7cc4 Fix typo 2017-09-27 03:55:28 +03:30
Ines Montani
b8e81daccf Fix typo (closes #1312) 2017-09-14 12:49:59 +02:00
ines
d15775c3ad Fix typos and commands in alpha docs 2017-08-21 13:40:11 +02:00
ines
3c33003078 Port over typo corrections from #1245 2017-08-20 12:00:17 +02:00
ines
a29f132ffd Change python -m spacy to spacy
Reflects latest change to entry point or auto-alias
2017-08-14 13:04:48 +02:00
Nikolai Kruglikov
08e443e083 Fix small typo in documentation 2017-08-14 12:19:04 +02:00
ines
ab8ffbaab7 Add text classification to v2 overview 2017-07-22 17:56:51 +02:00
ines
0fb89dd204 Add text classification usage guide template 2017-07-22 17:56:07 +02:00
ines
d05ab1b3a0 Add text classification to 101 overview and change order 2017-07-22 17:55:53 +02:00
Jarle Mathiesen
f20533ec0c fix small typo 2017-06-24 12:31:33 +02:00
Savva Kolbachev
800a8faff4 Changed the capital of Lithuania to Vilnius
Hi,
There is a typo about the capital of Lithuania.

Vilnius is the capital of Lithuania https://en.wikipedia.org/wiki/Vilnius
Ljubljana is the capital of Slovenia https://en.wikipedia.org/wiki/Ljubljana
2017-06-12 23:27:00 +03:00
Ines Montani
57f64b9e1c Merge pull request #1124 from v3t3a/patch-3
docs - Fix url error for Displacy Ent visualizer
2017-06-12 21:20:32 +02:00
Ines Montani
b2a28028cf Merge pull request #1115 from v3t3a/patch-2
docs - Add read() method when opening file (Lightning tour)
2017-06-12 21:19:25 +02:00
Vetea
eae1f7b19c Fix url error for Displacy Ent visualizer 2017-06-12 14:30:02 +02:00
ines
49026a1346 Fix typos in example (see #1105) 2017-06-08 19:15:50 +02:00
Vetea
cc3aee1189 Add read() method when opening file
Add read() method for 

to avoid :
```TypeError: Argument 'string' has incorrect type (expected str, got _io.TextIOWrapper)```

Test with:
spaCy : v2.0.0 Alpha
python : 3.5.2+ (default, Sep 22 2016, 12:18:14)
2017-06-08 11:27:09 +02:00
ines
6b799bac54 Fix formatting and details 2017-06-06 14:37:49 +02:00
ines
fd9ae0f0e0 Update v2 comparison table 2017-06-05 16:39:11 +02:00
ines
a3f9745a14 Update similarity usage guide and examples 2017-06-05 15:37:33 +02:00
ines
fd35d910b8 Update v2 docs and benchmarks 2017-06-05 14:13:38 +02:00
ines
040553ca59 Update architecture and features table 2017-06-05 13:33:01 +02:00
ines
505d43b832 Update norms example 2017-06-04 23:33:26 +02:00
ines
f8e93b6d0a Update norms example 2017-06-04 23:24:29 +02:00
ines
a857b2b511 Update norms example 2017-06-04 23:21:37 +02:00
ines
47d066b293 Add under construction 2017-06-04 23:17:54 +02:00
ines
e9816daa6a Add details on syntax iterators 2017-06-04 23:16:33 +02:00
ines
990cb81556 Add info on syntax iterators 2017-06-04 21:47:22 +02:00
ines
e4eb33daf7 Add links to production use guide 2017-06-04 20:56:58 +02:00
ines
63cd539d04 Add more details on model packages and requirements.txt (see #1099) 2017-06-04 20:52:10 +02:00
ines
97ff83d163 Fix docs on model loading 2017-06-04 20:44:59 +02:00
ines
b6002db797 Add v2 label 2017-06-04 18:53:03 +02:00
ines
468ff1a7dd Update v2 docs and add benchmarks stub 2017-06-04 15:34:28 +02:00
Matthew Honnibal
23fd6b1782 Add intro narrative for v2 2017-06-04 15:10:37 +02:00
ines
3419ecbfdd Update docs on model shortcut links 2017-06-04 13:55:00 +02:00
ines
586e901143 Add v2 intro stub 2017-06-04 13:42:37 +02:00
ines
4f8f62d9b3 Merge branch 'v2-docs-edits' into develop 2017-06-04 13:40:58 +02:00
ines
809903dcad Fix link and update wording 2017-06-04 13:29:20 +02:00
ines
22dd18c364 Remove redundant CPU commands 2017-06-04 13:29:13 +02:00
ines
1d6377218a Update architecture blurb and move other info 2017-06-04 13:28:58 +02:00
ines
7a66c9f039 Fix formatting 2017-06-04 13:14:00 +02:00
Matthew Honnibal
f2c4a9f690 Edits to spacy-101 page 2017-06-04 13:10:27 +02:00
Matthew Honnibal
aca53b95e1 Link architecture blurb 2017-06-04 13:10:06 +02:00
Matthew Honnibal
64ca5123bb Add Architecture 101 blurb 2017-06-04 13:09:19 +02:00
Matthew Honnibal
e77ed953f4 Update GPU instructions 2017-06-04 12:03:22 +02:00
ines
1d3b012e56 Update adding languages docs and add 101 2017-06-03 23:54:23 +02:00
ines
a3715a81d5 Update adding languages guide 2017-06-03 22:16:38 +02:00
ines
ec6d2bc81d Add table of contents mixin 2017-06-03 22:16:26 +02:00
ines
9acf8686f7 Update note on compact mode issues 2017-06-03 13:31:16 +02:00
ines
c60431357d Port over docs typo corrections 2017-06-03 11:31:30 +02:00
ines
c6dc2fafc0 Add Spanish and move example sentences to meta 2017-06-01 17:49:56 +02:00
ines
b577ed79ee Move social image logic out to function and move files 2017-06-01 14:27:44 +02:00
ines
5e60b09dcd Fix custom tokenizer example 2017-06-01 13:02:50 +02:00
ines
8274dffad6 Update NER training draft 2017-06-01 12:51:36 +02:00
ines
04fac3f52a Add NER training example code 2017-06-01 12:47:47 +02:00
ines
7f5e7e7320 Fix typo 2017-06-01 12:47:36 +02:00
ines
4a927154d8 Update v2 docs 2017-06-01 11:56:32 +02:00
ines
03bbb96db8 Remove outdated examples 2017-06-01 11:56:02 +02:00
ines
789e69b73f Update training guide 2017-06-01 11:53:23 +02:00
ines
2f40d6e7e7 Add training 101 2017-06-01 11:53:16 +02:00
ines
abed463bbb Update serialization 101 2017-06-01 11:52:58 +02:00
ines
72380c952a Update training section in NER guide and add links 2017-06-01 11:52:49 +02:00
ines
22b1f72870 Add spaCy 101 intro 2017-05-31 12:44:09 +02:00
ines
a18b95ca12 Update docs on testing 2017-05-31 12:43:40 +02:00
ines
981196c181 Fix typo 2017-05-31 11:34:31 +02:00
ines
f86289566a Update new in v2 section and add note on Matcher acceptors 2017-05-30 13:53:06 +02:00
ines
ce4e45d0bb Update 101 intro 2017-05-29 22:15:06 +02:00
ines
687ed28340 Update processing pipelines guide 2017-05-29 14:21:00 +02:00
ines
d5992f408f Update note on vocab consistency 2017-05-29 14:14:26 +02:00
ines
a2134951f2 Update 101 and add note on pipeline order and tensors 2017-05-29 11:45:32 +02:00
ines
17b635eaab Update alpha docs note and fix typo 2017-05-29 11:09:24 +02:00
ines
fbe105f1eb Add note on L in long integers in Python 2 2017-05-29 11:05:05 +02:00
ines
9d74810f6f Update examples 2017-05-29 01:09:52 +02:00
ines
42cf414138 Update Matcher example 2017-05-29 01:09:52 +02:00
ines
00b2094dc3 Fix typos, long integers and tests 2017-05-29 01:09:52 +02:00
ines
d71c6db76e Add missing Chainer install for GPU if building spaCy from source 2017-05-28 23:34:59 +02:00
ines
e0f9ccdaa3 Update texts and rename vectorizer to tensorizer 2017-05-28 23:26:13 +02:00
ines
606879b217 Update hash strings examples 2017-05-28 19:42:44 +02:00
ines
c7b57ea314 Update docs and change integer IDs to hash values 2017-05-28 19:25:34 +02:00
ines
738b4f7187 Add quickstart options and docs for GPU 2017-05-28 19:20:11 +02:00
ines
4c00cb8c8b Update 101 and add community/FAQ and table of contents 2017-05-28 18:45:49 +02:00
ines
8a148b6563 Fix code, links and formatting 2017-05-28 18:29:16 +02:00
ines
414193e9ba Update docs to reflect StringStore changes 2017-05-28 18:19:11 +02:00
ines
69bda9aed7 Update text, examples, typos, wording and formatting 2017-05-28 16:41:01 +02:00
ines
f8185b8e11 Rename vocab-stringsotre to vocab 2017-05-28 16:37:14 +02:00
ines
10d05c2b92 Fix typos, wording and formatting 2017-05-28 01:30:12 +02:00
ines
db116cbeda Update tokenization 101 and add illustration 2017-05-28 00:22:40 +02:00
ines
b03fb2d7b0 Update 101 and usage docs 2017-05-28 00:22:40 +02:00
ines
ae11c8d60f Add emoji sentiment to lightning tour matcher example 2017-05-27 20:02:20 +02:00
ines
22bf5f63bf Update Matcher docs and add social media analysis example 2017-05-27 17:58:18 +02:00
ines
0d33ead507 Fix initialisation of Doc in lightning tour example 2017-05-27 17:58:06 +02:00
ines
e05bcd6aa8 Update docs to reflect flattened model meta.json
Don't use "setup" key and instead, keep "lang" on root level and add
"pipeline".
2017-05-27 17:57:46 +02:00
ines
1b982f0838 Update train command and add docs on hyperparameters 2017-05-26 14:02:38 +02:00
ines
93ee5c4a52 Update serialization info 2017-05-26 13:22:45 +02:00
ines
f122d82f29 Update usage docs and ddd "under construction" 2017-05-26 13:17:48 +02:00
ines
286c3d0719 Update usage and 101 docs 2017-05-26 12:46:29 +02:00
ines
6d76c1ea16 Add 101 for Vocab, Lexeme and StringStore 2017-05-26 12:45:01 +02:00
ines
9063654a1a Add Training 101 stub 2017-05-25 11:18:02 +02:00
ines
b2324be3e9 Fix typos, text, examples and formatting 2017-05-25 11:17:21 +02:00
ines
dcb10da615 Update and fix lightning tour examples 2017-05-25 11:15:56 +02:00
ines
4b5540cc63 Rewrite examples in lightning tour 2017-05-25 01:58:33 +02:00
ines
87c976e04c Update model tag 2017-05-25 01:58:22 +02:00
ines
fe2b0b8b8d Update migrating docs 2017-05-25 00:56:35 +02:00
ines
709ea58990 Tidy up workflows 2017-05-25 00:56:16 +02:00
ines
d122bbc908 Rewrite custom tokenizer docs 2017-05-25 00:30:21 +02:00
ines
0f48fb1f97 Rename processing text to production use and remove linear feature scheme 2017-05-25 00:10:33 +02:00
ines
419d265ff0 Add section on disabling pipeline components 2017-05-25 00:10:06 +02:00
ines
9efa662345 Update dependency parse docs and add note on disabling parser 2017-05-25 00:09:51 +02:00
ines
9337866dae Add aside to pipeline 101 table 2017-05-24 22:46:18 +02:00
ines
c25f3133ca Update section on new v2.0 features 2017-05-24 20:54:37 +02:00
ines
f4658ff053 Rewrite usage workflow on saving and loading 2017-05-24 20:54:02 +02:00
ines
764bfa3239 Add section on using displaCy in a web app 2017-05-24 20:53:43 +02:00
ines
4f396236f6 Update saving and loading docs 2017-05-24 19:25:49 +02:00
ines
8aaed8bea7 Add pipelines 101 and rewrite pipelines workflow 2017-05-24 19:25:13 +02:00
ines
54885b5e88 Add serialization 101 2017-05-24 19:24:40 +02:00
ines
8b86b08bed Update usage workflows 2017-05-24 11:59:08 +02:00
ines
10afb3c796 Tidy up and merge usage pages 2017-05-24 00:37:47 +02:00
ines
990a70732a Move installation troubleshooting to installation docs 2017-05-24 00:37:21 +02:00
ines
697d3d7cb3 Fix links to CLI docs 2017-05-24 00:36:38 +02:00
ines
4fb5fb7218 Update v2 docs 2017-05-23 23:40:04 +02:00
ines
e6d88dfe08 Add features table to 101 2017-05-23 23:38:33 +02:00
ines
7ef7f0b42c Add linguistic annotations 101 content 2017-05-23 23:37:51 +02:00
ines
9ed6b48a49 Update dependency parse workflow 2017-05-23 23:34:39 +02:00
ines
fe24267948 Update usage docs meta and navigation 2017-05-23 23:19:20 +02:00
ines
af348025ec Update word vectors & similarity workflow 2017-05-23 23:19:09 +02:00
ines
b6c62baab3 Update What's new in v2 docs 2017-05-23 23:18:53 +02:00
ines
b6209e2427 Update POS tagging workflow 2017-05-23 23:18:08 +02:00
ines
43258d6b0a Update NER workflow 2017-05-23 23:17:57 +02:00
ines
61cf2bba55 Fix code example 2017-05-23 23:17:37 +02:00
ines
1c06ef3542 Update spaCy architecture 2017-05-23 23:17:25 +02:00
ines
a433e5012a Update adding languages docs 2017-05-23 23:16:44 +02:00
ines
3523715d52 Add spaCy 101 components 2017-05-23 23:16:31 +02:00
ines
3aff883434 Add displaCy examples to lightning tour 2017-05-23 23:15:39 +02:00
ines
6ef09d7ed8 Change save_to_directory to to_disk 2017-05-23 23:15:31 +02:00
ines
e6acd3bbf2 Fix matcher tests and matcher docs 2017-05-23 11:36:02 +02:00
ines
f497cf60b2 Update formatting 2017-05-23 11:32:25 +02:00
ines
4cd26bcb83 Update docs on rule-based matching and add examples 2017-05-22 19:04:02 +02:00
ines
701cba1524 Update models documentation with notes 2017-05-22 18:53:14 +02:00
ines
a23f487b06 Tidy up displaCy and add "manual" option
Also don't require title in EntityRenderer
2017-05-22 18:48:20 +02:00
ines
aa9c3bd464 Fix formatting 2017-05-22 13:55:01 +02:00
ines
d5a6a9a6a9 Use string values for attrs in Matcher docs 2017-05-22 13:54:45 +02:00
ines
cc569a348d Add quickstart widget to models and update docs
Add global variable for models and generate all model listings
programmatically
2017-05-21 20:55:52 +02:00
ines
924e8506de Move Defaults subclass to module scope (necessary for pickling) 2017-05-20 19:02:27 +02:00
ines
4ed6a36622 Update docstrings and API docs for Matcher 2017-05-20 14:43:10 +02:00
ines
39f36539f6 Update docstrings and API docs for Matcher 2017-05-20 14:32:34 +02:00
ines
b218c1964a Update "What's new in v2.0" docs 2017-05-20 14:00:41 +02:00
ines
e10c48210d Update Matcher API and workflow to reflect new API
on_match is now the second positional argument, to easily allow a
variable number of patterns while keeping the method clean and readable.
2017-05-20 12:59:03 +02:00
ines
9edc7fb0ba Update Matcher API docs 2017-05-20 12:27:22 +02:00
ines
5163a4513e Update API docs 2017-05-20 01:43:48 +02:00
ines
784347160d Rewrite rule-based matching workflow 2017-05-20 01:38:55 +02:00
ines
7f9539da27 Fix old download command and formatting 2017-05-20 01:38:43 +02:00
ines
476b8209fe Update docs with new Jupyter auto-detection 2017-05-18 14:58:17 +02:00
ines
11f52b8b83 Add headline to installation details and move aside 2017-05-17 12:04:03 +02:00
ines
533bb63816 Implement quickstart widget 2017-05-17 12:04:03 +02:00
ines
9df9a87d03 Add visualizer usage example 2017-05-17 12:04:03 +02:00
ines
6364a9be9d Add What's new and spaCy 101 stubs 2017-05-17 12:04:03 +02:00
ines
f4ae1e8750 Add section on adding titles to documents 2017-05-17 12:04:03 +02:00
ines
02a4841e7b Move CLI docs to API reference 2017-05-17 12:04:03 +02:00
ines
accf05b0a9 Update visualizers docs 2017-05-15 14:37:01 +02:00
ines
6d7986b7bc Update docs 2017-05-15 01:46:33 +02:00
ines
c6e8d55dcb Update NER workflow with new displaCy 2017-05-15 01:42:11 +02:00
ines
860a60e251 Fix explanation 2017-05-15 01:31:11 +02:00
ines
5c044cb670 Add visualizers usage docs 2017-05-15 01:25:18 +02:00
ines
3d37564a09 Remove resources from navigation for now
Not sure what to do with this page... maybe merge it with something
else?
2017-05-14 23:29:58 +02:00
ines
b462076d80 Merge load_lang_class and get_lang_class 2017-05-14 01:31:10 +02:00
ines
144161c58c Update links to dev resources 2017-05-13 21:23:02 +02:00
ines
0095d5322b Update adding languages docs 2017-05-13 18:54:10 +02:00
ines
1d94c0e98a Update table of contents 2017-05-13 15:42:51 +02:00
ines
a48e21755e Add section on testing language tokenizers 2017-05-13 15:39:27 +02:00
ines
2f54fefb5d Update adding languages docs 2017-05-13 14:54:58 +02:00
ines
3665acc0de Update adding languages docs 2017-05-13 12:39:36 +02:00
ines
3454f2aca8 Update showcase 2017-05-13 03:32:03 +02:00
ines
67726d1837 Update data model docs 2017-05-13 03:10:56 +02:00
ines
915b50c736 Update adding languages docs 2017-05-13 03:10:50 +02:00
ines
c4d2c3cac7 Update adding languages docs 2017-05-12 15:38:17 +02:00
Ines Montani
fb96f88b59 Update info on CoNLL format and include link 2017-04-27 14:36:08 +02:00
M. Z. Ferdous (Imran)
c9f9203d5f fix typo, CONLL format
tried to google about connlu format. Saw there is conll format, not connlu.
2017-04-27 16:48:54 +06:00
ines
5aa49971f9 Add French example to models docs 2017-04-27 12:08:47 +02:00
ines
100846bed3 Fix typo in model list 2017-04-26 21:40:17 +02:00
ines
4eacd72bc3 Move list of models to own file 2017-04-26 20:50:27 +02:00
ines
c2006166d3 Update list of available models and info 2017-04-26 16:03:41 +02:00
ines
e6bdf5bc5c Update adding language / training docs (see #966)
Add data examples and more info on training and CLI commands
2017-04-26 14:01:19 +02:00
ines
ae2b77db1b Fix info on naming conventions 2017-04-26 14:01:19 +02:00
Julien Chaumond
f997bceb07 Make object of the deep learning tutorial clearer
This is a great tutorial, but I think it is weirdly explained in the current form. The largest part of the code is about implementing the actual sentiment analysis model, not about counting entities. (which is not even present in the `deep_learning_keras.py` script in `examples`)
2017-04-24 11:55:41 +02:00
ines
2bfec1a4f8 Add note on languages with non-latin characters (see #996) 2017-04-23 15:58:40 +02:00
ines
2ab394d655 Fix whitespace 2017-04-17 01:45:00 +02:00
ines
7f776258f0 Add link to API docs 2017-04-17 01:41:46 +02:00
ines
c6c3162c50 Fix lightning tour example (closes #889) 2017-04-17 00:00:30 +02:00
ines
de5062711b Update adding languages workflow to reflect changes in __init__.py 2017-04-16 22:26:46 +02:00
ines
e4dd645c37 Update link 2017-04-16 20:37:46 +02:00
ines
dea79224ed Remove saving & loading docs and link to new workflow 2017-04-16 20:37:45 +02:00
ines
c365795bf6 Update navigation 2017-04-16 20:37:45 +02:00
ines
5bbbb7674b Add training examples to tutorials 2017-04-16 20:37:45 +02:00
ines
17e9743388 Add saving & loading models docs 2017-04-16 20:37:45 +02:00
ines
b15bdb5279 Update training docs 2017-04-16 20:37:45 +02:00
ines
5cb17b9f33 Add NER training docs 2017-04-16 20:37:45 +02:00
ines
d29c825ca4 Update docs for package command 2017-04-16 13:37:24 +02:00
ines
cf558e37c3 Update adding languages docs with new commands 2017-04-13 13:52:11 +02:00
Sohil
328678c7e9 Extra brace ")" creating error
There is an extra closing brace `)` which is creating error while running example.
2017-04-13 17:12:28 +05:30
ines
1f501af602 Add file name shadowing module issue to troubleshooting guide (see #953) 2017-04-07 16:21:32 +02:00
ines
2f38c1d77f Add documentation for new convert and model commands 2017-04-07 13:27:55 +02:00
ines
f33c4cbae1 Add --no-cache-dir error to troubleshooting docs (see #958) 2017-04-07 10:22:18 +02:00
ines
d6bbc3ffcd Fix formatting 2017-04-07 10:22:18 +02:00
ines
2c36a61ec5 Add spacyr to libraries 2017-04-03 18:12:38 +02:00
ines
e210496f78 Update Windows compiler docs 2017-03-29 10:35:20 +02:00
ines
13df2d6a60 Add documentation for spaCy's JSON format 2017-03-26 15:56:15 +02:00
ines
5901c8f7f0 Update spacy train CLI documentation 2017-03-26 15:33:48 +02:00
ines
afd839f64b Add pip and conda badges to installation docs 2017-03-26 14:11:31 +02:00
ines
9a481c9f42 Add "Troubleshooting" section 2017-03-26 13:42:36 +02:00
ines
d4a86b6394 Update formatting 2017-03-26 13:42:19 +02:00
ines
1dae97b2f6 Fix typos 2017-03-26 11:14:44 +02:00
ines
fa6e3cefbb Simplify package command docs 2017-03-21 11:35:29 +01:00
ines
49bbfdaac1 Add info on CLI to docs on own models 2017-03-21 11:25:01 +01:00
ines
09b24bc5a9 Add docs for package command 2017-03-21 11:19:21 +01:00
ines
81b28ca606 Update models docs with info on retraining own models 2017-03-20 18:01:55 +01:00
ines
ef5e261387 Add spacy_api project by @kootenpv to showcase 2017-03-19 12:49:40 +01:00
ines
fa1f2040a5 Use correct code block language 2017-03-18 18:19:50 +01:00
ines
ff277140f9 Add CLI docs 2017-03-18 15:24:50 +01:00
ines
e635e1f6f4 Update docs to reflect new commands 2017-03-18 15:24:42 +01:00
ines
e9d8d756fc Fix typo in pytest flags 2017-03-18 15:24:20 +01:00
ines
3926ffdb70 Update models docs 2017-03-17 19:26:37 +01:00
ines
76c0ea6cc6 Update models docs 2017-03-17 17:01:16 +01:00
ines
b322f31521 Update models docs 2017-03-17 16:09:56 +01:00
ines
7f25f64acc Update lightning tour 2017-03-17 13:11:00 +01:00
ines
e461fafd14 Update example 2017-03-16 23:23:35 +01:00
ines
f4df9463f2 Fix wording 2017-03-16 22:21:46 +01:00
ines
08b0fb62cc Update models docs 2017-03-16 22:09:43 +01:00
ines
0b5c664b04 Update resources 2017-03-16 21:59:26 +01:00
ines
807139ae61 Update installation docs and add models quickstart aside 2017-03-16 21:53:44 +01:00
ines
ec75c781b9 Add docs page for models 2017-03-16 21:53:31 +01:00
ines
4c53eed35a Remove sputnik from dependencies and docs 2017-03-15 17:39:25 +01:00
ines
758335452d Update installation instructions and fix formatting 2017-03-08 11:36:00 +01:00
ines
004c4c9566 Update installation docs
Include conda and virtualenv info for pip, add instructions for
downloading models manually and add details and fab commands to
"Compile from source" section.
2017-03-07 18:52:22 +01:00
yalei
27c0e6226b Edit example code
The original code forget to import the `random` module and the `EntityRecognizer` module.
2017-03-07 18:07:40 +08:00
ines
2b07ab7db4 Add feature scheme to API docs (see #857, #739) 2017-02-24 18:26:32 +01:00
ines
8ddad178f6 Add book and tutorial 2017-02-24 18:26:32 +01:00
John Gamboa
e31894b800 Fixes example 3 of entity recognition (see issue #832) 2017-02-16 11:19:53 +01:00
Stefan Bunk
2bf19d4735 Fix error in pipeline loading documentation
The cell for the `vocab` parameter is not displayed, making it seem as if the explanation belongs to the previous param.
2017-02-10 12:06:55 +01:00
Stefan Bunk
e972b2fa87 Fix error in matching documentation
LOWER and IS_PUNCT are members of `spacy` and not of the `Matcher` class.
2017-02-07 16:52:01 +01:00
Matthew Honnibal
9aaa2c5633 Fix entity recognition example (closes #803) 2017-02-05 11:23:12 +01:00
Ines Montani
651bf411e0 Add tutorial 2017-01-26 13:48:38 +01:00
Ines Montani
da3aca4020 Fix formatting 2017-01-26 13:48:29 +01:00
Kevin Gao
7ec710af0e Fix Custom Tokenizer docs
- Fix mismatched quotations
- Make it more clear where ORTH, LEMMA, and POS symbols come from
- Make strings consistent
- Fix lemma_ assertion s/-PRON-/me/
2017-01-17 10:38:14 -08:00
Jason Kessler
9fa6f9fb40 Origin of spacy.matcher attributes
Make it clear that Matcher attributes live in spacy.matcher.attrs.
2017-01-16 13:31:35 -06:00
Ines Montani
57919566b8 Add Jupyter notebooks repo to resources list 2017-01-05 20:50:08 +01:00
Ines Montani
e3d84572f2 Fix ents input format example 2017-01-01 12:28:37 +01:00
Guy Rosin
acdd2fc9a6 Tiny code typo 2016-12-31 14:53:05 +02:00
Ines Montani
b7becaec85 Fix typo 2016-12-25 15:23:32 +01:00
Ines Montani
207555fae7 Fix spelling 2016-12-23 21:36:01 +01:00
Ines Montani
48b03b4001 Fix formatting and wording 2016-12-23 14:36:03 +01:00
Ines Montani
cc051ddc15 Add resources page to usage docs 2016-12-23 14:36:03 +01:00
Ines Montani
d1a2846750 Document DET_LEMMA 2016-12-21 18:18:35 +01:00
aikramer2
349143faa2 update to training doc 2016-12-20 12:01:16 -08:00
Ines Montani
a2525c76ee Reformat word frequencies section in "adding languages" workflow 2016-12-19 17:18:38 +01:00
Ines Montani
d0c15730c4 Fix link 2016-12-19 13:09:45 +01:00
Ines Montani
a9c0e77b80 Fix typo 2016-12-19 13:09:45 +01:00
Ines Montani
fa65c6b54c Add "Adding languages" workflow (closes #562) 2016-12-18 23:54:19 +01:00
Ines Montani
1cddb7da36 Add "Part-of-speech tagging" workflow (closes #581) 2016-12-18 23:54:19 +01:00
Ines Montani
ac597b58f6 Update showcase 2016-12-18 23:54:18 +01:00
Ines Montani
614ca6fb41 Split annotation specs into files to they can be included in different places 2016-12-18 17:42:10 +01:00
Ines Montani
ce8bf08223 Fix formatting 2016-12-18 17:40:20 +01:00
jaspb
3d7f81ddf5 added 'en' to spacy.load(..) 2016-12-10 19:18:13 +00:00
Tobias Macey
1d768d6510 Fixed minor typo
The word `motto` was missing the second `t`.
2016-12-01 06:08:33 -05:00
Jimi Smoot
8373115cbd Minor typos 2016-11-25 18:22:52 -08:00
Ines Montani
ada007cb73 Fix formatting for consistency 2016-11-25 15:53:40 +01:00