Commit Graph

451 Commits

Author SHA1 Message Date
Ines Montani
1ea472468a Add usage docs for aligning tokenization 2019-07-17 15:08:33 +02:00
pmbaumgartner
9a86d95ea2 fix custom attribute links 2019-07-14 20:23:54 -04:00
Ines Montani
ebe58e7fa1 Document gold.docs_to_json [ci skip] 2019-07-10 10:27:33 +02:00
Ines Montani
881f5bc401 Auto-format 2019-07-10 10:27:29 +02:00
Ines Montani
d361e380b8 Fix matcher callback example (closes #3862) 2019-06-26 14:47:26 +02:00
Alejandro Alcalde
4866a7ee9e Changed learning rate by its param name. (#3855)
* Changed learning rate by its param name.

I've been searching for a while how the parameter learning rate was named, with `beta1` and `beta2` its easy as they are marked as code, but learning rate wasn't. I think writing the actual parameter name would be helpful.

* Signing SCA
2019-06-20 10:29:20 +02:00
Ramanan Balakrishnan
eb12703d10 minor fix to broken link in documentation (#3819) [ci skip] 2019-06-04 11:15:35 +02:00
Ines Montani
0c74506c9c Fix typos in docs (closes #3802) [ci skip] 2019-06-01 11:35:01 +02:00
mak
89379a7fa4 Corrected example model URL in requirements.txt (#3786)
The URL used to show how to add a model to the requirements.txt had the old release path (excl. explosion).
2019-05-29 10:51:55 +02:00
Aaron Kub
719a15f23d fixing regex matcher examples (#3708) (#3719) 2019-05-10 14:23:52 +02:00
张晓飞
ba1ff00370 update response after calling add_pipe (#3661)
* update response after calling add_pipe

component:print_info is appened in the last, so need show it at the end of  pipeline

* Create henry860916.md
2019-05-01 12:02:18 +02:00
Ramiro Gómez
8ee4100f8f Remove dangling M (#3657)
I assume this is a typo. Sorry if it has a meaning that I'm not aware of.
2019-04-29 19:44:43 +02:00
Amit Chaudhary
167d63af31 Fix broken link to Dive Into Python 3 website (#3656)
* Fix broken link to Dive Into Python 3 website

* Sign spaCy Contributor Agreement
2019-04-29 19:44:00 +02:00
Ivan Tham
fa94f83697 Improve redundant variable name (#3643)
* Improve redundant variable name

* Apply suggestions from code review

Co-Authored-By: pickfire <pickfire@riseup.net>
2019-04-26 16:50:14 +02:00
Ines Montani
0dce4585b1 Add course to 101 2019-04-19 15:59:51 +02:00
Ines Montani
38395d9518 Merge branch 'spacy.io' 2019-04-19 15:26:20 +02:00
Ines Montani
7ac5bb0a7b Update landing and feature overview 2019-04-19 15:23:08 +02:00
fizban99
f2f2df6e78 entity types for colors should be in uppercase (#3599)
although the text indicates the entity types should be in lowercase, the sample code shows uppercase, which is the correct format.
2019-04-17 11:22:56 +02:00
Ines Montani
9e7deeaf48 Remove Datacamp 2019-04-13 17:46:32 +02:00
Ines Montani
2f0f439c54 Remove non-existent example (closes #3533) 2019-04-03 09:59:17 +02:00
Ines Montani
200d8bdb3c Merge branch 'spacy.io' [ci skip] 2019-03-23 16:46:34 +01:00
Ines Montani
06bf130890 💫 Add better and serializable sentencizer (#3471)
* Add better serializable sentencizer component

* Replace default factory

* Add tests

* Tidy up

* Pass test

* Update docs
2019-03-23 15:45:02 +01:00
Ines Montani
b532386a60 Fix typo [ci skip] 2019-03-22 18:36:17 +01:00
Ines Montani
5073ce63fd Merge branch 'spacy.io' [ci skip] 2019-03-22 15:17:11 +01:00
Ines Montani
0712efc6b3 Update version requirements [ci skip] 2019-03-21 10:23:54 +01:00
Ines Montani
d4eed4a84f Add note on unicode build to troubleshooting guide (see #3421) [ci skip] 2019-03-19 10:27:02 +01:00
Ines Montani
a611b32fbf Update model docs [ci skip] 2019-03-17 11:48:18 +01:00
Ines Montani
cbcba699dd Fix missing ids 2019-03-14 17:56:53 +01:00
Ines Montani
4cfe4aa224 Fix small issues in the docs [ci skip] 2019-03-12 22:57:15 +01:00
Ines Montani
ba7eb2d131 Update section [ci skip] 2019-03-12 16:18:34 +01:00
Ines Montani
cecc31b765 Don't auto-slugify accordion links [ci skip] 2019-03-12 15:30:49 +01:00
Ines Montani
72fb324d95 Add vector training script to bin [ci skip] 2019-03-12 12:07:56 +01:00
Ines Montani
3abf0e6b9f Replace dev-resources links with real examples 2019-03-12 12:07:40 +01:00
Ines Montani
59c0620487 Auto-format 2019-03-12 12:07:11 +01:00
Ines Montani
7c05ca01e8 💫 Support mutable default values for extension attributes (#3389)
* Support mutable default values in extensions

* Update documentation
2019-03-11 12:50:44 +01:00
Ines Montani
8dbf1e9037 Also fix #3387 on develop 2019-03-10 23:36:28 +01:00
Ines Montani
9a8f169e5c Update v2-1.md 2019-03-10 18:58:51 +01:00
Ines Montani
296446a1c8
Tidy up and improve docs and docstrings (#3370)
<!--- Provide a general summary of your changes in the title. -->

## Description
* tidy up and adjust Cython code to code style
* improve docstrings and make calling `help()` nicer
* add URLs to new docs pages to docstrings wherever possible, mostly to user-facing objects
* fix various typos and inconsistencies in docs

### Types of change
enhancement, docs

## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
2019-03-08 11:42:26 +01:00
Ines Montani
48a206a95f Fix displaCy visualizations in docs (closes #3357) [ci skip] 2019-03-06 13:20:44 +01:00
Ines Montani
c478a2ccb6 Update backwards incompat [ci skip] 2019-02-27 11:56:56 +01:00
Ines Montani
1b6238101a Add table explaining training metrics [closes #2644] 2019-02-25 10:03:43 +01:00
Ines Montani
62b558ab72 💫 Support lexical attributes in retokenizer attrs (closes #2390) (#3325)
* Fix formatting and whitespace

* Add support for lexical attributes (closes #2390)

* Document lexical attribute setting during retokenization

* Assign variable oputside of nested loop
2019-02-24 21:13:51 +01:00
Ines Montani
aa52305461 Improve pipeline model and meta example [ci skip] 2019-02-24 18:45:39 +01:00
Ines Montani
df19e2bff6
💫 Allow setting of custom attributes during retokenization (closes #3314) (#3324)
<!--- Provide a general summary of your changes in the title. -->

## Description

This PR adds the abilility to override custom extension attributes during merging. This will only work for attributes that are writable, i.e. attributes registered with a default value like `default=False` or attribute that have both a getter *and* a setter implemented.

```python
Token.set_extension('is_musician', default=False)

doc = nlp("I like David Bowie.")
with doc.retokenize() as retokenizer:
    attrs = {"LEMMA": "David Bowie", "_": {"is_musician": True}}
    retokenizer.merge(doc[2:4], attrs=attrs)

assert doc[2].text == "David Bowie"
assert doc[2].lemma_ == "David Bowie"
assert doc[2]._.is_musician
```

### Types of change
enhancement

## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
2019-02-24 18:38:47 +01:00
Ines Montani
403b9cd58b Add docs on adding to existing tokenizer rules [ci skip] 2019-02-24 18:35:19 +01:00
Ines Montani
383e2e1f12 Update Python versions [ci skip] 2019-02-24 11:49:45 +01:00
Ines Montani
b624cb4b89 Update v2-1.md 2019-02-24 11:49:27 +01:00
Ines Montani
0fc908d7a5 Add note on merging speed in v2.1 (see #3300) [ci skip] 2019-02-21 12:34:18 +01:00
Ines Montani
236aa94ded Update v2-1.md 2019-02-21 12:33:56 +01:00
Sofie
9a478b6db8 Clean up of char classes, few tokenizer fixes and faster default French tokenizer (#3293)
* splitting up latin unicode interval

* removing hyphen as infix for French

* adding failing test for issue 1235

* test for issue #3002 which now works

* partial fix for issue #2070

* keep the hyphen as infix for French (as it was)

* restore french expressions with hyphen as infix (as it was)

* added succeeding unit test for Issue #2656

* Fix issue #2822 with custom Italian exception

* Fix issue #2926 by allowing numbers right before infix /

* splitting up latin unicode interval

* removing hyphen as infix for French

* adding failing test for issue 1235

* test for issue #3002 which now works

* partial fix for issue #2070

* keep the hyphen as infix for French (as it was)

* restore french expressions with hyphen as infix (as it was)

* added succeeding unit test for Issue #2656

* Fix issue #2822 with custom Italian exception

* Fix issue #2926 by allowing numbers right before infix /

* remove duplicate

* remove xfail for Issue #2179 fixed by Matt

* adjust documentation and remove reference to regex lib
2019-02-20 22:10:13 +01:00
Ines Montani
57ae71ea95 Add docs on serializing the pipeline (see #3289) [ci skip] 2019-02-18 14:13:29 +01:00
Ines Montani
38e4422c0d Improve matcher example (resolves #3287) 2019-02-18 13:26:37 +01:00
Ines Montani
660cfe44c5 Fix formatting 2019-02-18 13:26:22 +01:00
Ines Montani
212ff359ef Fix links [ci skip] 2019-02-17 22:25:50 +01:00
Ines Montani
e597110d31
💫 Update website (#3285)
<!--- Provide a general summary of your changes in the title. -->

## Description

The new website is implemented using [Gatsby](https://www.gatsbyjs.org) with [Remark](https://github.com/remarkjs/remark) and [MDX](https://mdxjs.com/). This allows authoring content in **straightforward Markdown** without the usual limitations. Standard elements can be overwritten with powerful [React](http://reactjs.org/) components and wherever Markdown syntax isn't enough, JSX components can be used. Hopefully, this update will also make it much easier to contribute to the docs. Once this PR is merged, I'll implement auto-deployment via [Netlify](https://netlify.com) on a specific branch (to avoid building the website on every PR). There's a bunch of other cool stuff that the new setup will allow us to do – including writing front-end tests, service workers, offline support, implementing a search and so on.

This PR also includes various new docs pages and content.
Resolves #3270. Resolves #3222. Resolves #2947. Resolves #2837.


### Types of change
enhancement

## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
2019-02-17 19:31:19 +01:00
ines
3f4fd2c5d5 Update usage documentation 2017-10-03 14:26:20 +02:00
Reza Gharibi
0461b82158 Fix typos 2017-09-27 03:56:20 +03:30
Reza Gharibi
fa1844b132 Fix typo 2017-09-27 03:55:54 +03:30
Reza Gharibi
b5dd7e7cc4 Fix typo 2017-09-27 03:55:28 +03:30
Ines Montani
b8e81daccf Fix typo (closes #1312) 2017-09-14 12:49:59 +02:00
ines
d15775c3ad Fix typos and commands in alpha docs 2017-08-21 13:40:11 +02:00
ines
3c33003078 Port over typo corrections from #1245 2017-08-20 12:00:17 +02:00
ines
a29f132ffd Change python -m spacy to spacy
Reflects latest change to entry point or auto-alias
2017-08-14 13:04:48 +02:00
Nikolai Kruglikov
08e443e083 Fix small typo in documentation 2017-08-14 12:19:04 +02:00
ines
ab8ffbaab7 Add text classification to v2 overview 2017-07-22 17:56:51 +02:00
ines
0fb89dd204 Add text classification usage guide template 2017-07-22 17:56:07 +02:00
ines
d05ab1b3a0 Add text classification to 101 overview and change order 2017-07-22 17:55:53 +02:00
Jarle Mathiesen
f20533ec0c fix small typo 2017-06-24 12:31:33 +02:00
Savva Kolbachev
800a8faff4 Changed the capital of Lithuania to Vilnius
Hi,
There is a typo about the capital of Lithuania.

Vilnius is the capital of Lithuania https://en.wikipedia.org/wiki/Vilnius
Ljubljana is the capital of Slovenia https://en.wikipedia.org/wiki/Ljubljana
2017-06-12 23:27:00 +03:00
Ines Montani
57f64b9e1c Merge pull request #1124 from v3t3a/patch-3
docs - Fix url error for Displacy Ent visualizer
2017-06-12 21:20:32 +02:00
Ines Montani
b2a28028cf Merge pull request #1115 from v3t3a/patch-2
docs - Add read() method when opening file (Lightning tour)
2017-06-12 21:19:25 +02:00
Vetea
eae1f7b19c Fix url error for Displacy Ent visualizer 2017-06-12 14:30:02 +02:00
ines
49026a1346 Fix typos in example (see #1105) 2017-06-08 19:15:50 +02:00
Vetea
cc3aee1189 Add read() method when opening file
Add read() method for 

to avoid :
```TypeError: Argument 'string' has incorrect type (expected str, got _io.TextIOWrapper)```

Test with:
spaCy : v2.0.0 Alpha
python : 3.5.2+ (default, Sep 22 2016, 12:18:14)
2017-06-08 11:27:09 +02:00
ines
6b799bac54 Fix formatting and details 2017-06-06 14:37:49 +02:00
ines
fd9ae0f0e0 Update v2 comparison table 2017-06-05 16:39:11 +02:00
ines
a3f9745a14 Update similarity usage guide and examples 2017-06-05 15:37:33 +02:00
ines
fd35d910b8 Update v2 docs and benchmarks 2017-06-05 14:13:38 +02:00
ines
040553ca59 Update architecture and features table 2017-06-05 13:33:01 +02:00
ines
505d43b832 Update norms example 2017-06-04 23:33:26 +02:00
ines
f8e93b6d0a Update norms example 2017-06-04 23:24:29 +02:00
ines
a857b2b511 Update norms example 2017-06-04 23:21:37 +02:00
ines
47d066b293 Add under construction 2017-06-04 23:17:54 +02:00
ines
e9816daa6a Add details on syntax iterators 2017-06-04 23:16:33 +02:00
ines
990cb81556 Add info on syntax iterators 2017-06-04 21:47:22 +02:00
ines
e4eb33daf7 Add links to production use guide 2017-06-04 20:56:58 +02:00
ines
63cd539d04 Add more details on model packages and requirements.txt (see #1099) 2017-06-04 20:52:10 +02:00
ines
97ff83d163 Fix docs on model loading 2017-06-04 20:44:59 +02:00
ines
b6002db797 Add v2 label 2017-06-04 18:53:03 +02:00
ines
468ff1a7dd Update v2 docs and add benchmarks stub 2017-06-04 15:34:28 +02:00
Matthew Honnibal
23fd6b1782 Add intro narrative for v2 2017-06-04 15:10:37 +02:00
ines
3419ecbfdd Update docs on model shortcut links 2017-06-04 13:55:00 +02:00
ines
586e901143 Add v2 intro stub 2017-06-04 13:42:37 +02:00
ines
4f8f62d9b3 Merge branch 'v2-docs-edits' into develop 2017-06-04 13:40:58 +02:00
ines
809903dcad Fix link and update wording 2017-06-04 13:29:20 +02:00
ines
22dd18c364 Remove redundant CPU commands 2017-06-04 13:29:13 +02:00
ines
1d6377218a Update architecture blurb and move other info 2017-06-04 13:28:58 +02:00
ines
7a66c9f039 Fix formatting 2017-06-04 13:14:00 +02:00
Matthew Honnibal
f2c4a9f690 Edits to spacy-101 page 2017-06-04 13:10:27 +02:00
Matthew Honnibal
aca53b95e1 Link architecture blurb 2017-06-04 13:10:06 +02:00
Matthew Honnibal
64ca5123bb Add Architecture 101 blurb 2017-06-04 13:09:19 +02:00
Matthew Honnibal
e77ed953f4 Update GPU instructions 2017-06-04 12:03:22 +02:00
ines
1d3b012e56 Update adding languages docs and add 101 2017-06-03 23:54:23 +02:00
ines
a3715a81d5 Update adding languages guide 2017-06-03 22:16:38 +02:00
ines
ec6d2bc81d Add table of contents mixin 2017-06-03 22:16:26 +02:00
ines
9acf8686f7 Update note on compact mode issues 2017-06-03 13:31:16 +02:00
ines
c60431357d Port over docs typo corrections 2017-06-03 11:31:30 +02:00
ines
c6dc2fafc0 Add Spanish and move example sentences to meta 2017-06-01 17:49:56 +02:00
ines
b577ed79ee Move social image logic out to function and move files 2017-06-01 14:27:44 +02:00
ines
5e60b09dcd Fix custom tokenizer example 2017-06-01 13:02:50 +02:00
ines
8274dffad6 Update NER training draft 2017-06-01 12:51:36 +02:00
ines
04fac3f52a Add NER training example code 2017-06-01 12:47:47 +02:00
ines
7f5e7e7320 Fix typo 2017-06-01 12:47:36 +02:00
ines
4a927154d8 Update v2 docs 2017-06-01 11:56:32 +02:00
ines
03bbb96db8 Remove outdated examples 2017-06-01 11:56:02 +02:00
ines
789e69b73f Update training guide 2017-06-01 11:53:23 +02:00
ines
2f40d6e7e7 Add training 101 2017-06-01 11:53:16 +02:00
ines
abed463bbb Update serialization 101 2017-06-01 11:52:58 +02:00
ines
72380c952a Update training section in NER guide and add links 2017-06-01 11:52:49 +02:00
ines
22b1f72870 Add spaCy 101 intro 2017-05-31 12:44:09 +02:00
ines
a18b95ca12 Update docs on testing 2017-05-31 12:43:40 +02:00
ines
981196c181 Fix typo 2017-05-31 11:34:31 +02:00
ines
f86289566a Update new in v2 section and add note on Matcher acceptors 2017-05-30 13:53:06 +02:00
ines
ce4e45d0bb Update 101 intro 2017-05-29 22:15:06 +02:00
ines
687ed28340 Update processing pipelines guide 2017-05-29 14:21:00 +02:00
ines
d5992f408f Update note on vocab consistency 2017-05-29 14:14:26 +02:00
ines
a2134951f2 Update 101 and add note on pipeline order and tensors 2017-05-29 11:45:32 +02:00
ines
17b635eaab Update alpha docs note and fix typo 2017-05-29 11:09:24 +02:00
ines
fbe105f1eb Add note on L in long integers in Python 2 2017-05-29 11:05:05 +02:00
ines
9d74810f6f Update examples 2017-05-29 01:09:52 +02:00
ines
42cf414138 Update Matcher example 2017-05-29 01:09:52 +02:00
ines
00b2094dc3 Fix typos, long integers and tests 2017-05-29 01:09:52 +02:00
ines
d71c6db76e Add missing Chainer install for GPU if building spaCy from source 2017-05-28 23:34:59 +02:00
ines
e0f9ccdaa3 Update texts and rename vectorizer to tensorizer 2017-05-28 23:26:13 +02:00
ines
606879b217 Update hash strings examples 2017-05-28 19:42:44 +02:00
ines
c7b57ea314 Update docs and change integer IDs to hash values 2017-05-28 19:25:34 +02:00
ines
738b4f7187 Add quickstart options and docs for GPU 2017-05-28 19:20:11 +02:00
ines
4c00cb8c8b Update 101 and add community/FAQ and table of contents 2017-05-28 18:45:49 +02:00
ines
8a148b6563 Fix code, links and formatting 2017-05-28 18:29:16 +02:00
ines
414193e9ba Update docs to reflect StringStore changes 2017-05-28 18:19:11 +02:00
ines
69bda9aed7 Update text, examples, typos, wording and formatting 2017-05-28 16:41:01 +02:00
ines
f8185b8e11 Rename vocab-stringsotre to vocab 2017-05-28 16:37:14 +02:00
ines
10d05c2b92 Fix typos, wording and formatting 2017-05-28 01:30:12 +02:00
ines
db116cbeda Update tokenization 101 and add illustration 2017-05-28 00:22:40 +02:00
ines
b03fb2d7b0 Update 101 and usage docs 2017-05-28 00:22:40 +02:00
ines
ae11c8d60f Add emoji sentiment to lightning tour matcher example 2017-05-27 20:02:20 +02:00
ines
22bf5f63bf Update Matcher docs and add social media analysis example 2017-05-27 17:58:18 +02:00
ines
0d33ead507 Fix initialisation of Doc in lightning tour example 2017-05-27 17:58:06 +02:00
ines
e05bcd6aa8 Update docs to reflect flattened model meta.json
Don't use "setup" key and instead, keep "lang" on root level and add
"pipeline".
2017-05-27 17:57:46 +02:00
ines
1b982f0838 Update train command and add docs on hyperparameters 2017-05-26 14:02:38 +02:00