Commit Graph

206 Commits

Author SHA1 Message Date
Jeffrey Gerard
b6ebedd09c Document Tokenizer(token_match) and clarify tokenizer_pseudo_code
Closes #835

In the `tokenizer_pseudo_code` I put the `special_cases` kwarg
before `find_prefix` because this now matches the order the args
are used in the pseudocode, and it also matches spacy's actual code.
2017-09-25 13:13:25 -07:00
Matthew Honnibal
9177313063 Merge pull request #1352 from hscspring/patch-5
Update customizing-tokenizer.jade
2017-09-22 16:11:49 +02:00
Yam
54855f0eee Update customizing-tokenizer.jade 2017-09-22 12:15:48 +08:00
Yam
6f450306c3 Update customizing-tokenizer.jade
update some codes:    
- `me` -> `-PRON`
- `TAG` -> `POS`
- `create_tokenizer` function
2017-09-22 10:53:22 +08:00
Yam
425c09488d Update word-vectors-similarities.jade
add
```    
import spacy
nlp = spacy.load('en') ```
2017-09-22 08:56:34 +08:00
Kevin Marsh
e3738aba0d Fix broken tutorial link on website 2017-08-15 21:50:09 +01:00
Delirious Lettuce
d3b03f0544 Fix typos:
* `auxillary` -> `auxiliary`
  * `consistute` -> `constitute`
  * `earlist` -> `earliest`
  * `prefered` -> `preferred`
  * `direcory` -> `directory`
  * `reuseable` -> `reusable`
  * `idiosyncracies` -> `idiosyncrasies`
  * `enviroment` -> `environment`
  * `unecessary` -> `unnecessary`
  * `yesteday` -> `yesterday`
  * `resouces` -> `resources`
2017-08-06 21:31:39 -06:00
Gideon Dresdner
7e98a3613c improve pipe, tee, izip explanation
Use an example from an old issue https://github.com/explosion/spaCy/issues/172#issuecomment-183963403.
2017-08-06 13:21:45 +02:00
ines
7c4bf9994d Add note on requirements and preventing model re-downloads (closes #1143) 2017-07-22 15:40:12 +02:00
ines
d7560047c5 Fix version 2017-07-22 15:24:33 +02:00
ines
b22b18a019 Add notes on spacy.explain() to annotation docs 2017-07-22 15:02:15 +02:00
ines
e3f23f9d91 Use latest available version in examples 2017-07-22 14:57:51 +02:00
Jorge Paredes
fadacd0d47 Fix url broken
The related url to **custom named entities** was broken
2017-07-16 10:06:32 -05:00
lgenerknol
2b219caf0d .../cli/#foo is 404
https://spacy.io/docs/usage/cli/#package is a 404.  
Changed to https://spacy.io/docs/usage/cli#package 

Definitely a larger fix possible to deal with trailing slashes
2017-07-12 13:12:24 -04:00
lgenerknol
6cf2690943 Missing markup char
Frontend displayed: 
```
 If start_idx and do not mark[...]
```
Note the missing "end_idx" after 'and'.
2017-07-12 11:06:16 -04:00
val314159
19d4706f69 make this work in python2.7 2017-07-07 13:18:17 -07:00
Callum Kift
dfaeee1f37 fixed bug in training ner documentation and example 2017-06-30 09:56:33 +02:00
Nathan Glenn
81166c3d56 fix confusing typo
This document describes the `Vocab` class, not the `Span` class.
2017-06-21 19:22:30 +02:00
Bart Broere
e4a45ae55f Very minor documentation fix 2017-06-12 12:28:51 +02:00
ines
6ef04afdc8 Update docs with Spanish model 2017-06-06 12:49:25 +02:00
Pascal van Kooten
e66cd9cc70 for easy copy & paste 2017-06-05 20:41:28 +02:00
ines
1e918b871c Remove infoboxes 2017-06-01 17:53:47 +02:00
ines
ab83dd5d25 Fix lightning tour example 2017-06-01 17:53:41 +02:00
Ines Montani
88ca82bfa6 Merge pull request #1081 from yuvalpinter/patch-2
Fixed link
2017-05-23 16:58:45 +02:00
Yuval Pinter
cb418c7aef Fixed span example error
Span as written gives empty text.
2017-05-23 10:54:13 -04:00
Yuval Pinter
68b387ffc3 Fixed link
link to Doc API documentation fixed
2017-05-23 10:46:17 -04:00
Matthew Honnibal
c282167310 Merge pull request #1076 from raphael0202/patch-1
Deleting (legacy?) whitespace attribute in doc
2017-05-23 11:03:29 +02:00
Matthew Honnibal
7669b9f923 Merge pull request #1077 from raphael0202/patch-2
Add Token.orth and Token.orth_ description in doc
2017-05-23 11:00:27 +02:00
Yuval Pinter
af3d121ec9 extend suffixes from first to last
reverse suffix list in `tokenizer_pseudo_code()` so the order of returned tokens matches input order
2017-05-22 10:56:03 -04:00
Raphaël Bournhonesque
a330287304 Add Token.orth and Token.orth_ description in doc 2017-05-19 21:17:31 +02:00
Raphaël Bournhonesque
7e4f31c362 Deleting (legacy?) whitespace attribute
token.whitespace raises an AttributeError
2017-05-19 21:12:41 +02:00
Niko Rebenich
d40b083934 Print list comprehension
Turn the generator expression into a list comprehension before printing
2017-05-18 14:50:43 -07:00
ines
35795c88c4 Add quickstart.js widget 2017-05-17 18:22:04 +02:00
ines
e506811a93 Update description 2017-05-13 03:27:50 +02:00
Phaninder Pasupula
953f638aa5 Update _data.json 2017-05-08 11:48:05 +01:00
ines
fac3566aac Add descriptions to POS tagging scheme 2017-05-03 20:11:02 +02:00
ines
1570b83ee5 Add spacy.explain() note to NER annotation scheme 2017-05-03 20:11:02 +02:00
ines
219369bb7d Add detailed docs for dependency label annotations 2017-05-03 20:11:02 +02:00
ines
f9384b0fbd Update alpha languages and add aside for tokenizer dependencies 2017-05-03 09:58:31 +02:00
Yasuaki Uechi
0e7a9b9fac Add Japanese to 'Alpha support’ section 2017-05-03 13:56:45 +09:00
Ines Montani
fb96f88b59 Update info on CoNLL format and include link 2017-04-27 14:36:08 +02:00
M. Z. Ferdous (Imran)
c9f9203d5f fix typo, CONLL format
tried to google about connlu format. Saw there is conll format, not connlu.
2017-04-27 16:48:54 +06:00
ines
5aa49971f9 Add French example to models docs 2017-04-27 12:08:47 +02:00
ines
034ec5710b Fix typo and add Norwegian to alpha languages 2017-04-27 11:24:21 +02:00
ines
100846bed3 Fix typo in model list 2017-04-26 21:40:17 +02:00
ines
375edf0bb5 Add list of models and include French 2017-04-26 20:50:27 +02:00
ines
4eacd72bc3 Move list of models to own file 2017-04-26 20:50:27 +02:00
ines
c2006166d3 Update list of available models and info 2017-04-26 16:03:41 +02:00
ines
e6bdf5bc5c Update adding language / training docs (see #966)
Add data examples and more info on training and CLI commands
2017-04-26 14:01:19 +02:00
ines
ae2b77db1b Fix info on naming conventions 2017-04-26 14:01:19 +02:00