Samuel Kane
06a1846379
fix(util): fix decaying function output ( #3495 )
...
* fix(util): fix decaying function output
* fix(util): better test and adhere to code standards
* fix(util): correct variable name, pytestify test, update website text
2019-03-28 13:24:47 +01:00
Bharat Raghunathan
1db3e47509
DOC: Update tokenizer docs to include default value for batch_size in pipe ( #3492 )
2019-03-28 12:48:02 +01:00
Ines Montani
9e14b2b69f
Add Estonian to docs [ci skip] ( closes #3482 )
2019-03-25 18:01:54 +01:00
Ines Montani
21ade53ef7
Merge branch 'master' into spacy.io
2019-03-25 13:05:00 +01:00
Ines Montani
db938ab0e3
Update favicon ( closes #3475 ) [ci skip]
2019-03-25 13:04:47 +01:00
Ines Montani
c8c1baaea8
Update binderVersion
2019-03-25 12:17:03 +01:00
Ines Montani
200d8bdb3c
Merge branch 'spacy.io' [ci skip]
2019-03-23 16:46:34 +01:00
Ines Montani
1e5b917d75
Fix formatting [ci skip]
2019-03-23 16:45:50 +01:00
Matthew Honnibal
6c783f8045
Bug fixes and options for TextCategorizer ( #3472 )
...
* Fix code for bag-of-words feature extraction
The _ml.py module had a redundant copy of a function to extract unigram
bag-of-words features, except one had a bug that set values to 0.
Another function allowed extraction of bigram features. Replace all three
with a new function that supports arbitrary ngram sizes and also allows
control of which attribute is used (e.g. ORTH, LOWER, etc).
* Support 'bow' architecture for TextCategorizer
This allows efficient ngram bag-of-words models, which are better when
the classifier needs to run quickly, especially when the texts are long.
Pass architecture="bow" to use it. The extra arguments ngram_size and
attr are also available, e.g. ngram_size=2 means unigram and bigram
features will be extracted.
* Fix size limits in train_textcat example
* Explain architectures better in docs
2019-03-23 16:44:44 +01:00
Ines Montani
5944cf10c7
Add blog post to v2.1 page
2019-03-23 16:34:23 +01:00
Ines Montani
ffebdad08d
Add cheat sheet to spaCy 101
2019-03-23 16:32:55 +01:00
Ines Montani
06bf130890
💫 Add better and serializable sentencizer ( #3471 )
...
* Add better serializable sentencizer component
* Replace default factory
* Add tests
* Tidy up
* Pass test
* Update docs
2019-03-23 15:45:02 +01:00
Ines Montani
dcd6e06c47
Improve landing example [ci skip]
2019-03-22 19:02:15 +01:00
Ines Montani
a841324034
Update landing example [ci skip]
2019-03-22 18:50:00 +01:00
Ines Montani
b532386a60
Fix typo [ci skip]
2019-03-22 18:36:17 +01:00
Ines Montani
d8533f0149
Update Binder [ci skip]
2019-03-22 18:16:46 +01:00
Christos Aridas
9cee3f702a
Add missing space in landing page ( #3462 ) [ci skip]
2019-03-22 15:17:35 +01:00
Ines Montani
5073ce63fd
Merge branch 'spacy.io' [ci skip]
2019-03-22 15:17:11 +01:00
Ines Montani
0712efc6b3
Update version requirements [ci skip]
2019-03-21 10:23:54 +01:00
Ines Montani
764359c952
Merge branch 'master' into spacy.io
2019-03-20 17:24:28 +01:00
Ines Montani
dac8f8ff99
Update Span.__init__ docs (see #3445 ) [ci skip]
2019-03-20 17:24:17 +01:00
Ines Montani
f7b5ff7907
Move netlify.toml to root
2019-03-19 14:40:14 +01:00
Ines Montani
c6ee030721
Fix docsearch
2019-03-19 14:38:49 +01:00
Ines Montani
0155083e01
Update netlify.toml
2019-03-19 14:07:00 +01:00
Ines Montani
d4eed4a84f
Add note on unicode build to troubleshooting guide (see #3421 ) [ci skip]
2019-03-19 10:27:02 +01:00
Ines Montani
42d4b818e4
Redirect Netlify URL
2019-03-19 10:17:56 +01:00
Ines Montani
1ee97bc282
Add page title fallback, just in case
2019-03-18 18:58:55 +01:00
Ines Montani
728ae7651b
Fix universe page titles if no separate title is set
2019-03-18 18:58:46 +01:00
Ines Montani
a20d3772fd
FIx responsive landing
2019-03-18 16:24:52 +01:00
Ines Montani
08284f3a11
💫 v2.1.0 launch updates (only merge on launch!) ( #3414 )
...
* Update README.md
* Use production docsearch [ci skip]
* Add option to exclude pages from search
2019-03-18 16:07:26 +01:00
Ines Montani
a611b32fbf
Update model docs [ci skip]
2019-03-17 11:48:18 +01:00
Matthew Honnibal
62afa64a8d
Expose batch size and length caps on CLI for pretrain ( #3417 )
...
Add and document CLI options for batch size, max doc length, min doc length for `spacy pretrain`.
Also improve CLI output.
Closes #3216
## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
2019-03-16 21:38:45 +01:00
Ines Montani
2c5dd4d602
Update Vectors.find docs [ci skip]
2019-03-16 17:10:57 +01:00
Ines Montani
fa0f501165
Use dev DocSearch index
2019-03-15 14:48:38 +01:00
Ines Montani
8af7d01382
Fix general-purpose IDs
2019-03-15 14:48:26 +01:00
Ines Montani
cbcba699dd
Fix missing ids
2019-03-14 17:56:53 +01:00
Ines Montani
cffe63ea24
Fix :target padding for ids
2019-03-14 17:41:02 +01:00
Ines Montani
51b7b88acf
Generate active sidebar heading (h0) at compile time
2019-03-14 17:20:51 +01:00
Ines Montani
4ab1871a75
Add search-exclude classes
2019-03-14 16:51:29 +01:00
Ines Montani
59bbf85986
Add id to body
2019-03-14 16:51:18 +01:00
Ines Montani
6e07750dd8
Fix class name
2019-03-14 11:52:31 +01:00
Ines Montani
a0813b93e0
Server-side render is-active for crawler
2019-03-14 11:46:27 +01:00
Ines Montani
39ace04b55
Fix active style
2019-03-14 11:46:13 +01:00
Ines Montani
4cfe4aa224
Fix small issues in the docs [ci skip]
2019-03-12 22:57:15 +01:00
Ines Montani
ba7eb2d131
Update section [ci skip]
2019-03-12 16:18:34 +01:00
Ines Montani
cecc31b765
Don't auto-slugify accordion links [ci skip]
2019-03-12 15:30:49 +01:00
Ines Montani
d842d5698e
Tidy up website and add eslint config [ci skip]
2019-03-12 15:21:58 +01:00
Ines Montani
72fb324d95
Add vector training script to bin [ci skip]
2019-03-12 12:07:56 +01:00
Ines Montani
3abf0e6b9f
Replace dev-resources links with real examples
2019-03-12 12:07:40 +01:00
Ines Montani
59c0620487
Auto-format
2019-03-12 12:07:11 +01:00