Ines Montani
012f4820cb
Keep infixes of punctuation + hyphens as one token (see #801 )
2017-02-02 16:22:40 +01:00
Ines Montani
1219a5f513
Add = to tokenizer prefixes
2017-02-02 16:21:11 +01:00
Ines Montani
ff04748eb6
Add missing emoticon
2017-02-02 16:21:00 +01:00
Ines Montani
13a4ab37e0
Add regression test for #801
2017-02-02 15:33:52 +01:00
Raphaël Bournhonesque
85f951ca99
Add tokenizer exceptions for French
2017-02-02 08:36:16 +01:00
Matthew Honnibal
16ce7409e4
Merge branch 'master' of https://github.com/explosion/spaCy
2017-01-31 13:27:34 -06:00
Matthew Honnibal
80aa4e114b
Fix x keras deep learning example
2017-01-31 13:27:13 -06:00
Ines Montani
ad0e4e4532
Merge pull request #794 from ematvey/count_by_doc_update
...
Small `Doc.count_by` documentation update
2017-01-31 20:11:47 +01:00
Matvey Ezhov
32a22291bc
Small Doc.count_by
documentation update
...
Current example doesn't work
2017-01-31 19:18:45 +03:00
Ines Montani
e4875834fe
Fix formatting
2017-01-31 15:19:33 +01:00
Ines Montani
c304834e45
Add missing import
2017-01-31 15:18:30 +01:00
Ines Montani
626ac282fe
Merge pull request #793 from latkins/master
...
Added regression test for Issue #792 .
2017-01-31 15:16:23 +01:00
Ines Montani
e6465b9ca3
Parametrize test cases and mark as xfail
2017-01-31 15:14:42 +01:00
latkins
e4c84321a5
Added regression test for Issue #792 .
2017-01-31 13:47:42 +00:00
Matthew Honnibal
6c665b81df
Fix redundant == TAG in from_array conditional
2017-01-31 00:46:21 +11:00
Matthew Honnibal
3ea0df6ba7
Merge pull request #782 from raphael0202/dep_version
...
Specify version number for ujson and plac
2017-01-29 05:32:45 +11:00
Raphaël Bournhonesque
0c2e5539ce
Specify version number for ujson and plac
...
The required version was specified for plac in requirements.txt but not in setup.py, which could cause a conflicting version error.
Similarly, set the version of ujson in requirements.txt to be the same as in setup.py
2017-01-28 18:38:14 +01:00
Matthew Honnibal
afd622fe04
Merge branch 'master' of ssh://github.com/explosion/spaCy
2017-01-27 12:28:30 +01:00
Matthew Honnibal
ab70f6e18d
Update NER training example
2017-01-27 12:27:10 +01:00
Ines Montani
651bf411e0
Add tutorial
2017-01-26 13:48:38 +01:00
Ines Montani
da3aca4020
Fix formatting
2017-01-26 13:48:29 +01:00
Ines Montani
baa6be8180
Update latest news to last blog post
2017-01-26 13:47:45 +01:00
Ines Montani
bdafb514c5
Update version
2017-01-26 13:47:32 +01:00
Ines Montani
19501f3340
Add regression test for #775
2017-01-25 13:16:52 +01:00
Ines Montani
209c37bbcf
Exclude "shell" and "Shell" from English tokenizer exceptions ( resolves #775 )
2017-01-25 13:15:02 +01:00
Ines Montani
a3c92e1bf6
Update README.rst
2017-01-25 10:48:09 +01:00
Ines Montani
c784b49d33
Merge pull request #772 from raphael0202/french-support
...
Add French tokenization support
2017-01-24 14:27:16 +01:00
Raphaël Bournhonesque
1be9c0e724
Add fr tokenization unit tests
2017-01-24 10:57:37 +01:00
Raphaël Bournhonesque
1faaf698ca
Add infixes and abbreviation exceptions (fr)
2017-01-24 10:57:37 +01:00
Raphaël Bournhonesque
cf8474401b
Remove unused import statement
2017-01-24 10:57:37 +01:00
Raphaël Bournhonesque
902f136f18
Add support for elision in French
2017-01-24 10:57:37 +01:00
Ines Montani
199ae10690
Update CONTRIBUTORS.md
2017-01-23 21:36:53 +01:00
Ines Montani
55c9c62abc
Use relative import
2017-01-23 21:27:49 +01:00
Ines Montani
0967eb07be
Add regression test for #768
2017-01-23 21:25:46 +01:00
Ines Montani
6baa98f774
Merge pull request #769 from raphael0202/spacy-768
...
Allow zero-width 'infix' token
2017-01-23 21:24:33 +01:00
Raphaël Bournhonesque
dce8f5515e
Allow zero-width 'infix' token
2017-01-23 18:28:01 +01:00
Ines Montani
5f6f48e734
Add regression test for #759
2017-01-20 15:11:48 +01:00
Ines Montani
09ecc39b4e
Fix multi-line string of NUM_WORDS ( resolves #759 )
2017-01-20 15:11:48 +01:00
Magnus Burton
69eab727d7
Added loops to handle contractions with verbs
2017-01-19 14:08:52 +01:00
Matthew Honnibal
be26085277
Fix missing import
...
Closes #755
2017-01-19 22:03:52 +11:00
Ines Montani
94ddfb2304
Merge pull request #750 from oiwah/span-doc-typofix-patch
...
Documentation Typo Fix: start_char description in the span API
2017-01-18 09:46:19 +01:00
Hidekazu Oiwa
7806ebafd2
Fix the span doc typo
...
Fix the typo in the span API doc.
It explains the `end` of the span as the `start_char` description.
2017-01-17 20:37:14 -08:00
Matthew Honnibal
300650a6f8
Merge pull request #749 from sudowork/custom-tokenizer-docs
...
Fix Custom Tokenizer docs
2017-01-18 11:39:43 +11:00
Kevin Gao
7ec710af0e
Fix Custom Tokenizer docs
...
- Fix mismatched quotations
- Make it more clear where ORTH, LEMMA, and POS symbols come from
- Make strings consistent
- Fix lemma_ assertion s/-PRON-/me/
2017-01-17 10:38:14 -08:00
Ines Montani
dbe8dafb52
Fix logo width and height to avoid link overlap in Safari ( resolves #748 )
2017-01-17 17:56:34 +01:00
Ines Montani
ee45619307
Fix formatting
2017-01-17 17:55:59 +01:00
Ines Montani
7e36568d5b
Fix title to accommodate sputnik
2017-01-17 00:51:09 +01:00
Ines Montani
d704cfa60d
Fix typo
2017-01-16 21:30:33 +01:00
Ines Montani
fb482ff049
Fix typo
2017-01-16 21:30:23 +01:00
Ines Montani
b50c499c04
Fix consistency
2017-01-16 20:44:31 +01:00