Commit Graph

5945 Commits

Author SHA1 Message Date
Ines Montani
c784b49d33 Merge pull request #772 from raphael0202/french-support
Add French tokenization support
2017-01-24 14:27:16 +01:00
Raphaël Bournhonesque
1be9c0e724 Add fr tokenization unit tests 2017-01-24 10:57:37 +01:00
Raphaël Bournhonesque
1faaf698ca Add infixes and abbreviation exceptions (fr) 2017-01-24 10:57:37 +01:00
Raphaël Bournhonesque
cf8474401b Remove unused import statement 2017-01-24 10:57:37 +01:00
Raphaël Bournhonesque
902f136f18 Add support for elision in French 2017-01-24 10:57:37 +01:00
Ines Montani
199ae10690 Update CONTRIBUTORS.md 2017-01-23 21:36:53 +01:00
Ines Montani
55c9c62abc Use relative import 2017-01-23 21:27:49 +01:00
Ines Montani
0967eb07be Add regression test for #768 2017-01-23 21:25:46 +01:00
Ines Montani
6baa98f774 Merge pull request #769 from raphael0202/spacy-768
Allow zero-width 'infix' token
2017-01-23 21:24:33 +01:00
Raphaël Bournhonesque
dce8f5515e Allow zero-width 'infix' token 2017-01-23 18:28:01 +01:00
Ines Montani
5f6f48e734 Add regression test for #759 2017-01-20 15:11:48 +01:00
Ines Montani
09ecc39b4e Fix multi-line string of NUM_WORDS (resolves #759) 2017-01-20 15:11:48 +01:00
Magnus Burton
69eab727d7 Added loops to handle contractions with verbs 2017-01-19 14:08:52 +01:00
Matthew Honnibal
be26085277 Fix missing import
Closes #755
2017-01-19 22:03:52 +11:00
Ines Montani
94ddfb2304 Merge pull request #750 from oiwah/span-doc-typofix-patch
Documentation Typo Fix: start_char description in the span API
2017-01-18 09:46:19 +01:00
Hidekazu Oiwa
7806ebafd2 Fix the span doc typo
Fix the typo in the span API doc.
It explains the `end` of the span as the `start_char` description.
2017-01-17 20:37:14 -08:00
Matthew Honnibal
300650a6f8 Merge pull request #749 from sudowork/custom-tokenizer-docs
Fix Custom Tokenizer docs
2017-01-18 11:39:43 +11:00
Kevin Gao
7ec710af0e Fix Custom Tokenizer docs
- Fix mismatched quotations
- Make it more clear where ORTH, LEMMA, and POS symbols come from
- Make strings consistent
- Fix lemma_ assertion s/-PRON-/me/
2017-01-17 10:38:14 -08:00
Ines Montani
dbe8dafb52 Fix logo width and height to avoid link overlap in Safari (resolves #748) 2017-01-17 17:56:34 +01:00
Ines Montani
ee45619307 Fix formatting 2017-01-17 17:55:59 +01:00
Ines Montani
7e36568d5b Fix title to accommodate sputnik 2017-01-17 00:51:09 +01:00
Ines Montani
d704cfa60d Fix typo 2017-01-16 21:30:33 +01:00
Ines Montani
fb482ff049 Fix typo 2017-01-16 21:30:23 +01:00
Ines Montani
b50c499c04 Fix consistency 2017-01-16 20:44:31 +01:00
Ines Montani
8a615e8961 Simplify and update pull request template 2017-01-16 20:43:52 +01:00
Ines Montani
5909804a61 Merge pull request #747 from JasonKessler/patch-1
Clarify Rule-Based Workflow Docs
2017-01-16 20:39:27 +01:00
Jason Kessler
9fa6f9fb40 Origin of spacy.matcher attributes
Make it clear that Matcher attributes live in spacy.matcher.attrs.
2017-01-16 13:31:35 -06:00
Ines Montani
842155e3ae Merge pull request #746 from jktong/patch-1
Correct typo "chldren" in doc.jade
2017-01-16 17:58:37 +01:00
jktong
df0aeff379 Correct typo "chldren" in doc.jade 2017-01-16 09:34:59 -05:00
Ines Montani
64e142f460 Update about.py 2017-01-16 14:23:08 +01:00
Matthew Honnibal
63adcb8141 Merge branch 'master' of ssh://github.com/explosion/spaCy 2017-01-16 14:02:12 +01:00
Matthew Honnibal
e889cd698e Increment version 2017-01-16 14:01:35 +01:00
Ines Montani
5e3793f711 Update README.rst 2017-01-16 14:00:56 +01:00
Matthew Honnibal
e7f8e13cf3 Make Token hashable. Fixes #743 2017-01-16 13:27:57 +01:00
Matthew Honnibal
2c60d0cb1e Test #743: Tokens unhashable. 2017-01-16 13:27:26 +01:00
Matthew Honnibal
48c712f1c1 Merge branch 'master' of ssh://github.com/explosion/spaCy 2017-01-16 13:18:06 +01:00
Matthew Honnibal
7ccf490c73 Increment version 2017-01-16 13:17:58 +01:00
Matthew Honnibal
d4e6d4c1c4 Use new thinc 2017-01-16 13:17:14 +01:00
Ines Montani
50878ef598 Exclude "were" and "Were" from tokenizer exceptions and add regression test (resolves #744) 2017-01-16 13:10:38 +01:00
Ines Montani
e053c7693b Fix formatting 2017-01-16 13:09:52 +01:00
Ines Montani
116c675c3c Merge pull request #742 from oroszgy/hu_tokenizer_fix
Improved Hungarian tokenizer
2017-01-14 23:52:44 +01:00
Gyorgy Orosz
92345b6a41 Further numeric test. 2017-01-14 22:44:19 +01:00
Gyorgy Orosz
b4df202bfa Better error handling 2017-01-14 22:24:58 +01:00
Ines Montani
853130bcf8 Update installation instructions (see #727) 2017-01-14 22:12:42 +01:00
Gyorgy Orosz
b03a46792c Better error handling 2017-01-14 22:09:29 +01:00
Gyorgy Orosz
a45f22913f Added further abbreviations present in the Szeged corpus 2017-01-14 22:08:55 +01:00
Ines Montani
a3e3df3e33 Clean up fabfile 2017-01-14 21:30:38 +01:00
Ines Montani
332ce2d758 Update README.md 2017-01-14 21:12:11 +01:00
Ines Montani
c77698af25 Update CONTRIBUTING.md 2017-01-14 21:02:35 +01:00
Ines Montani
8dcb1c183d Update CONTRIBUTING.md 2017-01-14 21:01:46 +01:00