Commit Graph

4062 Commits

Author SHA1 Message Date
Ines Montani
8a74129cdf Modernize and merge tokenizer tests for prefixes/suffixes/infixes 2017-01-05 13:13:12 +01:00
Ines Montani
0e65dca9a5 Modernize and merge tokenizer tests for exception and emoticons 2017-01-05 13:11:31 +01:00
Ines Montani
34c47bb20d Fix formatting 2017-01-05 13:10:51 +01:00
Ines Montani
2e72683baa Add missing docstrings 2017-01-05 13:10:21 +01:00
Ines Montani
da10a049a6 Add unicode declarations 2017-01-05 13:09:48 +01:00
Ines Montani
58adae8774 Remove unused file 2017-01-05 13:09:22 +01:00
Ines Montani
c6e5a5349d Move regression test for #360 into own file 2017-01-04 00:49:31 +01:00
Ines Montani
8279993a6f Modernize and merge tokenizer tests for punctuation 2017-01-04 00:49:20 +01:00
Ines Montani
550630df73 Update tokenizer tests for contractions 2017-01-04 00:48:42 +01:00
Ines Montani
109f202e8f Update conftest fixture 2017-01-04 00:48:21 +01:00
Ines Montani
ee6b49b293 Modernize tokenizer tests for emoticons 2017-01-04 00:47:59 +01:00
Ines Montani
f09b5a5dfd Modernize tokenizer tests for infixes 2017-01-04 00:47:42 +01:00
Ines Montani
59059fed27 Move regression test for #351 to own file 2017-01-04 00:47:11 +01:00
Ines Montani
667051375d Modernize tokenizer tests for whitespace 2017-01-04 00:46:35 +01:00
Ines Montani
aafc894285 Modernize tokenizer tests for contractions
Use @pytest.mark.parametrize.
2017-01-03 23:02:21 +01:00
Ines Montani
1d237664af Add lowercase lemma to tokenizer exceptions 2017-01-03 23:02:21 +01:00
Ines Montani
dd7cd44ba5 Update README.rst 2017-01-03 21:27:25 +01:00
Ines Montani
d677db6277 Change "Multi-language support" to amber for spaCy 2017-01-03 21:24:35 +01:00
Ines Montani
6f51609b5e Use yellow color for neutral pro/con icon 2017-01-03 21:24:14 +01:00
Ines Montani
84a87951eb Fix typos 2017-01-03 18:27:43 +01:00
Ines Montani
35b39f53c3 Reorganise English tokenizer exceptions (as discussed in #718)
Add logic to generate exceptions that follow a consistent pattern (like
verbs and pronouns) and allow certain tokens to be excluded explicitly.
2017-01-03 18:26:09 +01:00
Ines Montani
fb9d3bb022 Revert "Merge remote-tracking branch 'origin/master'"
This reverts commit d3b181cdf1, reversing
changes made to b19cfcc144.
2017-01-03 18:21:36 +01:00
Ines Montani
461cbb99d8 Revert "Reorganise English tokenizer exceptions (as discussed in #718)"
This reverts commit b19cfcc144.
2017-01-03 18:21:29 +01:00
Ines Montani
d3b181cdf1 Merge remote-tracking branch 'origin/master'
# Conflicts:
#	spacy/en/tokenizer_exceptions.py
2017-01-03 18:20:01 +01:00
Ines Montani
b19cfcc144 Reorganise English tokenizer exceptions (as discussed in #718)
Add logic to generate exceptions that follow a consistent pattern (like
verbs and pronouns) and allow certain tokens to be excluded explicitly.
2017-01-03 18:17:57 +01:00
Ines Montani
4fc4d3d0e3 Update PULL_REQUEST_TEMPLATE.md 2017-01-03 15:41:16 +01:00
Ines Montani
1bd53bbf89 Fix typos (resolves #718) 2017-01-03 11:26:21 +01:00
Matthew Honnibal
9b48bd161b Merge pull request #700 from oroszgy/tokenization_w_exception_patterns
Tokenization with exception patterns
2017-01-03 09:56:37 +11:00
Ines Montani
1b82756cc7 Tidy up and fix formatting and consistency 2017-01-02 00:29:24 +01:00
Ines Montani
614f95f3bf Remove help cursor from API links 2017-01-02 00:29:08 +01:00
Ines Montani
87c7496065 Use better chat window icons with more compact markup 2017-01-01 13:25:28 +01:00
Ines Montani
a1a4b253a1 Add Gitter chat widget component to docs 2017-01-01 12:46:01 +01:00
Ines Montani
78e54b375f Move scripts to own file 2017-01-01 12:45:37 +01:00
Ines Montani
134e115d9c Bump version 2017-01-01 12:45:17 +01:00
Ines Montani
4acd026cb6 Add missing documentation to mixins 2017-01-01 12:43:43 +01:00
Ines Montani
e3d84572f2 Fix ents input format example 2017-01-01 12:28:37 +01:00
Ines Montani
a9a7cddf5b Update icons and remove unused SVG meta 2017-01-01 03:18:51 +01:00
Ines Montani
cd0da315d5 Bump version 2017-01-01 03:18:36 +01:00
Ines Montani
3ca8de4666 Use rem value for top/bottom card padding
Fix rendering / interpretation error in Firefox
2017-01-01 03:18:08 +01:00
Ines Montani
2afbf6b6c0 Add missing closing tag for symbol 2017-01-01 03:17:43 +01:00
Ines Montani
d845ab3d20 Add Gitter room to social meta 2017-01-01 03:17:29 +01:00
Ines Montani
b4f6b1da9e Merge pull request #716 from guyrosin/patch-1
Tiny code typo
2016-12-31 14:19:25 +01:00
Guy Rosin
acdd2fc9a6 Tiny code typo 2016-12-31 14:53:05 +02:00
Ines Montani
505f31f2bf Update README.rst 2016-12-31 10:24:24 +01:00
Matthew Honnibal
fde53be3b4 Move whole token mach inside _split_affixes. 2016-12-30 17:11:50 -06:00
Matthew Honnibal
3ba7c167a8 Fix URL tests 2016-12-30 17:10:08 -06:00
Matthew Honnibal
9936a1b9b5 Merge branch 'tokenization_w_exception_patterns' of https://github.com/oroszgy/spaCy.hu into oroszgy-tokenization_w_exception_patterns 2016-12-30 14:53:40 -06:00
Magnus Burton
56e2219b65 Added Swedish city abbreviations 2016-12-30 21:17:34 +01:00
Magnus Burton
e935c950d8 Added months and days as abbreviations for Swedish 2016-12-30 21:08:44 +01:00
Matthew Honnibal
3e8d9c772e Test interaction of token_match and punctuation
Check that the new token_match function applies after punctuation is split off.
2016-12-31 00:52:17 +11:00