Commit Graph

4017 Commits

Author SHA1 Message Date
Ines Montani
d3b181cdf1 Merge remote-tracking branch 'origin/master'
# Conflicts:
#	spacy/en/tokenizer_exceptions.py
2017-01-03 18:20:01 +01:00
Ines Montani
b19cfcc144 Reorganise English tokenizer exceptions (as discussed in #718)
Add logic to generate exceptions that follow a consistent pattern (like
verbs and pronouns) and allow certain tokens to be excluded explicitly.
2017-01-03 18:17:57 +01:00
Ines Montani
4fc4d3d0e3 Update PULL_REQUEST_TEMPLATE.md 2017-01-03 15:41:16 +01:00
Ines Montani
1bd53bbf89 Fix typos (resolves #718) 2017-01-03 11:26:21 +01:00
Matthew Honnibal
9b48bd161b Merge pull request #700 from oroszgy/tokenization_w_exception_patterns
Tokenization with exception patterns
2017-01-03 09:56:37 +11:00
Ines Montani
1b82756cc7 Tidy up and fix formatting and consistency 2017-01-02 00:29:24 +01:00
Ines Montani
614f95f3bf Remove help cursor from API links 2017-01-02 00:29:08 +01:00
Ines Montani
87c7496065 Use better chat window icons with more compact markup 2017-01-01 13:25:28 +01:00
Ines Montani
a1a4b253a1 Add Gitter chat widget component to docs 2017-01-01 12:46:01 +01:00
Ines Montani
78e54b375f Move scripts to own file 2017-01-01 12:45:37 +01:00
Ines Montani
134e115d9c Bump version 2017-01-01 12:45:17 +01:00
Ines Montani
4acd026cb6 Add missing documentation to mixins 2017-01-01 12:43:43 +01:00
Ines Montani
e3d84572f2 Fix ents input format example 2017-01-01 12:28:37 +01:00
Ines Montani
a9a7cddf5b Update icons and remove unused SVG meta 2017-01-01 03:18:51 +01:00
Ines Montani
cd0da315d5 Bump version 2017-01-01 03:18:36 +01:00
Ines Montani
3ca8de4666 Use rem value for top/bottom card padding
Fix rendering / interpretation error in Firefox
2017-01-01 03:18:08 +01:00
Ines Montani
2afbf6b6c0 Add missing closing tag for symbol 2017-01-01 03:17:43 +01:00
Ines Montani
d845ab3d20 Add Gitter room to social meta 2017-01-01 03:17:29 +01:00
Ines Montani
b4f6b1da9e Merge pull request #716 from guyrosin/patch-1
Tiny code typo
2016-12-31 14:19:25 +01:00
Guy Rosin
acdd2fc9a6 Tiny code typo 2016-12-31 14:53:05 +02:00
Ines Montani
505f31f2bf Update README.rst 2016-12-31 10:24:24 +01:00
Matthew Honnibal
fde53be3b4 Move whole token mach inside _split_affixes. 2016-12-30 17:11:50 -06:00
Matthew Honnibal
3ba7c167a8 Fix URL tests 2016-12-30 17:10:08 -06:00
Matthew Honnibal
9936a1b9b5 Merge branch 'tokenization_w_exception_patterns' of https://github.com/oroszgy/spaCy.hu into oroszgy-tokenization_w_exception_patterns 2016-12-30 14:53:40 -06:00
Matthew Honnibal
3e8d9c772e Test interaction of token_match and punctuation
Check that the new token_match function applies after punctuation is split off.
2016-12-31 00:52:17 +11:00
Matthew Honnibal
623d94e14f Whitespace 2016-12-31 00:30:28 +11:00
Ines Montani
9d39e7853a Merge pull request #713 from petterhh/patch-1
Add PART to tag map
2016-12-28 18:51:09 +01:00
Petter Hohle
f112e7754e Add PART to tag map
16 of the 17 PoS tags in the UD tag set is added; PART is missing.
2016-12-28 18:39:01 +01:00
Ines Montani
14295f9302 Update README.rst 2016-12-28 00:55:00 +01:00
Ines Montani
9f24eb3fd9 Update CONTRIBUTORS.md 2016-12-28 00:25:07 +01:00
Ines Montani
d1585959d9 Add Hungarian to alpha support overview 2016-12-27 22:31:41 +01:00
Ines Montani
decb7437ea Update README.rst 2016-12-27 22:19:19 +01:00
Ines Montani
e80dad8616 Update version 2016-12-27 22:18:48 +01:00
Matthew Honnibal
f62db78dc3 Increment version 2016-12-27 21:11:22 +01:00
Matthew Honnibal
cade536d1e Merge branch 'master' of ssh://github.com/explosion/spaCy 2016-12-27 21:04:10 +01:00
Matthew Honnibal
ce4539dafd Allow the vocabulary to grow to 10,000, to prevent cold-start problem. 2016-12-27 21:03:45 +01:00
Ines Montani
ad3669cef5 Merge pull request #703 from magnusburton/master
Added Swedish abbreviations
2016-12-27 01:01:49 +01:00
Ines Montani
223142d3d3 Update CONTRIBUTORS.md 2016-12-27 00:49:26 +01:00
Ines Montani
78f754dd9a Merge pull request #705 from oroszgy/hu_tokenizer
Initial support for Hungarian
2016-12-27 00:48:13 +01:00
Gyorgy Orosz
ef8f3103f2 Merge branch 'hu_tokenizer' of github.com:oroszgy/spaCy into hu_tokenizer 2016-12-26 22:39:17 +01:00
Gyorgy Orosz
ade7487ff8 Accepted contributor agreement. 2016-12-26 22:37:02 +01:00
Ines Montani
b7becaec85 Fix typo 2016-12-25 15:23:32 +01:00
Ines Montani
6dd8ae1b0d Update README.md 2016-12-25 14:43:40 +01:00
Ines Montani
f6f6e028ea Make links detect target automatically and replace false with null for no attribute
New version of Harp would render attribute=false as attribute="false",
while attribute=null renders element without attribute.
2016-12-24 12:24:04 +01:00
Ines Montani
b893126c12 Use link mixin instead of plain link markup 2016-12-24 12:22:52 +01:00
Ines Montani
8785706039 Reformat stop words for better readability 2016-12-24 00:58:40 +01:00
Gyorgy Orosz
45e045a87b Unicode/UTF8 compatibility for Python2 2016-12-24 00:21:00 +01:00
Gyorgy Orosz
72b61b6d03 Typo fix. 2016-12-24 00:10:29 +01:00
Gyorgy Orosz
3a9be4d485 Updated token exception handling mechanism to allow the usage of arbitrary functions as token exception matchers. 2016-12-23 23:49:34 +01:00
Ines Montani
207555fae7 Fix spelling 2016-12-23 21:36:01 +01:00