Commit Graph

10988 Commits

Author SHA1 Message Date
Magnus Burton
19c0ce745a Added swedish lemma rules 2017-02-04 17:53:32 +01:00
Ines Montani
cf529f4774 Merge pull request #806 from wallinm1/fix/swedish-tokenizer-exceptions
Fix issue #805
2017-02-04 17:40:40 +01:00
Michael Wallin
d25556bf80 [issue 805] Fix issue 2017-02-04 16:22:21 +02:00
Michael Wallin
35100c8bdd [issue 805] Add regression test and the required fixture 2017-02-04 16:21:34 +02:00
ines
a44da8fb34 Update language models and alpha support overview 2017-02-04 13:49:05 +01:00
Ines Montani
708cd37a2e Update README.rst 2017-02-04 13:42:46 +01:00
Ines Montani
ff91be6d17 Update CONTRIBUTORS.md 2017-02-04 13:41:21 +01:00
ines
0ab353b0ca Add line breaks to Finnish stop words for better readability 2017-02-04 13:40:25 +01:00
Ines Montani
3431e7b86f Merge pull request #804 from wallinm1/finnish-alpha-support
Alpha support for Finnish
2017-02-04 13:37:08 +01:00
Michael Wallin
55b1e5e682 [finnish] Add contributor file 2017-02-04 13:54:10 +02:00
Michael Wallin
1a1952afa5 [finnish] Add initial tests for tokenizer 2017-02-04 13:54:10 +02:00
Michael Wallin
f9bb25d1cf [finnish] Reformat and correct stop words 2017-02-04 13:54:10 +02:00
Michael Wallin
73f66ec570 Add preliminary support for Finnish 2017-02-04 13:54:10 +02:00
Ines Montani
932aaba7de Update CONTRIBUTORS.md 2017-02-03 10:55:42 +01:00
Ines Montani
65d6202107 Merge pull request #802 from Tpt/fr-tokenizer
Adds more French tokenizer exceptions
2017-02-03 10:52:20 +01:00
Tpt
75a74857bb Adds more French tokenizer exceptions 2017-02-03 13:45:18 +04:00
Ines Montani
afc6365388 Update regression test for #801 to match current expected behaviour 2017-02-02 16:23:05 +01:00
Ines Montani
012f4820cb Keep infixes of punctuation + hyphens as one token (see #801) 2017-02-02 16:22:40 +01:00
Ines Montani
1219a5f513 Add = to tokenizer prefixes 2017-02-02 16:21:11 +01:00
Ines Montani
ff04748eb6 Add missing emoticon 2017-02-02 16:21:00 +01:00
Ines Montani
13a4ab37e0 Add regression test for #801 2017-02-02 15:33:52 +01:00
Raphaël Bournhonesque
85f951ca99 Add tokenizer exceptions for French 2017-02-02 08:36:16 +01:00
Matthew Honnibal
16ce7409e4 Merge branch 'master' of https://github.com/explosion/spaCy 2017-01-31 13:27:34 -06:00
Matthew Honnibal
80aa4e114b Fix x keras deep learning example 2017-01-31 13:27:13 -06:00
Ines Montani
ad0e4e4532 Merge pull request #794 from ematvey/count_by_doc_update
Small `Doc.count_by` documentation update
2017-01-31 20:11:47 +01:00
Matvey Ezhov
32a22291bc Small Doc.count_by documentation update
Current example doesn't work
2017-01-31 19:18:45 +03:00
Ines Montani
e4875834fe Fix formatting 2017-01-31 15:19:33 +01:00
Ines Montani
c304834e45 Add missing import 2017-01-31 15:18:30 +01:00
Ines Montani
626ac282fe Merge pull request #793 from latkins/master
Added regression test for Issue #792.
2017-01-31 15:16:23 +01:00
Ines Montani
e6465b9ca3 Parametrize test cases and mark as xfail 2017-01-31 15:14:42 +01:00
latkins
e4c84321a5 Added regression test for Issue #792. 2017-01-31 13:47:42 +00:00
Matthew Honnibal
6c665b81df Fix redundant == TAG in from_array conditional 2017-01-31 00:46:21 +11:00
Matthew Honnibal
3ea0df6ba7 Merge pull request #782 from raphael0202/dep_version
Specify version number for ujson and plac
2017-01-29 05:32:45 +11:00
Raphaël Bournhonesque
0c2e5539ce Specify version number for ujson and plac
The required version was specified for plac in requirements.txt but not in setup.py, which could cause a conflicting version error.
Similarly, set the version of ujson in requirements.txt to be the same as in setup.py
2017-01-28 18:38:14 +01:00
Matthew Honnibal
afd622fe04 Merge branch 'master' of ssh://github.com/explosion/spaCy 2017-01-27 12:28:30 +01:00
Matthew Honnibal
ab70f6e18d Update NER training example 2017-01-27 12:27:10 +01:00
Ines Montani
651bf411e0 Add tutorial 2017-01-26 13:48:38 +01:00
Ines Montani
da3aca4020 Fix formatting 2017-01-26 13:48:29 +01:00
Ines Montani
baa6be8180 Update latest news to last blog post 2017-01-26 13:47:45 +01:00
Ines Montani
bdafb514c5 Update version 2017-01-26 13:47:32 +01:00
Ines Montani
19501f3340 Add regression test for #775 2017-01-25 13:16:52 +01:00
Ines Montani
209c37bbcf Exclude "shell" and "Shell" from English tokenizer exceptions (resolves #775) 2017-01-25 13:15:02 +01:00
Ines Montani
a3c92e1bf6 Update README.rst 2017-01-25 10:48:09 +01:00
Ines Montani
c784b49d33 Merge pull request #772 from raphael0202/french-support
Add French tokenization support
2017-01-24 14:27:16 +01:00
Raphaël Bournhonesque
1be9c0e724 Add fr tokenization unit tests 2017-01-24 10:57:37 +01:00
Raphaël Bournhonesque
1faaf698ca Add infixes and abbreviation exceptions (fr) 2017-01-24 10:57:37 +01:00
Raphaël Bournhonesque
cf8474401b Remove unused import statement 2017-01-24 10:57:37 +01:00
Raphaël Bournhonesque
902f136f18 Add support for elision in French 2017-01-24 10:57:37 +01:00
Ines Montani
199ae10690 Update CONTRIBUTORS.md 2017-01-23 21:36:53 +01:00
Ines Montani
55c9c62abc Use relative import 2017-01-23 21:27:49 +01:00