Matthew Honnibal
|
a510858f5a
|
* Pretty-print specials.json, and add the em dash
|
2015-10-09 11:07:45 +02:00 |
|
Matthew Honnibal
|
49600a44a8
|
* Fix trailing comma in lemma_rules.json
|
2015-10-09 11:06:57 +02:00 |
|
Matthew Honnibal
|
0e92e8574a
|
* Fix pos tag in em-dash in specials
|
2015-10-09 11:06:37 +02:00 |
|
Matthew Honnibal
|
d341443282
|
* Remove em-dash from lemma rules. Handle instead in specials.
|
2015-10-09 10:27:13 +02:00 |
|
Matthew Honnibal
|
b6047afe4c
|
* Fix punctuation lemma rules, to resolve Issue #130
|
2015-10-09 10:25:37 +02:00 |
|
Matthew Honnibal
|
393a13d1af
|
* Add unicode em dash to specials.json, so that we can control what POS tag it gets. This way we can prevent sentence boundary detection errors, to address Issue #130.
|
2015-10-09 19:24:33 +11:00 |
|
Matthew Honnibal
|
1490feda29
|
* Make generate_specials pretty-print the specials.json file
|
2015-10-09 19:23:47 +11:00 |
|
Matthew Honnibal
|
1842a53e73
|
* Lemmatize smart quotes as plain quotes
|
2015-10-09 19:09:36 +11:00 |
|
Matthew Honnibal
|
2d9e5bf566
|
* Allow punctuation to be lemmatized
|
2015-10-09 19:02:42 +11:00 |
|
Matthew Honnibal
|
5332c0b697
|
* Add support for punctuation lemmatization, to handle unicode characters. This should help in addressing Issue #130
|
2015-10-09 18:54:40 +11:00 |
|
Matthew Honnibal
|
b71ba2eed5
|
* Add tests for unicode puncuation character lemmatization
|
2015-10-09 18:43:14 +11:00 |
|
Yubing (Tom) Dong
|
9a6811acc4
|
Merge remote-tracking branch 'upstream/master'
|
2015-10-08 22:53:02 -07:00 |
|
Henning Peters
|
0e13f18ea4
|
remove compile warning noise
|
2015-10-09 07:23:39 +02:00 |
|
Matthew Honnibal
|
c5b2c4ead8
|
* Don't build old license page
|
2015-10-09 14:58:45 +11:00 |
|
Matthew Honnibal
|
4bae38128d
|
* Remove license page from website in repo
|
2015-10-09 14:58:34 +11:00 |
|
Matthew Honnibal
|
00c1992503
|
* Mark tests that require models
|
2015-10-09 14:48:14 +11:00 |
|
Matthew Honnibal
|
dea40cfec3
|
* Mark tests that require models
|
2015-10-09 14:37:48 +11:00 |
|
Matthew Honnibal
|
5031440c35
|
* Mark tests that require models
|
2015-10-09 14:29:28 +11:00 |
|
Matthew Honnibal
|
76936a3456
|
* Mark tests that require models
|
2015-10-09 14:19:07 +11:00 |
|
Matthew Honnibal
|
7b340912d4
|
* Mark tests that require models
|
2015-10-09 14:09:26 +11:00 |
|
Matthew Honnibal
|
20b8c3e281
|
* Mark tests that require models
|
2015-10-09 13:58:01 +11:00 |
|
Matthew Honnibal
|
b125289f30
|
* Fix type declaration in asciied function
|
2015-10-09 13:46:57 +11:00 |
|
Matthew Honnibal
|
9ff288c7bb
|
* Update tests, after removal of spacy.en.attrs
|
2015-10-09 13:37:25 +11:00 |
|
Matthew Honnibal
|
c64fd472b0
|
* Fix travis.yml
|
2015-10-09 12:58:08 +11:00 |
|
Matthew Honnibal
|
f2374ecfb6
|
Merge branch 'master' of ssh://github.com/honnibal/spaCy
|
2015-10-09 12:48:34 +11:00 |
|
Matthew Honnibal
|
5af4b62fe7
|
* Filter out phrases that consist of common, lower-case words.
|
2015-10-09 12:47:43 +11:00 |
|
Matthew Honnibal
|
4bbc8f45c6
|
* Fix multi word matcher
|
2015-10-09 02:02:37 +11:00 |
|
Matthew Honnibal
|
801d55a6d9
|
* Fix phrase matcher
|
2015-10-09 02:00:45 +11:00 |
|
Matthew Honnibal
|
7b23442543
|
Merge pull request #133 from pquentin/patch-2
Fix typo
|
2015-10-08 21:47:04 +11:00 |
|
Quentin Pradet
|
1a71706c05
|
Fix typo
|
2015-10-08 14:22:23 +04:00 |
|
Matthew Honnibal
|
b3a70e6375
|
* Clean up unnecessary try/except block
|
2015-10-08 14:34:11 +11:00 |
|
Matthew Honnibal
|
4513bed175
|
* Avoid compiling unused files
|
2015-10-08 14:00:34 +11:00 |
|
Matthew Honnibal
|
e3e8994368
|
* Patch italian tag map
|
2015-10-08 14:00:13 +11:00 |
|
Matthew Honnibal
|
2d68f75b6a
|
* Fix identity tag map
|
2015-10-08 13:59:56 +11:00 |
|
Matthew Honnibal
|
5890682ed1
|
* Fix multi_word_matches script
|
2015-10-08 13:59:32 +11:00 |
|
Matthew Honnibal
|
a83253b455
|
Merge pull request #129 from chrisdubois/patch-1
Fix size of allocation when creating a pattern
|
2015-10-08 12:04:41 +11:00 |
|
Matthew Honnibal
|
6ea1601e93
|
* Add script to train models off the UD treebanks. Note that the UD data is restricted to research purposes only, and should only be used to train models for academic experiments.
|
2015-10-08 12:01:08 +11:00 |
|
Chris DuBois
|
e095faa785
|
Add contributor.
|
2015-10-07 17:55:46 -07:00 |
|
chrisdubois
|
cc47b8ad6a
|
Fix size of allocation when creating a pattern
Each pattern object currently contains two AttrValues rather than just one.
|
2015-10-07 10:32:55 -07:00 |
|
Yubing (Tom) Dong
|
0f601b8b75
|
Update docstring of Doc.__getitem__
|
2015-10-07 01:27:28 -07:00 |
|
Yubing (Tom) Dong
|
3fd3bc79aa
|
Refactor to remove duplicate slicing logic
|
2015-10-07 01:25:35 -07:00 |
|
Yubing (Tom) Dong
|
97685aecb7
|
Add slicing support to Span
|
2015-10-06 02:45:49 -07:00 |
|
Yubing (Tom) Dong
|
5cc2f2b01a
|
Test simple indexing for Span
|
2015-10-06 02:41:46 -07:00 |
|
Yubing (Tom) Dong
|
ef2af20cd3
|
Make Doc's slicing behavior conform to Python conventions
|
2015-10-06 02:41:28 -07:00 |
|
Yubing (Tom) Dong
|
2fc33e8024
|
Allow step=1 when slicing a Doc
|
2015-10-06 00:57:05 -07:00 |
|
Yubing (Tom) Dong
|
73566899bf
|
Add Doc slicing tests
|
2015-10-06 00:57:01 -07:00 |
|
Matthew Honnibal
|
b228a8f4a6
|
* Remove spacy/en/attrs
|
2015-10-06 16:20:46 +11:00 |
|
Matthew Honnibal
|
693677fd8d
|
* Prepare to remove en/attrx file, now that moving to symbols.pyx
|
2015-10-06 16:20:13 +11:00 |
|
Matthew Honnibal
|
63bd17135f
|
* Whitespace
|
2015-10-06 10:37:07 +11:00 |
|
Matthew Honnibal
|
e7c31f7eae
|
* Tweak information extraction example
|
2015-10-06 10:35:49 +11:00 |
|