Matthew Honnibal
|
393a13d1af
|
* Add unicode em dash to specials.json, so that we can control what POS tag it gets. This way we can prevent sentence boundary detection errors, to address Issue #130.
|
2015-10-09 19:24:33 +11:00 |
|
Matthew Honnibal
|
1490feda29
|
* Make generate_specials pretty-print the specials.json file
|
2015-10-09 19:23:47 +11:00 |
|
Matthew Honnibal
|
1842a53e73
|
* Lemmatize smart quotes as plain quotes
|
2015-10-09 19:09:36 +11:00 |
|
Matthew Honnibal
|
2d9e5bf566
|
* Allow punctuation to be lemmatized
|
2015-10-09 19:02:42 +11:00 |
|
Matthew Honnibal
|
5332c0b697
|
* Add support for punctuation lemmatization, to handle unicode characters. This should help in addressing Issue #130
|
2015-10-09 18:54:40 +11:00 |
|
Matthew Honnibal
|
b71ba2eed5
|
* Add tests for unicode puncuation character lemmatization
|
2015-10-09 18:43:14 +11:00 |
|
Matthew Honnibal
|
c5b2c4ead8
|
* Don't build old license page
|
2015-10-09 14:58:45 +11:00 |
|
Matthew Honnibal
|
4bae38128d
|
* Remove license page from website in repo
|
2015-10-09 14:58:34 +11:00 |
|
Matthew Honnibal
|
00c1992503
|
* Mark tests that require models
|
2015-10-09 14:48:14 +11:00 |
|
Matthew Honnibal
|
dea40cfec3
|
* Mark tests that require models
|
2015-10-09 14:37:48 +11:00 |
|
Matthew Honnibal
|
5031440c35
|
* Mark tests that require models
|
2015-10-09 14:29:28 +11:00 |
|
Matthew Honnibal
|
76936a3456
|
* Mark tests that require models
|
2015-10-09 14:19:07 +11:00 |
|
Matthew Honnibal
|
7b340912d4
|
* Mark tests that require models
|
2015-10-09 14:09:26 +11:00 |
|
Matthew Honnibal
|
20b8c3e281
|
* Mark tests that require models
|
2015-10-09 13:58:01 +11:00 |
|
Matthew Honnibal
|
b125289f30
|
* Fix type declaration in asciied function
|
2015-10-09 13:46:57 +11:00 |
|
Matthew Honnibal
|
9ff288c7bb
|
* Update tests, after removal of spacy.en.attrs
|
2015-10-09 13:37:25 +11:00 |
|
Matthew Honnibal
|
c64fd472b0
|
* Fix travis.yml
|
2015-10-09 12:58:08 +11:00 |
|
Matthew Honnibal
|
f2374ecfb6
|
Merge branch 'master' of ssh://github.com/honnibal/spaCy
|
2015-10-09 12:48:34 +11:00 |
|
Matthew Honnibal
|
5af4b62fe7
|
* Filter out phrases that consist of common, lower-case words.
|
2015-10-09 12:47:43 +11:00 |
|
Matthew Honnibal
|
4bbc8f45c6
|
* Fix multi word matcher
|
2015-10-09 02:02:37 +11:00 |
|
Matthew Honnibal
|
801d55a6d9
|
* Fix phrase matcher
|
2015-10-09 02:00:45 +11:00 |
|
Matthew Honnibal
|
7b23442543
|
Merge pull request #133 from pquentin/patch-2
Fix typo
|
2015-10-08 21:47:04 +11:00 |
|
Quentin Pradet
|
1a71706c05
|
Fix typo
|
2015-10-08 14:22:23 +04:00 |
|
Matthew Honnibal
|
b3a70e6375
|
* Clean up unnecessary try/except block
|
2015-10-08 14:34:11 +11:00 |
|
Matthew Honnibal
|
4513bed175
|
* Avoid compiling unused files
|
2015-10-08 14:00:34 +11:00 |
|
Matthew Honnibal
|
e3e8994368
|
* Patch italian tag map
|
2015-10-08 14:00:13 +11:00 |
|
Matthew Honnibal
|
2d68f75b6a
|
* Fix identity tag map
|
2015-10-08 13:59:56 +11:00 |
|
Matthew Honnibal
|
5890682ed1
|
* Fix multi_word_matches script
|
2015-10-08 13:59:32 +11:00 |
|
Matthew Honnibal
|
a83253b455
|
Merge pull request #129 from chrisdubois/patch-1
Fix size of allocation when creating a pattern
|
2015-10-08 12:04:41 +11:00 |
|
Matthew Honnibal
|
6ea1601e93
|
* Add script to train models off the UD treebanks. Note that the UD data is restricted to research purposes only, and should only be used to train models for academic experiments.
|
2015-10-08 12:01:08 +11:00 |
|
Chris DuBois
|
e095faa785
|
Add contributor.
|
2015-10-07 17:55:46 -07:00 |
|
chrisdubois
|
cc47b8ad6a
|
Fix size of allocation when creating a pattern
Each pattern object currently contains two AttrValues rather than just one.
|
2015-10-07 10:32:55 -07:00 |
|
Matthew Honnibal
|
b228a8f4a6
|
* Remove spacy/en/attrs
|
2015-10-06 16:20:46 +11:00 |
|
Matthew Honnibal
|
693677fd8d
|
* Prepare to remove en/attrx file, now that moving to symbols.pyx
|
2015-10-06 16:20:13 +11:00 |
|
Matthew Honnibal
|
63bd17135f
|
* Whitespace
|
2015-10-06 10:37:07 +11:00 |
|
Matthew Honnibal
|
e7c31f7eae
|
* Tweak information extraction example
|
2015-10-06 10:35:49 +11:00 |
|
Matthew Honnibal
|
c503654ec1
|
* Update bin/parser/train for printing output.
|
2015-10-06 10:35:22 +11:00 |
|
Matthew Honnibal
|
3d9f41c2c9
|
* Add LookupError for better error reporting in Vocab
|
2015-10-06 10:34:59 +11:00 |
|
Matthew Honnibal
|
ecc5281b36
|
* Remove en/pos.pyx, as the tagger code now lives in spacy/tagger.pyx
|
2015-10-06 10:12:08 +11:00 |
|
Matthew Honnibal
|
e4ba8a4b5a
|
* Add multi word matching code
|
2015-10-06 09:06:52 +11:00 |
|
Matthew Honnibal
|
262c215b55
|
examples/information_extraction.py
* Add very simple information extraction snippet.
|
2015-10-01 22:27:57 +10:00 |
|
Matthew Honnibal
|
fd72b8b282
|
* Add a test for Issue #118: Matcher behaves unpredictably with overlapping entities
|
2015-10-01 16:21:00 +10:00 |
|
Matthew Honnibal
|
73928001ed
|
* Set details(open=true) on docs while we redesign
|
2015-09-30 11:48:15 +10:00 |
|
Matthew Honnibal
|
04c92d4f89
|
* Update comparisons
|
2015-09-29 23:07:00 +10:00 |
|
Matthew Honnibal
|
e7a7f3bd63
|
* Fix indentation error in API docs.
|
2015-09-29 23:05:04 +10:00 |
|
Matthew Honnibal
|
bf4d30c5b6
|
* Fix test failures in test_api
|
2015-09-29 23:04:20 +10:00 |
|
Matthew Honnibal
|
87e6186828
|
* Rename _seq to doc attribute in Span
|
2015-09-29 23:03:55 +10:00 |
|
Matthew Honnibal
|
ab694b0364
|
* Fix open-bounded slice indices.
|
2015-09-29 23:03:09 +10:00 |
|
Matthew Honnibal
|
e562f504ee
|
* Fix license metadata in setup.py
|
2015-09-29 23:02:37 +10:00 |
|
Matthew Honnibal
|
69f0c2cd26
|
* Fix typo in README
|
2015-09-29 23:02:08 +10:00 |
|