Matthew Honnibal
0d3bf0d4eb
Merge branch 'master' of https://github.com/explosion/spaCy
2018-03-24 17:31:49 +01:00
dejanmarich
ccd1c04c63
Update stop_words.py
...
Added more words
2018-03-24 17:31:24 +01:00
ines
f1446b0257
Port over Turkish changes
2018-03-24 17:31:07 +01:00
DuyguA
cd604878a4
quick typo fix
2018-03-24 17:26:35 +01:00
Matthew Honnibal
406548b976
Support .gz and .tar.gz files in spacy init-model
2018-03-24 17:18:32 +01:00
Jim O'Regan
efe037e8be
more exceptions
2018-03-24 00:05:27 +00:00
Matthew Honnibal
e3be3d65b3
Version as v2.0.10.dev0
2018-03-15 17:31:22 +01:00
ines
f3f8bfc367
Add built-in factories for merge_entities and merge_noun_chunks
...
Allows adding those components to the pipeline out-of-the-box if they're defined in a model's meta.json. Also allows usage as nlp.add_pipe(nlp.create_pipe('merge_entities')).
2018-03-15 17:16:54 +01:00
alldefector
f4e5904fc2
Fix Spanish noun_chunks failure caused by typo
2018-03-14 17:03:17 +01:00
Thomas Opsomer
fbf48b3f9f
lemma property to return hash instead of unicode
2018-03-14 17:03:00 +01:00
Matthew Honnibal
8cefc58abc
Fix Vectors pickling
2018-03-14 16:59:37 +01:00
Matthew Honnibal
307aefe131
Increment version to v2.0.9
2018-02-22 17:07:53 +01:00
Ines Montani
14e7e0f12a
Merge pull request #2000 from jimregan/polish-tag-map
...
Polish tag map
2018-02-18 19:05:58 +01:00
Jim O'Regan
664407de5d
missing PrepCase attribute
2018-02-18 14:46:12 +00:00
Jim O'Regan
95f0673fbc
fix typo/missing here too
2018-02-18 14:38:27 +00:00
Matthew Honnibal
cf0e320f2b
Add doc.is_sentenced attribute, re #1959
2018-02-18 14:16:55 +01:00
Matthew Honnibal
1e5aeb4eec
Merge pull request #1987 from thomasopsomer/span-sent
...
Make span.sent work when only manual / custom sbd
2018-02-18 14:05:37 +01:00
Matthew Honnibal
1cf774bdc1
Add output options return_matches and as_tuples to Matcher
2018-02-18 14:00:45 +01:00
Matthew Honnibal
dd9b0945af
Fix inconsistencies in the symbols table
2018-02-18 13:51:31 +01:00
Matthew Honnibal
66496ac8e1
Set version to v2.1.0.dev0
2018-02-18 13:48:39 +01:00
Matthew Honnibal
eb3040ce46
Merge pull request #1891 from fucking-signup/master
...
Fix issue #1889
2018-02-18 13:47:47 +01:00
ines
6bba1db4cc
Drop six and related hacks as a dependency
2018-02-18 13:29:56 +01:00
Matthew Honnibal
b30b09192a
Merge pull request #1665 from jimregan/animacy
...
typo in "inan", add "nhum"
2018-02-18 13:26:53 +01:00
Matthew Honnibal
1b3c98e01b
Set version to v2.0.8
2018-02-18 12:16:31 +01:00
Matthew Honnibal
f9f46e5a07
Revert matcher fixes from GregDubbin
2018-02-18 10:59:28 +01:00
Matthew Honnibal
86405e4ad1
Fix CLI for multitask objectives
2018-02-18 10:59:11 +01:00
Matthew Honnibal
a34749b2bf
Add multitask objectives options to train CLI
2018-02-17 22:03:54 +01:00
Matthew Honnibal
8f06903e09
Fix multitask objectives
2018-02-17 18:41:36 +01:00
Matthew Honnibal
d1246c95fb
Fix model loading when using multitask objectives
2018-02-17 18:11:36 +01:00
Matthew Honnibal
262d0a3148
Fix overwriting of lexical attributes when loading vectors during training
2018-02-17 18:11:11 +01:00
Matthew Honnibal
c0caf7cf27
Fix LANG symbol
2018-02-17 18:10:50 +01:00
Matthew Honnibal
0bf2f6be29
Add missing symbol for LANG attr. Fixes inconsistent numeric ID
2018-02-17 17:37:02 +01:00
Matthew Honnibal
97a228a4ce
Increment to v2.0.8.dev0
2018-02-17 16:54:36 +01:00
Aaron Marquez
ea571e8325
Merge branch 'master' into issue-1959
2018-02-16 15:14:09 -08:00
Matthew Honnibal
7d5c720fc3
Fix multitask objective when no pipeline provided
2018-02-15 23:50:21 +01:00
Aaron Marquez
f0d3672e17
Changed loading EN model
2018-02-15 14:28:38 -08:00
Aaron Marquez
3765d84d57
Fix issue #1959
2018-02-15 12:51:49 -08:00
Aaron Marquez
7ba4111554
Add test for issue-1959
2018-02-15 12:46:22 -08:00
Matthew Honnibal
59b7cf9db8
Add get_beam_parse method in ArcEager, for Prodigy
2018-02-15 21:03:16 +01:00
Matthew Honnibal
3e541de440
Merge branch 'master' of https://github.com/explosion/spaCy
2018-02-15 21:02:55 +01:00
Thomas Opsomer
5d24a81c0b
add test for span.sent when doc not parsed
2018-02-15 16:59:16 +01:00
Thomas Opsomer
deab391cbf
correct check on sent_start & raise if no boundaries
2018-02-15 16:58:30 +01:00
Matthew Honnibal
4cb861e080
Merge pull request #1968 from DuyguA/is_currency
...
New lexical feature is_currency
2018-02-15 12:13:36 +01:00
Thomas Opsomer
b902731313
Find span sentence when only sentence boundaries (no parser)
2018-02-14 22:18:54 +01:00
Claudiu-Vlad Ursache
e28de12cbd
Ensure files opened in from_disk
are closed
...
Fixes [issue 1706](https://github.com/explosion/spaCy/issues/1706 ).
2018-02-13 20:49:43 +01:00
Johannes Dollinger
012e874d09
Add contributor agreement for emulbreh
2018-02-13 13:40:33 +01:00
Johannes Dollinger
bf94c13382
Don't fix random seeds on import
2018-02-13 12:42:23 +01:00
Matthew Honnibal
d7c9b53120
Pass kwargs into pipeline components during begin_training
2018-02-12 10:18:39 +01:00
4altinok
ca8728035d
added new lex feat to token
2018-02-11 18:55:48 +01:00
4altinok
edd7202a06
added new symbol
2018-02-11 18:55:32 +01:00
4altinok
ed1ac2969e
added new lexical feat to lexeme
2018-02-11 18:51:48 +01:00
4altinok
94fb0b75e3
code for is_currency
2018-02-11 18:51:32 +01:00
4altinok
3deef1497a
removed 18 and replaced 18 with is_currency
2018-02-11 18:51:09 +01:00
4altinok
471d3c9e23
added lex test for is_currency
2018-02-11 18:50:50 +01:00
ines
c63e99da8a
Fix typo in glossary ( resolves #1964 )
...
Co-Authored-By: SThomasP <sthomasp@users.noreply.github.com>
2018-02-10 11:58:41 +01:00
Lyndon White
6ee5dff51c
Make python 3.4 compat module loading ( fix #1733 )
2018-02-09 23:03:35 +08:00
Matthew Honnibal
e361b4f82b
Fix #1929 : Incorrect NER when pre-set sentence boundaries.
2018-02-08 15:25:41 +01:00
Matthew Honnibal
fd9fd275c5
Make test for #1945 more precise
2018-02-07 02:06:11 +01:00
Matthew Honnibal
c087a14380
Merge branch 'master' of https://github.com/explosion/spaCy
2018-02-07 01:29:39 +01:00
Matthew Honnibal
76d89b2180
Add test for #1945 : PhraseMatcher regression
2018-02-07 01:29:23 +01:00
Ines Montani
0954e15dda
Merge pull request #1913 from ohenrik/nb_syntax_iterator
...
Norwegian Language (nb) - Added french syntax iterator with explanation
2018-02-06 04:59:07 +01:00
Ole Henrik Skogstrøm
251a7805fe
Copied French syntax iterator to simplify future changes
2018-02-05 14:45:05 +01:00
Matthew Honnibal
2e7391e627
Merge pull request #1916 from tokestermw/bug/fix-not-passing-in-model-cfg-in-nlp
...
Bug/fix not passing in model cfg in nlp
2018-02-05 01:19:40 +01:00
Ali Zarezade
9df9da34a3
Fix init_model issue
...
Fixing issue #1928
2018-02-03 17:21:34 +03:30
Matthew Honnibal
ebe84e45e5
Increment version to 2.0.7
2018-02-02 03:39:16 +01:00
Matthew Honnibal
e4b1f57599
Increment version
2018-02-02 02:33:23 +01:00
Matthew Honnibal
069531c351
Merge branch 'master' of https://github.com/explosion/spaCy
2018-02-02 02:32:58 +01:00
Matthew Honnibal
f74a802d09
Test and fix #1919 : Error resuming training
2018-02-02 02:32:40 +01:00
ines
f1d3deffac
Add Russian example sentences (see #1107 )
2018-02-01 20:09:40 +01:00
Matthew Honnibal
6b1126c312
Merge branch 'master' of https://github.com/explosion/spaCy
2018-02-01 02:57:52 +01:00
ines
3c1fb9d02d
Make validate command fail more gracefully if version not found
...
Mostly relevant during develoment when working with .dev versions
2018-01-31 22:06:28 +01:00
Motoki Wu
54062b7326
added tests for issue #1915
2018-01-30 18:30:19 -08:00
Motoki Wu
f4a7d1a423
make to sure pass in **cfg to each component when training
2018-01-30 18:29:54 -08:00
ines
4046823699
Only check component in factories if string (see #1911 )
2018-01-30 16:29:07 +01:00
ines
ce10d320c4
Fix component check in self.factories (see #1911 )
2018-01-30 16:09:37 +01:00
Ole Henrik Skogstrøm
e40465487c
Added french syntax iterator with explenation
2018-01-30 15:44:29 +01:00
ines
8901814248
Improve error handling if pipeline component is not callable ( resolves #1911 )
...
Also add help message if user accidentally calls nlp.add_pipe() with a string of a built-in component name.
2018-01-30 15:43:03 +01:00
Matthew Honnibal
a437ba87a3
Set release=True
2018-01-29 21:26:04 +01:00
Adam Binford
9238749aaf
Removed test to avoid network requests
2018-01-29 14:48:20 -05:00
Adam Binford
1a2c2f7d7f
Fixed auto linking after download and added simple test to check
2018-01-29 14:25:21 -05:00
Matthew Honnibal
cb7110c22e
Merge pull request #1882 from ohenrik/nb_lemma_and_tag_map
...
Add norwegian bokmål ('nb') lemmatizer and tag_map
2018-01-29 18:18:50 +01:00
Matthew Honnibal
0c1e7f0c86
Merge pull request #1893 from azarezade/master
...
Add Persian language
2018-01-29 18:18:33 +01:00
Matthew Honnibal
cbdab75b36
Increment version
2018-01-28 23:46:22 +01:00
Matthew Honnibal
512e6adb08
Merge pull request #1896 from thomasopsomer/fix-sent
...
Fix sentence boundaries serialization (issue #1834 )
2018-01-28 21:18:51 +01:00
Matthew Honnibal
f5b1ad4100
Limit parser model size, to hopefully reduce memory during CI tests
2018-01-28 21:00:32 +01:00
Thomas Opsomer
515e25910e
fix sent_start in serialization
2018-01-28 19:50:42 +01:00
Thomas Opsomer
45d62561f7
add test for the issue
2018-01-28 19:49:56 +01:00
ines
6d978e5c35
Don't use deprecated Doc.merge call in displaCy
...
As reported here: https://stackoverflow.com/a/48464412/6400719
2018-01-27 11:25:05 +01:00
Ali Zarezade
bb6bd3d8ae
add persian language
2018-01-27 13:27:26 +03:30
Ali Zarezade
d195675db5
add persian language
2018-01-27 13:21:38 +03:30
Kit
4b42267ba3
Fix issue #1889
2018-01-25 23:17:22 +01:00
Kit
52ef51f36e
Add test for issue #1889
2018-01-25 22:56:48 +01:00
Ole Henrik Skogstrøm
8e2c9f2475
Cleaned up nb tag_map comments
2018-01-25 11:09:28 +01:00
Ole Henrik Skogstrøm
1107e89fcf
Updated doc string on nb tag_map module
2018-01-25 11:08:28 +01:00
Matthew Honnibal
6a8cb905aa
Merge pull request #1876 from GregDubbin/master
...
Pattern matcher fixes
2018-01-24 16:38:11 +01:00
Matthew Honnibal
38b260e0c3
Merge pull request #1879 from azarezade/master
...
Add Persian character and symbols
2018-01-24 16:34:22 +01:00
Matthew Honnibal
edb71a280e
Add test for #1883 : Unpickling Matcher
2018-01-24 15:42:33 +01:00
Matthew Honnibal
2ad050e668
Fix unpickling of Matcher. Also store correct data in matcher._patterns
2018-01-24 15:42:11 +01:00
Ole Henrik Skogstrøm
4058a7d579
Fix æøå characters in lemmatizer
2018-01-24 14:03:14 +01:00
Ole Henrik Skogstrøm
42248f423f
Updated tag map
2018-01-24 13:50:33 +01:00