Matthew Honnibal
|
a6a2159969
|
Add slot for text categories to Doc
|
2017-07-22 00:34:15 +02:00 |
|
Matthew Honnibal
|
374ab3ecfb
|
Increment alpha version
|
2017-07-22 00:32:49 +02:00 |
|
Matthew Honnibal
|
289f23df51
|
Test beam parsing
|
2017-07-20 15:03:10 +02:00 |
|
Matthew Honnibal
|
3da1063b36
|
Add beam decoding to parser, to allow NER uncertainties
|
2017-07-20 15:02:55 +02:00 |
|
Matthew Honnibal
|
0ca5832427
|
Improve negative example handling in NER oracle
|
2017-07-20 00:18:49 +02:00 |
|
Matthew Honnibal
|
a231b56d40
|
Add text-classification hook to pipeline
|
2017-07-20 00:18:15 +02:00 |
|
Matthew Honnibal
|
7ea50182a5
|
Add support for text-classification labels to GoldParse
|
2017-07-20 00:17:47 +02:00 |
|
Matthew Honnibal
|
727481377e
|
Add text-classifer thinc models
|
2017-07-20 00:17:17 +02:00 |
|
Matthew Honnibal
|
f014138c11
|
Fix parser tests
|
2017-07-20 00:16:52 +02:00 |
|
Ines Montani
|
c91642efd5
|
Port over changes from #1168
|
2017-07-01 11:43:54 +02:00 |
|
Jim Regan
|
d81ceb0cd5
|
Merge branch 'develop' into polish
|
2017-06-26 22:42:27 +01:00 |
|
Jim O'Regan
|
2f84c73585
|
a start
|
2017-06-26 22:40:04 +01:00 |
|
Jim O'Regan
|
28d7f0a672
|
reference
|
2017-06-26 22:38:28 +01:00 |
|
Matthew Honnibal
|
91e52543ef
|
Merge pull request #1118 from Gregory-Howard/patch-2
Update _tokenizer_exceptions_list (adding cities)
|
2017-06-20 11:16:07 +02:00 |
|
Matthew Honnibal
|
8ea785e01a
|
Merge pull request #1119 from oroszgy/patch-3
Fixed conllu converter
|
2017-06-20 11:14:41 +02:00 |
|
Tpt
|
7745b3ae04
|
Adds noun chunks to French syntax iterators
|
2017-06-12 15:29:58 +02:00 |
|
Tpt
|
57e8254f63
|
Adds function to extract french noun chunks
|
2017-06-12 15:20:49 +02:00 |
|
György Orosz
|
62dbf9025c
|
Fixed conllu converter
|
2017-06-09 22:53:56 +02:00 |
|
Grégory Howard
|
cd974b32b7
|
Update _tokenizer_exceptions_list (adding cities)
|
2017-06-09 17:58:18 +02:00 |
|
ines
|
34a2eecb17
|
Add simple "naughty strings" test (see #1107)
|
2017-06-06 17:43:51 +02:00 |
|
ines
|
045574a936
|
Update package name and increment version
|
2017-06-05 20:41:30 +02:00 |
|
Matthew Honnibal
|
1f5874a927
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-06-05 20:20:00 +02:00 |
|
ines
|
03db56f48c
|
Detect spaCy version and add package title
Package title allows customised package names (like spacy-nightly)
|
2017-06-05 20:11:02 +02:00 |
|
Matthew Honnibal
|
c0d90f52f7
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-06-05 19:20:13 +02:00 |
|
ines
|
cc9c5dc7a3
|
Fix noun chunks test
|
2017-06-05 16:39:04 +02:00 |
|
Matthew Honnibal
|
836bfa2d0f
|
Add factory for experimental SimilarityHook component
|
2017-06-05 15:40:22 +02:00 |
|
Matthew Honnibal
|
d59fa32df1
|
Add experimental SimilarityHook omponent
|
2017-06-05 15:40:03 +02:00 |
|
Matthew Honnibal
|
5489b49203
|
Remove print statement
|
2017-06-05 13:20:41 +02:00 |
|
Matthew Honnibal
|
fc4204a12a
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-06-05 13:13:23 +02:00 |
|
Matthew Honnibal
|
2479cde446
|
Support disable keyword in Language.__init__
|
2017-06-05 13:13:07 +02:00 |
|
ines
|
ea167e14db
|
Fix model package loading from link
|
2017-06-05 13:10:49 +02:00 |
|
ines
|
dd6dc4c120
|
Update spacy.load() helper functions
|
2017-06-05 13:02:31 +02:00 |
|
Matthew Honnibal
|
b4cdd05466
|
Add vectors.pyx in setup
|
2017-06-05 12:45:29 +02:00 |
|
Matthew Honnibal
|
280d419529
|
Add pickle method for vectors
|
2017-06-05 12:36:04 +02:00 |
|
Matthew Honnibal
|
30369d580f
|
Start testing Vectors class
|
2017-06-05 12:32:49 +02:00 |
|
Matthew Honnibal
|
eb7cbb62c2
|
Flesh out Vectors class
|
2017-06-05 12:32:08 +02:00 |
|
ines
|
51d7414e94
|
Make sure sents are a list
|
2017-06-05 12:30:13 +02:00 |
|
Matthew Honnibal
|
ebb6c49cd5
|
Make alignment case-insensitive for gold
|
2017-06-04 20:26:42 -05:00 |
|
Matthew Honnibal
|
fc4dd62e84
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-06-04 20:19:05 -05:00 |
|
Matthew Honnibal
|
8f8f90b46b
|
Disable labeller if not parsing
|
2017-06-04 20:18:54 -05:00 |
|
Matthew Honnibal
|
c52fde40f4
|
Improve train CLI
|
2017-06-04 20:18:37 -05:00 |
|
Matthew Honnibal
|
a053b1218e
|
Fix item counting during training
|
2017-06-04 20:18:20 -05:00 |
|
Matthew Honnibal
|
b3b5521625
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-06-04 20:17:18 -05:00 |
|
Matthew Honnibal
|
9bc4a26213
|
Add option of data augmentation noise
|
2017-06-04 20:16:57 -05:00 |
|
Matthew Honnibal
|
7b2ede783d
|
Add SP tag to tag map if missing
|
2017-06-04 20:16:30 -05:00 |
|
ines
|
a0f4592f0a
|
Update tests
|
2017-06-05 02:26:13 +02:00 |
|
ines
|
3e105bcd36
|
Update tests
|
2017-06-05 02:09:27 +02:00 |
|
Matthew Honnibal
|
516798e9fc
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-06-05 01:35:21 +02:00 |
|
Matthew Honnibal
|
193bf913c0
|
Set is_tagged=True after tagging
|
2017-06-05 01:35:07 +02:00 |
|
ines
|
078232932c
|
Fix tokenizer fixture scope
|
2017-06-05 01:06:34 +02:00 |
|
Matthew Honnibal
|
58be0e1f6f
|
Update tests
|
2017-06-04 16:35:06 -05:00 |
|
Matthew Honnibal
|
b78cc318c3
|
Fix loading of morphology exceptions
|
2017-06-04 16:34:32 -05:00 |
|
Matthew Honnibal
|
bb98d45a63
|
Fix tests
|
2017-06-04 16:00:44 -05:00 |
|
Matthew Honnibal
|
55d0621532
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-06-04 15:53:25 -05:00 |
|
Matthew Honnibal
|
5b9f116aca
|
Update tests
|
2017-06-04 15:53:17 -05:00 |
|
Matthew Honnibal
|
2a3bd5ee90
|
Fix fetching of noun chunk iterator
|
2017-06-04 15:53:05 -05:00 |
|
Matthew Honnibal
|
3680c51b8f
|
Avoid clobbering preset POS tags
|
2017-06-04 15:52:42 -05:00 |
|
Matthew Honnibal
|
939e8ed567
|
Add lookup properties for components in Language
|
2017-06-04 15:52:09 -05:00 |
|
Matthew Honnibal
|
e28f90b672
|
Fix syntax iterators
|
2017-06-04 15:51:50 -05:00 |
|
ines
|
8a29308d0b
|
Remove unused imports
|
2017-06-04 22:39:29 +02:00 |
|
Ines Montani
|
112c5787eb
|
Merge pull request #1101 from oroszgy/hu_tokenizer_fix
More robust Hungarian tokenizer.
|
2017-06-04 22:37:51 +02:00 |
|
ines
|
96867a24ae
|
Fix typo
|
2017-06-04 22:36:40 +02:00 |
|
ines
|
f432bb4b48
|
Fix fixture scopes
|
2017-06-04 22:34:31 +02:00 |
|
Matthew Honnibal
|
6d0356e6cc
|
Whitespace
|
2017-06-04 14:55:24 -05:00 |
|
Matthew Honnibal
|
8a683a4494
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-06-04 21:53:56 +02:00 |
|
Matthew Honnibal
|
92ae36f84e
|
Improve way noun chunks iterator is looked up
|
2017-06-04 21:53:39 +02:00 |
|
ines
|
9254a3dd78
|
Import and add Spanish syntax iterators
|
2017-06-04 21:42:15 +02:00 |
|
ines
|
7db1a0e83e
|
Make sure printed values are always strings
|
2017-06-04 21:27:20 +02:00 |
|
Matthew Honnibal
|
51e1541ddb
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-06-04 14:26:29 -05:00 |
|
Matthew Honnibal
|
add9a33782
|
Return False for vocab.has_vector
|
2017-06-04 14:26:14 -05:00 |
|
Matthew Honnibal
|
675f448313
|
Fix vector linkage on Doc
|
2017-06-04 14:25:30 -05:00 |
|
Matthew Honnibal
|
f4662e9218
|
Fix vector linkage for token
|
2017-06-04 14:19:58 -05:00 |
|
ines
|
070e026ed9
|
Ensure path on read_json
|
2017-06-04 20:44:37 +02:00 |
|
ines
|
e1e73936b1
|
Raise correct error
|
2017-06-04 20:44:27 +02:00 |
|
ines
|
848e47669e
|
Fix typo
|
2017-06-04 20:44:15 +02:00 |
|
ines
|
c4614c02a2
|
Fix dev resources URL
|
2017-06-04 15:45:50 +02:00 |
|
ines
|
a66cf24ee8
|
xfail tokenizer serialization tests for now
Tests pass locally, but not on Travis – needs more investigation
|
2017-06-04 13:58:20 +02:00 |
|
ines
|
7b7d46b64e
|
Fix typo and success message
|
2017-06-04 13:45:50 +02:00 |
|
ines
|
90d117f378
|
Update version
|
2017-06-04 13:41:16 +02:00 |
|
Matthew Honnibal
|
7ca215bc26
|
Resolve lex_attr_getters conflict
|
2017-06-03 16:12:01 -05:00 |
|
Matthew Honnibal
|
21eef90dbc
|
Support specifying which GPU
|
2017-06-03 16:10:23 -05:00 |
|
Matthew Honnibal
|
d0e42f9275
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-06-03 15:30:32 -05:00 |
|
Matthew Honnibal
|
8a17b99b1c
|
Use NORM attribute, not LOWER
|
2017-06-03 15:30:16 -05:00 |
|
ines
|
4c643d74c5
|
Add norm exceptions to other Language classes
|
2017-06-03 22:29:21 +02:00 |
|
ines
|
fa7e576c57
|
Change order of exception dicts
|
2017-06-03 21:52:06 +02:00 |
|
Matthew Honnibal
|
3f5c85d8de
|
Reorder setting of lex attrs, to avoid clobbering
|
2017-06-03 14:47:55 -05:00 |
|
Matthew Honnibal
|
aeb7520133
|
Make norm use lower-case
|
2017-06-03 14:47:38 -05:00 |
|
Matthew Honnibal
|
de3954843e
|
Populate norm exceptions with lower-case
|
2017-06-03 14:47:12 -05:00 |
|
Matthew Honnibal
|
f6955a459c
|
Fix prev commit
|
2017-06-03 14:38:37 -05:00 |
|
Matthew Honnibal
|
468ca6c760
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-06-03 14:33:51 -05:00 |
|
Matthew Honnibal
|
c647a0d33e
|
Fix training counter for gold preprocessing
|
2017-06-03 14:33:39 -05:00 |
|
ines
|
e47eef5e03
|
Update German tokenizer exceptions and tests
|
2017-06-03 21:07:44 +02:00 |
|
ines
|
d77c2cc8bb
|
Add tests for English norm exceptions
|
2017-06-03 20:59:50 +02:00 |
|
ines
|
0d6fa8b241
|
Add German norm exceptions
|
2017-06-03 20:54:18 +02:00 |
|
ines
|
5bd311c77e
|
Fix update of norm exceptions
|
2017-06-03 20:54:09 +02:00 |
|
Matthew Honnibal
|
94e063ae2a
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-06-03 13:31:40 -05:00 |
|
Matthew Honnibal
|
fea1144e6d
|
Set max batch size in evaluate
|
2017-06-03 13:31:33 -05:00 |
|
Matthew Honnibal
|
805495af27
|
Fix off-by-one in number of tags
|
2017-06-03 13:29:23 -05:00 |
|
Matthew Honnibal
|
e62f46d39f
|
Clarify gold.pyx slightly
|
2017-06-03 13:28:52 -05:00 |
|
Matthew Honnibal
|
43353b5413
|
Improve train CLI script
|
2017-06-03 13:28:20 -05:00 |
|
ines
|
746653880c
|
Add English norm exceptions to lex_attrs
|
2017-06-03 20:27:28 +02:00 |
|
ines
|
095eeeb12f
|
Update English tokenizer exceptions and add norms
|
2017-06-03 20:27:16 +02:00 |
|
ines
|
e5d426406a
|
Add base norm exceptions
|
2017-06-03 20:27:05 +02:00 |
|
ines
|
4c2bbc3ccc
|
Add add_lookups util function
|
2017-06-03 19:44:47 +02:00 |
|
ines
|
05fe6758a7
|
Set lexeme attributes for tokenizer special cases
|
2017-06-03 19:44:39 +02:00 |
|
ines
|
3152ee5ca2
|
Update serialization tests for tokenizer
|
2017-06-03 17:05:28 +02:00 |
|
ines
|
7c919aeb09
|
Make sure serializers and deserializers are ordered
|
2017-06-03 17:05:09 +02:00 |
|
ines
|
1ebd0d3f27
|
Add assert_packed_msg_equal util function
|
2017-06-03 17:04:30 +02:00 |
|
ines
|
de974f7bef
|
Add serializer tests for tokenizer
|
2017-06-03 13:26:34 +02:00 |
|
ines
|
0153b66a86
|
Return self in Tokenizer.from_bytes
|
2017-06-03 13:26:13 +02:00 |
|
ines
|
82154a1861
|
Add letter spacing to arrow label
|
2017-06-03 13:25:41 +02:00 |
|
ines
|
32c6f05de9
|
Adjust spacing and sizing in compact mode
|
2017-06-03 13:25:32 +02:00 |
|
ines
|
cc8c8617a4
|
Shut down displaCy server on KeyboardInterrupt
|
2017-06-03 13:24:56 +02:00 |
|
ines
|
70fbba7d08
|
Clone Doc to never merge punctuation on original Doc
|
2017-06-03 13:24:43 +02:00 |
|
ines
|
459a1e8470
|
Fix whitespace
|
2017-06-03 11:31:18 +02:00 |
|
ines
|
5109bba910
|
Port over fix from #1070
|
2017-06-03 11:31:11 +02:00 |
|
ines
|
d21459f87d
|
Update serializer tests
|
2017-06-02 21:42:26 +02:00 |
|
ines
|
6669583f4e
|
Use OrderedDict
|
2017-06-02 21:07:56 +02:00 |
|
ines
|
2f1025a94c
|
Port over Spanish changes from #1096
|
2017-06-02 19:09:58 +02:00 |
|
ines
|
d86e7cde93
|
Add entity recognizer to parser serialization tests
|
2017-06-02 18:40:06 +02:00 |
|
ines
|
0051c05964
|
Add tests for serializing parser
|
2017-06-02 18:37:19 +02:00 |
|
ines
|
fdd0923be4
|
Translate model=True in exclude to lower_model and upper_model
|
2017-06-02 18:37:07 +02:00 |
|
ines
|
cef547a9f0
|
Add serialization tests for tensorizer
|
2017-06-02 18:18:30 +02:00 |
|
ines
|
924c58bde3
|
Fix serialization of optional elements
|
2017-06-02 18:18:17 +02:00 |
|
ines
|
f74a45c1fe
|
Remove unnecessary argument
|
2017-06-02 18:17:46 +02:00 |
|
ines
|
43b4d63f85
|
Add serialization tests for tagger
|
2017-06-02 17:29:34 +02:00 |
|
ines
|
1b593bbd6d
|
Fix encoding on tagger serialization
|
2017-06-02 17:29:21 +02:00 |
|
Matthew Honnibal
|
5f4d328e2c
|
Fix serialization of tag_map in NeuralTagger
|
2017-06-02 10:18:37 -05:00 |
|
Matthew Honnibal
|
ed6f575e06
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-06-02 04:26:39 -05:00 |
|
ines
|
acd65c00f6
|
Add serialization tests for StringStore and Vocab
|
2017-06-02 10:57:42 +02:00 |
|
ines
|
41a6adf1f6
|
Initialise Vocab length correctly
|
2017-06-02 10:57:25 +02:00 |
|
ines
|
53b82f972a
|
Add strings to Vocab in init, instead of StringStore
|
2017-06-02 10:57:06 +02:00 |
|
ines
|
023f38bdd4
|
Fix return value of Vocab.from_bytes
|
2017-06-02 10:56:40 +02:00 |
|
ines
|
9692c98f57
|
Add test utils for temp file and temp dir
|
2017-06-02 10:56:09 +02:00 |
|
Matthew Honnibal
|
c650bc481c
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-06-01 13:03:57 -05:00 |
|
Matthew Honnibal
|
307d615c5f
|
Fix serialization for tagger when tag_map has changed
|
2017-06-01 12:18:36 -05:00 |
|
Matthew Honnibal
|
1d18cedae8
|
Fiddle with msgpack bytes vs unicode
|
2017-06-01 10:48:43 -05:00 |
|
ines
|
7a2380f617
|
Rename "nn_tagger" to "tagger"
|
2017-06-01 17:37:53 +02:00 |
|
ines
|
e5ae6ccf4e
|
Fix typo
|
2017-06-01 16:46:15 +02:00 |
|
ines
|
a3e4f91f4a
|
Only load vocab if it exists
|
2017-06-01 14:38:35 +02:00 |
|
Matthew Honnibal
|
d310b0aab3
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-06-01 04:58:03 -05:00 |
|
Matthew Honnibal
|
3ff7d7fcef
|
Merge for updated requirements
|
2017-06-01 04:57:47 -05:00 |
|
Matthew Honnibal
|
5eae3b9a1e
|
Fix to/from disk in tagger
|
2017-06-01 04:55:49 -05:00 |
|
ines
|
d5c8d2f5fd
|
Update about.py and increment version
|
2017-06-01 11:52:24 +02:00 |
|
Matthew Honnibal
|
4c97371051
|
Fixes for thinc 6.7
|
2017-06-01 04:22:16 -05:00 |
|
Matthew Honnibal
|
53d00a0371
|
Move weight serialization to Thinc
|
2017-06-01 03:04:36 -05:00 |
|
Matthew Honnibal
|
ae8010b526
|
Move weight serialization to Thinc
|
2017-06-01 02:56:12 -05:00 |
|
Gyorgy Orosz
|
f0c3b09242
|
More robust Hungarian tokenizer.
|
2017-05-31 22:28:40 +02:00 |
|
Matthew Honnibal
|
c8a58cfcf8
|
Fix Python2/3 load bug
|
2017-05-31 15:21:44 -05:00 |
|
Matthew Honnibal
|
99982684b0
|
Fix normalize_string_keys function'
|
2017-05-31 14:08:16 -05:00 |
|
Matthew Honnibal
|
67ade63fc4
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-05-31 08:28:42 -05:00 |
|
Matthew Honnibal
|
490b38e6bb
|
Fix reference to thinc copy_array util
|
2017-05-31 08:25:21 -05:00 |
|
Matthew Honnibal
|
9805e0e369
|
Fix vocab pickling
|
2017-05-31 08:25:01 -05:00 |
|
Matthew Honnibal
|
6c51cd77b4
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-05-31 15:06:56 +02:00 |
|
Matthew Honnibal
|
8dfb9546f0
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-05-31 07:21:14 -05:00 |
|
Matthew Honnibal
|
480ef8bfc8
|
Add compat function to normalize dict keys
|
2017-05-31 07:14:29 -05:00 |
|
Matthew Honnibal
|
92f9e5cc9a
|
Silence env_opt, and fix serialization for GPU
|
2017-05-31 07:14:11 -05:00 |
|
Matthew Honnibal
|
0561df2a9d
|
Fix tokenizer serialization
|
2017-05-31 14:12:38 +02:00 |
|
Matthew Honnibal
|
4a398c15b7
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-05-31 13:44:16 +02:00 |
|
Matthew Honnibal
|
097ab9c6e4
|
Fix transition system to/from disk
|
2017-05-31 13:44:00 +02:00 |
|
Matthew Honnibal
|
b1469d3360
|
Fix string serialisation
|
2017-05-31 13:43:44 +02:00 |
|
Matthew Honnibal
|
e9419072e7
|
Fix tokenizer serialisation
|
2017-05-31 13:43:31 +02:00 |
|
Matthew Honnibal
|
33e5ec737f
|
Fix to/from disk methods
|
2017-05-31 13:43:10 +02:00 |
|
ines
|
5e1c361270
|
Update tests README with info on model tests
|
2017-05-31 12:22:58 +02:00 |
|
Matthew Honnibal
|
fe28602f2e
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-05-31 11:43:56 +02:00 |
|
Matthew Honnibal
|
66af019d5d
|
Fix serialization of tokenizer
|
2017-05-31 11:43:40 +02:00 |
|
Ines Montani
|
e6cf3c7e1c
|
Merge pull request #1093 from oroszgy/hu_emoji_fix
Fixed emoji handling for Hungarian
|
2017-05-31 11:33:24 +02:00 |
|
Matthew Honnibal
|
e98eff275d
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-05-31 10:29:15 +02:00 |
|
Matthew Honnibal
|
53a3824334
|
Fix mistake in ner feature
|
2017-05-31 03:01:02 +02:00 |
|
Matthew Honnibal
|
8a693c2605
|
Write binary file during training
|
2017-05-31 02:59:18 +02:00 |
|
Matthew Honnibal
|
498ad85309
|
Try using tensor for vector/similarity methdos
|
2017-05-30 23:35:17 +02:00 |
|
Matthew Honnibal
|
a131981f3b
|
Work on vectors
|
2017-05-30 23:34:50 +02:00 |
|
Matthew Honnibal
|
6937e311a4
|
Update doc tests
|
2017-05-30 23:34:23 +02:00 |
|
Matthew Honnibal
|
cc911feab2
|
Fix bug in NER state
|
2017-05-30 22:12:19 +02:00 |
|
Gyorgy Orosz
|
8c0b4b850e
|
Fixed emoji handling for Hungarian
|
2017-05-30 21:34:46 +02:00 |
|
Matthew Honnibal
|
be4a640f0c
|
Fix arc eager label costs for uint64
|
2017-05-30 20:37:58 +02:00 |
|
Matthew Honnibal
|
b127645afc
|
Fix test_misc merge conflict
|
2017-05-29 18:31:44 -05:00 |
|
Matthew Honnibal
|
e0e8eae7c7
|
Tweak package test
|
2017-05-29 18:30:42 -05:00 |
|
Matthew Honnibal
|
11840ff5dd
|
Store tag map before normalizing props
|
2017-05-29 17:53:48 -05:00 |
|
Matthew Honnibal
|
b92a89f87b
|
Make it easier to reference embedding tables
|
2017-05-29 17:53:29 -05:00 |
|
Matthew Honnibal
|
293d1b425b
|
Serialize in consistent order
|
2017-05-29 17:53:06 -05:00 |
|
Matthew Honnibal
|
9bf22a94aa
|
Fix tag set serialisation
|
2017-05-29 17:52:36 -05:00 |
|
Matthew Honnibal
|
2a061e2777
|
Fix serialisation, for reals this time
|
2017-05-29 17:52:08 -05:00 |
|
ines
|
20a7003c0d
|
Update model fixtures and reorganise tests
|
2017-05-29 22:14:31 +02:00 |
|
ines
|
795fe43a4d
|
Add load_test_model function with importorskip()
Loads model only if it can be imported, i.e. if it's installed as a
package.
|
2017-05-29 22:11:31 +02:00 |
|
ines
|
ad3c8b3ad9
|
Fix formatting
|
2017-05-29 22:10:50 +02:00 |
|
ines
|
6e3937efc5
|
Check for arguments of model markers to specify models to test
Lets user set --models --en for only English models
|
2017-05-29 22:10:16 +02:00 |
|
Matthew Honnibal
|
35d981241f
|
Fix model deserialization
|
2017-05-29 14:46:31 -05:00 |
|
Matthew Honnibal
|
5b29f227ae
|
Fix serialization
|
2017-05-29 14:35:53 -05:00 |
|
Matthew Honnibal
|
1e6df0a2a1
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-05-29 14:30:12 -05:00 |
|
ines
|
08382f21e3
|
Pass model meta to nlp object in load_model
|
2017-05-29 20:44:11 +02:00 |
|
ines
|
6145fe6a93
|
Catch all kwargs on Language
|
2017-05-29 20:43:48 +02:00 |
|
ines
|
0d7d50fe22
|
Add __version__ to __init__.py
|
2017-05-29 20:43:24 +02:00 |
|
Matthew Honnibal
|
6522ea6c8b
|
More serialization fixes. Still broken
|
2017-05-29 13:23:47 -05:00 |
|
Matthew Honnibal
|
9c9ee24411
|
Fix broken lambda scoping in Python 2
|
2017-05-29 13:23:28 -05:00 |
|
Matthew Honnibal
|
f1acdaab55
|
Fix serialization of weight offsets
|
2017-05-29 13:23:11 -05:00 |
|
Matthew Honnibal
|
c044e9c21c
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-05-29 08:41:02 -05:00 |
|
Matthew Honnibal
|
aa4c33914b
|
Work on serialization
|
2017-05-29 08:40:45 -05:00 |
|
ines
|
9e83a17e95
|
Use new model templates
|
2017-05-29 15:27:24 +02:00 |
|
ines
|
567485a818
|
Fix and document model loading with pipeline and overrides
|
2017-05-29 14:10:10 +02:00 |
|
Matthew Honnibal
|
deac7eb01c
|
Fix for serialization
|
2017-05-29 13:54:18 +02:00 |
|
Matthew Honnibal
|
04c32aa091
|
Fix for serialization
|
2017-05-29 13:53:32 +02:00 |
|
Matthew Honnibal
|
a1960c2d09
|
Fix for serialization
|
2017-05-29 13:47:42 +02:00 |
|
Matthew Honnibal
|
7b06bb896e
|
Fix for serialization
|
2017-05-29 13:42:55 +02:00 |
|
Matthew Honnibal
|
74235587ef
|
Fix to serialization
|
2017-05-29 13:40:31 +02:00 |
|
Matthew Honnibal
|
59f355d525
|
Fixes for serialization
|
2017-05-29 13:38:20 +02:00 |
|
Matthew Honnibal
|
920887f4e4
|
Specify order of vocab deserialization
|
2017-05-29 13:04:40 +02:00 |
|
Matthew Honnibal
|
f4aafca222
|
Merge changes to test_misc
|
2017-05-29 12:26:02 +02:00 |
|
Matthew Honnibal
|
a318f0cae1
|
Add to/from disk/bytes methods for tokenizer
|
2017-05-29 12:24:41 +02:00 |
|
Matthew Honnibal
|
ff26aa6c37
|
Work on to/from bytes/disk serialization methods
|
2017-05-29 11:45:45 +02:00 |
|
ines
|
df920ba0e7
|
Add tests for displaCy and util functions and fix util typo
|
2017-05-29 10:51:19 +02:00 |
|
ines
|
c5714d4fb2
|
xfail matcher test for now until setting norm via Span.merge works
|
2017-05-29 10:51:02 +02:00 |
|
Matthew Honnibal
|
6b019b0540
|
Update to/from bytes methods
|
2017-05-29 10:14:20 +02:00 |
|
Matthew Honnibal
|
c91b121aeb
|
Move serialization functions to util
|
2017-05-29 10:13:42 +02:00 |
|
Matthew Honnibal
|
1fa2bfb600
|
Add model_to_bytes and model_from_bytes helpers. Probably belong in thinc.
|
2017-05-29 09:27:04 +02:00 |
|
Matthew Honnibal
|
6dad4117ad
|
Work on serialization for models
|
2017-05-29 01:37:57 +02:00 |
|
ines
|
7b1ddcc04d
|
Add test for vocab serialization
|
2017-05-29 01:09:52 +02:00 |
|
ines
|
00b2094dc3
|
Fix typos, long integers and tests
|
2017-05-29 01:09:52 +02:00 |
|
ines
|
804dbb8d25
|
Add StringStore test for API docs
|
2017-05-29 01:09:52 +02:00 |
|
Matthew Honnibal
|
6cd5730ee7
|
Fix lex struct setters for strings
|
2017-05-29 01:05:09 +02:00 |
|
Matthew Honnibal
|
2edd96ce47
|
Draft Vocab to/from disk/bytes
|
2017-05-28 23:34:12 +02:00 |
|
Matthew Honnibal
|
4ddff020c3
|
Fix compile error
|
2017-05-28 23:30:40 +02:00 |
|
Matthew Honnibal
|
6d3caeadd2
|
Fix type check for long
|
2017-05-28 23:22:45 +02:00 |
|
Matthew Honnibal
|
92dbf28c1e
|
Hack a fixture in the vectors tests, for xfail
|
2017-05-28 20:28:32 +02:00 |
|
Matthew Honnibal
|
9239f06ed3
|
Fix german noun chunks iterator
|
2017-05-28 20:13:03 +02:00 |
|
Matthew Honnibal
|
fd9b6722a9
|
Fix noun chunks iterator for new stringstore
|
2017-05-28 20:12:10 +02:00 |
|
ines
|
414193e9ba
|
Update docs to reflect StringStore changes
|
2017-05-28 18:19:11 +02:00 |
|
Matthew Honnibal
|
7996d21717
|
Fixes for new StringStore
|
2017-05-28 11:09:27 -05:00 |
|
Matthew Honnibal
|
8a24c60c1e
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-05-28 08:12:05 -05:00 |
|
Matthew Honnibal
|
bc97bc292c
|
Fix __call__ method
|
2017-05-28 08:11:58 -05:00 |
|
Matthew Honnibal
|
5cf47b847b
|
Handle iob with no tag in converter
|
2017-05-28 08:11:39 -05:00 |
|
Matthew Honnibal
|
fe11564b8e
|
Finish stringstore change. Also xfail vectors tests
|
2017-05-28 15:10:22 +02:00 |
|
Matthew Honnibal
|
b007a2b0d3
|
Update stringstore tests
|
2017-05-28 14:08:09 +02:00 |
|
Matthew Honnibal
|
84e66ca6d4
|
WIP on stringstore change. 27 failures
|
2017-05-28 14:06:40 +02:00 |
|
Matthew Honnibal
|
fe4a746300
|
Accomodate symbols in new string scheme
|
2017-05-28 13:03:16 +02:00 |
|
Matthew Honnibal
|
f51e6a6c16
|
Adjust lexeme sizing for attr_t being 64 bit
|
2017-05-28 12:51:09 +02:00 |
|
Matthew Honnibal
|
a5606c3eda
|
Work on changing StringStore to return hashes.
|
2017-05-28 12:36:27 +02:00 |
|
Matthew Honnibal
|
39293ab2ee
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-05-28 11:46:57 +02:00 |
|
Matthew Honnibal
|
dd052572d4
|
Update arc eager for SBD changes
|
2017-05-28 11:46:51 +02:00 |
|
Matthew Honnibal
|
3ea98e2043
|
Remove vector member from lexeme
|
2017-05-28 11:46:24 +02:00 |
|
Matthew Honnibal
|
2445707f3c
|
Re-delegate vectors to vocab
|
2017-05-28 11:46:10 +02:00 |
|
Matthew Honnibal
|
6863d01361
|
Remove vectors from lexeme
|
2017-05-28 11:45:48 +02:00 |
|
Matthew Honnibal
|
15f6efc127
|
Remove vectors from vocab
|
2017-05-28 11:45:32 +02:00 |
|
Matthew Honnibal
|
c1263a844b
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-05-27 18:32:57 -05:00 |
|
Matthew Honnibal
|
9e711c3476
|
Divide d_loss by batch size
|
2017-05-27 18:32:46 -05:00 |
|
Matthew Honnibal
|
b082f76494
|
Randomize pipeline order during training
|
2017-05-27 18:32:21 -05:00 |
|
Matthew Honnibal
|
a1d4c97fb7
|
Improve correctness of minibatching
|
2017-05-27 17:59:00 -05:00 |
|
ines
|
84189c1cab
|
Add 'xx' language ID for multi-language support
Allows models to specify their language ID as 'xx'.
|
2017-05-28 00:58:59 +02:00 |
|
ines
|
33e332e67c
|
Remove unused export
|
2017-05-28 00:57:59 +02:00 |
|
ines
|
c1983621fb
|
Update util functions for model loading
|
2017-05-28 00:22:40 +02:00 |
|