Matthew Honnibal
5b329acbf2
Fix vectors_length property in vocab
2017-08-22 19:00:27 +02:00
Matthew Honnibal
6a94648373
Fix serialization
2017-08-19 21:27:35 +02:00
Matthew Honnibal
1157294434
Improve vector handling
2017-08-19 20:35:33 +02:00
Matthew Honnibal
93fb8b64e9
Fix vector loading
2017-08-19 19:52:25 +02:00
Matthew Honnibal
49a615e7d9
Create Vectors object in Vocab
2017-08-19 18:50:16 +02:00
Matthew Honnibal
2993b54fff
Load vectors in vocab
2017-08-18 20:46:56 +02:00
Matthew Honnibal
add9a33782
Return False for vocab.has_vector
2017-06-04 14:26:14 -05:00
ines
05fe6758a7
Set lexeme attributes for tokenizer special cases
2017-06-03 19:44:39 +02:00
ines
41a6adf1f6
Initialise Vocab length correctly
2017-06-02 10:57:25 +02:00
ines
53b82f972a
Add strings to Vocab in init, instead of StringStore
2017-06-02 10:57:06 +02:00
ines
023f38bdd4
Fix return value of Vocab.from_bytes
2017-06-02 10:56:40 +02:00
Matthew Honnibal
307d615c5f
Fix serialization for tagger when tag_map has changed
2017-06-01 12:18:36 -05:00
Matthew Honnibal
9805e0e369
Fix vocab pickling
2017-05-31 08:25:01 -05:00
Matthew Honnibal
a131981f3b
Work on vectors
2017-05-30 23:34:50 +02:00
Matthew Honnibal
9bf22a94aa
Fix tag set serialisation
2017-05-29 17:52:36 -05:00
Matthew Honnibal
920887f4e4
Specify order of vocab deserialization
2017-05-29 13:04:40 +02:00
Matthew Honnibal
6b019b0540
Update to/from bytes methods
2017-05-29 10:14:20 +02:00
Matthew Honnibal
6dad4117ad
Work on serialization for models
2017-05-29 01:37:57 +02:00
Matthew Honnibal
2edd96ce47
Draft Vocab to/from disk/bytes
2017-05-28 23:34:12 +02:00
Matthew Honnibal
fe11564b8e
Finish stringstore change. Also xfail vectors tests
2017-05-28 15:10:22 +02:00
Matthew Honnibal
fe4a746300
Accomodate symbols in new string scheme
2017-05-28 13:03:16 +02:00
Matthew Honnibal
a5606c3eda
Work on changing StringStore to return hashes.
2017-05-28 12:36:27 +02:00
Matthew Honnibal
39293ab2ee
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-05-28 11:46:57 +02:00
Matthew Honnibal
15f6efc127
Remove vectors from vocab
2017-05-28 11:45:32 +02:00
ines
c8543c8237
Fix formatting and docstrings and remove deprecated function
2017-05-28 00:22:40 +02:00
ines
251346b59f
Fix typos and formatting
2017-05-21 14:18:46 +02:00
ines
d82ae9a585
Change "function" to "callable" in docs
2017-05-21 13:17:40 +02:00
ines
f0cc642bb9
Update docstrings and API docs for Vocab
2017-05-20 14:00:41 +02:00
Matthew Honnibal
793430aa7a
Get spaCy train command working with neural network
...
* Integrate models into pipeline
* Add basic serialization (maybe incorrect)
* Fix pickle on vocab
2017-05-17 12:04:50 +02:00
Matthew Honnibal
9e167b7bb6
Strip serializer from code
2017-05-09 17:28:50 +02:00
ines
e1efd589c3
Fix json imports and use ujson
2017-04-15 12:13:34 +02:00
ines
c05ec4b89a
Add compat functions and remove old workarounds
...
Add ensure_path util function to handle checking instance of path
2017-04-15 12:11:16 +02:00
ines
d24589aa72
Clean up imports, unused code, whitespace, docstrings
2017-04-15 12:05:47 +02:00
ines
561f2a3eb4
Use consistent formatting for docstrings
2017-04-15 11:59:21 +02:00
Matthew Honnibal
d013aba7b5
Merge branch 'master' of https://github.com/explosion/spaCy
2017-03-17 18:30:53 +01:00
Matthew Honnibal
854cfce7cf
Make vocabs more compatible across versions
...
Previously, symbols were inserted into the string-store
before strings were loaded. This meant that adding a symbol
would invalidate saved models. We now make sure that strings
are loaded faithfully, so that compatibility is maintained.
2017-03-17 18:29:04 +01:00
Matthew Honnibal
1cc841e600
Merge branch 'master' of https://github.com/explosion/spaCy
2017-03-17 08:18:11 -05:00
Matthew Honnibal
4bfc55b532
Auto-add words to vocab when loading vectors
...
When calling vocab.load_vectors_from_bin_loc, ensure that missing
entries are added to the vocab. Otherwise, loading vectors into an
empty vocab object resulted in no vectors being added.
2017-03-17 08:15:59 -05:00
Matthew Honnibal
4382f175b3
Squelch compiler warnings
2017-03-11 12:44:43 -06:00
Matthew Honnibal
d814892805
Hackish pickle support for Vocab.
2017-03-07 20:25:12 +01:00
ines
aa92d4e9b5
Fix unicode regex for Python 2 (see #834 )
2017-02-16 23:49:54 +01:00
ines
85d249d451
Revert "Revert "Merge pull request #836 from raphael0202/load_vectors ( closes #834 )""
...
This reverts commit ea05f78660
.
2017-02-16 23:26:25 +01:00
ines
ea05f78660
Revert "Merge pull request #836 from raphael0202/load_vectors ( closes #834 )"
...
This reverts commit 7d8c9eee7f
, reversing
changes made to f6b69babcc
.
2017-02-16 15:27:12 +01:00
Raphaël Bournhonesque
e17dc2db75
Remove useless import
2017-02-16 12:10:24 +01:00
Raphaël Bournhonesque
3fd2742649
load_vectors should accept arbitrary space characters as word tokens
...
Fix bug #834
2017-02-16 12:08:30 +01:00
Daniel Hershcovich
99eb494a82
Fix #737 : support loading word vectors with " " as a word
2017-01-12 17:00:14 +02:00
Daniel Hershcovich
8e603cc917
Avoid "True if ... else False"
2017-01-11 11:18:22 +02:00
Matthew Honnibal
cade536d1e
Merge branch 'master' of ssh://github.com/explosion/spaCy
2016-12-27 21:04:10 +01:00
Matthew Honnibal
ce4539dafd
Allow the vocabulary to grow to 10,000, to prevent cold-start problem.
2016-12-27 21:03:45 +01:00
Ines Montani
8978806ea6
Allow Vocab to load without serializer_freqs
2016-12-21 18:05:23 +01:00