Commit Graph

8600 Commits

Author SHA1 Message Date
ines
db15902e84 Tidy up 2017-10-23 10:38:21 +02:00
ines
3f0a157b33 Fix typo 2017-10-23 10:38:13 +02:00
ines
a31f048b4d Fix formatting 2017-10-23 10:38:06 +02:00
Ines Montani
0ed0c41bad Merge pull request #1448 from jerbob92/feature/fix-training-new-entity-type-example
Fix #1444: fix training new entity type example
2017-10-22 15:43:33 +02:00
Ines Montani
00fc2db7ef Merge pull request #1449 from jerbob92/feature/add-jerbob92-to-contributors
Add myself to contributors
2017-10-22 15:43:05 +02:00
Jeroen Bobbeldijk
5c7c08c2e3 Add myself to contributors 2017-10-22 15:35:46 +02:00
Jeroen Bobbeldijk
84c6c20d1c Fix #1444: fix pipeline logic and wrong paramater in update call 2017-10-22 15:18:36 +02:00
mayukh18
80edc905f7 added a few bengali pronouns 2017-10-22 13:16:39 +05:30
Matthew Honnibal
490ad3eaf0 Check that empty strings are handled. Closes #1242 2017-10-21 00:52:14 +02:00
Matthew Honnibal
8f8bccecb9 Patch deserialisation for invalid loads, to avoid model failure 2017-10-21 00:51:42 +02:00
Ramanan Balakrishnan
d2fe56a577
Add LCA matrix for spans and docs 2017-10-20 23:58:00 +05:30
Matthew Honnibal
d8391b1c4d Fix #1434: Matcher failed on ending ? if no token 2017-10-20 16:49:36 +02:00
Matthew Honnibal
fec53f09f7 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-10-20 16:28:34 +02:00
Matthew Honnibal
f111b228e0 Fix re-parsing of previously parsed text
If a Doc object had been previously parsed, it was possible for
invalid parses to be added. There were two problems:

1) The parse was only being partially erased
2) The RightArc action was able to create a 1-cycle.

This patch fixes both errors, and avoids resetting the parse if one is
present. In theory this might allow a better parse to be predicted by
running the parser twice.

Closes #1253.
2017-10-20 16:27:36 +02:00
Matthew Honnibal
1036798155 Make parser consistent if maxout==1 2017-10-20 16:24:16 +02:00
Matthew Honnibal
3faf9189a2 Make parser hidden shape consistent even if maxout==1 2017-10-20 16:23:31 +02:00
Matthew Honnibal
9010a1a060 Create vectors correctly 2017-10-20 14:19:46 +02:00
Matthew Honnibal
33229b1c9e Remove print statement 2017-10-20 14:19:29 +02:00
Matthew Honnibal
cfae54c507 Make change to Vectors.__init__ 2017-10-20 14:19:04 +02:00
Matthew Honnibal
ebecaddb76 Make 'data_or_width' two keyword args in Vectors.__init__
Previously the data and width options were one argument in Vectors,
which meant you couldn't say vectors = Vectors(strings, width=300).
It's better to have two keywords.
2017-10-20 14:17:15 +02:00
Matthew Honnibal
49895fbef6 Rename 'SP' special tag to '_SP'
Renaming the tag with an underscore lets us add it to the tag map
without worrying that we'll change the sequence of tags, which throws
off the tag-to-ID mapping. For instance, if we inserted a 'SP' tag,
the "VERB" tag is pushed to a different class ID, and the model is all
messed up.
2017-10-20 14:01:12 +02:00
Matthew Honnibal
506cf2eb13 Remove cpdef enum, to avoid too much code generation 2017-10-20 14:00:23 +02:00
Matthew Honnibal
6218af0105 Remove cpdef enum, to avoid too much code generation 2017-10-20 13:59:57 +02:00
Matthew Honnibal
92ac9316b5 Fix initialization of vectors, to address serialization problem 2017-10-20 13:59:24 +02:00
Ramanan Balakrishnan
0726946563
cleanup to_array implementation using fixes on master 2017-10-20 17:09:37 +05:30
Ines Montani
2a0ab6fafa Merge pull request #1435 from ramananbalakrishnan/update_to_array
Support single value for attribute list in doc.to_array
2017-10-20 13:21:48 +02:00
ines
108f1f786e Update symbols and document missing token attributes (see #1439) 2017-10-20 13:08:44 +02:00
ines
4acab77a8a Add missing symbol for LAW entities (resolves #1427) 2017-10-20 13:07:57 +02:00
Matthew Honnibal
dbc276e3b2 Fix 'toupper()' -> 'upper()' 2017-10-20 13:02:13 +02:00
Matthew Honnibal
b101736555 Fix precomputed layer 2017-10-20 12:14:52 +02:00
Matthew Honnibal
7a46792376 Fix compile error
Closures not allowed in cpdef
2017-10-20 11:53:47 +02:00
Matthew Honnibal
658536b5ce Fix to_array compile error 2017-10-20 11:35:10 +02:00
Matthew Honnibal
c0799430a7 Make small changes to Doc.to_array
* Change type-check logic to 'hasattr' (Python type-checking is brittle)
* Small 'house style' edits, mostly making code more terse.
2017-10-20 11:17:00 +02:00
Ramanan Balakrishnan
d44a079fe3
Update documentation on doc.to_array 2017-10-20 14:25:38 +05:30
Ramanan Balakrishnan
fbccc8c87d
Update documentation on doc.to_array 2017-10-20 14:23:48 +05:30
Ramanan Balakrishnan
5941aa96a1
Support strings for attribute list in doc.to_array 2017-10-20 11:59:34 +05:30
Ramanan Balakrishnan
b3ab124fc5
Support strings for attribute list in doc.to_array 2017-10-20 11:46:57 +05:30
Matthew Honnibal
45b41fcec8 Merge pull request #1441 from johnhaley81/patch-1
Fix Keras install in keras_parikeh_entailment README
2017-10-20 03:09:38 +02:00
Matthew Honnibal
64658e02e5 Implement fancier initialisation for precomputed layer 2017-10-20 03:07:45 +02:00
Matthew Honnibal
827cd8a883 Fix support of maxout pieces in parser 2017-10-20 03:07:17 +02:00
Matthew Honnibal
a8850b4282 Remove redundant PrecomputableMaxouts class 2017-10-19 20:27:34 +02:00
Matthew Honnibal
a17a1b60c7 Clean up redundant PrecomputableMaxouts class 2017-10-19 20:26:37 +02:00
Matthew Honnibal
b00d0a2c97 Fix bias in parser 2017-10-19 18:42:11 +02:00
John Haley
989814c4b6 Create johnhaley81.md 2017-10-19 09:11:16 -07:00
John Haley
44c61fde25 Fix Keras install in keras_parikeh_entailment
The master branch of Keras doesn't work with this example anymore so this pins Keras to version 1.2.2 for this example.
2017-10-19 08:56:28 -07:00
Matthew Honnibal
b54b4b8a97 Make parser_maxout_pieces hyper-param work 2017-10-19 13:45:18 +02:00
Matthew Honnibal
03a215c5fd Make PrecomputableAffines work 2017-10-19 13:44:49 +02:00
Ramanan Balakrishnan
7b9b1be44c
Support single value for attribute list in doc.to_array 2017-10-19 17:00:41 +05:30
Matthew Honnibal
61bc203f3f Merge pull request #1438 from explosion/feature/fast-parser
💫 Improve runtime CPU efficiency of parser/NER
2017-10-19 02:42:21 +02:00
Matthew Honnibal
15e5a04a8d Clean up more depth=0 conditional code 2017-10-19 01:48:43 +02:00