Commit Graph

14453 Commits

Author SHA1 Message Date
svlandeg
98ae77a682 unit test on number of candidates generated 2019-03-22 11:36:45 +01:00
svlandeg
9a46c431c3 store entity hash instead of pointer 2019-03-22 11:36:45 +01:00
svlandeg
9819dca80e create candidate object from entry pointer (not fully functional yet) 2019-03-22 11:36:45 +01:00
svlandeg
a9074e0886 check the length of entities and probabilities vector + unit test 2019-03-22 11:36:45 +01:00
svlandeg
d133ffaff9 correct size, not counting dummy elements in the vector 2019-03-22 11:36:45 +01:00
svlandeg
33f8a0fe2e check and unit test in case prior probs exceed 1 2019-03-22 11:36:45 +01:00
svlandeg
b55baaa1dc avoid value 0 in preshmap and helpful user warnings 2019-03-22 11:36:45 +01:00
svlandeg
20a7b7b1c0 raising error when adding alias for unknown entity + unit test 2019-03-22 11:36:45 +01:00
svlandeg
8843f9279c use StringStore 2019-03-22 11:36:45 +01:00
svlandeg
51560bf0ed bugfix adding aliases 2019-03-22 11:36:45 +01:00
svlandeg
c4ba942765 get candidates by alias 2019-03-22 11:36:45 +01:00
svlandeg
151b855cc8 adding and retrieving aliases 2019-03-22 11:36:45 +01:00
svlandeg
cf34113250 very minimal KB functionality working 2019-03-22 11:36:44 +01:00
svlandeg
af281c5466 adding aliases per entity in the KB 2019-03-22 11:36:44 +01:00
svlandeg
f77b99c103 fix compile errors 2019-03-22 11:36:44 +01:00
svlandeg
27483f9080 add pyx and separate method to add aliases 2019-03-22 11:36:44 +01:00
svlandeg
feb71e15fd hash the entity name 2019-03-22 11:36:44 +01:00
svlandeg
839dafa104 documented some comments and todos 2019-03-22 11:36:44 +01:00
svlandeg
7f37737878 kb snippet, draft by Matt (wip) 2019-03-22 11:36:44 +01:00
svlandeg
735fc2a735 annotate kb_id through ents in doc 2019-03-22 11:36:44 +01:00
svlandeg
d849eb2455 adding kb_id as field to token, el as nlp pipeline component 2019-03-22 11:34:46 +01:00
Matthew Honnibal
d811c97da1 Fix test that caused pytest to choke on Python3 2019-03-22 10:28:51 +01:00
Matthew Honnibal
a2ad9832e5 Add failing test for #3356 2019-03-22 02:42:37 +01:00
svlandeg
4820b43313 use nlp's vocab for stringstore 2019-03-21 23:17:25 +01:00
Matthew Honnibal
7ec64a36fd
Merge pull request #3455 from explosion/bugfix/fix-en-tag-map
💫 Bring English tag_map in line with UD Treebank
2019-03-21 21:19:30 +01:00
svlandeg
6e2433b95e select candidate with highest prior probabiity 2019-03-21 18:55:01 +01:00
svlandeg
24a0c4a8d4 name per entity 2019-03-21 18:20:57 +01:00
svlandeg
d0c763ba44 minimal EL pipe 2019-03-21 17:33:25 +01:00
svlandeg
26afa4800f ensure no candidates are returned for unknown aliases 2019-03-21 15:24:40 +01:00
Matthew Honnibal
c66bd61e88 Fix lemmas 2019-03-21 14:22:12 +01:00
Matthew Honnibal
04395ffa49 Bring English tag_map in line with UD Treebank
I wrote a small script to read the UD English training data and check
that our tag map and morph rules were resulting in the best POS map.
This hadn't been done for some time, and there have been various changes
to the UD schema since it has been done. After these changes we should
see much better agreement between our POS assignments and the UD POS
tags.
2019-03-21 13:53:44 +01:00
svlandeg
a5d5a05930 Entity class 2019-03-21 13:32:21 +01:00
svlandeg
6ba4079f7c property getters and keep track of KB internally 2019-03-21 13:26:12 +01:00
svlandeg
34969dddeb unit test on number of candidates generated 2019-03-21 12:48:59 +01:00
svlandeg
0ff4ce6c59 store entity hash instead of pointer 2019-03-21 12:31:02 +01:00
Ines Montani
375fbf3586 Update v2-1.md 2019-03-21 12:29:08 +01:00
Ines Montani
9394ca1f29 Update index.md 2019-03-21 10:24:55 +01:00
Ines Montani
0c82a5ddb2 Merge branch 'master' of https://github.com/explosion/spaCy 2019-03-21 10:23:56 +01:00
Ines Montani
0712efc6b3 Update version requirements [ci skip] 2019-03-21 10:23:54 +01:00
svlandeg
81a9030ab7 create candidate object from entry pointer (not fully functional yet) 2019-03-21 00:04:06 +01:00
Matthew Honnibal
4e3ed2ea88 Add -t2v argument to train_textcat script 2019-03-20 23:05:42 +01:00
Ines Montani
764359c952 Merge branch 'master' into spacy.io 2019-03-20 17:24:28 +01:00
Ines Montani
dac8f8ff99 Update Span.__init__ docs (see #3445) [ci skip] 2019-03-20 17:24:17 +01:00
Matthew Honnibal
c7f26abe5f
Merge pull request #3434 from Bharat123rox/narrow-unicode
Raise Error for a narrow unicode build of Python
2019-03-20 12:19:52 +01:00
Matthew Honnibal
1c8ff59185
Merge pull request #3441 from explosion/fix/cli-ud-scripts
💫 Move UD scripts to bin
2019-03-20 12:19:15 +01:00
Matthew Honnibal
72889a16d5 Fix similarity calculation if vectors are on GPU (#3440) 2019-03-20 12:09:59 +01:00
Matthew Honnibal
1612990e88 Implement cosine loss for spacy pretrain. Make default 2019-03-20 11:06:58 +00:00
Ines Montani
ae5b4d0e84 Fix formatting (hopefully also restarts build properly) 2019-03-20 09:55:45 +01:00
Ines Montani
6abc1ddb26 Update __main__.py 2019-03-20 09:43:26 +01:00
Bharat123Rox
f2547f02d6 Made changes suggested by @ines 2019-03-20 07:43:19 +05:30