svlandeg
a48241e9a2
use nlp's vocab for stringstore
2019-03-22 11:36:45 +01:00
svlandeg
1ee0e78fd7
select candidate with highest prior probabiity
2019-03-22 11:36:45 +01:00
svlandeg
7b708ab8a4
name per entity
2019-03-22 11:36:45 +01:00
svlandeg
c593607ce2
minimal EL pipe
2019-03-22 11:36:45 +01:00
svlandeg
c71123dd0c
ensure no candidates are returned for unknown aliases
2019-03-22 11:36:45 +01:00
svlandeg
b6c3255a9f
Entity class
2019-03-22 11:36:45 +01:00
svlandeg
1289cd6e8f
property getters and keep track of KB internally
2019-03-22 11:36:45 +01:00
svlandeg
98ae77a682
unit test on number of candidates generated
2019-03-22 11:36:45 +01:00
svlandeg
9a46c431c3
store entity hash instead of pointer
2019-03-22 11:36:45 +01:00
svlandeg
9819dca80e
create candidate object from entry pointer (not fully functional yet)
2019-03-22 11:36:45 +01:00
svlandeg
a9074e0886
check the length of entities and probabilities vector + unit test
2019-03-22 11:36:45 +01:00
svlandeg
d133ffaff9
correct size, not counting dummy elements in the vector
2019-03-22 11:36:45 +01:00
svlandeg
33f8a0fe2e
check and unit test in case prior probs exceed 1
2019-03-22 11:36:45 +01:00
svlandeg
b55baaa1dc
avoid value 0 in preshmap and helpful user warnings
2019-03-22 11:36:45 +01:00
svlandeg
20a7b7b1c0
raising error when adding alias for unknown entity + unit test
2019-03-22 11:36:45 +01:00
svlandeg
8843f9279c
use StringStore
2019-03-22 11:36:45 +01:00
svlandeg
51560bf0ed
bugfix adding aliases
2019-03-22 11:36:45 +01:00
svlandeg
c4ba942765
get candidates by alias
2019-03-22 11:36:45 +01:00
svlandeg
151b855cc8
adding and retrieving aliases
2019-03-22 11:36:45 +01:00
svlandeg
cf34113250
very minimal KB functionality working
2019-03-22 11:36:44 +01:00
svlandeg
af281c5466
adding aliases per entity in the KB
2019-03-22 11:36:44 +01:00
svlandeg
f77b99c103
fix compile errors
2019-03-22 11:36:44 +01:00
svlandeg
27483f9080
add pyx and separate method to add aliases
2019-03-22 11:36:44 +01:00
svlandeg
feb71e15fd
hash the entity name
2019-03-22 11:36:44 +01:00
svlandeg
839dafa104
documented some comments and todos
2019-03-22 11:36:44 +01:00
svlandeg
7f37737878
kb snippet, draft by Matt (wip)
2019-03-22 11:36:44 +01:00
svlandeg
735fc2a735
annotate kb_id through ents in doc
2019-03-22 11:36:44 +01:00
svlandeg
d849eb2455
adding kb_id as field to token, el as nlp pipeline component
2019-03-22 11:34:46 +01:00
Matthew Honnibal
d811c97da1
Fix test that caused pytest to choke on Python3
2019-03-22 10:28:51 +01:00
Matthew Honnibal
a2ad9832e5
Add failing test for #3356
2019-03-22 02:42:37 +01:00
Matthew Honnibal
7ec64a36fd
Merge pull request #3455 from explosion/bugfix/fix-en-tag-map
...
💫 Bring English tag_map in line with UD Treebank
2019-03-21 21:19:30 +01:00
Matthew Honnibal
c66bd61e88
Fix lemmas
2019-03-21 14:22:12 +01:00
Matthew Honnibal
04395ffa49
Bring English tag_map in line with UD Treebank
...
I wrote a small script to read the UD English training data and check
that our tag map and morph rules were resulting in the best POS map.
This hadn't been done for some time, and there have been various changes
to the UD schema since it has been done. After these changes we should
see much better agreement between our POS assignments and the UD POS
tags.
2019-03-21 13:53:44 +01:00
Ines Montani
0c82a5ddb2
Merge branch 'master' of https://github.com/explosion/spaCy
2019-03-21 10:23:56 +01:00
Ines Montani
0712efc6b3
Update version requirements [ci skip]
2019-03-21 10:23:54 +01:00
Matthew Honnibal
4e3ed2ea88
Add -t2v argument to train_textcat script
2019-03-20 23:05:42 +01:00
Ines Montani
764359c952
Merge branch 'master' into spacy.io
2019-03-20 17:24:28 +01:00
Ines Montani
dac8f8ff99
Update Span.__init__ docs (see #3445 ) [ci skip]
2019-03-20 17:24:17 +01:00
Matthew Honnibal
c7f26abe5f
Merge pull request #3434 from Bharat123rox/narrow-unicode
...
Raise Error for a narrow unicode build of Python
2019-03-20 12:19:52 +01:00
Matthew Honnibal
1c8ff59185
Merge pull request #3441 from explosion/fix/cli-ud-scripts
...
💫 Move UD scripts to bin
2019-03-20 12:19:15 +01:00
Matthew Honnibal
72889a16d5
Fix similarity calculation if vectors are on GPU ( #3440 )
2019-03-20 12:09:59 +01:00
Matthew Honnibal
1612990e88
Implement cosine loss for spacy pretrain. Make default
2019-03-20 11:06:58 +00:00
Ines Montani
ae5b4d0e84
Fix formatting (hopefully also restarts build properly)
2019-03-20 09:55:45 +01:00
Ines Montani
6abc1ddb26
Update __main__.py
2019-03-20 09:43:26 +01:00
Bharat123Rox
f2547f02d6
Made changes suggested by @ines
2019-03-20 07:43:19 +05:30
Ines Montani
7400c7f8a7
Move UD scripts to bin
2019-03-20 01:19:34 +01:00
Ines Montani
685fff40cf
Revert "Add --always-link flag to cli.download (see #3435 )"
...
This reverts commit 583a566843
.
2019-03-20 01:03:40 +01:00
Matthew Honnibal
6cfbb2d34e
Merge branch 'master' of https://github.com/explosion/spaCy
2019-03-20 00:59:54 +01:00
Matthew Honnibal
5a53e9358a
Set version to 2.1.1
2019-03-20 00:59:45 +01:00
Matthew Honnibal
02d7b41893
Fix GPU installation. Closes #3437
2019-03-20 00:59:27 +01:00