Commit Graph

15337 Commits

Author SHA1 Message Date
svlandeg
feb71e15fd hash the entity name 2019-03-22 11:36:44 +01:00
svlandeg
839dafa104 documented some comments and todos 2019-03-22 11:36:44 +01:00
svlandeg
7f37737878 kb snippet, draft by Matt (wip) 2019-03-22 11:36:44 +01:00
svlandeg
735fc2a735 annotate kb_id through ents in doc 2019-03-22 11:36:44 +01:00
svlandeg
d849eb2455 adding kb_id as field to token, el as nlp pipeline component 2019-03-22 11:34:46 +01:00
Matthew Honnibal
d811c97da1 Fix test that caused pytest to choke on Python3 2019-03-22 10:28:51 +01:00
Matthew Honnibal
a2ad9832e5 Add failing test for #3356 2019-03-22 02:42:37 +01:00
svlandeg
4820b43313 use nlp's vocab for stringstore 2019-03-21 23:17:25 +01:00
Matthew Honnibal
7ec64a36fd
Merge pull request #3455 from explosion/bugfix/fix-en-tag-map
💫 Bring English tag_map in line with UD Treebank
2019-03-21 21:19:30 +01:00
svlandeg
6e2433b95e select candidate with highest prior probabiity 2019-03-21 18:55:01 +01:00
svlandeg
24a0c4a8d4 name per entity 2019-03-21 18:20:57 +01:00
svlandeg
d0c763ba44 minimal EL pipe 2019-03-21 17:33:25 +01:00
svlandeg
26afa4800f ensure no candidates are returned for unknown aliases 2019-03-21 15:24:40 +01:00
Matthew Honnibal
c66bd61e88 Fix lemmas 2019-03-21 14:22:12 +01:00
Matthew Honnibal
04395ffa49 Bring English tag_map in line with UD Treebank
I wrote a small script to read the UD English training data and check
that our tag map and morph rules were resulting in the best POS map.
This hadn't been done for some time, and there have been various changes
to the UD schema since it has been done. After these changes we should
see much better agreement between our POS assignments and the UD POS
tags.
2019-03-21 13:53:44 +01:00
svlandeg
a5d5a05930 Entity class 2019-03-21 13:32:21 +01:00
svlandeg
6ba4079f7c property getters and keep track of KB internally 2019-03-21 13:26:12 +01:00
svlandeg
34969dddeb unit test on number of candidates generated 2019-03-21 12:48:59 +01:00
svlandeg
0ff4ce6c59 store entity hash instead of pointer 2019-03-21 12:31:02 +01:00
Ines Montani
375fbf3586 Update v2-1.md 2019-03-21 12:29:08 +01:00
Ines Montani
9394ca1f29 Update index.md 2019-03-21 10:24:55 +01:00
Ines Montani
0c82a5ddb2 Merge branch 'master' of https://github.com/explosion/spaCy 2019-03-21 10:23:56 +01:00
Ines Montani
0712efc6b3 Update version requirements [ci skip] 2019-03-21 10:23:54 +01:00
svlandeg
81a9030ab7 create candidate object from entry pointer (not fully functional yet) 2019-03-21 00:04:06 +01:00
Matthew Honnibal
4e3ed2ea88 Add -t2v argument to train_textcat script 2019-03-20 23:05:42 +01:00
Ines Montani
764359c952 Merge branch 'master' into spacy.io 2019-03-20 17:24:28 +01:00
Ines Montani
dac8f8ff99 Update Span.__init__ docs (see #3445) [ci skip] 2019-03-20 17:24:17 +01:00
Matthew Honnibal
c7f26abe5f
Merge pull request #3434 from Bharat123rox/narrow-unicode
Raise Error for a narrow unicode build of Python
2019-03-20 12:19:52 +01:00
Matthew Honnibal
1c8ff59185
Merge pull request #3441 from explosion/fix/cli-ud-scripts
💫 Move UD scripts to bin
2019-03-20 12:19:15 +01:00
Matthew Honnibal
72889a16d5 Fix similarity calculation if vectors are on GPU (#3440) 2019-03-20 12:09:59 +01:00
Matthew Honnibal
1612990e88 Implement cosine loss for spacy pretrain. Make default 2019-03-20 11:06:58 +00:00
Ines Montani
ae5b4d0e84 Fix formatting (hopefully also restarts build properly) 2019-03-20 09:55:45 +01:00
Ines Montani
6abc1ddb26 Update __main__.py 2019-03-20 09:43:26 +01:00
Bharat123Rox
f2547f02d6 Made changes suggested by @ines 2019-03-20 07:43:19 +05:30
Ines Montani
7400c7f8a7 Move UD scripts to bin 2019-03-20 01:19:34 +01:00
Ines Montani
685fff40cf Revert "Add --always-link flag to cli.download (see #3435)"
This reverts commit 583a566843.
2019-03-20 01:03:40 +01:00
Matthew Honnibal
6cfbb2d34e Merge branch 'master' of https://github.com/explosion/spaCy 2019-03-20 00:59:54 +01:00
Matthew Honnibal
5a53e9358a Set version to 2.1.1 2019-03-20 00:59:45 +01:00
Matthew Honnibal
02d7b41893 Fix GPU installation. Closes #3437 2019-03-20 00:59:27 +01:00
Ines Montani
583a566843 Add --always-link flag to cli.download (see #3435) 2019-03-19 22:03:27 +01:00
svlandeg
b7ca3de358 check the length of entities and probabilities vector + unit test 2019-03-19 21:55:10 +01:00
svlandeg
7402bb4c06 correct size, not counting dummy elements in the vector 2019-03-19 21:50:32 +01:00
svlandeg
f0decf98f1 check and unit test in case prior probs exceed 1 2019-03-19 21:43:48 +01:00
svlandeg
2f2f821648 avoid value 0 in preshmap and helpful user warnings 2019-03-19 21:35:24 +01:00
Bharat123Rox
b5f077dcf4 Sign the Contributor Agreement and update details 2019-03-19 23:07:54 +05:30
Bharat123Rox
6db1ddd9c7 Raise ValueError for narrow unicode build 2019-03-19 23:02:58 +05:30
svlandeg
19d3a2f9aa raising error when adding alias for unknown entity + unit test 2019-03-19 17:39:35 +01:00
svlandeg
1d20f19208 use StringStore 2019-03-19 16:43:23 +01:00
svlandeg
1fba7219fb bugfix adding aliases 2019-03-19 16:15:38 +01:00
svlandeg
c62cca3368 get candidates by alias 2019-03-19 15:51:56 +01:00