Commit Graph

5727 Commits

Author SHA1 Message Date
svlandeg
6ba4079f7c property getters and keep track of KB internally 2019-03-21 13:26:12 +01:00
svlandeg
34969dddeb unit test on number of candidates generated 2019-03-21 12:48:59 +01:00
svlandeg
0ff4ce6c59 store entity hash instead of pointer 2019-03-21 12:31:02 +01:00
svlandeg
81a9030ab7 create candidate object from entry pointer (not fully functional yet) 2019-03-21 00:04:06 +01:00
svlandeg
b7ca3de358 check the length of entities and probabilities vector + unit test 2019-03-19 21:55:10 +01:00
svlandeg
7402bb4c06 correct size, not counting dummy elements in the vector 2019-03-19 21:50:32 +01:00
svlandeg
f0decf98f1 check and unit test in case prior probs exceed 1 2019-03-19 21:43:48 +01:00
svlandeg
2f2f821648 avoid value 0 in preshmap and helpful user warnings 2019-03-19 21:35:24 +01:00
svlandeg
19d3a2f9aa raising error when adding alias for unknown entity + unit test 2019-03-19 17:39:35 +01:00
svlandeg
1d20f19208 use StringStore 2019-03-19 16:43:23 +01:00
svlandeg
1fba7219fb bugfix adding aliases 2019-03-19 16:15:38 +01:00
svlandeg
c62cca3368 get candidates by alias 2019-03-19 15:51:56 +01:00
svlandeg
a4d876d471 adding and retrieving aliases 2019-03-18 17:50:01 +01:00
svlandeg
a14fb54b17 very minimal KB functionality working 2019-03-18 17:27:51 +01:00
svlandeg
5ac7edf53c adding aliases per entity in the KB 2019-03-18 12:38:40 +01:00
svlandeg
3945fd21b0 fix compile errors 2019-03-18 10:31:01 +01:00
svlandeg
56b55e3bcd add pyx and separate method to add aliases 2019-03-15 16:05:23 +01:00
svlandeg
dc603fb85e hash the entity name 2019-03-15 15:00:53 +01:00
svlandeg
b6bac49444 documented some comments and todos 2019-03-15 11:37:24 +01:00
svlandeg
097e5f3da1 kb snippet, draft by Matt (wip) 2019-03-15 11:17:35 +01:00
svlandeg
5f002e9ced annotate kb_id through ents in doc 2019-03-14 16:31:46 +01:00
svlandeg
173d45ec5f adding kb_id as field to token, el as nlp pipeline component 2019-03-06 19:34:18 +01:00
Ines Montani
23f6ebf0f3 Add missing " (closes #3343) 2019-02-27 16:37:03 +01:00
Ines Montani
533b580c19 Add test for stray print statements in languages (see #3342) 2019-02-27 16:04:30 +01:00
Ines Montani
48a2046d1c Remove stray print statement (closes #3342) 2019-02-27 15:35:04 +01:00
Ines Montani
07d7c0a1af Fix whitespace 2019-02-27 15:34:21 +01:00
Ines Montani
9b62639d19 Auto-format [ci skip] 2019-02-27 14:24:55 +01:00
Matthew Honnibal
656edcb984 Set version to v2.1.0a10 2019-02-27 12:26:13 +01:00
Matthew Honnibal
f1d77eb140
💫 Improve handling of missing NER tags (closes #2603) (#3341)
* Improve handling of missing NER tags

GoldParse can accept missing NER tags, if entities is provided
in BILUO format (rather than as spans). Missing tags can be provided
as None values.

Fix bug that occurred when first tag was a None value. Closes #2603.

* Document specification of missing NER tags.
2019-02-27 12:06:32 +01:00
Ines Montani
e359bdd0e3 Auto-format 2019-02-27 11:56:45 +01:00
Matthew Honnibal
4a3371acd5
Make doc[0].is_sent_start == True (closes #2869) (#3340)
* Make doc[0] have sent_start True. Closes #2869

* Document that doc[0].is_sent_start defaults True.
2019-02-27 11:17:17 +01:00
Matthew Honnibal
2d3ce89b78 Improve matcher tests re issue #3328 2019-02-27 10:25:56 +01:00
Matthew Honnibal
8d6954e0e7 Fix matcher bug #3328 2019-02-27 10:25:39 +01:00
Ines Montani
aadf586789 Add xfailing test for #3331 2019-02-25 22:33:30 +01:00
Matthew Honnibal
3cdd3eb518 Set version to v2.1.0a9 2019-02-25 21:55:19 +01:00
Matthew Honnibal
b449be0f04 Add comment re issue #3170 2019-02-25 21:24:03 +01:00
Matthew Honnibal
9ccd6a3062 Fix head-outside-sentence bug. Fixes #3170 2019-02-25 21:21:44 +01:00
Matthew Honnibal
f2fae1f186 Add batch size argument to Language.evaluate(). Closes #3263 2019-02-25 19:30:33 +01:00
Ines Montani
f135d663f7 Update conftest.py 2019-02-25 15:55:29 +01:00
Ines Montani
76ce8b2662 Merge branch 'master' into develop 2019-02-25 15:54:55 +01:00
Julia Makogon
f1c3108d52 Fixing pymorphy2 dependency issue (#3329) (closes #3327)
* Classes for Ukrainian; small fix in Russian.

* Contributor agreement

* pymorphy2 initialization split for ru and uk (#3327)

* stop-words fixed

* Unit-tests updated
2019-02-25 15:48:17 +01:00
Ines Montani
1a735e0f1f Add regression test for #3328 2019-02-25 10:12:58 +01:00
Ines Montani
dfbed07d3b Remove unused temp errors 2019-02-24 22:26:08 +01:00
Ines Montani
62b558ab72 💫 Support lexical attributes in retokenizer attrs (closes #2390) (#3325)
* Fix formatting and whitespace

* Add support for lexical attributes (closes #2390)

* Document lexical attribute setting during retokenization

* Assign variable oputside of nested loop
2019-02-24 21:13:51 +01:00
Ines Montani
a48deb4081 Merge regression tests 2019-02-24 21:03:39 +01:00
Ines Montani
8f6c193a4d Delete _test_issue1622.py 2019-02-24 20:33:31 +01:00
Ines Montani
c8e967c78d Try include previously segfaulting test 2019-02-24 20:32:46 +01:00
Ines Montani
328b589deb Merge regression tests 2019-02-24 20:31:38 +01:00
Ines Montani
3bc53905cc Remove print statements from test 2019-02-24 20:31:15 +01:00
Ines Montani
1ae0df3da9 Un-x-fail passing test 2019-02-24 20:24:15 +01:00