Jeffrey Gerard
884ba168a8
Capture more noun chunks
2017-08-23 21:18:53 -07:00
Matthew Honnibal
23a55b40ca
Default to English noun chunks iterator if no lang set
2017-07-22 14:15:25 +02:00
Francisco Aranda
5b385e7d78
feat(spanish model): add the spanish noun chunker
2017-06-02 08:14:06 +02:00
Matthew Honnibal
60703cede5
Ensure noun chunks can't be nested. Closes #955
2017-04-23 17:56:39 +02:00
ines
0739ae7b76
Tidy up and fix formatting and imports
2017-04-15 13:05:15 +02:00
Matthew Honnibal
cc36c308f4
Fix noun_chunk rules around coordination
...
Closes #693 .
2017-04-07 17:06:40 +02:00
Matthew Honnibal
b8c4f5ea76
Allow German noun chunks to work on Span
...
Update the German noun chunks iterator, so that it also works on Span objects.
2016-11-24 23:30:15 +11:00
Pokey Rule
3e3bda142d
Add noun_chunks to Span
2016-11-24 10:47:20 +00:00
Matthew Honnibal
a44763af0e
Fix Issue #469 : Incorrectly cased root label in noun chunk iterator
2016-09-27 13:13:01 +02:00
Matthew Honnibal
13fad36e49
* Cosmetic change to english noun chunks iterator -- use enumerate instead of range loop
2016-05-20 10:11:05 +02:00
Wolfgang Seeker
7b78239436
add fix for German noun chunk iterator (issue #365 )
2016-05-06 01:41:26 +02:00
Matthew Honnibal
bb94022975
* Fix Issue #365 : Error introduced during noun phrase chunking, due to use of corrected PRON/PROPN/etc tags.
2016-05-06 00:21:05 +02:00
Wolfgang Seeker
e4ea2bea01
fix whitespace
2016-05-04 07:40:38 +02:00
Wolfgang Seeker
5bf2fd1f78
make the code less cryptic
2016-05-03 17:19:05 +02:00
Wolfgang Seeker
a06fca9fdf
German noun chunk iterator now doesn't return tokens more than once
2016-05-03 16:58:59 +02:00
Matthew Honnibal
508fd1f6dc
* Refactor noun chunk iterators, so that they're simple functions. Install the iterator when the Doc is created, but allow users to write to the noun_chunk_iterator attribute. The iterator functions accept an object and yield (int start, int end, int label) triples.
2016-05-02 14:25:10 +02:00
Wolfgang Seeker
b98cc3266d
bugfix: iterators now reset properly when called a second time
2016-04-15 17:49:16 +02:00
Wolfgang Seeker
80bea62842
bugfix in unit test
2016-04-08 16:46:44 +02:00
Wolfgang Seeker
5e2e8e951a
add baseclass DocIterator for iterators over documents
...
add classes for English and German noun chunks
the respective iterators are set for the document when created by the parser
as they depend on the annotation scheme of the parsing model
2016-03-16 15:53:35 +01:00