spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-07-16 19:22:34 +03:00

History

Matthew Honnibal 4262f231c5 Fix conversion of older CoNLL parsing files There are a billion "CoNLL" formats, depending on the tool producing them. The Stanford v3.3 converter has a few quirks that the CoNLL-X conversion wasn't handling: * Sentences may have extra spacing in between the newlines * The coarse-grained POS is the same as the fine-grained POS, so we need a tag map to get the coarse-grained POS. Needing the tag map is particularly unfortunate, it feels like something that should be patched on the source data? Adding the extra option may be confusing to people, especially since it overwrites the corpus tag.		2020-09-12 18:20:18 +02:00
..
project	Fix handling of existing asset without checksum [ci skip]	2020-09-12 17:02:53 +02:00
templates	Fix learn rate for non-transformer	2020-09-04 21:22:50 +02:00
__init__.py	"model" terminology consistency in docs	2020-09-03 13:13:03 +02:00
_util.py	Merge remote-tracking branch 'upstream/develop' into feature/cli-config	2020-09-12 14:44:40 +02:00
convert.py	Fix conversion of older CoNLL parsing files	2020-09-12 18:20:18 +02:00
debug_config.py	Update docs links in codebase	2020-09-04 12:58:50 +02:00
debug_data.py	Renaming gold & annotation_setter (#6042 )	2020-09-09 10:31:03 +02:00
debug_model.py	string_to_list to parse comma-separated string into a list	2020-09-12 14:43:22 +02:00
download.py	Update docs links in codebase	2020-09-04 12:58:50 +02:00
evaluate.py	Renaming gold & annotation_setter (#6042 )	2020-09-09 10:31:03 +02:00
info.py	Update docs links in codebase	2020-09-04 12:58:50 +02:00
init_config.py	string_to_list to parse comma-separated string into a list	2020-09-12 14:43:22 +02:00
init_model.py	Fix reading in GloVe vectors	2020-09-12 17:31:18 +02:00
package.py	Support overwriting name on spacy package	2020-09-11 11:38:28 +02:00
pretrain.py	Update docs links in codebase	2020-09-04 12:58:50 +02:00
profile.py	Update docs links in codebase	2020-09-04 12:58:50 +02:00
train.py	Renaming gold & annotation_setter (#6042 )	2020-09-09 10:31:03 +02:00
validate.py	Update docs links in codebase	2020-09-04 12:58:50 +02:00