spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-07-15 18:52:29 +03:00

History

Matthew Honnibal 609c0ba557 Fix accidentally quadratic runtime in Example.split_sents (#5464 ) * Tidy up train-from-config a bit * Fix accidentally quadratic perf in TokenAnnotation.brackets When we're reading in the gold data, we had a nested loop where we looped over the brackets for each token, looking for brackets that start on that word. This is accidentally quadratic, because we have one bracket per word (for the POS tags). So we had an O(N*2) behaviour here that ended up being pretty slow. To solve this I'm indexing the brackets by their starting word on the TokenAnnotations object, and having a property to provide the previous view. Fixes		2020-05-20 18:48:18 +02:00
..
converters	Tidy up and fix issues	2020-02-18 15:17:03 +01:00
__init__.py	Remove symlinks, data dir and related stuff	2020-02-18 17:20:17 +01:00
convert.py	Add convert CLI option to merge CoNLL-U subtokens (#4722 )	2020-01-29 17:44:25 +01:00
debug_data.py	Fix formatting and update docs for v2.2.4	2020-03-09 11:17:20 +01:00
download.py	Remove symlinks, data dir and related stuff	2020-02-18 17:20:17 +01:00
evaluate.py	Update morphologizer (#5108 )	2020-04-02 14:46:32 +02:00
info.py	Remove symlinks, data dir and related stuff	2020-02-18 17:20:17 +01:00
init_model.py	Simplify warnings	2020-02-28 12:20:23 +01:00
package.py	Modernize plac commands for Python 3 (#4836 )	2020-01-01 13:15:46 +01:00
pretrain.py	Tidy up and auto-format	2020-02-28 11:57:41 +01:00
profile.py	Update spaCy for thinc 8.0.0 (#4920 )	2020-01-29 17:06:46 +01:00
train_from_config.py	Fix accidentally quadratic runtime in Example.split_sents (#5464 )	2020-05-20 18:48:18 +02:00
train.py	Feature toggle_pipes (#5378 )	2020-05-18 22:27:10 +02:00
validate.py	Remove symlinks, data dir and related stuff	2020-02-18 17:20:17 +01:00