spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-09-23 12:36:46 +03:00

Author	SHA1	Message	Date
Matthew Honnibal	2a8137aba9	Merge pull request #5518 from svlandeg/fix/pretrain-docs Pretrain fixes	2020-05-29 19:20:20 +02:00
svlandeg	291483157d	prevent loading a pretrained Tok2Vec layer AND pretrained components	2020-05-29 17:38:33 +02:00
Adriane Boyd	e1b7cbd197	Remove MorphAnalysis __str__ and __repr__	2020-05-29 14:33:47 +02:00
svlandeg	04ba37b667	fix description	2020-05-29 13:52:39 +02:00
svlandeg	5f0a91cf37	fix conv-depth parameter	2020-05-29 09:56:29 +02:00
Ines Montani	4fd087572a	WIP: improve model version deps	2020-05-28 12:51:37 +02:00
Matthw Honnibal	58750b06f8	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2020-05-27 22:18:36 +02:00
Matthew Honnibal	aecd1437cc	Merge pull request #5508 from adrianeboyd/bugfix/tag-map-sp-tag Prefer _SP over SP for default tag map space attrs	2020-05-27 20:39:40 +02:00
Matthew Honnibal	e7ac12b598	Merge pull request #5514 from adrianeboyd/bugfix/load-vector-name Improve vector name loading from model meta	2020-05-27 20:39:23 +02:00
Adriane Boyd	25de2a2191	Improve vector name loading from model meta	2020-05-27 14:48:54 +02:00
adrianeboyd	aad0610a85	Map NR to PROPN (#5512 )	2020-05-26 22:30:53 +02:00
Sofie Van Landeghem	f00488ab30	Update train_intent_parser.py	2020-05-26 16:41:39 +02:00
Adriane Boyd	b6b5908f5e	Prefer _SP over SP for default tag map space attrs If `_SP` is already in the tag map, use the mapping from `_SP` instead of `SP` so that `SP` can be a valid non-space tag. (Chinese has a non-space tag `SP` which was overriding the mapping of `_SP` to `SPACE`.)	2020-05-26 14:57:13 +02:00
Matthew Honnibal	b0c0271a48	Merge pull request #5506 from adrianeboyd/bugfix/pl-lemmatizer-lookup-loading Fix Polish lemmatizer for deserialized models	2020-05-26 12:31:25 +02:00
Matthew Honnibal	a44d51a3d8	Merge pull request #5496 from explosion/docs/unicode-str unicode -> str consistency	2020-05-26 10:30:37 +02:00
Adriane Boyd	1eed101be9	Fix Polish lemmatizer for deserialized models Restructure Polish lemmatizer not to depend on lookups data in `__init__` since the lemmatizer is initialized before the lookups data is loaded from a saved model. The lookups tables are accessed first in `__call__` instead once the data is available.	2020-05-26 09:56:12 +02:00
adrianeboyd	69897b45d8	Handle spacy.pex renaming in Makefile (#5503 )	2020-05-25 16:39:22 +02:00
adrianeboyd	c9c7b135c0	Update Makefile for v2.3.0 (#5502 )	2020-05-25 15:24:24 +02:00
Ines Montani	24ef6680fa	Merge pull request #5499 from adrianeboyd/chore/bump-version-deps-v2.3.0	2020-05-25 13:25:45 +02:00
Ines Montani	ade4767e06	Merge pull request #5498 from adrianeboyd/bugfix/phrasematcher-unpickle-new-api	2020-05-25 13:25:07 +02:00
Adriane Boyd	3f727bc539	Switch to v2.3.0.dev0	2020-05-25 12:57:20 +02:00
Adriane Boyd	736f3cb5af	Bump version and deps for v2.3.0 * spacy to v2.3.0 * thinc to v7.4.1 * spacy-lookups-data to v0.3.2	2020-05-25 12:03:49 +02:00
Rajat	8b8efa1b42	update spacy universe with my project (#5497 ) * added contextualSpellCheck in spacy universe meta * removed extra formatting by code * updated with permanent links * run json linter used by spacy * filled SCA * updated the description	2020-05-25 11:30:23 +02:00
Adriane Boyd	e06ca7ea24	Switch to new add API in PhraseMatcher unpickle	2020-05-25 11:22:47 +02:00
Ines Montani	1a15896ba9	unicode -> str consistency [ci skip]	2020-05-24 18:51:10 +02:00
Ines Montani	262d306eaa	unicode -> str consistency	2020-05-24 17:23:00 +02:00
Ines Montani	5d3806e059	unicode -> str consistency	2020-05-24 17:20:58 +02:00
Ines Montani	cf156ed2f4	Merge pull request #5495 from explosion/fix/simplify-is-package	2020-05-24 15:42:55 +02:00
Ines Montani	387c7aba15	Update test	2020-05-24 14:55:16 +02:00
Ines Montani	f9786d765e	Simplify is_package check	2020-05-24 14:48:56 +02:00
Sofie Van Landeghem	ae1c179f3a	Remove the nested quote	2020-05-23 17:58:19 +02:00
Ines Montani	15d3a0ac3a	Merge pull request #5491 from explosion/chore/rename-pipe-analysis	2020-05-23 12:41:54 +02:00
Matthw Honnibal	2d9de8684d	Support use_pytorch_for_gpu_memory config	2020-05-22 23:10:40 +02:00
Jannis	aa53ce6996	Documentation Typo Fix (#5492 ) * Fix typo Change 'realize' to 'realise' * Add contributer agreement	2020-05-22 19:50:26 +02:00
Ines Montani	4465cad6c5	Rename spacy.analysis to spacy.pipe_analysis	2020-05-22 17:42:06 +02:00
Ines Montani	25d6ed3fb8	Merge pull request #5489 from explosion/feature/connected-components	2020-05-22 17:40:11 +02:00
Ines Montani	841c05b47b	Merge pull request #5490 from explosion/fix/remove-jsonschema	2020-05-22 17:39:54 +02:00
Ines Montani	569a65b60e	Auto-format	2020-05-22 16:55:42 +02:00
Ines Montani	d844528c5f	Add test for is_compatible_model	2020-05-22 16:55:15 +02:00
Ines Montani	12b7be1d98	Remove jsonschema from dependencies	2020-05-22 16:49:26 +02:00
Matthew Honnibal	7a73a9dcf6	Merge pull request #5488 from explosion/feature/better-model-compat Better model compatibility and validation	2020-05-22 16:44:29 +02:00
Matthew Honnibal	f7f6df7275	Move to spacy.analysis	2020-05-22 16:43:18 +02:00
Matthew Honnibal	78d79d94ce	Guess set_annotations=True in nlp.update During `nlp.update`, components can be passed a boolean set_annotations to indicate whether they should assign annotations to the `Doc`. This needs to be called if downstream components expect to use the annotations during training, e.g. if we wanted to use tagger features in the parser. Components can specify their assignments and requirements, so we can figure out which components have these inter-dependencies. After figuring this out, we can guess whether to pass set_annotations=True. We could also call set_annotations=True always, or even just have this as the only behaviour. The downside of this is that it would require the `Doc` objects to be created afresh to avoid problematic modifications. One approach would be to make a fresh copy of the `Doc` objects within `nlp.update()`, so that we can write to the objects without any problems. If we do that, we can drop this logic and also drop the `set_annotations` mechanism. I would be fine with that approach, although it runs the risk of introducing some performance overhead, and we'll have to take care to copy all extension attributes etc.	2020-05-22 15:55:45 +02:00
Ines Montani	6728747f71	Merge pull request #5486 from explosion/fix/compat-py2	2020-05-22 15:47:21 +02:00
Ines Montani	6e6db6afb6	Better model compatibility and validation	2020-05-22 15:42:46 +02:00
Matthew Honnibal	f6078d866a	Merge pull request #5121 from adrianeboyd/bugfix/revert-token-match Revert token_match priority changes from #4374 and extend token match options	2020-05-22 14:42:51 +02:00
Ines Montani	c685ee734a	Fix compat for v2.x branch	2020-05-22 14:22:36 +02:00
Ines Montani	65c7e82de2	Auto-format and remove 2.3 feature [ci skip]	2020-05-22 13:50:30 +02:00
Matthew Honnibal	8cb16c7120	Merge pull request #5485 from adrianeboyd/bugfix/retokenizer-merge-0-length-5450 Disallow merging 0-length spans	2020-05-22 13:28:35 +02:00
Adriane Boyd	e4a1b5dab1	Rename to url_match Rename to `url_match` and update docs.	2020-05-22 12:41:03 +02:00

... 2 3 4 5 6 ...

11792 Commits