spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-09-03 02:44:56 +03:00

Author	SHA1	Message	Date
Ines Montani	c8e967c78d	Try include previously segfaulting test	2019-02-24 20:32:46 +01:00
Ines Montani	328b589deb	Merge regression tests	2019-02-24 20:31:38 +01:00
Ines Montani	3bc53905cc	Remove print statements from test	2019-02-24 20:31:15 +01:00
Ines Montani	1ae0df3da9	Un-x-fail passing test	2019-02-24 20:24:15 +01:00
Ines Montani	399a5803d0	Tidy up tests [ci skip]	2019-02-24 19:02:16 +01:00
Ines Montani	aa52305461	Improve pipeline model and meta example [ci skip]	2019-02-24 18:45:39 +01:00
Ines Montani	2011563c51	Update docstrings [ci skip]	2019-02-24 18:39:59 +01:00
Ines Montani	df19e2bff6	💫 Allow setting of custom attributes during retokenization (closes #3314 ) (#3324 ) <!--- Provide a general summary of your changes in the title. --> ## Description This PR adds the abilility to override custom extension attributes during merging. This will only work for attributes that are writable, i.e. attributes registered with a default value like `default=False` or attribute that have both a getter and a setter implemented. ```python Token.set_extension('is_musician', default=False) doc = nlp("I like David Bowie.") with doc.retokenize() as retokenizer: attrs = {"LEMMA": "David Bowie", "_": {"is_musician": True}} retokenizer.merge(doc[2:4], attrs=attrs) assert doc[2].text == "David Bowie" assert doc[2].lemma_ == "David Bowie" assert doc[2]._.is_musician ``` ### Types of change enhancement ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [x] I have submitted the spaCy Contributor Agreement. - [x] I ran the tests, and all new and existing tests passed. - [x] My changes don't require a change to the documentation, or if they do, I've added all required information.	2019-02-24 18:38:47 +01:00
Ines Montani	403b9cd58b	Add docs on adding to existing tokenizer rules [ci skip]	2019-02-24 18:35:19 +01:00
Ines Montani	1ea1bc98e7	Document regex utilities [ci skip]	2019-02-24 18:34:10 +01:00
Ines Montani	cd4bc6757b	Update README.md [ci skip]	2019-02-24 17:40:01 +01:00
Matthew Honnibal	1f7c56cd93	Fix parser.add_label()	2019-02-24 16:53:22 +01:00
Matthew Honnibal	893aa40d73	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2019-02-24 16:43:01 +01:00
Matthew Honnibal	5882d82915	Set version to v2.1.0a9.dev2	2019-02-24 16:42:06 +01:00
Matthew Honnibal	0367f864fe	Fix handling of added labels. Resolves #3189	2019-02-24 16:41:41 +01:00
Matthew Honnibal	4dc57d9e15	Update train_new_entity_type example	2019-02-24 16:41:03 +01:00
Matthew Honnibal	d74dbde828	Fix order of actions when labels added to parser When labels were added to the parser or NER, we weren't loading back the classes in the correct order. Re issue #3189	2019-02-24 16:36:29 +01:00
Matthew Honnibal	7ac0f9626c	Update rehearsal example	2019-02-24 16:17:41 +01:00
Ines Montani	6de81ae310	Fix formatting of errors	2019-02-24 15:11:28 +01:00
Ines Montani	d8f69d592f	Tidy up retokenizer tests	2019-02-24 14:14:11 +01:00
Ines Montani	723e27cb8c	Tidy up tests	2019-02-24 14:11:23 +01:00
Ines Montani	2982f82934	Auto-format	2019-02-24 14:09:15 +01:00
Ines Montani	09bf08b3c3	Update redirects [ci skip]	2019-02-24 13:37:50 +01:00
Ines Montani	dceca3264d	Tidy up package.json [ci skip]	2019-02-24 13:37:41 +01:00
Ines Montani	3ef4da3503	Update and auto-format README [ci skip]	2019-02-24 13:12:13 +01:00
Ines Montani	46ec5cdccc	Update TextCategorizer docs	2019-02-24 13:11:57 +01:00
Ines Montani	c03cb1cc63	Improve built-in component API docs	2019-02-24 13:11:49 +01:00
Ines Montani	235a0e948e	Tidy up CI config	2019-02-24 12:07:33 +01:00
Ines Montani	b570a1e203	Exclude website branch from CI	2019-02-24 11:52:16 +01:00
Ines Montani	383e2e1f12	Update Python versions [ci skip]	2019-02-24 11:49:45 +01:00
Ines Montani	b624cb4b89	Update v2-1.md	2019-02-24 11:49:27 +01:00
Matthew Honnibal	909a9d9932	Set version to v2.1.0a9.dev1	2019-02-23 13:10:42 +01:00
Matthew Honnibal	55bb3cc482	Require thinc 7.0.2	2019-02-23 13:10:09 +01:00
Matthew Honnibal	981cb89194	Fix f-score calculation if zero	2019-02-23 12:45:41 +01:00
Matthew Honnibal	6b0008afc6	Clean up TextCategorizer slightly	2019-02-23 12:28:06 +01:00
Matthew Honnibal	d13b9373bf	Improve initialization for mutually textcat	2019-02-23 12:27:45 +01:00
Matthew Honnibal	5063d999e5	Set architecture in textcat example	2019-02-23 11:57:59 +01:00
Matthew Honnibal	e9dd5943b9	Support exclusive_classes setting for textcat models	2019-02-23 11:57:16 +01:00
Matthew Honnibal	ce1e4eace2	Default to former TextCategorizer model * Keep TextCategorizer default model same as v2.0 * Add option 'architecture' that allows "simple_cnn" to switch to simpler model. * Add option exclusive_classes, defaulting to False. If set to True, the model treats classes as mutually exclusive, i.e. only one class can be true per instance.	2019-02-23 11:55:16 +01:00
Matthew Honnibal	829c9091a4	Set version to v2.1.0a9.dev0	2019-02-21 17:13:34 +01:00
Matthew Honnibal	d396a69c7b	More fixes for issue #3112	2019-02-21 17:12:23 +01:00
Ines Montani	80bdcb99c5	Fix escaping of HTML in displacy ENT (closes #2728 )	2019-02-21 14:30:39 +01:00
Ines Montani	250e88ef55	Fix docs example (see #2728 )	2019-02-21 14:22:06 +01:00
Ines Montani	0fc908d7a5	Add note on merging speed in v2.1 (see #3300 ) [ci skip]	2019-02-21 12:34:18 +01:00
Ines Montani	236aa94ded	Update v2-1.md	2019-02-21 12:33:56 +01:00
Matthew Honnibal	7d529ebdfb	Set version to v2.1.0a8	2019-02-21 12:09:34 +01:00
Matthew Honnibal	7cbdcaddf3	Ensure new setuptools before building sdist	2019-02-21 12:08:41 +01:00
Matthew Honnibal	f75be6e7be	Set version to v2.1.0a8.dev1	2019-02-21 11:57:06 +01:00
Matthew Honnibal	c5f947f194	Fix regex deprecation warnings	2019-02-21 11:56:47 +01:00
Matthew Honnibal	7f02464494	Set version to v2.1.0a8.dev0	2019-02-21 11:42:23 +01:00

... 2 3 4 5 6 ...

9728 Commits