spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-09-16 00:52:38 +03:00

Author	SHA1	Message	Date
Ines Montani	e703301129	Update universe [ci skip]	2019-06-02 13:55:55 +02:00
Ines Montani	892e72451f	Update universe [ci skip]	2019-06-02 12:58:12 +02:00
Ines Montani	42de5be90c	Tidy up universe [ci skip]	2019-06-02 12:38:48 +02:00
Nirant	638caba9b5	Add multiple packages to universe.json (#3809 ) [ci skip] * Add multiple packages to universe.json Added following packages: NLPArchitect, NLPRe, Chatterbot, alibi, NeuroNER * Auto-format * Update slogan (probably just copy-paste mistake) * Adjust formatting * Update tags / categories	2019-06-02 12:35:52 +02:00
Germán	86eb817b74	Overwrites default getter for like_num in Spanish by adding _num_words and like_num to lex_attrs.py (#3810 ) (closes #3803 )) * (#3803) Spanish like_num returning false for number-like token * (#3803) Spanish like_num now returning True for number-like token	2019-06-02 12:22:57 +02:00
Nirant	d4d1eab5e1	Add Baderlab/saber to universe.json (#3806 )	2019-06-01 17:36:40 +02:00
Nirant	a5d92a3035	Create NirantK.md (#3807 ) [ci skip]	2019-06-01 17:36:06 +02:00
Ines Montani	6be7d07315	Update UNIVERSE.md	2019-06-01 16:37:06 +02:00
Ines Montani	09e78b52cf	Improve E024 text for incorrect GoldParse (closes #3558 )	2019-06-01 14:37:27 +02:00
Ines Montani	0c74506c9c	Fix typos in docs (closes #3802 ) [ci skip]	2019-06-01 11:35:01 +02:00
Nipun Sadvilkar	1f13005751	Incorrect Token attribute ent_iob_ description (#3800 ) * Incorrect Token attribute ent_iob_ description * Add spaCy contributor agreement	2019-05-31 16:50:45 +02:00
Ramanan Balakrishnan	26c37c5a4d	fix all references to BILUO annotation format (#3797 )	2019-05-31 12:19:19 +02:00
Ines Montani	a7fd42d937	Make jsonschema dependency optional (#3784 )	2019-05-30 14:34:58 +02:00
svlandeg	268a52ead7	experimenting with cosine sim for negative examples (not OK yet)	2019-05-29 16:07:53 +02:00
mak	89379a7fa4	Corrected example model URL in requirements.txt (#3786 ) The URL used to show how to add a model to the requirements.txt had the old release path (excl. explosion).	2019-05-29 10:51:55 +02:00
svlandeg	a761929fa5	context encoder combining sentence and article	2019-05-28 18:14:49 +02:00
Ines Montani	a8416c46f7	Use string name in setup.py Hopefully this will trick GitHub's parser into recognising it as a Python package and show us the dependents / "used by" statistics 🤞	2019-05-28 17:11:39 +02:00
svlandeg	992fa92b66	refactor again to clusters of entities and cosine similarity	2019-05-28 00:05:22 +02:00
svlandeg	8c4aa076bc	small fixes	2019-05-27 14:29:38 +02:00
Ujwal Narayan	ed7be3f64c	Update norm_exceptions.py (#3778 ) * Update norm_exceptions.py Extended the Currency set to include Franc, Indian Rupee, Bangladeshi Taka, Korean Won, Mexican Dollar, and Egyptian Pound * Fix formatting [ci skip]	2019-05-27 11:52:52 +02:00
svlandeg	cfc27d7ff9	using Tok2Vec instead	2019-05-26 23:39:46 +02:00
svlandeg	abf9af81c9	learn rate en epochs	2019-05-24 22:04:25 +02:00
estr4ng7d	604acb6ace	Marathi Language Support (#3767 ) * Adding Marathi language details and folder to it * Adding few changes and running tests * Adding few changes and running tests * Update __init__.py mh -> mr * Rename spacy/lang/mh/__init__.py to spacy/lang/mr/__init__.py * mh -> mr	2019-05-24 14:29:42 +02:00
Ines Montani	7634812172	Document Language.evaluate	2019-05-24 14:06:36 +02:00
Ines Montani	45e6855550	Update Language.update docs	2019-05-24 14:06:26 +02:00
Ines Montani	b78a8dc1d2	Update Scorer and add API docs	2019-05-24 14:06:04 +02:00
svlandeg	86ed771e0b	adding local sentence encoder	2019-05-23 16:59:11 +02:00
svlandeg	4392c01b7b	obtain sentence for each mention	2019-05-23 15:37:05 +02:00
svlandeg	97241a3ed7	upsampling and batch processing	2019-05-22 23:40:10 +02:00
svlandeg	1a16490d20	update per entity	2019-05-22 12:46:40 +02:00
svlandeg	eb08bdb11f	hidden with for encoders	2019-05-21 23:42:46 +02:00
svlandeg	7b13e3d56f	undersampling negatives	2019-05-21 18:35:10 +02:00
svlandeg	2fa3fac851	fix concat bp and more efficient batch calls	2019-05-21 13:43:59 +02:00
svlandeg	0a15ee4541	fix in bp call	2019-05-20 23:54:55 +02:00
svlandeg	89e322a637	small fixes	2019-05-20 17:20:39 +02:00
Ujwal Narayan	4d550a3055	Enhancing Kannada language Resources (#3755 ) * Updated stop_words.py Added more stopwords * Create ujwal-narayan.md Enhancing Kannada language resources	2019-05-20 12:56:10 +02:00
svlandeg	7edb2e1711	fix convolution layer	2019-05-20 11:58:48 +02:00
svlandeg	dd691d0053	debugging	2019-05-17 17:44:11 +02:00
svlandeg	400b19353d	simplify architecture and larger-scale test runs	2019-05-17 01:51:18 +02:00
Ines Montani	321c9f5acc	Fix lex_id docs (closes #3743 )	2019-05-16 23:15:58 +02:00
svlandeg	d51bffe63b	clean up code	2019-05-16 18:36:15 +02:00
svlandeg	b5470f3d75	various tests, architectures and experiments	2019-05-16 18:25:34 +02:00
svlandeg	9ffe5437ae	calculate gradient for entity encoding	2019-05-15 02:23:08 +02:00
svlandeg	2713abc651	implement loss function using dot product and prob estimate per candidate cluster	2019-05-14 22:55:56 +02:00
BreakBB	ed18a6efbd	Add check for callable to 'Language.replace_pipe' to fix #3737 (#3741 )	2019-05-14 16:59:31 +02:00
svlandeg	09ed446b20	different architecture / settings	2019-05-14 08:37:52 +02:00
svlandeg	4142e8dd1b	train and predict per article (saving time for doc encoding)	2019-05-13 17:02:34 +02:00
svlandeg	3b81b00954	evaluating on dev set during training	2019-05-13 14:26:04 +02:00
Ines Montani	8baff1c7c0	💫 Improve introspection of custom extension attributes (#3729 ) * Add custom __dir__ to Underscore (see #3707) * Make sure custom extension methods keep their docstrings (see #3707) * Improve tests * Prepend note on partial to docstring (see #3707) * Remove print statement * Handle cases where docstring is None	2019-05-12 00:53:11 +02:00
Ines Montani	f96af8526a	Merge branch 'spacy.io' [ci skip]	2019-05-11 23:03:56 +02:00

... 2 3 4 5 6 ...

10319 Commits