spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-09-18 10:02:40 +03:00

Author	SHA1	Message	Date
Ines Montani	45e6855550	Update Language.update docs	2019-05-24 14:06:26 +02:00
Ines Montani	b78a8dc1d2	Update Scorer and add API docs	2019-05-24 14:06:04 +02:00
svlandeg	86ed771e0b	adding local sentence encoder	2019-05-23 16:59:11 +02:00
svlandeg	4392c01b7b	obtain sentence for each mention	2019-05-23 15:37:05 +02:00
svlandeg	97241a3ed7	upsampling and batch processing	2019-05-22 23:40:10 +02:00
svlandeg	1a16490d20	update per entity	2019-05-22 12:46:40 +02:00
svlandeg	eb08bdb11f	hidden with for encoders	2019-05-21 23:42:46 +02:00
svlandeg	7b13e3d56f	undersampling negatives	2019-05-21 18:35:10 +02:00
svlandeg	2fa3fac851	fix concat bp and more efficient batch calls	2019-05-21 13:43:59 +02:00
svlandeg	0a15ee4541	fix in bp call	2019-05-20 23:54:55 +02:00
svlandeg	89e322a637	small fixes	2019-05-20 17:20:39 +02:00
Ujwal Narayan	4d550a3055	Enhancing Kannada language Resources (#3755 ) * Updated stop_words.py Added more stopwords * Create ujwal-narayan.md Enhancing Kannada language resources	2019-05-20 12:56:10 +02:00
svlandeg	7edb2e1711	fix convolution layer	2019-05-20 11:58:48 +02:00
svlandeg	dd691d0053	debugging	2019-05-17 17:44:11 +02:00
svlandeg	400b19353d	simplify architecture and larger-scale test runs	2019-05-17 01:51:18 +02:00
Ines Montani	321c9f5acc	Fix lex_id docs (closes #3743 )	2019-05-16 23:15:58 +02:00
svlandeg	d51bffe63b	clean up code	2019-05-16 18:36:15 +02:00
svlandeg	b5470f3d75	various tests, architectures and experiments	2019-05-16 18:25:34 +02:00
svlandeg	9ffe5437ae	calculate gradient for entity encoding	2019-05-15 02:23:08 +02:00
svlandeg	2713abc651	implement loss function using dot product and prob estimate per candidate cluster	2019-05-14 22:55:56 +02:00
BreakBB	ed18a6efbd	Add check for callable to 'Language.replace_pipe' to fix #3737 (#3741 )	2019-05-14 16:59:31 +02:00
svlandeg	09ed446b20	different architecture / settings	2019-05-14 08:37:52 +02:00
svlandeg	4142e8dd1b	train and predict per article (saving time for doc encoding)	2019-05-13 17:02:34 +02:00
svlandeg	3b81b00954	evaluating on dev set during training	2019-05-13 14:26:04 +02:00
Ines Montani	8baff1c7c0	💫 Improve introspection of custom extension attributes (#3729 ) * Add custom __dir__ to Underscore (see #3707) * Make sure custom extension methods keep their docstrings (see #3707) * Improve tests * Prepend note on partial to docstring (see #3707) * Remove print statement * Handle cases where docstring is None	2019-05-12 00:53:11 +02:00
Ines Montani	f96af8526a	Merge branch 'spacy.io' [ci skip]	2019-05-11 23:03:56 +02:00
Matthew Honnibal	3aceeeaaeb	Set version to v2.1.4	2019-05-11 22:57:53 +02:00
Ines Montani	aea1c93a05	Replace cytoolz.partition_all with util.minibatch	2019-05-11 21:12:09 +02:00
Ines Montani	0bf6441863	Fix .iob converter (closes #3620 )	2019-05-11 19:15:26 +02:00
Matthew Honnibal	f6e9394aa5	Fix push-tag script	2019-05-11 19:04:35 +02:00
Matthew Honnibal	a5159ddcf5	Set version to v2.1.4.dev1	2019-05-11 19:03:51 +02:00
Ines Montani	7534f7cb44	Fix return value of Language.update (closes #3692 )	2019-05-11 18:40:19 +02:00
Ines Montani	503b8c85f1	Add TWiML podcast to universe [ci skip]	2019-05-11 17:48:22 +02:00
Ines Montani	0daf2422a3	Auto-format	2019-05-11 17:48:07 +02:00
Ines Montani	6b3a79ac96	Call rmtree and copytree with strings (closes #3713 )	2019-05-11 15:48:35 +02:00
devforfu	21af12eb53	Make "text" key in JSONL format optional when "tokens" key is provided (#3721 ) * Fix issue with forcing text key when it is not required * Extending the docs to reflect the new behavior	2019-05-11 15:41:29 +02:00
Ines Montani	6cfa1e1f47	Fix DependencyParser.predict docs (resolves #3561 )	2019-05-11 15:37:54 +02:00
Ines Montani	25f5592d57	Improve Token.prob and Lexeme.prob docs (resolves #3701 )	2019-05-11 15:23:41 +02:00
Aaron Kub	719a15f23d	fixing regex matcher examples (#3708 ) (#3719 )	2019-05-10 14:23:52 +02:00
Luca Dorigo	82d034f976	Update glossary.py to match information found in documentation (#3704 ) (closes ##3679) * Update glossary.py to match information found in documentation I used regexes to add any dependency tag that was in the documentation but not in the glossary. Solves #3679 👍 * Adds forgotten colon	2019-05-10 14:23:20 +02:00
Wannaphong Phatthiyaphaibun	5a14a13f64	fix thai bug (#3693 ) fix tokenize for pythainlp	2019-05-10 14:21:34 +02:00
Luca Dorigo	2663f4133c	Submit contributor agreement (#3705 )	2019-05-10 14:19:18 +02:00
Ines Montani	65b55f1aaa	Add version tag to `--base-model` argument (closes #3720 )	2019-05-10 14:06:47 +02:00
svlandeg	b6d788064a	some first experiments with different architectures and metrics	2019-05-10 12:53:14 +02:00
svlandeg	9d089c0410	grouping clusters of instances per doc+mention	2019-05-09 18:11:49 +02:00
svlandeg	c6ca8649d7	first stab at model - not functional yet	2019-05-09 17:23:19 +02:00
richardpaulhudson	a1e07f0d14	Request to include Holmes in spaCy Universe (#3685 ) * Request to add Holmes to spaCy Universe Dear spaCy team, I would be grateful if you would consider my Python library Holmes for inclusion in the spaCy Universe. Holmes transforms the syntactic structures delivered by spaCy into semantic structures that, together with various other techniques including ontological matching and word embeddings, serve as the basis for information extraction. Holmes supports several use cases including chatbot, structured search, topic matching and supervised document classification. I had the basic idea for Holmes around 15 years ago and now spaCy has made it possible to build an implementation that is stable and fast enough to actually be of use - thank you! At present Holmes supports English and German (I am based in Munich) but could easily be extended to support any other language with a spaCy model. * Added	2019-05-08 02:42:03 +02:00
Ines Montani	505c9e0e19	Add util.filter_spans helper (#3686 )	2019-05-08 02:33:40 +02:00
svlandeg	9f33732b96	using entity descriptions and article texts as input embedding vectors for training	2019-05-07 16:03:42 +02:00
F0rge1cE	dd1e6b0bc6	Fix offset bug in loading pre-trained word2vec. (#3689 ) * Fix offset bug in loading pre-trained word2vec. * add contributor agreement	2019-05-06 23:00:38 +02:00

... 2 3 4 5 6 ...

10295 Commits