spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-09-24 04:56:43 +03:00

Author	SHA1	Message	Date
Matthew Honnibal	a7aa49c419	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2018-05-16 23:20:51 +02:00
Matthew Honnibal	a0b8a26655	Fix missing regex requirement	2018-05-16 23:19:01 +02:00
Matthew Honnibal	74d5c625b3	Use rising beam update prob	2018-05-16 20:11:59 +02:00
Matthew Honnibal	544ae7f1db	Merge branch 'develop' into feature/refactor-parser	2018-05-16 02:06:49 +02:00
Matthew Honnibal	d1b27fe5aa	Revert "Improve dynamic oracle when values are missing in parse" This reverts commit `f56bd4736b`.	2018-05-16 00:31:52 +02:00
Matthew Honnibal	8661218fe8	Refactor parser (#2308 ) * Work on refactoring greedy parser * Compile updated parser * Fix refactored parser * Update test * Fix refactored parser * Fix refactored parser * Readd beam search after refactor * Fix beam search after refactor * Fix parser * Fix beam parsing * Support oracle segmentation in ud-train CLI command * Avoid relying on final gold check in beam search * Add a keyword argument sink to GoldParse * Bug fixes to beam search after refactor * Avoid importing fused token symbol in ud-run-test, untl that's added * Avoid importing fused token symbol in ud-run-test, untl that's added * Don't modify Token in global scope * Fix error in beam gradient calculation * Default to beam_update_prob 1 * Set a more aggressive threshold on the max violn update * Disable some tests to figure out why CI fails * Disable some tests to figure out why CI fails * Add some diagnostics to travis.yml to try to figure out why build fails * Tell Thinc to link against system blas on Travis * Point thinc to libblas on Travis * Try running sudo=true for travis * Unhack travis.sh * Restore beam_density argument for parser beam * Require thinc 6.11.1.dev16 * Revert hacks to tests * Revert hacks to travis.yml * Update thinc requirement * Fix parser model loading * Fix size limits in training data * Add missing name attribute for parser * Fix appveyor for Windows	2018-05-15 22:17:29 +02:00
Matthew Honnibal	f3790bdeec	Fix appveyor for Windows	2018-05-15 21:16:39 +02:00
Matthew Honnibal	83acaa0358	Add missing name attribute for parser	2018-05-15 19:01:53 +02:00
Matthew Honnibal	f328c195ca	Fix size limits in training data	2018-05-15 19:01:41 +02:00
Matthew Honnibal	8446b35ce0	Fix parser model loading	2018-05-15 18:43:46 +02:00
Matthew Honnibal	dc1a479fbd	Merge branch 'develop' into feature/refactor-parser	2018-05-15 18:39:21 +02:00
Matthew Honnibal	13faf4e1ea	Update thinc requirement	2018-05-15 18:35:11 +02:00
Matthew Honnibal	546dd99cdf	Merge master into develop -- mostly Arabic and website	2018-05-15 18:14:28 +02:00
Matthew Honnibal	e3fdfba164	Revert hacks to travis.yml	2018-05-15 18:00:24 +02:00
Matthew Honnibal	5664ab7e6c	Revert hacks to tests	2018-05-15 18:00:09 +02:00
Matthew Honnibal	4dd1fb3c7b	Require thinc 6.11.1.dev16	2018-05-15 17:56:07 +02:00
Matthew Honnibal	7b9195657b	Restore beam_density argument for parser beam	2018-05-15 17:55:11 +02:00
Matthew Honnibal	581d318971	Fix conftest	2018-05-15 00:54:45 +02:00
Tahar Zanouda	00417794d3	Add Arabic language (#2314 ) * added support for Arabic lang * added Arabic language support * updated conftest	2018-05-15 00:27:19 +02:00
Jani Monoses	0e08e49e87	Lemmatizer ro (#2319 ) * Add Romanian lemmatizer lookup table. Adapted from http://www.lexiconista.com/datasets/lemmatization/ by replacing cedillas with commas (ș and ț). The original dataset is licensed under the Open Database License. * Fix one blatant issue in the Romanian lemmatizer * Romanian examples file * Add ro_tokenizer in conftest * Add Romanian lemmatizer test	2018-05-12 15:20:04 +02:00
vishnumenon	ae3719ece5	Fix the code for FACILITIY entities (#2324 ) * Fix the code for FACILITIY entities As far as I can tell, the default models all use "FAC" rather than "FACILITY" * Added my Contributor Agreement * Rename vishnumenon to vishnumenon.md	2018-05-12 15:19:17 +02:00
Matthew Honnibal	625ee6c464	Unhack travis.sh	2018-05-10 18:16:11 +02:00
Matthew Honnibal	299621b747	Try running sudo=true for travis	2018-05-10 18:11:11 +02:00
Matthew Honnibal	603907926f	Point thinc to libblas on Travis	2018-05-10 18:06:37 +02:00
Matthew Honnibal	1b294f4798	Tell Thinc to link against system blas on Travis	2018-05-10 18:03:44 +02:00
Matthew Honnibal	c261b5b996	Add some diagnostics to travis.yml to try to figure out why build fails	2018-05-10 17:10:44 +02:00
Matthew Honnibal	887631ca25	Disable some tests to figure out why CI fails	2018-05-10 16:42:01 +02:00
Matthew Honnibal	902a172cb7	Disable some tests to figure out why CI fails	2018-05-10 16:30:07 +02:00
Matthew Honnibal	614d45ea58	Set a more aggressive threshold on the max violn update	2018-05-10 15:38:24 +02:00
Matthew Honnibal	8e8724b55b	Default to beam_update_prob 1	2018-05-10 15:38:02 +02:00
Jani Monoses	42b34832e4	Update Romanian stopword list (#2316 ) * Contributor agreement for janimo * Update Romanian stopword list Include the correct spellings of all the words already in the repo that are using cedillas (ş and ţ) instead of commas (ș and ț). Add another unrelated spelling fix. See https://github.com/stopwords-iso/stopwords-ro/pull/1 and https://github.com/stopwords-iso/stopwords-ro/pull/2	2018-05-10 12:16:56 +02:00
Lucas Abbade	18af53014f	Adding my contributor agreement (#2315 ) * Create LRAbbade.md * Update LRAbbade.md	2018-05-09 21:25:05 +02:00
Lucas Abbade	be7fdc59d1	Update lex_attrs.py (#2307 ) * Update lex_attrs.py Fixed spelling mistakes of some numbers (according to Brazilian Portuguese). * Update lex_attrs.py As requested, I've included the correct spelling for both Brazilian Portuguese and Portuguese Portuguese. I will advise however, that the two are separated in the future. Brazilian Portuguese is a very different language from the original one, although most of the writing is unified, the way people talk in both countries is radically different. Keeping both languages as one may lead to bigger issues in the future, especially when it comes to spell checking.	2018-05-09 20:49:31 +02:00
mauryaland	5368ba028a	Update stop_words.py for French language (#2310 ) * Add contraction forms of some common stopwords All the stopwords added contain the apostrophe" ' "or " ’ ". * Adds contributor agreement mauryaland * Update mauryaland.md	2018-05-09 12:04:38 +02:00
Matthew Honnibal	a61fd60681	Fix error in beam gradient calculation	2018-05-09 02:44:09 +02:00
Matthew Honnibal	a6ae1ee6f7	Don't modify Token in global scope	2018-05-09 00:43:00 +02:00
Matthew Honnibal	f94f721f40	Avoid importing fused token symbol in ud-run-test, untl that's added	2018-05-09 00:28:03 +02:00
Matthew Honnibal	659ec5b975	Avoid importing fused token symbol in ud-run-test, untl that's added	2018-05-08 19:40:33 +02:00
Matthew Honnibal	4cb0494bef	Bug fixes to beam search after refactor	2018-05-08 13:48:50 +02:00
Matthew Honnibal	5ed71973b3	Add a keyword argument sink to GoldParse	2018-05-08 13:48:32 +02:00
Matthew Honnibal	8cfe326f87	Avoid relying on final gold check in beam search	2018-05-08 13:48:19 +02:00
Matthew Honnibal	fc4dd49b77	Support oracle segmentation in ud-train CLI command	2018-05-08 13:47:45 +02:00
Matthew Honnibal	c49e44349a	Fix beam parsing	2018-05-08 02:53:24 +02:00
Matthew Honnibal	99649d114d	Fix parser	2018-05-08 00:27:26 +02:00
Matthew Honnibal	8a82367a9d	Fix beam search after refactor	2018-05-08 00:20:33 +02:00
Matthew Honnibal	5a0f26be0c	Readd beam search after refactor	2018-05-08 00:19:52 +02:00
ines	7a3599c21a	Fix formatting and consistency	2018-05-07 23:02:11 +02:00
ines	37facf9b4d	Add config for no-response [ci skip]	2018-05-07 22:04:54 +02:00
ines	ac25bc4016	Add docs section on sentence segmentation [ci skip]	2018-05-07 21:25:20 +02:00
ines	14148cd147	Fix formatting and wording	2018-05-07 21:24:35 +02:00

... 5 6 7 8 9 ...

9030 Commits