Commit Graph

  • 832896ea6c * Add html to gazetteer Matthew Honnibal 2015-08-06 16:36:54 +0200
  • 5c3c962038 * Add html to gazetteer Matthew Honnibal 2015-08-06 16:34:51 +0200
  • 10d869d102 * Don't allow conjunction between NPs in base NP chunks Matthew Honnibal 2015-08-06 16:31:53 +0200
  • 8b8df851ca * Fix print statement in test_merge Matthew Honnibal 2015-08-06 16:28:31 +0200
  • 383dfabd67 * Fix matcher setting of entities Matthew Honnibal 2015-08-06 16:27:01 +0200
  • 91a94e152b * Make initial gazetteer Matthew Honnibal 2015-08-06 16:10:04 +0200
  • 2767979135 * Update matcher tests Matthew Honnibal 2015-08-06 16:09:28 +0200
  • 59c3bf60a6 * Ensure entity recognizer doesn't over-write preset types Matthew Honnibal 2015-08-06 16:09:08 +0200
  • cd7d1682cd * Fix loading of gazetteer.json file Matthew Honnibal 2015-08-06 16:08:25 +0200
  • 9c667b7f15 * Set a value in attrs.pxd on the first flag, to reduce bugs Matthew Honnibal 2015-08-06 16:08:04 +0200
  • c263577424 * Fix lower attribute in lexeme.pxd Matthew Honnibal 2015-08-06 16:07:41 +0200
  • 3ecacb9635 * Copy gazetteer file in init_model Matthew Honnibal 2015-08-06 16:07:23 +0200
  • faf75dfcb9 * Update matcher tests Matthew Honnibal 2015-08-06 14:33:35 +0200
  • 5737115e1e * Work on gazetteer matching Matthew Honnibal 2015-08-06 14:33:21 +0200
  • 9c1724ecae * Gazetteer stuff working, now need to wire up to API Matthew Honnibal 2015-08-06 00:35:40 +0200
  • 47db3067a0 * Compile spacy.matcher Matthew Honnibal 2015-08-05 23:48:11 +0200
  • 5bc0e83f9a * Reimplement matching in Cython, instead of Python. Matthew Honnibal 2015-08-05 01:05:54 +0200
  • 4c87a696b3 * Add draft dfa matcher, in Python. Passing tests. Matthew Honnibal 2015-08-04 15:55:28 +0200
  • eb7138c761 * Add attr relation in base NP detection Matthew Honnibal 2015-08-01 00:34:40 +0200
  • 4988356cf0 * Fix dependency type bug from merged tokens Matthew Honnibal 2015-08-01 00:33:24 +0200
  • af84669306 * Add smart-quote possessive marker to tokenizer Matthew Honnibal 2015-07-30 05:12:48 +0200
  • 78a9068319 * Fix spacy attr on merged tokens Matthew Honnibal 2015-07-30 04:25:58 +0200
  • 430e2edb96 * Fix noun_chunks issue Matthew Honnibal 2015-07-30 03:51:50 +0200
  • 9590968fc1 * Fix negative indices in Span Matthew Honnibal 2015-07-30 02:30:24 +0200
  • 74d8cb3980 * Add noun_chunks iterator, and fix left/right child setting in Doc.merge Matthew Honnibal 2015-07-30 02:29:49 +0200
  • d153f18969 * Fix negative indices on spans Matthew Honnibal 2015-07-29 22:36:03 +0200
  • 2bcb58456d * Workon docs for v0.89 Matthew Honnibal 2015-07-29 22:34:10 +0200
  • e7a4ee21e0 Merge fc0780600e into bb910cff92 Kyle McDonald 2015-07-29 18:16:11 +0000
  • 320836e346 * Move string description further down for token, and highlght that it includes trailing whitespace Matthew Honnibal 2015-07-28 21:05:08 +0200
  • d17a15ae66 * Add test to check parse is being deserialized properly Matthew Honnibal 2015-07-28 21:04:00 +0200
  • b5132bed7d * Set left and right children when loading parse from byte string Matthew Honnibal 2015-07-28 21:03:18 +0200
  • 6609fcf4b2 * Make mem and vocab python-visible in Doc Matthew Honnibal 2015-07-28 20:46:59 +0200
  • d42fe2e694 * Add unicode_literals to strings.pyx Matthew Honnibal 2015-07-28 16:15:53 +0200
  • bb910cff92 * Fix Python3 problem in align_raw Matthew Honnibal 2015-07-28 16:06:53 +0200
  • dcafb181b9 * Fix Python3 problem in align_raw Matthew Honnibal 2015-07-28 15:52:10 +0200
  • c609ea18f0 * Increment version in download script Matthew Honnibal 2015-07-28 15:22:17 +0200
  • 9c4d0aae62 * Switch to better Python2/3 compatible unicode handling Matthew Honnibal 2015-07-28 14:45:37 +0200
  • 7606d9936f * Python3 correction for GoldParse Matthew Honnibal 2015-07-28 14:44:53 +0200
  • ddc1a5cfe5 * Fix training under python3 Matthew Honnibal 2015-07-28 14:09:30 +0200
  • a8bbd7312c * Hackishly patch long dependencies problem Matthew Honnibal 2015-07-28 00:14:29 +0200
  • bb583f7f09 * Hackishly patch long dependencies problem Matthew Honnibal 2015-07-27 23:14:33 +0200
  • b96bf9b8cc Merge branch 'master' of ssh://github.com/honnibal/spaCy Matthew Honnibal 2015-07-27 22:57:48 +0200
  • aa7a964a4f * Add a type declaration for doc.from_array Matthew Honnibal 2015-07-27 22:57:22 +0200
  • 9034f8a1cf * Update test_docs Matthew Honnibal 2015-07-27 22:15:19 +0200
  • 25a8774f42 * Fix regression in packer Matthew Honnibal 2015-07-27 21:53:38 +0200
  • 174ed1ad20 * Tighten the frequency filter in init_model Matthew Honnibal 2015-07-27 21:44:51 +0200
  • 1601e488ee * Fix bug in decoding non-ascii characters Matthew Honnibal 2015-07-27 21:43:58 +0200
  • 6deb1e84b6 * Upd serialization tests Matthew Honnibal 2015-07-27 21:25:48 +0200
  • 6a95409cd2 * Fix type on bits Matthew Honnibal 2015-07-27 21:16:49 +0200
  • a296d72b54 * Fix en/attrs Matthew Honnibal 2015-07-27 21:16:33 +0200
  • 45460f505c * Fix data type on read32 in BitArray Matthew Honnibal 2015-07-27 21:12:13 +0200
  • 3d43f49f69 * Revert prev change Matthew Honnibal 2015-07-27 10:58:15 +0200
  • 6b586cdad4 * Change lexemes.bin format. Add a header specifying size of LexemeC and number of lexemes, and don't have the redundant orth information. Matthew Honnibal 2015-07-27 08:31:51 +0200
  • 6047f2aa35 * Fix path to freqs.txt Matthew Honnibal 2015-07-27 02:22:35 +0200
  • 4a0f40ec2d * Ensure data is packaged in vocab Matthew Honnibal 2015-07-27 02:14:36 +0200
  • af6ed18f2a * Ensure we don't use orth_encode on OOV words. Matthew Honnibal 2015-07-27 02:12:01 +0200
  • 912511f0aa * Update prebuild command, for shell bug Matthew Honnibal 2015-07-27 01:52:04 +0200
  • b532f4eaa2 * Ensure serialize is packaged. Matthew Honnibal 2015-07-27 01:51:37 +0200
  • 8535d872e8 * Set is_oov property in get_flags Matthew Honnibal 2015-07-27 01:51:24 +0200
  • 0f4d0d51ab * Test is_oov property Matthew Honnibal 2015-07-27 01:50:34 +0200
  • 8e4c69ee8c * Add is_oov property, and fix up handling of attributes Matthew Honnibal 2015-07-27 01:50:06 +0200
  • fc268f03eb * Assert against null pointer exceptions in vocab Matthew Honnibal 2015-07-27 01:00:10 +0200
  • 2b5cde87fd * Add prebuild command, to test clean builds Matthew Honnibal 2015-07-26 22:40:04 +0200
  • 0368889d6c * Support gzipped frequencies in init_model Matthew Honnibal 2015-07-26 22:39:22 +0200
  • 62da5eb338 * Inc version Matthew Honnibal 2015-07-26 22:22:54 +0200
  • b997b1122b * Mark test_io as requiring the model Matthew Honnibal 2015-07-26 21:36:22 +0200
  • 0f093fdb30 * Fix get_by_orth for py3 Matthew Honnibal 2015-07-26 19:26:41 +0200
  • ceeda5a739 * Fix get_by_orth for py3 Matthew Honnibal 2015-07-26 18:39:27 +0200
  • 5c9b8d05e4 * Upd test_docs Matthew Honnibal 2015-07-26 17:41:13 +0200
  • 609f729cc5 * Fix infix test Matthew Honnibal 2015-07-26 17:32:55 +0200
  • 3cfe3d8c1c * Revert bad infix change Matthew Honnibal 2015-07-26 17:32:37 +0200
  • 460b4c3207 * Add more infix tests Matthew Honnibal 2015-07-26 17:30:34 +0200
  • bd608559bc * Fix infix-period tokenization Matthew Honnibal 2015-07-26 17:14:52 +0200
  • 94f314c271 * Fix tokenization of email addresses. Matthew Honnibal 2015-07-26 16:38:08 +0200
  • 48a4d15264 * Test token properties Matthew Honnibal 2015-07-26 16:37:39 +0200
  • 6bb96c122d * Host IS_ flags in attrs.pxd, and add properties for them on Token and Lexeme objects Matthew Honnibal 2015-07-26 16:37:16 +0200
  • eeaea25f0c * Check oov_prob file is present Matthew Honnibal 2015-07-26 16:36:38 +0200
  • 847c08e411 * Unhack serialization api tests Matthew Honnibal 2015-07-26 16:23:41 +0200
  • c4f20847da * Fix init_model for travis tests Matthew Honnibal 2015-07-26 14:03:30 +0200
  • 09312b9353 * Fix init_model for travis tests Matthew Honnibal 2015-07-26 13:55:47 +0200
  • 3a4c2a3276 * Update doctests Matthew Honnibal 2015-07-26 13:04:18 +0200
  • 2b2032d1a0 * Update doctests Matthew Honnibal 2015-07-26 12:57:59 +0200
  • 90ad717dc4 * Update default freq thresholds in init_model Matthew Honnibal 2015-07-26 01:41:17 +0200
  • 6c01e01f12 * Fix some casing problems in specials.json Matthew Honnibal 2015-07-26 01:38:29 +0200
  • 6a5e035a48 * Ensure data files are copied for tokenizer in init_model Matthew Honnibal 2015-07-26 01:36:19 +0200
  • ab93898ac6 * Make heuristics more explicit in init_model Matthew Honnibal 2015-07-26 00:22:19 +0200
  • 7eb2446082 * Return empty lexeme on empty string Matthew Honnibal 2015-07-26 00:18:30 +0200
  • 1b5d1da2a7 * Allow an OOV probability to be specified in get_lex_props Matthew Honnibal 2015-07-26 00:03:43 +0200
  • cd6e25132b * Allow an OOV probability to be specified in get_lex_props Matthew Honnibal 2015-07-26 00:01:46 +0200
  • 5c04dcd7c1 * Fix init_model Matthew Honnibal 2015-07-25 23:33:02 +0200
  • fd525f0675 * Pass OOV probability around Matthew Honnibal 2015-07-25 23:29:51 +0200
  • 5b6bf4d4a6 * Remove probability cap on lexicon Matthew Honnibal 2015-07-25 23:05:51 +0200
  • c62eb110c0 * Fix merge conflict in init_model Matthew Honnibal 2015-07-25 23:04:30 +0200
  • 0301472d15 * Fix init_model Matthew Honnibal 2015-07-25 22:56:35 +0200
  • 3fe14b8ed6 * Fix CFile for Python2 Matthew Honnibal 2015-07-25 22:55:53 +0200
  • 8e800adfbc * Fix init_model Matthew Honnibal 2015-07-25 22:54:08 +0200
  • 5f183098e4 Merge branch 'master' of ssh://github.com/honnibal/spaCy Matthew Honnibal 2015-07-25 22:37:04 +0200
  • 65f3ce6c52 * Require preshed 0.41 Matthew Honnibal 2015-07-25 22:36:43 +0200
  • 6076213c16 * Fix init_model script Matthew Honnibal 2015-07-25 22:35:52 +0200
  • 1a99eb69da Merge branch 'master' of https://github.com/honnibal/spaCy Matthew Honnibal 2015-07-25 22:19:48 +0200