Miroslav Stampar
|
1e7f2d6da2
|
Implements #1215
|
2015-04-06 22:07:22 +02:00 |
|
Miroslav Stampar
|
25b23750e8
|
Bug fix for crawling over non-80 port
|
2015-03-12 11:49:52 +01:00 |
|
Miroslav Stampar
|
9f4a32ca2b
|
Automatically checking for sitemap existence in case of --crawl
|
2015-01-20 10:03:35 +01:00 |
|
Miroslav Stampar
|
45bdefd29b
|
Update of copyright
|
2015-01-06 15:02:16 +01:00 |
|
Miroslav Stampar
|
605b126758
|
Patch for an Issue #976
|
2014-11-26 13:38:21 +01:00 |
|
Miroslav Stampar
|
1a8b58fca6
|
Minor update
|
2014-11-20 16:42:06 +01:00 |
|
Miroslav Stampar
|
f8a8cbf9a6
|
Storing crawling results to a temporary file (for eventual further processing)
|
2014-11-20 16:29:17 +01:00 |
|
Miroslav Stampar
|
34aed7cde0
|
Bug fix (now it's possible to use multiple parsed requests without mixing associated headers)
|
2014-10-22 13:49:29 +02:00 |
|
Miroslav Stampar
|
4e3a4eb0ff
|
Added a prompt for choosing a number of threads when in crawling mode
|
2014-10-10 12:09:08 +02:00 |
|
Bernardo Damele
|
43a4e85749
|
updated copyright
|
2014-01-13 17:24:49 +00:00 |
|
stamparm
|
2bfdac5ebc
|
Minor update for crawler
|
2013-04-30 18:32:46 +02:00 |
|
stamparm
|
ebe8ee3500
|
Fix for crawler and redirection case
|
2013-04-30 18:08:26 +02:00 |
|
stamparm
|
3c110b3620
|
Minor bug fix
|
2013-04-30 16:40:16 +02:00 |
|
stamparm
|
8c9da95343
|
Style and consistency update (url -> URL)
|
2013-04-09 11:48:42 +02:00 |
|
stamparm
|
3948b527dd
|
Update for an Issue #429
|
2013-04-09 11:36:33 +02:00 |
|
stamparm
|
91054099aa
|
Minor style update
|
2013-04-09 10:42:58 +02:00 |
|
Bernardo Damele
|
4b9d8ed673
|
reverted a previous commit as not all distributions create a link file /usr/bin/python2 to the Python interpreter
|
2013-02-14 11:32:17 +00:00 |
|
Bernardo Damele
|
a67ef4117f
|
make sure to use Python 2 interpreter when default system Python is version 3
|
2013-02-14 11:25:04 +00:00 |
|
Bernardo Damele
|
a43202f3c0
|
updated copyright
|
2013-01-18 14:07:51 +00:00 |
|
Miroslav Stampar
|
ca3d35a878
|
Some PEP8 related style cleaning
|
2013-01-10 13:18:44 +01:00 |
|
Miroslav Stampar
|
bf5544903b
|
Minor style update
|
2013-01-09 16:10:26 +01:00 |
|
Miroslav Stampar
|
3d4f381ab5
|
Patch for an Issue #169
|
2013-01-09 15:22:21 +01:00 |
|
Miroslav Stampar
|
cb91729913
|
Fix for an Issue #324 (crawling when HTML is not well-formed)
|
2012-12-27 20:55:37 +01:00 |
|
Miroslav Stampar
|
712cf4e4db
|
Fix for an Issue #316
|
2012-12-20 20:55:59 +01:00 |
|
Miroslav Stampar
|
c2c4601d6e
|
Minor restyling
|
2012-12-20 11:06:52 +01:00 |
|
Miroslav Stampar
|
974407396e
|
Doing some more style updating (capitalization of exception classes; using _ is enough for private members - __ is used in Python specific methods)
|
2012-12-06 14:14:19 +01:00 |
|
Miroslav Stampar
|
baccbd6f48
|
Implementation for an Issue #283
|
2012-12-06 11:57:57 +01:00 |
|
Miroslav Stampar
|
b6650add46
|
Introducing 'new style classes' (idea from Pull request #284)
|
2012-12-06 10:42:53 +01:00 |
|
Miroslav Stampar
|
2de52927f3
|
Code refactoring (epecially Google search code)
|
2012-10-30 18:38:10 +01:00 |
|
Miroslav Stampar
|
87ecf205cb
|
More work for Issue #66
|
2012-07-14 17:01:04 +02:00 |
|
Bernardo Damele
|
162da75a04
|
modified homepage address
|
2012-07-12 18:38:03 +01:00 |
|
jekil
|
c39e5a85ba
|
Removed $id$ tags
|
2012-06-27 20:56:43 +02:00 |
|
Miroslav Stampar
|
6c4bd84d18
|
minor fix (turning back the functionality of kb.suppressResumeInfo)
|
2012-06-25 16:19:51 +00:00 |
|
Miroslav Stampar
|
d2dd47fb23
|
some more refactoring
|
2012-06-14 13:52:56 +00:00 |
|
Miroslav Stampar
|
b3bd4144f5
|
removing of unused imports together with some general code refactoring
|
2012-02-22 10:40:11 +00:00 |
|
Miroslav Stampar
|
95f89ab63a
|
updating copyright date
|
2012-01-11 14:59:46 +00:00 |
|
Miroslav Stampar
|
29f502fe29
|
some refactoring
|
2011-12-28 16:27:17 +00:00 |
|
Miroslav Stampar
|
bdc724cb46
|
minor bug fix
|
2011-12-20 10:34:28 +00:00 |
|
Miroslav Stampar
|
ef987c6954
|
adding compatibility support for using --crawl and --forms together
|
2011-10-29 09:32:20 +00:00 |
|
Miroslav Stampar
|
f5e45bf113
|
quick fix for a bug reported by jovon.itwaru@gmail.com
|
2011-07-11 08:54:39 +00:00 |
|
Bernardo Damele
|
aedcf8c8d7
|
Changed homepage address
|
2011-07-07 20:10:03 +00:00 |
|
Miroslav Stampar
|
e00cf81f7e
|
minor update
|
2011-06-24 19:50:13 +00:00 |
|
Miroslav Stampar
|
e9286ddd5b
|
fix for a bug reported by g@brindi.si (UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position
47: ordinal not in range(128))
|
2011-06-24 19:24:11 +00:00 |
|
Miroslav Stampar
|
eaa2a4202f
|
changing to: --crawl=CRAWLDEPTH
|
2011-06-24 05:40:03 +00:00 |
|
Miroslav Stampar
|
dfc02d8c3c
|
sorry Bernardo, i hope your mobile is turned off :)))
|
2011-06-20 22:47:24 +00:00 |
|
Miroslav Stampar
|
2a4a284a29
|
crawler fix (skip binary files)
|
2011-06-20 22:41:38 +00:00 |
|
Miroslav Stampar
|
20bb1a685b
|
really minor update
|
2011-06-20 21:57:53 +00:00 |
|
Miroslav Stampar
|
812cd2f19b
|
minor update
|
2011-06-20 21:47:03 +00:00 |
|
Miroslav Stampar
|
e8ac7414f2
|
bug fix
|
2011-06-20 21:36:15 +00:00 |
|
Miroslav Stampar
|
d6062e8fc9
|
minor fix for crawler and far less message overlaps in future
|
2011-06-20 21:18:12 +00:00 |
|
Miroslav Stampar
|
8968c708a0
|
minor update
|
2011-06-20 14:27:24 +00:00 |
|
Miroslav Stampar
|
17fac6f67f
|
minor update
|
2011-06-20 13:53:39 +00:00 |
|
Miroslav Stampar
|
4d1fa5596b
|
added support for --scope in --crawl mode
|
2011-06-20 12:37:51 +00:00 |
|
Miroslav Stampar
|
42746cc706
|
bug fix
|
2011-06-20 12:18:46 +00:00 |
|
Miroslav Stampar
|
cda39ca350
|
minor update
|
2011-06-20 11:46:23 +00:00 |
|
Miroslav Stampar
|
07e2c72943
|
adding Beautifulsoup (BSD) into extras; adding --crawl to options
|
2011-06-20 11:32:30 +00:00 |
|