Commit Graph

73 Commits

Author SHA1 Message Date
Miroslav Stampar
354a2ce249 'chardet' heuristic engine added to the project 2011-04-18 13:38:46 +00:00
Miroslav Stampar
5e70eac98c fix for a "popular" typo 'iso-5889-1' reported by David Guimaraes 2011-04-16 06:44:29 +00:00
Miroslav Stampar
0387654166 update of copyright string (until year) 2011-04-15 12:33:18 +00:00
Miroslav Stampar
265fa52600 minor code cosmetics 2011-04-04 18:24:16 +00:00
Miroslav Stampar
018b6b9430 fix for a charset encoding reported by Kirill 2011-04-04 18:20:09 +00:00
Bernardo Damele
c3b54cc222 Cosmetics 2011-04-01 16:40:28 +00:00
Miroslav Stampar
557ed7d665 minor fix for a invalid charset reported by Kirill 2011-03-31 14:39:01 +00:00
Miroslav Stampar
762397854e fix for a bug reported by Kirill (unknown charset '8859-1') 2011-03-24 09:27:19 +00:00
Miroslav Stampar
d79fae724c minor refactoring 2011-03-24 09:16:21 +00:00
Miroslav Stampar
cbfb10cbd1 fix of a minor bug reported by syssecurity7@googlemail.com (missing iso-8858...) 2011-03-21 16:43:46 +00:00
Miroslav Stampar
154d947c62 minor update 2011-03-07 10:15:41 +00:00
Miroslav Stampar
17c39fe231 fix for that non-HTML stuff 2011-02-22 11:32:55 +00:00
Miroslav Stampar
535eb9f3eb implementation of referer feature 2011-02-11 23:07:03 +00:00
Miroslav Stampar
ddf23ba7cc refactoring 2011-01-30 11:36:03 +00:00
Miroslav Stampar
b98cbeee04 page for handling binary files 2011-01-27 22:00:34 +00:00
Miroslav Stampar
f6f4b5e9dd bug fix for charset used in inference for pages retrieved with --null-connection 2011-01-20 11:01:01 +00:00
Miroslav Stampar
041abb56e2 you can't believe how much man can learn when having good testing points 2011-01-17 13:59:22 +00:00
Miroslav Stampar
34d13be0d3 minor update regarding default page encoding 2011-01-17 10:23:37 +00:00
Bernardo Damele
1c86ec374e Code refactoring and cosmetics 2011-01-07 15:41:09 +00:00
Miroslav Stampar
aa81ed4033 implementation of a feature suggested by pan@knownsec.com (usage of charset type from http-equiv attribute in case when charset is not defined in headers) 2011-01-04 15:49:20 +00:00
Miroslav Stampar
eb11f5b2e0 minor update 2011-01-04 13:07:12 +00:00
Miroslav Stampar
c1dc73d0a1 minor, just in case update related to the previous commit 2011-01-04 12:56:55 +00:00
Miroslav Stampar
709a7d156b fix for a bug reported by shaohua pan (UnicodeDecodeError: 'ascii' codec can't decode...) 2011-01-04 12:51:51 +00:00
Miroslav Stampar
08ccbf2c1e important fix for a bug reported by x <deep_freeze@mail.ru> (along with normal fixes, getUnicode now uses kb.pageEncoding) 2011-01-03 22:02:58 +00:00
Miroslav Stampar
d1f5c1d7b7 now when we "decode page" based on a charset, sanitizeAsciiString only brings unneeded filtering 2010-12-29 15:10:42 +00:00
Miroslav Stampar
b472b96f92 bug fix, refactoring and improved extractErrorMessage capabilities 2010-12-25 10:16:20 +00:00
Bernardo Damele
5d37df6104 Ugly code to set the cookies when got them from a 302 redirect too 2010-12-03 17:41:10 +00:00
Miroslav Stampar
5abbea4a9f fix for a bug reported by nightman (unknown charset 'null') 2010-11-17 09:57:32 +00:00
Miroslav Stampar
fda8752dca revert of some HTTP headers handling 2010-11-08 13:26:45 +00:00
Bernardo Damele
78d7b17483 More replacements for refactoring.
Minor layout adjustments.
Alignment of conffile/optiondict/cmdline parameters.
2010-11-08 12:36:48 +00:00
Bernardo Damele
4d81da6bc8 Cosmetics 2010-11-07 16:23:03 +00:00
Miroslav Stampar
f1f7e0bfe0 fix for "unknown charset 'en_us'" (reported by ToR) 2010-11-04 13:56:01 +00:00
Miroslav Stampar
6adee3792a removed all trailing spaces from blank lines 2010-11-03 10:08:27 +00:00
Miroslav Stampar
861706fb31 fix for bug reported by ToR (unknown charset 'utf-8, text/html') 2010-11-02 18:01:10 +00:00
Bernardo Damele
3eda4510e2 Properly encode the cookie 2010-10-31 11:26:33 +00:00
Miroslav Stampar
5a38ac7ea9 important update regarding (Bug #209) - probably more will be needed 2010-10-29 16:11:50 +00:00
Miroslav Stampar
4f7f20b94f sorry, cosmetics 2010-10-14 23:18:29 +00:00
Bernardo Damele
1674142d82 Minor cosmetic fixes 2010-10-14 15:28:54 +00:00
Miroslav Stampar
8b48833136 large commit with copyright header modifications 2010-10-14 14:41:14 +00:00
Miroslav Stampar
53289c6a42 fix for bug reported by Marek Sarvas (unicode) 2010-09-09 14:03:45 +00:00
Miroslav Stampar
6a6ff09c9a fix for a bug reported by Marek Sarvas 2010-07-26 08:11:28 +00:00
Miroslav Stampar
48a67d6d51 fix for "unknown charset 'windows-874'" reported by Phat R. 2010-07-15 08:44:42 +00:00
Miroslav Stampar
0d08903bc3 some charset fix up 2010-06-30 12:09:33 +00:00
Bernardo Damele
9bce22683b Minor bug fix and adjustment to deal with Keep-Alive also against Google (-g) 2010-06-11 10:08:19 +00:00
Miroslav Stampar
36953221f8 few quick changes 2010-06-10 11:34:17 +00:00
Miroslav Stampar
eaef068c90 major bug fix (different HTTP content charsets are now properly handled) 2010-06-09 14:40:36 +00:00
Miroslav Stampar
12a5ec9f3d more unicode refactoring 2010-06-02 12:45:40 +00:00
Miroslav Stampar
ac6ce478a0 just removing unneded and possible future source of confusion 2010-05-28 14:19:12 +00:00
Miroslav Stampar
a3db3c03c1 str() -> unicode() 2010-05-28 13:05:02 +00:00
Miroslav Stampar
5d5ebd49b6 introducing regex caching mechanism 2010-05-21 14:42:59 +00:00