Commit Graph

24 Commits

Author SHA1 Message Date
Lonami Exo
59a1a6aef2 Stop working with bytes on the markdown parser 2018-01-07 16:19:41 +01:00
Lonami Exo
ec4ca5dbfc More consistent with asyncio branch (style/small fixes)
Like passing an extra (invalid) dt parameter when serializing
a datetime, and handling more errors in the TcpClient class.
2018-01-05 18:31:48 +01:00
Lonami Exo
605c103f29 Add unparse markdown method 2017-11-26 17:16:59 +01:00
Lonami Exo
57a70d0d47 Document the extensions/ module 2017-11-26 17:14:28 +01:00
Lonami Exo
9767774147 Fix import in markdown parser not being relative 2017-11-17 15:57:48 +01:00
Lonami Exo
346c5bb303 Add method to md parser to extract text surrounded by entities 2017-11-16 19:13:13 +01:00
Lonami Exo
e5deaf5db8 Fix c4e07cf, md parsing adding unfinished entity at wrong offset 2017-11-16 19:07:53 +01:00
Lonami Exo
c4e07cff57 Fix unfinished markdown delimiters being stripped away 2017-11-10 11:44:27 +01:00
Lonami Exo
cb3f20db65 Clean up markdown parsing since tuples aren't used anymore 2017-11-10 11:41:49 +01:00
Lonami Exo
7d75eebdab Make markdown parser use only Telegram's MessageEntity's 2017-11-10 11:07:36 +01:00
Lonami Exo
83af705cc8 Add more comments to the markdown parser 2017-11-06 11:32:40 +01:00
Lonami Exo
3a2c3a9497 Fix URL regex for markdown was greedy (fix-up) 2017-11-06 11:22:58 +01:00
Lonami Exo
07ece83aba Fix overlapping markdown entities being skipped 2017-11-06 10:37:22 +01:00
Lonami Exo
4f80429215 Work on byte level when parsing markdown
Reasoning: instead encoding every character one by one as we
encounter them to use half their length as the correct offset,
we can simply encode the whole string at once as utf-16le and
work with that directly.
2017-11-06 10:29:32 +01:00
Viktor Oreshkin
49eb281251 Proper offset calculation for markdown (#407)
Dan suca
If Dan shared it with Traitor I'll not have to spend my time on this
Not a, sorry for not letting you sleep
k thx bye
Will this stay in history?
2017-11-06 00:17:22 +01:00
Lonami Exo
82cac4836c Fix markdown URL parsing using character index instead offset 2017-10-30 11:15:53 +01:00
Lonami Exo
0a14aa1bc6 Remove additional check when calculating emojies length
This special check treated some emojies as 3 characters long but
this shouldn't have actually been done, likely due to the old
regex matching more things as emoji than it should (which would
have count as 2 too, making up for 1+3 from the new is_emoji()).
2017-10-30 10:56:39 +01:00
Lonami Exo
2609bd9bd1 Use constants and allow empty URL regex when parsing markdown 2017-10-29 18:21:21 +01:00
Lonami Exo
d47a9f83d0 Fix some special cases which are not treated as emojis (offset 1) 2017-10-29 17:07:37 +01:00
Lonami Exo
bcaa8007a3 Fix inline URL matching swallowing all parse entities 2017-10-29 16:43:30 +01:00
Lonami Exo
f5fafc6a27 Enhance emoji detection 2017-10-29 16:41:30 +01:00
Lonami Exo
368269cb11 Add ability to parse inline URLs 2017-10-29 16:33:10 +01:00
Lonami Exo
9600a9ea0b Fix markdown parsing failing if delimiter was last character 2017-10-28 19:17:18 +02:00
Lonami Exo
5adec2e1ab Initial attempt at parsing Markdown-like syntax 2017-10-28 19:06:41 +02:00