Lonami Exo
83d9d1d78e
Fix markdown parser not inverting delimiters dict
2018-02-16 20:30:19 +01:00
Lonami Exo
75d99fbb53
Fix HTML entity parsing failing when needing surrogates
2018-02-15 11:52:46 +01:00
Lonami Exo
59a1a6aef2
Stop working with bytes on the markdown parser
2018-01-07 16:19:41 +01:00
Lonami Exo
ec4ca5dbfc
More consistent with asyncio branch (style/small fixes)
...
Like passing an extra (invalid) dt parameter when serializing
a datetime, and handling more errors in the TcpClient class.
2018-01-05 18:31:48 +01:00
Lonami Exo
605c103f29
Add unparse markdown method
2017-11-26 17:16:59 +01:00
Lonami Exo
57a70d0d47
Document the extensions/ module
2017-11-26 17:14:28 +01:00
Lonami Exo
9767774147
Fix import in markdown parser not being relative
2017-11-17 15:57:48 +01:00
Lonami Exo
346c5bb303
Add method to md parser to extract text surrounded by entities
2017-11-16 19:13:13 +01:00
Lonami Exo
e5deaf5db8
Fix c4e07cf
, md parsing adding unfinished entity at wrong offset
2017-11-16 19:07:53 +01:00
Lonami Exo
c4e07cff57
Fix unfinished markdown delimiters being stripped away
2017-11-10 11:44:27 +01:00
Lonami Exo
cb3f20db65
Clean up markdown parsing since tuples aren't used anymore
2017-11-10 11:41:49 +01:00
Lonami Exo
7d75eebdab
Make markdown parser use only Telegram's MessageEntity's
2017-11-10 11:07:36 +01:00
Lonami Exo
83af705cc8
Add more comments to the markdown parser
2017-11-06 11:32:40 +01:00
Lonami Exo
3a2c3a9497
Fix URL regex for markdown was greedy (fix-up)
2017-11-06 11:22:58 +01:00
Lonami Exo
07ece83aba
Fix overlapping markdown entities being skipped
2017-11-06 10:37:22 +01:00
Lonami Exo
4f80429215
Work on byte level when parsing markdown
...
Reasoning: instead encoding every character one by one as we
encounter them to use half their length as the correct offset,
we can simply encode the whole string at once as utf-16le and
work with that directly.
2017-11-06 10:29:32 +01:00
Viktor Oreshkin
49eb281251
Proper offset calculation for markdown ( #407 )
...
Dan suca
If Dan shared it with Traitor I'll not have to spend my time on this
Not a, sorry for not letting you sleep
k thx bye
Will this stay in history?
2017-11-06 00:17:22 +01:00
Lonami Exo
82cac4836c
Fix markdown URL parsing using character index instead offset
2017-10-30 11:15:53 +01:00
Lonami Exo
0a14aa1bc6
Remove additional check when calculating emojies length
...
This special check treated some emojies as 3 characters long but
this shouldn't have actually been done, likely due to the old
regex matching more things as emoji than it should (which would
have count as 2 too, making up for 1+3 from the new is_emoji()).
2017-10-30 10:56:39 +01:00
Lonami Exo
2609bd9bd1
Use constants and allow empty URL regex when parsing markdown
2017-10-29 18:21:21 +01:00
Lonami Exo
d47a9f83d0
Fix some special cases which are not treated as emojis (offset 1)
2017-10-29 17:07:37 +01:00
Lonami Exo
bcaa8007a3
Fix inline URL matching swallowing all parse entities
2017-10-29 16:43:30 +01:00
Lonami Exo
f5fafc6a27
Enhance emoji detection
2017-10-29 16:41:30 +01:00
Lonami Exo
368269cb11
Add ability to parse inline URLs
2017-10-29 16:33:10 +01:00
Lonami Exo
9600a9ea0b
Fix markdown parsing failing if delimiter was last character
2017-10-28 19:17:18 +02:00
Lonami Exo
5adec2e1ab
Initial attempt at parsing Markdown-like syntax
2017-10-28 19:06:41 +02:00