Alexander
|
d9cc462106
|
Merge branch 'simd/rgba-convert' into simd/5.3.x
|
2018-10-17 14:52:37 +03:00 |
|
Alexander
|
b646ac278f
|
Merge branch 'simd/resample' into simd/5.3.x
|
2018-10-17 14:52:32 +03:00 |
|
Alexander
|
dd99b65d78
|
Merge branch 'simd/filters' into simd/5.3.x
|
2018-10-17 14:52:26 +03:00 |
|
Alexander
|
32f3dff6f5
|
Merge branch 'simd/box-blur' into simd/5.3.x
|
2018-10-17 14:51:58 +03:00 |
|
Alexander
|
0fc8680360
|
Merge branch 'simd/alpha-composite' into simd/5.3.x
|
2018-10-17 14:51:45 +03:00 |
|
Alexander
|
8c38010f7d
|
Speedup other 2L convertions
|
2018-10-05 13:57:28 +03:00 |
|
Alexander
|
87385595ce
|
RGB → L 2.2 times faster
|
2018-10-05 13:57:28 +03:00 |
|
Alexander
|
ca27f8197b
|
fix rounding and speedup a bit
|
2018-10-05 13:57:28 +03:00 |
|
Alexander
|
7c7d7018b1
|
use 16bit arithmetics
|
2018-10-05 13:57:28 +03:00 |
|
Alexander
|
7f2b368e85
|
sse4 version (still 1.4x faster than previous avx2 implementation)
|
2018-10-05 13:57:28 +03:00 |
|
Alexander
|
89ddb0d95a
|
use float div instead of gather
|
2018-10-05 13:57:28 +03:00 |
|
homm
|
a92659f65c
|
fix RGBa → RGBA conversion on AVX2
|
2018-10-05 13:57:28 +03:00 |
|
homm
|
a880dd08e9
|
RGBa → RGBA convert using gather
|
2018-10-05 13:57:28 +03:00 |
|
homm
|
880fede485
|
avx2 implementation
|
2018-10-05 13:57:28 +03:00 |
|
homm
|
096aaa1e6c
|
faster implementation
|
2018-10-05 13:57:28 +03:00 |
|
homm
|
fdef92c60a
|
sse4 implementation
|
2018-10-05 13:57:28 +03:00 |
|
Alexander
|
adc2e0302d
|
move files
|
2018-10-05 13:55:10 +03:00 |
|
Alexander
|
ef1692649d
|
add parentheses around var declarations
|
2018-10-05 13:55:10 +03:00 |
|
Alexander
|
80a64c013e
|
optimize coefficients loading for horizontal pass
wtf is xmax / 2
optimize coefficients loading for vertical pass
|
2018-10-05 13:55:10 +03:00 |
|
homm
|
b7b3b26483
|
SIMD resample: unrolled SSE4 & AVX2
|
2018-10-05 13:55:10 +03:00 |
|
Alexander
|
ff5ed4f6d5
|
move files
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
1713b71c0a
|
fix memory access for:
3x3f_u8
3x3i_4u8
5x5i_4u8
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
94ea64c416
|
5x5i_4u8 AVX2
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
3da294ca21
|
advanced 5x5i_4u8 SSE4
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
96b367c571
|
5x5i_4u8 SSE4
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
7bd48c8f63
|
finish 3x3i_4u8
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
cb68d00256
|
avx2 version
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
3b7b833f45
|
rearrange operations
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
c4085db81e
|
reduce number of registers
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
0b3550c24f
|
Rearrange instruction for speedup
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
e4c9528d55
|
better loading
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
8695387f05
|
better macros
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
3e8574ae26
|
3x3i
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
44c56befbd
|
move ImagingFilterxxx functions to separate files
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
98bed5abae
|
fix offset
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
db69139906
|
5x5 single channel SSE4 (tests failed)
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
cdde46ae17
|
consider last pixel in AVX
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
5ca47243f8
|
unroll AVX (with no profit)
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
c30554ca64
|
Macros for AVX
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
0d36fd05ee
|
unroll AVX 2 times
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
3c3623265c
|
First AVX try
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
ee7158d8d5
|
3x3 SSE4 singleband: 2 lines
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
9966e832e0
|
reuse loaded values
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
32c372a616
|
faster 3x3 singleband SSE4
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
86c8aac6f8
|
3x3 SSE4 singleband
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
bef019f9cf
|
use macros in 3x3
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
78e99deaef
|
use macros
|
2018-10-05 13:52:49 +03:00 |
|
Alexander
|
328bf4593e
|
rearrange 3x3 filter to match 5x5
|
2018-10-05 13:52:48 +03:00 |
|
Alexander
|
8a351e1e31
|
improve locality in 5x5 filter
|
2018-10-05 13:52:48 +03:00 |
|
Alexander
|
9c8a9014c4
|
a bit faster 5x5 filter
|
2018-10-05 13:52:48 +03:00 |
|