From b5bad9489d7bddd18d6a28375a98525fe148d7a8 Mon Sep 17 00:00:00 2001 From: homm Date: Sat, 30 Apr 2016 01:16:00 +0300 Subject: [PATCH 01/21] fix list (+6 squashed commits) Squashed commits: [c45b871] update for Pillow-SIMD 3.4.0 [bedd83f] no alpha compositing in this release [e8fe730] update results for latest version add Skia results [a16ff97] add SIMD changes [82ffbd6] fix readme (+4 squashed commits) Squashed commits: [85677f9] fix error [f44ebb1] update results for unrolled implementation [83968c3] fix #4 [cd73c51] update link (+11 squashed commits) Squashed commits: [5882178] correct spelling [a0e5956] Why Pillow-SIMD is even faster [108e72e] Why Pillow itself is so fast [e8eeda1] spelling fixes [e816e9c] spelling [d2eefef] methodology, why not contributed [2e55786] installation and conclusion [9f6415e] more info [67e55b7] more benchmarks test files [471d4c5] remove spaces [904d89d] add performance tests [4fe17fe] simple readme --- CHANGES.SIMD.rst | 67 +++++++++++++++++ README.md | 183 +++++++++++++++++++++++++++++++++++++++++++++++ setup.py | 2 +- 3 files changed, 251 insertions(+), 1 deletion(-) create mode 100644 CHANGES.SIMD.rst create mode 100644 README.md diff --git a/CHANGES.SIMD.rst b/CHANGES.SIMD.rst new file mode 100644 index 000000000..374bdc7f6 --- /dev/null +++ b/CHANGES.SIMD.rst @@ -0,0 +1,67 @@ +Changelog (Pillow-SIMD) +======================= + +3.3.0.post1 +----------- + +Alpha compositing +~~~~~~~~~~~~~~~~~ + +- SSE4 and AVX2 fixed-point full loading implementation. + Up to 4.6x faster. + +3.3.0.post0 +----------- + +Resampling +~~~~~~~~~~ + +- SSE4 and AVX2 fixed-point full loading horizontal pass. +- SSE4 and AVX2 fixed-point full loading vertical pass. + +Convertion +~~~~~~~~~~ + +- RGBA -> RGBa SSE4 and AVX2 fixed-point full loading implementations. + Up to 2.6x faster. +- RGBa -> RGBA AVX2 implementation using gather instructions. + Up to 5x faster. + + +3.2.0.post3 +----------- + +Resampling +~~~~~~~~~~ + +- SSE4 and AVX2 float full loading horizontal pass. +- SSE4 float full loading vertical pass. + + +3.2.0.post2 +----------- + +Resampling +~~~~~~~~~~ + +- SSE4 and AVX2 float full loading horizontal pass. +- SSE4 float per-pixel loading vertical pass. + + +2.9.0.post1 +----------- + +Resampling +~~~~~~~~~~ + +- SSE4 and AVX2 float per-pixel loading horizontal pass. +- SSE4 float per-pixel loading vertical pass. +- SSE4: Up to 2x for downscaling. Up to 3.5x for upscaling. +- AVX2: Up to 2.7x for downscaling. Up to 3.5x for upscaling. + + +Box blur +~~~~~~~~ + +- Simple SSE4 fixed-point implementations with per-pixel loading. +- Up to 2.1x faster. diff --git a/README.md b/README.md new file mode 100644 index 000000000..8f325257f --- /dev/null +++ b/README.md @@ -0,0 +1,183 @@ +# Pillow-SIMD + +Pillow-SIMD is "following" Pillow fork (which is PIL fork itself). + +For more information about original Pillow, please +[read the documentation][original-docs], +[check the changelog][original-changelog] and +[find out how to contribute][original-contribute]. + + +## Why SIMD + +There are many ways to improve the performance of image processing. +You can use better algorithms for the same task, you can make better +implementation for current algorithms, or you can use more processing unit +resources. It is perfect when you can just use more efficient algorithm like +when gaussian blur based on convolutions [was replaced][gaussian-blur-changes] +by sequential box filters. But a number of such improvements are very limited. +It is also very tempting to use more processor unit resources +(via parallelization) when they are available. But it is handier just +to make things faster on the same resources. And that is where SIMD works better. + +SIMD stands for "single instruction, multiple data". This is a way to perform +same operations against the huge amount of homogeneous data. +Modern CPU have different SIMD instructions sets like +MMX, SSE-SSE4, AVX, AVX2, AVX512, NEON. + +Currently, Pillow-SIMD can be [compiled](#installation) with SSE4 (default) +and AVX2 support. + + +## Status + +[![Uploadcare][uploadcare.logo]][uploadcare.com] + +Pillow-SIMD can be used in production. Pillow-SIMD has been operating on +[Uploadcare][uploadcare.com] servers for more than 1 year. +Uploadcare is SAAS for image storing and processing in the cloud +and the main sponsor of Pillow-SIMD project. + +Currently, following operations are accelerated: + +- Resize (convolution-based resampling): SSE4, AVX2 +- Gaussian and box blur: SSE4 +- Alpha composition: SSE4, AVX2 +- RGBA → RGBa (alpha premultiplication): SSE4, AVX2 +- RGBa → RGBA (division by alpha): AVX2 + +See [CHANGES](CHANGES.SIMD.rst). + + +## Benchmarks + +The numbers in the table represent processed megapixels of source RGB 2560x1600 +image per second. For example, if resize of 2560x1600 image is done +in 0.5 seconds, the result will be 8.2 Mpx/s. + +- Skia 53 +- ImageMagick 6.9.3-8 Q8 x86_64 +- Pillow 3.3.0 +- Pillow-SIMD 3.3.0.post1 + +Operation | Filter | IM | Pillow| SIMD SSE4| SIMD AVX2| Skia 53 +------------------------|---------|------|-------|----------|----------|-------- +**Resize to 16x16** | Bilinear| 41.37| 337.12| 571.67| 903.40| 809.49 + | Bicubic | 20.58| 185.79| 305.72| 552.85| 453.10 + | Lanczos | 14.17| 113.27| 189.19| 355.40| 292.57 +**Resize to 320x180** | Bilinear| 29.46| 209.06| 366.33| 558.57| 592.76 + | Bicubic | 15.75| 124.43| 224.91| 353.53| 327.68 + | Lanczos | 10.80| 82.25| 153.10| 244.22| 196.92 +**Resize to 1920x1200** | Bilinear| 17.80| 55.87| 131.27| 152.11| 192.30 + | Bicubic | 9.99| 43.64| 90.20| 112.34| 112.84 + | Lanczos | 6.95| 34.51| 72.55| 103.16| 104.76 +**Resize to 7712x4352** | Bilinear| 2.54| 6.71| 16.06| 20.33| 20.58 + | Bicubic | 1.60| 5.51| 12.65| 16.46| 16.52 + | Lanczos | 1.09| 4.62| 9.84| 13.38| 12.05 +**Blur** | 1px | 6.60| 16.94| 35.16| | + | 10px | 2.28| 16.94| 35.47| | + | 100px | 0.34| 16.93| 35.53| | + + +### Some conclusion + +Pillow is always faster than ImageMagick. And Pillow-SIMD is faster +than Pillow in 2—2.5 times. In general, Pillow-SIMD with AVX2 always +**8-20 times faster** than ImageMagick and almost equal to the Skia results, +high-speed graphics library used in Chromium. + +### Methodology + +All tests were performed on Ubuntu 14.04 64-bit running on +Intel Core i5 4258U with AVX2 CPU on the single thread. + +ImageMagick performance was measured with command-line tool `convert` with +`-verbose` and `-bench` arguments. I use command line because +I need to test the latest version and this is the easiest way to do that. + +All operations produce exactly the same results. +Resizing filters compliance: + +- PIL.Image.BILINEAR == Triangle +- PIL.Image.BICUBIC == Catrom +- PIL.Image.LANCZOS == Lanczos + +In ImageMagick, the radius of gaussian blur is called sigma and the second +parameter is called radius. In fact, there should not be additional parameters +for *gaussian blur*, because if the radius is too small, this is *not* +gaussian blur anymore. And if the radius is big this does not give any +advantages but makes operation slower. For the test, I set the radius +to sigma × 2.5. + +Following script was used for testing: +https://gist.github.com/homm/f9b8d8a84a57a7e51f9c2a5828e40e63 + + +## Why Pillow itself is so fast + +There are no cheats. High-quality resize and blur methods are used for all +benchmarks. Results are almost pixel-perfect. The difference is only effective +algorithms. Resampling in Pillow was rewritten in version 2.7 with +minimal usage of floating point numbers, precomputed coefficients and +cache-awareness transposition. + + +## Why Pillow-SIMD is even faster + +Because of SIMD, of course. There are some ideas how to achieve even better +performance. + +- **Efficient work with memory** Currently, each pixel is read from + memory to the SSE register, while every SSE register can handle + four pixels at once. +- **Integer-based arithmetic** Experiments show that integer-based arithmetic + does not affect the quality and increases the performance of non-SIMD code + up to 50%. +- **Aligned pixels allocation** Well-known that the SIMD load and store + commands work better with aligned memory. + + +## Why do not contribute SIMD to the original Pillow + +Well, it's not that simple. First of all, Pillow supports a large number +of architectures, not only x86. But even for x86 platforms, Pillow is often +distributed via precompiled binaries. To integrate SIMD in precompiled binaries +we need to do runtime checks of CPU capabilities. +To compile the code with runtime checks we need to pass `-mavx2` option +to the compiler. However this automatically activates all `if (__AVX2__)` +and below conditions. And SIMD instructions under such conditions exist +even in standard C library and they do not have any runtime checks. +Currently, I don't know how to allow SIMD instructions in the code +but *do not allow* such instructions without runtime checks. + + +## Installation + +In general, you need to do `pip install pillow-simd` as always and if you +are using SSE4-capable CPU everything should run smoothly. +Do not forget to remove original Pillow package first. + +If you want the AVX2-enabled version, you need to pass the additional flag to C +compiler. The easiest way to do that is define `CC` variable while compilation. + +```bash +$ pip uninstall pillow +$ CC="cc -mavx2" pip install -U --force-reinstall pillow-simd +``` + + +## Contributing to Pillow-SIMD + +Pillow-SIMD and Pillow are two separate projects. +Please submit bugs and improvements not related to SIMD to +[original Pillow][original-issues]. All bugs and fixes in Pillow +will appear in next Pillow-SIMD version automatically. + + + [original-docs]: http://pillow.readthedocs.io/ + [original-issues]: https://github.com/python-pillow/Pillow/issues/new + [original-changelog]: https://github.com/python-pillow/Pillow/blob/master/CHANGES.rst + [original-contribute]: https://github.com/python-pillow/Pillow/blob/master/.github/CONTRIBUTING.md + [gaussian-blur-changes]: http://pillow.readthedocs.io/en/3.2.x/releasenotes/2.7.0.html#gaussian-blur-and-unsharp-mask + [uploadcare.com]: https://uploadcare.com/?utm_source=github&utm_medium=description&utm_campaign=pillow-simd + [uploadcare.logo]: https://ucarecdn.com/dc4b8363-e89f-402f-8ea8-ce606664069c/-/preview/ diff --git a/setup.py b/setup.py index 15d81e465..1ae577ba4 100755 --- a/setup.py +++ b/setup.py @@ -767,7 +767,7 @@ try: setup(name=NAME, version=PILLOW_VERSION, description='Python Imaging Library (Fork)', - long_description=_read('README.rst').decode('utf-8'), + long_description=_read('README.md').decode('utf-8'), author='Alex Clark (Fork Author)', author_email='aclark@aclark.net', url='http://python-pillow.org', From aaa9974e0d06da58a286688cdfde3fba91460d6a Mon Sep 17 00:00:00 2001 From: homm Date: Tue, 4 Oct 2016 06:19:54 +0300 Subject: [PATCH 02/21] compiller flag --- setup.py | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/setup.py b/setup.py index 1ae577ba4..eb8bb43a0 100755 --- a/setup.py +++ b/setup.py @@ -630,7 +630,8 @@ class pil_build_ext(build_ext): exts = [(Extension("PIL._imaging", files, libraries=libs, - define_macros=defs))] + define_macros=defs, + extra_compile_args=['-msse4']))] # # additional libraries From 06b502f732f84c1fabd6fa072b9a2741c139fa6e Mon Sep 17 00:00:00 2001 From: homm Date: Tue, 4 Oct 2016 13:05:41 +0300 Subject: [PATCH 03/21] pypi readme updated changelog --- CHANGES.SIMD.rst | 16 ++++++++++++++++ PyPI.rst | 6 ++++++ setup.py | 4 ++-- 3 files changed, 24 insertions(+), 2 deletions(-) create mode 100644 PyPI.rst diff --git a/CHANGES.SIMD.rst b/CHANGES.SIMD.rst index 374bdc7f6..20ca9b570 100644 --- a/CHANGES.SIMD.rst +++ b/CHANGES.SIMD.rst @@ -1,6 +1,22 @@ Changelog (Pillow-SIMD) ======================= +3.4.1.post0 +----------- + + - A lot of optimizations in resampling including 16-bit + intermediate color representation and heavy unrolling. + +3.3.2.post0 +----------- + + - Maintenance release + +3.3.0.post2 +----------- + +- Fixed error in RGBa -> RGBA convertion + 3.3.0.post1 ----------- diff --git a/PyPI.rst b/PyPI.rst new file mode 100644 index 000000000..f1b79994f --- /dev/null +++ b/PyPI.rst @@ -0,0 +1,6 @@ + +`Pillow-SIMD repo and readme ` + +`Pillow-SIMD changelog ` + +`Pillow documentation ` diff --git a/setup.py b/setup.py index eb8bb43a0..2cbaac5b0 100755 --- a/setup.py +++ b/setup.py @@ -768,10 +768,10 @@ try: setup(name=NAME, version=PILLOW_VERSION, description='Python Imaging Library (Fork)', - long_description=_read('README.md').decode('utf-8'), + long_description=_read('PyPI.rst').decode('utf-8'), author='Alex Clark (Fork Author)', author_email='aclark@aclark.net', - url='http://python-pillow.org', + url='https://github.com/uploadcare/pillow-simd', classifiers=[ "Development Status :: 6 - Mature", "Topic :: Multimedia :: Graphics", From c5bb0305f39ab25edf550ebb3463fba807e568b0 Mon Sep 17 00:00:00 2001 From: homm Date: Tue, 4 Oct 2016 15:25:52 +0300 Subject: [PATCH 04/21] update readme --- README.md | 60 +++++++++++++++++++++++-------------------------------- 1 file changed, 25 insertions(+), 35 deletions(-) diff --git a/README.md b/README.md index 8f325257f..d4ce868b4 100644 --- a/README.md +++ b/README.md @@ -57,23 +57,23 @@ in 0.5 seconds, the result will be 8.2 Mpx/s. - Skia 53 - ImageMagick 6.9.3-8 Q8 x86_64 -- Pillow 3.3.0 -- Pillow-SIMD 3.3.0.post1 +- Pillow 3.4.1 +- Pillow-SIMD 3.4.1.post0 Operation | Filter | IM | Pillow| SIMD SSE4| SIMD AVX2| Skia 53 ------------------------|---------|------|-------|----------|----------|-------- -**Resize to 16x16** | Bilinear| 41.37| 337.12| 571.67| 903.40| 809.49 - | Bicubic | 20.58| 185.79| 305.72| 552.85| 453.10 - | Lanczos | 14.17| 113.27| 189.19| 355.40| 292.57 -**Resize to 320x180** | Bilinear| 29.46| 209.06| 366.33| 558.57| 592.76 - | Bicubic | 15.75| 124.43| 224.91| 353.53| 327.68 - | Lanczos | 10.80| 82.25| 153.10| 244.22| 196.92 -**Resize to 1920x1200** | Bilinear| 17.80| 55.87| 131.27| 152.11| 192.30 - | Bicubic | 9.99| 43.64| 90.20| 112.34| 112.84 - | Lanczos | 6.95| 34.51| 72.55| 103.16| 104.76 -**Resize to 7712x4352** | Bilinear| 2.54| 6.71| 16.06| 20.33| 20.58 - | Bicubic | 1.60| 5.51| 12.65| 16.46| 16.52 - | Lanczos | 1.09| 4.62| 9.84| 13.38| 12.05 +**Resize to 16x16** | Bilinear| 41.37| 317.28| 1282.85| 1601.85| 809.49 + | Bicubic | 20.58| 174.85| 712.95| 900.65| 453.10 + | Lanczos | 14.17| 117.58| 438.60| 544.89| 292.57 +**Resize to 320x180** | Bilinear| 29.46| 195.21| 863.40| 1057.81| 592.76 + | Bicubic | 15.75| 118.79| 503.75| 504.76| 327.68 + | Lanczos | 10.80| 79.59| 312.05| 384.92| 196.92 +**Resize to 1920x1200** | Bilinear| 17.80| 68.39| 215.15| 268.29| 192.30 + | Bicubic | 9.99| 49.23| 170.41| 210.62| 112.84 + | Lanczos | 6.95| 37.71| 130.00| 162.57| 104.76 +**Resize to 7712x4352** | Bilinear| 2.54| 8.38| 22.81| 29.17| 20.58 + | Bicubic | 1.60| 6.57| 18.23| 23.94| 16.52 + | Lanczos | 1.09| 5.20| 14.90| 20.40| 12.05 **Blur** | 1px | 6.60| 16.94| 35.16| | | 10px | 2.28| 16.94| 35.47| | | 100px | 0.34| 16.93| 35.53| | @@ -82,9 +82,9 @@ Operation | Filter | IM | Pillow| SIMD SSE4| SIMD AVX2| Skia 53 ### Some conclusion Pillow is always faster than ImageMagick. And Pillow-SIMD is faster -than Pillow in 2—2.5 times. In general, Pillow-SIMD with AVX2 always -**8-20 times faster** than ImageMagick and almost equal to the Skia results, -high-speed graphics library used in Chromium. +than Pillow in 4—5 times. In general, Pillow-SIMD with AVX2 always +**16-40 times faster** than ImageMagick and overperforms Skia, +high-speed graphics library used in Chromium, up to 2 times. ### Methodology @@ -119,36 +119,26 @@ There are no cheats. High-quality resize and blur methods are used for all benchmarks. Results are almost pixel-perfect. The difference is only effective algorithms. Resampling in Pillow was rewritten in version 2.7 with minimal usage of floating point numbers, precomputed coefficients and -cache-awareness transposition. +cache-awareness transposition. This result was improved in 3.3 & 3.4 with +integer-only arithmetics and other optimizations. ## Why Pillow-SIMD is even faster -Because of SIMD, of course. There are some ideas how to achieve even better -performance. - -- **Efficient work with memory** Currently, each pixel is read from - memory to the SSE register, while every SSE register can handle - four pixels at once. -- **Integer-based arithmetic** Experiments show that integer-based arithmetic - does not affect the quality and increases the performance of non-SIMD code - up to 50%. -- **Aligned pixels allocation** Well-known that the SIMD load and store - commands work better with aligned memory. +Because of SIMD, of course. But this is not all. Heavy loops unrolling, +specific instructions, which not available for scalar. ## Why do not contribute SIMD to the original Pillow -Well, it's not that simple. First of all, Pillow supports a large number +Well, that's not simple. First of all, Pillow supports a large number of architectures, not only x86. But even for x86 platforms, Pillow is often distributed via precompiled binaries. To integrate SIMD in precompiled binaries we need to do runtime checks of CPU capabilities. To compile the code with runtime checks we need to pass `-mavx2` option -to the compiler. However this automatically activates all `if (__AVX2__)` -and below conditions. And SIMD instructions under such conditions exist -even in standard C library and they do not have any runtime checks. -Currently, I don't know how to allow SIMD instructions in the code -but *do not allow* such instructions without runtime checks. +to the compiler. But with that option compiller will inject AVX instructions +enev for SSE functions, because every SSE instruction has AVX equivalent. +So there is no easy way to compile such library, especially with setuptools. ## Installation From 110a362f8a9f01742e6efee2e3ea09d7d0a8879c Mon Sep 17 00:00:00 2001 From: homm Date: Tue, 4 Oct 2016 18:35:38 +0300 Subject: [PATCH 05/21] package name --- setup.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/setup.py b/setup.py index 2cbaac5b0..6c2e78e73 100755 --- a/setup.py +++ b/setup.py @@ -134,7 +134,7 @@ except (ImportError, OSError): # pypy emits an oserror _tkinter = None -NAME = 'Pillow' +NAME = 'Pillow-SIMD' PILLOW_VERSION = get_version() JPEG_ROOT = None JPEG2K_ROOT = None From ffd69a9593c69e3a2aebaed2a1907cb87ab921bb Mon Sep 17 00:00:00 2001 From: homm Date: Tue, 4 Oct 2016 19:22:42 +0300 Subject: [PATCH 06/21] fix readmes --- CHANGES.SIMD.rst | 6 +++--- PyPI.rst | 6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/CHANGES.SIMD.rst b/CHANGES.SIMD.rst index 20ca9b570..cd9c68493 100644 --- a/CHANGES.SIMD.rst +++ b/CHANGES.SIMD.rst @@ -4,13 +4,13 @@ Changelog (Pillow-SIMD) 3.4.1.post0 ----------- - - A lot of optimizations in resampling including 16-bit - intermediate color representation and heavy unrolling. +- A lot of optimizations in resampling including 16-bit + intermediate color representation and heavy unrolling. 3.3.2.post0 ----------- - - Maintenance release +- Maintenance release 3.3.0.post2 ----------- diff --git a/PyPI.rst b/PyPI.rst index f1b79994f..e63270f75 100644 --- a/PyPI.rst +++ b/PyPI.rst @@ -1,6 +1,6 @@ -`Pillow-SIMD repo and readme ` +`Pillow-SIMD repo and readme `_ -`Pillow-SIMD changelog ` +`Pillow-SIMD changelog `_ -`Pillow documentation ` +`Pillow documentation `_ From 56ab1cfba198db62d3d387b7cddc3da5e02ccd29 Mon Sep 17 00:00:00 2001 From: homm Date: Wed, 5 Oct 2016 01:28:17 +0300 Subject: [PATCH 07/21] clarify Following fork --- README.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/README.md b/README.md index d4ce868b4..acfc3fad4 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,10 @@ # Pillow-SIMD Pillow-SIMD is "following" Pillow fork (which is PIL fork itself). +"Following" means than Pillow-SIMD versions are 100% compatible +drop-in replacement for Pillow with the same version number. +For example, `Pillow-SIMD 3.2.0.post3` is drop-in replacement for +`Pillow 3.2.0` and `Pillow-SIMD 3.4.1.post0` for `Pillow 3.4.1`. For more information about original Pillow, please [read the documentation][original-docs], From 44618e606273bbeee58e65f5ec78b17cbdef55a4 Mon Sep 17 00:00:00 2001 From: homm Date: Thu, 6 Oct 2016 14:08:23 +0300 Subject: [PATCH 08/21] update versions in readme --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index acfc3fad4..e37d51f53 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ Pillow-SIMD is "following" Pillow fork (which is PIL fork itself). "Following" means than Pillow-SIMD versions are 100% compatible drop-in replacement for Pillow with the same version number. For example, `Pillow-SIMD 3.2.0.post3` is drop-in replacement for -`Pillow 3.2.0` and `Pillow-SIMD 3.4.1.post0` for `Pillow 3.4.1`. +`Pillow 3.2.0` and `Pillow-SIMD 3.3.3.post0` for `Pillow 3.3.3`. For more information about original Pillow, please [read the documentation][original-docs], @@ -62,7 +62,7 @@ in 0.5 seconds, the result will be 8.2 Mpx/s. - Skia 53 - ImageMagick 6.9.3-8 Q8 x86_64 - Pillow 3.4.1 -- Pillow-SIMD 3.4.1.post0 +- Pillow-SIMD 3.4.1.post1 Operation | Filter | IM | Pillow| SIMD SSE4| SIMD AVX2| Skia 53 ------------------------|---------|------|-------|----------|----------|-------- From 2612c9967d8a252a594d18e5a8c94b9ff0b3f0bd Mon Sep 17 00:00:00 2001 From: homm Date: Thu, 6 Oct 2016 14:11:02 +0300 Subject: [PATCH 09/21] changelog --- CHANGES.SIMD.rst | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/CHANGES.SIMD.rst b/CHANGES.SIMD.rst index cd9c68493..2fb14c9bc 100644 --- a/CHANGES.SIMD.rst +++ b/CHANGES.SIMD.rst @@ -1,6 +1,12 @@ Changelog (Pillow-SIMD) ======================= +3.4.1.post1 +----------- + +- Critical memory error for some combinations of source/destinatnion + sizes is fixed. + 3.4.1.post0 ----------- From 582ed2d87b5d0eea18f70c4ef3a33196fdf509d0 Mon Sep 17 00:00:00 2001 From: Elijah Date: Fri, 7 Oct 2016 16:54:22 +0500 Subject: [PATCH 10/21] Rewritten the Pillow-SIMD readme --- README.md | 154 +++++++++++++++++++++++++++--------------------------- 1 file changed, 77 insertions(+), 77 deletions(-) diff --git a/README.md b/README.md index e37d51f53..8b799abd8 100644 --- a/README.md +++ b/README.md @@ -1,12 +1,12 @@ # Pillow-SIMD -Pillow-SIMD is "following" Pillow fork (which is PIL fork itself). -"Following" means than Pillow-SIMD versions are 100% compatible -drop-in replacement for Pillow with the same version number. -For example, `Pillow-SIMD 3.2.0.post3` is drop-in replacement for -`Pillow 3.2.0` and `Pillow-SIMD 3.3.3.post0` for `Pillow 3.3.3`. +Pillow-SIMD is "following" the Pillow fork (which is a PIL's fork itself). +"Following" here means than Pillow-SIMD versions are 100% compatible +drop-in replacements for Pillow of the same version. +For example, `Pillow-SIMD 3.2.0.post3` is a drop-in replacement for +`Pillow 3.2.0`, and `Pillow-SIMD 3.3.3.post0` — for `Pillow 3.3.3`. -For more information about original Pillow, please +For more information on the original Pillow, please refer to: [read the documentation][original-docs], [check the changelog][original-changelog] and [find out how to contribute][original-contribute]. @@ -14,35 +14,32 @@ For more information about original Pillow, please ## Why SIMD -There are many ways to improve the performance of image processing. -You can use better algorithms for the same task, you can make better -implementation for current algorithms, or you can use more processing unit -resources. It is perfect when you can just use more efficient algorithm like -when gaussian blur based on convolutions [was replaced][gaussian-blur-changes] -by sequential box filters. But a number of such improvements are very limited. -It is also very tempting to use more processor unit resources -(via parallelization) when they are available. But it is handier just -to make things faster on the same resources. And that is where SIMD works better. +There are multiple ways to tweak image processing performance. +To name a few, such ways can be: utilizing better algorithms, optimizing existing implementations, +using more processing power and/or resources. +One of the great examples of using a more efficient algorithm is [replacing][gaussian-blur-changes] +a convolution-based Gaussian blur with a sequential-box one. +Such examples are rather rare, though. It is also known, that certain processes might be optimized +by using parallel processing to run the respective routines. +But a more practical key to optimizations might be making things work faster +using the resources at hand. For instance, SIMD computing might be the case. -SIMD stands for "single instruction, multiple data". This is a way to perform -same operations against the huge amount of homogeneous data. -Modern CPU have different SIMD instructions sets like -MMX, SSE-SSE4, AVX, AVX2, AVX512, NEON. +SIMD stands for "single instruction, multiple data" and its essence is +in performing the same operation on multiple data points simultaneously +by using multiple processing elements. +Common CPU SIMD instruction sets are MMX, SSE-SSE4, AVX, AVX2, AVX512, NEON. -Currently, Pillow-SIMD can be [compiled](#installation) with SSE4 (default) -and AVX2 support. +Currently, Pillow-SIMD can be [compiled](#installation) with SSE4 (default) and AVX2 support. ## Status +Pillow-SIMD project is production-ready for you to start building SIMD-enabled image processing systems. +The project is supported by Uploadcare, a SAAS for cloud-based image storing and processing. [![Uploadcare][uploadcare.logo]][uploadcare.com] +In fact, Uploadcare itself has been running Pillow-SIMD for about a year now. -Pillow-SIMD can be used in production. Pillow-SIMD has been operating on -[Uploadcare][uploadcare.com] servers for more than 1 year. -Uploadcare is SAAS for image storing and processing in the cloud -and the main sponsor of Pillow-SIMD project. - -Currently, following operations are accelerated: +The following Uploadcare image operations are currently SIMD-accelerated: - Resize (convolution-based resampling): SSE4, AVX2 - Gaussian and box blur: SSE4 @@ -50,20 +47,25 @@ Currently, following operations are accelerated: - RGBA → RGBa (alpha premultiplication): SSE4, AVX2 - RGBa → RGBA (division by alpha): AVX2 -See [CHANGES](CHANGES.SIMD.rst). +See [CHANGES](CHANGES.SIMD.rst) for more information. + ## Benchmarks -The numbers in the table represent processed megapixels of source RGB 2560x1600 -image per second. For example, if resize of 2560x1600 image is done -in 0.5 seconds, the result will be 8.2 Mpx/s. +In order for you to clearly assess the productivity of implementing SIMD computing into Pillow image processing, +we ran a number of benchmarks. The respective results can be found in the table below. +The numbers represent processing rates in megapixels per second (Mpx/s). +For instance, the rate at which a 2560x1600 RGB image is processed in 0.5 seconds equals to 8.2 Mpx/s. +Here are the instruments we've been up to during the benchmarks: - Skia 53 - ImageMagick 6.9.3-8 Q8 x86_64 - Pillow 3.4.1 - Pillow-SIMD 3.4.1.post1 +Now, let's proceed to the numbers (the more — the better): + Operation | Filter | IM | Pillow| SIMD SSE4| SIMD AVX2| Skia 53 ------------------------|---------|------|-------|----------|----------|-------- **Resize to 16x16** | Bilinear| 41.37| 317.28| 1282.85| 1601.85| 809.49 @@ -83,89 +85,87 @@ Operation | Filter | IM | Pillow| SIMD SSE4| SIMD AVX2| Skia 53 | 100px | 0.34| 16.93| 35.53| | -### Some conclusion +### A brief conclusion -Pillow is always faster than ImageMagick. And Pillow-SIMD is faster -than Pillow in 4—5 times. In general, Pillow-SIMD with AVX2 always -**16-40 times faster** than ImageMagick and overperforms Skia, -high-speed graphics library used in Chromium, up to 2 times. +The results show that Pillow is generally faster than ImageMagick, +Pillow-SIMD, in turn, is even faster than the original Pillow by the factor of 4-5. +In general, Pillow-SIMD with AVX2 is always **16 to 40 times faster** than +ImageMagick and outperforms Skia, the high-speed graphics library used in Chromium. ### Methodology -All tests were performed on Ubuntu 14.04 64-bit running on -Intel Core i5 4258U with AVX2 CPU on the single thread. - -ImageMagick performance was measured with command-line tool `convert` with -`-verbose` and `-bench` arguments. I use command line because -I need to test the latest version and this is the easiest way to do that. - -All operations produce exactly the same results. +All rates were measured using the following setup: Ubuntu 14.04 64-bit, +single-thread AVX2-enabled intel i5 4258U CPU. +ImageMagick performance was measured with the `convert` command-line tool +followed by `-verbose` and `-bench` arguments. +Such approach was used because there's usually a need in testing +the latest software versions and command-line is the easiest way to do that. +All the routines involved with the testing procedure produced identic results. Resizing filters compliance: - PIL.Image.BILINEAR == Triangle - PIL.Image.BICUBIC == Catrom - PIL.Image.LANCZOS == Lanczos -In ImageMagick, the radius of gaussian blur is called sigma and the second -parameter is called radius. In fact, there should not be additional parameters -for *gaussian blur*, because if the radius is too small, this is *not* -gaussian blur anymore. And if the radius is big this does not give any -advantages but makes operation slower. For the test, I set the radius -to sigma × 2.5. +In ImageMagick, Gaussian blur operation invokes two parameters: +the first is called 'radius' and the second is called 'sigma'. +In fact, in order for the blur operation to be Gaussian, there should be no additional parameters. +When the radius value is too small the blur procedure ceases to be Gaussian and +if the value is excessively big the operation gets slowed down with zero benefits in exchange. +For the benchmarking purposes, the radius was set to sigma × 2.5. -Following script was used for testing: +Following script was used for the benchmarking procedure: https://gist.github.com/homm/f9b8d8a84a57a7e51f9c2a5828e40e63 ## Why Pillow itself is so fast -There are no cheats. High-quality resize and blur methods are used for all -benchmarks. Results are almost pixel-perfect. The difference is only effective -algorithms. Resampling in Pillow was rewritten in version 2.7 with -minimal usage of floating point numbers, precomputed coefficients and -cache-awareness transposition. This result was improved in 3.3 & 3.4 with +No cheats involved. We've used identical high-quality resize and blur methods for the benchmark. +Outcomes produced by different libraries are in almost pixel-perfect agreement. +The difference in measured rates is only provided with the performance of every involved algorithm. +Resampling for Pillow 2.7 was rewritten with minimal usage of floating point calculations, +precomputed coefficients and cache-awareness transposition. +The results were further improved in versions 3.3 & 3.4 by utilizing integer-only arithmetics and other optimizations. - ## Why Pillow-SIMD is even faster -Because of SIMD, of course. But this is not all. Heavy loops unrolling, -specific instructions, which not available for scalar. +Because of the SIMD computing, of course. But there's more to it: +heavy loops unrolling, specific instructions, which aren't available for **scalar WTF**. ## Why do not contribute SIMD to the original Pillow -Well, that's not simple. First of all, Pillow supports a large number -of architectures, not only x86. But even for x86 platforms, Pillow is often -distributed via precompiled binaries. To integrate SIMD in precompiled binaries -we need to do runtime checks of CPU capabilities. -To compile the code with runtime checks we need to pass `-mavx2` option -to the compiler. But with that option compiller will inject AVX instructions -enev for SSE functions, because every SSE instruction has AVX equivalent. +Well, it's not that simple. First of all, the original Pillow supports +a large number of architectures, not just x86. +But even for x86 platforms, Pillow is often distributed via precompiled binaries. +In order for us to integrate SIMD into the precompiled binaries +we'd need to execute runtime CPU capabilities checks. +To compile the code this way we need to pass the `-mavx2` option to the compiler. +But with the option included, a compiler will inject AVX instructions even +for SSE functions (i.e. interchange them) since every SSE instruction has its AVX equivalent. So there is no easy way to compile such library, especially with setuptools. ## Installation -In general, you need to do `pip install pillow-simd` as always and if you -are using SSE4-capable CPU everything should run smoothly. -Do not forget to remove original Pillow package first. - -If you want the AVX2-enabled version, you need to pass the additional flag to C -compiler. The easiest way to do that is define `CC` variable while compilation. +If there's a copy of the original Pillow installed, it has to be removed first. +In general, you need to run `pip install pillow-simd`, +and if you're using SSE4-capable CPU everything should run smoothly. +If you'd like to install the AVX2-enabled version, +you need to pass the additional flag to a C compiler. +The easiest way to do so is to define the `CC` variable **while -> during?** the compilation. ```bash $ pip uninstall pillow $ CC="cc -mavx2" pip install -U --force-reinstall pillow-simd ``` - ## Contributing to Pillow-SIMD -Pillow-SIMD and Pillow are two separate projects. -Please submit bugs and improvements not related to SIMD to -[original Pillow][original-issues]. All bugs and fixes in Pillow -will appear in next Pillow-SIMD version automatically. +Please be aware that Pillow-SIMD and Pillow are two separate projects. +Please submit bugs and improvements not related to SIMD to the [original Pillow][original-issues]. +All bugfixes to the original Pillow will then be transferred to the next Pillow-SIMD version automatically. [original-docs]: http://pillow.readthedocs.io/ From b3592c864ad268c10e105f9c735838d9a82afed1 Mon Sep 17 00:00:00 2001 From: Elijah Date: Sat, 8 Oct 2016 14:51:37 +0500 Subject: [PATCH 11/21] Updated according to the review --- README.md | 38 +++++++++++++++++--------------------- 1 file changed, 17 insertions(+), 21 deletions(-) diff --git a/README.md b/README.md index 8b799abd8..a5dd20c74 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # Pillow-SIMD -Pillow-SIMD is "following" the Pillow fork (which is a PIL's fork itself). +Pillow-SIMD is "following" Pillow (which is a PIL's fork itself). "Following" here means than Pillow-SIMD versions are 100% compatible drop-in replacements for Pillow of the same version. For example, `Pillow-SIMD 3.2.0.post3` is a drop-in replacement for @@ -18,7 +18,8 @@ There are multiple ways to tweak image processing performance. To name a few, such ways can be: utilizing better algorithms, optimizing existing implementations, using more processing power and/or resources. One of the great examples of using a more efficient algorithm is [replacing][gaussian-blur-changes] -a convolution-based Gaussian blur with a sequential-box one. +a convolution-based Gaussian blur with a sequential-box one. + Such examples are rather rare, though. It is also known, that certain processes might be optimized by using parallel processing to run the respective routines. But a more practical key to optimizations might be making things work faster @@ -29,17 +30,17 @@ in performing the same operation on multiple data points simultaneously by using multiple processing elements. Common CPU SIMD instruction sets are MMX, SSE-SSE4, AVX, AVX2, AVX512, NEON. -Currently, Pillow-SIMD can be [compiled](#installation) with SSE4 (default) and AVX2 support. +Currently, Pillow-SIMD can be [compiled](#installation) with SSE4 (default) or AVX2 support. ## Status -Pillow-SIMD project is production-ready for you to start building SIMD-enabled image processing systems. +Pillow-SIMD project is production-ready. The project is supported by Uploadcare, a SAAS for cloud-based image storing and processing. [![Uploadcare][uploadcare.logo]][uploadcare.com] -In fact, Uploadcare itself has been running Pillow-SIMD for about a year now. +In fact, Uploadcare has been running Pillow-SIMD for about two years now. -The following Uploadcare image operations are currently SIMD-accelerated: +The following image operations are currently SIMD-accelerated: - Resize (convolution-based resampling): SSE4, AVX2 - Gaussian and box blur: SSE4 @@ -54,18 +55,16 @@ See [CHANGES](CHANGES.SIMD.rst) for more information. ## Benchmarks In order for you to clearly assess the productivity of implementing SIMD computing into Pillow image processing, -we ran a number of benchmarks. The respective results can be found in the table below. +we ran a number of benchmarks. The respective results can be found in the table below (the more — the better). The numbers represent processing rates in megapixels per second (Mpx/s). For instance, the rate at which a 2560x1600 RGB image is processed in 0.5 seconds equals to 8.2 Mpx/s. -Here are the instruments we've been up to during the benchmarks: +Here is the list of libraries and their versions we've been up to during the benchmarks: - Skia 53 - ImageMagick 6.9.3-8 Q8 x86_64 - Pillow 3.4.1 - Pillow-SIMD 3.4.1.post1 -Now, let's proceed to the numbers (the more — the better): - Operation | Filter | IM | Pillow| SIMD SSE4| SIMD AVX2| Skia 53 ------------------------|---------|------|-------|----------|----------|-------- **Resize to 16x16** | Bilinear| 41.37| 317.28| 1282.85| 1601.85| 809.49 @@ -87,7 +86,7 @@ Operation | Filter | IM | Pillow| SIMD SSE4| SIMD AVX2| Skia 53 ### A brief conclusion -The results show that Pillow is generally faster than ImageMagick, +The results show that Pillow is always faster than ImageMagick, Pillow-SIMD, in turn, is even faster than the original Pillow by the factor of 4-5. In general, Pillow-SIMD with AVX2 is always **16 to 40 times faster** than ImageMagick and outperforms Skia, the high-speed graphics library used in Chromium. @@ -95,7 +94,7 @@ ImageMagick and outperforms Skia, the high-speed graphics library used in Chromi ### Methodology All rates were measured using the following setup: Ubuntu 14.04 64-bit, -single-thread AVX2-enabled intel i5 4258U CPU. +single-thread AVX2-enabled Intel i5 4258U CPU. ImageMagick performance was measured with the `convert` command-line tool followed by `-verbose` and `-bench` arguments. Such approach was used because there's usually a need in testing @@ -112,7 +111,7 @@ the first is called 'radius' and the second is called 'sigma'. In fact, in order for the blur operation to be Gaussian, there should be no additional parameters. When the radius value is too small the blur procedure ceases to be Gaussian and if the value is excessively big the operation gets slowed down with zero benefits in exchange. -For the benchmarking purposes, the radius was set to sigma × 2.5. +For the benchmarking purposes, the radius was set to `sigma × 2.5`. Following script was used for the benchmarking procedure: https://gist.github.com/homm/f9b8d8a84a57a7e51f9c2a5828e40e63 @@ -123,15 +122,11 @@ https://gist.github.com/homm/f9b8d8a84a57a7e51f9c2a5828e40e63 No cheats involved. We've used identical high-quality resize and blur methods for the benchmark. Outcomes produced by different libraries are in almost pixel-perfect agreement. The difference in measured rates is only provided with the performance of every involved algorithm. -Resampling for Pillow 2.7 was rewritten with minimal usage of floating point calculations, -precomputed coefficients and cache-awareness transposition. -The results were further improved in versions 3.3 & 3.4 by utilizing -integer-only arithmetics and other optimizations. ## Why Pillow-SIMD is even faster Because of the SIMD computing, of course. But there's more to it: -heavy loops unrolling, specific instructions, which aren't available for **scalar WTF**. +heavy loops unrolling, specific instructions, which aren't available for scalar data types. ## Why do not contribute SIMD to the original Pillow @@ -149,12 +144,13 @@ So there is no easy way to compile such library, especially with setuptools. ## Installation -If there's a copy of the original Pillow installed, it has to be removed first. -In general, you need to run `pip install pillow-simd`, +If there's a copy of the original Pillow installed, it has to be removed first +with `$ pip uninstall -y pillow`. +The installation itself is simple just as running `$ pip install pillow-simd`, and if you're using SSE4-capable CPU everything should run smoothly. If you'd like to install the AVX2-enabled version, you need to pass the additional flag to a C compiler. -The easiest way to do so is to define the `CC` variable **while -> during?** the compilation. +The easiest way to do so is to define the `CC` variable during the compilation. ```bash $ pip uninstall pillow From 5ef7b7392ce5f336ecd57c4a063b6e9b533a938d Mon Sep 17 00:00:00 2001 From: homm Date: Thu, 13 Oct 2016 16:37:32 +0300 Subject: [PATCH 12/21] fix markup --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index a5dd20c74..5742d40d4 100644 --- a/README.md +++ b/README.md @@ -37,7 +37,9 @@ Currently, Pillow-SIMD can be [compiled](#installation) with SSE4 (default) or A Pillow-SIMD project is production-ready. The project is supported by Uploadcare, a SAAS for cloud-based image storing and processing. + [![Uploadcare][uploadcare.logo]][uploadcare.com] + In fact, Uploadcare has been running Pillow-SIMD for about two years now. The following image operations are currently SIMD-accelerated: From dc148338c786ea1c7529edc515545a3c8997f1b3 Mon Sep 17 00:00:00 2001 From: Alexander Date: Mon, 27 Feb 2017 02:38:20 +0300 Subject: [PATCH 13/21] replace benchmarks with link to pillow-perf --- README.md | 75 ++++++++++--------------------------------------------- 1 file changed, 13 insertions(+), 62 deletions(-) diff --git a/README.md b/README.md index 5742d40d4..be983a291 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,7 @@ # Pillow-SIMD -Pillow-SIMD is "following" Pillow (which is a PIL's fork itself). -"Following" here means than Pillow-SIMD versions are 100% compatible +Pillow-SIMD is "following" [Pillow][original-docs]. +Pillow-SIMD versions are 100% compatible drop-in replacements for Pillow of the same version. For example, `Pillow-SIMD 3.2.0.post3` is a drop-in replacement for `Pillow 3.2.0`, and `Pillow-SIMD 3.3.3.post0` — for `Pillow 3.3.3`. @@ -53,71 +53,17 @@ The following image operations are currently SIMD-accelerated: See [CHANGES](CHANGES.SIMD.rst) for more information. - ## Benchmarks -In order for you to clearly assess the productivity of implementing SIMD computing into Pillow image processing, -we ran a number of benchmarks. The respective results can be found in the table below (the more — the better). -The numbers represent processing rates in megapixels per second (Mpx/s). -For instance, the rate at which a 2560x1600 RGB image is processed in 0.5 seconds equals to 8.2 Mpx/s. -Here is the list of libraries and their versions we've been up to during the benchmarks: - -- Skia 53 -- ImageMagick 6.9.3-8 Q8 x86_64 -- Pillow 3.4.1 -- Pillow-SIMD 3.4.1.post1 - -Operation | Filter | IM | Pillow| SIMD SSE4| SIMD AVX2| Skia 53 -------------------------|---------|------|-------|----------|----------|-------- -**Resize to 16x16** | Bilinear| 41.37| 317.28| 1282.85| 1601.85| 809.49 - | Bicubic | 20.58| 174.85| 712.95| 900.65| 453.10 - | Lanczos | 14.17| 117.58| 438.60| 544.89| 292.57 -**Resize to 320x180** | Bilinear| 29.46| 195.21| 863.40| 1057.81| 592.76 - | Bicubic | 15.75| 118.79| 503.75| 504.76| 327.68 - | Lanczos | 10.80| 79.59| 312.05| 384.92| 196.92 -**Resize to 1920x1200** | Bilinear| 17.80| 68.39| 215.15| 268.29| 192.30 - | Bicubic | 9.99| 49.23| 170.41| 210.62| 112.84 - | Lanczos | 6.95| 37.71| 130.00| 162.57| 104.76 -**Resize to 7712x4352** | Bilinear| 2.54| 8.38| 22.81| 29.17| 20.58 - | Bicubic | 1.60| 6.57| 18.23| 23.94| 16.52 - | Lanczos | 1.09| 5.20| 14.90| 20.40| 12.05 -**Blur** | 1px | 6.60| 16.94| 35.16| | - | 10px | 2.28| 16.94| 35.47| | - | 100px | 0.34| 16.93| 35.53| | - - -### A brief conclusion +Tons of tests can be found on the [Pillow Performance][pillow-perf-page] page. +There are benchmarks against different versions of Pillow and Pillow-SIMD +as well as ImageMagick, Skia, OpenCV and IPP. The results show that Pillow is always faster than ImageMagick, -Pillow-SIMD, in turn, is even faster than the original Pillow by the factor of 4-5. +Pillow-SIMD, in turn, is even faster than the original Pillow by the factor of 4-6. In general, Pillow-SIMD with AVX2 is always **16 to 40 times faster** than ImageMagick and outperforms Skia, the high-speed graphics library used in Chromium. -### Methodology - -All rates were measured using the following setup: Ubuntu 14.04 64-bit, -single-thread AVX2-enabled Intel i5 4258U CPU. -ImageMagick performance was measured with the `convert` command-line tool -followed by `-verbose` and `-bench` arguments. -Such approach was used because there's usually a need in testing -the latest software versions and command-line is the easiest way to do that. -All the routines involved with the testing procedure produced identic results. -Resizing filters compliance: - -- PIL.Image.BILINEAR == Triangle -- PIL.Image.BICUBIC == Catrom -- PIL.Image.LANCZOS == Lanczos - -In ImageMagick, Gaussian blur operation invokes two parameters: -the first is called 'radius' and the second is called 'sigma'. -In fact, in order for the blur operation to be Gaussian, there should be no additional parameters. -When the radius value is too small the blur procedure ceases to be Gaussian and -if the value is excessively big the operation gets slowed down with zero benefits in exchange. -For the benchmarking purposes, the radius was set to `sigma × 2.5`. - -Following script was used for the benchmarking procedure: -https://gist.github.com/homm/f9b8d8a84a57a7e51f9c2a5828e40e63 - ## Why Pillow itself is so fast @@ -125,6 +71,7 @@ No cheats involved. We've used identical high-quality resize and blur methods fo Outcomes produced by different libraries are in almost pixel-perfect agreement. The difference in measured rates is only provided with the performance of every involved algorithm. + ## Why Pillow-SIMD is even faster Because of the SIMD computing, of course. But there's more to it: @@ -159,6 +106,7 @@ $ pip uninstall pillow $ CC="cc -mavx2" pip install -U --force-reinstall pillow-simd ``` + ## Contributing to Pillow-SIMD Please be aware that Pillow-SIMD and Pillow are two separate projects. @@ -166,10 +114,13 @@ Please submit bugs and improvements not related to SIMD to the [original Pillow] All bugfixes to the original Pillow will then be transferred to the next Pillow-SIMD version automatically. - [original-docs]: http://pillow.readthedocs.io/ + [original-homepage]: https://python-pillow.org/ + [original-docs]: https://pillow.readthedocs.io/ [original-issues]: https://github.com/python-pillow/Pillow/issues/new [original-changelog]: https://github.com/python-pillow/Pillow/blob/master/CHANGES.rst [original-contribute]: https://github.com/python-pillow/Pillow/blob/master/.github/CONTRIBUTING.md - [gaussian-blur-changes]: http://pillow.readthedocs.io/en/3.2.x/releasenotes/2.7.0.html#gaussian-blur-and-unsharp-mask + [gaussian-blur-changes]: https://pillow.readthedocs.io/en/3.2.x/releasenotes/2.7.0.html#gaussian-blur-and-unsharp-mask + [pillow-perf-page]: https://python-pillow.org/pillow-perf/ + [pillow-perf-repo]: https://github.com/python-pillow/pillow-perf [uploadcare.com]: https://uploadcare.com/?utm_source=github&utm_medium=description&utm_campaign=pillow-simd [uploadcare.logo]: https://ucarecdn.com/dc4b8363-e89f-402f-8ea8-ce606664069c/-/preview/ From 135da8613e67d51f32afd1a9c41c589906d98e8b Mon Sep 17 00:00:00 2001 From: Alexander Date: Mon, 27 Feb 2017 02:51:52 +0300 Subject: [PATCH 14/21] for resizing --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index be983a291..0136c201f 100644 --- a/README.md +++ b/README.md @@ -59,7 +59,7 @@ Tons of tests can be found on the [Pillow Performance][pillow-perf-page] page. There are benchmarks against different versions of Pillow and Pillow-SIMD as well as ImageMagick, Skia, OpenCV and IPP. -The results show that Pillow is always faster than ImageMagick, +The results show that for resizing Pillow is always faster than ImageMagick, Pillow-SIMD, in turn, is even faster than the original Pillow by the factor of 4-6. In general, Pillow-SIMD with AVX2 is always **16 to 40 times faster** than ImageMagick and outperforms Skia, the high-speed graphics library used in Chromium. From 8bd80ceeec5c76d1b31cf5bb68d1450ff09df6da Mon Sep 17 00:00:00 2001 From: Alexander Date: Thu, 17 Aug 2017 02:11:57 +0300 Subject: [PATCH 15/21] remove original README.rst --- README.rst | 77 ------------------------------------------------------ 1 file changed, 77 deletions(-) delete mode 100644 README.rst diff --git a/README.rst b/README.rst deleted file mode 100644 index b88a103b0..000000000 --- a/README.rst +++ /dev/null @@ -1,77 +0,0 @@ -Pillow -====== - -Python Imaging Library (Fork) ------------------------------ - -Pillow is the friendly PIL fork by `Alex Clark and Contributors `_. PIL is the Python Imaging Library by Fredrik Lundh and Contributors. - -.. start-badges - -.. list-table:: - :stub-columns: 1 - - * - docs - - |docs| - * - tests - - |linux| |macos| |windows| |coverage| - * - package - - |zenodo| |version| - * - social - - |gitter| |twitter| - -.. |docs| image:: https://readthedocs.org/projects/pillow/badge/?version=latest - :target: https://pillow.readthedocs.io/?badge=latest - :alt: Documentation Status - -.. |linux| image:: https://img.shields.io/travis/python-pillow/Pillow/master.svg?label=Linux%20build - :target: https://travis-ci.org/python-pillow/Pillow - :alt: Travis CI build status (Linux) - -.. |macos| image:: https://img.shields.io/travis/python-pillow/pillow-wheels/latest.svg?label=macOS%20build - :target: https://travis-ci.org/python-pillow/pillow-wheels - :alt: Travis CI build status (macOS) - -.. |windows| image:: https://img.shields.io/appveyor/ci/python-pillow/Pillow/master.svg?label=Windows%20build - :target: https://ci.appveyor.com/project/python-pillow/Pillow - :alt: AppVeyor CI build status (Windows) - -.. |coverage| image:: https://coveralls.io/repos/python-pillow/Pillow/badge.svg?branch=master&service=github - :target: https://coveralls.io/github/python-pillow/Pillow?branch=master - :alt: Code coverage - -.. |zenodo| image:: https://zenodo.org/badge/17549/python-pillow/Pillow.svg - :target: https://zenodo.org/badge/latestdoi/17549/python-pillow/Pillow - -.. |version| image:: https://img.shields.io/pypi/v/pillow.svg - :target: https://pypi.org/project/Pillow/ - :alt: Latest PyPI version - -.. |gitter| image:: https://badges.gitter.im/python-pillow/Pillow.svg - :target: https://gitter.im/python-pillow/Pillow?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge - :alt: Join the chat at https://gitter.im/python-pillow/Pillow - -.. |twitter| image:: https://img.shields.io/badge/tweet-on%20Twitter-00aced.svg - :target: https://twitter.com/PythonPillow - :alt: Follow on https://twitter.com/PythonPillow - -.. end-badges - - - -More Information ----------------- - -- `Documentation `_ - - - `Installation `_ - - `Handbook `_ - -- `Contribute `_ - - - `Issues `_ - - `Pull requests `_ - -- `Changelog `_ - - - `Pre-fork `_ From 4144e8bd822eedaa614403ecd1f16f5a59e40fca Mon Sep 17 00:00:00 2001 From: Alexander Date: Wed, 4 Oct 2017 21:54:16 +0300 Subject: [PATCH 16/21] version --- CHANGES.SIMD.rst | 16 ++++++++++++++-- src/PIL/_version.py | 2 +- 2 files changed, 15 insertions(+), 3 deletions(-) diff --git a/CHANGES.SIMD.rst b/CHANGES.SIMD.rst index 2fb14c9bc..d1a2df404 100644 --- a/CHANGES.SIMD.rst +++ b/CHANGES.SIMD.rst @@ -1,6 +1,18 @@ Changelog (Pillow-SIMD) ======================= +4.3.0.post0 +----------- + +- Float-based filters, single-band: 3x3 SSE4, 5x5 SSE4 +- Float-based filters, multi-band: 3x3 SSE4 & AVX2, 5x5 SSE4 +- Int-based filters, multi-band: 3x3 SSE4 & AVX2, 5x5 SSE4 & AVX2 +- Box blur: fast path for radius < 1 +- Alpha composite: fast div approximation +- Color conversion: RGB to L SSE4, fast div in RGBa to RGBA +- Resampling: optimized coefficients loading +- Split and get_channel: SSE4 + 3.4.1.post1 ----------- @@ -21,7 +33,7 @@ Changelog (Pillow-SIMD) 3.3.0.post2 ----------- -- Fixed error in RGBa -> RGBA convertion +- Fixed error in RGBa -> RGBA conversion 3.3.0.post1 ----------- @@ -41,7 +53,7 @@ Resampling - SSE4 and AVX2 fixed-point full loading horizontal pass. - SSE4 and AVX2 fixed-point full loading vertical pass. -Convertion +Conversion ~~~~~~~~~~ - RGBA -> RGBa SSE4 and AVX2 fixed-point full loading implementations. diff --git a/src/PIL/_version.py b/src/PIL/_version.py index b5e4f0d75..eee9c701d 100644 --- a/src/PIL/_version.py +++ b/src/PIL/_version.py @@ -1,2 +1,2 @@ # Master version for Pillow -__version__ = '5.3.0' +__version__ = '5.3.0.post0' From 4dd3b6b10dd2ea8b528fd3189e4c25a7034a4774 Mon Sep 17 00:00:00 2001 From: Alexander Date: Wed, 4 Oct 2017 22:15:44 +0300 Subject: [PATCH 17/21] fix typo --- CHANGES.SIMD.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/CHANGES.SIMD.rst b/CHANGES.SIMD.rst index d1a2df404..b5cce2787 100644 --- a/CHANGES.SIMD.rst +++ b/CHANGES.SIMD.rst @@ -16,7 +16,7 @@ Changelog (Pillow-SIMD) 3.4.1.post1 ----------- -- Critical memory error for some combinations of source/destinatnion +- Critical memory error for some combinations of source/destination sizes is fixed. 3.4.1.post0 From 18451ed6f368e077dff3ee59e7362be1b797da00 Mon Sep 17 00:00:00 2001 From: Alexander Date: Fri, 13 Apr 2018 00:05:34 +0300 Subject: [PATCH 18/21] Update readme --- README.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 0136c201f..b112e0b7d 100644 --- a/README.md +++ b/README.md @@ -40,7 +40,7 @@ The project is supported by Uploadcare, a SAAS for cloud-based image storing and [![Uploadcare][uploadcare.logo]][uploadcare.com] -In fact, Uploadcare has been running Pillow-SIMD for about two years now. +In fact, Uploadcare has been running Pillow-SIMD for about three years now. The following image operations are currently SIMD-accelerated: @@ -48,9 +48,10 @@ The following image operations are currently SIMD-accelerated: - Gaussian and box blur: SSE4 - Alpha composition: SSE4, AVX2 - RGBA → RGBa (alpha premultiplication): SSE4, AVX2 -- RGBa → RGBA (division by alpha): AVX2 - -See [CHANGES](CHANGES.SIMD.rst) for more information. +- RGBa → RGBA (division by alpha): SSE4, AVX2 +— RGB → L (grayscale): SSE4 +- 3x3 and 5x5 kernel filters: SSE4, AVX2 +- Split and get_channel: SSE4 ## Benchmarks @@ -120,7 +121,7 @@ All bugfixes to the original Pillow will then be transferred to the next Pillow- [original-changelog]: https://github.com/python-pillow/Pillow/blob/master/CHANGES.rst [original-contribute]: https://github.com/python-pillow/Pillow/blob/master/.github/CONTRIBUTING.md [gaussian-blur-changes]: https://pillow.readthedocs.io/en/3.2.x/releasenotes/2.7.0.html#gaussian-blur-and-unsharp-mask - [pillow-perf-page]: https://python-pillow.org/pillow-perf/ + [pillow-perf-page]: https://python-pillow.github.io/pillow-perf/ [pillow-perf-repo]: https://github.com/python-pillow/pillow-perf [uploadcare.com]: https://uploadcare.com/?utm_source=github&utm_medium=description&utm_campaign=pillow-simd [uploadcare.logo]: https://ucarecdn.com/dc4b8363-e89f-402f-8ea8-ce606664069c/-/preview/ From 8278cd0abacfc50e33b615d792b0ab432e8100ad Mon Sep 17 00:00:00 2001 From: Alexander Date: Tue, 17 Apr 2018 15:35:10 +0300 Subject: [PATCH 19/21] fix mark --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index b112e0b7d..598f37593 100644 --- a/README.md +++ b/README.md @@ -49,7 +49,7 @@ The following image operations are currently SIMD-accelerated: - Alpha composition: SSE4, AVX2 - RGBA → RGBa (alpha premultiplication): SSE4, AVX2 - RGBa → RGBA (division by alpha): SSE4, AVX2 -— RGB → L (grayscale): SSE4 +- RGB → L (grayscale): SSE4 - 3x3 and 5x5 kernel filters: SSE4, AVX2 - Split and get_channel: SSE4 From 8e3d76590547ade0ae21b30bb6327820ec70d3b4 Mon Sep 17 00:00:00 2001 From: Alexander Date: Thu, 24 May 2018 14:26:52 +0300 Subject: [PATCH 20/21] Update Uploadcare logo in readme --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 598f37593..21b0eca66 100644 --- a/README.md +++ b/README.md @@ -124,4 +124,4 @@ All bugfixes to the original Pillow will then be transferred to the next Pillow- [pillow-perf-page]: https://python-pillow.github.io/pillow-perf/ [pillow-perf-repo]: https://github.com/python-pillow/pillow-perf [uploadcare.com]: https://uploadcare.com/?utm_source=github&utm_medium=description&utm_campaign=pillow-simd - [uploadcare.logo]: https://ucarecdn.com/dc4b8363-e89f-402f-8ea8-ce606664069c/-/preview/ + [uploadcare.logo]: https://ucarecdn.com/74c4d283-f7cf-45d7-924c-fc77345585af/uploadcare.svg From 39cb82bed14d0934e62976252e4b90dbf35d529a Mon Sep 17 00:00:00 2001 From: Alexander Date: Wed, 17 Oct 2018 17:15:02 +0300 Subject: [PATCH 21/21] Exclude tests from package --- MANIFEST.in | 1 + 1 file changed, 1 insertion(+) diff --git a/MANIFEST.in b/MANIFEST.in index 40b2ef5d7..38850bf3a 100644 --- a/MANIFEST.in +++ b/MANIFEST.in @@ -15,6 +15,7 @@ graft depends graft winbuild graft docs prune docs/_static +prune Tests # build/src control detritus exclude .appveyor.yml