Do not represent scalar tags as 1-element tuples. Keep tag
type and count information in TiffTags.TAGS. Normalize data in
ImageFileDirectory.__setitem__: wrap and unwrap tuples as needed,
convert rationals to floats. (To ensure consistency, make the "tags"
attribute private.) Interpret byte data as a series of integers rather
than a bytearray (which should only map to the "undefined" type). On
Python3, if a str is assigned to an "undefined" tag, encode it as ASCII.
Note that a large number of tags have been removed from TiffTags.TAGS
because I do not have time to figure out the type and count of each of
them. They should be restored before this gets merged in.
This obviously breaks backwards compatibility in a lot of ways...
Fix for UnicodeDecodeError: ascii codec cannot decode byte while saving a TIFF image
Problem occured while saving TIFF images that contain non-ascii characters in metadata
Manually merged with master by wiredfool
cf. #1191.
Only TiffImagePlugin and OLEFileIO still rely on (their own) DEBUG flag.
I left TiffImagePlugin as it is because I hope #1059 gets merged in
first, and OLEFileIO because it uses its own logic.
Untested, as usual.
We regularly use this format to store 32bit floats and I would like to see it handled by clean Pillow installations without having to add it on every system I use.
There are two main issues fixed with this commit:
* bytes vs. str: All file, image, and palette data are now handled as
bytes. A new _binary module consolidates the hacks needed to do this
across Python versions. tostring/fromstring methods have been renamed to
tobytes/frombytes, but the Python 2.6/2.7 versions alias them to the old
names for compatibility. Users should move to tobytes/frombytes.
One other potentially-breaking change is that text data in image files
(such as tags, comments) are now explicitly handled with a specific
character encoding in mind. This works well with the Unicode str in
Python 3, but may trip up old code expecting a straight byte-for-byte
translation to a Python string. This also required a change to Gohlke's
tags tests (in Tests/test_file_png.py) to expect Unicode strings from
the code.
* True div vs. floor div: Many division operations used the "/" operator
to do floor division, which is now the "//" operator in Python 3. These
were fixed.
As of this commit, on the first pass, I have one failing test (improper
handling of a slice object in a C module, test_imagepath.py) in Python 3,
and three that that I haven't tried running yet (test_imagegl,
test_imagegrab, and test_imageqt). I also haven't tested anything on
Windows. All but the three skipped tests run flawlessly against Pythons
2.6 and 2.7.
This is, I guess, a few things the Python devs were just fed up with.
* "while 1" is now "while True"
* Types are compared with isinstance instead of ==
* Sort a list in one go with sorted()
My own twist is to also replace type('') with str, type(()) with tuple,
type([]) with list, type(1) with int, and type(5000.0) with float.
In py3k, imports are absolute unless using the "from . import" syntax.
This commit also solves a recursive import between Image, ImageColor, and
ImagePalette by delay-importing ImagePalette in Image.
I'm not too keen on this commit because the syntax is ugly. I might go back
and prefer the prettier "from PIL import".
What's really going on is that map() and filter() return iterators in py3k.
I've just gone ahead and turned them all into list comprehensions, because
I find them much easier to read.