By interleaving little and big-endian entries we make sure entries
exist for both cases. Some additional entries created when the
big-endian was missing. I am not sure of what entry to create for the
big-endian, 4-bit case (what is the order of the two entries within the
byte?).
Some programs generate SamplesPerPixel entries in ExtraSamples instead
of SamplesPerPixel-3, cf. #1227. This is a stopgap measure to support
them. One could also decide to add generic code to always support
having SamplesPerPixel entries (by dropping the first 3).
To have the old API that always returns tuples, and fractions as pairs,
set the `legacy_api` attribute of the IFD to True.
This should alleviate concerns about backwards compatibility.
Do not represent scalar tags as 1-element tuples. Keep tag
type and count information in TiffTags.TAGS. Normalize data in
ImageFileDirectory.__setitem__: wrap and unwrap tuples as needed,
convert rationals to floats. (To ensure consistency, make the "tags"
attribute private.) Interpret byte data as a series of integers rather
than a bytearray (which should only map to the "undefined" type). On
Python3, if a str is assigned to an "undefined" tag, encode it as ASCII.
Note that a large number of tags have been removed from TiffTags.TAGS
because I do not have time to figure out the type and count of each of
them. They should be restored before this gets merged in.
This obviously breaks backwards compatibility in a lot of ways...
Fix for UnicodeDecodeError: ascii codec cannot decode byte while saving a TIFF image
Problem occured while saving TIFF images that contain non-ascii characters in metadata
Manually merged with master by wiredfool
cf. #1191.
Only TiffImagePlugin and OLEFileIO still rely on (their own) DEBUG flag.
I left TiffImagePlugin as it is because I hope #1059 gets merged in
first, and OLEFileIO because it uses its own logic.
Untested, as usual.
We regularly use this format to store 32bit floats and I would like to see it handled by clean Pillow installations without having to add it on every system I use.
There are two main issues fixed with this commit:
* bytes vs. str: All file, image, and palette data are now handled as
bytes. A new _binary module consolidates the hacks needed to do this
across Python versions. tostring/fromstring methods have been renamed to
tobytes/frombytes, but the Python 2.6/2.7 versions alias them to the old
names for compatibility. Users should move to tobytes/frombytes.
One other potentially-breaking change is that text data in image files
(such as tags, comments) are now explicitly handled with a specific
character encoding in mind. This works well with the Unicode str in
Python 3, but may trip up old code expecting a straight byte-for-byte
translation to a Python string. This also required a change to Gohlke's
tags tests (in Tests/test_file_png.py) to expect Unicode strings from
the code.
* True div vs. floor div: Many division operations used the "/" operator
to do floor division, which is now the "//" operator in Python 3. These
were fixed.
As of this commit, on the first pass, I have one failing test (improper
handling of a slice object in a C module, test_imagepath.py) in Python 3,
and three that that I haven't tried running yet (test_imagegl,
test_imagegrab, and test_imageqt). I also haven't tested anything on
Windows. All but the three skipped tests run flawlessly against Pythons
2.6 and 2.7.
This is, I guess, a few things the Python devs were just fed up with.
* "while 1" is now "while True"
* Types are compared with isinstance instead of ==
* Sort a list in one go with sorted()
My own twist is to also replace type('') with str, type(()) with tuple,
type([]) with list, type(1) with int, and type(5000.0) with float.