Merge branch 'winbuild' of https://github.com/wiredfool/Pillow into winbuild

This commit is contained in:
wiredfool 2014-06-05 00:14:39 +00:00
commit ef3ac73cfc
23 changed files with 695 additions and 101 deletions

View File

@ -1,18 +1,15 @@
language: python language: python
# for python-qt4
virtualenv:
system_site_packages: true
notifications: notifications:
irc: "chat.freenode.net#pil" irc: "chat.freenode.net#pil"
python: python:
- "pypy"
- 2.6 - 2.6
- 2.7 - 2.7
- 3.2 - 3.2
- 3.3 - 3.3
- "pypy" - 3.4
install: install:
- "sudo apt-get -qq install libfreetype6-dev liblcms2-dev python-qt4 ghostscript libffi-dev cmake" - "sudo apt-get -qq install libfreetype6-dev liblcms2-dev python-qt4 ghostscript libffi-dev cmake"
@ -42,3 +39,4 @@ after_success:
matrix: matrix:
allow_failures: allow_failures:
- python: "pypy" - python: "pypy"
- python: 3.4

View File

@ -4,6 +4,18 @@ Changelog (Pillow)
2.5.0 (unreleased) 2.5.0 (unreleased)
------------------ ------------------
- Added Image.close, context manager support.
[wiredfool]
- Added support for 16 bit PGM files.
[wiredfool]
- Updated OleFileIO to version 0.30 from upstream
[hugovk]
- Added support for additional TIFF floating point format
[Hijackal]
- Have the tempfile use a suffix with a dot - Have the tempfile use a suffix with a dot
[wiredfool] [wiredfool]

View File

@ -92,7 +92,7 @@ except ImportError:
from PIL import ImageMode from PIL import ImageMode
from PIL._binary import i8, o8 from PIL._binary import i8, o8
from PIL._util import isPath, isStringType from PIL._util import isPath, isStringType, deferred_error
import os, sys import os, sys
@ -497,6 +497,35 @@ class Image:
_makeself = _new # compatibility _makeself = _new # compatibility
# Context Manager Support
def __enter__(self):
return self
def __exit__(self, *args):
self.close()
def close(self):
"""
Closes the file pointer, if possible.
This operation will destroy the image core and release it's memory.
The image data will be unusable afterward.
This function is only required to close images that have not
had their file read and closed by the
:py:meth:`~PIL.Image.Image.load` method.
"""
try:
self.fp.close()
except Exception as msg:
if Image.DEBUG:
print ("Error closing: %s" %msg)
# Instead of simply setting to None, we're setting up a
# deferred error that will better explain that the core image
# object is gone.
self.im = deferred_error(ValueError("Operation on closed image"))
def _copy(self): def _copy(self):
self.load() self.load()
self.im = self.im.copy() self.im = self.im.copy()
@ -642,7 +671,8 @@ class Image:
Allocates storage for the image and loads the pixel data. In Allocates storage for the image and loads the pixel data. In
normal cases, you don't need to call this method, since the normal cases, you don't need to call this method, since the
Image class automatically loads an opened image when it is Image class automatically loads an opened image when it is
accessed for the first time. accessed for the first time. This method will close the file
associated with the image.
:returns: An image access object. :returns: An image access object.
""" """
@ -2074,10 +2104,11 @@ def open(fp, mode="r"):
""" """
Opens and identifies the given image file. Opens and identifies the given image file.
This is a lazy operation; this function identifies the file, but the This is a lazy operation; this function identifies the file, but
actual image data is not read from the file until you try to process the file remains open and the actual image data is not read from
the data (or call the :py:meth:`~PIL.Image.Image.load` method). the file until you try to process the data (or call the
See :py:func:`~PIL.Image.new`. :py:meth:`~PIL.Image.Image.load` method). See
:py:func:`~PIL.Image.new`.
:param file: A filename (string) or a file object. The file object :param file: A filename (string) or a file object. The file object
must implement :py:meth:`~file.read`, :py:meth:`~file.seek`, and must implement :py:meth:`~file.read`, :py:meth:`~file.seek`, and

View File

@ -89,8 +89,8 @@ try:
except ImportError as ex: except ImportError as ex:
# Allow error import for doc purposes, but error out when accessing # Allow error import for doc purposes, but error out when accessing
# anything in core. # anything in core.
from _util import import_err from _util import deferred_error
_imagingcms = import_err(ex) _imagingcms = deferred_error(ex)
from PIL._util import isStringType from PIL._util import isStringType
core = _imagingcms core = _imagingcms

View File

@ -1,27 +1,22 @@
OleFileIO_PL OleFileIO_PL
============ ============
[OleFileIO_PL](http://www.decalage.info/python/olefileio) is a Python module to read [Microsoft OLE2 files (also called Structured Storage, Compound File Binary Format or Compound Document File Format)](http://en.wikipedia.org/wiki/Compound_File_Binary_Format), such as Microsoft Office documents, Image Composer and FlashPix files, Outlook messages, ... [OleFileIO_PL](http://www.decalage.info/python/olefileio) is a Python module to parse and read [Microsoft OLE2 files (also called Structured Storage, Compound File Binary Format or Compound Document File Format)](http://en.wikipedia.org/wiki/Compound_File_Binary_Format), such as Microsoft Office documents, Image Composer and FlashPix files, Outlook messages, StickyNotes, several Microscopy file formats ...
This is an improved version of the OleFileIO module from [PIL](http://www.pythonware.com/products/pil/index.htm), the excellent Python Imaging Library, created and maintained by Fredrik Lundh. The API is still compatible with PIL, but I have improved the internal implementation significantly, with new features, bugfixes and a more robust design. This is an improved version of the OleFileIO module from [PIL](http://www.pythonware.com/products/pil/index.htm), the excellent Python Imaging Library, created and maintained by Fredrik Lundh. The API is still compatible with PIL, but since 2005 I have improved the internal implementation significantly, with new features, bugfixes and a more robust design.
As far as I know, this module is now the most complete and robust Python implementation to read MS OLE2 files, portable on several operating systems. (please tell me if you know other similar Python modules) As far as I know, this module is now the most complete and robust Python implementation to read MS OLE2 files, portable on several operating systems. (please tell me if you know other similar Python modules)
WARNING: THIS IS (STILL) WORK IN PROGRESS. OleFileIO_PL can be used as an independent module or with PIL. The goal is to have it integrated into [Pillow](http://python-imaging.github.io/), the friendly fork of PIL.
Main improvements over PIL version of OleFileIO: OleFileIO\_PL is mostly meant for developers. If you are looking for tools to analyze OLE files or to extract data, then please also check [python-oletools](http://www.decalage.info/python/oletools), which are built upon OleFileIO_PL.
------------------------------------------------
- Better compatibility with Python 2.6 (also compatible with Python 3.0+)
- Support for files larger than 6.8MB
- Robust: many checks to detect malformed files
- Improved API
- New features: metadata extraction, stream/storage timestamps
- Added setup.py and install.bat to ease installation
News News
---- ----
Follow all updates and news on Twitter: <https://twitter.com/decalage2>
- **2014-02-04 v0.30**: now compatible with Python 3.x, thanks to Martin Panter who did most of the hard work.
- 2013-07-24 v0.26: added methods to parse stream/storage timestamps, improved listdir to include storages, fixed parsing of direntry timestamps - 2013-07-24 v0.26: added methods to parse stream/storage timestamps, improved listdir to include storages, fixed parsing of direntry timestamps
- 2013-05-27 v0.25: improved metadata extraction, properties parsing and exception handling, fixed [issue #12](https://bitbucket.org/decalage/olefileio_pl/issue/12/error-when-converting-timestamps-in-ole) - 2013-05-27 v0.25: improved metadata extraction, properties parsing and exception handling, fixed [issue #12](https://bitbucket.org/decalage/olefileio_pl/issue/12/error-when-converting-timestamps-in-ole)
- 2013-05-07 v0.24: new features to extract metadata (get\_metadata method and OleMetadata class), improved getproperties to convert timestamps to Python datetime - 2013-05-07 v0.24: new features to extract metadata (get\_metadata method and OleMetadata class), improved getproperties to convert timestamps to Python datetime
@ -34,47 +29,224 @@ News
- 2009-12-10 v0.19: fixed support for 64 bits platforms (thanks to Ben G. and Martijn for reporting the bug) - 2009-12-10 v0.19: fixed support for 64 bits platforms (thanks to Ben G. and Martijn for reporting the bug)
- see changelog in source code for more info. - see changelog in source code for more info.
Download: Download
--------- --------
The archive is available on [the project page](https://bitbucket.org/decalage/olefileio_pl/downloads). The archive is available on [the project page](https://bitbucket.org/decalage/olefileio_pl/downloads).
Features
--------
How to use this module: - Parse and read any OLE file such as Microsoft Office 97-2003 legacy document formats (Word .doc, Excel .xls, PowerPoint .ppt, Visio .vsd, Project .mpp), Image Composer and FlashPix files, Outlook messages, StickyNotes, Zeiss AxioVision ZVI files, Olympus FluoView OIB files, ...
----------------------- - List all the streams and storages contained in an OLE file
- Open streams as files
- Parse and read property streams, containing metadata of the file
- Portable, pure Python module, no dependency
See sample code at the end of the module, and also docstrings.
Here are a few examples: Main improvements over the original version of OleFileIO in PIL:
----------------------------------------------------------------
- Compatible with Python 3.x and 2.6+
- Many bug fixes
- Support for files larger than 6.8MB
- Support for 64 bits platforms and big-endian CPUs
- Robust: many checks to detect malformed files
- Runtime option to choose if malformed files should be parsed or raise exceptions
- Improved API
- Metadata extraction, stream/storage timestamps (e.g. for document forensics)
- Can open file-like objects
- Added setup.py and install.bat to ease installation
- More convenient slash-based syntax for stream paths
How to use this module
----------------------
OleFileIO_PL can be used as an independent module or with PIL. The main functions and methods are explained below.
For more information, see also the file **OleFileIO_PL.html**, sample code at the end of the module itself, and docstrings within the code.
### About the structure of OLE files ###
An OLE file can be seen as a mini file system or a Zip archive: It contains **streams** of data that look like files embedded within the OLE file. Each stream has a name. For example, the main stream of a MS Word document containing its text is named "WordDocument".
An OLE file can also contain **storages**. A storage is a folder that contains streams or other storages. For example, a MS Word document with VBA macros has a storage called "Macros".
Special streams can contain **properties**. A property is a specific value that can be used to store information such as the metadata of a document (title, author, creation date, etc). Property stream names usually start with the character '\x05'.
For example, a typical MS Word document may look like this:
\x05DocumentSummaryInformation (stream)
\x05SummaryInformation (stream)
WordDocument (stream)
Macros (storage)
PROJECT (stream)
PROJECTwm (stream)
VBA (storage)
Module1 (stream)
ThisDocument (stream)
_VBA_PROJECT (stream)
dir (stream)
ObjectPool (storage)
### Import OleFileIO_PL ###
:::python :::python
import OleFileIO_PL import OleFileIO_PL
# Test if a file is an OLE container: As of version 0.30, the code has been changed to be compatible with Python 3.x. As a consequence, compatibility with Python 2.5 or older is not provided anymore. However, a copy of v0.26 is available as OleFileIO_PL2.py. If your application needs to be compatible with Python 2.5 or older, you may use the following code to load the old version when needed:
:::python
try:
import OleFileIO_PL
except:
import OleFileIO_PL2 as OleFileIO_PL
If you think OleFileIO_PL should stay compatible with Python 2.5 or older, please [contact me](http://decalage.info/contact).
### Test if a file is an OLE container ###
Use isOleFile to check if the first bytes of the file contain the Magic for OLE files, before opening it. isOleFile returns True if it is an OLE file, False otherwise (new in v0.16).
:::python
assert OleFileIO_PL.isOleFile('myfile.doc') assert OleFileIO_PL.isOleFile('myfile.doc')
# Open an OLE file from disk:
### Open an OLE file from disk ###
Create an OleFileIO object with the file path as parameter:
:::python
ole = OleFileIO_PL.OleFileIO('myfile.doc') ole = OleFileIO_PL.OleFileIO('myfile.doc')
# Get list of streams: ### Open an OLE file from a file-like object ###
This is useful if the file is not on disk, e.g. already stored in a string or as a file-like object.
:::python
ole = OleFileIO_PL.OleFileIO(f)
For example the code below reads a file into a string, then uses BytesIO to turn it into a file-like object.
:::python
data = open('myfile.doc', 'rb').read()
f = io.BytesIO(data) # or StringIO.StringIO for Python 2.x
ole = OleFileIO_PL.OleFileIO(f)
### How to handle malformed OLE files ###
By default, the parser is configured to be as robust and permissive as possible, allowing to parse most malformed OLE files. Only fatal errors will raise an exception. It is possible to tell the parser to be more strict in order to raise exceptions for files that do not fully conform to the OLE specifications, using the raise_defect option (new in v0.14):
:::python
ole = OleFileIO_PL.OleFileIO('myfile.doc', raise_defects=DEFECT_INCORRECT)
When the parsing is done, the list of non-fatal issues detected is available as a list in the parsing_issues attribute of the OleFileIO object (new in 0.25):
:::python
print('Non-fatal issues raised during parsing:')
if ole.parsing_issues:
for exctype, msg in ole.parsing_issues:
print('- %s: %s' % (exctype.__name__, msg))
else:
print('None')
### Syntax for stream and storage path ###
Two different syntaxes are allowed for methods that need or return the path of streams and storages:
1) Either a **list of strings** including all the storages from the root up to the stream/storage name. For example a stream called "WordDocument" at the root will have ['WordDocument'] as full path. A stream called "ThisDocument" located in the storage "Macros/VBA" will be ['Macros', 'VBA', 'ThisDocument']. This is the original syntax from PIL. While hard to read and not very convenient, this syntax works in all cases.
2) Or a **single string with slashes** to separate storage and stream names (similar to the Unix path syntax). The previous examples would be 'WordDocument' and 'Macros/VBA/ThisDocument'. This syntax is easier, but may fail if a stream or storage name contains a slash. (new in v0.15)
Both are case-insensitive.
Switching between the two is easy:
:::python
slash_path = '/'.join(list_path)
list_path = slash_path.split('/')
### Get the list of streams ###
listdir() returns a list of all the streams contained in the OLE file, including those stored in storages. Each stream is listed itself as a list, as described above.
:::python
print(ole.listdir()) print(ole.listdir())
# Test if known streams/storages exist: Sample result:
:::python
[['\x01CompObj'], ['\x05DocumentSummaryInformation'], ['\x05SummaryInformation']
, ['1Table'], ['Macros', 'PROJECT'], ['Macros', 'PROJECTwm'], ['Macros', 'VBA',
'Module1'], ['Macros', 'VBA', 'ThisDocument'], ['Macros', 'VBA', '_VBA_PROJECT']
, ['Macros', 'VBA', 'dir'], ['ObjectPool'], ['WordDocument']]
As an option it is possible to choose if storages should also be listed, with or without streams (new in v0.26):
:::python
ole.listdir (streams=False, storages=True)
### Test if known streams/storages exist: ###
exists(path) checks if a given stream or storage exists in the OLE file (new in v0.16).
:::python
if ole.exists('worddocument'): if ole.exists('worddocument'):
print("This is a Word document.") print("This is a Word document.")
print("size :", ole.get_size('worddocument'))
if ole.exists('macros/vba'): if ole.exists('macros/vba'):
print("This document seems to contain VBA macros.") print("This document seems to contain VBA macros.")
# Extract the "Pictures" stream from a PPT file:
if ole.exists('Pictures'): ### Read data from a stream ###
openstream(path) opens a stream as a file-like object.
The following example extracts the "Pictures" stream from a PPT file:
:::python
pics = ole.openstream('Pictures') pics = ole.openstream('Pictures')
data = pics.read() data = pics.read()
f = open('Pictures.bin', 'wb')
f.write(data)
f.close()
# Extract metadata (new in v0.24) - see source code for all attributes:
### Get information about a stream/storage ###
Several methods can provide the size, type and timestamps of a given stream/storage:
get_size(path) returns the size of a stream in bytes (new in v0.16):
:::python
s = ole.get_size('WordDocument')
get_type(path) returns the type of a stream/storage, as one of the following constants: STGTY\_STREAM for a stream, STGTY\_STORAGE for a storage, STGTY\_ROOT for the root entry, and False for a non existing path (new in v0.15).
:::python
t = ole.get_type('WordDocument')
get\_ctime(path) and get\_mtime(path) return the creation and modification timestamps of a stream/storage, as a Python datetime object with UTC timezone. Please note that these timestamps are only present if the application that created the OLE file explicitly stored them, which is rarely the case. When not present, these methods return None (new in v0.26).
:::python
c = ole.get_ctime('WordDocument')
m = ole.get_mtime('WordDocument')
The root storage is a special case: You can get its creation and modification timestamps using the OleFileIO.root attribute (new in v0.26):
:::python
c = ole.root.getctime()
m = ole.root.getmtime()
### Extract metadata ###
get_metadata() will check if standard property streams exist, parse all the properties they contain, and return an OleMetadata object with the found properties as attributes (new in v0.24).
:::python
meta = ole.get_metadata() meta = ole.get_metadata()
print('Author:', meta.author) print('Author:', meta.author)
print('Title:', meta.title) print('Title:', meta.title)
@ -82,29 +254,67 @@ Here are a few examples:
# print all metadata: # print all metadata:
meta.dump() meta.dump()
# Close the OLE file: Available attributes include:
codepage, title, subject, author, keywords, comments, template,
last_saved_by, revision_number, total_edit_time, last_printed, create_time,
last_saved_time, num_pages, num_words, num_chars, thumbnail,
creating_application, security, codepage_doc, category, presentation_target,
bytes, lines, paragraphs, slides, notes, hidden_slides, mm_clips,
scale_crop, heading_pairs, titles_of_parts, manager, company, links_dirty,
chars_with_spaces, unused, shared_doc, link_base, hlinks, hlinks_changed,
version, dig_sig, content_type, content_status, language, doc_version
See the source code of the OleMetadata class for more information.
### Parse a property stream ###
get\_properties(path) can be used to parse any property stream that is not handled by get\_metadata. It returns a dictionary indexed by integers. Each integer is the index of the property, pointing to its value. For example in the standard property stream '\x05SummaryInformation', the document title is property #2, and the subject is #3.
:::python
p = ole.getproperties('specialprops')
By default as in the original PIL version, timestamp properties are converted into a number of seconds since Jan 1,1601. With the option convert\_time, you can obtain more convenient Python datetime objects (UTC timezone). If some time properties should not be converted (such as total editing time in '\x05SummaryInformation'), the list of indexes can be passed as no_conversion (new in v0.25):
:::python
p = ole.getproperties('specialprops', convert_time=True, no_conversion=[10])
### Close the OLE file ###
Unless your application is a simple script that terminates after processing an OLE file, do not forget to close each OleFileIO object after parsing to close the file on disk. (new in v0.22)
:::python
ole.close() ole.close()
# Work with a file-like object (e.g. StringIO) instead of a file on disk: ### Use OleFileIO_PL as a script ###
data = open('myfile.doc', 'rb').read()
f = io.BytesIO(data)
ole = OleFileIO_PL.OleFileIO(f)
print(ole.listdir())
ole.close()
OleFileIO_PL can also be used as a script from the command-line to display the structure of an OLE file and its metadata, for example:
It can also be used as a script from the command-line to display the structure of an OLE file, for example:
OleFileIO_PL.py myfile.doc OleFileIO_PL.py myfile.doc
You can use the option -c to check that all streams can be read fully, and -d to generate very verbose debugging information.
## Real-life examples ##
A real-life example: [using OleFileIO_PL for malware analysis and forensics](http://blog.gregback.net/2011/03/using-remnux-for-forensic-puzzle-6/). A real-life example: [using OleFileIO_PL for malware analysis and forensics](http://blog.gregback.net/2011/03/using-remnux-for-forensic-puzzle-6/).
How to contribute: See also [this paper](https://computer-forensics.sans.org/community/papers/gcfa/grow-forensic-tools-taxonomy-python-libraries-helpful-forensic-analysis_6879) about python tools for forensics, which features OleFileIO_PL.
------------------
About Python 2 and 3
--------------------
OleFileIO\_PL used to support only Python 2.x. As of version 0.30, the code has been changed to be compatible with Python 3.x. As a consequence, compatibility with Python 2.5 or older is not provided anymore. However, a copy of v0.26 is available as OleFileIO_PL2.py. See above the "import" section for a workaround.
If you think OleFileIO_PL should stay compatible with Python 2.5 or older, please [contact me](http://decalage.info/contact).
How to contribute
-----------------
The code is available in [a Mercurial repository on bitbucket](https://bitbucket.org/decalage/olefileio_pl). You may use it to submit enhancements or to report any issue. The code is available in [a Mercurial repository on bitbucket](https://bitbucket.org/decalage/olefileio_pl). You may use it to submit enhancements or to report any issue.
If you would like to help us improve this module, or simply provide feedback, you may also send an e-mail to decalage(at)laposte.net. You can help in many ways: If you would like to help us improve this module, or simply provide feedback, please [contact me](http://decalage.info/contact). You can help in many ways:
- test this module on different platforms / Python versions - test this module on different platforms / Python versions
- find and report bugs - find and report bugs
@ -112,12 +322,12 @@ If you would like to help us improve this module, or simply provide feedback, yo
- write unittest test cases - write unittest test cases
- provide tricky malformed files - provide tricky malformed files
How to report bugs: How to report bugs
------------------- ------------------
To report a bug, for example a normal file which is not parsed correctly, please use the [issue reporting page](https://bitbucket.org/decalage/olefileio_pl/issues?status=new&status=open), or send an e-mail with an attachment containing the debugging output of OleFileIO_PL. To report a bug, for example a normal file which is not parsed correctly, please use the [issue reporting page](https://bitbucket.org/decalage/olefileio_pl/issues?status=new&status=open), or if you prefer to do it privately, use this [contact form](http://decalage.info/contact). Please provide all the information about the context and how to reproduce the bug.
For this, launch the following command : If possible please join the debugging output of OleFileIO_PL. For this, launch the following command :
OleFileIO_PL.py -d -c file >debug.txt OleFileIO_PL.py -d -c file >debug.txt
@ -126,7 +336,7 @@ License
OleFileIO_PL is open-source. OleFileIO_PL is open-source.
OleFileIO_PL changes are Copyright (c) 2005-2013 by Philippe Lagadec. OleFileIO_PL changes are Copyright (c) 2005-2014 by Philippe Lagadec.
The Python Imaging Library (PIL) is The Python Imaging Library (PIL) is

View File

@ -2,11 +2,12 @@
# -*- coding: latin-1 -*- # -*- coding: latin-1 -*-
""" """
OleFileIO_PL: OleFileIO_PL:
Module to read Microsoft OLE2 files (also called Structured Storage or Module to read Microsoft OLE2 files (also called Structured Storage or
Microsoft Compound Document File Format), such as Microsoft Office Microsoft Compound Document File Format), such as Microsoft Office
documents, Image Composer and FlashPix files, Outlook messages, ... documents, Image Composer and FlashPix files, Outlook messages, ...
This version is compatible with Python 2.6+ and 3.x
version 0.26 2013-07-24 Philippe Lagadec - http://www.decalage.info version 0.30 2014-02-04 Philippe Lagadec - http://www.decalage.info
Project website: http://www.decalage.info/python/olefileio Project website: http://www.decalage.info/python/olefileio
@ -16,25 +17,30 @@ See: http://www.pythonware.com/products/pil/index.htm
The Python Imaging Library (PIL) is The Python Imaging Library (PIL) is
Copyright (c) 1997-2005 by Secret Labs AB Copyright (c) 1997-2005 by Secret Labs AB
Copyright (c) 1995-2005 by Fredrik Lundh Copyright (c) 1995-2005 by Fredrik Lundh
OleFileIO_PL changes are Copyright (c) 2005-2013 by Philippe Lagadec OleFileIO_PL changes are Copyright (c) 2005-2014 by Philippe Lagadec
See source code and LICENSE.txt for information on usage and redistribution. See source code and LICENSE.txt for information on usage and redistribution.
WARNING: THIS IS (STILL) WORK IN PROGRESS. WARNING: THIS IS (STILL) WORK IN PROGRESS.
""" """
from __future__ import print_function # Starting with OleFileIO_PL v0.30, only Python 2.6+ and 3.x is supported
# This import enables print() as a function rather than a keyword
# (main requirement to be compatible with Python 3.x)
# The comment on the line below should be printed on Python 2.5 or older:
from __future__ import print_function # This version of OleFileIO_PL requires Python 2.6+ or 3.x.
__author__ = "Philippe Lagadec, Fredrik Lundh (Secret Labs AB)" __author__ = "Philippe Lagadec, Fredrik Lundh (Secret Labs AB)"
__date__ = "2013-07-24" __date__ = "2014-02-04"
__version__ = '0.26' __version__ = '0.30'
#--- LICENSE ------------------------------------------------------------------ #--- LICENSE ------------------------------------------------------------------
# OleFileIO_PL is an improved version of the OleFileIO module from the # OleFileIO_PL is an improved version of the OleFileIO module from the
# Python Imaging Library (PIL). # Python Imaging Library (PIL).
# OleFileIO_PL changes are Copyright (c) 2005-2013 by Philippe Lagadec # OleFileIO_PL changes are Copyright (c) 2005-2014 by Philippe Lagadec
# #
# The Python Imaging Library (PIL) is # The Python Imaging Library (PIL) is
# Copyright (c) 1997-2005 by Secret Labs AB # Copyright (c) 1997-2005 by Secret Labs AB
@ -133,9 +139,14 @@ __version__ = '0.26'
# of a directory entry or a storage/stream # of a directory entry or a storage/stream
# - fixed parsing of direntry timestamps # - fixed parsing of direntry timestamps
# 2013-07-24 PL: - new options in listdir to list storages and/or streams # 2013-07-24 PL: - new options in listdir to list storages and/or streams
# 2014-02-04 v0.30 PL: - upgraded code to support Python 3.x by Martin Panter
# - several fixes for Python 2.6 (xrange, MAGIC)
# - reused i32 from Pillow's _binary
#----------------------------------------------------------------------------- #-----------------------------------------------------------------------------
# TODO (for version 1.0): # TODO (for version 1.0):
# + isOleFile should accept file-like objects like open
# + fix how all the methods handle unicode str and/or bytes as arguments
# + add path attrib to _OleDirEntry, set it once and for all in init or # + add path attrib to _OleDirEntry, set it once and for all in init or
# append_kids (then listdir/_list can be simplified) # append_kids (then listdir/_list can be simplified)
# - TESTS with Linux, MacOSX, Python 1.5.2, various files, PIL, ... # - TESTS with Linux, MacOSX, Python 1.5.2, various files, PIL, ...
@ -220,17 +231,26 @@ __version__ = '0.26'
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
import io import io
import sys import sys
from PIL import _binary
import struct, array, os.path, datetime import struct, array, os.path, datetime
#[PL] Define explicitly the public API to avoid private objects in pydoc: #[PL] Define explicitly the public API to avoid private objects in pydoc:
__all__ = ['OleFileIO', 'isOleFile', 'MAGIC'] __all__ = ['OleFileIO', 'isOleFile', 'MAGIC']
# For Python 3.x, need to redefine long as int:
if str is not bytes: if str is not bytes:
long = int long = int
# Need to make sure we use xrange both on Python 2 and 3.x:
try:
# on Python 2 we need xrange:
iterrange = xrange
except:
# no xrange, for Python 3 it was renamed as range:
iterrange = range
#[PL] workaround to fix an issue with array item size on 64 bits systems: #[PL] workaround to fix an issue with array item size on 64 bits systems:
if array.array('L').itemsize == 4: if array.array('L').itemsize == 4:
# on 32 bits platforms, long integers in an array are 32 bits: # on 32 bits platforms, long integers in an array are 32 bits:
@ -281,8 +301,7 @@ def set_debug_mode(debug_mode):
else: else:
debug = debug_pass debug = debug_pass
#TODO: convert this to hex MAGIC = b'\xD0\xCF\x11\xE0\xA1\xB1\x1A\xE1'
MAGIC = b'\320\317\021\340\241\261\032\341'
#[PL]: added constants for Sector IDs (from AAF specifications) #[PL]: added constants for Sector IDs (from AAF specifications)
MAXREGSECT = 0xFFFFFFFA; # maximum SECT MAXREGSECT = 0xFFFFFFFA; # maximum SECT
@ -362,9 +381,39 @@ def isOleFile (filename):
return False return False
i8 = _binary.i8 if bytes is str:
i16 = _binary.i16le # version for Python 2.x
i32 = _binary.i32le def i8(c):
return ord(c)
else:
# version for Python 3.x
def i8(c):
return c if c.__class__ is int else c[0]
#TODO: replace i16 and i32 with more readable struct.unpack equivalent?
def i16(c, o = 0):
"""
Converts a 2-bytes (16 bits) string to an integer.
c: string containing bytes to convert
o: offset of bytes to convert in string
"""
return i8(c[o]) | (i8(c[o+1])<<8)
def i32(c, o = 0):
"""
Converts a 4-bytes (32 bits) string to an integer.
c: string containing bytes to convert
o: offset of bytes to convert in string
"""
## return int(ord(c[o])+(ord(c[o+1])<<8)+(ord(c[o+2])<<16)+(ord(c[o+3])<<24))
## # [PL]: added int() because "<<" gives long int since Python 2.4
# copied from Pillow's _binary:
return i8(c[o]) | (i8(c[o+1])<<8) | (i8(c[o+2])<<16) | (i8(c[o+3])<<24)
def _clsid(clsid): def _clsid(clsid):
@ -373,7 +422,9 @@ def _clsid(clsid):
clsid: string of length 16. clsid: string of length 16.
""" """
assert len(clsid) == 16 assert len(clsid) == 16
if clsid == bytearray(16): # if clsid is only made of null bytes, return an empty string:
# (PL: why not simply return the string with zeroes?)
if not clsid.strip(b"\0"):
return "" return ""
return (("%08X-%04X-%04X-%02X%02X-" + "%02X" * 6) % return (("%08X-%04X-%04X-%02X%02X-" + "%02X" * 6) %
((i32(clsid, 0), i16(clsid, 4), i16(clsid, 6)) + ((i32(clsid, 0), i16(clsid, 4), i16(clsid, 6)) +
@ -902,18 +953,22 @@ class _OleDirectoryEntry:
def __eq__(self, other): def __eq__(self, other):
"Compare entries by name" "Compare entries by name"
return self.name == other.name return self.name == other.name
def __lt__(self, other): def __lt__(self, other):
"Compare entries by name" "Compare entries by name"
return self.name < other.name return self.name < other.name
#TODO: replace by the same function as MS implementation ?
# (order by name length first, then case-insensitive order)
def __ne__(self, other): def __ne__(self, other):
return not self.__eq__(other) return not self.__eq__(other)
def __le__(self, other): def __le__(self, other):
return self.__eq__(other) or self.__lt__(other) return self.__eq__(other) or self.__lt__(other)
# Reflected __lt__() and __le__() will be used for __gt__() and __ge__() # Reflected __lt__() and __le__() will be used for __gt__() and __ge__()
#TODO: replace by the same function as MS implementation ?
# (order by name length first, then case-insensitive order)
def dump(self, tab = 0): def dump(self, tab = 0):
"Dump this entry, and all its subentries (for debug purposes only)" "Dump this entry, and all its subentries (for debug purposes only)"
@ -1046,7 +1101,7 @@ class OleFileIO:
#TODO: if larger than 1024 bytes, this could be the actual data => BytesIO #TODO: if larger than 1024 bytes, this could be the actual data => BytesIO
self.fp = open(filename, "rb") self.fp = open(filename, "rb")
# old code fails if filename is not a plain string: # old code fails if filename is not a plain string:
#if isPath(filename): #if isinstance(filename, (bytes, basestring)):
# self.fp = open(filename, "rb") # self.fp = open(filename, "rb")
#else: #else:
# self.fp = filename # self.fp = filename
@ -1133,7 +1188,7 @@ class OleFileIO:
) = struct.unpack(fmt_header, header1) ) = struct.unpack(fmt_header, header1)
debug( struct.unpack(fmt_header, header1)) debug( struct.unpack(fmt_header, header1))
if self.Sig != b'\xd0\xcf\x11\xe0\xa1\xb1\x1a\xe1': if self.Sig != MAGIC:
# OLE signature should always be present # OLE signature should always be present
self._raise_defect(DEFECT_FATAL, "incorrect OLE signature") self._raise_defect(DEFECT_FATAL, "incorrect OLE signature")
if self.clsid != bytearray(16): if self.clsid != bytearray(16):
@ -1385,7 +1440,7 @@ class OleFileIO:
if self.csectDif != nb_difat: if self.csectDif != nb_difat:
raise IOError('incorrect DIFAT') raise IOError('incorrect DIFAT')
isect_difat = self.sectDifStart isect_difat = self.sectDifStart
for i in range(nb_difat): for i in iterrange(nb_difat):
debug( "DIFAT block %d, sector %X" % (i, isect_difat) ) debug( "DIFAT block %d, sector %X" % (i, isect_difat) )
#TODO: check if corresponding FAT SID = DIFSECT #TODO: check if corresponding FAT SID = DIFSECT
sector_difat = self.getsect(isect_difat) sector_difat = self.getsect(isect_difat)
@ -1494,7 +1549,7 @@ class OleFileIO:
#self.direntries = [] #self.direntries = []
# We start with a list of "None" object # We start with a list of "None" object
self.direntries = [None] * max_entries self.direntries = [None] * max_entries
## for sid in range(max_entries): ## for sid in iterrange(max_entries):
## entry = fp.read(128) ## entry = fp.read(128)
## if not entry: ## if not entry:
## break ## break

View File

@ -63,7 +63,11 @@ class PpmImageFile(ImageFile.ImageFile):
c = self.fp.read(1) c = self.fp.read(1)
if not c or c in b_whitespace: if not c or c in b_whitespace:
break break
if c > b'\x79':
raise ValueError("Expected ASCII value, found binary")
s = s + c s = s + c
if (len(s) > 9):
raise ValueError("Expected int, got > 9 digits")
return s return s
def _open(self): def _open(self):
@ -96,6 +100,17 @@ class PpmImageFile(ImageFile.ImageFile):
ysize = s ysize = s
if mode == "1": if mode == "1":
break break
elif ix == 2:
# maxgrey
if s > 255:
if not mode == 'L':
raise ValueError("Too many colors for band: %s" %s)
if s < 2**16:
self.mode = 'I'
rawmode = 'I;16B'
else:
self.mode = 'I';
rawmode = 'I;32B'
self.size = xsize, ysize self.size = xsize, ysize
self.tile = [("raw", self.tile = [("raw",
@ -116,6 +131,11 @@ def _save(im, fp, filename):
rawmode, head = "1;I", b"P4" rawmode, head = "1;I", b"P4"
elif im.mode == "L": elif im.mode == "L":
rawmode, head = "L", b"P5" rawmode, head = "L", b"P5"
elif im.mode == "I":
if im.getextrema()[1] < 2**16:
rawmode, head = "I;16B", b"P5"
else:
rawmode, head = "I;32B", b"P5"
elif im.mode == "RGB": elif im.mode == "RGB":
rawmode, head = "RGB", b"P6" rawmode, head = "RGB", b"P6"
elif im.mode == "RGBA": elif im.mode == "RGBA":
@ -123,8 +143,15 @@ def _save(im, fp, filename):
else: else:
raise IOError("cannot write mode %s as PPM" % im.mode) raise IOError("cannot write mode %s as PPM" % im.mode)
fp.write(head + ("\n%d %d\n" % im.size).encode('ascii')) fp.write(head + ("\n%d %d\n" % im.size).encode('ascii'))
if head != b"P4": if head == b"P6":
fp.write(b"255\n") fp.write(b"255\n")
if head == b"P5":
if rawmode == "L":
fp.write(b"255\n")
elif rawmode == "I;16B":
fp.write(b"65535\n")
elif rawmode == "I;32B":
fp.write(b"2147483648\n")
ImageFile._save(im, fp, [("raw", (0,0)+im.size, 0, (rawmode, 0, 1))]) ImageFile._save(im, fp, [("raw", (0,0)+im.size, 0, (rawmode, 0, 1))])
# ALTERNATIVE: save via builtin debug function # ALTERNATIVE: save via builtin debug function

View File

@ -146,6 +146,7 @@ OPEN_INFO = {
(II, 0, 1, 2, (1,), ()): ("1", "1;IR"), (II, 0, 1, 2, (1,), ()): ("1", "1;IR"),
(II, 0, 1, 1, (8,), ()): ("L", "L;I"), (II, 0, 1, 1, (8,), ()): ("L", "L;I"),
(II, 0, 1, 2, (8,), ()): ("L", "L;IR"), (II, 0, 1, 2, (8,), ()): ("L", "L;IR"),
(II, 0, 3, 1, (32,), ()): ("F", "F;32F"),
(II, 1, 1, 1, (1,), ()): ("1", "1"), (II, 1, 1, 1, (1,), ()): ("1", "1"),
(II, 1, 1, 2, (1,), ()): ("1", "1;R"), (II, 1, 1, 2, (1,), ()): ("1", "1;R"),
(II, 1, 1, 1, (8,), ()): ("L", "L"), (II, 1, 1, 1, (8,), ()): ("L", "L"),

View File

@ -15,7 +15,7 @@ else:
def isDirectory(f): def isDirectory(f):
return isPath(f) and os.path.isdir(f) return isPath(f) and os.path.isdir(f)
class import_err(object): class deferred_error(object):
def __init__(self, ex): def __init__(self, ex):
self.ex = ex self.ex = ex
def __getattr__(self, elt): def __getattr__(self, elt):

Binary file not shown.

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 578 B

Binary file not shown.

View File

@ -58,6 +58,11 @@ for file in files:
)) ))
result = out.read() result = out.read()
result_lines = result.splitlines()
if len(result_lines):
if result_lines[0] == "ignore_all_except_last_line":
result = result_lines[-1]
# Extract any ignore patterns # Extract any ignore patterns
ignore_pats = ignore_re.findall(result) ignore_pats = ignore_re.findall(result)
result = ignore_re.sub('', result) result = ignore_re.sub('', result)

View File

@ -12,3 +12,25 @@ def test_sanity():
assert_equal(im.mode, "RGB") assert_equal(im.mode, "RGB")
assert_equal(im.size, (128, 128)) assert_equal(im.size, (128, 128))
assert_equal(im.format, "PPM") assert_equal(im.format, "PPM")
def test_16bit_pgm():
im = Image.open('Tests/images/16_bit_binary.pgm')
im.load()
assert_equal(im.mode, 'I')
assert_equal(im.size, (20,100))
tgt = Image.open('Tests/images/16_bit_binary_pgm.png')
assert_image_equal(im, tgt)
def test_16bit_pgm_write():
im = Image.open('Tests/images/16_bit_binary.pgm')
im.load()
f = tempfile('temp.pgm')
assert_no_exception(lambda: im.save(f, 'PPM'))
reloaded = Image.open(f)
assert_image_equal(im, reloaded)

View File

@ -128,3 +128,14 @@ def test_12bit_rawmode():
print (im2.getpixel((0,2))) print (im2.getpixel((0,2)))
assert_image_equal(im, im2) assert_image_equal(im, im2)
def test_32bit_float():
# Issue 614, specific 32 bit float format
path = 'Tests/images/10ct_32bit_128.tiff'
im = Image.open(path)
im.load()
assert_equal(im.getpixel((0,0)), -0.4526388943195343)
assert_equal(im.getextrema(), (-3.140936851501465, 3.140684127807617))

View File

@ -2,6 +2,8 @@ from tester import *
from PIL import Image from PIL import Image
import os
def test_sanity(): def test_sanity():
im = lena() im = lena()
@ -9,3 +11,17 @@ def test_sanity():
pix = im.load() pix = im.load()
assert_equal(pix[0, 0], (223, 162, 133)) assert_equal(pix[0, 0], (223, 162, 133))
def test_close():
im = Image.open("Images/lena.gif")
assert_no_exception(lambda: im.close())
assert_exception(ValueError, lambda: im.load())
assert_exception(ValueError, lambda: im.getpixel((0,0)))
def test_contextmanager():
fn = None
with Image.open("Images/lena.gif") as im:
fn = im.fp.fileno()
assert_no_exception(lambda: os.fstat(fn))
assert_exception(OSError, lambda: os.fstat(fn))

162
Tests/test_olefileio.py Normal file
View File

@ -0,0 +1,162 @@
from __future__ import print_function
from tester import *
import datetime
import PIL.OleFileIO as OleFileIO
def test_isOleFile_false():
# Arrange
non_ole_file = "Tests/images/flower.jpg"
# Act
is_ole = OleFileIO.isOleFile(non_ole_file)
# Assert
assert_false(is_ole)
def test_isOleFile_true():
# Arrange
ole_file = "Tests/images/test-ole-file.doc"
# Act
is_ole = OleFileIO.isOleFile(ole_file)
# Assert
assert_true(is_ole)
def test_exists_worddocument():
# Arrange
ole_file = "Tests/images/test-ole-file.doc"
ole = OleFileIO.OleFileIO(ole_file)
# Act
exists = ole.exists('worddocument')
# Assert
assert_true(exists)
ole.close()
def test_exists_no_vba_macros():
# Arrange
ole_file = "Tests/images/test-ole-file.doc"
ole = OleFileIO.OleFileIO(ole_file)
# Act
exists = ole.exists('macros/vba')
# Assert
assert_false(exists)
ole.close()
def test_get_type():
# Arrange
ole_file = "Tests/images/test-ole-file.doc"
ole = OleFileIO.OleFileIO(ole_file)
# Act
type = ole.get_type('worddocument')
# Assert
assert_equal(type, OleFileIO.STGTY_STREAM)
ole.close()
def test_get_size():
# Arrange
ole_file = "Tests/images/test-ole-file.doc"
ole = OleFileIO.OleFileIO(ole_file)
# Act
size = ole.get_size('worddocument')
# Assert
assert_greater(size, 0)
ole.close()
def test_get_rootentry_name():
# Arrange
ole_file = "Tests/images/test-ole-file.doc"
ole = OleFileIO.OleFileIO(ole_file)
# Act
root = ole.get_rootentry_name()
# Assert
assert_equal(root, "Root Entry")
ole.close()
def test_meta():
# Arrange
ole_file = "Tests/images/test-ole-file.doc"
ole = OleFileIO.OleFileIO(ole_file)
# Act
meta = ole.get_metadata()
# Assert
assert_equal(meta.author, b"Laurence Ipsum")
assert_equal(meta.num_pages, 1)
ole.close()
def test_gettimes():
# Arrange
ole_file = "Tests/images/test-ole-file.doc"
ole = OleFileIO.OleFileIO(ole_file)
root_entry = ole.direntries[0]
# Act
ctime = root_entry.getctime()
mtime = root_entry.getmtime()
# Assert
assert_is_instance(ctime, type(None))
assert_is_instance(mtime, datetime.datetime)
assert_equal(ctime, None)
assert_equal(mtime.year, 2014)
ole.close()
def test_listdir():
# Arrange
ole_file = "Tests/images/test-ole-file.doc"
ole = OleFileIO.OleFileIO(ole_file)
# Act
dirlist = ole.listdir()
# Assert
assert_in(['WordDocument'], dirlist)
ole.close()
def test_debug():
# Arrange
print("ignore_all_except_last_line")
ole_file = "Tests/images/test-ole-file.doc"
ole = OleFileIO.OleFileIO(ole_file)
meta = ole.get_metadata()
# Act
OleFileIO.set_debug_mode(True)
ole.dumpdirectory()
meta.dump()
OleFileIO.set_debug_mode(False)
ole.dumpdirectory()
meta.dump()
# Assert
# No assert, just check they run ok
print("ok")
ole.close()
# End of file

View File

@ -94,6 +94,50 @@ def assert_deep_equal(a, b, msg=None):
assert_equal(a, b, msg) assert_equal(a, b, msg)
def assert_greater(a, b, msg=None):
if a > b:
success()
else:
failure(msg or "%r unexpectedly not greater than %r" % (a, b))
def assert_greater_equal(a, b, msg=None):
if a >= b:
success()
else:
failure(
msg or "%r unexpectedly not greater than or equal to %r" % (a, b))
def assert_less(a, b, msg=None):
if a < b:
success()
else:
failure(msg or "%r unexpectedly not less than %r" % (a, b))
def assert_less_equal(a, b, msg=None):
if a <= b:
success()
else:
failure(
msg or "%r unexpectedly not less than or equal to %r" % (a, b))
def assert_is_instance(a, b, msg=None):
if isinstance(a, b):
success()
else:
failure(msg or "got %r, expected %r" % (type(a), b))
def assert_in(a, b, msg=None):
if a in b:
success()
else:
failure(msg or "%r unexpectedly not in %r" % (a, b))
def assert_match(v, pattern, msg=None): def assert_match(v, pattern, msg=None):
import re import re
if re.match(pattern, v): if re.match(pattern, v):

View File

@ -39,7 +39,7 @@ Why a fork?
PIL is not setuptools compatible. Please see `this Image-SIG post`_ for a more PIL is not setuptools compatible. Please see `this Image-SIG post`_ for a more
detailed explanation. Also, PIL's current bi-yearly (or greater) release detailed explanation. Also, PIL's current bi-yearly (or greater) release
schedule is too infrequent to accomodate the large number and frequency of schedule is too infrequent to accommodate the large number and frequency of
issues reported. issues reported.
.. _this Image-SIG post: https://mail.python.org/pipermail/image-sig/2010-August/006480.html .. _this Image-SIG post: https://mail.python.org/pipermail/image-sig/2010-August/006480.html
@ -52,7 +52,7 @@ What about PIL?
Prior to Pillow 2.0.0, very few image code changes were made. Pillow 2.0.0 Prior to Pillow 2.0.0, very few image code changes were made. Pillow 2.0.0
added Python 3 support and includes many bug fixes from many contributors. added Python 3 support and includes many bug fixes from many contributors.
As more time passes since the last PIL release, the likelyhood of a new PIL As more time passes since the last PIL release, the likelihood of a new PIL
release decreases. However, we've yet to hear an official "PIL is dead" release decreases. However, we've yet to hear an official "PIL is dead"
announcement. So if you still want to support PIL, please announcement. So if you still want to support PIL, please
`report issues here first`_, then `report issues here first`_, then

View File

@ -126,7 +126,7 @@ Identify Image Files
for infile in sys.argv[1:]: for infile in sys.argv[1:]:
try: try:
im = Image.open(infile) with Image.open(infile) as im:
print(infile, im.format, "%dx%d" % im.size, im.mode) print(infile, im.format, "%dx%d" % im.size, im.mode)
except IOError: except IOError:
pass pass

View File

@ -143,7 +143,7 @@ distribution. Otherwise, use whatever XCode you used to compile Python.)
The easiest way to install the prerequisites is via `Homebrew The easiest way to install the prerequisites is via `Homebrew
<http://mxcl.github.com/homebrew/>`_. After you install Homebrew, run:: <http://mxcl.github.com/homebrew/>`_. After you install Homebrew, run::
$ brew install libtiff libjpeg webp littlecms $ brew install libtiff libjpeg webp little-cms2
If you've built your own Python, then you should be able to install Pillow If you've built your own Python, then you should be able to install Pillow
using:: using::

View File

@ -136,9 +136,9 @@ ITU-R 709, using the D65 luminant) to the CIE XYZ color space:
.. automethod:: PIL.Image.Image.verify .. automethod:: PIL.Image.Image.verify
.. automethod:: PIL.Image.Image.fromstring .. automethod:: PIL.Image.Image.fromstring
.. deprecated:: 2.0
.. automethod:: PIL.Image.Image.load .. automethod:: PIL.Image.Image.load
.. automethod:: PIL.Image.Image.close
Attributes Attributes
---------- ----------