mirror of
				https://github.com/python-pillow/Pillow.git
				synced 2025-11-04 09:57:43 +03:00 
			
		
		
		
	
		
			
				
	
	
		
			128 lines
		
	
	
		
			4.2 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			128 lines
		
	
	
		
			4.2 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
.. _file-handling:
 | 
						|
 | 
						|
File Handling in Pillow
 | 
						|
=======================
 | 
						|
 | 
						|
When opening a file as an image, Pillow requires a filename,
 | 
						|
pathlib.Path object, or a file-like object.  Pillow uses the filename
 | 
						|
or Path to open a file, so for the rest of this article, they will all
 | 
						|
be treated as a file-like object.
 | 
						|
 | 
						|
The first four of these items are equivalent, the last is dangerous
 | 
						|
and may fail::
 | 
						|
 | 
						|
    from PIL import Image
 | 
						|
    import io
 | 
						|
    import pathlib
 | 
						|
 | 
						|
    im = Image.open('test.jpg')
 | 
						|
 | 
						|
    im2 = Image.open(pathlib.Path('test.jpg'))
 | 
						|
 | 
						|
    f = open('test.jpg', 'rb')
 | 
						|
    im3 = Image.open(f)
 | 
						|
 | 
						|
    with open('test.jpg', 'rb') as f:
 | 
						|
        im4 = Image.open(io.BytesIO(f.read()))
 | 
						|
 | 
						|
    # Dangerous FAIL:
 | 
						|
    with open('test.jpg', 'rb') as f:
 | 
						|
        im5 = Image.open(f)
 | 
						|
    im5.load() # FAILS, closed file
 | 
						|
 | 
						|
The documentation specifies that the file will be closed after the
 | 
						|
``Image.Image.load()`` method is called.  This is an aspirational
 | 
						|
specification rather than an accurate reflection of the state of the
 | 
						|
code.
 | 
						|
 | 
						|
Pillow cannot in general close and reopen a file, so any access to
 | 
						|
that file needs to be prior to the close.
 | 
						|
 | 
						|
Issues
 | 
						|
------
 | 
						|
 | 
						|
The current open file handling is inconsistent at best:
 | 
						|
 | 
						|
* Most of the image plugins do not close the input file.
 | 
						|
* Multi-frame images behave badly when seeking through the file, as
 | 
						|
  it's legal to seek backward in the file until the last image is
 | 
						|
  read, and then it's not.
 | 
						|
* Using the file context manager to provide a file-like object to
 | 
						|
  Pillow is dangerous unless the context of the image is limited to
 | 
						|
  the context of the file.
 | 
						|
 | 
						|
Image Lifecycle
 | 
						|
---------------
 | 
						|
 | 
						|
* ``Image.open()`` called. Path-like objects are opened as a
 | 
						|
  file. Metadata is read from the open file. The file is left open for
 | 
						|
  further usage.
 | 
						|
 | 
						|
* ``Image.Image.load()`` when the pixel data from the image is
 | 
						|
  required, ``load()`` is called. The current frame is read into
 | 
						|
  memory. The image can now be used independently of the underlying
 | 
						|
  image file.
 | 
						|
 | 
						|
* ``Image.Image.seek()`` in the case of multi-frame images
 | 
						|
  (e.g. multipage TIFF and animated GIF) the image file left open so
 | 
						|
  that seek can load the appropriate frame.  When the last frame is
 | 
						|
  read, the image file is closed (at least in some image plugins), and
 | 
						|
  no more seeks can occur.
 | 
						|
 | 
						|
* ``Image.Image.close()`` Closes the file pointer and destroys the
 | 
						|
  core image object. This is used in the Pillow context manager
 | 
						|
  support. e.g.::
 | 
						|
 | 
						|
      with Image.open('test.jpg') as img:
 | 
						|
         ...  # image operations here.
 | 
						|
 | 
						|
 | 
						|
The lifecycle of a single frame image is relatively simple. The file
 | 
						|
must remain open until the ``load()`` or ``close()`` function is
 | 
						|
called.
 | 
						|
 | 
						|
Multi-frame images are more complicated. The ``load()`` method is not
 | 
						|
a terminal method, so it should not close the underlying file. The
 | 
						|
current behavior of ``seek()`` closing the underlying file on
 | 
						|
accessing the last frame is presumably a heuristic for closing the
 | 
						|
file after iterating through the entire sequence. In general, Pillow
 | 
						|
does not know if there are going to be any requests for additional
 | 
						|
data until the caller has explicitly closed the image.
 | 
						|
 | 
						|
 | 
						|
Complications
 | 
						|
-------------
 | 
						|
 | 
						|
* TiffImagePlugin has some code to pass the underlying file descriptor
 | 
						|
  into libtiff (if working on an actual file). Since libtiff closes
 | 
						|
  the file descriptor internally, it is duplicated prior to passing it
 | 
						|
  into libtiff.
 | 
						|
 | 
						|
* ``decoder.handles_eof`` This slightly misnamed flag indicates that
 | 
						|
  the decoder wants to be called with a 0 length buffer when reads are
 | 
						|
  done. Despite the comments in ``ImageFile.load()``, the only decoder
 | 
						|
  that actually uses this flag is the Jpeg2K decoder. The use of this
 | 
						|
  flag in Jpeg2K predated the change to the decoder that added the
 | 
						|
  pulls_fd flag, and is therefore not used.
 | 
						|
 | 
						|
* I don't think that there's any way to make this safe without
 | 
						|
  changing the lazy loading::
 | 
						|
 | 
						|
    # Dangerous FAIL:
 | 
						|
    with open('test.jpg', 'rb') as f:
 | 
						|
        im5 = Image.open(f)
 | 
						|
    im5.load() # FAILS, closed file
 | 
						|
 | 
						|
 | 
						|
Proposed File Handling
 | 
						|
----------------------
 | 
						|
 | 
						|
* ``Image.Image.load()`` should close the image file, unless there are
 | 
						|
  multiple frames.
 | 
						|
 | 
						|
* ``Image.Image.seek()`` should never close the image file.
 | 
						|
 | 
						|
* Users of the library should call ``Image.Image.close()`` on any
 | 
						|
  multi-frame image to ensure that the underlying file is closed.
 | 
						|
 |