mirror of
				https://github.com/python-pillow/Pillow.git
				synced 2025-10-25 05:01:26 +03:00 
			
		
		
		
	
		
			
				
	
	
		
			126 lines
		
	
	
		
			4.2 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			126 lines
		
	
	
		
			4.2 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
| File Handling in Pillow
 | |
| =======================
 | |
| 
 | |
| When opening a file as an image, Pillow requires a filename,
 | |
| pathlib.Path object, or a file-like object.  Pillow uses the filename
 | |
| or Path to open a file, so for the rest of this article, they will all
 | |
| be treated as a file-like object. 
 | |
| 
 | |
| The first four of these items are equivalent, the last is dangerous
 | |
| and may fail::
 | |
| 
 | |
|     from PIL import Image
 | |
|     import io
 | |
|     import pathlib
 | |
|     
 | |
|     im = Image.open('test.jpg')
 | |
| 
 | |
|     im2 = Image.open(pathlib.Path('test.jpg'))
 | |
| 
 | |
|     f = open('test.jpg', 'rb')
 | |
|     im3 = Image.open(f)
 | |
|     
 | |
|     with open('test.jpg', 'rb') as f:
 | |
|         im4 = Image.open(io.BytesIO(f.read()))
 | |
| 
 | |
|     # Dangerous FAIL:
 | |
|     with open('test.jpg', 'rb') as f:
 | |
|         im5 = Image.open(f)
 | |
|     im5.load() # FAILS, closed file
 | |
| 
 | |
| The documentation specifies that the file will be closed after the
 | |
| ``Image.Image.load()`` method is called.  This is an aspirational
 | |
| specification rather than an accurate reflection of the state of the
 | |
| code. 
 | |
| 
 | |
| Pillow cannot in general close and reopen a file, so any access to
 | |
| that file needs to be prior to the close. 
 | |
| 
 | |
| Issues
 | |
| ------
 | |
| 
 | |
| The current open file handling is inconsistent at best:
 | |
| 
 | |
| * Most of the image plugins do not close the input file.
 | |
| * Multi-frame images behave badly when seeking through the file, as
 | |
|   it's legal to seek backward in the file until the last image is
 | |
|   read, and then it's not. 
 | |
| * Using the file context manager to provide a file-like object to
 | |
|   Pillow is dangerous unless the context of the image is limited to
 | |
|   the context of the file. 
 | |
| 
 | |
| Image Lifecycle
 | |
| ---------------
 | |
| 
 | |
| * ``Image.open()`` called. Path-like objects are opened as a
 | |
|   file. Metadata is read from the open file. The file is left open for
 | |
|   further usage.
 | |
| 
 | |
| * ``Image.Image.load()`` when the pixel data from the image is
 | |
|   required, ``load()`` is called. The current frame is read into
 | |
|   memory. The image can now be used independently of the underlying
 | |
|   image file. 
 | |
| 
 | |
| * ``Image.Image.seek()`` in the case of multi-frame images
 | |
|   (e.g. multipage TIFF and animated GIF) the image file left open so
 | |
|   that seek can load the appropriate frame.  When the last frame is
 | |
|   read, the image file is closed (at least in some image plugins), and
 | |
|   no more seeks can occur.
 | |
| 
 | |
| * ``Image.Image.close()`` Closes the file pointer and destroys the
 | |
|   core image object. This is used in the Pillow context manager
 | |
|   support. e.g.::
 | |
| 
 | |
|       with Image.open('test.jpg') as img:
 | |
|          ...  # image operations here. 
 | |
| 
 | |
| 
 | |
| The lifecycle of a single frame image is relatively simple. The file
 | |
| must remain open until the ``load()`` or ``close()`` function is
 | |
| called. 
 | |
| 
 | |
| Multi-frame images are more complicated. The ``load()`` method is not
 | |
| a terminal method, so it should not close the underlying file. The
 | |
| current behavior of ``seek()`` closing the underlying file on
 | |
| accessing the last frame is presumably a heuristic for closing the
 | |
| file after iterating through the entire sequence. In general, Pillow
 | |
| does not know if there are going to be any requests for additional
 | |
| data until the caller has explicitly closed the image. 
 | |
| 
 | |
| 
 | |
| Complications
 | |
| -------------
 | |
| 
 | |
| * TiffImagePlugin has some code to pass the underlying file descriptor
 | |
|   into libtiff (if working on an actual file). Since libtiff closes
 | |
|   the file descriptor internally, it is duplicated prior to passing it
 | |
|   into libtiff. 
 | |
| 
 | |
| * ``decoder.handles_eof`` This slightly misnamed flag indicates that
 | |
|   the decoder wants to be called with a 0 length buffer when reads are
 | |
|   done. Despite the comments in ``ImageFile.load()``, the only decoder
 | |
|   that actually uses this flag is the Jpeg2K decoder. The use of this
 | |
|   flag in Jpeg2K predated the change to the decoder that added the
 | |
|   pulls_fd flag, and is therefore not used.
 | |
| 
 | |
| * I don't think that there's any way to make this safe without
 | |
|   changing the lazy loading::
 | |
| 
 | |
|     # Dangerous FAIL:
 | |
|     with open('test.jpg', 'rb') as f:
 | |
|         im5 = Image.open(f)
 | |
|     im5.load() # FAILS, closed file
 | |
| 
 | |
| 
 | |
| Proposed File Handling
 | |
| ----------------------
 | |
| 
 | |
| * ``Image.Image.load()`` should close the image file, unless there are
 | |
|   multiple frames.
 | |
| 
 | |
| * ``Image.Image.seek()`` should never close the image file. 
 | |
| 
 | |
| * Users of the library should call ``Image.Image.close()`` on any
 | |
|   multi-frame image to ensure that the underlying file is closed. 
 | |
| 
 |