Version: | 3.0 |
---|---|
Date: | 2013-09-29 |
Author: | Stefan Schwarzer <sschwarzer@sschwarzer.net> |
Contents
This ftputil release adds support for Python 3.0 and up.
Python 2 and 3 are supported with the same source code. Also, the API including the semantics is the same. As for Python 3 code, in ftputil 3.0 unicode is somewhat preferred over byte strings. On the other hand, in line with the file system APIs of both Python 2 and 3, methods take either byte strings or unicode strings. Methods that take and return strings (for example, FTPHost.path.abspath or FTPHost.listdir), return the same string type they get.
Note
Both Python 2 and 3 have two "string" types where one type represents a sequence of bytes and the other type character (text) data.
Python version | Binary type | Text type | Default string literal type |
---|---|---|---|
2 | str | unicode | str (= binary type) |
3 | bytes | str | str (= text type) |
So both lines of Python have an str type, but in Python 2 it's the byte type and in Python 3 the text type. The str type is also what you get when you write a literal string without any prefixes. For example "Python" is a binary string in Python 2 and a text (unicode) string in Python 3.
If this seems confusing, please read this description in the Python documentation for more details.
To make it easier to use the same code for Python 2 and 3, I decided to use the Python 3 features backported to Python 2.6. As a consequence, ftputil 3.0 doesn't work with Python 2.4 and 2.5.
Traditionally, "text mode" for FTP transfers meant translation to \r\n newlines, even between transfers of Unix clients and Unix servers. Since this presumably most of the time is neither the expected nor the desired behavior, the FTPHost.open method now has the API and semantics of the built-in open function in Python 3. If you want the same API for local files in Python 2.6 and 2.7, you can use the open function from the io module.
Thus, when opening remote files in binary mode, the new API does not accept an encoding argument. On the other hand, opening a file in text mode always implies an encoding step when writing and decoding step when reading files. If the encoding argument isn't specified, it defaults to the value of locale.getpreferredencoding(False).
Also as with Python 3's open builtin, opening a file in binary mode for reading will give you byte string data. If you write to a file opened in binary mode, you must write byte strings. Along the same lines, files opened in text mode will give you unicode strings when read, and require unicode strings to be passed to write operations.
In earlier ftputil versions, most module names had a redundant ftp_ prefix. In ftputil 3.0, these prefixes are removed. Of the module names that are part of the public ftputil API, this affects only ftputil.error and ftputil.stat.
In Python 2.2, file became an alias for open, and previous ftputil versions also had an FTPHost.file besides the FTPHost.open method. In Python 3.0, the file builtin was removed and the return values from the built-in open methods are no longer file instances. Along the same lines, ftputil 3.0 also drops the FTPHost.file alias and requires FTPHost.open.
The FTPHost methods for downloading and uploading files (download, download_if_newer, upload and upload_if_newer) now always use binary mode; a mode argument is no longer needed or even allowed. Although this behavior makes downloads and uploads slightly less flexible, it should cover almost all use cases.
If you really want to do a transfer involving files opened in text mode, you can still do:
import ftputil.file_transfer ... with FTPHost.open("source.txt", "r", encoding="UTF-8") as source, \ FTPHost.open("target.txt", "w", encoding="latin1") as target: ftputil.file_transfer.copyfileobj(source, target)
Note that it's not possible anymore to open one file in binary mode and the other file in text mode and transfer data between them with copyfileobj. For example, opening the source in binary mode will read byte strings, but a target file opened in text mode will only allow writing of unicode strings. Then again, I assume that the cases where you want a mixed binary/text mode transfer should be very rare.
Custom parsers, as described in the documentation, receive a text line for each directory entry in the methods ignores_line and parse_line. In previous ftputil versions, the line arguments were byte strings; now they're unicode strings.
If you aren't sure what this is about, this may help: If you never used the FTPHost.set_parser method, you can ignore this section. :-)
Note
In the root directory of the installed ftputil package is a script find_invalid_code.py which, given a start directory as argument, will scan that directory tree for code that may need to be fixed. However, this script uses very simple heuristics, so it may miss some problematic code or list perfectly valid code.
In particular, you may want to change the regular expression string HOST_REGEX for the names you usually use for FTPHost objects.
It's difficult to be more specific without knowing your application.
That said, best practices nowadays are:
Yes, I know that's not much more specific.
(What's meant here is, for example, that if you opened a remote file as text, the read data could be of byte string type in Python 2 and of unicode type in Python 3. Similarly, under Python 2 a text file opened for writing could accept both byte strings and unicode strings in the write* methods.)
Actually, I had at first thought of implementing this but dropped the idea because it has several problems:
For these reasons, I ended up choosing the same API semantics for Python 2 and 3.
There are two reasons:
I had considered this when I started adapting the ftputil source code for Python 3. On the other hand, although using 2to3 used to be the recommended approach for Python 3 support, even rather large projects have chosen the route of having one code base and using it unmodified for Python 2 and 3.
When I looked into this approach for ftputil 3.0, it became quickly obvious that it would be easier and I found it worked out very well.