What’s New in Python 2.7

Author:A.M. Kuchling (amk at amk.ca)
Release:3.1a2
Date:April 04, 2009

This article explains the new features in Python 2.7. No release schedule has been decided yet for 2.7.

Other Language Changes

Some smaller changes made to the core Python language are:

  • The int() and long() types gained a bit_length method that returns the number of bits necessary to represent its argument in binary:

    >>> n = 37
    >>> bin(37)
    '0b100101'
    >>> n.bit_length()
    6
    >>> n = 2**123-1
    >>> n.bit_length()
    123
    >>> (n+1).bit_length()
    124
    

    (Contributed by Fredrik Johansson and Victor Stinner; issue 3439.)

  • Integers are now stored internally either in base 2**15 or in base 2**30, the base being determined at build time. Previously, they were always stored in base 2**15. Using base 2**30 gives significant performance improvements on 64-bit machines, but benchmark results on 32-bit machines have been mixed. Therefore, the default is to use base 2**30 on 64-bit machines and base 2**15 on 32-bit machines; on Unix, there’s a new configure option –enable-big-digits that can be used to override this default.

    Apart from the performance improvements this change should be invisible to end users, with one exception: for testing and debugging purposes there’s a new structseq sys.long_info that provides information about the internal format, giving the number of bits per digit and the size in bytes of the C type used to store each digit:

    >>> import sys
    >>> sys.long_info
    sys.long_info(bits_per_digit=30, sizeof_digit=4)
    

    (Contributed by Mark Dickinson; issue 4258.)

Optimizations

A few performance enhancements have been added:

  • The garbage collector now performs better when many objects are being allocated without deallocating any. A full garbage collection pass is only performed when the middle generation has been collected 10 times and when the number of survivor objects from the middle generation exceeds 10% of the number of objects in the oldest generation. The second condition was added to reduce the number of full garbage collections as the number of objects on the heap grows, avoiding quadratic performance when allocating very many objects. (Suggested by Martin von Loewis and implemented by Antoine Pitrou; issue 4074.)

  • The garbage collector tries to avoid tracking simple containers which can’t be part of a cycle. As of now, this is true for tuples and dicts containing atomic types (such as ints, strings, etc.). Transitively, a dict containing tuples of atomic types won’t be tracked either. This helps bring down the individual cost of each garbage collection, since it decreases the number of objects to be considered and traversed by the collector.

    To help diagnosing this optimization, a new function in the gc module, is_tracked(), returns True if a given instance is tracked by the garbage collector, False otherwise. (Contributed by Antoine Pitrou; issue 4688.)

New, Improved, and Deprecated Modules

As in every release, Python’s standard library received a number of enhancements and bug fixes. Here’s a partial list of the most notable changes, sorted alphabetically by module name. Consult the Misc/NEWS file in the source tree for a more complete list of changes, or look through the Subversion logs for all the details.

  • In Distutils, distutils.sdist.add_defaults now uses package_dir and data_files to feed MANIFEST.

  • It is not mandatory anymore to store clear text passwords in the .pypirc file when registering and uploading packages to PyPI. As long as the username is present in that file, the distutils package will prompt for the password if not present. (Added by tarek, with the initial contribution of Nathan Van Gheem; issue 4394.)

  • The bz2 module’s BZ2File now supports the context management protocol, so you can write with bz2.BZ2File(...) as f: .... (Contributed by Hagen Fuerstenau; issue 3860.)

  • A new Counter class in the collections module is useful for tallying data. Counter instances behave mostly like dictionaries but return zero for missing keys instead of raising a KeyError:

    >>> from collections import Counter
    >>> c=Counter()
    >>> for letter in 'here is a sample of english text':
    ...   c[letter] += 1
    ...
    >>> c
    Counter({' ': 6, 'e': 5, 's': 3, 'a': 2, 'i': 2, 'h': 2,
    'l': 2, 't': 2, 'g': 1, 'f': 1, 'm': 1, 'o': 1, 'n': 1,
    'p': 1, 'r': 1, 'x': 1})
    >>> c['e']
    5
    >>> c['z']
    0
    

    There are two additional Counter methods: most_common() returns the N most common elements and their counts, and elements() returns an iterator over the contained element, repeating each element as many times as its count:

    >>> c.most_common(5)
    [(' ', 6), ('e', 5), ('s', 3), ('a', 2), ('i', 2)]
    >>> c.elements() ->
       'a', 'a', ' ', ' ', ' ', ' ', ' ', ' ',
       'e', 'e', 'e', 'e', 'e', 'g', 'f', 'i', 'i',
       'h', 'h', 'm', 'l', 'l', 'o', 'n', 'p', 's',
       's', 's', 'r', 't', 't', 'x']
    

    Contributed by Raymond Hettinger; issue 1696199.

  • The gzip module’s GzipFile now supports the context management protocol, so you can write with gzip.GzipFile(...) as f: .... (Contributed by Hagen Fuerstenau; issue 3860.)

  • The io.FileIO class now raises an OSError when passed an invalid file descriptor. (Implemented by Benjamin Peterson; issue 4991.)

  • The pydoc module now has help for the various symbols that Python uses. You can now do help('<<') or help('@'), for example. (Contributed by David Laban; issue 4739.)

  • A new function in the subprocess module, check_output(), runs a command with a specified set of arguments and returns the command’s output as a string if the command runs without error, or raises a CalledProcessError exception otherwise.

    >>> subprocess.check_output(['df', '-h', '.'])
    'Filesystem     Size   Used  Avail Capacity  Mounted on\n
    /dev/disk0s2    52G    49G   3.0G    94%    /\n'
    
    >>> subprocess.check_output(['df', '-h', '/bogus'])
      ...
    subprocess.CalledProcessError: Command '['df', '-h', '/bogus']' returned non-zero exit status 1
    

    (Contributed by Gregory P. Smith.)

  • The is_zipfile() function in the zipfile module will now accept a file object, in addition to the path names accepted in earlier versions. (Contributed by Gabriel Genellina; issue 4756.)

ttk: Themed Widgets for Tk

Tcl/Tk 8.5 includes a set of themed widgets that re-implement basic Tk widgets but have a more customizable appearance and can therefore more closely resemble the native platform’s widgets. This widget set was originally called Tile, but was renamed to Ttk (for “themed Tk”) on being added to Tcl/Tck release 8.5.

XXX write a brief discussion and an example here.

The ttk module was written by Guilherme Polo and added in issue 2983. An alternate version called Tile.py, written by Martin Franklin and maintained by Kevin Walzer, was proposed for inclusion in issue 2618, but the authors argued that Guilherme Polo’s work was more comprehensive.

Build and C API Changes

Changes to Python’s build process and to the C API include:

  • If you use the .gdbinit file provided with Python, the “pyo” macro in the 2.7 version will now work when the thread being debugged doesn’t hold the GIL; the macro will now acquire it before printing. (Contributed by Victor Stinner; issue 3632.)
  • Py_AddPendingCall is now thread safe, letting any worker thread submit notifications to the main Python thread. This is particularly useful for asynchronous IO operations. (Contributed by Kristjan Valur Jonsson; issue 4293.)

Port-Specific Changes: Windows

  • The msvcrt module now contains some constants from the crtassem.h header file: CRT_ASSEMBLY_VERSION, VC_ASSEMBLY_PUBLICKEYTOKEN, and LIBRARIES_ASSEMBLY_NAME_PREFIX. (Contributed by David Cournapeau; issue 4365.)
  • The new _beginthreadex API is used to start threads, and the native thread-local storage functions are now used. (Contributed by Kristjan Valur Jonsson; issue 3582.)

Port-Specific Changes: Mac OS X

Porting to Python 2.7

This section lists previously described changes and other bugfixes that may require changes to your code:

To be written.

Acknowledgements

The author would like to thank the following people for offering suggestions, corrections and assistance with various drafts of this article: no one yet.