4.1 Your First Extension Type

In the spirit of example-based documentation, here's a minimal example of defining a new type in C:

#include <Python.h>

typedef struct {
	PyObject_HEAD
	/* Type-specific fields go here. */
} SimpleObject;

static PyTypeObject SimpleObjectType = {
	PyObject_HEAD_INIT(NULL)
	0,				/* ob_size        */
	"simpletype.Simple",		/* tp_name        */
	sizeof(SimpleObject),		/* tp_basicsize   */
	0,				/* tp_itemsize    */
	0,				/* tp_dealloc     */
	0,				/* tp_print       */
	0,				/* tp_getattr     */
	0,				/* tp_setattr     */
	0,				/* tp_compare     */
	0,				/* tp_repr        */
	0,				/* tp_as_number   */
	0,				/* tp_as_sequence */
	0,				/* tp_as_mapping  */
	0,				/* tp_hash        */
	0,				/* tp_call        */
	0,				/* tp_str         */
	0,				/* tp_getattro    */
	0,				/* tp_setattro    */
	0,				/* tp_as_buffer   */
	Py_TPFLAGS_DEFAULT,		/* tp_flags       */
	"Simple objects are simple.",	/* tp_doc         */
};

PyMODINIT_FUNC
initsimpletype(void) 
{
	PyObject* m;

	SimpleObjectType.tp_new = PyType_GenericNew;
	if (PyType_Ready(&SimpleObjectType) < 0)
		return;

	m = Py_InitModule3("simpletype", NULL,
			   "Example module that creates an extension type.");
	if (m == NULL)
		return;

	Py_INCREF(&SimpleObjectType);
	PyModule_AddObject(m, "Simple", (PyObject *)&SimpleObjectType);
}

As you might expect, the type Simple and its instances are pretty boring; you can print and instantiate it and print its instances:

>>> import simpletype
>>> simpletype.Simple
<type 'simpletype.Simple'>
>>> simpletype.Simple()
<simpletype.Simple object at 0x1053b8>

But that's about it. Still, one has to start with something, and adding interesting behaviour can mostly be done in incremental and sometimes orthogonal steps from this example.

Let's consider the code. A fair proportion should be familiar from earlier examples - the basic layout, the creation of the module object, etc.

The first novelty is

typedef struct {
    PyObject_HEAD
    /* Type-specific fields go here. */
} SimpleObject;

Instances of the Simple type are, as far as C is concerned, SimpleObject objects. As mentioned previously, it must be possible to treat a pointer to any Python object as a pointer to a PyObject. This is the purpose of the PyObject_HEAD macro. In fact, PyObject is declared like this:

typedef struct _object {
	PyObject_HEAD
} PyObject;

So we can be sure that PyObject and SimpleObject start with the same fields in the same order. A macro is needed because PyObject contains extra fields in a debug build of Python. Note that there is no semicolon after the PyObject_HEAD macro; one is included in the macro definition. Be wary of adding one by accident; it's easy to do from habit, and your compiler might not complain, but someone else's probably will! (On Windows, MSVC is known to call this an error and refuse to compile the code.)

Almost every type you define will include type-specific fields where the comment indicates - an object that is all behaviour and no data is a strange beast. For a concrete example, here is the corresponding definition for standard Python integers:

typedef struct {
    PyObject_HEAD
    long ob_ival;
} PyIntObject;

Moving on, we come to the crunch -- the type object. In Python terms the concept of ``a new type'' is synonymous with ``a new instance of the type type.'' Earlier, it was claimed that all Python objects live on the heap. This is in fact only nearly true; in particular type objects are often defined statically (XXX what's the right word for this?), as here:

static PyTypeObject SimpleObjectType = {
    PyObject_HEAD_INIT(NULL)
    0,                         /*ob_size*/
    "simpletype.Simple",       /*tp_name*/
    sizeof(SimpleObject),      /*tp_basicsize*/
    0,                         /*tp_itemsize*/
    0,                         /*tp_dealloc*/
    0,                         /*tp_print*/
    0,                         /*tp_getattr*/
    0,                         /*tp_setattr*/
    0,                         /*tp_compare*/
    0,                         /*tp_repr*/
    0,                         /*tp_as_number*/
    0,                         /*tp_as_sequence*/
    0,                         /*tp_as_mapping*/
    0,                         /*tp_hash */
    0,                         /*tp_call*/
    0,                         /*tp_str*/
    0,                         /*tp_getattro*/
    0,                         /*tp_setattro*/
    0,                         /*tp_as_buffer*/
    Py_TPFLAGS_DEFAULT,        /*tp_flags*/
    "Simple objects are simple.", /* tp_doc */
};

If you go and look up the definition of PyTypeObject in Include/object.h you'll see that it has many more fields that the definition above. The remaining fields will be filled with zeros by the C compiler, and it is common practice to not specify them explicitly unless you need them.

This is sufficiently important that we're going to pick it apart still further:

static PyTypeObject SimpleObjectType = {
    PyObject_HEAD_INIT(NULL)

As noted above, defining a new type amounts to creating a new instance of the ``type'' type. As such an instance is a Python object just as much as a Python integer is, it must start with the fields of a PyObject. This is the purpose of the PyObject_HEAD_INIT macro. The line

    PyObject_HEAD_INIT(NULL)

itself is something of a wart. What we would have liked to have written is:

    PyObject_HEAD_INIT(&PyType_Type)

(as SimpleObjectType is to be an instance of type type) but this isn't strictly conforming C and is rejected by some compilers. Fortunately the PyType_Ready function will fill this field out for us.

    0,                          /* ob_size */

The ob_size field of the header is not used; its presence in the type structure is a historical artifact that is maintained for binary compatibility with extension modules compiled for older versions of Python. Always set this field to zero.

    "simpletype.Simple",        /* tp_name */

The name of our type. This will appear in the default textual representation of our objects and in some error messages, for example:

>>> '' + simpletype.Simple()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: cannot concatenate 'str' and 'simpletype.Simple' objects

Note that the name is a dotted name that includes both the module name and the name of the type within the module. The module in this case is simpletype and name of the type is Simple, so we set the type name to ``simpletype.Simple''.

Getting the name right is also vital for supporting pickling of instances of the custom type (XXX there must be other things too).

    sizeof(SimpleObject),  /* tp_basicsize */

This is so that Python knows how much memory to allocate when creating instances of our type.

    0,                          /* tp_itemsize */

This has to do with variable length objects like lists and strings. Ignore this for now.

Skipping a number of type methods that we don't provide, we set the class flags to Py_TPFLAGS_DEFAULT.

    Py_TPFLAGS_DEFAULT,        /*tp_flags*/

All types should include this constant in their flags.

We provide a doc string for the type in tp_doc.

    "Simple objects are simple",           /* tp_doc */

So far we have used nothing but the default behaviour Python supplies for objects. The only area where the default is not what we want is that the default behaviour is to forbid instantiation of a type. To change this, we need to supply a tp_new method. For the usual case of a instantiatable type, Python provides a function PyType_GenericNew which we can use. Unfortunately we can't just include this in the definition of SimpleObjectType as on some platforms or compilers, we can't statically initialize a structure member with a function defined in another C module, so instead we assign the tp_new slot in the module initialization function just before calling PyType_Ready():

	SimpleObjectType.tp_new = PyType_GenericNew;
	if (PyType_Ready(&SimpleObjectType) < 0)
		return;

The PyType_Ready() function performs some sanity checks and various pieces of book-keeping.

The only remaining unfamiliar code is the code that exposes the type object to Python:

	Py_INCREF(&SimpleObjectType);
	PyModule_AddObject(m, "Simple", (PyObject *)&SimpleObjectType);

That's it! All that remains is to put

from distutils.core import setup, Extension

setup(name="simpletype", version="1.0",
      ext_modules=[Extension("simpletype", ["simpletypemodule.c"])])

in a file called simpletype-setup.py, execute

$ python simpletype-setup.py build_ext -i

and play around with your first extension type in the interactive interpreter.

The remainder of this chapter is devoted to creating extension types with more (well, any) capabilities.

THIS DOCUMENT IS A DRAFT! Comments to mwh@python.net please.