Here's the complete source to a module that exposes one function
called func() that takes two arguments and returns their sum:
#include <Python.h>
static char second_doc[] =
"This module is just a simple example. It provides one function: func().";
static PyObject*
second_func(PyObject *self, PyObject *args)
{
PyObject *a, *b;
if (!PyArg_UnpackTuple(args, "func", 2, 2, &a, &b)) {
return NULL;
}
return PyNumber_Add(a, b);
}
static char second_func_doc[] =
"func(a, b)\n\
\n\
Return the sum of a and b.";
static PyMethodDef second_methods[] = {
{"func", second_func, METH_VARARGS, second_func_doc},
{NULL, NULL}
};
PyMODINIT_FUNC
initsecond(void)
{
Py_InitModule3("second", second_methods, second_doc);
}
Although this is still fairly short, there's quite a bit to explain.
This module is essentially identical to the following Python module:
def func(a, b):
return a + b
Something it helps greatly to understand about writing C extensions is the use and value of various conventions - or perhaps ``patterns'' - both in stylistic aspects of the source and in the design of the C API.
A large fraction of C functions that are exposed to Python are defined like this:
static PyObject *
<MODNAME>_<FUNCNAME>(PyObject *self, PyObject *args)
{
...
}
In this example the function ``func()'' is just a module-level
function and the self parameter will always be NULL. The
args parameter will contain a tuple of the arguments the
function was called with.
There's not much good defining a function without allowing access to
it. This is the purpose of the second_methods array: it lists
the functions exposed to Python by the module.
Each entry in the array is a struct of type PyMethodDef and has
four fields, in order:
| field | meaning |
|---|---|
ml_name |
The name of the function as seen from Python. |
ml_meth |
A pointer to the C implementation of the function. |
ml_flags |
A flag indicating the calling convention the function uses.
METH_VARARGS is the usual choice here; the
alternatives will be explained in a later section. |
ml_func |
The function's docstring or NULL if you want to be evil and have a function with no docstring. |
The last entry in the array is indicated by a sentinel entry filled
with NULLs.
One of the conventions I mentioned earlier is that the name of the C function implementing a function is of the form MODNAME_FUNCNAME. As you can from the above you can call it anything you like, but also that it's sensible to have some relation between the two names.
The second_methods array is passed as the second parameter to
Py_InitModule3. There are other ways to arrange for functions
to be available at module level, but this is the easiest - and
conventional - way.
Having covered (lightly) exposing functions to Python, we now turn to
second_func's implementation.
The first thing most functions should do is verify that the arguments
match expectations, both in number and type. This function accepts
arguments of any type, so we use the API function
PyArg_UnpackTuple. PyArg_UnpackTuple takes a variable
number of arguments:
METH_VARARGS function
knows its args parameter is a tuple, so there is no need to
check.
PyObject *
If there are a suitable number of arguments, the pointers passed to
PyArg_UnpackTuple will be filled out with the passed arguments
(if at least min but less than max arguments are passed,
the corresponding pointers are left alone, so it's best to initialize
them to NULL - but this does not matter in our example as both
min and max are 2).
There is a related function, PyArg_ParseTuple that can do more
sophisticated argument checking and processing, but
PyArg_UnpackTuple is more efficient in the cases where it
applies.
PyArg_UnpackTuple returns a true value on success and 0
on failure. This is the norm (???) for functions that return an
int as an error indicator. Slightly confusingly, the
convention for functions that return a value of type int
is to return -1 in case of error.
The convention for C API functions that return PyObject*s is
much clearer: NULL indicates failure, non-NULL (and by
implication, a valid pointer) indicates success. With a very
small number of (not accidental or historical) exceptions, a
NULL return is indication that the called function set an
exception. Conversely, a non-NULL return indicates that no
exception is set.
These conventions apply just as strongly to C functions you write and getting them wrong is to be considered a severe bug in your code.
It is important to note (and tedious to live with) the fact that
essentially any API function can fail - MemoryError in
particular is almost impossible to rule out.
The foregoing has two notable impacts on the code in the example. The
first is that if PyArg_UnpackTuple returns 0, we know it
has set an exception so we just return NULL. The other is that
when we call PyNumber_Add we know that it will return
NULL or non-NULL exactly when we need to, so there is no
need for us to check its return value. This is no accident: by and
large the calling conventions of the Python/C API are designed to make
programming with it convenient.
To return to explaining what the example function actually does...
In simple terms, it receives two arguments and returns their sum. We've covered checking the number of arguments but not actually discussed how Python objects are presented to C code that deals with them.
All Python objects are allocated on the heap. They can all be
referred to as pointers to a PyObject. A PyObject just
contains a reference count and a pointer to a type object. So when a
function receives arguments of arbitrary type, as the example does, it
will be dealing with pointers to PyObject.
The spec for the example function says that it returns the sum of the
arguments. The Python/C API provides a function to do this:
PyNumber_Add. This is part of what is sometimes termed the
``abstract objects layer'' of the API: it allows you to operate on
objects in a similar fashion to the way interpreted Python code does.
In this case PyNumber_Add is equivalent to the +
operator in Python, as good a definition of ``sum'' of two values as
any in Python. You can probably guess the name of the functions that
subtract and multiply two values.
Something that is notable by its absence from the example function above is explicit manipulation of reference counts. This is another example of the conventions of the Python/C API making programming convenient. It's not always so convenient.
THIS DOCUMENT IS A DRAFT! Comments to mwh@python.net please.