2.4 Handling Different Types of Data

Although we've covered a fair amount of ground already, we're still not in a position to write an actually useful extension. Part of the reason for this is the fact that we've only handled raw PyObjects. One of the primary purposes of writing C extensions is interfacing with third party APIs, and not many of them take PyObject*s!

As was mentioned earlier, all Python objects are allocated on the heap, and the first few bytes of every object can be interpreted as being those of a struct PyObject. In a release build, a PyObject just consists of a reference count and a pointer to a type object. The reference count we have seen. The type object determines almost every aspect of the behavior of the object.

In many extension functions, you can use the argument processing function PyArg_ParseTuple to check the types of the arguments passed to your function and extract values in format more usual to C programs.

For instance, here's a module that implements type-specific addition functions (there's not much point to this, but never mind):

#include <Python.h>

static char third_doc[] = 
"This module is just a simple example.";

static PyObject*
third_addi(PyObject *self, PyObject *args)
{
	long a, b;
	
	if (!PyArg_ParseTuple(args, "ll:addi", &a, &b)) {
		return NULL;
	}

	return PyInt_FromLong(a + b);
}

static char third_addi_doc[] = 
"addi(a, b) -> int\n\
\n\
Return the sum of integers a and b.";

static PyObject*
third_addf(PyObject *self, PyObject *args)
{
	double a, b;
	
	if (!PyArg_ParseTuple(args, "dd:addf", &a, &b)) {
		return NULL;
	}

	return PyFloat_FromDouble(a + b);
}

static char third_addf_doc[] = 
"addf(a, b) -> floats\n\
\n\
Return the sum of floats a and b.";

static PyMethodDef third_methods[] = {
	{"addi", third_addi, METH_VARARGS, third_addi_doc},
	{"addf", third_addf, METH_VARARGS, third_addf_doc},
	{NULL, NULL}
};

PyMODINIT_FUNC
initthird(void)
{
	Py_InitModule3("third", third_methods, third_doc);
}

The main novelty here is the use of the PyArg_ParseTuple function. This function (like PyArg_UnpackTuple) takes a variable number of arguments:

  1. args: An argument tuple.
  2. format: The name of the function, which can appear in error messages.
  3. ...: A variety of arguments, the type and number of which are determined by format.

PyArg_ParseTuple is quite a complex function and no attempt to describe every intricacy is made here.

In common usage, a character of the format string indicates that the correspending varadic argument is a pointer to a given C type and PyArg_ParseTuple will attempt to convert the corresponding Python-level positional argument to that type and store this value into the given pointer.

So, in the function third_addi in the above module, "ll" declares that the final two arguments are pointers to C variables of long type. Attempting to pass values that cannot be implicitly converted to Python integers - strings or files, say - will result in PyArg_ParseTuple setting an exception or returning -1. Conversley, passing integers or an instance of a user defined class that defines an __int__ method stores the integer values obtained from these objects in the C variables a and b (Python integers are implemented using C longs, so long is the most natural type to use here).

Here's a list of a few of the simplest format codes:

Code  C type 
"l" long
"i" int
"d" double
"f" float
"s" char * (a C string)
"O" PyObject *
"|" indicates that the following arguments are optional.

For more details, please see the fine documentataion XXX link!.

A final note is that parsing the format code on each invocation of the function takes time. If the call to PyArg_ParseTuple isn't doing much - if you aren't taking advantage of the type checking and processing it provides - a call to PyArg_UnpackTuple is probably clearer and more efficient.