About... Prerequisites Examples Home Bugs Download Manual Links

Python tools for the climate variability analysis.

Official site

Hosted by

What is PyClimate?

It is a Python package designed to accomplish some usual tasks during the analysis of climate variability using Python. It provides functions to perform some simple IO operations, operations with COARDS-compliant netCDF files, EOF analysis, SVD and CCA analysis of coupled data sets, some linear digital filters, kernel based probability density function estimation and access to DCDFLIB.C library from Python.

PyClimate is not paid-Climate, it's free

Our users and us find minor bugs from time to time. You should check the ERRATA page

Last events

2004 01 08
PyClimate 1.2.1 RELEASED
Several small bugs corrected. No result affected, just code crashes. See the ERRATA section.

2002 12 12
Added bug report section to the web site.

2002 12 09
PyClimate 1.2 RELEASED
( Change Log )

2002 11 15
A poster presentation at the AMIP International Workshop in Toulouse

2001 09 31
The internal calculus of EOFs SVD and CCA formulation used by PyClimate are explained in this document. (Updated: 2002 11 29)

2001 07 27
PyClimate version 1.1.1 released.
Because of changes in Python 2.1 , the previous release PyClimate 1.1 didn't work correctly from that version on.

2001 06 13
PyClimate version 1.1 released.
( Change Log )

2001 03 05
PyClimate public presentation at the 9th International Python Conference.

2001 02 28
Talk at the Ernest Orlando Lawrence Berkeley National Laboratory. You can download the slides shown there.

2001 02 12
We were at the APMG 2001.

Who are the authors?

Jon Saenz - jsaenz@wm.lc.ehu.es
Jesus Fernandez - chus@wm.lc.ehu.es
Juan Zubillaga - wmpzuesj@lg.ehu.es
From the Department of Applied Physics II, Faculty of Sciences,
University of the Basque Country.

Under which license is it distributed?

After several years of using great GNU software for free, we felt ourselves in the moral requirement to distribute it under.... the GNU Public License to give back the partners of the Open Source movements just a small part of what we had got from them.

Can I contribute to its development?

Yes, it is an open project, but we are not granting CVS write access to any user. You can send us your routines, we will test them and we could add them to the package in future versions. Of course, you will be cited as the original author!!

In any case, if you feel that PyClimate is being a very helpful tool for your research on the topic, we would appreciate that you cite our work in your papers:
  • J. Saenz, J. Zubillaga and J. Fernandez (2002) Geophysical data analysis using Python, Computers and Geosciences, 28/4:457-465

Are there other mirrors of this software?

http://fisica.ehu.es/jsaenz/ (Europe)
http://starship.python.net/crew/jsaenz (USA)

Why did we write it?

We work in the field and we think that Python and NumPy are GREAT tools to perform atmospheric and oceanic data analysis. We used FORTRAN and C for years to analyze data, but now we use mostly Python. We don't say that Global Circulation models should be written using just Python, but for some of the usual tasks that climate analysts are usually using, PyClimate and Python are sufficient tools and it is not worth wasting a lot of time writing equivalent C or FORTRAN programs which will be used only a few times. It is too time-consuming. After creating several routines for our own research work, we decided to wrap them under the form of a Python package and share them with you, the users.

Which specific problems of the climate analyst can PyClimate solve?

During recent years, due to the existing concern about the detection and attribution of human-induced climate change, there is an increasing interest in a careful analysis of several instrumental data sets, as well as modelling results. Some of the data analysis relies heavily on eigenvalue techniques and matrix-oriented operations.

That is, for instance,

  • the case during the analysis of the joint spatial and temporal variability of scalar or vector fields like geopotential height, precipitation or temperature over a wide area to identify the main modes of variability embedded in a temporally and spatially variable data set, which is often performed by means of what is called the Empirical Orthogonal Function (EOF) approach, based on the Karhunen-Love decomposition of the joint temporal and spatial variability of the fields.
  • Similarly, during the analysis of coupled variability of geophysical data sets, eigenvalue techniques like Canonical Correlation Analysis (CCA) or the Singular Value Decomposition (SVD) of the covariance matrices are standard tools.
  • The matricial nature of the data and the solutions make very simple and easy the use of Numeric Python and the LinearAlgebra routines to code the corresponding algorithms.

On the other hand, atmospheric and oceanic data sets are usually distributed using extremely different formats, which provide some very interesting features, but which make the access to individual records and hyperslabs difficult to handle.

    Just to name a few instances,
  • the use of netCDF files is very common (see, for instance, the NCEP/NCAR Reanalysis Homepage). The use of globally gridded data sets usually means that the data is distributed in regular latitude/longitude grids, which makes very interesting the use of the so-called COARDS compliant netCDF files.
  • However, the use of global spectral models is also very common in meteorological applications. In this case, the data is usually stored in the World Meteorological Organization (WMO) endorsed GRIB format, which allows the storage of spectral tetradimensional grids (time, vertical level, latitude and longitude) to store/retrieve the data and allowing extra features not currently supported by the netCDF interface.
  • Finally, just to mention a few other data formats, satellite data is often stored using the HDF interface, instrumental surface data is very often distributed using compressed ASCII files (CRU 05 data set, GOSTA-8...) or even binary packed data, like the GTOPO30 topography data set.

This means that, routinely, a researcher on atmospheric or oceanographic sciences has to have a way to access these data sets and convert them from one format to another to be able to perform a quantitative analysis on them. Thus, a flexible tool like Python with its Numeric extensions is very helpful.

  • We can directly read binary data from a binary file using file objects and convert it into its Numeric representation with fromstring().
  • We can call external programs like wgrib to dump the data records on a native representation and then access the records from the interpreter, storing them under a different format. Creating netCDF files from scratch using the Scientific.IO.NetCDF interface by Konrad Hinsen is much easier than using equivalent C or FORTRAN programs.
  • Similarly, the array-oriented nature of the netCDF COARDS interface makes it very easy to access whole parts or some hyperslabs of data in a COARDS compliant netCDF file. Briefly stated, Python, Numeric Python and other modules/packages accesible from the Python interpreter make very easy to work with data under very different and heterogeneous formats.

Data used in the analysis of climate variability responds to different physical phenomena. This explains that temporal variations in the fields are usually formed by a set of different frequencies, they are broad-band signals. However, it is usually interesting to separate the different scales of "motion" in a geophysical field with the aim of separating different physical effects. Such is the case, for instance, when one is interested in analysing the high-frequency (with period in the 2-10 days range) transients of the extratropical atmospheric flow, attributed to the baroclinic instability of the flow from the so-called low frequency variability (LFV) of the extratropical atmospheric circulation, associated to monthly and higher time scales. Similarly, it is of primary interest to analyse the interchange of Rossby waves from midlatitudes and tropical latitudes in the 6-25 days time-scale, which allows one to separate the extratropical-only short-wavelength baroclinic systems from the extratropical motions and to atenuate the tropical-only motions associated with the Madden-Julian Oscillation and mixed Rossby-gravity waves, which are of primary importance in the tropical latitudes, but do not exist outside the tropics.

In the ocean, it is important to distinguish between internal gravity waves and Rossby waves on the basis of their periods. Thus, it is very important to be able to perform digital filtering of multivariate datasets, without considering in detail the structure of the field, which can be unidimensional (the time series at a single measurement site), two-dimensional (the time series of a zonal average), three-dimensional (the time-varying global sea-surface temperature) or four-dimensional (time varying geopotential height field at several vertical levels and grid points in a latitude/longitude grid).

The generic nature of some of Numeric Python's array oriented operations allow an easy coding of these operations for generically shaped arrays, achieving a good performance in the computations, which is a very important requirement for this task. It is to be taken into account that, for instance, the geopotential height 12 hourly data over the whole earth for a part of the Reanalysis period (1967-1998), on a regular 2.5 degree x 2.5 degree latitude/longitude grid with 6 vertical levels (1000, 850, 700, 500, 300 and 200 hPa surfaces) takes about 2.8Gb in a netCDF file, even after packing the floating point values into Int16 accuracy through an "offset plus scaling" approach.

Which platforms are supported?

Currently, we have been able to install the package using distutils on several UNIX machines running Linux, OSF, FreeBSD, IRIX and AIX systems. We are, unfortunately, unable to create a Windows distribution, so, volunteers would be appreciated, but we will not do it.
About... Prerequisites Examples Home Bugs Download Manual Links

Contact the