[Python-au] Processing large amounts of data
Garth T Kidd
garthk at gmail.com
Tue Jun 19 03:48:23 UTC 2007
Ian;
Depends what kind of processing is involved, and how much bigger than
main memory. Your solution might be as easy as a database, or a pile
of flat files. Or, if your emphasis is processing and you have to
scale to dozens or thousands of nodes, implement MapReduce.
MapReduce description: http://labs.google.com/papers/mapreduce.html
MapReduce summary on Wikipedia: http://en.wikipedia.org/wiki/MapReduce
Comment on a potential Python implementation:
http://outgoing.typepad.com/outgoing/2005/04/mapreduce.html
A simple Python implementation:
http://d.hatena.ne.jp/y_yanbe/20061001/1159688053
A more complicated remote-capable version, it would seem, if only I
could make a DNS lookup:
http://agentmine.com/blog/2005/11/30/mapreduce-in-python (referred to
by http://home.badc.rl.ac.uk/lawrence/blog/2006/03/03/mapreduce_and_pyro).
Yours,
Garth.
On 18/06/07, Ian Bourke <ian.bourke at qbe.com> wrote:
>
>
> As a newbie to python, I was hoping that someone on this list could give me
> some advise on different approaches to processing large amounts of data in
> python or where I can access information about this issue. To qualify "large
> amounts of data" I would say more than can fit in physical memory.
>
> Regards
> IanB
>
> - ----------------
> IMPORTANT NOTICE : The information in this email is confidential and may
> also be privileged. If you are not the intended recipient, any use or
> dissemination of the information and any disclosure or copying of this email
> is unauthorised and strictly prohibited. If you have received this email in
> error, please promptly inform us by reply email or telephone. You should
> also delete this email and destroy any hard copies produced.
>
> _______________________________________________
> python-au maillist - python-au at starship.python.net
> http://starship.python.net/mailman/listinfo/python-au
>
>
More information about the python-au
mailing list