m
寏Fc           @   sS  d  Z  d f  d     YZ d k Z d k l Z d e e f d     YZ [ [ d e f d     YZ d	 e f d
     YZ d e f d     YZ d k	 l
 Z
 d e
 e f d     YZ d e e e e f d     YZ d   Z d   Z d   Z d e
 f d     YZ [ [ [ d Z e e i d d d d e _ e e i d d d e _ d S(   s  Representation of distributions.

This is about giving a numeric semantics to dictionaries whose keys are the
putative values of some quantity: to each, associate an interval (whose edges
aren't specified but lie between the given key and its neighbours); the value
for a key is the integral of the distribution across the interval associated
with that key.

In caricature at least, the value of a key is the probability that a random
selection from the distribution would be closer to the given key than to any of
the other keys in the same dictionary: and the keys are roughly the odd 2n-iles,
where n is the number of keys, so have roughly equal weights.

When combining two such `numeric' values, we want to combine each key of one
with each key of the other, using the result as a key in the composite
dictionary.  This key gets a contribution to its value which is the product of
the values of the two keys combined to produce it: if it is the result of
several such combinations, its value will be a sum of such products.  Then take
the resulting big bok of weighted values, compute its odd 2n-iles for some
chosen n, use these as the keys of a dictionary in which weight[key] is the sum
of the big bok's values for keys which are closer to this key of weight than to
any other.

Various classes with Weighted in their names provide the underlying
implementation for that; the class Sample packages this functionality up for
external consumption.

$Id: sample.py,v 1.38 2007/07/07 15:21:57 eddy Exp $
t   _baseWeightedc           B   s    t  Z d  Z d   Z d   Z RS(   s  Base class for weight dictionaries.

    A `weight dictionary' has `data points' as its keys and weights as the
    associated values.  It is to be read as a `distribution': typically, as a
    probability distribution (when the sum of the weights is one).  The presence
    of a given { key: value } within the dictionary indicates that there is some
    neighbourhood of the data point, key, in which there is a probability value
    of finding your random thing.  I aim to coerce things into a form where, to
    the best of my knowledge, the neighbourhood of each key lies between the
    neighbouring keys above and below.

    The various kinds of functionality layered together to make this are split
    from one another in this file to allow the possibility of re-use of some of
    the toolset by alloying with some other implementation of other parts of the
    toolset.  Each part's implementation assumes the exported functionality of
    the other parts, so only alloys analogous to Weighted are viable: instances
    of the component classes won't work.

    Sub-classes:

      statWeighted -- provides for statistical computations;
      joinWeighted -- provides for combining distributions (and transforming them);
      repWeighted -- integrating and rounding;
      _Weighted -- Object packaging a weights dictionary.

    These are all alloyed together to build the final class, Weighted, which is
    what Sample actually uses. c         C   s   t  |  i    S(   N(   t   maxt   selft   keys(   R   (    (    t(   /home/eddy/.sys/py/study/value/sample.pyt   high=   s    c         C   s   t  |  i    S(   N(   t   minR   R   (   R   (    (    R   t   low>   s    (   t   __name__t
   __module__t   __doc__R   R   (    (    (    R   R       s    	N(   s   Lazyt   curveWeightedc           B   s}   t  Z d  Z h  d d d <d d d <Z e e d d  Z d e f d     YZ d	   Z e	 i
 d
  Z e	 i
 d  Z RS(   s  Interpretation of the weights dictionary as a curve.

    This introduces an implicit curve described by the weight dictionary.  If
    there is only one key, we have no idea how broadly it should spread: an
    exact `delta function' is assumed.  Otherwise, the gaps between adjacent
    keys are used for interpolation: and the boundary keys are extrapolated
    outwards to an extent comparable with their extent inwards.

    For this version, the curve is piecewise constant - i.e. it's a sum of
    uniform distributions.  Each weight in self's dictionary is spread evenly
    across the interval between the mid-points between the weight's position and
    that of its nearest neighbour on either side.  The outer extreme of the
    first and last weights' intervals are as far beyond the weight as it is
    beyond its neighbour (so the end interval's outer tail is twice as long as
    its inward tail; evenly spaced weights in the ratio 3:2:2:...:2:2:3 will
    yield a true uniform distribution).

    More sophisticated replacements for curveWeighted may be worth using in its
    place when alloying a modified Weighted for use by Sample.  The only thing
    you need to over-ride is (well, should be) interpolator; curveWeighted's
    cliends *should* assume nothing about the curve (but repWeighted's rounding
    infrastructure may still be entanged).  Your interpolator must support
    .weigh(), .split(), cross() with the same semantics as here and have a .cuts
    attribute with a suitable meaning. f-1.0i   i   f1.0f9.9999999999999995e-07c         C   s  t  |   d j  o | d j o+ | d j	 o |  i h  | d < q q| d j o |  i h  | d < q|  i h  | d | d d <| d | d d < n |  i i |  i i | } } h  } | d j	 o. | | d j  o | | d | |  i d <n | d j	 o. | | d j o | | d |  i d | <n t  |  d j o- x* | i   D] } | | d | | <qfWn | o |  i |  n y
 |  ` Wn t j
 o n Xy
 |  ` Wn t j
 o n Xd S(   s  Ensure self's weights stretch as far as low and high, if non-None.

        Arguments:
          low -- None, or a lower bound that self should reach
          high -- None, or an upper bound that self should reach
          share -- fraction of self's total weight available to reach bounds

        Note that if low or high is None, or is already within self's range, it
        will be ignored.
i   i   f3.0i    f0.5iN(   t   lenR   R   t   NoneR   t   addt   interpolatort   cutst   totalt   sharet   cutt   sumt   bokt
   sortedkeysR   t   kt   AttributeError(   R   R   R   R   R   R   R   R   (    (    R   t   reach`   s8    	   ;    
  
 t   _lazy_get_interpolator_c           B   s   t  Z d  Z d   Z d   d  Z d   d  Z d   d  Z d   Z d	   Z e	 i
 d
   e	 i
 d  d  Z e	 i
 d  Z d   Z e	 i d  Z RS(   s  Integration of a curve interpolated from a weights-dictionary.

        What should be happening here ?
        repWeighted (or its replacement, e.g. a Bezier interpolator) provides:
          between([low, high]) -- defaults, None, mean relevant infinity; yeilds weight
          weights(row) -- map(self.between, [ None ] + row, row + [ None ])
          carve(weights) -- yields tuple for which map(self.between, (), yield) = weights.
          round([estim]) -- yields string describing estim to self's accuracy.

        The meaning of a distribution is:
        we have { position: weight, ... }
        and ks = sortedkeys lists the positions in increasing order.
        What an entry { ks[i]: w } in the mapping means is that the
        total weight between (ks[i-1]+ks[i])/2 and (ks[i]+ks[i+1])/2 is w.
        It doesn't say anything about mean or median.

        Doing it piecewise linearly looks a pig.  So do it piecewise constant,
        with steps at the mid-points between adjacent weights. c         C   s:   |  i | i  |  _ t t | d  | i   |  _ d  S(   Nc         C   s   | |  S(   N(   t   wR   (   R   R   (    (    R   t   <lambda>   s    (   R   t   _lazy_get_interpolator___cutst   weigherR   R   t   tuplet   mapt   size(   R   R   t   ignored(    (    R   t   __init__   s    c         C   s   d |  | S(   Nf0.5(   t   at   b(   R$   R%   (    (    R   R      s    c         C   s"  t  |  d j  o+ | p f  Sn d | d } | | f Sn t d   t d   | d  | d   p t d | f  d	 | d | d
 d	 | d | d } } | d j o | d d j o
 d } n | d j  o | d d j o
 d } n | f t t | | d  | d   | f S(   Ni   f1.0i    c         C   s
   |  d j  S(   Ni    (   t   x(   R&   (    (    R   R      s    c         C   s   | |  S(   N(   t   yR&   (   R&   R'   (    (    R   R      s    ii   s   expected sorted dataf2.0i(
   R   t   rowR&   t   filterR    t   AssertionErrort   topt   botR   t   mean(   R   R(   R-   R,   R&   R+   (    (    R   t   __cuts   s     ;- 
 
c         C   s   |  | S(   N(   R$   R%   (   R$   R%   (    (    R   R      s    c         C   s   t  | |  i d  S(   Ni    (   t   reduceR   R   R!   (   R   R"   R   (    (    R   t   _lazy_get_total_   s    c         C   s   |  | S(   N(   R$   R%   (   R$   R%   (    (    R   R      s    c         C   s`  t  d   |  p
 t d  |  i t | | d  } |  i |  i	 } } g  | d d }	 } } x t | d  | d   D] } | | }
 | | | j o. |
 | d | | | d | | | }
 n x@ | |
 j o2 | |
 d | | d | } } } | | }
 q W| d j o* | | | | d | | | | } n |	 i |  q| Wt |	  S(	   s  Cuts the distribution into pieces in the proportions requested.

            Required argument, weights, is a list of non-negative values, having
            positive sum.  A scaling is applied to all entries in the list to
            make its sum equal to self.total().

            Returns a list, result, one entry shorter than weights, for which
            self.weigh(result) equals the re-scaled weights-list. c         C   s
   |  d j  S(   Ni    (   R&   (   R&   (    (    R   R      s    s   weights cannot be negativef0.0i    c         C   s   |  | S(   N(   R&   t   s(   R&   R1   (    (    R   R      s    ii   N(   R)   t   weightsR*   R   R   R/   R   t   scaleR   R!   R   t   loadt   anst   priort   iR    R   t   availt   appendR   (   R   R2   R   R4   R3   R7   R6   R   R   R5   R8   (    (    R   t   split   s$      
 . $ *c   
      C   s^  d g d t  |  |  i |  i }	 } } |  i p t |	  Sn t  |  i  d j  o |  i d } t  t
 |  i d d  |   } | t  |  j p( | | |  i d j p t d | f  | t  |  j  o6 |  i d | | j o | d |	 | <|	 | d <qT| |	 | <n;d } } d
 } y+ x$ | | | d j  o d | } q0WWn t j
 o n Xyxy | | } Wn t j
 o | d } n X| | | j  o- | | d | j p
 t d  d | } qq| | d | j  os | d
 j o | | } n | | j o6 |	 | | | | | | d | | | |	 | <n | d | } } qq| d
 j	 oO |	 | | | | d | | | d | | | |	 | <d
 d | } } n x; | | d | j o% |	 | | | d | |	 | <} qWqqWWnL t j
 o@ | t  |  j p& t d	 | t  |  | t  |  f  n Xt |	  S(   s*  Integrates self's distribution between positions in a sequence.

            Single argument, seq, is a sequence of positions in the
            distribution.  The sequence is presumed to be sorted.

            Returns a tuple of weights, result, one entry longer than the
            sequence, with each being the integral over the distribution between
            two bounds:
              result[0] -- from minus infinity to seq[0]
              result[1+i] -- from seq[i] to seq[1+i]
              result[-1] -- from seq[-1] to infinity
            f0.0i   i   i    c         C   s
   |  | j  S(   N(   R&   t   r(   R&   R;   (    (    R   R      s    s   mis-sorted positionsis    must have incremented i in errors2   algorithm exited loop surprisingly at %d/%d, %d/%dN(   R   t   seqR   R!   R   t   resultR4   R   R   t   weightR)   R7   R*   R1   R   t   lastt
   IndexErrort   stop(
   R   R<   R4   R   R?   R>   R7   RA   R1   R=   (    (    R   t   weigh   sX     +
 ";+
        6: />c         C   s  h  }	 x |  i  D] } d  |	 | <q Wx | i  D] } d  |	 | <q. W|	 i   }	 |	 i   |  i |	  | i |	  } } | d d j o | d j n o& | d d j o | d j n p t
  | d d !| d d !} } |	 d |	 d } | o t |  |  i d | | i d | } } }
 |	 | t |  t |  } } } xh | d j oV | d } |	 | } | | | } } | | | | | | |
 | | | <| | <qUWn t d   | |  } t |	  d t |  j o
 d j n p t
  h  } t |  d j o | d | |	 d d |	 d d <| d | d |	 d |	 d d <| d d !} xD t d	   |	 d d !|	 d d ! D] } | d | d | | <} qWt |  d j p t
  n5 t |	  d j p t
  d
 | |	 d |	 d d <| S(   Ni    ii   f0.050000000000000003c         C   s   |  | S(   N(   R&   R'   (   R&   R'   (    (    R   R   3  s    i   f3.0ic         C   s   d |  | S(   Nf0.5(   R$   R%   (   R$   R%   (    (    R   R   :  s    f1.0f0.5(   R   R   R   R   t   otherR   t   sortRB   t   met   youR*   R   R   R   R7   t   mR'   t   listt   pt   nt   hR    t   bitsR   (   R   RC   RE   R   R7   RG   RJ   RI   R   R   R'   RF   RK   RL   R   (    (    R   t   cross  sH    
  
  
S0$ 

65""$ c         C   s   |  | S(   N(   R$   R%   (   R$   R%   (    (    R   R   F  s    c         C   s   | | | | |   S(   N(   R   t   gRK   t   l(   RO   RK   R   RN   (    (    R   R   G  s    c         C   sR   |  i |  i } } t |  d j  o d Sn t | t | | d  | d |   S(   s  entropoid = integral(: p.log(p) :)

            Since p is piecewise constant, the integral is a sum of simple
            terms: each term is an integral between two entries in self.cuts, h
            apart, in which lies the matching weight, w, in self.size; this
            makes p = w/h over the interval, so e gets a contribution
            h.(log(w/h).w/h) = w.log(w/h).  This makes the integration easy.

            It is not immediately clear what to do with a delta function ...
            However, this datum is used via dispersal, which can handle that.

            Note, as documented for dispersal, that entropoid depends on the
            choice of unit of measurement for the quantity whose distribution
            self describes. i   i    ii   N(
   R   R   R!   R   t   sizR   R/   R   R    t   each(   R   R"   t   logR   RQ   R   RP   (    (    R   t   _lazy_get_entropoid_F  s
      c         C   sC   t  |  i  d j  o d Sn |  i } | | |  i  |  i | S(   s?
  Computes the dispersal (an analogue of entropy) of the distribution.

            The caricature of what we return is integral(: -p*log(p) :).

            However, the distribution, p, described by self is really a density
            (: p :{u*x: scalar x}) for some unit u (as used for measurement of
            the quantity whose distribution self describes); and integral(p) is
            dimensionless.  The dimensions of integral(p) are those of p's
            outputs times those of the integrating variable, i.e. u; so p's
            outputs must be of the same kind as 1/u.

            Thus log(p) isn't strictly meaningful; however, u*p is dimensionless
            and we can take its log, giving us integral(p*log(u*p)).  This is
            then meaningful, but depends on our unit, u.  With the choice of u
            made by our client, we compute this integral as self.entropoid.
            Using a different unit, w, in place of u will add
            log(w/u)*self.total to self.entropoid:
              integral(p*log(w*p)) = self.entropoid + log(w/u) * self.total

            Furthermore, self's distribution is meant to be understood as being
            independent of self.total, i.e. integral(p).  In general, scaling p
            down by a factor k also changes integral(: p*log(u*p) :), to
                integral(: log(u*p/k)*p/k :)
                 = (self.entropoid - self.total * log(k)) / k

            So we have to decide what unit to use and what overall scaling to
            apply.  For the overall scaling, a natural choice is k = self.total,
            so as to normalise p to yield self.total = 1.  If we replace our
            unit, u, used implicitly in computing self.entropoid, with some more
            apt unit w, this will give us, as integral(-log(w*p/k)*p/k),
                r = log(self.total*u/w) - self.entropoid/self.total

            The issue of chosing a sensible unit is, as ever, non-trivial.  I
            intuit that the dispersal should be translation-invariant;
            i.e. replacing p with (: p(x-z) &larr;x :) shouldn't change its
            dispersal, for constant z, e.g. an average of the distribution.
            Thus a sensible unit, w, must needs be obtained from the width of
            the distribution, in one guise or another.  The combination of scale
            invariance and translation invariance implies that the resulting
            dispersal will describe the *shape* of the distribution, rather than
            anything else.  See also the docs, below, of _lazy_get__unit_ and
            repWeighted.dispersor. i   i    N(   R   R   R!   R   R$   RR   t   _unitt	   entropoid(   R   R"   RR   R$   (    (    R   t   _lazy_get_dispersal_\  s
    *  	c         C   s   |  i S(   s
  A `width of the distribution' unit for use in normalising dispersal.

            This is still exploratory.

            A suitable unit must be independent of applying an overall scaling
            to self's weights or overall translation of self's cuts; a uniform
            scaling of self's cuts should cale the unit proportionately.  Thus
            the unit must be a `width' of the distribution, such as the total
            span or standard deviation.

            Various units present themselves as candidates.  For each, I've
            examined the theoretical value for uniform and for Gaussians; I've
            also examined the limiting behaviour of binomial distributions.
            I've tried the following:

              standard deviation -- well, sqrt(variance) anyhow.  Gives positive
              answers (which is good); uniform is log(12)/2 = 1.24 and a bit;
              Gaussian is sqrt(pi/2) = 1.25 and a bit; binomials tend to about
              1.4189 from below; Planck.mass gets 0.414ish.

              total width -- i.e. cut[-1] - cut[0].  Gives negative answers
              (bad); uniform is 0, Gaussian has no width, binomials tend to
              about -pi from above; Planck.mass gets -0.592ish.

              90% confidence interval -- i.e. difference between entries in
              self.split([.5,9,.5]).  Uniforms get -log(.9) = 0.105 and a bit;
              Gaussian has width 3.29 so gets log(sqrt(2*pi)/3.29) +.5 = 0.228
              and a bit; binomials approximately stabilise on approximately this
              last value; Planck.mass gets -0.584ish.

              50% confidence interval -- i.e. similar for self.split([1,2,1]).
              Uniforms get -log(.5) = log(2) = 0.69 and a bit; Gaussian has
              width 1.348 so gets log(sqrt(2*pi)/1.348) +.5 = 1.12 and a bit;
              binomials stabilise on slightly less than 1.12, oscillating among
              1.118 and 1.119 mostly; Planck.mass gets -0.2765ish.

            Generally:

              uniform is, wlog, .5 between -1 and 1; integral(p) = 1,
              integral(p*log(p)) = .5 *log(.5) * 2 = log(.5), dispersal is thus
              log(u/w) -log(.5) = log(2*u/w); its exp is simply the actual total
              width of the distribution divided by the unit we select.

              Gaussian is, wlog, p = (: exp(-x*x/2) &larr;x :) with
              total = integral(p) = sqrt(2*pi),
              entropoid = integral(p*log(p)) = -total * variance / 2 = -total/2, so
              dispersal = log(total*u/w) -entropoid/total = log(sqrt(2*pi)*u/w) +0.5.

            Since I like +ve dispersals, I've settled on standard deviation ...
            N(   R   t
   _deviation(   R   R"   (    (    R   t   _lazy_get__unit_  s    2 c      	   C   s   |  i |  i } } d }	 } } | d } xy | d D]m } | d | d }
 } | |	 |
 | |
 | | d | |
 | | | | | | d f \ } }	 } } q6 W| | |	 | |	 d  S(   s   standard deviationf0.0i    i   i   i   N(   R   R   R!   R   RP   t   zerot   onet   twoR?   t   cR   t   sqrt(   R   R"   R]   R\   R   R?   RP   R[   RZ   RY   R   (    (    R   t   _lazy_get__deviation_  s     
 V(   R   R	   R
   R#   R   R0   R:   RB   RM   t   mathRR   RS   RV   RX   R]   R^   (    (    (    R   R      s    	 	B	/!1	6c         C   s
   |  i i S(   s  Integrates log(density) using the density as measure; a.k.a. entropy.

        This also performs scale-invariance normalisations; the result is
        actually integral(-log(w*p/k)*p/k) with k = integral(p) and w a unit
        chosen based on the width of the distribution, p. N(   R   R   t	   dispersal(   R   (    (    R   R`     s     c         C   sW   t  |   d j o d Sn |  i } | i d j o d Sn | i | | i | i  S(   s  Divide self by this to get a (dimensionless) value with zero entropoid.

        Continuing from interpolator's docs:

        Had we used w in place of u, self.entropoid would have been:
            integral(p*log(w*p)) = self.entropoid + log(w/u) * self.total
        which tells us that exp(self.entropoid/self.total), which is
        dimensionless, is proportional to u, the unit implicitly used in
        computing self.entropoid; as shown earlier when looking at entropoid's
        dependence on k, it's also proportional to self.total.  Thus
        u*self.total/exp(self.entropoid/self.total) is independent of self.total
        and u with the same dimensions as u, i.e. as the quantity whose
        distribution we're looking at.  Thus this quantity appears below as the
        `dispersor' of a Sample; dividing a Sample by its dispersor will give a
        Sample whose distribution has zero dispersal. i   i    N(   R   R   R   R7   R   t   expRU   (   R   Ra   R7   (    (    R   t	   dispersor  s      	 c         C   s   |  i } d | i d j o/ |  i d d t |  i     }  |  i } n_ d d | i d j oF t t |  i    d  } |  i d |  i d |  }  |  i } n |  i d | | i
 | i   S(   s:   Returns variant on self normalised to have zero entropoid.f1.0i    R3   f-0.5N(   R   R   R7   R   t   copyR   t   valuest   powR1   Ra   RU   (   R   Ra   R7   R1   (    (    R   t   disperse  s     	"(   R   R	   R
   t   tophatR   R   t   LazyR   R`   R_   Ra   Rb   Rf   (    (    (    R   R   D   s     ) O		t   repWeightedc           B   s\   t  Z d  Z d   Z e e d  Z d d  Z e d  Z d   Z d   Z	 e d  Z
 RS(	   sE   Base-class for rounding (whence representation) and integration.
    c         C   s   |  i i |  S(   N(   R   R   RB   R<   (   R   R<   (    (    R   R2     s    c         C   s   g  } | d j	 o | i |  n | d j	 o | i |  n |  i i |  } | d j o | d Sn | d j o | d Sn | d S(   s   Returns the weight associated with an interval.

        Arguments are the low and high bounds of the interval.  Either may be
        None, indicating an interval unbounded at that end. i    ii   N(   R(   R   R   R9   R   R   R   RB   (   R   R   R   R(   (    (    R   t   between  s         i   c         C   sk   d | j o
 d j  n o+ d d | } |  i i | | | g  Sn |  i i } | d | d f Sd S(   s  Bounds self's distribution.

        Optional argument, frac (default: 1), is the proportion of self's total
        weight which is to fall between the bounds; ignored (i.e. treated as 1)
        unless between 0 and 1.  Returns a 2-tuple (lo, hi) for which:

            apply(self.between, self.bounds(f)) == f * self.total()

        Thus, for instance, 95% of a distribution lies between the two entries
        in the .bounds(0.95) of the distribution; 2.5% lies below the first
        entry and 2.5% lies above the second. i    i   f0.5iN(   t   fract   gapR   R   R:   R   R   (   R   Rk   R   Rl   (    (    R   t   bounds!  s     c         C   sz   | d j  o t d   n | o) |  i i d g d g | d g  Sn* |  i i d g d g d | d g  Sd S(   sm  Subdivides distribution into n equal bands.

        First argument, n, is the number of bands into which to divide self's
        distribution.  Second argument, mid (default false), selects the
        mid-points of the bands, instead of their ends; this, in fact, still
        delivers the ends of bands but starts (and ends) with half-bands.

        Returns a tuple of sample points (n+1 of them if mid is false, n if it
        is true) for self's distribution: between any two adjacent entries in
        the tuple, self.between() finds weight self.total()/n.  If mid is false
        (default) the first and last entries in the tuple are the top and bottom
        of self's distribution's tails (which may be further apart than self's
        highest and lowest keys).

        Contrast statWeighted.median(), which always returns a sample-point of
        the distribution, and joinWeighted.condense(), which uses sample-points
        for the equivalent of niles with mid = true.  Note that self's median
        can be obtained as the single entry in self.niles(1, true) or as the
        middle entry of self.niles(2). i   s6   Can only subdivide range into positive number of partsi   i    N(   RJ   t
   ValueErrort   midR   R   R:   (   R   RJ   Ro   (    (    R   t   niles8  s       )c         C   s!  |  i }
 t |
  } } x1 | d j o# |
 | d | j o | d } q W| } d } x | | j o | d j o | | j o Pn d } n6 | | j o
 d } n d | |
 | |
 | d } | d j  o  | d } | |  |
 | } q\ | |  |
 | } | d } q\ W| | j  o d |
 | |
 | d } n6 | d j o d |
 d |
 d } n |
 | d } | d j o d |
 | |
 | d }	 n2 | d j o d |
 d |
 d }	 n |
 | }	 |	 | j p t  | | j  o
 | } n |	 | j o
 | }	 n | | |	 f S(   s   Finds a slice of self, returns its weight and width.

        Arguments:
          total -- lower-bound on total weight of the slice.
          about -- slice will span this value
        i    i   ii   f0.5iN(   R   R   R(   R   t   hiR+   t   aboutt   loR>   R   t   signt   rightt   leftR*   (   R   R   Rr   Ru   R>   Rs   R+   Rt   Rq   Rv   R(   (    (    R   t	   __embraceR  sL     	 "   
 

    
 
 
c         C   s   | o
 | } n d } d } x' | d j o | d | d } } q  Wx' | d j  o | d | d } } qJ Wt d | d |  } t | |  d j o d | Sn | S(   s2   Returns a suitable power of 10 for examining what.i   i    i
   f10.0N(   t   hatt   whatt   decadeRe   R5   t   int(   R   Rx   Ry   R5   Rz   (    (    R   t   __unit  s      
     c         C   s  | d j o< y |  i   } WqI t t f j
 o |  i   } qI Xn t |   d j  o	 | Sn | o | d | j o	 | Sn d |  i   } | d j p t
 d | |  i   f  |  i | |  \ } }	 | d j  o d \ } } n d \ } } d \ } } } |  i t t |  |	   } |	 d j o. d	 t |  i  } | t d
 | d  } n d } | d j o
 d }
 nT t |  }
 |
 d  d j o( d |
 j o d |
 i d  d }
 n d | d }
 x| |	 j p | | j obt | | | |  } | | | | | | } } t! | | | d |  } | d j o$ h  d d <d d <| d } n | d j  o
 | } n | | | } | o t | |  | j  oK |
 o- |
 d d j p
 t
 |
  d |
 d }
 n t |  d j } Pn t |  | j o d } Pq$n |  i$ | d | | d |  } d
 | } qWt |  d j p |	 d j p
 | | j } y x | d j o | d } | d  } yc h  d d <d d <d d <d d <d d <d d <d d <d d <d d <| |
 }
 Wn t& j
 o d |
 }
 qXd } qWWn t' j
 o | d } n X| |
 } | o! | o | d  d | d } n | | S(    s?  Returns a rounding-string for estim.

        Argument, estim, is optional: if omitted, the distribution's median (if
        available) or mean (likewise) will be used.

        Result is a string representing this value to some accuracy, in %e-style
        format.  This implicity represents an interval, given by `plus or minus
        a half in the last digit'.  This interval will contain estim.

        Normally, the interval denoted by the result string will contain less
        than half the weight of self's distribution and is the shortest such
        representation.  E.g. if self.between(3.05, 3.15) >= .5 >
        self.between(3.135, 3.145) then self.round(pi) will return '3.14'.

        That's impossible if half (or more) of self's weight sits at estim,
        i.e. self's half-width about estim is zero.  In this case, the result
        string will only give estim to as many significant digits as eight more
        than the number of sample-points of self; and if the next five digits
        would all have been 0, any trailing zeros will be elided from the ones
        given [for various sanity reasons].  Sugar: in this `exact' case, any
        exponent used will employ E rather than e (thus 1.2E1 for 12); and if no
        digits appear after the '.'  in an exact representation, the '.' is
        omitted. i   f10.0f0.5i    s2   Weights need to be positive for rounding algorithmt   -it    i   f0.10000000000000001i   i   s   1.t   es   %.0et   Et   0t   1t   2t   3t   4t   5t   6t   7t   8t   9t   .N(   R}   i(   R~   i   (   R~   i    i    ((   t   estimR   R   t   medianR   Rn   R-   R   R   t	   thresholdR*   Rd   t   _repWeighted__embraceR>   t   widtht   headRt   t   bodyt   tweakt   aimt   _repWeighted__unitR   t   abst   unitR   RA   Re   t   tinyt   tailt   strR:   R{   t   digt   cmpRo   t   adddotRj   R?   t   KeyErrorR@   (   R   R   R>   R   Rt   R   R   Ro   R   R   R   R   R   RA   R?   R   R   R   (    (    R   t   round  s        	 	& 
 
  $ 
	", 

 c  
 (   R   R	   R
   R2   R   Rj   Rm   Rp   R   R   R   (    (    (    R   Ri     s    		4	t   joinWeightedc           B   st   t  Z d  Z d   Z d e d   d  Z d   Z d   d  Z e d   d	  Z d
   Z	 e d  Z
 d   Z RS(   s  Interface-class defining how to stick distributions together.

    Provides apparatus for adding data to a distribution, re-sampling a
    distribution, obtaining a canonical distribution (i.e. one whose weights'
    neighbourhoods don't overlap) and for performing `cartesion product'
    operations (i.e. taking two distributions and obtaining the joint
    distribution for some combination of their parameters), including
    comparison. c         C   s   | |  _ d  S(   N(   t   detailR   t   _joinWeighted__detail(   R   R   (    (    R   R#   <  s    i   c         C   s
   |  d f S(   Ni   (   R&   (   R&   (    (    R   R   ?  s    c         C   s#  y | i   } WnR t j
 oF y t | |  } Wqe t t f j
 o | d f g } qe Xn Xx | D] \ } } | | } | d j  o t
 d | | | f  n t | t  o | i } n t | t  o |  i | | |  ql | d j	 o | |  } n |  | | |  | <ql Wd S(   s  Increment some of my keys.

        Arguments:

          weights -- a mapping (e.g. dictionary), for each key of which we'll be
          performing self[key] = self.get(key, 0) + weights[key], save that
          weights[key] may be scaled by scale and key may have been replaced
          with func(key) - see below.  If a sequence is given, it is read as a
          mapping with the sequence as .keys() and all values equal to 1; if
          anything other than a sequence or mapping is given, it is read as a
          single key with value 1.

          [scale=1] -- a scaling to apply to all values in weights

          [func=None] -- a callable which accepts keys of weights and returns
          keys for self.  Used to transform any keys of weights which aren't
          joinWeighted instances.

        If a key to be given to self (either a key of weights or func's output
        from such) is itself a joinWeighted, self.add recurses with
        self.add(key, weights[key] * scale, func).  If a key is a Sample, its
        .mirror is obtained: this is a joinWeighted and the same recursion is
        used.  Otherwise, (func's replacements for) keys should be scalars. i   i    s   Negative weightN(   R2   t   itemst   mitesR   R    t   oneacht	   TypeErrort   keyt   valR3   Rn   t
   isinstancet   Samplet   mirrorR   R   R   t   funcR   (   R   R2   R3   R   R   R   R   R   (    (    R   R   ?  s(         
    c         C   se   t  |   d j o t  |  j  n p
 t d  |  i |  i i | i  d t |  i | i  S(   s7   Pointwise product of two distributions: `intersection'.i   s    delta functions aren't nice hereR   N(	   R   R   RC   R*   t
   _weighted_R   RM   R   R   (   R   RC   (    (    R   RM   g  s     4c         C   s   |  | d S(   Nf2.0(   R$   R%   (   R$   R%   (    (    R   R   m  s    c   	      C   s   |  i p |  i   Sn |  i   \ } } g  } xJ | D]B } | | j o/ | | j o
 | j n o | i	 |  q7 q7 Wh  } | oT |  i i t | | d  | d   } x+ | D] } | d | d | | <} q Wn |  i | d t |  S(   s   Decomposes self.

        Argument, new, is a sorted sequence of keys to use.  Only keys in new
        which lie between self's bounds will actually be used. ii   i    R   N(   R   R   Rc   Rm   Rs   Rq   t   runt   newR   R9   R=   R   RB   R    R-   R4   R   R   (	   R   R   R-   R4   R   Rs   Rq   R=   R   (    (    R   t	   decomposem  s      
  +&  !	c         C   s
   |  i   S(   N(   R7   R   (   R7   (    (    R   R     s    c   	      C   s`  | d j o |  i } n t |   | j o |  Sn | d j o | d } n |  i   d | |  i h   g d } } } d | } | d j p
 t d  x |  i D] } |  | } xX | | j  oJ | d j o | | | | d | <} n | i |  i h    | } q W| d j o | | d | <| | } q q W|  i t | t d |    S(   s  Simplifies a messy distribution.

        Argument, count, is optional: it specifies the desired level of detail
        in the result (default: None).  None is taken to mean the level of
        detail specified for self when it was created.

        Returns a self._weighted_() whose keys are: the highest and lowest of
        self, and; count-1 points in between, roughly evenly-spaced as to self's
        weight between them.  The weight of each of these points is based on
        carving up self's weights according to who's nearest. i   f1.0f0.5i    s   Condensing degenerate weightingiN(   t   countR   R   R   R   R   R   t   stept   partsR?   Rl   R*   R   R   R   R9   R   R    t   middleR)   (	   R   R   R   R?   R   Rl   R   R   R   (    (    R   t   condense  s.    
    .

 
  c         C   st   |  i h   } x6 |  i   D]( \ } } | i | | | | d   q Wy | i	   SWn t
 j
 o | Sn Xd  S(   Nc         C   s   | | |   S(   N(   t   fR   t   j(   R   R   R   (    (    R   R     s    (   R   R   R5   R   R   R   R   t   dictR   t	   normaliset   ZeroDivisionError(   R   R   R   R   R   R5   (    (    R   t	   __combine  s       c         C   sh   | d  j oB y | i } Wn t j
 o |  i } qO Xt | |  i  } n |  i | |  i
 |  S(   N(   R   R   R   R   t   detR   R   R   t   _joinWeighted__combineR   R   (   R   R   R   R   R   (    (    R   t   combine  s      c         C   sd   |  i | t  } | i   d } | d | j  o
 d } n d } | d | j  o | Sn | d S(   Nf2.0ii    i   (   R   R   Ry   R   Rt   R   t   halfR%   (   R   Ry   R%   Rt   R   (    (    R   t   __cmp__  s     
 (   R   R	   R
   R#   R   R   RM   R   R   R   R   R   (    (    (    R   R   2  s    	(	/	

t   statWeightedc           B   sw   t  Z d  Z d   Z d   Z d   Z d   d  Z d   Z d   Z d   Z	 d	   Z
 d
   Z d   Z d   Z RS(   s*  Interface class providing statistical reading of weight dictionaries.

    Provides standard statistical functionality: presumes that instances behave
    as dictionaries, which must be arranged for by alloying this base with some
    other base-class providing that functionality (see _Weighted). c   
      C   s  |  p t d  n t |   d j o |  i   d Sn |  i }	 d t |	  } } d |  i   } } x6 | |  |	 | j o  | |  |	 | } | d } ql Wx: | |  |	 | d j o  | d } | |  |	 | } q W| d | j o |	 | Sn | | j o | d } n | d } |	 | |	 | } } yH |  | |	 | d |	 | d } |  | |	 | d |	 | d } Wn t j
 o n, X| | j  o | Sn | | j  o | Sn |  | |  | j  o | Sn | S(   sH  Takes the median of a distribution.

        Choses one of the keys of the distribution, with the aim that at most
        half of the total weight of the distribution lies on either side of the
        given key's neighbourhood.  If two equally good keys present themselves
        (two neighbourhoods abut at the `true' median), a choice is made between
        them: this choice might legitimately be arbitrary.

        I should really work out how to generalise this to the n-iles
        (i.e. those points in the distribution which are to n as pentiles are to
        5 and the median is to 2: there are n-1 n-iles (though one could bump
        that up to n+1 by regarding the top and bottom of the distribution as
        `boundary' n-iles for all n).

        See, for comparison, joinWeighted.condense() and repWeighted.niles(). s$   Taking median of an empty populationi   i    f0.5N(   R   Rn   R   R   R   R(   Rs   Rq   R   R+   R,   R   t   hieRv   t   riteR@   (
   R   Rs   R+   R,   Rq   R   R   Rv   R   R(   (    (    R   R     sD       	  
  
"&    c         C   s}   | d d g } x_ |  i   D]Q \ } } x0 t |  D]" } | | | | | <| | } q7 W| | | | | <q Wt |  S(   s  Returns a tuple of moments of the given distribution.

        Argument, n, is the highest order for which moments are desired.
        Returns a tuple with 1+n entries: for i running from 0 to n, result[i]
        is the sum, over (key, val) in self.items(), of val * pow(key, i). i   i    N(	   RJ   R(   R   R   R   R   t   rangeR7   R   (   R   RJ   R   R7   R   R(   (    (    R   t   _momentsF  s       c         C   s   |  i   } | d j o |  Sn | p t d |  f  n d | d j o2 |  i d d t |  i     }  |  i   } n_ d d | d j oI t t |  i    d  } |  i d |  i d |  }  |  i   } n |  i d d |  S(   s?   Returns variant on self normalised to have .total() equal to 1.i   s'   Attempted to normalise zero-sum mappingf1.0i    R3   f-0.5N(	   R   R   R   R   Rc   R   Rd   Re   R1   (   R   R1   R   (    (    R   R   V  s       "c         C   s   |  | S(   N(   R&   R'   (   R&   R'   (    (    R   R   l  s    c         C   s   t  | |  i   d  S(   Ni    (   R/   R   R   Rd   (   R   R   (    (    R   R   l  s    c         C   s   |  i   f S(   N(   R   R   (   R   (    (    R   t   _totalm  s    c         C   s'   |  i d  \ } } | | d | f S(   Ni   f1.0(   R   R   t   normR   (   R   R   R   (    (    R   t   _meano  s    c         C   s   |  i   d S(   Ni   (   R   R   (   R   (    (    R   R-   s  s    c         C   sC   |  i d  \ } } } | d | } | | | d | | | f S(   Ni   f1.0(   R   R   R   R   t   squaresR-   (   R   R   R   R   R-   (    (    R   t	   _varianceu  s    c         C   s   |  i   d S(   Ni   (   R   R   (   R   (    (    R   t   variancez  s    c         C   st   f  } xa |  i   D]S \ } } | p | | j o | | g } } q | | j o | i |  q q Wt |  S(   N(   t   sameR   R   R   R   t   mostR9   R   (   R   R   R   R   R   (    (    R   t   modes|  s       c         C   s  |  i   } | p t d  n t |  d j  o | d Sn t |  t |  } } | i   | d o | | d Sn | d } | | d | | } } | | } xQ |  i |  i f D]= } |   d } | | j  o | Sn | | j o | Sq q W| S(   Ns   empty population has no modei   i    i   (   R   R   R(   Rn   R   RH   RJ   RD   Rs   Rq   R   R   R-   t   mideRo   (   R   Rs   R   Ro   RJ   Rq   R   R(   (    (    R   t   mode  s(      
 

   (   R   R	   R
   R   R   R   R   R   R   R-   R   R   R   R   (    (    (    R   R     s    	^								(   s   Objectt	   _Weightedc           B   s   t  Z d  Z d e i Z e i Z d d	  Z d
   Z d   Z d   Z	 d   Z
 d   Z d   Z d   Z d d  Z e i Z e e d  Z d   Z RS(   s2   Base-class providing a form of weight-dictionary. R   Rd   R   t   has_keyt   gett   updatet   cleari   c         O   s_   | i d  p t | d  t |  i | |  h  |  _ |  i | |  |  i |  i  d  S(   NRd   (   Ry   R   R*   t   applyR   t   _Weighted__upinitt   argst   _Weighted__weightsR   R2   R3   t   borrow(   R   R2   R3   R   Ry   (    (    R   R#     s
    	c         C   s    |  i   } | i   t |  S(   N(   R   R   R(   RD   R   (   R   R"   R(   (    (    R   t   _lazy_get_sortedkeys_  s    
c         C   s   |  i S(   N(   R   R   (   R   (    (    R   t   __repr__  s    c         C   s   t  |  i  S(   N(   R   R   R   (   R   (    (    R   t   __str__  s    c         C   s   t  |  i  S(   N(   R   R   R   (   R   (    (    R   t   __len__  s    c         C   sS   y |  i | } Wn t j
 o d Sn& X| d j p t d | t f  | Sd  S(   Ni    s   Negative weight(   R   R   R   R=   R   R*   R   (   R   R   R=   (    (    R   t   __getitem__  s      	 c         C   s|   t  |  } | d j  o t d | | f  n | d j o | |  i | <|  i   n" y |  | =Wn t j
 o n Xd  S(   Ni    s   Negative weight(   t   floatR   Rn   R   R   R   t   _Weighted__change_weightsR   (   R   R   R   (    (    R   t   __setitem__  s       c         C   s   |  i | =|  i   d  S(   N(   R   R   R   R   (   R   R   (    (    R   t   __delitem__  s    
R   R   c         C   s<   x5 | D]- } y t |  |  Wq t j
 o q Xq Wd  S(   N(   t	   volatilest   nomt   delattrR   R   (   R   R   R   (    (    R   t   __change_weights  s       c         C   s   | d j o% |  i   } | o d | } q2 n h  } | o | oK x |  i   D]6 \ } } | |  } | i | d  | | | | <qS Wq | d j o | i |  i  q x, |  i   D] \ } } | | | | <q Wn |  i |  S(   st  Copies a distribution.

        No arguments are required.  The new distribution is of the same class as
        self.  With no arguments, this distribution is identical to the
        original: the two optional arguments allow for its keys and values
        (respectively) to be different.

        Optional arguments:

          scale -- a scaling to apply to the values in the distribution.
          Default is 1./self.total(), to produce a unit-total result, unless
          self.total() is zero (when there's nothing to copy anyway).

          func -- a function to apply to the keys: must accept keys (sample
          points) of the existing distribution, yielding a key for the new
          distribution.  Default is, strictly, None: if func is None, the
          identity, lambda x: x, is (implicitly) used.

        see, e.g., negation and copying of Samples (below). f1.0i    i   N(   R3   R   R   R   R   R   R   R   R   t   vRK   R   R   R   t   _Weighted__obcopy(   R   R   R3   RK   R   R   R   R   (    (    R   Rc     s$       &  c         O   s   t  |  i | |  S(   s  Method for overriding by derived classes.

        Generates a new weight-book from the same creation args as a _Weighted.
        Derived classes will typically want to generate instances of themselves,
        which is what this does: if their __init__ has signature incompatible
        with that of _Weighted, they can override this method with something
        which works round that. N(   R   R   t	   __class__R   Ry   (   R   R   Ry   (    (    R   R     s     (   s   keyss   valuess   itemss   has_keys   gets   updates   clear(   s
   sortedkeyss   interpolator(   R   R	   R
   t   Objectt   _borrowed_value_R#   R   R   R   R   R   R   R   R   R   Rc   R   R   R   (    (    (    R   R     s    	
									*t   Weightedc           B   s)   t  Z e i Z e i Z d d d  Z RS(   Ni   i   c         O   s.   t  |  i | | f | |  |  i |  d  S(   N(	   R   R   t   _Weighted__weinitR2   R3   R   Ry   t   _Weighted__joinitR   (   R   R2   R3   R   R   Ry   (    (    R   R#     s    (   R   R	   R   R#   R   R   R   (    (    (    R   R     s   		c         C   sj   y= y t  |  |  SWn% t j
 o d t  |  |  Sn XWn& t j
 o t  t |   |  Sn Xd  S(   Nf1.0(   Re   t   thisRy   Rn   t   OverflowErrort   long(   R   Ry   (    (    R   t   _power  s       c         C   s4   y |  | SWn! t j
 o t |   | Sn Xd  S(   N(   R   Ry   R   R   (   R   Ry   (    (    R   t	   _multiply   s      c         C   sF   |  | } d | | |  j  o
 d j  n o | Sn |  t |  S(   s   Division straightener f-0.9375f0.9375N(   R   Ry   t   ratioR   (   R   Ry   R   (    (    R   t   _divide$  s
     
& 	R   c           B   s  t  Z d  Z h  d d <Z e i dC Z e e d  Z e i	 Z
 e d  Z	 d   Z d   Z d   Z e d	  Z d
   Z d   Z e d  Z d   Z d   d  Z d   d  Z d   d  Z e d  Z d   d  Z d   d  Z d   d  Z e d  d  Z e d  Z e Z e d  d  Z e Z  e! d   Z" d!   Z# d"   Z$ d#   Z% d$   Z& e& Z' d%   Z( e( d&  Z) e( d'  Z* [( d( Z+ e, d)  e, e  Ae, e  AZ- d*   Z. d+   Z/ d,   Z0 d-   Z1 d.   Z2 d/   d0  Z3 d1   Z4 d2   Z5 d3   Z6 d4   Z7 d5   Z8 d6   Z9 d7   Z: d8   Z; d9   Z< d:   Z= d;   Z> d<   Z? d=   Z@ d>   ZA d?   ZB d@ dA  ZC e dB  ZD RS(D   s(   Models numeric values by distributions. t   _strt   _reprt   bestc         C   s   | d  j o y	 | WnE t j
 o9 y t | i    } Wqj t j
 o d } qj Xn Xt |  } | d j  o
 d } n d } | | d j o
 | } q x+ | | d j o | | d } } q Wn | | |  S(   Ni   i    f-1.0f1.0f0.5(   R3   R   R2   R   R   Rd   t   totR   R   R'   t   klaz(   R   R2   R3   R   R   R'   (    (    R   R   <  s$     	   
 
  c         O   s,  y | d } Wn  t j
 o |  i | d <n- Xh  | d <} | i |  i  | i |  t | t	  o | f | } | i } n yo y | d } Wn% t j
 o t t |  i } n	 X| d =| p' y | i } Wq t j
 o q Xn Wnc t j
 oW g  |  _ | o? | i d d   d  j o& | i d d   d  j o t d  qn Xd   } y	 | WnE t t f j
 o3 | p h  | d <} n | |  g |  _ n% X| p
 | } n t | |  |  _ t |  i | |  |  i |  |  _ |  i i | i d d   | i d d    d  S(   Nt   lazy_aliasesR   R   R   s0   What kind of numeric Sample has no data at all ?c         C   sT   xM t  oE t |  t  o |  i }  q t |  t  o |  i   }  q |  Sq Wd S(   s(   Coerce a Sample or Weighted to a scalar.N(   t   TrueR   R%   R   R   R   R   (   R%   (    (    R   t   flattenj  s        i   (   Ry   R   R   R   t   _Sample__aliasR   R   R   R2   R   R   t   _Sample__weighR   R   R   R   R   t   _Sample__bestR   R   R   R   R    t   _Sample__upinitR   R   (   R   R2   R   Ry   R   R   R   R   (    (    R   R#   M  sL          	:	 	  
c         C   sy   t  |  d j  o  t |  i t | i     SnC y |  i } Wn t	 j
 o | |  _ n X| i
 |  |  _ d Sd S(   s  Take note of what looks like a distribution.

        Single argument should be a Weighted (or, at least, curveWeighted)
        object.  If this has more than one weight, it's interpreted as a
        distribution describing the value self is meant to represent; otherwise,
        its weight-point is merely used as candidate best estimate value.
        Return value is true precisely if the changes made by this call to
        __update invalidate self.best (and possibly other related values).

        When it *is* read as a distribution: if self has no distribution
        (e.g. because just removed by update()'s preamble for being merely best
        estimate), the new distribution is taken on board as self's
        distribution.  Otherwise, the two distributions are combined via
        point-wise product (see joinWeighted.cross): each is interpreted as
        giving someone's information about our value, and if their experiments
        rule out some values the other's data permitted, it should be ruled out;
        more generally, the probability that our quantity has some value is the
        product of the two opinions' probabilities, give-or-take some
        normalisation of the results. i   i   N(   R   R2   R   R   t   _Sample__betterR   R   R   R   R   RM   (   R   R2   R   (    (    R   t   __update  s        c         G   sI   d |  i } } x2 | D]* } | | j o | i |  d } q q W| S(   s(  Support for update(): notice some new best estimates.

        Each arg (if any) should be someone's best estimate for self.  Returns a
        true value if at least one of these values wasn't already in .__best; in
        that case, caller should del self.best (at some point, possibly after
        doing assorted other things that might also necessitate this).

        If two sources agree *exactly* on best estimate, sooner suspect that
        they're both quoting a common source than believe them independent;
        hence the return value. i   N(   R   R   R  t   hitR%   R   t   itR9   (   R   R   R%   R  R  (    (    R   t   __better  s    
  c         C   s   d } t |  i  d j  oN x; |  i i   D]* } | |  i j o |  i i |  q, q, W|  ` d } nr t |  i  t |  i  j oR xO |  i i	   D]. \ } } | |  i j o
 | d j p Pq q W|  ` d } n t | t  o> |  i | i  o
 d } n |  i | i  o
 d } qn[ t | t  o |  i |  o
 d } qn- | d | d f |  i |  o
 d } n y |  i Wn6 t j
 o* | p t  |  i |  i  |  _ n X| o. y
 |  ` Wn t j
 o n X|  i   n d S(   sC   Implements the .observe() functionality of quantity.Quantity (q.v.)i   i   f2.2999999999999998f1.7N(   R   R  R   R   R   R   R   R  R9   R   R   R   RC   R   t   _Sample__updateR  R   R   R   R*   R   t   simplify(   R   RC   R  R   R   (    (    R   R     sL       
  	
 
   
  
 c         C   s#   |  i i |  |  _ |  i   d S(   s  Simplifies the distribution describing self.

        Argument, count, is optional (default: None).  If count is omitted or
        None, the distribution's view of how many sample points it should be
        using is used as count.  Otherwise, it specifies the number of sample
        points to retain in the simplification; if it exceeds the number of
        sample points in use, nothing changes. N(   R   R   R   R   t   _lazy_reset_(   R   R   (    (    R   R	    s     c         C   s   |  i i   |  _ d  S(   N(   R   R   R   (   R   (    (    R   R     s    c         C   s   |  i i   |  _ d  S(   N(   R   R   Rf   (   R   (    (    R   Rf     s    c         C   s   |  i i   } x@ | i   D]2 } | d d j p |  i |  o | | =q q W| o | |  i  | d <n t	 |  i
 |  i i |  f |  S(   s   Copies a sample, optionally transforming it.

        Optional argument, func, is a function to apply to the sample-points.
        Default is None: if func is None, the identity is used. it   _R   N(   R   t   dirRc   R   R   R   t   _lazy_ephemeral_R   R   R   t	   _sampler_R   (   R   R   R   R   (    (    R   Rc     s      ! c         O   s   t  |  i | |  S(   N(   R   R   R   R   Ry   (   R   R   Ry   (    (    R   R    s    c         C   s   |  | S(   N(   R&   R   (   R&   R   (    (    R   R   	  s    c         C   s   |  i | |  S(   N(   R   t   joinR   Ry   (   R   Ry   R   (    (    R   t   __add__	  s    c         C   s   |  | S(   N(   R&   R   (   R&   R   (    (    R   R   
  s    c         C   s   |  i | |  S(   N(   R   R  R   Ry   (   R   Ry   R   (    (    R   t   __sub__
  s    c         C   s   |  | S(   N(   R&   R   (   R&   R   (    (    R   R     s    c         C   s   |  i | |  S(   N(   R   R  R   Ry   (   R   Ry   R   (    (    R   t   __mod__  s    c         C   s   |  i | |  S(   N(   R   R  R   Ry   (   R   Ry   R   (    (    R   t   __mul__  s    c         C   s   | |  S(   N(   R   R&   (   R&   R   (    (    R   R     s    c         C   s   |  i | |  S(   N(   R   R  R   Ry   (   R   Ry   R   (    (    R   t   __radd__  s    c         C   s   | |  S(   N(   R   R&   (   R&   R   (    (    R   R     s    c         C   s   |  i | |  S(   N(   R   R  R   Ry   (   R   Ry   R   (    (    R   t   __rsub__  s    c         C   s   | |  S(   N(   R   R&   (   R&   R   (    (    R   R     s    c         C   s   |  i | |  S(   N(   R   R  R   Ry   (   R   Ry   R   (    (    R   t   __rmod__  s    c         C   s   | | |   S(   N(   RG   R   R&   (   R&   R   RG   (    (    R   R     s    c         C   s   |  i | |  S(   N(   R   R  R   Ry   (   R   Ry   R   (    (    R   t   __rmul__  s    c         C   s   yw y | i } Wn8 t j
 o, t | i d  t | i d  } } n- Xt | i   d  t | i   d  } } Wn0 t j
 o$ | p t	 d |  | f  q nG X| d j o
 | j n p | | d j  o t	 d |  | f  n |  i | |  S(   Ni    s   Dividing by zeros   Dividing by interval about 0(   Ry   R   R   R   R   R   R   Rs   Rq   R   R   R  R   (   R   Ry   R   Rs   Rq   R   (    (    R   t   __div__  s      *//c         C   s   | | |   S(   N(   t   dR   R&   (   R&   R   R  (    (    R   R   '  s    c         C   s   |  i } t | i   d  t | i   d  } } | d j o
 | j n p | | d j  o t d | |  f  n |  i
 | |  S(   Ni    s   Dividing by interval about 0(   R   R   R   R   R   R   Rs   Rq   R   Ry   R  R   (   R   Ry   R   Rs   Rq   R   (    (    R   t   __rdiv__'  s
    	+/c         C   s   |  i | |  S(   N(   R   R  R   Ry   (   R   Ry   R   (    (    R   t   __pow__1  s    c         C   s   |  i t  S(   N(   R   Rc   R   (   R   (    (    R   t   __abs__3  s    c         C   s   |  i S(   N(   R   R   (   R   (    (    R   R   6  s    c         C   s   |  i S(   N(   R   R   (   R   (    (    R   R   7  s    c         C   s   |  i i |  i  S(   N(   R   R   R   R   (   R   R"   (    (    R   t   _lazy_get__repr_8  s    c         C   sC   y |  i } Wn& t j
 o h  |  d <|  f Sn X| |  i f S(   Ni   (   Ry   R   R   R   R   (   Ry   R   (    (    R   t   extract<  s
      c         C   s@   | |  \ } } |  i |  i i | |  d | |  i |  S(   s  Combine with another Sample via a two-parameter function.

        First argument is the function, second is the other sample (or a plain
        number, which will be handled as if it were a single-point sample).  Do
        not pass more than two arguments.

        An intermediate distribution is built: for each point in each
        distribution (self and the other), the product of their weights gives
        the weight used for the result of applying the function to the two
        sample points.  The resulting distribution may then be somewhat
        simplified.  Its best estimate is obtained by applying the function to
        self's best estimate and that of the other sample. R   N(	   t   grabRy   R   R   R   R  R   R   R   (   R   R   Ry   R  R   R   (    (    R   R  A  s     c         C   s5   | |  \ } } t |  i |  p t |  i |  S(   N(   R  Ry   R   R   R   R   R   (   R   Ry   R  R   R   (    (    R   R   \  s    s1  Lazy hash value for samples.  Sub-optimal.

    Inconveniently, IIRC, the relationship between hashing and comparison
    requires equal keys to hash to the same value, making it hard (given the
    oddities of Sample comparison) to see any sensible dependance of the hash on
    the data of a Sample's distribution, even if samples weren't mutable, so all
    samples must have the same hash - making mappings using them as keys behave
    as linked lists (with a large hash-table overhead).  The easiest way to
    ensure that all samples have the same hash is to have them inherit its value
    from the base-class, Sample ... albeit hooking inheritance of a value into
    python's __hash__() idiom is here implemented by exploiting Lazy's fall-back
    hashing mechanism.

    Ideally, I'd replace this constant with something of form:

        def _lazy_get__lazy_hash_(self, ignored, base=...):
            return hash(self.__weigh) ^ base

    with base's default being the shared value below.  The problem is in
    deciding what Weighted.__hash__() should do ... R   c         C   s"   |  i d j o |  i j n S(   Ni    (   R   R   R   (   R   (    (    R   t   __nonzero__|  s    c         C   s   t  |  i  S(   N(   R   R   R-   (   R   (    (    R   t	   __float__~  s    c         C   s   t  |  i  S(   N(   R   R   R   (   R   (    (    R   t   __long__  s    c         C   s   t  |  i  S(   N(   R{   R   R   (   R   (    (    R   t   __int__  s    c         C   s   |  i S(   N(   R   t   _neg(   R   (    (    R   t   __neg__  s    c         C   s   |  S(   N(   R&   (   R&   (    (    R   R     s    c         C   s   |  i |  } |  | _ | S(   N(   R   Rc   t   negR=   R$  (   R   R"   R&  R=   (    (    R   t   _lazy_get__neg_  s    	c         C   sk   |  i } | o | i   n |  i St t |  d  \ } } | o | | Sn | | | | d d S(   Ni   i   f0.5(	   R   R  R%   RD   R-   t   divmodR   Ro   t   bit(   R   R"   R%   Ro   R)  (    (    R   t   _lazy_get_best_  s    	  c         C   s2   y |   SWn  t t f j
 o |  i Sn Xd  S(   N(   R   Rn   R   R   R   (   R   R   (    (    R   t   __call_or_best_  s      c         C   s   |  i |  i i  S(   N(   R   t   _Sample__call_or_best_R   R-   (   R   R"   (    (    R   t   _lazy_get_mean_  s    c         C   s   |  i |  i i  S(   N(   R   R,  R   R   (   R   R"   (    (    R   t   _lazy_get_mode_  s    c         C   s   |  i |  i i  S(   N(   R   R,  R   R   (   R   R"   (    (    R   t   _lazy_get_median_  s    c         C   s;   y |  i i   SWn# t t f j
 o |  i f Sn Xd  S(   N(   R   R   R   Rn   R   R   (   R   R"   (    (    R   t   _lazy_get_modes_  s      c         C   sQ   y |  i i   \ } |  _ } Wn+ t t f j
 o t d |  i f  n X| S(   Ns+   Seeking variance of degenerate distribution(   R   R   R   R   R-   t   varyRn   R   (   R   R"   R1  R   (    (    R   t   _lazy_get_variance_  s
     c         C   s   |  i d |  i d S(   Ni   i    (   R   t   span(   R   R"   (    (    R   t   _lazy_get_width_  s    c         C   s   |  i i   S(   N(   R   R   R   (   R   R"   (    (    R   t   _lazy_get_mirror_  s    c         C   s   |  |  i S(   N(   R   R   (   R   R"   (    (    R   t   _lazy_get_errors_  s    c         C   s   |  i i   S(   N(   R   R   R`   (   R   R"   (    (    R   RV     s    c         C   s   |  i i   S(   N(   R   R   Rb   (   R   R"   (    (    R   t   _lazy_get_dispersor_  s    c         C   s   |  i i   S(   N(   R   R   R   (   R   R"   (    (    R   t   _lazy_get_low_  s    c         C   s   |  i i   S(   N(   R   R   R   (   R   R"   (    (    R   t   _lazy_get_high_  s    c         C   s
   |  i   S(   N(   R   Rm   (   R   R"   (    (    R   t   _lazy_get_span_  s    i   c         C   s   |  i i |  S(   s  Returns upper and lower bounds on self's spread.

        Single argument, frac, is the fraction of self's distribution which
        should lie between the bounds returned; if greater than 1 or less than
        0, it is treated as its default, 1 - the upper and lower bounds of the
        distribution are returned.  Thus 95% confidence bounds for self's value
        are returned by self.bounds(0.95). N(   R   R   Rm   Rk   (   R   Rk   (    (    R   Rm     s     c         C   s   |  i i | |  S(   s  Cuts self's distribution into n equal parts.

        First argument, n, is the number of parts into which to subdivide.
        Second argument, mid, controls whether you get n band-centres or 1+n
        band-ends.

        Returns a tuple of points in self's distribution; between any adjacent
        pair of these, 1/n of the distribution's weight lies.  If mid is true,
        there is weight 0.5/n to the left of the first entry in the tuple and
        the same to the right of the last, and there are n entries in the tuple.
        If mid is false (the default) the first and last entries are the nominal
        extremes of the distribution and there are 1+n entries in the tuple. N(   R   R   Rp   RJ   Ro   (   R   RJ   Ro   (    (    R   t	   fractiles  s     (   s   best(E   R   R	   R
   R   R   t   _unborrowable_attributes_R   R   R   R#   R  R  R  R   R	  R   Rf   Rc   R  R  R  R  R   R  R  R  R  R  R   R  t   __truediv__R  t   __rtruediv__R   R  R  R   R   R  t   _lazy_get__str_R  R  R   t   _Sample__why_lazy_hasht   hasht
   _lazy_hashR   R!  R"  R#  R%  R'  R*  R,  R-  R.  R/  R0  R2  R4  R5  R6  RV   R7  R8  R9  R:  Rm   R;  (    (    (    R   R   5  sz    	5			3								 																				s|  Note that one can do some surprising things with Sample()s; e.g.:
    >>> gr = (1 + Sample({5.**.5: 1, -(5.**.5): 1}))/2
    >>> gr
    0.
    >>> gr+1
    2.
    >>> gr**2
    0.
    >>> gr**2 > gr+1
    1

in which gr's weighs are the roots to x*x=x+1 (and its .best is .5).
Notice that gr.copy(lambda x: x**2-x-1) and gr**2-gr-1 will have quite
different weight dictionaries !
R   i    R
   s$  Unit width zero-centred error bar.

Also known as 0 +/- .5, which can readily be used as a simple way to implement
a+/-b as a + 2*b*tophat.  For asymmetric error bars, use Sample.upward, which
has best estimate zero, like tophat, but is uniformly distributed on the
interval from zero to one.f-0.5f0.5(   R
   R    R_   t   study.snake.lazyRh   R   Ri   R   R   t   objectR   R   R   R   R   R   R   t	   _surpriseRg   t   upward(   Rh   R   R   R   R   R   R   R   R   R   R    Ri   RE  R_   R   (    (    R   t   ?   s.   "	  (t			 	