Why Python’s `random.choice` is bad when number of choices are smalls

Consider two snippets. Both pick 0 or 1 with equal probabilities.

a = random.choice([0,1])


a = 1 if random.random() < 0.5 else 0

Later is much faster roughly 10 times faster.

In [4]: %timeit random.choice([0,1])
834 ns ± 2.19 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [5]: %timeit 1 if random.random() < 0.5 else 0
101 ns ± 0.25 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

How about when choices are large. Writing so many if else are waste of time. One can use the following approximation.

a = int(random.random()*100)

For 100 choices, it will generate number between 0 and 100. Its performance is slightly better than choice.

In [15]: %timeit random.choice( range(100) )
1 µs ± 1.38 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [16]: %timeit int(random.random()*100)
224 ns ± 1.23 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


I thought if my choices are always in list; probably choice will be faster but I was wrong.

In [20]: choices = list(range(100))

In [21]: %timeit random.choice( choices )
752 ns ± 1.56 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

More importantly, DO NOT do this.

In [17]: choices = range(100)

In [18]: %timeit random.choice( choices )
798 ns ± 6.09 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Sampling uniformly on unit simplex

The script is available here https://github.com/dilawar/Scripts/blob/master/simplex_generate.py

sage: simplex_generate.py [-h] [--dimension DIMENSION] [--number NUMBER]
[--output OUTPUT] [--method METHOD]
Generate x1,x2,...,xk such that sum(x1,x2,...,xk)=1

optional arguments:
-h, --help show this help message and exit
Dimention of vector (default 3)
--number NUMBER, -N NUMBER
Total number of vectors to generate (-1 for infinity).
--output OUTPUT, -o OUTPUT
Output file
--method METHOD, -m METHOD
Method (uniform sampling|TODO)

It writes the sampled numbers to stdout. By default, it samples from 3d space.

Double exponential function

A rising and falling exponential function can approximate many observations. Here is the python function to generate one.

double-exponential function
Multiplication of a rising and falling exponential.


def doubleExp( a ):
    """Use a rising and falling exponential 
       tauR, tauF = 50, 30 
    amax = 100
       fr = 1.0 - math.exp( - a / tauR )
       ff = math.exp( - (a - amax) / tauF )
       return fr * ff


Compressed Sensing: A Python demo

I am having trouble with HTML format of blog post. Here is draft PDF PDF.

I recommend http://www.pyrunner.com/weblog/2016/05/26/compressed-sensing-python/ . This is much better written.

Can you solve $latex Ax=b$ for $latex x$ when number of rows are less than number of columns in (A) i.e there are more variables than equations? Such a system is called underdetermined system. For examples, give me 3 numbers whose average is 3. Or solve (x/3+y/3+z/3=3). You are right! There are many solutions to this problem!

If one adds more containts then one can hope to get a unique solution. Give me a solution which has minimum length (minimize (L_2)-norm)? Answer is 3,3,3 (hopefully). Give me a solution which is most sparse (least number of non-zero entries)? Answer is 9,0,0 (or 0,0,9).

Many signals are sparse (in some domain). Images are sparse in Fourier/Wavelets domain; neural acitivity is sparse in time-domain, activity of programming is sparse in spatial-domain. Can we solve under-determined (Ax=b) for sparse (x). Formally,

\min_x L_0(x), \quad \text{st} \quad Ax = b where (L_0) is the 0-norm i.e. number of non-zero entries.

This problem is hard to solve. (L_0) is not a convex function. In this case, one uses convex relaxation trick and replace the non-convex function with convex one.

\min_x L_1(x), \quad \text{st} \quad Ax = b

It is now well-understood that under certain assumptions on (A), we can recover (x) by solving this tractable convex-optimization problem rather than the original NP hard problem. Greedy strategy based solvers also exists which works even faster (though, according to Terry Tao blog, they do not guarantee the same performance as interior-point method based).

Somewhat formal statement

If (x_N) (size (N)) is (S-sparse) then an under-determined system (A_{kN}x_{N} =b_k) — where (b) is vector of (k) observations — can be solved exactly given that (A) (mesurement matrix) has following properties:

  • \left\Vert y\right\Vert^2 \sim = \left\Vert x \right\Vert ^2 where \(y = A x\) for any sparse signal \(x\) i.e application of measurements matrix does not change the length of sparse signal \(x\) ‘very much’. See Restricted Isometric Property in sec. 3.1 .

Lets there is a sparse signal (x) of length (N). Its sparse. We take a dot-product of x with a random vector (A_i) of size N i.e. (b_i = A_i * x). We make (k) such measurements. We can put all k (A_i) into a matrix (A) and represent the process by a linear system:

\begin{bmatrix} A_1 \\ A_2 \\ \vdots \\ A_k \end{bmatrix} * x = \begin{bmatrix} b_1 \\ b_2 \\ \vdots \\ b_k \end{bmatrix} \qquad(1)

Note that each (A_i) has the dimension (1 \times N). We can rewrite eq. 1 as following:

A x = b\qquad(2) where dimension of (A) are (k \times N) and (k << N). This is an under-determined system with the condition that (x) is sparse. Can we recover (x) (of size N) from b (of size k (<<N))?

Figure fig. 1 shows a real system. Compressed sensing says that (x) can be recovered by solving the following linear program.

\min_x \left\Vert x \right\Vert_1 \; \text{given} \; Ax = b\qquad(3)

A demonstration using Python

Data for fig. 2 is generated by script ./compressed_sensing.py and for fig. 3, data is generated by script ./magic_reconstruction.py. Both of these scripts also generate similar figures.

Dependencies In addition to scipy, numpy, and matplotlib, we also need pyCSalgo. It is available via pip: pip install pyCSalgo --user.

Code is available at github


  1. input : sparse signal x of size N
  2. Generate a random matrix (measurement matrix) \(A\) of size \(k\times N\).
  3. Make \(k\) measurements of \(x\) using \(A\). That is \(b = A x\). Note that each measurement of \(x\) (i.e. entry of \(b\)) is some linear combination of values of \(x\) given by \(A_i x\) where \(A_i\) is ith row.
  4. Now find \(x\) by solving \(\min_x \left\Vert x \right\Vert_1 \text{given}\; Ax=b\).

For step 4, we are using function l1eq_pd from Python library pyCSalgo. This is reimplementation of l1magic routine written in Matlab by Candes. When (k >> 2S) where (S) is the sparsity of (x), we can recover (x) quite well. In practice, (k \ge 4S) almost always gives exact result.

There are other algorithms available which works faster; they are based on greedy strategy. We are not discussing them here.

Sparse in time domain


Figure 1: x is the sparse signal of length 256. Using measurements matrix A, we made 124 random measurements of \(x\) namely b. The sparse solution of eq. 2 is the solution of $\min_x \left\Vert x \right\Vert_1 \text{given}\; Ax =b$.

Sparse in Fourier domain


Figure 3: CS solution when signal is sparse in Forier domain. Note that we took 200 samples for a singal of size 2000. The recovery is pretty good.

With error in measurements

This example shows that some error is added to measurements. The reconstructed signal is no longer ‘exact’ enough. But still, given that x was binary signal, we can recover x by thresholding function.


Mathematical definitions

Restricted isometric property (RIP)

Restricted isometric property (RIP) of a matrix (A) is defined as the following.

Given (\delta_s \in (0,1)), for any sub-matrix (A_s) of (A), and for any sparse vector (y), if following holds

(1-\delta_s) \left\Vert y \right\Vert^2 \leq \left\Vert A_s y \right\Vert ^2 \leq (1+\delta_s) \left\Vert y \right\Vert ^2

then matrix (A) is said to satisfy (s-)restricted isometric property with restricted isometric constant (\delta_s). Note that a matrix A with such a property does not change the length of signal (x) ‘very much’. This enables us to sense two different sparse signal (x_1) and (x_2) such that (A x_1) and (A x_2) are almost likely to be different.


To learn about compressed sensing, I recommend following articles

There are many many other great articles and papers on this subject. There are some dedicated webpages also.

Performance of “sorting dictionary by values” in python2, python3 and pypy

The script is hosted here http://github.com/dilawar/playground/raw/master/Python/test_dict_sorting.py . It is based on the work of https://writeonly.wordpress.com/2008/08/30/sorting-dictionaries-by-value-in-python-improved/

My script has been changed to accommodate python3 (iteritems is gone and replaced by items — not sure whether it is a fair replacement). For method names and how they are implemented, please refer to script or the blog post.

Following chart shows the comparison. PyPy does not boost up the performance for simple reason that dictionary sorted is not large enough. I’ve put it here just for making a point and PyPy can slow thing down on small size computation.

The fastest method is sbv6 which is based on PEP-0265 https://www.python.org/dev/peps/pep-0265/ is the fastest. Python3 always performing better than python2.




Thresholding numpy array, well sort of

Here is a test case

>>> import numpy as np
>>> a = np.array( [ 0.0, 1, 2, 0.2, 0.0, 0.0, 2, 3] )

I want to turn all non-zero elements of this array to 1. I can do it using np.where and numpy indexing.

>>>  a[ np.where( a != 0 ) ] = 1
>>> a
array([ 0.,  1.,  1.,  1.,  0.,  0.,  1.,  1.])

note: np.where returns the indices where the condition is true. e.g. if you want to change all 0 to -1.

>>> a[ np.where[ a == 0] ] = -1.0

That’s it. Checkout np.clip as well.

Comparision of various matplotlib/pylab plotting styles, version-1.(4, 5)

Global font size is set to 8 in all figures.


Style ‘ggplot’ is my first preference but the overlapping histogram does not have good contrast. So I settled for ‘seahorse-darkgrid’.