From here:

ftp://ftp.ebi.ac.uk/pub/databases/biomodels/releases/2017-06-26/

Skip to content
# Author: Dilawar

# Download signaling pathways in SBML format

# Stationary distribution of Markov Chains using Python

# Peak detection in signal using Python

# Why Pi matters?

# Compress/decompress a file using Huffman codes

# Double exponential function

# Compressed Sensing: A Python demo

# Somewhat formal statement

# A demonstration using Python

## Algorithm

## Sparse in time domain

## Sparse in Fourier domain

### With error in measurements

# Mathematical definitions

## Restricted isometric property (RIP)

# References

Graduate Student at National Center for Biological Sciences, Bangalore.

I always forget to add one special condition to transition matrix when computing stationary distribution namely that in addition to , one also has to ensure that .

def solveStationary( A ): """ x = xA where x is the answer x - xA = 0 x( I - A ) = 0 and sum(x) = 1 """ n = A.shape[0] a = np.eye( n ) - A a = np.vstack( (a.T, np.ones( n )) ) b = np.matrix( [0] * n + [ 1 ] ).T return np.linalg.lstsq( a, b )[0]

And thats it!

Usually people use `scipy.signal`

to detect peak in signal.

scipy.signal.find_peaks_cwt( vec )

which returns list of index where vec has maximas. However this does not work all

the time, especially when vector is noisy. I could not get it working very well to compute the modality of distributions.

This is the most sophisticated library (AFAIK): https://gist.github.com/sixtenbe/1178136

Its can be installed via pip:

$ pip install analytic_wfm

To use,

>>> import analytic_wfm >>> vec # this is the 1d array in which I want to detect peaks. >>> maxPeaks, minPeaks = analytic_wfm.peakdetect( vec, lookahead = 10 )

This returns maximum peaks and minimum peaks. There are various other functions

described in readme.txt file.

Here is an example where I used this library to compute modality of a distribution:

An excellant article by Steven Strogatz.

https://steven-strogatz.squarespace.com/s/why-pi-matters.pdf

I wrote a small python script which can be used to compress/decompress a text file. On large files (more than 5MB), one can reduce the size of the file by a factor of ~ 0.35 which is not far from the best compression ratio. Since it is in pure Python, it is slow. This script requires module `huffman`

and `bitarray`

.

pip install huffman bitarray --user

The script accepts two options `-c`

or compression and `-d`

for decompression. Following is help message produced by the script.

./compress -h usage: compress [-h] [--compress] [--decompress] file Compress/Decompress a file using Huffman codes. positional arguments: file File to compress/decompress. optional arguments: -h, --help show this help message and exit --compress, -c Compress file

This script is intended for Information Theory students who are looking for a python script to play with.

We read the input file and compute the frequency of each character (`Counter`

from standard python library does the job). We pass this frequency information to `huffman`

module and ask it to give a code-book (Huffman code). Next, we replace each character with code and save it to a binary file. We also add the code-book to the file as prefix. Adding code-book is necessary if we want to decompress it. The Huffman code gives the shorted binary code to the symbol which occurs most frequently, and longest to the one which occur most rarely. I ran the script (with `time`

command) on a large file (76MB).

time ./compress ~/Downloads/camk12_pp54_voxel2_diff--PP1--1e-13-_suOFF.dat -c Reading /home1/dilawars/Downloads/camk12_pp54_voxel2_diff--PP1--1e-13-_suOFF.dat (76 MB).. done. Generating codebook.. done. Compressing files .. done. Average codeword length : 2.924465 | Optimal average code length: 2.869759 Compressed files is written to /home1/dilawars/Downloads/camk12_pp54_voxel2_diff--PP1--1e-13-_suOFF.dat.dx | Original file size : 80169086 | Compressed file size : 29307774 | Compression ratio : 2.735421 real 0m40.917s user 0m40.822s sys 0m0.096s

`gzip`

takes less than a second to compress it (0.75 sec) and this script took more than 40 secs. With pypy, it took more time because bitarray is already a c-module.

The point of this script is demonstration, and not the efficiency. `gzip`

reduced the file to size 2878272 bytes. Ours (29307774 bytes) is not bad. Note that we are storing our codebook in plain text and not in binary format.

We also print the `Optimal average code length`

which is information theoretic optical average codeword length; no one can do better than this. And Huffman codes are quite close to this number.

On small file, you won’t see much reduction.

To decompress, we replace the codeword with the symbol and write to the output file.

File can be found on github.

A rising and falling exponential function can approximate many observations. Here is the python function to generate one.

def doubleExp( a ): """Use a rising and falling exponential """ tauR, tauF = 50, 30 amax = 100 fr = 1.0 - math.exp( - a / tauR ) ff = math.exp( - (a - amax) / tauF ) return fr * ff

I am having trouble with HTML format of blog post. Here is draft PDF PDF.

*I recommend http://www.pyrunner.com/weblog/2016/05/26/compressed-sensing-python/ . This is much better written.*

Can you solve $latex Ax=b$ for $latex x$ when number of rows are less than number of columns in (A) i.e there are more variables than equations? Such a system is called underdetermined system. For examples, give me 3 numbers whose average is 3. Or solve (x/3+y/3+z/3=3). You are right! There are many solutions to this problem!

If one adds more containts then one can hope to get a unique solution. Give me a solution which has minimum length (minimize (L_2)-norm)? Answer is 3,3,3 (hopefully). Give me a solution which is most sparse (least number of non-zero entries)? Answer is 9,0,0 (or 0,0,9).

Many signals are sparse (in some domain). Images are sparse in Fourier/Wavelets domain; neural acitivity is sparse in time-domain, activity of programming is sparse in spatial-domain. Can we solve under-determined (Ax=b) for sparse (x). Formally,

where (L_0) is the 0-norm i.e. number of non-zero entries.

This problem is **hard** to solve. (L_0) is not a convex function. In this case, one uses *convex relaxation* trick and replace the non-convex function with convex one.

It is now well-understood that under certain assumptions on (A), we can recover (x) by solving this tractable convex-optimization problem rather than the original NP hard problem. Greedy strategy based solvers also exists which works even faster (though, according to Terry Tao blog, they do not guarantee the same performance as interior-point method based).

If (x_N) (size (N)) is (S-sparse) then an under-determined system (A_{kN}x_{N} =b_k) — where (b) is vector of (k) observations — can be solved *exactly* given that (A) (**mesurement matrix**) has following properties:

- where \(y = A x\) for any sparse signal \(x\) i.e application of measurements matrix does not change the length of sparse signal \(x\) ‘very much’. See Restricted Isometric Property in sec.Â 3.1 .

Lets there is a sparse signal (x) of length (N). Its sparse. We take a dot-product of x with a random vector (A_i) of size N i.e. (b_i = A_i * x). We make (k) such measurements. We can put all k (A_i) into a matrix (A) and represent the process by a linear system:

Note that each (A_i) has the dimension (1 \times N). We can rewrite eq.Â 1 as following:

where dimension of (A) are (k \times N) and (k << N). This is an under-determined system with the condition that (x) is sparse. Can we recover (x) (of size N) from b (of size k (<<N))?

Figure fig.Â 1 shows a real system. Compressed sensing says that (x) can be recovered by solving the following linear program.

Data for fig.Â 2 is generated by script `./compressed_sensing.py`

and for fig.Â 3, data is generated by script `./magic_reconstruction.py`

. Both of these scripts also generate similar figures.

**Dependencies** In addition to `scipy`

, `numpy`

, and `matplotlib`

, we also need pyCSalgo. It is available via pip: `pip install pyCSalgo --user`

.

Code is available at github

**input**: sparse signal x of size N- Generate a random matrix (
**measurement matrix**) \(A\) of size \(k\times N\). - Make \(k\) measurements of \(x\) using \(A\). That is \(b = A x\). Note that each measurement of \(x\) (i.e. entry of \(b\)) is some linear combination of values of \(x\) given by \(A_i x\) where \(A_i\) is ith row.
- Now find \(x\) by solving \(\min_x \left\Vert x \right\Vert_1 \text{given}\; Ax=b\).

For step 4, we are using function `l1eq_pd`

from Python library `pyCSalgo`

. This is reimplementation of `l1magic`

routine written in Matlab by Candes. When (k >> 2S) where (S) is the sparsity of (x), we can recover (x) quite well. In practice, (k \ge 4S) almost always gives exact result.

There are other algorithms available which works faster; they are based on **greedy** strategy. We are not discussing them here.

This example shows that some error is added to measurements. The reconstructed signal is no longer ‘exact’ enough. But still, given that x was binary signal, we can recover x by thresholding function.

Restricted isometric property (RIP) of a matrix (A) is defined as the following.

Given (\delta_s \in (0,1)), for any sub-matrix (A_s) of (A), and for any sparse vector (y), if following holds

then matrix (A) is said to satisfy (s-)restricted isometric property with restricted isometric constant (\delta_s). *Note that a matrix A with such a property does not change the length of signal (x) ‘very much’*. This enables us to sense two different sparse signal (x_1) and (x_2) such that (A x_1) and (A x_2) are almost likely to be different.

To learn about compressed sensing, I recommend following articles

- A very good motivational article www.ams.org/samplings/math-history/hap7-pixel.pdf . (last accessed Wed 30 Aug 2017 11:14:35 AM IST)
- A nice tutorial using Matlab CodeProject article.
- Terry Tao Blog with tag Compressed sensing.

There are many many other great articles and papers on this subject. There are some dedicated webpages also.