Bruce Lee random walk

随机矩阵理论(I): 综述

随机矩阵理论自一开始就注重理论联系实际,随着时间发展已经深入到很多学科领域,如随机过程,PDE,数论,统计,图论,医药,生物,图像处理,神经科学,金融,信号处理,无线通信,甚至机器学习,深度学习等计算机科学中。因此,我们很难毫无疏漏地把RMT讲述清楚。下面,我将RMT分成几个不同部分来写,不定期地推出。本文是随机矩阵理论系列的第一篇文章,主要描述一些产生背景,发展过程和几个重要的定理定律及其证明方法。

什么是随机矩阵

A random matrix is a matrix-valued random variable—that is, a matrix some or all of whose elements are random variables.

For more details, see Persi Diaconis(2005), What is ... a random matrix? or Romain Couillet and Merouane Debbah(2011), Random Matrix Methods for Wireless Communications

预备知识

简史

For more details, see for example Yan Fyodorov

Distributions

the probability distribution supported on the interval [−R, R] the graph of whose probability density function f is a semicircle of radius R centered at (0, 0) and then suitably normalized (so that it is really a semi-ellipse)

for -R ≤ x ≤ R, and f(x)=0 if x > R.

This distribution arises as the limiting distribution of eigenvalues of many random symmetric matrices as the size of the matrix approaches infinity.

It is a scaled beta distribution, more precisely, if Y is beta distributed with parameters α = β = 3/2, then X = 2RY – R has the above Wigner semicircle distribution.

In the limit of R approaching zero, the Wigner semicircle distribution becomes a Dirac delta function.

In free probability theory, the role of Wigner's semicircle distribution is analogous to that of the normal distribution in classical probability theory. Namely, in free probability theory, the role of cumulants is occupied by “free cumulants”, whose relation to ordinary cumulants is simply that the role of the set of all partitions of a finite set in the theory of ordinary cumulants is replaced by the set of all noncrossing partitions of a finite set. Just as the cumulants of degree more than 2 of a probability distribution are all zero if and only if the distribution is normal, so also, the free cumulants of degree more than 2 of a probability distribution are all zero if and only if the distribution is Wigner’s semicircle distribution.

In number-theoretic literature, the Wigner distribution is sometimes called the Sato–Tate distribution.

# finite semi-circle law

using Plots,StatsBase
n=5;
#using Christoffel-Darboux formula
x=-1:0.001:1
x=x*sqrt(2*n)*1.3
pold=0*x#-1st Hermite polynomial
p=1+0*x#0th Hermite polynomial
k=p
for j=1:n
    pnew=(sqrt(2)*x.*p-sqrt(j-1)*pold)/sqrt(j)
    pold=p
    p=pnew
end
pnew=(sqrt(2)*x.*p-sqrt(n)*pold)/sqrt(n+1)
k=n*p.^2-sqrt(n*(n+1))*pnew.*pold#use Page 420 of Mehta
k=k.*exp(-x.^2)/sqrt(π)#correct normalization multiplied
#rescale on [-1,1] and the area is π/2
plot(x/sqrt(2*n),k*π/sqrt(2*n))

If X denotes a M×N random matrix whose entries are independent identically distributed random variables with mean 0 and variance σ2 < ∞, let

and let λ1, λ2, …, λM be the eigenvalues of YN (viewed as random variables). Finally, consider the random measure

Theorem. Assume that M, N → ∞ so that the ratio M/N → λ ∈ (0,+ ∞). Then µM → µ (in weak* topology in distribution), where

and

with

The Marčenko–Pastur law also arises as the free Poisson law in free probability theory, having rate 1/λ and jump size σ2.

The shift by √(2n) is used to keep the distributions centered at 0. The multiplication by (√2)n1/6 is used because the standard deviation of the distributions scales as n-1/6.

The cumulative distribution function of the Tracy–Widom distribution can be given as the Fredholm determinant

of the operator As on square integrable functions on the half line (s, ∞) with kernel given in terms of Airy functions Ai by

It can also be given as an integral

in terms of a solution of a Painlevé equation of type II

where q, called the Hastings–McLeod solution, satisfies the boundary condition

The distribution F2 is associated to unitary ensembles in random matrix theory. There are analogous Tracy–Widom distributions F1 and F4 for orthogonal (β = 1) and symplectic ensembles (β = 4) that are also expressible in terms of the same Painlevé transcendent q

and

In practical terms, Tracy–Widom is the crossover function between the two phases of weakly versus strongly coupled components in a system. It also appears in the distribution of the length of the longest increasing subsequence of random permutations (Baik, Deift & Johansson 1999), in current fluctuations of the asymmetric simple exclusion process (ASEP) with step initial condition (Johansson 2000, Tracy & Widom 2009), and in simplified mathematical models of the behavior of the longest common subsequence problem on random inputs.

The distribution F1 is of particular interest in multivariate statistics (Johnstone 2007, 2008, 2009). For a discussion of the universality of Fβ, β = 1, 2, and 4, see Deift (2007). For an application of F1 to inferring population structure from genetic data see Patterson, Price & Reich (2006). In 2017 it was proved that the distribution F is not infinitely divisible see Domínguez-Molina, J. Armando 2017.

For numerical approximations, see Edelman&Persson, Bornemann.

其它性质

using Plots,StatsBase
t = 10;n = 1000;dx = .1
x = -1.9:dx:1.9
v = zeros(n,t);w = zeros(n,t)
for j = 1:t
    #coin flipping example
    A = sign(randn(n))
    B = sign(randn(n))
    Q = qr(randn(n,n))[1]
    v[:,j] = A + B  #classical probability
    
    #calculate the asymptotic eigenvalue distribution of sums of random matrices which are asymptotically free
    w[:,j] = eigvals(diagm(A)+Q'*diagm(B)*Q)  #free probability with Haar unitary measure
end
histogram(w[:],normed=true,lab="experiment",palette=:blues)
plot!(x,(π*sqrt(4-x.^2)).^(-1),lab="theory",w=2)

Types

Gaussian measure with density:

where β =1 for GOE, β =2 for GUE and β =4 for GSE.

joint probability density for the eigenvalues λ1, λ2, …, λn of GOE/GUE/GSE:

In the case of GUE, the formula describes a determinantal point process with the sine kernel.

where the function V is called the potential.

softwares

应用

function LIS(;n=4,k=2,trials=100_000)
  expec=[]
  for i=1:trials
    A=randn(k,k)+im*randn(k,k)
    Q,R=qr(A) #QR algorithm does not guarantee nonnegative diagonal entries in R
    # obtain a Haar-distributed unitary matrix
    Q=Q*diagm(exp(im*2*pi*rand(k))) #a simple correction by randomly perturbing the phase
    #Q=Q*diagm(sign(diag(R)))
    expec=[expec;abs(trace(Q))^(2n)]
  end
  mean(expec)
end