Shih Yu Changand Yimin Wei
1Department of Applied Data Science,San Jose State University,San Jose,CA 95192,USA
2School of Mathematical Sciences and Shanghai Key Laboratory of Contemporary Applied Mathematics,Fudan University,Shanghai 200433,China
Abstract.Since Kilmer et al.introduced the new multiplication method between two third-order tensors around 2008 and third-order tensors with such multiplication structure are also called as T-product tensors,T-product tensors have been applied to many fields in science and engineering,such as low-rank tensor approximation,signal processing,image feature extraction,machine learning,computer vision,and the multi-view clustering problem,etc.However,there are very few works dedicated to exploring the behavior of random T-product tensors.This work considers the problem about the tail behavior of the unitarily invariant norm for the summation of random symmetric T-product tensors.Majorization and antisymmetric Kronecker product tools are main techniques utilized to establish inequalities for unitarily norms of multivariate T-product tensors.The Laplace transform method is integrated with these inequalities for unitarily norms of multivariate T-product tensors to provide us with Bernstein bound estimation of Ky Fan k-norm for functions of the symmetric random T-product tensors summation.Finally,we also apply T-product Bernstein inequality to bound Ky Fan norm of covariance T-product tensor induced by hypergraph signal processing.
Key words:T-product tensors,T-eigenvalues,T-singular values,Bernstein bound,Courant-Fischer theorem for T-product tensors.
Since Kilmer et al.introduced the new multiplication method between two thirdorder tensors(T-product tensors),many new algebraic properties about such new multiplication rule between two third-order tensors are investigated[9,11].These useful algebraic properties of T-product tensors have been discovered as powerful tools in many science and engineering fields[21,28].Although T-product tensors have attracted many practical applications,all of these applications of T-product tensors assume that T-product tensors under consideration are deterministic.This assumption is not practical in general scientific and engineering applications based on T-product tensors.In[4,5],the authors have tried to establish several new tail bounds for sums of random T-product tensors.These probability bounds characterize large-deviation behavior of the extreme T-eigenvalue of the sums of random T-product tensors(definitions about T-eigenvalues and T-singular values associated to T-product tensors are given in Section 2.1).The authors first apply Lapalace transform method and Lieb’s concavity theorem for T-product tensors obtained from the work[4]to build several inequalities based on random T-product tensors,then utilize these inequalities to generalize the classical bounds associated with the names Chernoff,and Bernstein from the scalar to the T-product tensor setting.Tail bounds for the norm of a sum of random rectangular T-product tensors are also derived from corollaries of random symmetric T-product tensors cases.The proof mechanism is also applied to T-product tensor valued martingales and T-product tensor-based Azuma,Hoeffding and McDiarmid inequalities are also derived[5].The random tensor and its applications in MRI and the tensor normal distribution can be found in[1,23].
In this work,we will apply majorization techniques to establish new Bernstein bounds based on the summation of random symmetric T-product tensors.Compared to the previous work studied in[4,5,14],we make following generalizations:(1)besides bounds related to extreme values of T-eigenvalues,we consider more general unitarily invariant norm for T-product tensors;(2)the bounds derived in[5]can only be applied to the identity map for the summation of random symmetric T-product tensors,this work can derive new bounds for any polynomial function raised by any power greater or equal than one for the summation of random symmetric T-product tensors.In order to drive these new bounds,we also establish Courant-Fischer minmax theorem for T-product tensors in Theorem 2.1 and marjoization relation for T-singular values in Lemma 4.1.Our main theorem is provided below:
Theorem 1.1(Generalized T-product tensor Bernstein bound).Consider a sequence{Xj∈Rm×m×p}of independent,random symmetric T-product tensors with random structure defined by Definition4.1.Let g be a polynomial function withdegree n and nonnegative coefficients a0,a1,···,anraised by power s≥1,i.e.,g(x)=(a0+a1x+···+anxn)swith s≥1.Suppose following condition is satisfied:
where t>0,and we also have
Then we have following inequality:
The rest of this paper is organized as follows.In Section 2,we review T-product tensors basic concepts and introduce a powerful scheme about antisymmetric Kronecker product for T-product tensors.In Section 3,we apply a majorization technique to prove T-product tensor norm inequalities.We then apply new derived T-product tensor norm inequalities to obtain random T-product tensor Bernstein bounds for the extreme T-eigenvalues and Ky Fank-norm in Section 4.Finally,concluding remarks are given by Section 6.
In this section,we will introduce fundamental facts about T-product tensors in Section 2.1.Several unitarily invariant norms about a T-product tensor are defined in Section 2.2.A powerful scheme about antisymmetric Kronecker product for T-product tensors will be provided by Section 2.3.
All third order tensors considered in this work will adopt T-product between two third order tensors multiplication.Basic definitions like identity,symmetric T-product tenor,inner product and computations can be found at[2,10,13,17,18,24,26].Notions about trace,T-positive definite(TPD)tensor,T-positive semidefinite(TPSD)tensor,SVD of a symmetric T-product tensor are provided by[27].
If a T-product tensorC∈Rm×m×pcan be diagonalized as
thej-th eigenvalue of the matrix Ciis called a T-eigenvalue[12,19],denoted byλi,j.We then define thedeterminantof a T-product tensorC∈Rm×m×p,represented by det(C),as
Similarly,singular values of each matrix Ciare T-singular values of the tensorC.
If a symmetric T-product tensorC∈Rm×m×pcan be expressed as the format shown by Eq.(2.1),the T-eigenvalues ofCwith respect to the matrix Ciare denoted asλi,ki,where 1≤ki≤m,and we assume thatλi,1≥λi,2≥···≥λi,m(including multiplicities).Then,λi,kiis theki-th largest T-eigenvalue associated to the matrix Ci.If we sort all T-eigenvalues ofCfrom the largest one to the smallest one,we use,a smallest integer between 1 tom×p(inclusive)associated withpgiven positive integersk1,k2,···,kpthat satisfies
Then,we will have the following Courant-Fischer theorem for T-product tensors.
Theorem 2.1.Given a symmetric T-product tensor C∈Rm×m×pand p positive integers k1,k2,···,kpwith1≤ki≤m,then we have
where λandare defined by Eqs.(2.3)and(2.4).
Proof.First,we have to express〈X,C★X〉by matrices of Ciand Xithrough the representation shown by Eq.(2.1).It is
We will just verify the first characterization ofThe other is similar.LetSibe the projection ofSto the space with dimensionkispanned by vi,1,···,vi,ki,for every xi∈Si,we can write
To verify that this is the maximum,letbe the projection ofTto the space with dimensionwith dimension+1,then the intersection ofSandis not empty.We have
Therefore,for all subspacesSof dimensions{k1,···,kp},we have
Thus,we complete the proof.
Given a symmetric T-product tensorCwith associated matrices Ciprovided by Eq.(2.1),next theorem is the representation of the summation of all the largestkiT-eigenvalues of Ciand the summation of all the smallestkiT-eigenvalues of Ci.
Theorem 2.2.Let C∈Rm×m×pbe a symmetric T-product tensor with associated matricesCiprovided by Eq.(2.1),and we sort T-eigenvalues of the matrixCias λi,1≥λi,2≥···≥λi,ki.Then,we have
and
whereUiare ki×m complex matrices.
Proof.From Theorem 5 in[27],we may assume that Ciare diagonal matrices,denoted as Di,since Ciare symmetric T-product matrices.Therefore,we have the expression
Then,we have
where P=(pj,l)is aki×mstochastic matrix.Then,we can concatenate an(m-ki)×mmatrix Q to the matrix P to make the following matrixas doubly stochastic from 2.C.1(4)from[16].Then,Eq.(2.12)can be expressed as
Given two lists of real numbers,[a1,···,an]and[b1,···,bn],we use[a1,···,an]?[b1,···,bn]to represent the following relationships:
holds for anykbetween 1 andn.From Eq.(2.13),we have
and 3.H.2.b from[16],we have
and
Finally,this theorem is proved by applyingto both sides of Eqs.(2.15)and(2.16)with respect to the indexi,and note that
respectively.
Let us represent the T-eigenvalues of a symmetric T-product tensorH∈Rm×m×pin decreasing order by the vector
wherem×pis the total number of T-eigenvalues.We useto represent a set of nonnegative(positive)real numbers.Let‖·‖ρbe a unitarily invariant tensor norm,i.e.,
whereUis any unitary tensor.Letbe the corresponding gauge function that satisfies Hlder’s inequality so that
Several popular norms can be treated as special cases of unitarily invariant tensor norm.The first one is Ky Fan likek-norm[6]for tensors.Fork∈{1,2,···,m×p},the Ky Fank-norm[6]for tensorsH∈Rm×m×p,denoted as‖H‖(k),is defined as:
Ifk=1,the Ky Fank-norm for tensors is the tensor operator norm,denoted as‖H‖.The second one is Schattenp-norm for tensors,denoted as‖H‖p,is defined as:
wherep≥1.Ifp=1,it is the trace norm.
In this section,we will discuss a machinery of antisymmetric Kronecker product for T-product tensors and this scheme will be used later for log-majorization results.Let H be anm×p-dimensional Hilbert space.For eachk∈N,let H?kdenote thek-fold Kronecker product of H,which is the(m×p)k-dimensional Hilbert space with respect to the inner product defined by
For X1,···,Xk∈H,we define X1∧···∧Xk∈H?kby
whereσruns over all permutations on{1,2,···,k}and sgnσ=±1 depending onσis even or odd.The subspace of H?kspanned by{X1∧···∧Xk},where Xi∈H,is named ask-fold antisymmetric Kronecker product of H and represented by H∧k.
For eachC∈Rm×m×pandk∈N,thek-fold Kronecker productC?k∈Rmk×mk×pkis given by
Because H∧kis invariant forC?k,the antisymmetric Kronecker product ofC∧kofCcan be defined asC∧k=C?|H∧k,then we have
We will provide the following lemmas about antisymmetric Kronecker product.
Lemma 2.1.Let A,B,C,E∈Rm×m×pbe T-product tensors,for any k∈{1,2,···,m×p},we have
The proof can be found at Lemma 3 in[3].
In this section,the majorization with integral average and log-majorization with integral average will be introduced in Section 3.1 and Section 3.2.These majorization results will be used to prove T-product tensor norm inequalities in Section 3.3.The basic concept about majorization and its applications can refer to[16].
Let Ω be aσ-compact metric space andνa probability measure on the Borelσ- field of Ω.LetC,Dτ∈Rm×m×pbe symmetric T-product tensors.We further assume that tensorsC,Dτare uniformly bounded in their norm forτ∈Ω.Letτ∈Ω→Dτbe a continuous function such that sup{‖Dτ‖:τ∈Ω}<∞.For notational convenience,we define the following relation:
Iffis a single variable function,the notationf(C)represents a tensor function with respect to the tensorC.
Theorem 3.1.LetΩ,ν,C,Dτbe defined as the beginning part of Section3.1,and f:R→[0,∞)be a non-decreasing convex function,we have following two equivalent statements:
where‖·‖ρis the unitarily invariant norm defined in Eq.(2.17).
Proof.We assume that the left statement of Eq.(3.2)is true and the functionfis a non-decreasing convex function.From Lemma 1 in[8],we have
From the convexity off,we also have
Then,we obtain
By applying Lemma 4.4.2 in[7]to both sides of
with gauge functionρ,we obtain
Therefore,the right statement of Eq.(3.2)is true from the left statement.
On the other hand,if the right statement of Eq.(3.2)is true,we select a functionf=max{x+c,0},wherecis a positive real constant satisfyingC+cI≥O,Dτ+cI≥Ofor allτ∈Ω,and tensorsC+cI,Dτ+cI.If the Ky Fank-norm at the right statement of Eq.(3.2)is applied,we have
Hence,
this is the left statement of Eq.(3.2).
Next theorem will provide a stronger version of Theorem 3.1 by removing weak majorization conditions.
Theorem 3.2.LetΩ,ν,C,Dτbe defined as the beginning part of Section3.1,and f:R→[0,∞)be a convex function,we have following two equivalent statements:
where‖·‖ρis the unitarily invariant norm defined in Eq.(2.17).
Proof.We assume that the left statement of Eq.(3.7)is true and the functionfis a convex function.Again,from Lemma 1 in[8],we have
then,
This proves the right statement of Eq.(3.7).
Now,we assume that the right statement of Eq.(3.7)is true.From Theorem 3.1,we already have
It is enough to prove
We define a functionf=max{c-x,0},wherecis a positive real constant satisfyingC≤cI,Dτ≤cIfor allτ∈Ω and tensorscI-C,cI-Dτ.If the trace norm is applied,i.e.,the sum of the absolute value of all eigenvalues of a symmetric T-product tensor,then the right statement of Eq.(3.7)becomes
The desired inequality
is established.
The purpose of this section is to consider log-majorization issues for unitarily invariant norms of TPSD T-product tensors.In this section,letC,Dτ∈Rm×m×pbe TPSD T-product tensors withm×pnonnegative T-eigenvalues by keeping notations with the same definitions as at the beginning of the Section 3.1.For notational convenience,we define the following relation for logarithm vector:
Theorem 3.3.Let C,Dτbe TPSD T-product tensors,f:(0,∞)→[0,∞)be a continuous function such that the mapping x→logf(ex)is convex onR,and g:(0,∞)→[0,∞)be a continuous function such that the mapping x→g(ex)is convex onR,then we have following three equivalent statements:
Proof.The roadmap of this proof is to prove equivalent statements between Eq.(3.11a)and Eq.(3.11b) first,followed by equivalent statements between Eq.(3.11a)and Eq.(3.11c).
Eq.(3.11a)=?Eq.(3.11b).
There are two cases to be discussed in this part of proof:C,Dτare TPD tensors,andC,Dτare TPSD T-product tensors.At the beginning,we consider the case thatC,Dτare TPD tensors.
SinceDτare positive,we can findε>0 such thatDτ≥εIfor allτ∈Ω.From Eq.(3.11a),the convexity of logf(ex)and Lemma 1 in[8],we have
Then,from Eq.(2.17),we obtain
From the functionfproperties,we can assume thatf(x)>0 for anyx>0.Then,we have following bounded and continuous maps on Ω:τ→logf(λi(Dτ))fori∈Because we haveν(Ω)=1 andσ-compactness of Ω,we haveandn∈N withsuch that
wherei∈{1,2,···m×p}.Moreover,
By taking the exponential at both sides of Eq.(3.14)and apply the gauge functionρ,we have
Similarly,by taking the exponential at both sides of Eq.(3.15),we have
From Lemma 2 in[8],we have
From Eqs.(3.16),(3.17)and(3.18),we have
Then,Eq.(3.11b)is proved from Eqs.(3.13)and(3.19).
Next,we consider thatC,Dτare TPSD T-product tensors.For anyδ>0,we have following log-majorization relation:
where∈δ>0 andk∈{1,2,···r}.Then,we can apply the previous case result about TPD tensors to TPD tensorsC+∈δIandDτ+δI,and get
Asδ→0,Eq.(3.21)will give us Eq.(3.11b)for TPSD T-product tensors.
Eq.(3.11a)?=Eq.(3.11b).
We consider TPD tensors at first phase by assuming thatDτare TPD T-product tensors for allτ∈Ω.We may also assume that the tensorCis a TPD T-product tensor.Since if this is a TPSD T-product tensor,i.e.,someλi=0,we always have following inequality valid:
If we applyf(x)=xpforp>0 and‖·‖ρas Ky Fank-norm in Eq.(3.11b),we have
From L’Hopital’s Rule,ifp→0,we have
and
whereτ∈Ω.Appling Eqs.(3.25)and(3.26)into Eq.(3.24)and takingp→0,we have
Therefore,Eq.(3.11a)is true for TPD tensors.
For TPSD T-product tensorsDτ,since Eq.(3.11b)is valid forDτ+δIfor anyδ>0,we can apply the previous case result about TPD tensors toDτ+δIand obtain
wherek∈{1,2,···,r}.Eq.(3.11a)is still true for TPSD T-product tensors asδ→0.
Eq.(3.11a) =?Eq.(3.11c).
IfC,Dτare TPD tensors,andDτ≥δIfor allτ∈Ω.From Eq.(3.11a),we have
If we apply Theorem 3.1 to logC,logDτwith functionf(x)=g(ex),wheregis used in Eq.(3.11c),Eq.(3.11c)is implied.
IfC,Dτare TPSD T-product tensors and anyδ>0,we can find∈δ∈(0,δ)to satisfy following:
Then,from TPD T-product tensor case,we have
Eq.(3.11c)is obtained by takingδ→0 in Eq.(3.31).
Eq.(3.11a)?=Eq.(3.11c).
Fork∈{1,2,···,r},if we applyg(x)=log(δ+x),whereδ>0,and Ky Fank-norm in Eq.(3.11c),we have
Then,we have following relation asδ→0:
Therefore,Eq.(3.11a)can be derived from Eq.(3.11c).
Next theorem will extend Theorem 3.3 to non-weak version.
Theorem 3.4.Let C,Dτbe TPSD T-product tensors with
for any p>0,f:(0,∞)→[0,∞)be a continuous function such that the mapping x→logf(ex)is convex onR,and g:(0,∞)→[0,∞)be a continuous function such that the mapping x→g(ex)is convex onR,then we have following three equivalent statements:
Proof.The proof plan is similar to the proof in Theorem 3.3.We prove the equivalence between Eq.(3.34a)and Eq.(3.34b) first,then prove the equivalence between Eq.(3.34a)and Eq.(3.34c).
Eq.(3.34a)=?Eq.(3.34b).
First,we assume thatC,Dτare TPD T-product tensors withDτ≥δIfor allτ∈Ω.The corresponding part of the proof in Theorem 3.3 about TPD tensorsC,Dτcan be applied here.
For case thatC,Dτare TPSD T-product tensors,we have
whereδn>0 andδn→0.Because we have
asn→∞;from Lemma 12 in[8],we can find a(n)withn≥n0such that≥···≥a(n)→→λ(C)and
wheren≥n0.
There are two situations for the functionfnear 0:f(0+)<∞andf(0+)=∞.For the case withf(0+)<∞,we have
whereτ∈Ω andn→∞.From Fatou-Lebesgue theorem,we then have
By takingn→∞in Eq.(3.37)and using Eqs.(3.38),(3.39),(3.40),we have Eq.(3.34b)for case thatf(0+)<∞.
For the case withf(0+)=∞,we assume thatRΩlog‖f(Dτ)‖ρdν(τ)<∞,(since the inequality in Eq.(3.34b)is always true forRΩlog‖f(Dτ)‖ρdν(τ)=∞).Sincefis decreasing on(0,∈)for some∈>0.We claim that the following relation is valid:there are two constantsa,b>0 such that
for allτ∈Ω andn≥n0.If Eq.(3.41)is valid and
from Lebesgue’s dominated convergence theorem,we also have Eq.(3.34b)for case thatf(0+)=∞by takingn→∞in Eq.(3.37).
Below,we will prove the claim stated by Eq.(3.41).By the uniform boundedness of tensorsDτ,there is a constantκ>0 such that
whereτ∈Ω andn≥n0.We may assume thatDτis TPD tensors because‖f(Dτ)‖ρ=∞,i.e.,Eq.(3.41)being true automatically,whenDτis TPSD T-product tensors.From SVD of symmetric T-product tensors,we have
Therefore,the claim in Eq.(3.41)follows by the triangle inequality for‖·‖ρandf(λj′(Dτ)+δn)<∞forλj′(Dτ)+δn≥∈.
Eq.(3.34a)?=Eq.(3.34b).
The weak majorization relation
is valid fork<m×pfrom Eq.(3.11a)=?Eq.(3.11b)in Theorem 3.3.We wish to prove that Eq.(3.44)becomes equal fork=m×p.It is equivalent to prove that
where det(·)is defined by Eq.(2.2).We can assume that
since Eq.(3.45)is true for
Then,Dτare TPD tensors.
If we scale tensorsC,DτasaC,aDτby somea>0,we can assumeDτ≤Iandλi(Dτ)≤1 for allτ∈Ω andi∈{1,2,···,m×p}.Then for anyp>0,we have
If we use tensor trace norm,represented by‖·‖1,as unitarily invariant tensor norm andf(x)=x-?for any?>0 in Eq.(3.34b),we obtain
Similar to Eqs.(3.25)and(3.26),we have following two relations as?→0:
and
From Eq.(3.47)and Lebesgue’s dominated convergence theorem,we have
Finally,we have Eq.(3.45)from Eqs.(3.49)and(3.52).
Eq.(3.34a)=?Eq.(3.34c).
First,we assume thatC,Dτare TPD tensors andDτ≥δIforτ∈Ω.From Eq.(3.34a),we can apply Theorem 3.2 to logC,logDτandf(x)=g(ex)to obtain Eq.(3.34c).
ForC,Dτare TPSD T-product tensors,we can choose a(n)and correspondingC(n)forn≥n0givenδn→0 withδn>0 as the proof in Eq.(3.34a)=?Eq.(3.34b).Since tensorsC(n),Dτ+δnIare TPD T-product tensors,we then have
Ifg(0+)<∞,Eq.(3.34c)is obtained from Eq.(3.53)by takingn→∞.On the other hand,ifg(0+)=∞,we can apply the argument similar to the portion aboutf(0+)=∞in the proof for Eq.(3.34a)=?Eq.(3.34b)to geta,b>0 such that
for allτ∈Ω andn≥n0.Since the case that
will have Eq.(3.34c),we only consider the case that
Then,we have Eq.(3.34c)from Eqs.(3.53),(3.54)and Lebesgue’s dominated convergence theorem.
Eq.(3.34a)?= Eq.(3.34c).
The weak majorization relation
is true from the implication from Eq.(3.11a)to Eq.(3.11c)in Theorem 3.3.We have to show that this relation becomes identity fork=m×p.If we apply‖·‖ρ=‖·‖1andg(x)=x-?for any?>0 in Eq.(3.34c),we have
Then,we will get
which will prove the identity for Eq.(3.55)whenk=m×p.The equality in=1will be proved by the following Lemma 3.1.
Lemma 3.1.Let Dτbe TPSD T-product tensors withRΩ‖D-pτ‖ρdν(τ)<∞for any p>0,then we have
The proof can be found at Lemma 6 in[3].
In this section,we will apply derived majorization inequalities for T-product tensors to multivariate T-product tensor norm inequalities which will be used to bound random T-product tensor concentration inequalities in later sections.We will begin to present a Lie-Trotter product formula for tensors.
The proof of this lemma can be found in Lemma 7 in[3].
Below,new multivariate norm inequalities for T-product tensors are provided according to previous majorization theorems.
Theorem 3.5.Let Ci∈Rm×m×pbe TPD tensors,where1≤i≤n,‖·‖ρbe a unitarily invaraint norm with corresponding gauge function ρ.For any continuous function f:(0,∞)→[0,∞)such that x→logf(ex)is convex onR,we have
where
For any continuous function g(0,∞)→[0,∞)such that x→g(ex)is convex onR,we have
Proof.From Hirschman interpolation theorem[22]andθ∈[0,1],we have
whereh(z)be uniformly bounded onand holomorphic onS.The termdβθ(t)is defined as:
LetH(z)be a uniformly bounded holomorphic function with values in Cm×m×p.Fix someθ∈[0,1]and letU,V∈Cm×m×pbe normalized tensors such that
If we defineh(z)ash(z)=〈U,H(z)★V〉,we have following bound:|h(z)|≤‖H(z)‖for allz∈S.From Hirschman interpolation theorem,we then have following interpolation theorem for tensor-valued function:
Let
Then the first term in the R.H.S.of Eq.(3.64)is zero sinceH(ιt)is a product of unitary tensors.Then we have
From Lemma 2.1,we have following relations:
If Eq.(3.65)is applied to∧kCifor 1≤k≤r,we have following log-majorization relation from Eqs.(3.66)and(3.67):
Moreover,we have the equality condition in Eq.(3.68)fork=rdue to following identies:
At this stage,we are ready to apply Theorem 3.4 for the log-majorization provided by Eq.(3.68)to get following facts:
and
From Lie product formula for tensors given by Lemma 3.2,we have
By settingθ→0 in Eqs.(3.70),(3.71)and using Lie product formula given by Eq.(3.72),we will get Eqs.(3.60)and(3.61).
The purpose of this section is to apply new derived T-product tensor norm inequalities to obtain random symmetric T-product tensor Bernstein bounds.In Section 4.1,Ky Fank-norm inequalities for T-product tensors will be provided and such Ky Fank-norm inequalities will be utilized to establish T-product tensor Bernstein bounds in Section 4.2 and Section 4.3.
We will present several lemmas required to prove Ky Fank-norm tail bounds.
We have following lemma about the majorization relation of T-singular values among T-product tensors summation.
Lemma 4.1.Given two symmetric T-product tensors C,D∈Rm×m×p.We have following majorization relation about T-singular values:
Proof.Since we have
where?is the operation to take the real part,and the equalities=1and=2come from Theorem 2.2.
We are ready to introduce the following two lemmas about Ky Fank-norm inequalities for the product of tensors(Lemma 4.2)and the summation of tensors(Lemma 4.3).
Lemma 4.2.Let Ci∈Rm×m×pbe symmetric T-product tensors and let pibe positivereal numbers satisfyingThen,we have
where s≥1and k∈{1,2,···,m×p}.
The proof can be found at Lemma 10 in[3].
Lemma 4.3.Let Ci∈Rm×m×pbe symmetric T-product tensors,then we have
where s≥1and k∈{1,2,···,m×p}.
The proof can be found at Lemma 11 in[3].
Now,we are ready to present our main theorem about Ky Fank-norm probability bound for a function of tensors summation.
Theorem 4.1.Consider a sequence{Xj∈Rm×m×p}of independent,random,symmetric T-product tensors.Let g(x)be a polynomial function with degree n and nonnegative coefficients a0,a1,···,anraised by power s≥1,then g(x)can be expressed as:
Suppose following condition is satisfied:
where t>0.Then,we have
Proof.Lett>0 be a parameter to be chosen later.Then
where≤1uses Markov’s inequality,≤2requires conditions provided by Eq.(4.6).
We can further bound the expectation term in Eq.(4.7)as
where≤3from Eq.(3.61)in Theorem 3.5,≤4is obtained from functiongdefinition and Lemma 4.3.Again,the expectation term in Eq.(4.9)can be further bounded by Lemma 4.2 as
Note that the final equality is obtained due to that the integrand is independent of the variableτand
Finally,this theorem is established from Eqs.(4.8),(4.9),and(4.10).
Remark 4.1.The condition provided by Eq.(4.6)can be achieved by normalizing tensorsXjthrough scaling.
In this section,we will present a tensor Bernstein bound for the maximum and the minimum T-eigenvalue for summation of random symmetric T-product tensors.We will provide the following definition to define a random structure for the T-product tensorX∈Rm×m×p.
Definition 4.1.Random structure for random symmetric T-product tensor X∈Rm×m×p
1.There are p Hermitian matrices with size m×m,denoted asX1,X2,···,Xp,obtained from Eq.(2.1).The entries for the matrixXiare denoted by(xij,k),where xij,kis a complex number.
2.For eachXi,the random variables xij,j,?xij,kfor j<k,and ?xij,kfor j<k,are independent.
3.For eachXi,the random variables xij,jfollow Gaussian distribution with zeromean and variance as
4.For eachXi,the random variablesfor j<k,and ?xij,kfor j<k,followGaussian distribution with zero mean and variance as
Following lemma is about the expectation of the largest T-eigenvalue of symmetric T-product tensor exp(γX),whereγis a real number.
Lemma 4.4.Given a random symmetric T-product tensor X∈Rm×m×psatisfying Definition4.1and any real number γ,we have
where λ1is the largest T-eigenvalue,and c1,c2are constants related to the bound of cumulative distribution function of the largest eigenvalue of the random Hermitian matrixX.
The proof can be found at Lemma 12 in[3].
We are ready to present our theorem about the maximum and the minimum of T-eigenvalue for the summation of random symmetric T-product tensors.
Theorem 4.2(T-product tensor Bernstein bound for T-eigenvalue).Consider a sequence{Xj∈Rm×m×p}of independent,random,symmetric T-product tensors with random structure defined by Definition4.1.Then we have following inequalities:given θ1>0,we have
and,given θ2<0,we have
TheΨfunction is defined by Eq.(4.11).
Proof.Since we have
where=1comes from that maximum singular value equals to the maximum absolute value of an T-eigenvalue and the maximum and the minimum of T-eigenvalue has same distribution due to the symmetry of random structure given by Definition 4.1;the inequality≤2comes from Theorem 4.1 whengis the identity function;the equality≤3comes from Lemma 4.4 and
due to TPD of exp(pjtXj);the inequality≤4is obtained by selectingpj=M.Therefore,we have Eq.(4.12).
For the minimum T-eigenvalue,we also have
where=1comes from Theorem 2.1;=2is true since the maximum singular value equals to the maximum absolute value of an T-eigenvalue and the maximum and the minimum of T-eigenvalue has same distribution due to the symmetry of random structure given by Definition 4.1;the inequality≤3comes from Theorem 4.1 again whengis an identity map;the equality≤4comes from Lemma 4.4 and
due to TPD of exp(pjtXj);the inequality≤5is obtained by selectingpj=M.Hence,we have Eq.(4.13).
In this section,we will present a generalized tensor Bernstein bound for Ky Fanknorm,and we will begin with a lemma to bound exponential of a random T-product tensor.
Lemma 4.5.Suppose that X∈Rm×m×pis a random symmetric T-product tensor that satisfies
where A is a fixed TPD tensor.Then,we have
where0<t<1.
The proof can be found at Lemma 13 in[3].
Lemma 4.6.Given a random symmetric T-product tensor X∈Rm×m×psatisfying Definition4.1,we have
where σ1is the largest T-singular value,and d1,d2are constants related to the upper bound of the largest eigenvalue of the random Hermitian matrixX.
The proof can be found at Lemma 14 in[3].
Following lemma is about Ky Fank-norm bound for the exponential of a random T-product tensor with subexponential constraints.
Lemma 4.7.Given a symmetric random T-product tensor X∈Rm×m×pwith random structure defined by Definition4.1and
where A is a TPD T-product tensor.Then,we have following bound about the expectation value of Ky Fan k-norm for the random T-product tensorexp(θX)
Proof.From Lemma 4.5,we have
whereσl(·)is thel-th largest T-singular value.
From Lemma 4.1,we have for two symmetric T-product tensorsAandB.Then,we can bound Eσ1(I+θX+as
where we use Φ(m,d1,d2)from Lemma 4.6 to bound Eσ1(X)in the last inequality.This Lemma is proved by multiplyingkat Eq.(4.22).
We are ready to prove our main theorem,Theorem 1.1,about the generalized T-product tensor Bernstein bound.
Proof.Since we have
where the inequality≤1comes from Theorem 4.1;the inequality≤2comes from Lemma 4.7;the inequality≤3is obtained by settingpj=M.
In this section,we will try to apply generalized T-product Bernstein Bound derived in Section 4.3 to bound Ky Fan norm of covariance T-product tensor induced by hypergraph signal processing.In[15],Marques et al.provide a comprehensive introduction to the spectral analysis and estimation of graph stationary processes based on graph signal processing(GSP).We extend their settings from vectors/matrices used in traditional GSP to hypergraph signal processing,where T-product tensors are applied to characterize hypergraph 3-uniform signals,i.e.,signals are represented by three(3)dimensional data array[25].
Let G=(N,E)be a directed hypergraph with nodes set N and directed edges set E such that if there exists a hyperedge among a set ofMnodes(i,j,k)∈E.We associate G with the hypergraph shift operator(HGSO)S,defined as an square T-product tensor with dimensionsm×m×pwhose entrys(i,j,k)/=0 if(i,j,k)∈E.We introduce a hypergraph filterH:Rm×1×p→Rm×1×p,de fined as a linear hypergraph signal operator with the form
wherehkare scaler coefficients.The covariance tensor of output signals X∈Rm×1×pafter filtering white input signals by hypergraph filter shown in Eq.(5.1)will be expressed as
where=1is true if HGSOSis a symmetric T-product tensor.The coefficients
It is shown by the work[20]that although the correlation information of signal is given by thedensetensor,the actual relation is easier to be described by the more sparse tensorS.Examples about relationships between the HGSO and the covariance tensorCX(h)include
·CX(h)=S-1,as in in conditionally independent Markov random fields;
·CX(h)=(I-S)-2,as in symmetric structural equation models with white exogenous inputs.
In the sequel,we will bound the Ky Fan norm for the covariance tensorCX(h)when h=[h0,h1].In random environment,suppose HGSOSis obtained by sample average as
which is the polynomial function
in Theorem 1.1.We assume that random sampled tensorsare identical distributed asX′are satisfy Eq.(1.1)and Eq.(1.2).Then we have following bound of Ky Fan norm for the covarainceCX[h0,h1]from Theorem 1.1:
wherea0=h20,a1=2h0h1,anda2=h21.The usefulness of Eq.(5.5)is that we can control Ky Fan norm for the covarianceCX[h0,h1]via graph filter parametersh0,h1,and this controllability is crucial in GSP system design.
This work extend previous work in[5]by making following generalizations via majorization techniques:(1)besides bounds related to extreme values of T-eigenvalues,this works considers more general unitarily invariant norm for T-product tensors;(2)this work derives new bounds for any polynomial function raised by any power greater or equal than one for the summation of random symmetric T-product tensors.We also establish the Courant-Fischer min-max theorem for T-product tensors and marjoization relation for T-singular values which are by-products of our procedure to prove the generalized random T-product Bernstein bounds.Eventually,we apply T-product Bernstein inequality to bound Ky Fan norm of covariance T-product tensor induced by hypergraph signal processing.
Possible future work about this research is to consider tail bounds behaviors for the summation of random symmetric T-product tensors equipped with other random structures different from random structure provided by Definition 4.1.
Acknowledgements
The helpful comments of the handling editor and the referee are gratefully acknowledged.Prof.Wei is supported by Innovation Program of Shanghai Municipal Education Commission and the National Natural Science Foundation of China under grant No.11771099.