Siyu LV Zhen WU
AbstractThe authors prove a sufficient stochastic maximum principle for the optimal control of a forward-backward Markov regime switching jump diffusion system and show its connection to dynamic programming principle.The result is applied to a cash flow valuation problem with terminal wealth constraint in a financial market.An explicit optimal strategy is obtained in this example.
Keywords Stochastic maximum principle,Dynamic programming principle,Forward-backward stochastic differential equation,Regime switching,Jump diffusion
The stochastic maximum principle is one of the principal approaches in solving optimal control problems.The key idea of the stochastic maximum principle is to derive a set of necessary conditions that must be satisfied by any optimal control and these necessary conditions become sufficient under certain convexity conditions(see[2,7,9,12,24]).These works can be regarded as the references on the controls of stochastic differential equations(SDEs for short).On the other hand,since the introduction of nonlinear backward stochastic differential equations(BSDEs for short,see[11]),the stochastic maximum principles for optimal control problems derived by BSDEs or forward-backward SDEs(FBSDEs for short)have been studied by many authors(see[3,14,19–22]).
There is a very extensive literature on the stochastic maximum principles for various types of optimal control problems.For jump diffusion processes,see[6],for Markov regime switching diffusion processes,see[4],and for Markov regime switching jump diffusion processes,see[23],in which sufficient maximum principles for SDEs were developed.Stochastic maximum principles for forward-backward controlled systems with Poisson jumps or Markov chains were studied in[10]and[17],respectively.In this paper,we prove a sufficient stochastic maximum principle for the optimal control of a forward-backward Markov regime switching jump diffusion system.This work extends the results of[23],which only discussed a forward case.
For another important approach to study forward-backward stochastic optimal control problems,Peng[13] first obtained the generalized dynamic programming principle and introduced the generalized Hamilton-Jacobi-Bellman(HJB for short)equation.Shi[16]generalized the results of[13]by considering the controlled FBSDE with jump.In this paper,we establish the connection between maximum principle and dynamic programming principle of Peng’s type in the Markov regime switching jump diffusion context.Relations among the adjoint processes,generalized Hamiltonian function,and value function are given under certain differentiability conditions.
Finally,we use the sufficient maximum principle to discuss the cash flow valuation problem with terminal wealth constraint in a financial model.Using Lagrange multiplier technique,the problem is converted to an unconstrained optimization problem.We prove that the system for this unconstrained problem is governed by a controlled FBSDE,which is naturally reduced to the framework of our paper.And then,the explicit optimal strategy is given with linear state feedback form by virtue of delicate analysis technique.
The paper is organized as follows.The next section presents system dynamics and the optimal control problem.In Section 3,we prove the sufficient stochastic maximum principle.Section 4 establishes the relationship between maximum principle and dynamic programming principle.We illustrate the use of the maximum principle by solving a cash flow valuation problem with terminal wealth constraint in Section 5.
We now introduce the Poisson random measure.Denote R+=[0,∞)and B(R+)the Borel σ- field of R+.Let E ? R{0}be a nonempty Borel set and B(E)the Borel σ- field generated by open subset O of E,whose closuredoes not contain the point 0.Suppose that N(dt,de)is the Poisson random measure onwith the compensator n(dt,de)=ν(de)dt,where ν (de)is the L′evy density of jump size of the random measure N(dt,de)on(E,B(E)).In what follows,we write the compensated Poisson martingale measure as
We assume that the Brownian motion,the Markov chain,and the Poisson random measure defined above are independent of each other.This assumption can ensure that the integration by parts formula(see Lemma 3.1)and the It bo’s formula(see Lemma 4.1)hold for the regime switching jump diffusions.Assume further that the initial market mode α(0)of the Markov chain is αi0.
whereμ∈R is a given constant andare given functions with appropriate dimensions.Here we denotefor notational simplicity.
Consider a performance criterion defined as
where l,g,h are given functions.
Let θ denote(x,y,z)and R denote the set of all functionsDefine the HamiltonianWe also assume that the Hamiltonian H is differentiable with respect to θ.
Now we introduce an FBSDE satisfied by the adjoint processes∈R4×RD(from now on the argument t is suppressed sometimes for simplicity whenever no confusion arises):
where Diag(λ(t))represents a diagonal matrix with the elements of λ(t)on the diagonal.
Theorem 3.1Let u?∈ U with a corresponding solutionof(2.1)and suppose that there exists a solutionof the corresponding adjoint equation(2.3),such that for all u∈U,
Furthermore,we assume that the following conditions hold(to simply the notations,in what follows we write
Condition 3 The functions g(x,αi)and h(y)are convex for each αi,i=1,2,···,D.Then u?is an optimal control andis the corresponding optimal state processes.
To prove this theorem we first need the following lemma on the integration by parts formula,whose proof is similar to that of Lemma 3.2 in[23],so we omit it.
Lemma 3.1Suppose that Γ(j)(t),j=1,2,are processes defined by the following SDEs:
Proof of Theorem 3.1For any u∈U and corresponding state processesby Condition 3,
Noting the initial value of φ?(t)in(2.3),we have
From(2.1),(2.3)and Lemma 3.1,we obtain that the above is equal to
By the definitions in(2.1)and(2.3)of Y(T)and ??(T),we have
From(2.1),(2.3)and Lemma 3.1,is equal to
Combining(3.1)–(3.5),we get
Then we show that the integrand on the right-hand side of the above equation is nonnegative.By Condition 1,Then by Condition 2,is a convex function of θ and for all(θ,u),Therefore,for all(θ,u),
Define
By(3.6)–(3.7),η(t,θ) ≥ 0 for all θ.Moreover,Therefore,That is
Substituting these into(3.7)and from(3.6),we get
and conclude J(u(t))?J(u?(t))≥ 0,which proves that u?(t)is optimal.
As in pure diffusion case,the adjoint processes can be expressed in terms of derivatives of the value function.We first cast our optimal control problem into a Markovian framework and consider the Markovian(feedback)control,that is,the control u(t)of the form u(t,X(t),α(t)),and in order to connect the stochastic maximum principle derived in the previous section with the dynamic programming principle of Peng’s type(see[13]),we should reduce the cost functional(2.2)to J(u(t))=Y(0),corresponding to g(x,αi)=0,h(y)=y and l(t,θ,u,αi)=0 for all i=1,2,···,D.
Write J(t,x,αi;u)=Y(t),where(t,x,αi)represent the initial time and initial states,respectively,i.e.,X(t)=x,α(t)= αi.Furthermore,we define
To proceed,we need to use the following It bo’s formula for the Markov regime-switching jump-diffusion processes,whose proof can be found in[23].
Lemma 4.1Suppose that we are given an real-valued process X(t)satisfying the following SDE:
In the following,for technical reason,we assumeand
In a similar way to[13]and by Lemma 4.1,we obtain that the value function V(t,x,αi)satisfies the following HJB equation:
where the generalized Hamiltonian G associated with v ∈ C1,2([0,T]×R)for each αiis defined as
Now we present a theorem which establishes the relationship between our stochastic maximum principle and the dynamic programming principle of Peng’s type.
Theorem 4.1Assume thatfor each αi∈ S.Let u?be the optimal control andbe the corresponding optimal state processes.Then for all s∈[t,T],we have
ProofFrom the generalized dynamic programming principle(see[13,Theorem 3.1]),it is easy to obtain
In fact,because
where the last inequality is due to the property of backward semigroup introduced by[13],therefore all the inequalities in the aforementioned become equalities.In particular
which gives(4.5)because of the definition
and
In view of the definition(4.2)of the generalized Hamiltionian G,substituting(4.5)and(4.7)into(4.6)implies the relation(4.3).Next,by the HJB equation(4.1),we have
Consequently,
Finally,applying It bo’s formula to,we get
The first relation in(4.4)is obtained by solving the forward SDE in(2.3)directly.Hence,from(4.9),we show thatgiven by(4.4)solve the adjoint equation(2.3)(noting that the terminal valueThe proof is complete.
In this section,we use the stochastic maximum principle to solve the cash flow valuation problem with terminal wealth constraint in a Markov regime-switching jump-diffusion financial model(see[1]for a similar problem fomulation in a pure diffusion case).
Consider a simple financial market consisting of one risk-free asset and one risky asset.The risk-free asset’s price S0(t)is given by the following stochastic ordinary differential equation(ODE):
where r(t,αi)>0,i=1,2,···,D,are bounded deterministic functions and can be regarded as the interest rates in different market modes.The risky asset’s price process S(t)is described by the following SDE:
In what follows,we denote u(t)the amount of the agent’s wealth invested in the risky asset at time t.We call u(t)a portfolio of the agent and then u(t)can be seen the other part of control.One has
where we set B(t,αi)=b(t,αi)? r(t,αi),i=1,2,···,D.
Definition 5.1A strategy pair(c(t),u(t))is said to be admissible if u(t)and c(t)X(t)are square integrable.The set of all admissible strategies is denoted by A.
Definition 5.2The cash flow valuation problem with terminal wealth constraint is the following stochastic optimization problem:
where the positive constant δ represents the weight.From(5.1),the constraint in the above is in fact Y(0)=x0.
Using the Lagrange multiplier method,the problem can be reduced to the following unconstrained control problem:
where θ is the Lagrange multiplier.
If we introduce an stochastic ODE as follows
So we can reformulate problem(5.4)as follows,where FBSDE provides a natural setup,
Finally,we assume that the principal’s utility function is of HARA(hyperbolic absolute risk aversion)type.That is,We shall solve the above forward-backward Markov regime switching jump diffusion optimal control problem(5.6)using the sufficient stochastic maximum principle obtained in Section 3.
In this case,the Hamiltonian defined in Section 2 has the following form:
The adjoint equation(2.3)becomes
and
We immediately have
Now,let(c?,u?)be a candidate for an optimal strategy,and letbe the corresponding solution of FBSDE(5.2)and(5.5),be the corresponding solution of FBSDE(5.7)and(5.8).Thus the value of c?which maximizes H is
Substituting c?(t)in(5.9)into(5.8),then the backward adjoint equation becomes
where p(t,αi)and q(t,αi),i=1,2,···,D,are deterministic,differential functions which are to be determined.From(5.10),(p(t,α(t)),q(t,α(t)))must satisfy terminal boundary condition:p(T,αi)= δ,q(T,αi)= ?δd,for each i=1,2,···,D.
Comparing the coefficients with(5.10),we get
On the other hand,since the Hamiltonian H is a linear expression in u,the coefficients of u should vanish at optimality,i.e.,
Substituting for ψ?(t)andfrom(5.12)into(5.13)and noting(5.11),we obtain
To obtain the expression of the functions p(t,α(t))and q(t,α(t)),we substitute for ??(t)from(5.11)and for u?(t)from(5.14)into the first relation in(5.12).This leads to a linear equation in X?(t).Setting the coefficients of X?(t)equal to zero,we get the following two systems of ODEs:
and
with the terminal boundary conditions
where
The existence and uniqueness of solutions to the above two systems of ODEs are evident as both are linear with uniformly bounded coefficients.In order to get explicit solutions of them,we consider the following processes
and
Applying the It bo’s formula towe get
We then use Lemma 3.1 to expand the right-hand side of(5.21),
Comparing the dt-part in the above equation with the same part of dR(t)given by(5.20),we find thatdefined by(5.17)satisfies(5.15).By uniqueness,we conclude thatSimilarly,we can verify thatdefined by(5.18)is the unique solution of(5.16).
To find the optimal Lagrange multiplier θ?,using the technique in[8],we insert(5.9)into the initial constraint Y(0)=x0,then easily derive
Thus we get the optimal θ?(recalling the definition of transition probabilities of the Markov chain)
By the definition of u?(t)in(5.14),we can see that u?(t)is linear in X?(t).It leads to a linear SDE with bounded coefficients for X?(t).Sodefined by(5.9)and(5.14)is indeed an admissible strategy.
Theorem 5.1The optimal strategy for the cash flow valuation with terminal wealth constraint problem(5.3)is given by(5.9)and(5.14)with linear state feedback form:
where θ?is given by(5.23),p(t,α(t))and q(t,α(t))are given by(5.17)and(5.18),respectively.
There are several interesting problems that deserve further investigation.One is to consider the necessary part of the stochastic maximum principle.This needs the derivation of the corresponding variational equations,which can be obtained similarly as that in Tao and Wu[17].Then the necessary stochastic maximum principle can be achieved by virtue of the duality analysis.On the other hand,the forward-backward regime switching jump diffusion system is assumed to be completely observable in this paper.A more realistic and interesting model is only partially observable.To study partially observable optimal control problem will encounter further difficulty including complex filtering technique.Finally,we have established the connection between maximum principle and dynamic programming principle under the assumption that the value function is smooth enough,which is obviously a very strong restriction.Without involving any derivatives of the value function,we should explore the relations among the adjoint processes,the Hamiltonian function and the value function in the language of viscosity solutions.We shall study these problems in our forthcoming papers.
AcknowledgementThe authors would like to thank the anonymous referee for valuable comments,which led to a much better version of this paper.
Chinese Annals of Mathematics,Series B2018年5期