Shunan Han,Peng Liu,Guang Huang
Aviation University of Air Force,Changchun 130000,China
Abstract: The existing methods for identifying recursive systematic convolutional encoders with high robustness require to test all the candidate generator matrixes in the search space exhaustively.With the increase of the codeword length and constraint length,the search space expands exponentially,and thus it limits the application of these methods in practice.To overcome the limitation,a novel identification method,which gets rid of exhaustive test,is proposed based on the cuckoo search algorithm by using soft-decision data.Firstly,by using soft-decision data,the probability that a parity check equation holds is derived.Thus,solving the parity check equations is converted to maximize the joint probability that parity check equations hold.Secondly,based on the standard cuckoo search algorithm,the established cost function is optimized.According to the final solution of the optimization problem,the generator matrix of recursive systematic convolutional code is estimated.Compared with the existing methods,our proposed method does not need to search for the generator matrix exhaustively and has high robustness.Additionally,it does not require the prior knowledge of the constraint length and is applicable in any modulation type.
Keywords: RSC code;blind identification;softdecision;cuckoo search algorithm
In the field of information interception,the analyses of communication protocols are the key procedures.For the analyses of protocols,identification of channel encoders is a necessary technology.As the sub-code of Turbo code[1],recursive systematic convolutional(RSC)code has been widely used in modern communication systems,such as wireless sensor networks[2]and power line communication systems [3].Therefore,the identification of RSC code has drawn much attention.Usually,influenced by intentional and unintentional interferences,the signal to noise ratio of the intercepted signal may be low,which leads to the high bit error rates after blind demodulation.Therefore,it is necessary to do research on identification methods with high robustness.For the identification of RSC code under high bit error rates,a method based on walsh-hadamard transform (WHT) is proposed in[4].With the prior knowledge of the constraint length of the encoder,the WHT-based method tests all the binary vectors having the same length as the parity check vector of the RSC code.The binary vector with the maximum statistic value is determined as the parity check vector and the generator matrix can be naturally derived from it.The computational complexity of the WHT-based method increases with the speed of 2 to the power of the length of the parity check vector.Consequently,the computation load is hardly tolerable in practice,especially when the constraint length is not known priori.The segmented WHT-based method is proposed in [5].The segmented WHT-based method occupies less computer storage room than the WHTbased method.However,its computation complexity is the same as the original one.An identification method based on the maximum likelihood estimation is proposed in [6].It also needs to test all the candidates in the vector space,thus the computational complexity is still very high.The above methods achieve the identification of RSC code by searching for the parity check vector exhaustively,and the data that they employ is a hard-decision bit stream.It is seen that by using soft-decision data,the estimation of the parity check vector of RSC code can be achieved in the way of maximizing the log-likelihood ratio of two hypotheses in [7].Subsequently,a more concise loglikelihood ratio is proposed in [8].A new statistic value defined as likelihood difference is constructed by using soft-decision data in[9].The three methods covert the identification of RSC code to a hypothesis testing problem,but it is still required to search for the parity check vector exhaustively.In[10],a finite set of possible RSC encoders is constructed.All the candidate encoders are tested according to the established different test statistic,and the candidate having the best statistic value is regarded as the identification result.These methods can only identify a RSC encoder in the finite set,and they require to test the candidates exhaustively.By using soft-decision data,the first method breaking the limitation of exhaustive search is proposed in[11],which is based on the expectationmaximization(EM)algorithm[12].However,it is not applicable in high bit error rates.An identification method based on the gradient search is proposed in[13].Compared with the method based on the EM algorithm,it has higher correct identification rates.Nevertheless,like other soft-decision data based methods,it still has the limitation of being applicable only in the cases of binary modulations.The convolutional neural network is used to distinguish Turbo code from block code and convolutional code in[14],but it does not discuss the identification of the generator matrix of a RSC encoder.In this paper,by using soft-decision data,a novel identification method is proposed based on the cuckoo search algorithm.The method does not need to search for the parity check vector of a RSC encoder exhaustively,and the prior knowledge of the constraint length is not required.In addition,it has higher robustness compared with the method in [13],and can be applied in the multiple modulation cases.
The contents are organized as follows.In section II,the identification model is established according to the parity check relation of RSC code.In section III,by using soft-decision data,solving the identification model is converted to minimize the derived cost function based on the cuckoo search algorithm.The experimental results are shown in section IV.Conclusions are given in section V.
The Figure 1 shows the structure of a parallel concatenated Turbo encoder.
Figure 1.Sturcture of a Turbo encoder.
As shown in Figure 1,RSC code with the code rate 1/2 is the sub-code of a Turbo code.Consequently,the identification of RSC code with 1/2 code rate is a key step for the identification of a Turbo code.Additionally,for the identification of RSC code with the code rate 1/n,it can be converted to the identification of several RSC codes with the code rate 1/2.So the identification of RSC code with the code rate 1/2 is discussed.
Figure 2.The flow chart of the optimizing procedure.
The task of identifying a RSC encoder is to estimate its generator matrixG(D).The generator matrix of RSC code with the code rate 1/2 can be denoted as
wheregi(D)=gi,0+gi,1D+···+gi,mDm,(i=1,2)is the generator polynomial.
For RSC code with the code rate 1/2,there exists an unique parity check matrixH(D)which is orthogonal to the generator matrix,i.e.,
Through combining(1)and(2),it can be obtained that
Therefore,the estimation of the generator matrix can be achieved by estimating the parity check matrix.
The relationship between the encoder input sequenceU(D) and the output sequenceC(D) can be expressed as
From(2)and(4),it is derived that
According to(5),the following parity check equation set can be established.
The solution of the parity check equation set is the parity check vector of the encoder,and it consists of coefficients of the generator polynomials.Through solving the equation set,the generator matrix can be estimated directly.
In this section,a novel scheme for solving parity check equations is presented.Firstly,according to the parity check equation set,shown in(6),a cost function is established.In the cost function,each unknown variable represents the probability that the corresponding coefficient of a generator polynomial equals 1.Thus,solving parity check equations is converted to minimize the cost function.Then,the standard cuckoo search(CS)algorithm is employed to optimize the cost function.Subsequently,the computational complexity of the scheme is analyzed.
In practice,bit errors in the received sequence lead to invalid equation in the parity check equation set.Since the number of invalid equations is less than that of valid equations,solving the parity check equation set is to find the vector that makes the number of valid equations largest,which is equal to maximize the joint probability that the equations hold,i.e.,
By using the soft-decision datar,which are the realistic output data of the correlator in demodulator,the estimation of the generator polynomials can be achieved through maximizing the posterior probabilities that the equations hold,i.e.,
Since
the above maximization problem can be converted the following minimization problem.
Theorem 1.[15] Let b1,b2,···,bq are independent variables in GF(2).Then,
Since all of the encoded bits and the coefficients of the generator polynomials are independent from each other,according to Theorem 1,the probabilityPcan be simplified further as
where the variablesη1,η2,···,η2(m+1)denote the probabilitiesP(g2,m=1),P(g1,m=1),···,P(g1,0=1)respectively.
Subsequently,the analytic expression of the posterior probabilityis derived.For any M-ary modulation,there exists a one-to-one mapping between transmitted symbols and binary bits.For instance,in quadrature phase shift keying (QPSK),a symbol corresponds to a combination of two bits.LetHp,(p=1,2,···,M)represent the different hypotheses of a transmitted symbol.Thus,the probabilitycan be expressed as
Furthermore,it can be derived that
If the prior probabilities of transmitted symbolsP(Hp),(p=1,2,···,M) are equal,then equation(14)can be simplified as
According to the hypothesisHp,the likelihood functioncan be obtained directly.For example,in QPSK modulation,the soft-decision datar=[r1;r2]have the following four possible relations with the noise under different hypotheses of the transmitted symbol.
wheren1,iandn2,irepresent the Gaussian white noise.
Consequently,combining (12),(13) and(15),we can derive the analytic expression of the cost function,as shown in (17).
Through minimizing the cost function,the estimation ofη1,η2,···,η2(m+1)are estimated.Then according to the following decision-rule,the coefficients of the generator polynomials are estimated.
The standard CS algorithm is used to minimize the cost function.The CS algorithm belong to a kind of meta-heuristic algorithms.According to different behaviors and phenomenon in reality,various kinds of meta-heuristic algorithms have been proposed,such as the genetic algorithm(GA),the particle swarm optimization (PSO) algorithm,the African vulture optimization algorithm(AVOA)[16]and the artificial gorilla troops optimization (GTO) algorithm [17].The CS algorithm combines the cuckoo breeding behavior and the L′evy flight [18].Compared with other metaheuristic algorithms,such as the GA and the PSO algorithm,it can find the global optima with higher success rates.Thanks to its superior performance,it has been widely used to solve many engineering problems.In[19],the CS algorithm is used to solve the inversion of magnetic flux leakage signals.A modified CS algorithm is employed to optimize the dispatch of economic power in [20].In [21],the CS algorithm is adopted to schedule the semiconductor testing facilities.The block length of an asymmetric turbo code is optimized by using the CS algorithm in [22].Additionally,the CS algorithm is also employed to solve some optimization problems in the fields of artificial networks[23]and energy systems[24].
The vector|η1,η2,···,η2(m+1)|represents a location of a nest in the CS algorithm.The steps of minimizing the cost function by using the standard CS are as follows.
Step 1:Initialize the nests.The generator polynomialsg1(D) andg2(D) have the same order,and therefore,we can assign the first two elements of the nests to be the constant 1.Since 0≤ηi ≤1,(3≤i ≤2(m+1)),other elements in the initial nests are drawn randomly from the standard uniform distribution on the interval[0,1].
Step 2: The fitness of the initial nests is evaluated,and the current best nest is found.
Step 3: New nests are generated via the L′evy flight.The best nest is retained.Based on the principle of the L′evy flight,new nests are generated according to the following equation.
wherextrepresents the nest in thet-th iteration andxt+1represents the new nest.αis the step controlling parameter,and in the paper,we assume it equals the constant 0.1.Srepresents the step size vector.Each elementλiin the vectorSis drawn from a L′evy distribution and can be expressed as
Step 4: The fitness of new nests is evaluated.If the fitness of the new nest is better than that of the corresponding old one,it replaces the old nest in the population.Through evaluation,the current best nest is found.
Step 5: A fraction of nests with lower fitness is replaced by random new nests.The cuckoo’s egg can be discovered by the host with the probabilitypa,and in this paper,we assumepa=0.25.It means that 25 percent of nests with lower fitness will be replaced by random new nests in each iteration.
Step 6: The steps from 3 to 5 are carried out iteratively,until the number of iterations reaches the largest value or the best fitness exceeds the pre-defined threshold.
The final best nest is determined as the best solution,and then according to(18),we can estimate the generator polynomials of the RSC code from the solution.
The flow chart of the optimizing procedure is shown in Figure 2.
Figure 3.The value of the cost function along with iterations.
The computation of the identification method concentrates on generating new nests through the L′evy flight and evaluating the fitness of new nests.An operation(addition or multiplication) between two variables is used in this analysis.Assuming that the number of nests in the population isMand the dimension of each nest isL,thus in each iteration,generating new nests requires 9MLoperations.Assuming that the number of parity check equations isN,evaluating the fitness of new nests requires 4NML(1+pa)operations.If the number of iterations isT,the computational complexity of our identification method isO(TNML(1+pa)).
The first experiment shows the effectiveness of our method.In this experiment,the RSC code is a(2,1,4)code whose generator matrix is.The number of parity check equations is 100.The modulation type is QPSK and the the signal to noise ratio(SNR) is 10dB.In the CS algorithm,the number of nests is 25.The number of iterations in the CS algorithm is 100.We assume the constraint length of the RSC code is known priori or not respectively.In the two cases,the value of the cost function in each iteration is shown in Figure 3.
Figure 4.Comparisons of the correct identification rates of two methods.
From Figure 3,it can be seen that the value of the cost function decreases along with the increase of the number of iterations and converges to the minimal point after some iterations.The final best solutions in the two cases are [1101101011] and[11011010110000] respectively.In both of the cases,the generator matrix can be estimated correctly.The experimental results indicate that our method does not require the prior knowledge about the constraint length.When the constraint length is not known priori,the constraint length can be assumed to be a large value.In this case,the minimal point is the combination of the true solution and a zero vector,as shown in the experimental result.Since all of the coefficients of the terms whose order are larger than the constraint length equal zero,the generator still can be estimated correctly.This means our method is also applicable even without the prior knowledge of the constraint length.
In the second experiment,the correct identification rates of our method and the method in [13] are compared under the same condition.Since the method in [13] is only applicable in the case of binary phase shift keying(BPSK),the modulation type is BPSK in this experiment.The SNR ranges from 0dB to 10dB and the difference between two adjacent SNR is 1dB.Other experimental conditions are the same as those in the first experiment.The simulation is carried out for one hundred times,and the correct identification rates of our method and the method in [13] are shown in Figure 4.
As shown in Figure 4,the correct identification rates by our method is much higher than that by the method in [13] under the same condition,especially in the medium SNR.The experimental results illustrated that compared with the method in[13],our method is more robust.The cost function established is in the real domain,and according to the decision-rule,shown in(18),only if the solution falls into the certain region near the global minimum point,the generator polynomial can be estimated correctly.Additionally,the CS algorithm has a stronger capability in global optimization.These reasons lead to the high robustness of our method.
In the third experiment,the correct identification rates by using the standard CS algorithm,the standard GA and the standard PSO algorithm are compared.The parameters of the GA algorithm are as follows.The number of individuals in the population is 50.The individuals are ranked via linear ranking method and the selective pressure equals 2.The rate of individuals to be selected for generating offspring is 0.8.The probability for mutation of each variable is 1.The rate of offspring to be inserted is 0.9 in each iteration.The number of iterations in the GA is 1000.As to the PSO algorithm,the number of individuals in the population is also 50.The inertance-weight of the speed is 0.9.The individual-learning factor equals 0.3 and the population-learning factor equals 0.9.The number of iterations in the PSO algorithm is also 1000.
The modulation type is QPSK.The SNR ranges from 5.5dB to 10dB and the difference between two adjacent SNR is 0.5dB.To make a fair comparison,the number of nests and iterations in the CS algorithm is 50 and 1000 respectively.Other experimental conditions are the same as those in the first experiment.Under the different SNR,the simulation experiment is carried out for one hundred times and the correct identification rates for the RSC encoder by using these three optimization algorithms are shown in Figure 5.
As indicated in Figure 5,the correct identification rates by using the CS algorithm is higher than that by using the GA or the PSO algorithm.Compared with the GA and the PSO algorithms,less parameters need to be set in the CS algorithm,and thus it can find the global optimum with a higher success rate.
Figure 5.Comparisons of the correct identification rates by using different optimization algorithms.
In the fourth experiment,the average and standard deviation of the cost function by using these three optimization algorithms are compared.The SNR is 10dB and other parameters are the same as the above.For a certain bit sequence of RSC code,we use the CS algorithm,the GA and the PSO algorithm to minimize the cost function respectively.Each optimization algorithm is run for two hundred times.The average and standard deviation of the cost function are presented in Table 1.
Table 1.The average and standard derivation of the cost function.
From Table 1,it can be seen that the average and standard deviation obtained by the CS algorithm are smaller than those by the GA and the PSO algorithm.The results also testify that the CS algorithm has a stronger capability in global optimization.
The fifth experiment illustrates the influence of the number of coefficients 1 in the generator matrix to the correct identification rates.In the experiment,the generator matrixes to be identified arerespectively.The SNR ranges from 5dB to 10dB,and the difference between two adjacent SNR is 0.5dB.The modulation type is QPSK.The number of iterations in the CS algorithm is 1000.Other conditions are the same as those in the first experiment.The correct identification rates in the two cases are shown in Figure 6.
From Figure 6,it can be seen that with the decrease of the number of coefficient 1 in the generator matrixes,the correct identification rate increases.The reason is that in the same SNR,the more the number of coefficient 1 is,the more invalid parity check equations there are.More invalid parity check equations would make the difference between the minimum value and the subminimum value of the cost function smaller.Consequently,the optimization algorithm converges to the local minimum more easily.
Figure 6.Comparison of the correct identification ratios for two different generator matrixes.
Figure 7.Comparisons of the correct identification rates in the cases of different number of equations.
In the sixth experiment,the influence of the number of parity check equations to the correct identification rates is discussed.The identification of RSC code with the generator matrixis considered in the cases that the number of parity check equations is 100 and 500 respectively.Other conditions are the same as those in the fifth experiment.The correct identification rates in the two cases are displayed in Figure 7.
As can be seen in Figure 7,the increase of the number of the parity check equations leads to the increase of the correct identification rate.As the increase of the number of parity check equations,there exist more valid equations in the equation set and therefore,the difference between the minimum value and the subminimum value of the cost function become larger.Consequently,the difficulty in searching for correct solutions decreases.In real application,as much available data as possible should be used to construct the cost function.
A novel method for identifying RSC encoders is proposed based on the cuckoo search algorithm.The method does not need to test all the possible binary vectors exhaustively for estimating the parity check vector,and has good robustness against bit errors.It also does not require the prior knowledge of the constraint length.Additionally,the method overcomes the limitation that the existing soft-decision data based identification methods are only applicable in BPSK modulation and can be applied in any modulation type.The limitation of our identification method is that the prior knowledge about the type of channel code and the start point of the received codewords is required.In our identification scheme,the cost function is established according to the parity check relation of RSC code.Therefore,our identification scheme can be extended further to solve other channel code identification problems with the requirements of solving parity check equations.As future work,we plan to modify the CS algorithm or hybridize it with other optimization algorithms to improve the capability in finding global optimum in higher dimensional search space.By using modified schemes,it is expected to achieve the identification of low density parity check(LDPC)code and Reed Solomon(RS)code with more robustness to noise.