Title |
A Study on Polynomial Neural Networks for Stabilized Deep Networks Structure |
Authors |
전필한(Jeon, Pil-Han) ; 김은후(Kim, Eun-Hu) ; 오성권(Oh, Sung-Kwun) |
DOI |
https://doi.org/10.5370/KIEE.2017.66.12.1772 |
Keywords |
Polynomial neural networks ; L2 regularization ; Least squard estimation ; Sum of Squared Coefficients(SSC) ; Deep network structure |
Abstract |
In this study, the design methodology for alleviating the overfitting problem of Polynomial Neural Networks(PNN) is realized with the aid of two kinds techniques such as L2 regularization and Sum of Squared Coefficients (SSC). The PNN is widely used as a kind of mathematical modeling methods such as the identification of linear system by input/output data and the regression analysis modeling method for prediction problem. PNN is an algorithm that obtains preferred network structure by generating consecutive layers as well as nodes by using a multivariate polynomial subexpression. It has much fewer nodes and more flexible adaptability than existing neural network algorithms. However, such algorithms lead to overfitting problems due to noise sensitivity as well as excessive trainning while generation of successive network layers. To alleviate such overfitting problem and also effectively design its ensuing deep network structure, two techniques are introduced. That is we use the two techniques of both SSC(Sum of Squared Coefficients) and L_2 regularization for consecutive generation of each layer's nodes as well as each layer in order to construct the deep PNN structure. The technique of L_2 regularization is used for the minimum coefficient estimation by adding penalty term to cost function. L_2 regularization is a kind of representative methods of reducing the influence of noise by flattening the solution space and also lessening coefficient size. The technique for the SSC is implemented for the minimization of Sum of Squared Coefficients of polynomial instead of using the square of errors. In the sequel, the overfitting problem of the deep PNN structure is stabilized by the proposed method. This study leads to the possibility of deep network structure design as well as big data processing and also the superiority of the network performance through experiments is shown. |