The Efficiency of the Proposed Smoothing Method over the Classical Cubic Smoothing Spline Regression Model with Autocorrelated Residual

Samuel Olorunfemi Adams; Omorogbe Joseph Asemota

Article Open Access March 18, 2023

The Efficiency of the Proposed Smoothing Method over the Classical Cubic Smoothing Spline Regression Model with Autocorrelated Residual

Samuel Olorunfemi Adams ^1,* and Omorogbe Joseph Asemota ²

¹

Department of Statistics, University of Abuja, Abuja, Nigeria

²

Department of Economic and Social Research, National Institute for Legislative and Democratic Studies, Abuja, Nigeria

Publihed in: Journal of Mathematics Letters (Volume 1, Issue 1, 2023)

Page(s): 19-37

DOI: 10.31586/jml.2023.618

Received
February 06, 2023

Revised
March 10, 2023

Accepted
March 16, 2023

Published
March 18, 2023

Keywords

Cubic spline; goodness of fit; Generalized Maximum Likelihood (GML); Generalized Cross-Validation (GCV); and Mallow CP criterion (MCP)

Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Abstract

Spline smoothing is a technique used to filter out noise in time series observations when predicting nonparametric regression models. Its performance depends on the choice of the smoothing parameter. Most of the existing smoothing methods applied to time series data tend to over fit in the presence of autocorrelated errors. This study aims to determine the optimum performance value, goodness of fit and model overfitting properties of the proposed Smoothing Method (PSM), Generalized Maximum Likelihood (GML), Generalized Cross-Validation (GCV), and Unbiased Risk (UBR) smoothing parameter selection methods. A Monte Carlo experiment of 1,000 trials was carried out at three different sample sizes (20, 60, and 100) and three levels of autocorrelation (0.2, 05, and 0.8). The four smoothing methods' performances were estimated and compared using the Predictive Mean Squared Error (PMSE) criterion. The findings of the study revealed that: for a time series observation with autocorrelated errors, provides the best-fit smoothing method for the model, the PSM does not over-fit data at all the autocorrelation levels considered ( the optimum value of the PSM was at the weighted value of 0.04 when there is autocorrelation in the error term, PSM performed better than the GCV, GML, and UBR smoothing methods were considered at all-time series sizes (T = 20, 60 and 100). For the real-life data employed in the study, PSM proved to be the most efficient among the GCV, GML, PSM, and UBR smoothing methods compared. The study concluded that the PSM method provides the best fit as a smoothing method, works well at autocorrelation levels (ρ=0.2, 0.5, and 0.8), and does not over fit time-series observations. The study recommended that the proposed smoothing is appropriate for time series observations with autocorrelation in the error term and econometrics real-life data. This study can be applied to; non – parametric regression, non – parametric forecasting, spatial, survival, and econometrics observations.

1. Introduction

The smoothing spline is a spline consisting of piecewise third-request polynomials that go through a bunch of control focuses. The second subsidiary of every polynomial is ordinarily set to zero at the endpoints since this gives a limit condition that finishes the framework condition of $m - 2$ conditions. This creates a purported "normal" cubic spline and prompts a straightforward tri-diagonal framework that can be settled effectively to give the coefficients of the polynomials. The parameters are estimated by minimizing the residual sum of squares (RSS) and a roughness penalty. A general test of “loyalty to observation" for a curve g is the residual sum of squares. If g is allowed to be any curve – unrestricted in functional form, then this distance test can be reduced to zero by any g that interpolates the observation. The curve would not be admitted because it is not exclusive and because it is a structure-oriented interpretation, [1]. The spline smoothing approach avoids impossible interpolation of the observation by evaluating the contest between the tasks of producing a good fit to the observation and producing a curve without too much rapid local change. The main function of splines is for interpolation, but they can also be used for parametric and non-parametric regression modeling. The most commonly used spline smoothing technique is the cubic splines. Spline smoothing produces another technique for local polynomial regression and it is also a charming component of additives regression models. It is well known that correlation greatly affects the selection of smoothing parameter which is critical to the performance of smoothing spline. The commonly used approach in time series analysis is the classical ARMA method. It assumes linear dependence on past values and past innovations. Generalized Cross Validation (GCV) and Generalized Maximum Likelihood (GML) are the most appropriate spline smoothing method for selecting optimal value for the smoothing parameter and a performance criteria for smoothing parameter selection. So many scholars had carried out research on this area, most of them discovered that time series data assume independence of regressors and error terms which lead to autocorrelation problem. The application of smoothing parameter estimators like GCV and GML do not always solve these problems because they don’t occasionally smooth.

Over the last two decades, research on spline smoothing estimation methods has produced a vast amount of information and discoveries from researchers in evaluating the efficiency and performance of the existing estimation techniques when autocorrelation is present in their error terms. In this research work, the proposed smoothing method is compared with three classical smoothing spline parameter selection techniques with the intention of providing a robust smoothing parameter estimation method that will alleviate the problem of over fitting models for time-series data with low, moderate, and high autocorrelation levels and the problem associated with the smoothing methods’ performance when different time series sample sizes are utilized.

In Section 2, the cubic smoothing spline was discussed, method of selecting smoothing parameters like Generalized Cross-Validation, Generalized maximum Likelihood, Mallow’s CP criterion, and performance evaluation criteria were also discussed in this section. A simulation study and results are given in Sections. Finally, concluding remarks are presented in Section 4.

2. Literature Review

A lot of attention has been directed to studies on smoothing smoothing with autocorrelated error. [2] made a comparison between GCV and REML, it was recommended that GCV and REML are good smoothing parameter selection for small and medium-sized samples. [3, 4] applied the smoothing spline method to fit a curve to a noisy data set, where the selection of the smoothing parameter is essential. An improved C_p criterion for spline smoothing based on Stein’s unbiased risk estimate has been proposed to select the smoothing parameter. The resulting fitted curve is superior and more stable than commonly used selection criteria and possesses the same asymptotic optimality as C_p. [5] applied most of the data-driven smoothing parameter selection methods and compared them based on large and small sample sizes. The parallel of Akaike’s information criterion and Generalized Cross-Validation is recommended as being the best selection criteria. For large samples, the GF_AIC method would seem to be more appropriate while for small samples they proposed the implementation of the GCV criterion. [6] investigates two types of results that support the use of GCV for variable selection under the assumption of sparsity. The first type of result is based on the well-established links between GCV on the one hand and Mallows’s Cp and Stein Unbiased Risk Estimator on the other hand. The result states that GCV performs as well as Cp or SURE in a regularized or penalized least squares problem as an estimator of the prediction error for the penalty in the neighborhood of its optimal value. [7, 8, 9] investigated the behavior of the optimal values of gamma and rho to identify simple practical rules to choose their optimal properties. RGCV and modified GCV perform significantly better than GCV. The performance is defined in terms of the Sobolev error, which is shown by example to be more consistent with a visual assessment of the fit than the average squared error. [10, 11, 12] discussed UBR and GCV for selecting the optimal knots in spline. The criteria for selecting the best model were based on Mean Squared Error and R-square. The simulation was performed on a spline truncated function with error generated from a Normal distribution for varied sample sizes and error variance. The results of the simulation study showed that GCV estimates the knots more accurately than UBR. [13] considered nonparametric regression problems and developed a model-averaging procedure for smoothing spline regression problems. Model weights were estimated using a delete-one-out cross-validation procedure to minimize the prediction error. A simulation study was performed by using a program written in R. The simulation study provides a comparison of the most well-known CV, generalized GCV, and the proposed method. The model averaging approach is straightforward to implement and gives reliable performances in simulations.

It is clear from the existing literature that the goodness-of-fit of smoothing spline for time series observations has not been investigated so far. The paper aimed at presenting a goodness of fit test for time series observation using three classical cubic spline non-parametric regression functions.

3. Methodology

This section discussed the methodology applied in this research work.

3.1. Cubic Smoothing Spline Regression Model

The most common example of the smoothing spline is the cubic spline; it is the smoothing spline's functional form and a piecewise cubic function that interpolates the dataset and ensures the smoothness of the observation. It is piecewise third-request polynomials that go through a bunch of focuses. It has a nonstop first and second subordinate with the request for the coherence of (d–1), where d is the polynomial degree. The Model with shortened force premise work b(x) changes the factors Xi by applying a premise work b(x) and fits a model utilizing these changed factors, which adds non-linearity to the model and empowers the splines to fit smoother and adaptable Non-straight capacities. The spline smoothing model is written as;

y_{i} = f (t_{i}) + ε_{i}

(1)

Where; $y_{i}$ is the response variable, $f$ is an unknown smoothing function, $t_{i}$ is the independent/predictor variable and $ε_{i}$ is zero means autocorrelated stationary process.

The general cubic spline function is given as;

f (t) = a t^{3} + b t^{2} + c t + d + ε

(2)

Where; , b, c, and d = real number coefficients and $a \neq 0$ , t = independent variable, $ε$ = error term, and d.f. = k-d-1 (k = number of knots and d = degree of cubic spline)

The cubic spline smoothing estimate function is $\hat{f}$ while; $f$ refers to the minimizer of a twice differentiable function of;

S (f) = \sum_{i - 1}^{n} {(y_{i} - \hat{f} (t_{i}))}^{2} + λ \int_{a}^{b} {({\hat{f}}^{''} (t))}^{2} d t

(3)

Where;

is a smoothing parameter,
The initial part in equation (3) refers to the residual sum of the square for the integrity of the information's attack.
The roughness penalty in the subsequent term of equation (3), is enormous when the incorporated second subsidiary of a regression function $f^{''} (t)$ is likewise huge
If λ moves toward 0, then $f (t)$ only interpolates the data set.
If λ is very big, then $f (t)$ would be chosen wherefore $f^{''} (t)$ is wherever 0, which will suggest a by and large direct least-squares fit the perceptions.

If $f (t)$ values are fixed at $f (t_{1}), . . . ., f (t_{2})$ the roughness $\int_{a}^{b} {({\hat{f}}^{''} (t))}^{2} d t$ is minimized by a natural cubic spline, this solution is written as a basic function as;

3.2. Generalized Cross-Validation (GCV) Estimation Method with an Autocorrelation Structure

The term Generalized Cross-Validation (GCV) was proposed by [14] and [17] as a replacement for Cross-Validation (CV), it is the most popular method for choosing the complexity of statistical models. The basic principle of cross-validation is to leave the data points out one at a time and to choose the value of λ under which the missing data points are best predicted by the remainder of the data. To be precise, let $g_{λ}^{- 1}$ be the smoothing spline determined from all the information sets aside from $(t_{i}, y_{i}),$ utilizing the worth λ for the smoothing boundary. The cross-validation decision regarding λ can then be the estimation of λ which can be written as;

C V (λ) = \frac{1}{n} \sum {\{y_{i} - \hat{g} (t_{i})\}}^{2}

(4)

Equation (4) is similar to the test for regression model estimation [16]. Define a matrix A (λ) by;

A_{i j} (λ) = n^{- 1} g (t_{i}, t_{j})

(5)

C V (λ) = \frac{1}{n} \sum_{i = 1}^{n} \frac{{\{y_{i} - \hat{g} (t_{i})\}}^{2}}{{\{1 - A_{i i} (λ)\}}^{2}}

(6)

Wang, Meyer & Opsomer (2013) [15] also proposed the application of a related test referred to as the Generalized Cross-validation, acquired from equation (6) by substituting $A_{i i} (λ)$ with its mean value, , this gives the score.

G C V (λ) = \frac{n^{- 1} R S S (λ)}{{(1 - n^{- 1} t r A (λ))}^{2}}

(7)

Where; RSS (λ) refers to the residual sum of squares. [17] also gave hypothetical contentions to prove that GCV ought to pick an ideal estimation of λ in the function of minimizing the mean squared error (MSE) at the design points. The forecast published practical examples bear out a good performance in [18]. The summed-up Cross-validation technique is notable for its optimal qualities [19]. For any given $n \times n,$ the impact matrix is given as;

[\begin{matrix} {\hat{f}}_{n}, λ (t_{1}) \\ {\hat{f}}_{n}, λ (t_{2}) \\ . \\ . \\ . \\ {\hat{f}}_{n}, λ (t_{n}) \end{matrix}] = S (λ) y

(8)

therefore W₀ (λ) can be revised as;

W_{0} (λ) = \frac{\sum_{k = 1}^{n} {(a_{k j} y_{j} - y_{k})}^{2}}{{(1 - a_{k k})}^{2}}

(9)

Where; Generalized Cross-Validation is the changed type of Cross-Validation, a customary method for assessing the smoothing boundary. The GCV score is built by correlation with the CV score which is gotten from the normal residuals by dividing them by ${1 - (S_{λ})}_{i i}$ . The acknowledged arrangement of GCV is to replace the documentation ${1 - (S_{λ})}_{i i}$ in Cross-Validation with a mean score of $1 - n^{- 1}$ follow $(S_{λ})$ . Consequently, by adding the residual squared and notation {1 − n⁻¹ trace (S_λ)}², by the known conventional cross-approval, the GCV smoothing technique is composed numerically as;

G C V (λ) = \frac{1}{n} \frac{\sum_{k = 1}^{n} {\{y - f_{k} (x_{1})\}}^{2}}{{\{1 - n^{- 1} t r a c e (S λ)\}}^{2}}

(10)

G C V (λ) = \frac{n^{- 1} {‖(I - S λ) y‖}^{2}}{{[n^{- 1} t r a c e (I - S λ)]}^{2}}

(11)

Where; n is observations or data set, λ is the smoothing parameter, Sλ refers to the ith diagonal member of the smoothing matrix

The first research on cross-validation was conducted by [20], which was subsequently augmented to the log periodogram's smoothing [21]. The term Generalized Cross-Validation (GCV) was determined by [20]. The GCV score figured by similarity to the CV score can be obtained from the normal residuals by isolating them by $1 - (S λ) i i$ . The essential plan of GCV is to supplant the components $1 - (S λ) i i$ with the mean score $1 - n - 1 t r (S λ) .$ Consequently, adding the squared revised remaining and factor ${1 - n - 1 t r (S λ)} .$ Given the spline smoothing for non-parametric assessment of a relapse work in a period series setting and accepting that the reaction variable $y_{i}$ are taken on the occasion $t_{i}$ , for $i = 1, . . ., n$ and that a model of the structure creates the $y_{i}$

y_{i} = f (t_{i}) + Z (t_{i})

(12)

Where $f (.)$ refers to the smoothing function and $Z (t_{i})$ refers to the zero-mean, Autocorrelated stationary process. It can be said that even though t_i is specific, it is not uniformly spaced, with t₁ < . . . < t_n

If the $Z (t_{i})$ in (12) has a known correlation function, with $C o v Z (t_{1}, . . ., t_{n}) = σ^{2} v_{i j}$ ,a normal addition of the usual smoothing spline approach amongst is to estimate f by the $\hat{f}$ which minimizes;

{(y - \hat{f})}^{T} W (y - f) + λ \int_{a}^{b} {f^{''} (t)}^{2} d t

(13)

Amid every properly smoothed function f, it is confirmed that $W = V - 1 = [v_{i j}], y = (y_{1}, . . ., y_{n}) T$ and $f = (f (t_{1}), . . ., f (t_{n})) T .$ It has been proven that the function $\hat{f}$ remains a natural cubic spline that has knots at the t_j. Also, if $\hat{f}$ denotes the vector with the ith element $\hat{f} (t_{i})$ then there is a matrix S_λ such that $\hat{f} = S (λ_{y})$ , i.e. for fixed λ, the estimate is a direct capacity of y. This linearity proposes a nearby association between spline smoothing and bit smoothing, as shown unequivocally in [22]. One approach to picking the parameter denoted by λ is for the generalized cross-correlation to be minimized [17]. In the current setting, the common extension of this model is to limit equation (13); this gives a technique for assessing g within the sight of a realized autocorrelation structure. Concerning span assessment of g, the Bayesian assumption presented by [22] extends with the connection network and V is replaced by Silverman's inverse inclining weighting lattice, which presents the posterior difference matrix, written as;

V a r (\hat{f}) = σ^{2} A (λ) V

(14)

The minimization of GCV (λ) as proposed by [23] and [24] is written as;

G C V (λ) = \frac{{(y - \hat{f})}^{T} W (y - f)}{{[t r a c e (I - S λ)]}^{2}}

(15)

Where; ( $S λ$ )is the $i t h$ diagonal element of the smoother matrix, , the correlation function, y is $(y_{i}, . . ., y_{n}) T, f = (f (t_{1}), . . ., f (t_{n})) T$

3.3. Generalized Maximum Likelihood (GML) Estimation Method with Autocorrelation Structure

[25] proposed the GML technique for correlated data that possesses a single parameter for smoothing observations. However, there exist two parameters for smoothing in the case of a bivariate model that should be assessed along with the covariance boundaries. Following a comparative determination, GML is given as;

G M L (λ) = \frac{y^{ᴵ} (I - S λ)}{{[{d e t}^{+} (I - S λ)]}^{\frac{1}{n - m}}}

(16)

Where; det⁺ $(I - S λ)$ refers to the product of $(n - m)$ non-zero eigenvalues of $(I - S λ)$ . [25] provided a Bayesian model for the GML method's general framework and can calculate a spline estimate's posterior confidence intervals. Suppose that the data are simulated via the;

y_{i} = f (t_{i}) + ε_{i}, i = 1,2, . . ., n, t_{i} ϵ [0,1]

(17)

Where; $\in = (\in_{1}, . . ., \in_{n})$ ~ $N (0, σ^{2}, W^{- 1})$ which do not dependent on f, Model (17) is usually referred to as a Bayesian model, it can also be known as a hierarchical model or a mixed-effects model. This Bayesian model is similar to the model illustrated by [19], though the residuals are correlated. Based on the justification of [19], it can be shown that;

\lim_{n \to \infty} E (f (t) / y) = \hat{f} t a n d \lim_{n \to \infty} c o v (f / y) = σ^{2}, W^{- 1}

(18)

Where; $F = (F (t_{1}) ., . ., F (t_{n})) ’$ and $σ \to \infty$ expanded prior are estimated for polynomial coefficients with degrees smaller than m.

According to [25], the covariance matrix W^-1 relies on several correlations with parameter vector of $τ$ . Interestingly, covariance structures refer to first-order autoregressive for time-series observation, structured symmetry or unstructured for repeated measurements, and spatial data. GML with Autocorrelation structure is therefore given by;

G M L (λ) = \frac{λ^{ᴵ} W (I - S λ)}{{[{d e t}^{+} W (I - S λ)]}^{\frac{1}{n - m}}}

(19)

Where; ${d e t}^{+} (I - S λ)$ is the product of the $n - m$ nonzero eigenvalues of (I – Sλ), λ is the Smoothing parameter, $W$ is the structure of the correlation, $S λ$ is the smoother matrix diagonal elements, $n = n_{1} + n_{2}$ are the pair of observations and $m$ = the number of zero eigenvalues.

3.4. Unbiased Risk (UBR) Estimation Method with Autocorrelation Structure

Unbiased Risk is also known as Mallow’s CP criterion; it was developed by [26] to evaluate the regression model fit dependency on Ordinary Least Square (OLS). It is used to estimate choice situations where explanatory variables can predict a few results and locate the best model associated with the subset of independent variables. The more modest the estimation of the Cp, the generally exact it is, the Cp is written numerically as;

U B R (λ) = \frac{{‖(S λ - I) y‖}^{2}}{t r (I - S λ)}

(20)

[25] provides the UBR technique that can be used effectively to choose a smoothing parameter for cubic spline smoothing that possesses non-Gaussian information. It was developed by using Predictive Mean Square Errors (PMSE).

The Unbiased Risk with Autocorrelation structure can be written mathematically as;

U B R (λ) = \frac{\frac{1}{n} {‖W^{\frac{k}{2}} (I - S λ) y‖}^{2}}{{[\frac{1}{n} t r a c e (W^{k - 1} (I - S λ))]}^{2}}; k = 0, 1, 2

(21)

Where; n is the measurement/observations ${x i, y i},$ W is the Autocorrelation structure, λ is the parameter used for smoothing and $S λ$ is the matrix smoother of the $i t h$ diagonal member.

3.5. Proposed Smoothing Method (PSM) with Autocorrelation Structure

A smoothing spline model is usually written as:

y_{i} = f (x_{i}) + ε_{i}

(22)

Where; y refers to the response variable, x refers to a predictor variable, f is the Regression function and .

There are several options to examine whenever model (22) is used for non-linearity, it incorporates, observation change, and adds substance items, for example, cubic spline and Spline smoothing. This research work is keen on spline smoothing because it examines non-linearity dependent on regression bend by presenting a wrinkle or twists in these crimps created by pivot work, and the place of the turn on the fit is called hitches.

The traditional regression analysis's primary purpose is to minimize the residual sum of squares (RSS); the model with the minimum RSS is the preferred model. It is important to note that [27] proposed Cross-Validation (CV) as a technique for estimating Spline Smoothing. Instead of RSS in the customary straightforward simple regression, the residual is characterized.

In this manner, an improved spline smoothing technique is proposed by adding the weighted parameters $k$ and $k - 1$ with the other properties and qualities of the UBR and GCV [28] and [29]. The combination of the two smoothing methods' quantities will result in the optimal performance of smoothing methods whose model does not overfit time-series observations. The minimizer is the Proposed Smoothing Method (PSM) with autocorrelation structure given as;

PSM = (k) overfitting and optimal knot detector + $(1 - k)$ best for forecasting non-Gaussian data

P S M (λ) = k \frac{{(y - \hat{f})}^{T} W (y - \hat{f})}{{[t r a c e (I - S λ)]}^{2}} + (1 - k) \frac{\frac{1}{n} {‖W^{\frac{g}{2}} (I - S λ)‖}^{2}}{{[\frac{1}{n} t r a c e \{W^{g - 1} (I - S λ)\}]}^{2}}

(23)

The behavior of the minimized λ in UBR and GCV techniques under the alternate value of g = 1 as the optimum value of PSM yields;

P S M (λ) = k \frac{{(y - \hat{f})}^{T} W (y - \hat{f})}{{[t r a c e (I - S λ)]}^{2}} + (1 - k) \frac{\frac{1}{n} {‖W^{\frac{1}{2}} (I - S λ)‖}^{2}}{{[\frac{1}{n} t r a c e \{W (I - S λ)\}]}^{2}}

(24)

The proposed method for estimating f is given in (27) subject to the condition that $0 < g < 1$ is chosen, using the algorithm in section 3.6. [30, 31, 32].

Where; n is the number of the dataset, k is the weighted value, , W = V^-1 = Correlation Matrix for the error term, y = (y₁, . . . ,y_n)^T is the Smoothing function, $\hat{f} = {(f (t_{1}) . . . f (t_{n})) . y_{n})}^{T}$ = S_λy, Sλ is the diagonal member of the smoothing matrix, $‖W^{\frac{1}{2}} (I - S λ) y‖$ = norm of the Euclidean vector $W^{\frac{1}{2}} (y - \hat{f})$ .

3.6. Proposed Smoothing Method (PSM) Algorithm

Step 1: Read the simulated sample data $(x_{i}, y_{i})$ for i = 1 – T and for each of the determine the Pre-selected smoothing parameters $λ_{1}, . . ., λ_{t}$ , calculate the respective set of smoothing Spline estimates $f (λ) = \{{\hat{f}}_{λ 1}, . . ., {\hat{f}}_{λ t}\}$

Step 2: For the given λ, σ and T use the data in 1 above to fit a curve and the estimate ahead by linear extension $f (x_{i})$ and $\hat{f} (x_{i})$

Step 3: Insert the weighted value (k) of the coefficients of GCV and UBR

Step 4: Obtain the predictive mean square error $P M S E (\hat{f} λ) = \sum_{i = 1}^{t} [{(f (x) - \hat{f} (x i)))}^{2}]$ for these points

Step 5: add all values of PMSEs to get the resulting PSM value for the given λ and ρ.

Step 6: Repeat steps 1–5 for 1000 times.

3.7. Monte Carlo Simulation study

This part is concerned with the outcome of a Monte Carlo simulation study. This study was led to assess the achievement of the four smoothing techniques depicted in this research, for example, GML, GCV, UBR, and PSM. The dataset was generated by applying a program written in R (version 3.2.3) for time-series sample sizes of; 20, 60, and 100. The experiment was replicated at 1,000 for every one of the examples. The Predictive Mean Squared-Errors (PMSE), adjusted R-Square and predicted R-square was utilized to assess the smoothing techniques' quality and performance for each simulated data.

3.8. Equation used to generate the value in the data

The data generation study performed to assess and measure the performance of the four spline smoothing methods is given as;

y (t) = 2 S i n (\frac{π}{t}) + ε_{t \dots \dots} t = 20, 60, a n d 100

(25)

Where; $π = 180^{0}$ , $ε_{t} ~ N (0, σ W^{- 1})$ , a first-order autoregressive process AR (1) with a mean of 0, a standard deviation of 0.8, and autocorrelation levels (ρ) of 0.2, 0.5, and 0.8 with a 95% confidence limit, Note that; $e_{t} = ρ ε_{t - 1} + v_{t}$ and $v_{t} ~ N (0, σ^{2})$

3.9. Experimental design and data generation

The experimental design adopted in this study is;

Three-time-series samples (T) of 20, 60, and 100 were considered in the data generation
Three autocorrelation levels were considered, i.e. ρ = 0.2, 0.5 and 0.8
One standard deviation value was considered, i.e. $σ = 0.8$
The dataset was simulated for 1,000 replications in each of the 3 x 3 x 4 x 1 = 36 combinations for cases T’s, ρ’s, λ’s, and σ’s.

All the selected parameters the in experimental design are similar to the ones used in [33].

3.10. Smoothing Spline Assessment methods used in this Study

Efforts were made in this study to examine and compare the strength of the four spline smoothing estimators, namely; Generalized Cross-Validation (GCV), Unbiased risk (UBR), Generalized Maximum Likelihood (GML), and the Proposed Smoothing Method (PSM) developed by taking the weighted hybrid of GCV and UBR.

(i) Predictive Mean Square Error

A comparison was made to test the four estimation methods' effect and performance in the presence of autocorrelation error. An estimate of the four smoothing methods, the criterion, effect, and performance of different autocorrelation errors of the four estimation methods (i.e. Generalized Crossed Validation (GCV), Generalized Maximum Likelihood (GML), Proposed Smoothing Methods (PSM) (0 < k < 1) and Unbiased Risk (UBR)) were performed using codes written in R-console. Four different estimation methods were used i.e. GCV (V), GML (M), PSM (0 < k < 1), and UBR (U). This data generation was carried out for V, M, P, and U. At the same time, the Evaluation and comparison of the Four (4) Spline Smoothing estimation methods were investigated by applying the asymptotic sampling qualities of the criterion given as; Mean Square Prediction Error (MSPE).

The Predictive Mean squared error (PMSE) of a smoothing curve or model fitting process, according to [26] and [4], is the difference between the expected value of the square difference of the fitted value, that is; function $\hat{f} (x_{i})$ and the observed value estimate is given as the function $f (x_{i})$ . It is utilized to estimate the performance and attributes of smoothing methods like Cross-Validation, Generalized Cross-Validation, and Generalized Maximum Likelihood, etc. The Predictive Mean Square Error (PMSE) is written numerically as;

P M S E (λ) = E [{\sum_{i = 1}^{n} (f (x_{i}) - \hat{f} (x_{i}))}^{2}]

(26)

The Predictive Mean Square Error is usually separated into two parts; the initial part is the sum of square biases of the fitted qualities, and the other part is the number of changes in the fitted observations.

Where;

$f (x_{i})$ = observed value

$\hat{f} (x_{i})$ = predicted or estimated value

At each scenario of specification, for instance, say, time-series size (T) = 20, autocorrelation level (ρ) = 0.2, d.f = 1, and standard deviation (σ) = 0.8, the smoothing methods were tested and compared using the asymptotic properties of the estimators based on the PMSE criterion.

(ii)Test for Over-fitting in Spline Smoothing

In statistics, overfitting occurs when a model fails to fit extra information or neglects to anticipate future perceptions reliably. PRESS and Predicted R-square are the best and easiest ways to discover overfitted in smoothing methods and models. The result may be interpreted by simply comparing the predicted R-Square to the normal R-Square and observing if there exist is a great difference between the two test techniques. If there is a large difference between the two values, the model doesn’t predict new observations and fits the true data, and there is the possibility of overfitting the model. Overfit model has too many numbers and terms and begins to fit the random noise in the sample; it is not possible to predict random noise. The Predictive R-square is a statistical technique that determines how well a model predicts a response for new observation. It is something of an in-house cooked measure, which is computed by effectively eradicating each variable from the data set, estimating the regression model, and deciding how suitable the model forecasts the removed variables. Predictive R-square is usually written mathematically as;

P r e d . R - S q u a r e = (1 - \frac{P r e d i c t e d r e s i d u a l s u m o f s q u a r e s (P R E S S)}{S u m o f s q u a r e t o t a l}) \times 100

(27)

While R-square, also known as the coefficient of determination, can be derived through;

R - S q u a r e = (1 - \frac{S u m o f s q u a r e s e r r o r}{S u m o f s q u a r e t o t a l})

(28)

(iii) Test for Goodness-of-fit for the Smoothing Methods

The goodness-of-fit of the smoothing methods explains how well the methods fit the simulated and real-life data. It also summarizes the differences between the observed value and predicted or estimated values. The Adjusted R-square was used to determine the best-fit smoothing methods. It is written mathematically as;

A d j u s t e d R - S q u a r e = (1 - \frac{(1 - R s q u a r e) \times (n - 1)}{n - p})

(29)

Where; n = several observations and p = many parameters.

4. Results and Discussion

Table 1, table 2 and table 3 present the summary fit result of the smoothing spline regression model and the model performance criteria, i.e. the PMSE, multiple R-square, adjusted R-Square, and predicted R-square based on time-series periods (T=60), four degrees of smoothing (D.S.=1, 2, 3 and 4) and autocorrelation level (ρ = 0.5). It was revealed from the result that all the coefficients of the smoothing methods’ parameters were significant at (P-value <0.001, <0.01, and < 0.05).

The PMSE of the four smoothing techniques indicated that; the Proposed Smoothing Method (PSM = 0.18) had the smallest PMSE of 0.757980 at T = 60, D.S. = 2, and ρ=0.5. This was closely followed by, UBR with PSME of 1.017353 at T = 60, D.S. = 2, and ρ = 0.5 then, GML with PSME of 1.300494 at T = 60, D.S. = 2 and $ρ = 0.5$ . The result implies that; the Proposed Smoothing Method (PSM = 0.18) performs better than the other smoothing methods at a time series size (T =60) and rho = 0.5.

The adjusted R-Square result showed that the Proposed Smoothing Method (PSM = 0.18) had the largest values of 0.8095 at, T = 60, D.S. = 2, and ρ=0.5, which is closely followed by the Proposed Smoothing Method (PSM = 0.20) with the value of 0.7879 when T = 20, D.S. = 2 and ρ=0.5, then GCV smoothing method with 0.7828 at T = 60, D.S. = 2 and $ρ = 0.5$ . It can be inferred from the result above that; the Proposed Smoothing Method (PSM = 0.18), provides the best fit to the time-series observations at a time-series size (T = 60) and rho = 0.5.

It can be seen from the result presented in table 1, table 2 and table 3 that the difference between the Multiple R-square and predictive R-square of the Proposed Smoothing Method was the least when compared to the other smoothing methods. At T = 60, D.S. = 1, 2, 3, and 4, $ρ = 0.5,$ the differences between the Multiple R-Square and predictive R-square were 0.3669, 0.0364, 0.4599, and 0.1759 respectively. This result shows that the Proposed Smoothing Method does not overfit the time series observations when the time-series size of 60 and rho = 0.5.

Figure 1, figure 2 and figure 3 below clearly show the comparisons of the behaviors of the cubic smoothing spline selected by GCV, GML, and MCP for sample sizes 20, 60, and 100 respectively. It was observed that the observed value of Generalized Cross-Validation was closer to the fitted/estimated value when compared to the

5. Application

H₀: The data are independently distributed, or the correlations in the population from which the samples are drawn are zero

H_I: The data are not independently distributed; they have serial correlation

Decision: autocorrelation exists in the model

H₀: The observation is not stationary, there exists a unit root

H_I: The observation is stationary, there is no unit root

Decision: The data is stationary, there is no unit root

Table 6 above presents the predictive mean square error of the real-life data on the standard international trade classification (SITC) export and import price indices in Nigeria between 2001 and 2020. It was discovered that the proposed smoothing Method (PSM) had the least predictive mean square error (PMSE) a confirmation that it is the preferred smoothing method for simulated and real-life data. The result also presented the multiple, adjusted, and predictive R-Square. It can be inferred from the adjusted R-square of the proposed smoothing method of 59.6% that it has the best fit among the four smoothing methods.

The plot above presents the smoothing curve of the annual standard international trade classification import price index dataset in Nigeria from 1970-2020. The data used for analysis were earlier tested for stationarity and autocorrelation. As can be seen from Figure 4 the proposed smoothing method with optimal smoothing parameter λ = 0.062439908 and for weighted value (k =0.04) was used to carefully analyze the residuals to try to detect disturbances or errors in the stationary part of the series. It was observed that the PSM curve is very close to the real-life data and also provides a good fit.

The plot above presents the smoothing curve of the annual standard international trade classification export price index dataset in Nigeria from 1970-2018. The curve presented in Figure 4 indicated that the proposed smoothing method with optimal smoothing parameter λ = 0.062439908 and for weighted value (k =0.04) was used to smooth the residuals for disturbances or errors in the stationary part of the series. It was observed that the PSM curve is very close to the real-life data and also provides a good fit.

6. Conclusion

The results generated from the simulation and real-life data conducted in this study have provided great insight into the smoothing method whose model produces the best fit for the time-series observations, the model whose smoothing method does not over fit data, the optimum value of the proposed smoothing method and the performance of the smoothing methods when autocorrelation is present in the error term.

The result of the goodness-of-fit test revealed that the proposed smoothing method had the best-fit model among the competing smoothing methods on the simulated and real-life data. The proposed smoothing method’s model fitted without any defection and shortcoming under cubic spline functional form with the highest adjusted R-Square of 0.9618, at T = 20, D.S. = 4, $ρ = 0.2,$ and the weight value of k = 0.04.

The finding on the effect of autocorrelation in the error terms of the four smoothing methods considered in this study showed that the proposed smoothing method (PSM) works well for all levels of autocorrelation (ρ = 0.2, 0.5, and 0.8). It also provided a better estimate, proved to be the most preferred smoothing method than the GML, GCV, and UBR, and does not overfit a time series observation with autocorrelation in the error term at a predicted R-square of 0.6218. This result is slightly similar for GML with; [25] and [35] but differs from [24, 36, 37, 38, 39].

The study on the optimum value of the Proposed Smoothing Method (PSM) indicated that the smoothing method performs at an optimal level when (k = 0.04) with a predictive mean square error value of 0.046857, multiple R-Square of 0.9678, Adjusted R-Square of 0.9618 and predictive R-square of 0.6428.

The result on the effect of sample size on the performance of the four smoothing methods shows that the proposed smoothing method is computationally more efficient and consistent, and works well at all sample sizes (T = 20, 60, and 100) Monte-Carlo experiment. The plots and results presented in Tables and Figure 1 – 4 indicated that; GML, GCV, and UBR showed signs of inefficiency for all the time series sizes (T = 20, 60, and 100). This finding is quite different from; [25, 35] and [40].

The findings from the result also proved the Proposed Smoothing Method (PSM) to be more efficient among the four competing smoothing methods for real-life data. This result disagrees with the finding by [41].

Conflicts of Interest: The authors declare no conflict of interest.

References

Wahba, G. & Wang, Y. (1993). Behavior near Zero of the Distribution of GCV Smoothing Parameter Estimates for Splines. Statistics and Probability Letters, (25), 105 – 111.[CrossRef]
Aydin, D. & Memmedli, M. (2011). Optimum Smoothing Parameter Selection for Penalized Least-squares in the Form of Linear Mixed Effect Models. Optimization, iFirst: 1–18.
Chen, C.S. & Huang, H.C. (2011). An improved Cp Criterion for Spline Smoothing. Journal of Statistical Planning and Inference, 144(1), 445 – 471.[CrossRef]
Adams, S.O., Ipinyomi, R.A. (2019). A Proposed Spline Smoothing Estimation Method for Time Series Observations. International Journal of Mathematics and Statistics Invention (IJMSI), 07(02), 18-25.
Aydin, D., M. Memmedli, and R. E. Omay. (2013). Smoothing Parameter Selection for Nonparametric Regression using a Smoothing Spline. European Journal of Pure and Applied Mathematics, 6, 222–38.
Jansen, Maarten (2015). Generalized Cross Validation in Variable Selection with and Without Shrinkage. Journal of Statistical Planning and Inference, 159, 90-104. https://doi.org/10.1016/j.jsp.2014.10.007[CrossRef]
Lukas, M.A., De Hoog, F.R. & Anderssen R.S. (2016). Practical Use of Robust GCV and Modified GCV for Spline Smoothing. Computational Statistics. 31(1), 269-289.[CrossRef]
Adams, S.O. (2021). An Improved Spline Smoothing Method for Estimation in the Presence of Autocorrelation Errors. The University of Ilorin.
Adams, S.O., Balogun, P.O. (2020). Panel Data Analysis on Corporate Effective Tax Rates of Some Listed Large Firms in Nigeria. Dutch Journal of Finance and Management, 4(2), 1-9, 2542–4750. https://doi.org/10.21601/djfm/9345[CrossRef]
Devi, A.R., Budiantara, I.N. & Vita-Ratnasari, V. (2018). Unbiased Risk and Cross-Validation Method for Selecting Optimal Knots in Multivariable Nonparametric Regression Spline Truncated (Case Study: The Unemployment Rate in Central Java, Indonesia, 2015). AIP Conference Proceedings 2021.[CrossRef]
Adams, S.O., Ipinyomi, R.O., Yahaya, H.U. (2017). Smoothing Spline of ARMA Observations in the Presence Autocorrelation Error. European Journal of Statistics and Probability, 5(1), 1-8.
Adams, S.O., Yahaya, H.U., Nasiru, M.O. (2017). Smoothing Parameter Estimation of the Generalized Cross Validation and Generalized Maximum Likelihood. IOSR Journal of Mathematics, 13(1), 41 – 44. https://doi:10.9790/5728-1301054144[CrossRef]
Xu, L. & Zhou, J. (2019). A Model-Averaging Approach for Smoothing Spline Regression. Communications in Statistics - Simulation and Computation. 48(8), 2438 – 2451.[CrossRef]
Wahba, G. (1977). Applications of Statistics in P. Krishnaiah Edition, A Survey of Some Smoothing Problems and the Method of Generalized Cross-Validation for solving them, Northern Holland, Amsterdam.
Wang, H., Meyer M. C. & Opsomer J.D. (2013). Constrained Spline Regression in the Presence of AR(p) Errors. Journal of Nonparametric Statistics, 25, 809 – 827.[CrossRef]
Cook R.D. & S. Weisberg, S. (1982). Residuals and Influence in Regression, Journal of the American Statistical Association, 86, 328–332.[CrossRef]
Craven P. & Wahba, G. (1979). Smoothing Noisy Data with Spline Functions, Numerical Mathematics, 31, 377 – 403.[CrossRef]
Xiang, D. & Wahba, G. (1998). Approximate Smoothing Spline Methods for Large Data Sets in the Binary Case. Proceedings of the 1997 ASA Joint Statistical Meetings, Biometrics Section 94-98.
Wahba, G. (1990). Spline Models for Observational Data. CBMS-NSF Regional Conference Series in Applied Mathematics, Philadelphia: SIAM 59.[CrossRef]
Wahba, G. (1975). Optimal Convergence Properties of Variable Knot Kernel and Orthogonal Series Methods for Density Estimation, Annals of Statistics, 3, 15 – 29.[CrossRef]
Wahba, G. (1980). Automatic Smoothing of the Log Periodogram. Journal of the American Statistical Association, 75, 122-132.[CrossRef]
Silverman, B.W. (1984). Spline Smoothing: The Equivalent Variable Kernel Method. Annals of Statistics, 12(3), 898 – 916.[CrossRef]
Wahba, G. (1983). Bayesian Confidence Intervals for the Cross-Validated Smoothing Spline. Journal of Royal Statistical Society (Series B), 45, 133-150.[CrossRef]
Diggle, P.J. & Hutchinson, M.F (1998). On Spline Smoothing with Autocorrelated Errors. Australian Journal of Statistics, 31, 166 –182.[CrossRef]
Yuedong, W. (1998). Smoothing Spline Models with Correlated Random Errors. Journal of American Statistical Association. 93(441), 341 – 348.[CrossRef]
Mallows, C.L. (1973). Some Comments on Cp, Technometrics, 15(4), 661 – 675.[CrossRef]
Wahba, G. (1979). Convergence Rates of Thin Plate Smoothing Splines when the Data are Noisy in T. Gasser and M. Rosenblatt. Smoothing Techniques for Curve Estimation, Springer-Verlag, New York.[CrossRef]
Adams, S.O., Gayawan, E., Garba, M.K. (2009). Empirical Comparison of the Kruskal - Wallis Statistics and its Parametric Counterpart. Journal of Modern Mathematics and Statistics, 3(2), 38 – 42. Medwell Journal. https://doi:jmmstat.2009.38.42.
Adams, S.O., Ipinyomi, R.A. (2019). A New Smoothing Method for Time Series Data in the Presence of Autocorrelated Error. Asian Journal of Probability and Statistics (AJPAS), 04(04), 1-19. https://doi.org/10.9734/ajpas/2019/v4i430121[CrossRef]
Adams, S.O., Ipinyomi, R.A. (2020). On the Efficiency of the Weighted Generalized Cross Validation and Unbiased Risk Smoothing Method for Time Series Observations with Autocorrelated Error. International Journal of Academic and Applied Research, 04(07), 70-81.
Adams, S.O., Yahaya, H.U. (2020). Comparative Study of GCV-MCP Hybrid Smoothing Methods for Predicting Time Series Observations. American Journal of Theoretical and Applied Statistics, 9(5), 219-227. https://doi:10.11648/j.ajtas.20200905.15[CrossRef]
Adams, S.O., Obaromi, A.D, Alumbugu, A.I. (2021). The goodness of Fit test of an Autocorrelated Time Series Cubic Smoothing Spline Model. Journal of the Nigerian Society of Physical Sciences. 3(3), 191-200. https://doi.org/10.46481/jnsps.2021.265.[CrossRef]
Wahba, G. (1985). A Comparison of GCV and GML for Choosing the Smoothing Parameters in the Generalized Spline Smoothing Problem. The Annals of Statistics 4, 1378 – 1402.[CrossRef]
Daniel, C. (1973). One at a time Plans. Journal of American Statistical Association, 68 (342), 353 – 360.[CrossRef]
Yuedong, W., Wensheng G. & Brown, M.B. (2000). Spline Smoothing for Bivariate Data with Application to Association between Hormones, Statistica Sinica, 10, 377 – 397.
Hart, J. D. & Wehrly, T. E. (1986). Kernel Regression Estimation using Repeated Measurements Data. Journal of the American Statistical Association, 81, 1080 – 1088.[CrossRef]
Altman, N.S. (1990). Kernel Smoothing of Data with Correlated Errors. Journal of the American Statistical Association. 85, 749–759.[CrossRef]
Herrmann, E., Gasser, T. & Kneip, A. (1992). Choice of Bandwidth for Kernel Regression When Residuals are Correlated. Biometrika, 79(4), 783–795.[CrossRef]
Krivobokova T. & Kauermann, G. (2007). A Note on Penalized Spline Smoothing with Correlated Errors. Journal of the American Statistical Association, 102, 1328 – 1337.[CrossRef]
Kim, T., Park, B., Moon, M. & Kim, C. (2009). Using Bimodal Kernel for Inference in Nonparametric Regression with Correlated Errors. Journal of Multivariate Analysis, 100 (7), 1487 – 1497.[CrossRef]
Carew, J. D., Wahba, G., Xie, X., Nordheim, E.V. & Meyerand M. E. (2003). Optimal Spline Smoothing of fMRI Time Series by Generalized Cross-Validation. NeuroImage, 18(4), 950 – 961.[CrossRef] [PubMed]

[R1] Wahba, G. & Wang, Y. (1993). Behavior near Zero of the Distribution of GCV Smoothing Parameter Estimates for Splines. Statistics and Probability Letters, (25), 105 – 111.[CrossRef]

[R2] Aydin, D. & Memmedli, M. (2011). Optimum Smoothing Parameter Selection for Penalized Least-squares in the Form of Linear Mixed Effect Models. Optimization, iFirst: 1–18.

[R3] Chen, C.S. & Huang, H.C. (2011). An improved Cp Criterion for Spline Smoothing. Journal of Statistical Planning and Inference, 144(1), 445 – 471.[CrossRef]

[R4] Adams, S.O., Ipinyomi, R.A. (2019). A Proposed Spline Smoothing Estimation Method for Time Series Observations. International Journal of Mathematics and Statistics Invention (IJMSI), 07(02), 18-25.

[R5] Aydin, D., M. Memmedli, and R. E. Omay. (2013). Smoothing Parameter Selection for Nonparametric Regression using a Smoothing Spline. European Journal of Pure and Applied Mathematics, 6, 222–38.

[R6] Jansen, Maarten (2015). Generalized Cross Validation in Variable Selection with and Without Shrinkage. Journal of Statistical Planning and Inference, 159, 90-104. https://doi.org/10.1016/j.jsp.2014.10.007[CrossRef]

[R7] Lukas, M.A., De Hoog, F.R. & Anderssen R.S. (2016). Practical Use of Robust GCV and Modified GCV for Spline Smoothing. Computational Statistics. 31(1), 269-289.[CrossRef]

[R8] Adams, S.O. (2021). An Improved Spline Smoothing Method for Estimation in the Presence of Autocorrelation Errors. The University of Ilorin.

[R9] Adams, S.O., Balogun, P.O. (2020). Panel Data Analysis on Corporate Effective Tax Rates of Some Listed Large Firms in Nigeria. Dutch Journal of Finance and Management, 4(2), 1-9, 2542–4750. https://doi.org/10.21601/djfm/9345[CrossRef]

[R10] Devi, A.R., Budiantara, I.N. & Vita-Ratnasari, V. (2018). Unbiased Risk and Cross-Validation Method for Selecting Optimal Knots in Multivariable Nonparametric Regression Spline Truncated (Case Study: The Unemployment Rate in Central Java, Indonesia, 2015). AIP Conference Proceedings 2021.[CrossRef]

[R11] Adams, S.O., Ipinyomi, R.O., Yahaya, H.U. (2017). Smoothing Spline of ARMA Observations in the Presence Autocorrelation Error. European Journal of Statistics and Probability, 5(1), 1-8.

[R12] Adams, S.O., Yahaya, H.U., Nasiru, M.O. (2017). Smoothing Parameter Estimation of the Generalized Cross Validation and Generalized Maximum Likelihood. IOSR Journal of Mathematics, 13(1), 41 – 44. https://doi:10.9790/5728-1301054144[CrossRef]

[R13] Xu, L. & Zhou, J. (2019). A Model-Averaging Approach for Smoothing Spline Regression. Communications in Statistics - Simulation and Computation. 48(8), 2438 – 2451.[CrossRef]

[R14] Wahba, G. (1977). Applications of Statistics in P. Krishnaiah Edition, A Survey of Some Smoothing Problems and the Method of Generalized Cross-Validation for solving them, Northern Holland, Amsterdam.

[R15] Wang, H., Meyer M. C. & Opsomer J.D. (2013). Constrained Spline Regression in the Presence of AR(p) Errors. Journal of Nonparametric Statistics, 25, 809 – 827.[CrossRef]

[R16] Cook R.D. & S. Weisberg, S. (1982). Residuals and Influence in Regression, Journal of the American Statistical Association, 86, 328–332.[CrossRef]

[R17] Craven P. & Wahba, G. (1979). Smoothing Noisy Data with Spline Functions, Numerical Mathematics, 31, 377 – 403.[CrossRef]

[R18] Xiang, D. & Wahba, G. (1998). Approximate Smoothing Spline Methods for Large Data Sets in the Binary Case. Proceedings of the 1997 ASA Joint Statistical Meetings, Biometrics Section 94-98.

[R19] Wahba, G. (1990). Spline Models for Observational Data. CBMS-NSF Regional Conference Series in Applied Mathematics, Philadelphia: SIAM 59.[CrossRef]

[R20] Wahba, G. (1975). Optimal Convergence Properties of Variable Knot Kernel and Orthogonal Series Methods for Density Estimation, Annals of Statistics, 3, 15 – 29.[CrossRef]

[R21] Wahba, G. (1980). Automatic Smoothing of the Log Periodogram. Journal of the American Statistical Association, 75, 122-132.[CrossRef]

[R22] Silverman, B.W. (1984). Spline Smoothing: The Equivalent Variable Kernel Method. Annals of Statistics, 12(3), 898 – 916.[CrossRef]

[R23] Wahba, G. (1983). Bayesian Confidence Intervals for the Cross-Validated Smoothing Spline. Journal of Royal Statistical Society (Series B), 45, 133-150.[CrossRef]

[R24] Diggle, P.J. & Hutchinson, M.F (1998). On Spline Smoothing with Autocorrelated Errors. Australian Journal of Statistics, 31, 166 –182.[CrossRef]

[R25] Yuedong, W. (1998). Smoothing Spline Models with Correlated Random Errors. Journal of American Statistical Association. 93(441), 341 – 348.[CrossRef]

[R26] Mallows, C.L. (1973). Some Comments on Cp, Technometrics, 15(4), 661 – 675.[CrossRef]

[R27] Wahba, G. (1979). Convergence Rates of Thin Plate Smoothing Splines when the Data are Noisy in T. Gasser and M. Rosenblatt. Smoothing Techniques for Curve Estimation, Springer-Verlag, New York.[CrossRef]

[R28] Adams, S.O., Gayawan, E., Garba, M.K. (2009). Empirical Comparison of the Kruskal - Wallis Statistics and its Parametric Counterpart. Journal of Modern Mathematics and Statistics, 3(2), 38 – 42. Medwell Journal. https://doi:jmmstat.2009.38.42.

[R29] Adams, S.O., Ipinyomi, R.A. (2019). A New Smoothing Method for Time Series Data in the Presence of Autocorrelated Error. Asian Journal of Probability and Statistics (AJPAS), 04(04), 1-19. https://doi.org/10.9734/ajpas/2019/v4i430121[CrossRef]

[R30] Adams, S.O., Ipinyomi, R.A. (2020). On the Efficiency of the Weighted Generalized Cross Validation and Unbiased Risk Smoothing Method for Time Series Observations with Autocorrelated Error. International Journal of Academic and Applied Research, 04(07), 70-81.

[R31] Adams, S.O., Yahaya, H.U. (2020). Comparative Study of GCV-MCP Hybrid Smoothing Methods for Predicting Time Series Observations. American Journal of Theoretical and Applied Statistics, 9(5), 219-227. https://doi:10.11648/j.ajtas.20200905.15[CrossRef]

[R32] Adams, S.O., Obaromi, A.D, Alumbugu, A.I. (2021). The goodness of Fit test of an Autocorrelated Time Series Cubic Smoothing Spline Model. Journal of the Nigerian Society of Physical Sciences. 3(3), 191-200. https://doi.org/10.46481/jnsps.2021.265.[CrossRef]

[R33] Wahba, G. (1985). A Comparison of GCV and GML for Choosing the Smoothing Parameters in the Generalized Spline Smoothing Problem. The Annals of Statistics 4, 1378 – 1402.[CrossRef]

[R34] Daniel, C. (1973). One at a time Plans. Journal of American Statistical Association, 68 (342), 353 – 360.[CrossRef]

[R35] Yuedong, W., Wensheng G. & Brown, M.B. (2000). Spline Smoothing for Bivariate Data with Application to Association between Hormones, Statistica Sinica, 10, 377 – 397.

[R36] Hart, J. D. & Wehrly, T. E. (1986). Kernel Regression Estimation using Repeated Measurements Data. Journal of the American Statistical Association, 81, 1080 – 1088.[CrossRef]

[R37] Altman, N.S. (1990). Kernel Smoothing of Data with Correlated Errors. Journal of the American Statistical Association. 85, 749–759.[CrossRef]

[R38] Herrmann, E., Gasser, T. & Kneip, A. (1992). Choice of Bandwidth for Kernel Regression When Residuals are Correlated. Biometrika, 79(4), 783–795.[CrossRef]

[R39] Krivobokova T. & Kauermann, G. (2007). A Note on Penalized Spline Smoothing with Correlated Errors. Journal of the American Statistical Association, 102, 1328 – 1337.[CrossRef]

[R40] Kim, T., Park, B., Moon, M. & Kim, C. (2009). Using Bimodal Kernel for Inference in Nonparametric Regression with Correlated Errors. Journal of Multivariate Analysis, 100 (7), 1487 – 1497.[CrossRef]

[R41] Carew, J. D., Wahba, G., Xie, X., Nordheim, E.V. & Meyerand M. E. (2003). Optimal Spline Smoothing of fMRI Time Series by Generalized Cross-Validation. NeuroImage, 18(4), 950 – 961.[CrossRef] [PubMed]

The Efficiency of the Proposed Smoothing Method over the Classical Cubic Smoothing Spline Regression Model with Autocorrelated Residual

Abstract

1. Introduction

2. Literature Review

3. Methodology

3.1. Cubic Smoothing Spline Regression Model

3.2. Generalized Cross-Validation (GCV) Estimation Method with an Autocorrelation Structure

3.3. Generalized Maximum Likelihood (GML) Estimation Method with Autocorrelation Structure

3.4. Unbiased Risk (UBR) Estimation Method with Autocorrelation Structure

3.5. Proposed Smoothing Method (PSM) with Autocorrelation Structure

3.6. Proposed Smoothing Method (PSM) Algorithm

3.7. Monte Carlo Simulation study

3.8. Equation used to generate the value in the data

3.9. Experimental design and data generation

3.10. Smoothing Spline Assessment methods used in this Study

4. Results and Discussion

5. Application

6. Conclusion

References

Cite This Article

Information

About SCIPUB

Policies

Follow SCIPUB