DOI https://doi.org/10.18551/rjoas.2017-03.11
RANK-BASED ESTIMATION FOR ASYMMETRIC PRICE TRANSMISSION MODELLING
Acquah De-Graft Henry, Associate Professor Department of Agricultural Economics and Extension, University of Cape Coast,
Cape Coast, Ghana E-mail: [email protected]
ABSTRACT
This paper introduces and compares Rank-based estimation and the Ordinary Least Squares (OLS) methods for modelling the Granger and Lee asymmetric price transmission model when the true data generating process is known. Monte Carlo simulation results indicate that the estimates of the coefficients of the asymmetric price transmission model derived from the Least squares and the Rank-based estimation methods are accurate and equivalent or close to their true values for normal data regardless of variability in sample size. For data with outliers, Least squares method is affected by outliers and yields inaccurate estimates of the coefficients of the asymmetric price transmission model across various sample sizes. Rank-based estimation remains robust to outliers in large samples and provides estimates of the coefficients of the asymmetric price transmission model that are accurate and nearly equivalent to their true values. The evidence from Monte Carlo experimentation suggests that the proposed Rank-based estimation is likely to do no worse than the OLS with normal dataset and promise to do better when the dataset has outliers within the asymmetric price transmission modelling context.
KEY WORDS
Monte Carlo Simulation, Rank-based Estimation, Granger and Lee Asymmetry, Ordinary Least Squares Estimation, Outlier.
An econometric methodology to investigate asymmetric adjustments that relies on ordinary least squares (OLS) estimation is proposed by Granger and Lee (1989). Consequently, the OLS estimation of the Granger and Lee Asymmetric model has been widely employed in modelling price asymmetry. However, the OLS estimation is only optimal under certain assumptions. For example, if the normality assumption of the error term and other assumptions of the linear regression model are not violated. But under circumstances commonly encountered with economic data such as contaminated data (data with outliers), some underlying assumptions for which the OLS is efficient is violated.
In the presence of outliers, the OLS may produce wrong asymmetric adjustment coefficient, which forms the basis for detecting asymmetry. In effect, wrong conclusions will be made from the Granger and Lee asymmetric model and its variants when underlying assumptions of the OLS are violated due to the presence of outliers. For example, Douglas (2010) finds the presence of outliers in price data in asymmetric price transmission analysis. He detects asymmetry in the contaminated data but on dropping the outliers, finds no evidence of asymmetry. Kind (2015) also notes that data used in agricultural price analysis from developing countries are more often found to have outliers and as result, estimates obtained from such contaminated data lose their value. In a spatial price analysis of selected agricultural commodities, Karfakis and Rapsomanikis (2007) noted that the error correction modelling methods are not robust to the presence of outliers in the price series.
An alternative approach to estimate the Granger and Lee Asymmetric model whilst concurrently accommodating outliers in the data is to employ Rank-based regression. Rank-based regression remains robust to the presence of outliers. This approach has been met with success in estimation of linear models in the presence of outliers as detailed in Jureckova (1971); Jaeckel (1972); Mckean and Hettmansperger (1978); and Kloke and Mckean (2015).
Rank-based regression estimation provides valuable information even if the data is contaminated with outliers as commonly encountered in price data used in asymmetric price transmission analysis. However, less effort has been made to compare the Rank-based estimation and OLS methods for asymmetric price transmission analysis in the presence of outliers. The purpose of this research is therefore to investigate by use of Monte Carlo methods, the performance of the Rank-based regression and OLS methods in estimating the Granger and Lee asymmetry using data with and without outliers.
The paper is planned as follows. The introduction is followed by the methodology which discusses the Granger and Lee Asymmetric model, Ordinary Least Squares (OLS) and Rank-based Regression. The results and discussion present Monte Carlo simulations of Granger and Lee asymmetric model and demonstrates the ability of Ordinary Least Squares and Rank-based regression to estimate true values of Granger and Lee asymmetric data generating process. Finally, the paper ends with a conclusion.
METHODS OF RESEARCH
Granger and Lee Asymmetry. Granger and Lee (1989) Asymmetric Error Correction Model data generating process can be specified as follows:
Ayt = p1Axt x)U + *)r-i + £i,t elit~N(0, S2) (1)
If y and x are variables integrated of the order one processes [I (1)] that are cointegrated, then there exists a long run equilibrium relationship between y and x which is captured by a symmetric error correction term (y-x). Asymmetric adjustments can be introduced by decomposition of the error correction term into positive and negative components as follows:
(y-x)t = 0 - *)F =
(y - x)t,if (y-x)t>0 zero otherwise
(y - x)t,if (y-x)t<0 zero otherwise
(2) (3)
Asymmetry is incorporated by allowing the speed of adjustment to differ for the positive and negative components of the Error Correction Term (ECT) since the long run relationship captured by the ECT is symmetric. The test for symmetry in eq. (1) is conducted by determining whether the coefficients (p2+ andp2") are identical (that isH0: P2+=P2~). The
Granger and Lee Asymmetric Error correction model in eq. (1) can be considered as a standard regression model as specified below:
y = X pP + st st □ iidN(0,ct2) (4),
Where: y and X are a response and explanatory variables with the explanatory variables defined to include asymmetric adjustment terms. Consequently, the estimation of parameters of the Granger and Lee Asymmetric model can be done using Ordinary Least Squares method and Rank-based technique. Empirical results are computed for the two techniques and compared.
Ordinary Least Squares Estimation (OLS). Suppose a data which is made up of n observations {YiXi}^=i. Each observation has an outcome variable Yt and a vector of p explanatory variables Xt. Here, the outcome variable is a linear function of the explanatory variables in a linear regression model. In this regard, the general model can be of the form:
Y = f(X1,X2,X3) + £ (5)
This could be re-written as:
Y = fi0 + piXi + fi2X2 +p3X3+e (6),
Where: p^, i=1, 2, 3 are unknown parameters. Eq. (6) is written in a matrix form as:
Yi = Xrf + €i (7)
(IX x \ ... " j
1 Xnl Xnn/
The column of the ones incorporates the intercept term. Estimation of p from eq. (7) which involves minimizing the sum of square errors as follows:
Z£2=£T£ = (y_mT(y_m (8)
Differentiating with respect to p, setting it to zero, we find that p satisfies:
XTXp = XTy (9)
The estimated p is the Best Linear Unbiased Estimators (BLUE) under Gauss-Markov theorem, and can be estimated using OLS. The OLS method of estimating p can be estimated by p which is given by the explicit formula:
p = {X'X)-iX'y (10)
The matrix (XX)~1X' is referred to as the Moore-Penrose pseudo inverse matrix of X. After estimation of p, the fitted values (or predicted values) from the regression will be:
P = xp = X{X 'X)~iX 'y (11)
Here p is unbiased and has variance (XTX)~1a2.
From the above, estimating the variance (ct2), we find that Eete = a2(n - p). This suggests that the estimator a2 = — = — is an unbiased estimator of a2. Where n-pis
n-p n-p r
the degree of freedom of the model.
It is assumed that the errors are independent and identically distributed (iid) with mean 0 and variance a2, thus e ~ N(0,o2I). Now since y = xp + e, implies that y ~ N(Xp,o2I), which is a compact description of the regression model.
Rank-Based Regression. Consider a linear model of the form:
Yi = a + p1xil + - + Ppxip + e; for i = l,...,n. (12)
From eq. (12), the reduced form can be stated as:
Yi = a + xfp + £i for i = l,...,n. (13),
Where: Yt is a continuous response variable, xt is vector of explanatory variable, a is intercept parameter, p is vector of coefficients, and e{ is the error term. The e is assumed to be iid with a continuous pdf function, f(x).
Rearranging eq. (13) in matrix notation can be stated as:
Y = al + Xp + e (14),
Where: Y = \Y1, _,yn]Tis n x 1 vector of response variable, X = [Xlr...,Xn]T is n x p design matrix, and e = [e1, ..., en]T is n x 1 vector of error terms.
Comparing the least square estimator and the rank estimator, the least squares estimator minimizes the Euclidean distance between Y and YLS = XfiLS. The R estimator on the other hand is obtain through the Jaeckel's (1972) dispersion function, given by:
D{p) = \\Y - Xp\\v, (15)
D(P) is a convex function of p and provide a robust measure of distance between Y and XfiW-Wy, is a pseudo-norm defined as:
= Y^=1a(R(ui))ui (16)
In eq. (16), scores are generated as a(i) = y
(¿l)
, and ^ is a nondecreasing score
function which is defined on interval (0, 1). The R estimator of fi is defined as:
^ =Argmin\\Y-Xp\\<p. (17)
Eq. (17) is consistent with asymptotically normal distribution given by:
^v~N(Pt2v(XtX)~1), (18),
Where: is the scale parameter. Eq. (18) follows a normal distribution with the variance-covariance matrix:
v&p<p =
Kn tvX
-xliX'X)-1* t2(X X)
-in
l
Where: Kn = n 1r| + -t^x'(X X) lx. The vector x is the vector of column averages of X and r5, is the scale parameter.
RESULTS AND DISCUSSION
Comparison of Rank-based and Ordinary Least Squares Estimation. The Granger and Lee asymmetric error correction model data generating process is specified as follows:
Ayt = 0.7 + 0.5x, - 0.25(y - ^)+M - 0.75(yt - ^)-M + s (19),
yt and xt are generated as non-stationary variables that are integrated of the order one. There exist a cointegrating relationship between y and x which is defined by the error correction term (yt -xt)t-1 . The positive and negative components of the error correction
term are denoted by (yt - xt )+t-1 and (yt - xt)-t-1. The errors are generated from a normal distribution with a mean 0 and a variance of 1 [s □ (0,1)] for normal data. For the data with outliers, nine observations of the errors generated for the normal data with values generated from the normal distribution with a mean of 0 and a variance of 1 is replaced with nine observations from the normal distribution with a mean of 20 and variance of 1 (s □ N (20,1)).
In order to investigate the performance of the Rank-based estimators and OLS in estimating the true values of the asymmetric price transmission model, 1000 regressions based on the Granger and Lee model specified in eq. (19) is estimated. The Monte Carlo experimentation is conducted under conditions of different sample sizes (50,150 and 500) and asymmetry given by(P +, p-) e (P2, p3) e (-0.25, -0.75) for the normal data as well
as the data with outliers. This study assigns the asymmetric adjustment parameters(P2+,P2 )
in the spirit of Cook et al (1999, 2000) and Acquah (2012, 2013).
The results obtained from the Monte Carlo analysis for the normal data are reported in Table 1. Results of 1000 Monte Carlo simulations indicate that the estimates of the coefficients of the asymmetric price transmission model derived from the Rank-based analysis are accurate and close to their true parameter values for the data without outliers (Normal data) with small and moderate sample sizes (50 and 100). The estimates of the coefficients of the asymmetric price transmission model derived from the least squares methods are accurate and equal to their true parameter values for the data without outliers (Normal data) with small and moderate sample sizes (50 and 100). Noticeably, the estimates of the coefficients of the asymmetric price transmission model derived from the Least squares method and the Rank-based analysis are accurate and equivalent to their true parameter values for normal data with large sample size (500).
In summary, Table 1 illustrates that in the absence of outliers, the OLS and Rank-based analysis performed well, with the averaged estimates all nearly equivalent or close to their true values of $0 =0.7,$ =0.5,$2 =-0.25,$3 =-0.75regardless of differences in sample sizes.
The results are consistent with Chen,Tang,Lu and Tu (2014) who noted that in the absence of outliers, OLS and Rank-based methods performed well, with the averaged estimates all nearly identical to the true values in linear regression analysis. Similarly, Ryan (1997) notes that robust methods such as Rank-based estimation methods perform almost as well as the OLS when the data is free from mistakes and influential data points.
Table 1 - Normal Data (Without Outliers)
Sample Size Properties of Data Estimates
Method Ä, ß^ ß?. ft
N=50 Normal OLS Rank-Estimation 0.70 0.70 0.50 0.50 -0.25 -0.25 -0.75 -0.76
Sample Size Properties of Data Estimates
Method Ä, Ä ß, ft
N=150 Normal OLS Rank-Estimation 0.70 0.70 0.50 0.50 -0.25 -0.25 -0.75 -0.74
Sample Size Properties of Data Estimates
Method Ä, Ä ß, ft
N=500 Normal OLS Rank-Estimation 0.70 0.70 0.50 0.50 -0.25 -0.25 -0.75 -0.75
Based on 1000 Monte Carlo Simulation.
The results obtained from the Monte Carlo analysis for the data with outliers are reported in Table 2. Results of 1000 Monte Carlo simulations indicate that the estimates of the coefficients of the asymmetric price transmission model derived from the Rank-based analysis are accurate and close to their true parameter values for the data with outliers in large sample (500). Noticeably, as sample size increase from small through moderate to large sample, estimated coefficients of the asymmetric price transmission model move closer to their true parameter values in the Rank-based regression analysis.
In the presence of outliers, the least squares method performed poorly as shown in Table 2. In small, moderate and large samples of 50, 150 and 500 respectively, the ordinary least squares (OLS) estimator performs poorly with its parameter estimates entirely different from the true parameter values of $0 = 0.7,$ = 0.5,$ = -0.25, $3 = -0.75 as defined in the data generating process.
From the foregoing discussion, it is obvious that the results of the Rank-based analysis is similar to the least squares and close to their true values in the data without outliers. However, in data with outliers, the least squares is affected by outliers in small, moderate and large samples whilst the Rank based analysis remains robust to outliers in large samples.
Table 2 - Data with Outliers
Sample Size Properties of Data Estimates
Method ßo Ä ß, ß,
N=50 With Outliers OLS Rank-Estimation 3.05 0.85 0.50 0.50 -0.16 -0.24 -1.08 -0.78
Sample Size Properties of Data Estimates
Method ßo ßi ß2 ßs
N=150 With Outliers OLS Rank-Estimation 1.00 0.71 0.47 0.50 -0.44 -0.26 -2.01 -0.87
Sample Size Properties of Data Estimates
Method ßo ßi ß2 ßs
N=500 With Outliers OLS Rank-Estimation 1.00 0.72 0.50 0.50 -0.26 -0.25 -0.52 -0.73
Based on 1000 Monte Carlo Simulation.
The results are consistent with Ryan (1997) assertion that robust methods such as Rank-based estimation methods perform much better than OLS when the data has outliers. Similarly, Chen, Tang, Lu and Tu (2014) noted that in the presence of outliers, classic linear models yield extremely large estimates that are un-interpretable, whilst in contrast, the rank regression model generated estimates close to their true values. Atikatu (2015) also in an empirical study showed that rank-based methods were more robust in estimation when the distribution of the error term of the dataset was non-normal and also in the presence of outlying observations, whiles the OLS method was very non-robust. Results from the study indicated that 5% outlier-contamination was enough to cause some instability in the estimates of the OLS method.
CONCLUSION
The performance of Rank-based estimators have been investigated in asymmetric price transmission regression modelling. The findings suggest that the Rank-based estimation yield similar results as the OLS with normal data. However, when outliers are present in the data, the least squares does not provide correct estimates of the coefficients of the asymmetric price transmission model in small, moderate and large samples of data. Rank-based estimation on the other hand is robust and provides correct estimates of the coefficients of the asymmetric price transmission model in large samples. The results of the simulation indicate that the Rank-based estimation can be considered an alternative to the OLS technique in asymmetric price transmission estimation and may yield accurate results in large samples when the data contains outliers.
REFERENCES
1. Acquah, H. D. (2012). A bootstrap approach to testing for symmetry in the Granger and Lee Asymmetric Error Correction Model. RJOAS, 11(11), 33-36.
2. Acquah, H. D. (2013). Using bootstrap method to evaluate the power of tests for non-linearity in asymmetric price relationship. Journal of Economics and Behavioral Studies, 5, (4), 237-241.
3. Atikatu, A. (2015). Rank-based adaptive method of estimating beta. Thesis submitted to the department of mathematics, Kwame Nkrumah University of Science and Technology
4. Chen, T., Tang, W., Lu, Y., & Tu, X. (2014). Rank regression: an alternative regression approach for data with outliers. Shanghai archives of psychiatry, 26(5), 310.
5. Cook, S., Holly, S., & Turner, P. (1999). The Power of tests for non-linearity: the case of Granger-Lee asymmetry, Economics Letters, 62, pp.155-159.
6. Cook, S., Holly, S., & Turner, P. (2000). The Power of Tests for Non-linearity: The Escribano-Pfann Model, Computational Economics, 15, pp. 223-226.
7. Douglas, C. C. (2010). Do gasoline prices exhibit asymmetry? Not usually!. Energy Economics, 32(4), 918-925.
8. Granger, C. W. J., & Lee, T. H. (1989). Investigation of production, sales and inventory relationships using multicointegration and non-symmetric error correction models. Journal of applied econometrics, 4(S1), S145-S159.
9. Jaeckel, L. A. (1972). Estimating Regression Coefficients by Minimizing the Dispersion of the Residuals. The Annals of Mathematical Statistics, 43,1449-1458.
10. Jureckova, J. (1971) Non-parametric Estimate of Regression Coefficients. The Annals of Mathematical Statistics, 42,1328-1338.
11. Karfakis, P., & Rapsomanikis, G. (2007). Margins across time and space: threshold cointegration and spatial pricing applications to commodity markets in Tanzania. In workshop on staple food trade and market policy options for promoting development in eastern and southern Africa.
12. Kind, M (2015). Analysis of Market Integration- An Alternative Approach. MSc Thesis submitted to the Agricultural Economics and Rural Policy Group, Wageningen University, Netherland.
13. Kloke, J. & McKean, J. W. (2015). Nonparametric Statistical Methods Using R. New York: CRC Press. ISBN-13:978-1-4398-7343-4.
14. McKean, J. W., & Hettmansperger, T. A (1978). Robust Analysis of the General Linear Model Based on one Step r-estimates. Biometrika, 65(3):571.
15. Ryan, T. P. (1997). Modern regression methods. New York, NY: John Wiley & Sons, Inc.