A NEW TRANSMUTED PROBABILITY MODEL: PROPERTIES AND APPLICATIONS
Khawar Javaid1, Bilal Ahmad Para2*
^Department of Mathematical Sciences, Islamic University of Science and Technology, Kashmir [email protected], ^"Corresponding author: [email protected]
Abstract
In this article, we introduced a new three parameter continuous probability model by extending a two parameter log-logistic distribution using the quadratic rank transmutation map technique. We provide a comprehensive description of the statistical properties of the newly introduced model. Robust measures of skewness and kurtosis of the proposed model have also been derived along with the moment generating function, characteristic function, reliability function and hazard rate function of the proposed model. The estimation of the model parameters is performed by maximum likelihood method followed by a Monte Carlo simulation procedure. The applicability of this distribution to modeling real life data is illustrated by two real life examples and the results of comparison to base distribution in modeling the data are also exhibited.
Keywords: Transmuted Probability Model, Survival Analysis, Reliability Measures, Monte Carlo Simulation.
1. Introduction
The quality of procedures that are put to use in a statistical analysis relies greatly upon the assumed probability model or distribution. As a consequence of this, significant effort has been directed over the course of history towards the development of large classes of standard distributions along with relevant statistical methodologies. These happen to be designed for serving as models for a wide variety of real-world phenomena. However, many important situations exist where real data does not follow any of the classical or standard models. In the work that follows, we have obtained a three-parameter Generalized Log-Logistic Distribution (GLLD) by utilizing the Quadratic Rank Transmutation Map (QRTM) technique proposed by Shaw and Buckley [1]. The field of transmutation has seen a lot of research recently. Ashour and Eltehiwy [2] introduced a new generalized distribution of the exponentiated modified Weibull distribution using the transmutation technique. Aryal et al. [3] introduced the transmuted extreme value distribution. Merovci et al. [4, 5] studied the transmuted Lindley and Rayleigh distributions. Now we will study the three-parameter Generalized Log-Logistic Distribution (GLLD) and obtain and understand its different characteristics as well as its structural properties.
According to the Quadratic Rank Transmutation Map (QRTM) technique for generalization, the cumulative distribution function (CDF) must satisfy the relationship:
Ft(x) = (l + l)F6(x)-l[F6(x)]2 (1)
which upon differentiation yields,
ft(x)=fb(x)[l + X-2XFb(x)] (2)
where/6(x) and /t(x) are the probability density functions corresponding to Fb(x) and
Ft(x)respectively and|A| < 1. F6(x)is the CDF of the base distribution. If we put A = 0, we get the base distribution.
The log-logistic distribution is a continuous probability distribution particularly useful in dealing with survival data. It is specifically used as a parametric model for events whose rate increases initially and later diminishes. For example, mortality rate from a certain cancer post diagnosis or treatment. The probability density function (pdf) of the two-parameter log-logistic distribution is given as:
afi(ax)^"1
= (3)
The corresponding cumulative distribution function (CDF) is given as:
(ax)*1
where a is a scale parameter while p is a shape parameter.
The remaining paper is organized as follows. In subSection 1, the three-parameter Generalized Log-Logistic Distribution is demonstrated. The various statistical properties of the generalized distribution such as the moments, moment generating function, characteristic function, order statistics, quantile function, etc. are summarized in Section 2. The MLE of the distribution parameters are illustrated in Section 3 of this paper and contains an exhibition of the Monte Carlo simulation procedure. Robust measures of skewness and Kurtosis along with graphical illustrations are presented in Section 4. Section 5 deals with the applicability of this generalized distribution in modeling real life data which is illustrated by two real-life data sets.
1.1 Three-Parameter Generalized Log-Logistic Distribution (GLLD)
This section deals with the study of the three-parameter Generalized Log-Logistic Distribution. Using (1) and (4), the CDF of GLLD is obtained as follows:
Ft(x) = (l+A)F6(x)-A[Fi,(x)]2
^Ft(x) = (1 + 1)
After simplifying, we obtain the CDF of three-parameter Generalized Log-Logistic Distribution as
(ax)2^ + (1 + A)(ax)*
•'•F(X;M'1)=-(l + (ax)02-' *'^>0&-l<A<l (5)
Hence, the pdf of GLLD with parameters a,p and X is obtained using (5) as follows:
nx-,a,p,X)=--(1 + (gx),)2-
aB{ax)P~1{{ 1 + A)(l + (ax)?) - 2A(ax)^} ■•■/(x;M,A) = ^ J U (1 + (gx)g)3 -—LL' 1<A<1 (6)
The CDF and pdf plots for (5) and (6) respectively for different values of the parameters involved is illustrated through figure 1 and 2 respectively. The plots reveal quite evidently that the distribution of the three-parameter generalized log-logistic random variable X is right skewed.
2. Statistical Properties of GLLD
This section deals with the various structural properties of the three-parameter GLLD such as moments (non-central and central), moment generating function, characteristic function, order statistics, quantile function as well as the survival measures. All these have been obtained and discussed in the sub-sections that follow.
" (axY ' 2 ' (axY '
1 +(ax)^ 1 +
2.1 Moments
Moments refer to a set of statistical parameters that are useful in measuring a distribution. They are the crucial measures to calculate mean, variance, skewness and kurtosis of the data. Skewness deals with symmetry of a distribution, or in more precise terms, the lack of symmetry of a distribution. Kurtosis enables us to measure the peakedness or flatness of a distribution. Another interpretation of kurtosis is concerned with the heavy or light-tailed nature of the data relative to a normal distribution.
Fig 1: CDF plots of three parameter GLLD
- a= ■5, 5 = 1.9, = -0.7
— - a= 5, 1 = 2.3, = -0.7
''' a= ■5, 5 = 2.9, . = -0.7
— a= ■5, 1=3.7, L--0.7
.1
rv.
fA
^ - 4.'
- a =2.7, p=2.1,i = -0.3
- - -1=2.7, 0=2.1,1 = -0.3
j| - - - a =2.7, p=2.1,) = 0.6
•f 3=2.7, P=2.1,J = 0.9
11
!i
\i
uA
¡j\
l] \\ ¡1 i\
r 11
0 2 3 4 5
X
Fig 1: pdf plots of three parameter GLLD
The theorem 1.1 is used to arrive at the rthnon-central moment of the three parameter GLLD. Theorem 1.1: If a random variable X follows GLLD with parameters a,p and X such that a,p > Oand |1| < 1, then the rth non-central moment is given by
(1+1) i r r\ 2X i r r\
--0(1+-,1--)--B( 2+-,l--) (7)
ar \ p p) ar \ p p) w
Proof:
We know by the definition of the rth raw moment that
H'r = E(Xr)
r CO
• ^ = I xrf(x; a,p, Ja
X)dx
Jn
' r a0(ax)^"1{(l + A)(l + (axY) - 2X(axY}
*T (1 + (ax)O3 dX
8-1
^ (1 + (,axyy dX~J0 *r (1 + (ax)")3 **
Put (ax)13 = t, we obtain x = ^ and a/3 (ax1dx = dt Also, as x ^ 0,t ^ 0 and as x ^
IN r , i^r
A M t
co \ a
where
rco \ a i ro
2 If"
(1+1) i r r\ 21 t r r\
••• № = ■
r(a + b)
ar r(2) ar T(3)
Thus, the rth non-central moment is given by the expression
(8)
Using expression (8), the first two raw moments for three-parameter GLLD can be easily obtained. These are given by:
Besides, we know that variance is given by
Thus, the variance of the three-parameter GLLD is given by:
It is important note that for the convergence of the rth moment, (l — 0 in (8) must be greater than
zero. In other words, convergence of rth moment is possible only if ft > r. Thus, existence of mean of the proposed distribution requires that p is greater than 1. For variance, p must be greater than 2. Similarly, for skewness and kurtosis, p must be greater than 3 and 4 respectively. Any situation of divergence of the statistical measures is dealt with by employing robust measures.
2.2 Moment generating function (mgf) and characteristic function (cf)
This sub-section contains the derivation of the mgf and cf of the three-parameter GLLD. The following theorem gives the mgf and cf of the distribution under study.
Theorem 3.2:If a random variable X follows GLLD with parameters a,p and A such that a,p > Oand |1| < 1, then the mgf denoted by Mx(t) and the cf denoted by has the following form:
CO
j=o j = 0
Proof:
We know from the definition of mgf that
Mx(t) = E(etx)
= I / ^-f(x-,a,p,X)dx
CO CO
fVtJ • viJ
j 2^ — x'f{x-,a,p,X)dx =
]=0 j=o
From (8), we know
CO
]=0 -
which is the required mgf of the three-parameter GLLD. Also, we know that
ipx(t) = E(eltx) ^ xpx(t) =
+ (13)
j = 0
which is the required cf of the three-parameter GLLD.
2.3 Order Statistics
Stated in the simplest of terms, order statistics refer to sampling values arranged in an ascending order. If denote the order statistics of a random sample X1,X2,X3, ...,Xndrawn
from a continuous population having CDF Fx(x) and pdf fx(x), then the pdf of the rth order statistics X(r) is given by:
fr(x) = ■
n!
:fx(x)[Fx(x)]r~1[l-Fx(x)]n-
Vr = 1,2,
(r — 1)! (n — r)!
Using (5) and (6), the formula for the pdf of the rth order statistic X^ for the three-parameter GLLD is obtained and is given as under:
n\ ap(ax)?~1{(! +!)(! + (ax)*3) - 2A(ax)?}
fr(x) =
(r — 1)! (n — r)!
(1 + (ax)?)3
(ax)2? + (1+A)(ax)?
(1 + (ax)?)2
1 -
(ax)2? + (1 + A)(ax)?
(1 + (ax)?)2
(14)
For r = n, we get the pdf of the nth or the largest order statistic X^ for the three-parameter GLLD which is obtained as follows:
n\ ap(ax)?~1{( 1 + A)(l + (ax)?) - 2A(ax)?}
fn(x) =
(n-l)\(n-n)\ (ax)2? + (1 + X)(ax)?
(1 + (ax)?)2
1 -
(1 + (ax)?)3 (ax)2? + (1+X)(ax)?
(1 + (ax)?)2
n! ap(ax)^~1{(l + A)(l + (ax)?) - 2A(ax
•••/nOO =
(n-1)! (l + (ax)03
nafi(ax)^~1{( 1 +!)(! + (ax)*3) - 2A(ax)^}
(ax)2? + (1 + X)(axY
(1 + (ax)^)2 (ax)2^ +(1 +A){axy
(1 + (ax)^)2
(15)
(1 + (ax)^)3
Also, for r = 1, we get the pdf of the first or the smallest order statistic X^ for the three-parameter GLLD which is obtained as follows:
n! + A)(l + (ax)*3) - 2A(ax)^}
fi(x) =
(1-1)! (n-1)! (ax)2^ + (l+l)(ax)^1
(1 + (ax)^)2
1-
(l + (ax)03
(ax)2^ + (1 + A)(ax)^n~1
(1 + (ax)^)2
•••AM =
n! afi(ax)^~1{(l + A)(l + (ax)^) - 2A(ax)^}
(n-1)! (1 + (ax)O3 '
na^(ax)^"1{(l +!)(! + (ax)^) - 2A(ax)^}
1 -
(ax)2^ + (1+A)(ax)^
1-
(l + (ax)^)2
(ax)2^ + (l+l)(ax)^"_1
(1 + (ax)^)2
(16)
(1 + (ax)^)3
Quite evidently, for 1 = 0, the order statistics of the base distribution i.e., the Log-Logistic Distribution, are yielded.
2.4. Quantile function and random number generation
A prominent method that is put to use for the sake of generating random numbers from a specified distribution is the inverse CDF method. This method generates random numbers from a particular distribution by equating the CDF of the distribution to a number u where u itself follows continuous uniform distribution, ^(0,1). Solving the equation yields the quantile function of the distribution. Employing this inverse CDF method, we proceed to obtain the quantile function of the three-parameter GLLD as follows:
F(x; a,p,A) = u (ax)2^ + (l+A)(ax)^
After simplifying, we obtain
^ =
(1 + (ax)O2 (ax)2^ + (1 + A)(axy = u(l + (ax)^)2
+ A-2u)± V«2^(l + !)2 -4uAa2^
2a2^(l — u) +A-2u)± V(a^)2V(l + A)2 -4uA ' 2a2^(l-u)
-(1 + A - 2u) ± J(l+A)2-4uA
-(1 + 1- 2u) ± V(1 +1)2 -4uA 2a^(l -u)
-(1 + A - 2u) + V(1 +A)2 -4uA
(17)
Equation (17) is the required quantile function of three-parameter GLLD. Note that the negative root of (17) has been discarded since x only takes values greater than 0. Equation (17) yields random numbers from three-parameter GLLD. For u = 0.25,0.50 and 0.75, the values of x obtained represent the first, second and third quartiles of the distribution, respectively. In a similar fashion, deciles and percentiles of different orders are obtained by assigning different values to u.
2.5. Survival measures of three-parameter GLLD
This sub-section deals with the survival measures of three-parameter GLLD such as the survival function and the hazard function. The survival function, also known as the survivorship function, refers to the probability that a life, system or a component will survive beyond a specified time. In mathematical terms, it happens to be the complement of the CDF and is given by:
S(x) = Pr(X > x) = 1 - F(x) (18)
Using (5) in (18), we obtain the survival function of three-parameter GLLD as follows:
c, . „ (l + (gx)*)2-(gx)2*~(l + A)(gx)*
S(x; a, 16,X) =---7—t-^t--
^ (1 + (gx)02
(1 + 2(ax)? + (gx)2^) - (gx)2^ - (1 + A)(gx)^
S(x; a.B.X) =-—-;—-it—-
^ (1 + (gx)02
1 + (1 -X)(axY ,
,S{X)a,p,X)= (1 + (gx)/3)2 - x-a-P> 0&-l<A<l (19)
The hazard function, also known as the hazard rate or failure rate or force of mortality, happens to be an important quantity used for the characterization of life phenomenon. Hazard function is defined as the conditional probability that a life, system or a component that survives up to a specified time, will undergo failure or succumb in the immediate, infinitesimally small interval of time that follows. In mathematical terms, the hazard rate or the hazard function is given by:
Pr[t <X<t + At\X>t]
h(x) = Km-
v ' At^O At
which upon simplification yields
= (20)
Using (6) and (19) in (20), we obtain the hazard function of three-parameter GLLD as follows:
afi(ax)^"1{(l +X)(l + (ax)^) - 2A(gx)^}
h(x-a,B,X) = -, v—/ ' , x,a,B> 0 &|A| < 1
v H J (l + (gx)^){l + (l-A)(gx)^} H 11
(21)
The survival function and the hazard function plots for (19) and (21) respectively for different values of the parameters involved is illustrated through figure 3 and 4 respectively.
3. Maximum Likelihood Estimation
One of the most useful frameworks in parameter estimation is the Maximum Likelihood estimation (MLE). This method obtains the unknown population parameters by the virtue of likelihood maximization.
In this section, the parameters gj(S and Aof the three-parameter GLLD are estimated using the method of maximum likelihood estimation (MLE). The procedure is given as follows: Consider a random sample X1,X2,...,Xnof size n taken from the three-parameter GLLD. The likelihood function based on this sample is therefore given as:
L(xlaM = 1=1-^- (22)
n^d + W)3 (23)
Taking logarithm on both sides of (23), we obtain the log likelihood function as follows:
, , , \( Bn^ nr=1xf"1 nr=i{(l + A)(l + (axtf) - 2A(gxt)*}] ^ logL = log {a^P) --—
n?=1(i + iaXiYY
XXX
Fig. 3: Survival function plot for three parameter GLLD
XXX
Fig. 4: Hazard rate function plot for three parameter GLLD
log L log L = nfi log a + n log B + (0 — 1) ^ log xt
¡=i
n n
+ ^ log{(l + !)(! + (a*)") - 2 XtaXif} - 3 ^ log(l + (axdp)
(24)
i=l (i=l) which is the required log-likelihood function.
The MLEs of the parameters a, B and X of GLLD are obtained by differentiation of the log-likelihood function (24) w.r.t a,B and A.The partial derivatives used for estimating the parameters are obtained as follows:
0 nB v-1
— log L=-^+> da a ¿—i
d n \r~<
— logL = nloga +- + 2_|logxi
{(1 + X){BaP~1xD - 2XBa^~1xpi] n
{{l+X){l + {axi)P)-2X{axi)P} i=i (1 + («*,)*)_
, y [{(1 + X)(axdp log(gxt) - 2X(axiy log(gxt)}' _ „ L[ {(l+X){l + {axiy)~2X{axiy} 1
i=l L
(aXiY log(aXj)
(1+(axjP)
n
Txl°zL = X\
1 - (axty
{(l+X)(l + (axiy)-2X(axiy}
(25)
(26)
(27)
The derivative equations (25), (26) and (27) cannot be analytically solved and thereby estimates of the parameters a,B and X denoted by a,B and X are obtained by maximization of log-likelihood function through the employment of powerful iterative numerical methods such as the Newton-Raphson method. The second order partial derivatives are computed which are helpful in
obtaining the Fisher's Information Matrix in the following manner:
Ix(a,ß,X) =
-E
-E
-E
Ô2 log L
da2 Ô2 log L dßda ô2 log L dXda
-E
—E
-E
a2 logL dadß a2 logL dß2 a2logL dXdß
-E
—E
-E
a2 logL dadX a2logL dßdX a2logL dX2
(28)
It can be shown that the three-parameter GLLD satisfies the regularity conditions and thereby the MLE vector 0 = (â,0,X)T is consistent as well as asymptotically normal, i.e., ^n [(â,0,î)r — (a,0,l)r] converges to a normal distribution with mean vector 0 and the identity covariance matrix. Fisher's Information matrix in (28) is calculated by virtue of the following approximation:
lx{&,ß,X)
-E
-E
-E
a2logL\ da2 )
a2logL\ dßda )
a2 logL dXda
(S,?,Â)
(S,?,Â)
(S,?,Â)
-E
-E
-E
a2logL\ dadß )
a2logL\ dß a2 log L
dXdß
(S.3.Â)
(S.3.Â)
-E
-E
-E
a2 log L dadX
a2 log L dßdX
a2 log L
dX2
(s,?,Â)
(a,ßX)i
(29)
Where a, ft and X are the MLEs of a, 0 and X respectively. This approximation is useful in the construction of the confidence intervals for the parameters of three-parameter GLLD. The approximate 100(1 — a)% confidence intervals for a,0 and X are respectively given by:
: ± z^I1-11(0), ß ± zzJl-^G) and X ± z^I~31(0)
(30)
3.1. Monte Carlo Simulation Study of ML Estimates
Monte Carlo simulation refers to a wide range of computational algorithms aimed at obtaining numerical results by using repeated random sampling. This sub-section contains a behavioral analysis of the maximum likelihood estimates of three-parameter GLLD for a finite sample of size n. A MC simulation study for different values of parameters a, 0 and y is employed for this purpose with random numbers being generated using the quantile function (17) obtained earlier. The procedure undertaken involves a simulation study for each triplet (a,ft,X) for the parameter combinations (a = 0.7,0 = 0.5,1 = 0.4) and (a = 1.2,0 = 0.8,1 = 0.5).The iterative process is carried out 100 times for samples of size n, where n = 25,75,150,200 and 500, generating 100 samples of the mentioned sample sizes. ML estimates for each sample generated are then obtained and their average bias, variance and MSE is calculated. The results have been tabulated in Table 1 and clearly indicate that with the increase in the sample size n, agreement between theory and practice improves significantly. MSE and variance of estimates of a,0 and X indicate consistency and that the ML method performs well for estimation of parameters of the three-parameter GLLD.
Table 1: Average Bias, Variance and MSE for simulated results of MLEs
Sample size n Parameters (a = 0.7,0 = 0.5,1 = 0.4) (a = 1.2,0 = 0.8,1 = 0.5)
Bias Variance MSE Bias Variance MSE
25 a 0.026299 2.645356 2.646048 -0.15184 0.633681 0.656737
ß 0.008205 0.009891 0.009958 0.012896 0.052401 0.052567
X 0.284835 0.18187 0.263 0.160373 0.173618 0.199338
75 a -0.184323 0.841443 0.875418 -0.16281 0.495852 0.522358
ß -0.009204 0.003322 0.003407 -0.04699 0.026425 0.028633
X 0.237382 0.142089 0.198439 0.156228 0.153889 0.178296
150 a -0.357382 0.54484 0.672562 -0.05974 0.366872 0.370441
P -0.020729 0.001617 0.002047 -0.03842 0.012117 0.013593
A 0.271836 0.148146 0.22204 0.067127 0.140693 0.145199
200 a -0.276202 0.540957 0.617245 0.037218 0.422209 0.423594
P -0.018463 0.001689 0.00203 -0.00735 0.012184 0.012238
A 0.241823 0.135421 0.193899 0.03889 0.128701 0.130213
500 a -0.377683 0.375987 0.518631 0.095593 0.332553 0.341691
P -0.01319 0.000977 0.001151 -0.02082 0.006032 0.006466
A 0.252524 0.10096 0.164728 -0.01595 0.102887 0.103141
4. Robust Skewness and Kurtosis Measures for three-parameter GLLD
This section deals with the study of skewness and kurtosis measures for the proposed distribution. Skewness and kurtosis both deal with the shape of the distribution with the former concerned with symmetry while latter with the tailedness and peakedness of the distribution. The effect of parameters on the skewness and kurtosis of the distribution is studied in this section by considering measures based on quantiles.
Bowley[6] proposed a coefficient of skewness based on quantiles which is well known in statistical literature and is one of the earliest measures of skewness. It is defined as the average of the first and third quartiles minus the median divided by half the interquartile range. It is given by:
n ft + g,-2ft ggH (;)-*?(;) - = „(!)-„(!) (31) Bowley's coefficient of skewness lies between +1 and —1.
Moors[7] proposed a robust alternative to the conventional measure of kurtosis in order to overcome the shortcomings of the latter. For many heavy tailed distributions, the conventional measure is infinite and uninformative as such. The new measure of kurtosis based on quantiles, however, is less sensitive to outliers and even exists for distributions for which there are not any
defined moments. The Moors' kurtosis based on octiles is given by:
iE3_Ei) + iE7_E5)_QQ-QQ + QQ-Q^
m =■
Ef: E-i
«ens
For distributions that are symmetrical to 0, the Moors' kurtosis reduces to:
«0-«©
M = -
(32)
(33)
Table 2: Bowley'sskewness for GLLD (x; a, p, A) for different parameter combinations
Parameters a = 1.3
P
0.7 1.4 1.9 2.6 3.3 4.4 5.6
1 -0.9 0.62570 0.36985 0.28923 0.22568 0.18823 0.15298 0.13014
-0.7 0.63217 0.36683 0.28241 0.21574 0.17642 0.13942 0.11543
-0.6 0.63601 0.36633 0.28009 0.21192 0.17169 0.13383 0.10929
-0.3 0.64848 0.36931 0.27877 0.20697 0.16455 0.12460 0.09870
0.3 0.64453 0.36414 0.27336 0.20143 0.15894 0.11895 0.09303
0.6 0.60956 0.33252 0.24491 0.17596 0.13538 0.09727 0.07260
0.7 0.59349 0.31787 0.23156 0.16380 0.12400 0.08664 0.06248
0.9 0.55777 0.28555 0.20196 0.13671 0.09850 0.06271 0.03959
Table 3: Moors' kurtosis for GLLD (x; a, p, A) for different parameter combinations
Parameters a = 2.6
P
0.7 1.4 1.9 2.6 3.3 4.4 5.6
1 -0.9 3.02678 1.75277 1.56318 1.45452 1.40469 1.36648 1.34594
-0.7 3.05154 1.74699 1.55687 1.45018 1.40247 1.36694 1.34849
-0.6 3.06594 1.74405 1.55301 1.44675 1.39978 1.36525 1.34763
-0.3 3.11493 1.73857 1.54241 1.43495 1.38840 1.35504 1.33856
0.3 3.07158 1.72426 1.53261 1.42809 1.38310 1.35111 1.33550
0.6 2.76317 1.64151 1.48248 1.39720 1.36142 1.33684 1.32545
0.7 2.60943 1.59528 1.45225 1.37657 1.34545 1.32469 1.31549
0.9 2.27908 1.48840 1.37979 1.32506 1.30423 1.29197 1.28772
For standard normal distribution, it is easy to compute that
E1 = -E7 = -1.15, E2 = -E6 = -0.67 and E3 = -E5 = -0.32
Therefore, M = 1.23. The centered Moors' coefficient is thus given by:
M = ——---— — 1.23 (34)
Using R software, the values of Bowley'sskewness and Moors' kurtosis for the three-parameter GLLD for different parameter values have been numerically calculated and tabulated in Tables 2 and 3 respectively. Clearly, Bowley'sskewness as well as Moors' kurtosis are decreasing function of p for a fixed value of the transmuted parameter X. However, for a fixed value of the scale parameter ft, both Bowley's skewness and Moors' kurtosis reflect both increasing and decreasing behavior for different values of the transmuted parameter X.
5. Applications of three-parameter GLLD
In this particular section, the performance of the proposed generalized log-logistic model is put to test by comparing it with base model. Two real life data sets, one based on survival times and the other on strength data, that are already available in the literature have been used to carry out the comparisons. The procedure involves the computation of MLEs of the transmuted model as well the base model based on both data sets using R software. The various goodness of fit statistics for the two models are then calculated and comparisons carried out. These statistics include AIC (Akaike's Information Criterion) provided by Akaike[8], AICC (AIC Corrected) and BIC (Bayesian Information Criterion) given by Schwarz[9]. AIC, AICC and BIC for a model with k parameters are calculated using the following generic functions:
AIC = 2fc-21ogL
2k(k + 1)
AICC =AIC+—^--
n — k — 1 BIC = fclogn-21ogL
Kolmogorov-Smirnov test is also carried out for testing model significance based on the two mentioned real-life data sets.
Data Set I: The data set reported by Efron[10] is analyzed for carrying out comparisons between three-parameter GLLD and LLD. Efron [10] reported the data set in which observations represent the survival times of a group of patients suffering from head and neck cancer disease and are
treated using radiotherapy. The data set is given in Table 4.
Table 4: Survival times of 58 patients suffering from head and neck cancer disease
6.53 7 10.42 14.48 16.10 22.70 34 41.55 42 45.28 49.40 53.62
63 64 83 84 91 108 112 129 133 133 139 140
140 146 149 154 157 160 160 165 146 149 154 157
160 160 165 173 176 218 225 241 248 273 277 297
405 417 420 440 523 583 594 1101 1146 1417
The MLEs, model functions alongside the standard errors based on the above data set are tabulated in Table 5.
The Table 6 contains various goodness of fit measures for models fitted to data given in Table 4. From the table, it is evident that the AIC, AICC and BIC values for the transmuted model (GLLD) are better as compared to the base model (LLD), thereby suggesting that the new model is a better performer. Furthermore, the KS p-value is also greater than 0.05 for GLLD as such reiterating the statistical significance of the new transmuted model over the base model.
Table 5: MLEs with standard errors of parameters for GLLD and LLD for data set I
Model Model function MLEs Standard Error
Transmuted Model + A)(l + {ax)P) - 2X(axY} a = 0.01 P = 1.55 À = -0.47 SE(â) = 0.002 SE Iff) = 0.203 SE(X) = 0.344
(1 + (ax)^)3
Base Model ap^axy-1 (1 + (ax)P)2 a = 0.01 P = 1.52 SE (a) = 0.002 SE(P) = 0.196
Table 6: Goodness of fit measures for models fitted to data set I
Model -log I AIC AICC BIC KS Distance KS p-value LR Statistic
GLLD 371.1943 748.3887 748.8331 754.5700 0.15548 0.1211 5.01905
LLD 373.7039 751.4077 751.6259 755.5286 0.26802 0.0004
The GLLD and LLD plots fitted to the survival times of the 58 patients suffering from head and neck cancer disease are illustrated through Figure 5. The graphical overview of the empirical and theoretical (GLLD) CDFs and survival functions for data set I is illustrated through Figures 6 and 7 respectively.
Fig. 5: Curve fitting GLLD vz LLD for data set I 1031
Data Set II: The data set reported by Lawless [11] is analyzed for carrying out comparisons between three-parameter GLLD and LLD. Lawlessreported the data set in which the observations represent the number of cycles to failure for 25-100 cm specimens of yarn tested at a particular strain level. The data set is given in Table 7.
Table 7: Cycles to failure for 25-100 cm specimens of yarn at a specific strain level
15 20 38 42 61 76 86 98 121 146
149 157 175 176 180 180 198 220 224 251
175 176 180 180 198 653
Fig. 6: Empirical and Theoritical CDF for data set I
Fig. 7:Empirical and Theoretical Survival Function for data set I
The MLEs, model functions alongside the standard errors based on the above data set are tabulated in Table 8 below:
Table 8: MLEs with standard errors of parameters for GLLD and LLD for data set II
Model Model function MLEs Standard Error
Transmuted Model + A)(l + (ax)*3) - 2X(ax)P} a = 0.01 /5 = 1.89 X = -0.56 SE{a) = 0.004 SE(fi) = 0.450 SE(X) = 0.491
(1 + (ax)^)3
Base Model apiaxy-1 (1 + (ax)^)2 a = 0.01 /5 = 1.85 SE(â) = 0.003 SE (ff) = 0.423
From the table 9, it is evident that the AIC, AICC and BIC values for the transmuted model (GLLD) are better as compared to the base model (LLD), thereby suggesting that the new model is a better performer. Furthermore, the KS p-value>0.05 for GLLD as such reiterating the statistical significance of the new transmuted model over the base model. In other words, GLLD is a better fit for data given in Table 7 as compared to LLD.
Table 9: Goodness of fit measures for models fitted to data set II
Model -log L AIC AICC BIC KS Distance KS p-value LR Statistic
GLLD 154.2395 314.4790 315.6219 318.1356 0.18704 0.346 3.440033
LLD 155.9595 315.9191 316.4645 318.3568 0.30815 0.01734
The GLLD and LLD plots fitted to the number of cycles to failure for 25 100-cm specimens of yarn tested at a particular strain level are illustrated through Figure 8.The graphical overview of the empirical and theoretical (GLLD) CDFs and survival functions for data set II is illustrated through Figures 9 and 10 respectively.
0 100 200 300 400 500 600 700
Lifetimes
Fig. 8: Curve fitting GLLD vz LLD for data set I
Fig. 9: Empirical and Theoritical CDF Fi& 10-Empirical and Theoretical Survival Function
for data set II for data set II
6. Concluding Remarks
A new three parameter transmuted probability model namely is introduced by using the quadratic rank transmutation map technique. Comprehensive description of the statistical properties of the newly introduced model are introduced. Robust measures of skewness and kurtosis of the proposed model have also been derived along with the moment generating function, characteristic function, reliability function and hazard rate function of the said model. The estimation of the model parameters is performed by maximum likelihood method followed by a Monte Carlo simulation procedure. The applicability of this distribution to modeling real life data is illustrated by two real life examples and the results of comparison to base distribution in modeling the data are also exhibited.
Conflict of Interest
The Authors declare that there is no conflict of Interest. References
[1] Shaw, W., and Buckley, I. (2007). The alchemy of probability distributions: Beyond Gram-Charlier expansions and a skew-kurtotic-normal distribution from a rank transmutation map. UCL Discovery Repository, pages 1-16.
[2] Ashour, S. K., and Eltehiwy, M. A. (2013). Transmuted exponentiated modified Weibull distribution. International Journal of Basic and Applied Sciences, 2 (3), 258-269.
[3] Aryal, G. R., and Tsokos, C. P. (2009). On the transmuted extreme value distribution with applications. Nonlinear Analysis: Theory, Methods and Applications, 71(12), 1401-1407.
[4] Merovci, F. (2013). Transmuted Lindley distribution.International Journal of Open Problem in Computer Science and Mathematics, 6(2), 63-72.
[5] Merovci, F., (2013). Transmuted Rayleigh distribution. Austrian Journal of Statistics, 42(1), 21-31.
[6] Bowley, A.L. (1920). Elements of Statistics. London: P.S. King & Son, Ltd.
[7] Moors, J. J. (1988). A quantile alternative for kurtosis. The Statistician, 37(1), 25.
[8] Akaike, H. (1974). A new look at the statistical model identification. Springer Series in Statistics, 215-222.
[9] Schwarz, G. (1978) Estimating the Dimension of a Model. Annals of Statistics, 6, 461-464.
[10] Efron, B. (1988). Logistic regression, survival analysis and the Kaplan-Meier curve. Journal of the American Statistical Association, 83: 414-425.
[11] Lawless, J.F. (2003). Statistical models and methods for lifetime data. John Wiley and Sons, New York, USA.