The contribution of the maximum to the sum of excesses for testing max-domains of attraction

The contribution of the maximum to the sum of excesses for testing max-domains of attraction
of 21
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Share
Tags
Transcript
  Journal of Statistical Planning andInference 136 (2006) 1281–1301www.elsevier.com/locate/jspi The contribution of the maximum to the sum of excesses for testing max-domains of attraction Cláudia Neves a , 1 , Jan Picek  b , 2 , M.I. FragaAlves c , ∗ , 1 a UIMA, Department of Mathematics, University of Aveiro, Portugal b  Department of Applied Mathematics, Technical University of Liberec, Czech Republic c CEAUL, DEIO, Faculty of Sciences, University of Lisbon, Portugal Received 1 July 2003; accepted 16 September 2004Available online 26 November 2004 Abstract Weconsiderani.i.d.sample,fromanunderlyingdistributionfunctionwithunknownshape,locationand scale parameters, belonging to some max-domain of attraction. We study the performance of ateststatisticwhichismerelyaratiobetweenthemaximumandthemeanofthesampleoftheexcessesabove some random threshold. This scale/location invariant ratio turns out to be very useful in theconstruction of an asymptotically size    test for the null hypothesis that the distribution comes fromthe Gumbel domain of attraction. The test is based on the  k n  largest observations, where  k n  is anyintermediate sequence of positive integers. Both power of the test and type I error probability arestudied for finite sample sizes by simulation.© 2004 Elsevier B.V.All rights reserved.  MSC:  62G10; 62G20; 62G32 Keywords:  Generalized extreme value and generalized Pareto distributions; Consistency of a test;Semi-parametric approach; Regular variation; Simulation ∗  Corresponding author. Tel.: +351217500414; fax: +351217500081.  E-mail addresses:  claudia@mat.ua.pt (C. Neves), jan.picek@vslib.cz (J. Picek), isabel.alves@fc.ul.pt (M.I. FragaAlves). 1 Research partially supported by FCT/POCTI/FEDER. 2 Czech Republic Grant KJB3042303.0378-3758/$-see front matter © 2004 Elsevier B.V.All rights reserved.doi:10.1016/j.jspi.2004.09.008  1282  C. Neves et al. / Journal of Statistical Planning and Inference 136 (2006) 1281–1301 1. Introduction Let  X 1 ,X 2 , ..., X n  be independent and identically distributed (i.i.d.) random variables(r.v.’s),withthesameunknowndistributionfunction(d.f.) F  ,andlet X 1 ,n  X 2 ,n  ···  X n,n be the associated order statistics (o.s.) after arranging the random sample in nondecreasingorder.Duetotheirnature,semi-parametricmodels,areneverspecifiedindetailbyhand.Instead,the only assumption made is that  F   is in the domain of attraction of an extreme valuedistribution (notation:  F   ∈ D (G  ) ), i.e., there exist normalizing constants  a n  > 0 and  b n  ∈ R such thatlim n →∞ P  { a − 1 n  (X n,n  −  b n )  x } =  G  (x)  :=  exp ( − ( 1  +   x) − 1 /  ) for all  x   such that 1  +   x > 0 and with some extreme value index    ∈  R . Read  G 0 (x)  asexp ( − exp ( − x)) for all x  ∈ R .The fundamental paper of  Gnedenko (1943) establishes thatGeneralized Extreme Value (GEV) distribution in the von Mises parametrization  (G  )  isan unified version of all possible non-degenerate weak limits of the maximum  X n,n , up tolocation/scale parameters. For   < 0,    =  0 and   > 0,  G   d.f. reduces to Weibull, Gumbeland Fréchet distributions, respectively.The following necessary and sufficient condition for  F   ∈  D (G  )  was established inde Haan (1984) (  first order extended regular variation property ):lim t  →∞ U(tx)  −  U(t)a(t) =  D  (x)  :=  x  − 1   ,    =  0 , log  x,    =  0 (1)for every  x > 0 and some positive measurable function  a , with  U   standing for a quantiletype function (q.f.) pertaining to  F   defined by the generalized inverse U(t)  :=   11  −  F   ← (t)  =  inf   x  :  F(x)  1  −  1 t   . Observe that the limit function  (x  −  1 )/   is the tail q.f. of the generalized Pareto (GP)distribution F   (x)  :=  1  +  log  G  (x)  =  1  −  ( 1  +   x) − 1 /  for  x  0 if     0 , 0  x   −  1   if    < 0 . ThisfactreflectsitsexceptionalroleinExtremeValueTheory(cf.Pickands,1975;Balkemaand de Haan, 1974) and appeals to the appropriateness of classifying the tails of all possibledistributions in D (G  )  into three classes, discriminated by the tail index sign. For positive  , the power-law behavior in the tail of the underlying distribution  F   has important impli-cations since it may suggest, for instance, the presence of infinite moments. Because thefirst-order condition (1) can be reformulated as lim t  →∞ U(tx)/U(t)  =  x  , for all  x > 0,i.e.  U   is   -regular varying at infinity (notation:  U   ∈  RV   ), Karamata’s Theorem for inte-gration of regularly varying functions asserts that  E(X + 1  ) p is infinite for  p> 1 /  , where X + 1  =  max ( 0 ,X 1 ) . So, these heavy tailed distributions have infinite right endpoint and theexistence of moments is related to the value of    .The Fréchet domain of attraction containsdistributionswithpolynomiallydecaytailssuchasthePareto,Cauchy,Student’sandFréchet  C. Neves et al. / Journal of Statistical Planning and Inference 136 (2006) 1281–1301  1283 itself. All d.f.’s belonging to D (G  )  with   < 0—Weibull domain of attraction—are lighttailed distributions with finite right endpoint. Such domain of attraction encloses Uniformand Beta distributions. The intermediate case   = 0 is of particular interest in many appliedsciences where extremes are relevant, not only because of the simplicity of inference withinthe Gumbel domain  G 0  but also for the great variety of distributions possessing an expo-nential tail whether having finite right endpoint or not. Normal, Gamma and Lognormaldistributions can be found in Gumbel domain. Taking all into consideration, it has becomeclear the advantage of looking for the most propitious type of tail when fitting empiricaldistributions at high quantiles. Effectively, separating statistical inference procedures ac-cording to the most suitable domain of attraction for the underlying d.f.  F   has become anusual practice.A test for Gumbel domain against Fréchet or Weibull max-domain has received thegeneral designation of statistical choice of extreme domains of attraction (see e.g. Castilloet al., 1989; Hasofer and Wang, 1992; Fraga Alves and Gomes, 1996; Wang et al., 1996; Marohn, 1998a, b). Among these, Hasofer and Wang’s may be pointed out as one of themost commonly used testing procedure. In particular, Reiss and Thomas (2001, p. 154)have incorporated it in the “XTREMES” software. This test is based on a  location/scale invariantstatistic,functionoftheexcessesoverarandomthreshold X n − k,n .Theasymptoticstatements of the referred authors settle on a fixed  k  , whereas  n  goes to infinity, bearing onresults presented in Weissman (1978). Nevertheless, in the last part of the referred paperthere is an attempt to extend the setup of the test, allowing  k   to increase with the samplesize  n , albeit under heuristic arguments. Pursuing the same objective, Segers and Teugels(2000) have recently suggested a large sample test for the Gumbel domain hypothesis; afterderivingtheasymptoticdistributionofGalton’sratio(enjoyingthe location/scale invarianceproperty too) provided condition (1), the authors used Rao’s test statistic (see e.g. Serfling,1980) for simple null hypothesis in order to establish a decision rule. In the process, theywere confronted with the need of blocking the srcinal sample of size  n  into  m  subsamples,each of size  n i ,i  =  1 ,...,m  also under pledge of largeness.The present paper deals with the two-sided problem of testing Gumbel domain againstFréchet or Weibull domains, i.e. F   ∈ D (G 0 )  versus  F   ∈ D (G  )  = 0 .  (2)Considering  k   upper order statistics in a way that these might present a satisfactory pictureof the tail of   F  , we introduce a new test statistic which is simply the ratio between themaximum and the mean of the excesses above a random threshold  X n − k,n T  n (k)  :=  X n,n  −  X n − k,n 1 k  ki = 1 (X n − i + 1 ,n  −  X n − k,n ),  (3)where  k  =  k n  is a sequence of positive integers such that  k  → ∞  and  k/n  →  0 as thesample size  n  tends to infinity, i.e. taking into account the increasing information about theright tail provided by the top data by enlarging the sample size, in a quite natural way. Theexactdistribution T  n (k) doesnotdependonlocationorscaleparametersanditsdiscriminantbehavior towards heavy or light tailed distributions proves to be basically governed by the  1284  C. Neves et al. / Journal of Statistical Planning and Inference 136 (2006) 1281–1301 sample maximum. In addition, one-sided testing problems F   ∈ D (G 0 )  versus  F   ∈ D (G  )  < 0  ( or  F   ∈ D (G  )  > 0 )  (4)can also be treated by our results.The outline of this paper is as follows. In Section 2, we present a new test criteriumin companion results about the kind of ratios under the basis of our study. In Section 3,proofs about the asymptotic properties of the test statistic are given, liable to  F   ∈ D (G 0 ) , F   ∈ { D (G  )  :   < 0 }  or to  F   ∈ { D (G  )  :   > 0 } , and subsequent rejection regions at anasymptotic level    for testing (2) or (4) are established. The test reveals to be consistent.In Section 4, the exact performance of this test is evaluated, via simulation for a variety of models, in accordance with two main factors: the type I error probability and power of thetest; comparisons with the Segers and Teugels’, the Hasofer and Wang’s and the likelihoodratio testing procedures will be carried out. Finally, Section 5 summarizes some concludingremarks and Section 6 is fully dedicated to a practical example. 2. A new test for Gumbel domain As a starting point, let  X 1 ,X 2 ,...,X n  be i.i.d. non-negative r.v.’s and define  S  n  := X 1  + X 2  +···+ X n  and  R n  :=  X n,n /S  n . The preliminary intention here is to characterizethe ratio  R n , indicating only roughly its familiar asymptotic properties. This will be doneby means of results which exploit the intimate connection of the asymptotic behavior of  R n with regular variation concepts. Such approach will, inevitably, lead us to the order of finitemoments of   F  . Specifically, Theorem 2 refers to the case of no finite first moment but finitemoments of some fractional order,  E(X p 1 )< ∞ , 0 <p< 1, while for the case of no finitemoments of any order  p> 0,  E(X p 1 )  = ∞ , for all  p> 0, Theorem 3 states an equivalencerelation involving slowly varying right tails (notation:  F   ∈  RV  0 ).Throughout this paper,  as → ,  P →  and  d →  denote  almost sure convergence ,  convergence in probability  and  convergence in distribution , respectively. Theorem 1  ( O’Brien, 1980 ).  Let   X 1 ,X 2 ,...  be independent non-negative random vari-ables with common d.f. F  ,  then R n as →  0  ⇔  E(X 1 )< ∞; R n P →  0  ⇔  E(X 1 I  { X 1  x } )  ∈  RV  0 . Theorem 2  (  Bingham and Teugels, 1981 ).  Assume  X 1 ,X 2 ,...  are independent non-negative random variables with common d.f. F  .  The following are equivalent  :(i)  R n d →  R,  where  R  is  a  non - degenerate  r.v. ;(ii)  F   ∈  RV  −  ,  for some    ∈  ( 0 , 1 ) ;(iii)  E( 1 /R n )  → n →∞ 1 /( 1  −   ),    ∈  ( 0 , 1 ) .  C. Neves et al. / Journal of Statistical Planning and Inference 136 (2006) 1281–1301  1285 Theorem 3  (  Arov and Bobrov, 1960; Maller and Resnick, 1984 ).  Let   X 1 ,X 2 ,...  be inde- pendent non-negative random variables with common d.f. F  ,  then R n P →  1  ⇔  F   ∈  RV  0 . Remark 4.  A Borel–Cantelli argument yields  E(X 1 )< ∞ ⇔  n − 1 X n,n as →  0 (seeEmbrechts et al., 1997, p. 432). So we have that  n − 1 X n,n as →  0  ⇔  R n as →  0. Moreover,from Theorem 6 of  Downey (1990),  E(X 1 )< ∞  implies  n − 1 E(X n,n )  → n →∞ 0.Givingheedtothestatisticalchoiceofdomainofattractionproblem,werestrictthefocusof our study to the tail of   F  . In this framework, the most important features under studycomprise the excesses { X n − i + 1 ,n − X n − k,n } ki = 1  over the random threshold X n − k,n . Namely,the relative contribution of the maximum of excesses to their sum can be written as R n (k)  :=  X n,n  −  X n − k,n  ki = 1 (X n − i + 1 ,n  −  X n − k,n ).  (5)Under condition (1), the ratio (5) is approximately equal in distribution (notation:  d ∼ ) tothe ratio of the maximum to the sum of   k   independent r.v.’s  W  (  ) 1  ,...,W  (  )k  identicallydistributed as a r.v.  W  (  ) with GP(  ) distribution (   ∈ R ), i.e. R n (k)  d ∼ (Y   k,k  −  1 )/   ki = 1 (Y   i  −  1 )/  d = W  (  )k,k  ki = 1 W  (  )i or equivalently T  n (k)  d ∼  kW  (  )k,k  ki = 1 W  (  )i ,  (6)where  { Y  i,n } ni = 1  are the o.s. of   Y  1 ,...,Y  n  i.i.d r.v.’s with common Pareto d.f.  F(y)  =  1  − y − 1 ,y > 1. For    =  0, take  (Y   −  1 )/   as log (Y) . The statement in (6) may be checkedby taking into consideration the first-order condition (1) with the equality  X i,n d = U(Y  i,n ) (notation  d = for‘isdistributedas’)andusingthefactsthatforanintermediatesequence k = k n , Y  n − k,n P →∞  (see, e.g. Smirnov, 1952) to expand the scaled excesses for  i = 1 ,...,k .Then X n − i + 1 ,n  −  X n − k,n a(Y  n − k,n ) d =  U(Y  n − i + 1 ,n )  −  U(Y  n − k,n )a(Y  n − k,n ) =  (Y  n − i + 1 ,n /Y  n − k,n )  −  1   +  o p ( 1 ) = Y   k − i + 1 ,k  −  1   +  o p ( 1 ). Remark 5.  The truncated moments of a  GP(  )  distributed r.v.  W  (  ) take the form E(W  (  ) I  { W  (  )  x } ) = (E(Y   I  { Y    x } ) + x − 1 /  −  1 )/  , where Y   is a standard Pareto randomvariable. Thus, considering  R n (k)  liable to an intermediate sequence of positive integers,
Advertisement
Related Documents
View more
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks