Testing significance of peaks in kernel density estimator by SiZer map

Similar documents
Characteristics and dead-time of GM-tube

Road Surface Crack Identification by Using Different Classifiers on Digital Images

NEW METRICS FOR EVALUATING MONTE CARLO TOLERANCE ANALYSIS OF ASSEMBLIES

Load Carrying Capacity of Nail-Laminated Timber loaded perpendicular to its plane

Point Pollution Sources Dimensioning

Conductivity in Bulk and Film-type Zinc Oxide

Contour Approach for Analysis of Minimum Regions for the Economic Statistical Design of Xbar Control Charts

Determining the Optimal Stages Number of Module and the Heat Drop Distribution

Applications. 38 Looking for Pythagoras. Find the missing length(s).

A Modified Stratified Randomized Response Techniques

Product cordial labeling for alternate snake graphs

Dr.Abdulsattar A.jabbar Alkubaisi Associate Professor Department of Accounting World Islamic Sciences & Education University Amman-Jordan

Draft general guidance on sampling and surveys for SSC projects

INVESTIGATION OF ERROR SOURCES MEASURING DEFORMATIONS OF ENGINEERING STRUCTURES BY GEODETIC METHODS

A NOVEL OPTIMIZED ENERGY-SAVING EXTRACTION PROCESS ON COFFEE

Intelligent Call Admission Control Using Fuzzy Logic in Wireless Networks

Overall stability of multi-span portal sheds at right-angles to the portal spans

TOOLS TO MINIMIZE RISK UNDER DEVELOPMENT OF HIGH-TECH PRODUCTS 1

APPENDIX C2: Design of Canard Aircraft

DE HOTLINE: DE: AT: CH: FR HOTLINE : B : F : CH :

Predicting Persimmon Puree Colour as a Result of Puree Strength Manipulation. Andrew R. East a, Xiu Hua Tan b, Jantana Suntudprom a

Alcohol & You Promoting Positive Change DERBYSHIRE Alcohol Advice Service

Record your answers to all the problems in the EMCF titled Homework 4.

Methodology of industrial projects economic evaluation (M.E.E.P.I.)

Revision Topic 12: Area and Volume Area of simple shapes

Drivers of Agglomeration: Geography vs History

青藜苑教育 Example : Find te area of te following trapezium. 7cm 4.5cm cm To find te area, you add te parallel sides 7

234 The National Strategies Secondary Mathematics exemplification: Y7

Model Predictive Control for Central Plant Optimization with Thermal Energy Storage

Physics Engineering PC 1431 Experiment P2 Heat Engine. Section B: Brief Theory (condensed from Serway & Jewett)

Optimization Model of Oil-Volume Marking with Tilted Oil Tank

THIS REPORT CONTAINS ASSESSMENTS OF COMMODITY AND TRADE ISSUES MADE BY USDA STAFF AND NOT NECESSARILY STATEMENTS OF OFFICIAL U.S.

Calculation of Theoretical Torque and Displacement in an Internal Gear Pump

Ground Improvement Using Preloading with Prefabricated Vertical Drains

Further Results on Divisor Cordial Labeling

16.1 Volume of Prisms and Cylinders

Balanced Binary Trees

Variance Estimation of the Design Effect

Numerical Simulation of Stresses in Thin-rimmed Spur Gears with Keyway B. Brůžek, E. Leidich

Fixation effects: do they exist in design problem solving?

4-Difference Cordial Labeling of Cycle and

Prediction of steel plate deformation due to triangle heating using the inherent strain method

Russell James Department of Scientific and Industrial Research Taupo-ldairakei, New Zealand

ANALYSIS OF WORK ROLL THERMAL BEHAVIOR FOR 1450MM HOT STRIP MILL WITH GENETIC ALGORITHM

2 2D 2F. 1pc for each 20 m of wire. h (min. 45) h (min. 45) 3AC. see details J, E

Study of microrelief influence on optical output coefficient of GaN-based LED

Installation the DELTABEAM Frame

László Mester. The new physical-mechanical theory of granular materials

Analysing the energy consumption of air handling units by Hungarian and international methods

Math Practice Use a Formula

Volumes of Pyramids. Essential Question How can you find the volume of a pyramid?

CALIFORNIA CABERNET Class 1 Tasting

Description of Danish Practices in Retail Trade Statistics.

Calculation Methodology of Translucent Construction Elements in Buildings and Other Structures

OD DVOSTRUKO ZASTAKLJENOG PROZORA DO DVOSTRUKE FASADE INDIKATORI PRENOSA TOPLOTE STACIONARNOG STANJA

STRUCTURE OF EARNINGS STATISTICS 2010

Managing Measurement Uncertainty in Building Acoustics

1/1 FULL SIZE 3/4 QUARTER SIZE 1/2 HALF SIZE EXTRA LARGE SIZE EXTRA LONG SIZE

Goal: Measure the pump curve(s)

1/1 FULL SIZE 3/4 QUARTER SIZE 1/2 HALF SIZE EXTRA LARGE SIZE EXTRA LONG SIZE

Ratio Estimators Using Coefficient of Variation and Coefficient of Correlation

Math GPS. 2. Art projects include structures made with straws this week.

CO-ROTATING FULLY INTERMESHING TWIN-SCREW COMPOUNDING: ADVANCEMENTS FOR IMPROVED PERFORMANCE AND PRODUCTIVITY

TORQUE CONVERTER MODELLING FOR ACCELERATION SIMULATION

Forecasting of Tea Yield Based on Energy Inputs using Artificial Neural Networks (A case study: Guilan province of Iran)

International Plant Protection Convention Page 1 of 10

Red Green Black Trees: Extension to Red Black Trees

Influence of the mass flow ratio water-air on the volumetric mass transfer coefficient in a cooling tower

MPLEMENTATION OF A NATIONAL OBSERVATORY FOR MONITORING TECHNO-ECONOMIC DATA OF THE ITALIAN FLEET AND THE EVALUATION OF SOCIO-ECONOMIC PARAMETERS 1

Parte /I : Camera riverberante per prove EMC PAGINA DI GUARDIA

EECS 556, 2003, Noll Exam #1: 4

To find the volume of a pyramid and of a cone

Annex 16. Methodological Tool. Tool to determine project emissions from flaring gases containing methane

Sum divisor cordial graphs

The household budget and expenditure data collection module (IOF 2014/2015) within a continuous multipurpose survey system (INCAF)

Research regarding the setting up of the Processing Directions of Peach New Cultivars and Hybrids

Reflections on the drinking bowl 'Balance'

4.2 Using Similar Shapes

Effect of Processing on Storage and Microbial Quality of Jackfruit

Wildlife Trade and Endangered Species Protection

HCR OF HEAT PUMP ROOM AIR CONDITIONER IN CHINA. Beijing , China

Conjoint Analysis: A Study of Canned Coffee in Taiwan

An experimental study on the design method of a real-sized Mobile Bridge for a moving vehicle

10 Fingers of Death: Algorithms for Combat Killing Roger Smith and Don Stoner Titan Corporation

Long-run Determinants of Export Supply of Sarawak Black and White Pepper: An ARDL Approach

Do Regional Trade Pacts Benefit the Poor?

Background. Sample design

Study of Steam Export Transients in a Combined Cycle Power Plant

HACCP implementation in Jap an. Hajime TOYOFUKU, DVM., PhD Professor, Joint Faculty of Veterinary Medicine, Yamaguchi University, Japan

An Effective Approach for Compression of Bengali Text

5.10. Area and Perimeter INSERT

RESEARCHES ON THE EVOLUTION OF THE MAIN PHYSICAL PROPERTIES OF DIFFERENT VARIETIES OF APPLE FRUIT

Pa c k De s i g n s

Electric Motion Platform for Use in Simulation Technology Design and Optimal Control of a Linear Electromechanical Actuator

Gas Flow into Rotary Valve Intake and Exhaust Mechanism in Internal Combustion Engine

Essential Question How can you find the surface area and the volume of a cone? 3 in. π

Data Classification with Radial Basis Function Networks Based on a Novel Kernel Density Estimation Algorithm

FIRST COMPARISON OF REMOTE CESIUM FOUNTAINS

Applying Trigonometric Functions. ENTERTAINMENT The circus has arrived and the roustabouts must put

p x The revenue function is 5. What is the maximum vertical distance between the line

Transcription:

Proceedigs of te 8 t Professor Aleksader Zelias Iteratioal Coferece o Modellig ad Forecastig of Socio-Ecooic Peoea Testig sigificace of peaks i kerel desity estiator by SiZer ap Aleksadra Baszczyńska Abstract I kerel desity estiatio te researcer eeds two paraeters of kerel etod: te kerel fuctio ad sootig paraeter called as badwidt. Te special care is required i coosig te last oe. Too sall value of badwidt results i spurious peaks i te desity estiator. Too large value akes it oversooted. I paper, a useful tecique kow as SiZer ap is preseted. Tis tecique elps i deteriig weter peaks i desity estiator are sigificat or ot. Te desity kerel estiator is viewed tougt te differet level of sootig. Te SiZer ap ca be used by o-experts ad speeds te procedure of decidig wic features are sigals ad wic are oise. Te procedure of testig te ypotesis about sigificace of tis type is described. Te applicatios of SiZer ap is illustrated by aalysis of carbo dioxide eissio i coutries ade by desity fuctio estiatio. Keywords: kerel desity estiatio, SiZer ap, testig ypotesis JEL Classificatio: C, C3. Itroductio Desity estiatio is oe of te ostly used way of idetifyig ad describig te structure of data o te basis of te rado saple. Noparaetric etods, especially kerel desity estiatio, becoes ore ad ore popular i te aalysis of, aog oters, ecooic variables (Li ad Racie, 7). I te process of desity fuctio estiatio by kerel etod, te researcer as to deterie two paraeters of te etod: kerel fuctio ad sootig paraeter. Soe kerel fuctios are preseted i literature but te ifluece of tis paraeter o te results of desity estiatio is regarded ot to be sigificat. Te sootig paraeter, kow as te badwidt, wic deteries te level of sootig i te process of estiatio, plays a iportat role i resultig estiator. So, te ways of coosig te appropriate value of sootig paraeter i te process of estiatio are take ito regard i, for exaple, i Silvera (996). Te classical approac to kerel desity estiator eas regardig oe value of sootig paraeter i kerel desity estiatio tat results i a sigle estiated fuctio. Eve we a good coice of sootig paraeter is ade, isleadig ipressio ca be created due to te bups of te estiator. Te proble of assessig if tese bups are really tere ad avoidig spurious oise sould be regarded Uiversity of Łódź, Departet of Statistical Metods, 9-4 Łódź, Rewolucji 95 r. 4, Polad, albasz@ui.lodz.pl. 9

Proceedigs of te 8 t Professor Aleksader Zelias Iteratioal Coferece o Modellig ad Forecastig of Socio-Ecooic Peoea i te data structure aalysis. I tecical aalysis tis proble eas deteriig wic structure is sigal ad wic is oise. Te SiZer ap is a grapical tool used i aalyzig te visible feature represetig iportat uderlyig structures troug differet levels of sootig wat eas tat te estiatio of kerel desity fuctio is ade ad aalyzed for differet values of badwidts. Te idea of cosiderig a faily of soots ca be foud i scale space teory i coputer sciece. Cauduri ad Marro () explored tis proble i a statistical poit of view. Te bup i te structure of curve like desity fuctio is caracterized by goig up oe side ad goig dow te oter. Te bup is a zero crossig of te derivative ad it is statistically sigificat we te derivative estiate is sigificatly positive to te left ad statistically egative to te rigt. Te ae of SiZer ap stes fro assessig te SIgificat ZERo crossig of te derivative. Coperig wit te classical approac tere are two ai differeces. Firstly, SiZer studies a very wide rage of badwidts istead of lookig at just oe. Secodly, istead of focusig o a true uderlyig curve i classical, SiZer as lookig at te true curve viewed at varyig badwidts wat ca lead to recoverig te sigificat aspects of te uderlyig fuctio for differet levels of sootig. Beefits are evidet - it speeds up te process of decidig wic features are really tere ad akes tis type of iferece readily do-able by o-experts.. Kerel etod Kerel etod ca be applied i differet areas: i desity estiatio, regressio estiatio, classificatio ad patter recogitio. I desity fuctio estiatio, kerel etod, kow as Parze-Roseblat etod, is oe of te ostly used procedures i assessig te caracteristic features of rado variable. A copreesive review of kerel desity etods ca be foud i Silvera (986) ad Li ad Racie (7). Kerel desity estiator is defied i te followig way (Roseblatt 956; Parze 96): X i x f ˆ x K i () were X,...,, X X is te rado -eleet saple, is te sootig paraeter, () K is te kerel fuctio. Kerel fuctios, wic are i ost cases desity fuctios, are preseted, aog oters, i Doański ad Pruska (). Te ost widely used is Gaussia kerel wic is desity

Proceedigs of te 8 t Professor Aleksader Zelias Iteratioal Coferece o Modellig ad Forecastig of Socio-Ecooic Peoea fuctio of oral stadardized distributio. We tis kerel is used i kerel desity estiatio, te uber of zero crossig of te derivative estiate is always a decreasig fuctio of sootig paraeter. Because of tis feature just Gaussia kerel is used i SiZer ap. I classical approac of kerel desity estiatio te researcer as to ake a decisio wic value of sootig paraeter is appropriate i particular estiatio. Sootig paraeter cotrols, like i oter oparaetric curve estiators (for exaple istogra), te level of sootess. Sall value of leads to jagged estiate, wile big value teds to produce over sooted estiator. I literature soe procedures idicatig tis value are preseted, suc as Silvera s rule of tub, cross-validatio, plug-i etod ad teir odificatios. I SiZer ap te sootig paraeter rage, istead oe value like i classical approac, is take ito cosideratio. 3. Testig ypoteses i SiZer ap I SiZer ap we ave te possibility of regardig ot oly oe desity kerel estiator costructed for a particular kerel fuctio ad particular value of sootig paraeter but te faily of desity estiators wit Gaussia kerel fuctio ad te rage of sootig paraeter. Te faily of soot curves is te followig: were: x : i, i B, B is te biwidt, ax xax xi. Te case of, is also regarded. ax fˆ () Te faily () represets differet structures of te curve uder differet levels of sootig ad ca be called as scale space surface. Wile at differet scales of resolutio. x E f ˆ is te true curve viewed We a peak is observed, before te peak te sig of derivative is positive, at te poit of axiu te derivative is equal to, after peak te derivative is egative. We a valley is observed before te valley te sig of derivative is egative, at te poit of iiu te derivative is equal to, after valley te derivative is positive. Hece, peaks ad valleys are deteried by zero crossig of te derivative. I SiZer ap Gaussia kerel fuctio is used: K u e x,

Proceedigs of te 8 t Professor Aleksader Zelias Iteratioal Coferece o Modellig ad Forecastig of Socio-Ecooic Peoea because i kerel desity estiatio wit tis kerel fuctio, te uber of zero crossigs of derivative (uber of peaks) decreases ootoically wit te icrease of te badwidt (Silvera 986). Cauduri ad Marro () sow tat i kerel regressio wit Gaussia kerel fuctio, te uber of zero crossigs of te t order derivative decreases ootoically wit te icrease of te badwidt. I SiZer te followig ypoteses are regarded: fˆ x, x E H :, (3) x If x H, to te sig of is rejected, tere is a evidece tat f ˆ x x locatio i te scale space. fˆ x, x E H :. (4) x E fˆ x x is positive or egative, accordig (Cauduri ad Marro, ). Te test is doe idepedetly at eac I te calculatio of te quatile q te followig fact is used: if two locatios u ad u are sufficietly far apart, relative to te f ˆ u ad ˆ u tat f ˆ u ad ˆ u f are idepedet wic iplies f are idepedet. Te siultaeous cofidece liit proble is te approxiated by idepedet cofidece itervals. Te estiate for is calculated troug a x ESS, estiated effective saple size: ESS We kerel is uifor x x X i x K i. (5) K, ESS, is siply te uber of data poits i te widow of widt cetered at x. For Gaussia kerel te data poits are dowweigted accordig to te eigt of te kerel fuctio. Next is cose to be te uber of idepedet blocks ( cofidece itervals) of average size available fro a dataset of size : Te x. (6) avg ESS x x, ESS, ca also be used to idicate were te soot is based o sparse data by igligtig te regios were ESS x,. Cauduri ad Marro (999) suggested tat

Proceedigs of te 8 t Professor Aleksader Zelias Iteratioal Coferece o Modellig ad Forecastig of Socio-Ecooic Peoea. Terefore te calculatio of block size 5 ESS x, to: were x ESS x, is odified to avoid probles wit sall, (7) avg ESS x x D, D : 5, is te set of locatios were te data are dese. Assuig idepedece of a % cofidece iterval is: blocks of data te approxiate siultaeous quatile for q. (8) For te derivative estiate f ˆ x te cofidece liits, depedig o, ca be costructed: were: q is appropriate quatile, ad calculatio of derivative estiator were s k,..., k x f ˆ x q sdfˆ x f x q sdf x ; ˆ ˆ, (9) x sd fˆ is based o te fact tat te f ˆ is a average of te derivative kerel fuctios: var fˆ s x is te saple variace of X i x var K i, X x X x K,..., K k,...,k. O te vertical axis i te SiZer ap is x ad o te orizotal axis is. Fro te SiZer ap it is possible to preset te iforatio, for give x ad, about te positivity ad u x egativity of te derivative of x K f udu used:. blue, x f. red, x f 3. purple, x f. Te followig color codes are ˆ is sigificatly icreasig, (zero is greater ta te upper cofidece liit), ˆ is sigificatly decreasig (zero is less ta te lower cofidece liit), ˆ is ot sigificatly icreasig or decreasig (zero witi cofidece liits), f 4. grey, idicates regios were te data are too spare to ake stateets about sigificace, te effective saple size is less ta 5. 3

Proceedigs of te 8 t Professor Aleksader Zelias Iteratioal Coferece o Modellig ad Forecastig of Socio-Ecooic Peoea I SiZer ap te logscale is used for i te display (it gives soots tat are ore equally spaced). Te dotted wite curves sow effective widow widts for eac badwidt, as itervals represetig ( stadard deviatios of te Gaussia kerel). Tere is a variatio of SiZer ap aed SiCo ap (Sigificat CONvexity), were statistical iferece is ade takig ito accout secod derivative ad regios of statistically sigificat curvature are igligted (special color code is used: cya sigificat cocavity, dowward curvature; orage sigificat covexity, upward curvature; gree o sigificat curvature). 4. Applicatio of SiZer ap I literature tere are exaples of usig te Sizer ap i aalysis of ecooic data (Zabo ad Dias, ), edical data (Skrovset, Bellika, Godtliebse, ) or geoceical data (Rudge, 8). Te applicatio of SiZer ap is illustrated i te aalysis of te carbo dioxide eissio i coutries i te world. Te data was dowloaded fro te data bak (ttp://data.worldbak.org/topic/eviroet [5..4]). Total carbo dioxide eissio (i tousad etric tos) is available for 4 coutries i te world for 96-. Te last year was take ito accout i te researc. Saples of sizes, 3 ad 5 coutries were cose ad o te basis of tese saples te SiZer aps are obtaied usig te codes i Matlab. Figure sows te results were te kerel desity estiator for differet values of sootig paraeters is preseted (top) ad te SiZer ap (botto) for saple size. I te SiZer ap blue sows regios of sigificat positive sigificatly egative ˆ x, purple regios were x f fˆ x, red regios of ˆ is ot sigificatly icreasig or decreasig ad grey regios were it is ot possible to ake iferece. For large values of badwidt te desity fuctio sigificatly icreases, te tere is a regio were SiZer is uable to distiguis ad te tere is a regio were te desity fuctio sigificatly decreases. Te SiZer ap results i grey regio for sall values of badwidt, it eas tat it is ot possible to separate sigal ad oise. Tis situatio is closed coected wit te saple size. For suc sall saple size te process of estiatig te desity fuctio is rater difficult. f 4

log () log () Proceedigs of te 8 t Professor Aleksader Zelias Iteratioal Coferece o Modellig ad Forecastig of Socio-Ecooic Peoea x -5 Faily Overlay, 5-Feb-4 4.5.5.5 3 3.5 4 4.5 Slope SiZer Map x 5 5.5 5 4.5 4 3.5.5.5.5 3 3.5 4 4.5 x 5 Fig.. SiZer ap for =. Figure -3 presets SiZer ap for bigger saple sizes. It sould be oted tat we saple size is icreasig, te grey regio becoes saller. x -6 Faily Overlay, 5-Feb-4 5.5.5.5 3 3.5 4 4.5 5 Slope SiZer Map x 6 6.5 6 5.5 5 4.5 3 4 5 x 6 Fig.. SiZer ap for = 3. 5

log () Proceedigs of te 8 t Professor Aleksader Zelias Iteratioal Coferece o Modellig ad Forecastig of Socio-Ecooic Peoea x -6 Faily Overlay, 5-Feb-4 8 6 4 3 4 5 6 7 8 Slope SiZer Map x 6 6.5 6 5.5 5 3 4 5 6 7 8 Fig. 3. SiZer ap for = 5. x 6 Coclusio Te SiZer ap is very useful tecique i deteriig structure of te data. It ca be treated as oclassical etod because of its ultiple results. Takig ito accout ot oly oe value of sootig paraeter like i classical approac but te rage of values, broades te researcer s poit of view. But te special issue sould be uderlied: te saple size. Too sall saple size uables detailed aalysis of structure of date. Furter researc sould be ade to deterie te ifluece of te saple size o te results of SiZer ap. Ackowledgeets Tis work was supported by te project uber DEC-//B/HS4/746 fro te Natioal Sciece Cetre. Refereces Cauduri, P., & Marro, S. (999). SiZer for exploratio of structure of curves. JASA, 94, 87-83. Cauduri, P., & Marro, S. (). Scale space view of curve estiatio. Te Aals of Statistics, 8, 4-48. Doański, Cz., & Pruska, K. (). Nieklasycze etody statystycze. PWE, Warszawa. 6

Proceedigs of te 8 t Professor Aleksader Zelias Iteratioal Coferece o Modellig ad Forecastig of Socio-Ecooic Peoea Li, Q., & Racie, J. S. (7). Noparaetric ecooetrics. Teory ad practice, Priceto Uiversity Press, Priceto ad Oxford. Parze, E. (96). O estiatio of a probability desity fuctio ad ode. A. Mat. Statist., 3. Roseblatt, M. (956). Rearks o soe oparaetric estiatio of a desity fuctio, A. Mat. Statist., 7. Rudge, J. (8). Fidig peaks i geoceical distributios: A re-exaiatio of te eliu-cotietal crust correlatio, Eart ad Plaetary Sciece Letters, 74, 79-88. Silvera, B.(996). Desity estiatio for statistics ad data aalysis, Capa ad Hall, Lodo Skrovset, S., Bellika, J., & Godtliebse, F. (). Causality i scale space as a approac to cage detectio. Retrived fro ttp://www.plosoe.org/article/fetcobject.actio? uri=ifo%3adoi%f.37%fjoural.poe.553&represetatio=pdf. Turer, L. (3). Explorig structure of curves usig SiZer. Retrived fro ttp://www.stat.ubc.ca/~webaste/owto/statsoftware/isc/sizer/paper.pdf. Zabo, A. Z., & Dias, R. (). A review of kerel desity estiatio wit applicatios to ecooetrics. Retrieved fro ttp://arxiv.org/pdf/.8.pdf. 7