Method for the imputation of the earnings variable in the Belgian LFS

Similar documents
Multiple Imputation for Missing Data in KLoSA

Missing value imputation in SAS: an intro to Proc MI and MIANALYZE

Missing Data Treatments

Handling Missing Data. Ashley Parker EDU 7312

Flexible Imputation of Missing Data

Missing Data Methods (Part I): Multiple Imputation. Advanced Multivariate Statistical Methods Workshop

Missing Data: Part 2 Implementing Multiple Imputation in STATA and SPSS. Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 4/24/13

Modeling Wine Quality Using Classification and Regression. Mario Wijaya MGT 8803 November 28, 2017

Missing Data Imputation Method Comparison in Ohio University Student Retention. Database. A thesis presented to. the faculty of

ARE THERE SKILLS PAYOFFS IN LOW AND MIDDLE-INCOME COUNTRIES?

Labor Supply of Married Couples in the Formal and Informal Sectors in Thailand

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics

A Web Survey Analysis of the Subjective Well-being of Spanish Workers

RELATIVE EFFICIENCY OF ESTIMATES BASED ON PERCENTAGES OF MISSINGNESS USING THREE IMPUTATION NUMBERS IN MULTIPLE IMPUTATION ANALYSIS ABSTRACT

Long term impacts of facilitating temporary contracts: A comparative analysis of Italy and Spain using birth cohorts

ICT Use and Exports. Patricia Kotnik, Eva Hagsten. This is a working draft. Please do not cite or quote without permission of the authors.

Flexible Working Arrangements, Collaboration, ICT and Innovation

Missing data in political science

The Market Potential for Exporting Bottled Wine to Mainland China (PRC)

Imputation Procedures for Missing Data in Clinical Research

Predicting Wine Quality

CENTRAL OTAGO WINEGROWERS ASSOCIATION (INC.)

The Practical Implementation of the 2011 UK Census Imputation Methodology

A Comparison of Approximate Bayesian Bootstrap and Weighted Sequential Hot Deck for Multiple Imputation

Notes on the Philadelphia Fed s Real-Time Data Set for Macroeconomists (RTDSM) Indexes of Aggregate Weekly Hours. Last Updated: December 22, 2016

Appendix A. Table A.1: Logit Estimates for Elasticities

*p <.05. **p <.01. ***p <.001.

Improving Capacity for Crime Repor3ng: Data Quality and Imputa3on Methods Using State Incident- Based Repor3ng System Data

Regression Models for Saffron Yields in Iran

ECONOMIC IMPACT OF LEGALIZING RETAIL ALCOHOL SALES IN BENTON COUNTY. Produced for: Keep Dollars in Benton County

Business Statistics /82 Spring 2011 Booth School of Business The University of Chicago Final Exam

Occupational Structure and Social Stratification in East Asia: A Comparative Study of Japan, Korea and Taiwan

Fair Trade and Free Entry: Can a Disequilibrium Market Serve as a Development Tool? Online Appendix September 2014

Activity 10. Coffee Break. Introduction. Equipment Required. Collecting the Data

Analysis of Things (AoT)

Albertine de Lange UTZ Ghana. Cocoa Certification: challenges and solutions for encouraging sustainable cocoa production and trade

Table A.1: Use of funds by frequency of ROSCA meetings in 9 research sites (Note multiple answers are allowed per respondent)

Relationships Among Wine Prices, Ratings, Advertising, and Production: Examining a Giffen Good

ECONOMIC REVIEW No

Sponsored by: Center For Clinical Investigation and Cleveland CTSC

The multivariate piecewise linear growth model for ZHeight and zbmi can be expressed as:

THE EFFECTS OF FINAL MOLASSES AND SUGAR PURITY VALUES ON THE CALCULATION OF 96 0 SUGAR AND FACTORY RECOVERY INDEX. Heera Singh

Gasoline Empirical Analysis: Competition Bureau March 2005

Gail E. Potter, Timo Smieszek, and Kerstin Sailer. April 24, 2015

Demographic, Seasonal, and Housing Characteristics Associated with Residential Energy Consumption in Texas, 2010

Computerized Models for Shelf Life Prediction of Post-Harvest Coffee Sterilized Milk Drink

UNIT TITLE: PROVIDE ADVICE TO PATRONS ON FOOD AND BEVERAGE SERVICES NOMINAL HOURS: 80

Clothing: the use of class mean imputation in the Swiss Consumer price index (CPI) analysis and impact on the results

Power and Priorities: Gender, Caste, and Household Bargaining in India

Pitfalls for the Construction of a Welfare Indicator: An Experimental Analysis of the Better Life Index

What makes a good muffin? Ivan Ivanov. CS229 Final Project

Wine-Tasting by Numbers: Using Binary Logistic Regression to Reveal the Preferences of Experts

Notes on the Philadelphia Fed s Real-Time Data Set for Macroeconomists (RTDSM) Capacity Utilization. Last Updated: December 21, 2016

An application of cumulative prospect theory to travel time variability

Eestimated coefficient. t-value

EXECUTIVE SUMMARY OVERALL, WE FOUND THAT:

Imputation of multivariate continuous data with non-ignorable missingness

The Wild Bean Population: Estimating Population Size Using the Mark and Recapture Method

Adelaide Plains Wine Region

Perspective of the Labor Market for security guards in Israel in time of terror attacks

Wine Rating Prediction

How Rest Area Commercialization Will Devastate the Economic Contributions of Interstate Businesses. Acknowledgements

Financing Decisions of REITs and the Switching Effect

The Bank Lending Channel of Conventional and Unconventional Monetary Policy: A Euro-area bank-level Analysis

1. Continuing the development and validation of mobile sensors. 3. Identifying and establishing variable rate management field trials

On-line Appendix for the paper: Sticky Wages. Evidence from Quarterly Microeconomic Data. Appendix A. Weights used to compute aggregate indicators

COMPARISON OF CORE AND PEEL SAMPLING METHODS FOR DRY MATTER MEASUREMENT IN HASS AVOCADO FRUIT

Evaluating a harvest control rule of the NEA cod considering capelin

Table 1: Number of patients by ICU hospital level and geographical locality.

DO YOU GROW OR BUY WINE GRAPES, AND MAKE WINE FROM IT?

Preview. Introduction. Chapter 3. Labor Productivity and Comparative Advantage: The Ricardian Model

SA Winegrape Crush Survey Regional Summary Report 2017 South Australia - other

Food and beverage services statistics - NACE Rev. 2

COMMISSION DELEGATED REGULATION (EU) /... of XXX

Weekly tax table with no and half Medicare levy

Recent U.S. Trade Patterns (2000-9) PP542. World Trade 1929 versus U.S. Top Trading Partners (Nov 2009) Why Do Countries Trade?

ECONOMIC IMPACT OF WINE AND VINEYARDS IN NAPA COUNTY

PSYC 6140 November 16, 2005 ANOVA output in R

A Comparison of Price Imputation Methods under Large Samples and Different Levels of Censoring.

To: Professor Roger Bohn & Hyeonsu Kang Subject: Big Data, Assignment April 13th. From: xxxx (anonymized) Date: 4/11/2016

Technical Memorandum: Economic Impact of the Tutankhamun and the Golden Age of the Pharoahs Exhibition

Using Growing Degree Hours Accumulated Thirty Days after Bloom to Help Growers Predict Difficult Fruit Sizing Years

AJAE Appendix: Testing Household-Specific Explanations for the Inverse Productivity Relationship

Gender and Firm-size: Evidence from Africa

Volume 30, Issue 1. Gender and firm-size: Evidence from Africa

Promote and support advanced computing to further Tier-One research and education at the University of Houston

Buying Filberts On a Sample Basis

Credit Supply and Monetary Policy: Identifying the Bank Balance-Sheet Channel with Loan Applications. Web Appendix

A Hedonic Analysis of Retail Italian Vinegars. Summary. The Model. Vinegar. Methodology. Survey. Results. Concluding remarks.

Whisky pricing: A dram good case study. Anirudh Kashyap General Assembly 12/22/2017 Capstone Project The Whisky Exchange

The impact of the National Minimum Wage on UK Businesses 1

TABLE OF CONTENTS. Page. Page

Comparative Analysis of Fresh and Dried Fish Consumption in Ondo State, Nigeria

Michael Bankier, Jean-Marc Fillion, Manchi Luc and Christian Nadeau Manchi Luc, 15A R.H. Coats Bldg., Statistics Canada, Ottawa K1A 0T6

Climate change may alter human physical activity patterns

AGREEMENT n LLP-LDV-TOI-10-IT-538 UNITS FRAMEWORK ABOUT THE MAITRE QUALIFICATION

Statistics & Agric.Economics Deptt., Tocklai Experimental Station, Tea Research Association, Jorhat , Assam. ABSTRACT

Who s snitching my milk?

A.P. Environmental Science. Partners. Mark and Recapture Lab addi. Estimating Population Size

From VOC to IPA: This Beer s For You!

Transcription:

Method for the imputation of the earnings variable in the Belgian LFS Workshop on LFS methodology, Madrid 2012, May 10-11 Astrid Depickere, Anja Termote, Pieter Vermeulen

Outline 1. Introduction 2. Imputation 3. Imputation method for Earnings variable in LFS 4. Implementation: different steps 5. General evaluation

Introduction The Earnings Variable in the Labour Force Survey (LFS) : very high number of missing values. (24,9% in 2011) Number of Missings on Earnings variable LFS 60 50 40 % 30 20 10 0 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 In 2009: Some actions were undertaken to reduce the number of missings Start imputation of the earnings variable

Imputation Imputation = replacing missing values with credible data from a donor. What is credible data? Using what we know in order to say something about we do not know Donor? Same source: borrowing information from the nonmissing observations to impute for the missing observations External source: using information from another source to impute for the missings Imputation techniques: Single imputation: generate a single replacement value for each missing data point. Multiple Imputation: creates several copies of the data set and imputes each copy with different plausible estimates of the missing values.

Imputation method for Earnings variable in LFS (1) Regression imputation using an external source: the Structure of Earnings Survey (SES): Regression imputation (or conditional mean imputation) replaces missing values with predicted scores from a regression equation. We use the information about the effects of different personal and job characteristics on the wage level from the SES, in order to predict a wage level for the missing observations in the LFS. Why SES (instead of LFS)? A better measurement of wage variables in SES then in LFS. Earnings are the core variables in SES, whereas they are not in LFS. High number of missings in LFS: insufficient representativity of the regression model

Imputation method for Earnings variable in LFS (2) Some particular issues that needed to be resolved: Two year gap between delivery of SES data and LFS data Indexation on the basis of the Labour Cost Index SES is a yearly survey but does not always cover the entire market. Some sectors are included only once every four years (ESTAT year). Coefficiënts for the missing years are derived on the basis of the last nonmissing year SES only measures gross wages, whereas for LFS nett wages are needed. Applying a gross/nett calculation (taking into account as much as possible the information in LFS on individual an his household)

Implementation: different steps (1) Step 1: Obtain regression equation from SES SAS proc GLM Different models were compared Final model has a R-squared of 75% Only main effects, no interactions Regression parameters were converted into a formula for the prediction of a Gross Monthly Wage loggmw = sex age age2 isco_3d pct_pt nace_2d isced_6cl region size Dependent variable = variable to be predicted Independent variables = predictors

Implementation: different steps (2) Step 2: Impute Wage variable in LFS Regression equation is applied Result = Gross Monthly Wage value for the missing observations in the LFS survey Apply indexation (by NACE_1d) obtained from the Labour Cost Index Step 3: Prepare LFS dataset for Gross/Nett calculation Update calculation according to legislative rules: Nett wage is a function of the Gross wage, number of persons in charge, partnership & employment position (and wage) of the partner Derive household variables

Implementation: different steps (3) Step 4: determine Nett Wage By applying the gross/nett calculation, a Nett Monthly Wage value is obtained (for all observations) Validation of the result: compare imputed values to observed values (for the nonmissing observations) The method not only serves as an imputation method, but can also be used for data editing (e.g. evaluation of outliers)

General evaluation Effect of imputation on estimates (descriptive values): bias remains very small => strong coherence between the sources Imputed (but biased) data better quality than original ones?

General evaluation (2) Effect of imputation on variance and sampling error: artificial reduction of variance, true variance is underestimated Solution could lie in the use of a different technique: Stochastic regression imputation Multiple imputation