Selection bias in innovation studies: A simple test

Similar documents
The R&D-patent relationship: An industry perspective

Flexible Working Arrangements, Collaboration, ICT and Innovation

AJAE Appendix: Testing Household-Specific Explanations for the Inverse Productivity Relationship

Gasoline Empirical Analysis: Competition Bureau March 2005

Valuation in the Life Settlements Market

Structural Reforms and Agricultural Export Performance An Empirical Analysis

"Primary agricultural commodity trade and labour market outcome

The Financing and Growth of Firms in China and India: Evidence from Capital Markets

Appendix A. Table A.1: Logit Estimates for Elasticities

Statistics: Final Project Report Chipotle Water Cup: Water or Soda?

The premium for organic wines

Appendix A. Table A1: Marginal effects and elasticities on the export probability

The Roles of Social Media and Expert Reviews in the Market for High-End Goods: An Example Using Bordeaux and California Wines

An application of cumulative prospect theory to travel time variability

What does radical price change and choice reveal?

Gender and Firm-size: Evidence from Africa

Volume 30, Issue 1. Gender and firm-size: Evidence from Africa

Effects of political-economic integration and trade liberalization on exports of Italian Quality Wines Produced in Determined Regions (QWPDR)

Internet Appendix for CEO Personal Risk-taking and Corporate Policies TABLE IA.1 Pilot CEOs and Firm Risk (Controlling for High Performance Pay)

Introduction to the Practical Exam Stage 1

Tourism and HSR in Spain. Does the AVE increase local visitors?

Nuclear reactors construction costs: The role of lead-time, standardization and technological progress

Zeitschrift für Soziologie, Jg., Heft 5, 2015, Online- Anhang

ICT Use and Exports. Patricia Kotnik, Eva Hagsten. This is a working draft. Please do not cite or quote without permission of the authors.

Decision making with incomplete information Some new developments. Rudolf Vetschera University of Vienna. Tamkang University May 15, 2017

Missing value imputation in SAS: an intro to Proc MI and MIANALYZE

Firm structure, reporting incentives and international accounting research

RELATIVE EFFICIENCY OF ESTIMATES BASED ON PERCENTAGES OF MISSINGNESS USING THREE IMPUTATION NUMBERS IN MULTIPLE IMPUTATION ANALYSIS ABSTRACT

This appendix tabulates results summarized in Section IV of our paper, and also reports the results of additional tests.

Wine-Tasting by Numbers: Using Binary Logistic Regression to Reveal the Preferences of Experts

Table A.1: Use of funds by frequency of ROSCA meetings in 9 research sites (Note multiple answers are allowed per respondent)

MEASURING THE OPPORTUNITY COSTS OF TRADE-RELATED CAPACITY DEVELOPMENT IN SUB-SAHARAN AFRICA

Appendix Table A1 Number of years since deregulation

Eestimated coefficient. t-value

Online Appendix. for. Female Leadership and Gender Equity: Evidence from Plant Closure

Introduction to Management Science Midterm Exam October 29, 2002

November K. J. Martijn Cremers Lubomir P. Litov Simone M. Sepe

Internet Appendix for Does Stock Liquidity Enhance or Impede Firm Innovation? *

Panel A: Treated firm matched to one control firm. t + 1 t + 2 t + 3 Total CFO Compensation 5.03% 0.84% 10.27% [0.384] [0.892] [0.

Demand Fluctuations and Productivity of Service Industries

Can Belgian Firms cope with the Chinese Dragon and the Asian Tigers? The Export Performance of Multiproduct Firms on Foreign Markets

Export Spillover and Export Performance in China

Trade Integration and Method of Payments in International Transactions

Survival of the Fittest: The Impact of Eco-certification on the Performance of German Wineries Patrizia FANASCH

1/17/manufacturing-jobs-used-to-pay-really-well-notanymore-e/

Do the rules of the game determine who is playing? Institutional Change, Entrepreneurship and Human Capital

Imputation of multivariate continuous data with non-ignorable missingness

QUALITY, PRICING AND THE PERFORMANCE OF THE WHEAT INDUSTRY IN SOUTH AFRICA

Lecture 13. We continue our discussion of the economic causes of conflict, but now we work with detailed data on a single conflict.

Credit Supply and Monetary Policy: Identifying the Bank Balance-Sheet Channel with Loan Applications. Web Appendix

Religion and Innovation

DERIVED DEMAND FOR FRESH CHEESE PRODUCTS IMPORTED INTO JAPAN

Liquidity and Risk Premia in Electricity Futures Markets

Survival of the Fittest: The Impact of Eco-certification on the Performance of German Wineries. Patrizia Fanasch University of Paderborn, Germany

Financing Decisions of REITs and the Switching Effect

Fiscal Reaction Functions of Different Euro Area Countries

Introduction to the Practical Exam Stage 1. Presented by Amy Christine MW, DC Flynt MW, Adam Lapierre MW, Peter Marks MW

Wine Futures: Pricing and Allocation as Levers against Quality Uncertainty

The Impact of Free Trade Agreement on Trade Flows;

Using Patent Information to Promote R&D and Job Creation in Rwanda. Dr. Elangi Botoy Ituku KARONGI RULINDO June 1-5, 2015

J / A V 9 / N O.

A Hedonic Analysis of Retail Italian Vinegars. Summary. The Model. Vinegar. Methodology. Survey. Results. Concluding remarks.

Réseau Vinicole Européen R&D d'excellence

The challenge of tackling Campylobacter in Belgium

Not to be published - available as an online Appendix only! 1.1 Discussion of Effects of Control Variables

International Journal of Wine Business Research: Background and How to Get Published. Professor Johan Bruwer. (Editor-in-Chief)

Scientific Research and Experimental Development (SR&ED) Tax Credit

Online Appendix to The Effect of Liquidity on Governance

OIV Revised Proposal for the Harmonized System 2017 Edition

Work Sample (Minimum) for 10-K Integration Assignment MAN and for suppliers of raw materials and services that the Company relies on.

Lack of Credibility, Inflation Persistence and Disinflation in Colombia

A FLOURISHING SUPPLY & BURGEONING CONSUMER INTEREST PRESENT AN OPPORTUNITY TO INNOVATE

Online Appendix to Voluntary Disclosure and Information Asymmetry: Evidence from the 2005 Securities Offering Reform

Perspective of the Labor Market for security guards in Israel in time of terror attacks

Chapter 3. Labor Productivity and Comparative Advantage: The Ricardian Model. Pearson Education Limited All rights reserved.

Preview. Introduction (cont.) Introduction. Comparative Advantage and Opportunity Cost (cont.) Comparative Advantage and Opportunity Cost

Preview. Introduction. Chapter 3. Labor Productivity and Comparative Advantage: The Ricardian Model

A COMPARISON OF SUBJECTIVE WELL-BEING AND GENERIC PREFERENCE-BASED MEASURES OF HEALTH

KALLAS, Z.; ESCOBAR, C. & GIL, J.M.

Market, Regulatory & Policy Update for Plant-based Ingredients

Emerging Local Food Systems in the Caribbean and Southern USA July 6, 2014

A study on consumer perception about soft drink products

Labor Supply of Married Couples in the Formal and Informal Sectors in Thailand

Chapter 3. Labor Productivity and Comparative Advantage: The Ricardian Model

Preview. Chapter 3. Labor Productivity and Comparative Advantage: The Ricardian Model

Housing Quality in Europe A Comparative Analysis Based on EU-SILC Data

The Sources of Risk Spillovers among REITs: Asset Similarities and Regional Proximity

Mobility tools and use: Accessibility s role in Switzerland

ARIMNet2 Young Researchers Seminar

Fair Trade and Free Entry: Can a Disequilibrium Market Serve as a Development Tool? Online Appendix September 2014

Putting dollar value on whaling

Statistics & Agric.Economics Deptt., Tocklai Experimental Station, Tea Research Association, Jorhat , Assam. ABSTRACT

Consumer and Market Insights Symposium James Omond Lawyer & trade mark attorney, Omond & Co Board Member, Wine Victoria and WFA

OF THE VARIOUS DECIDUOUS and

A latent class approach for estimating energy demands and efficiency in transport:

Child Nutrition Program participation: Special Provision operation: Areas of Review. Commendations

HW 5 SOLUTIONS Inference for Two Population Means

To make wine, to sell the grapes or to deliver them to a cooperative: determinants of the allocation of the grapes

Previous analysis of Syrah

The Bank Lending Channel of Conventional and Unconventional Monetary Policy: A Euro-area bank-level Analysis

Transcription:

Selection bias in innovation studies: A simple test Work in progress Gaétan de Rassenfosse University of Melbourne (MIAESR and IPRIA), Australia. Annelies Wastyn KULeuven, Belgium. IPTS Workshop, June 2011

Innovation production functions (IPF) are often used in innovation studies Relate a firm s innovation output (I) to its research input (R): I = f(r) Study the impact of innovation policies, contribution of innovation to productivity growth, etc. Inventions are not observed, so patents are used as a proxy. Imperfect measure that has well-know limitations (Jefferson, 1929; Pavitt, 1985; Griliches, 1990) Not all inventions are patented The value of patents widely varies (majority is worthless) We look at a third shortcoming: patents are counted in a simple manner 2

A patent protects an invention in one market Priority filing (PF) Second filing (SF) = extension of the protection to a foreign market Picture downloaded from worldmapsphotos.com 3

Firms have a variety of patenting routes available to them... Belgium France US EPO WIPO ROW Invention 1: Invention 2: Invention 3: Invention 4: Invention 5: Invention 6: Invention 7: Invention 8: In theory: global count of priority filings. In practice, patents are counted at one reference office 4

Which is fine (but noisy)... unless the decision to select one office is not random Relationship between firm size and innovation Global count Level of innovation (number of patents per year) Low High Count at EPO firm i Small Large Firm size 5

The objective of this research is twofold Study whether the single office count leads to a selection bias or is it just noise? Propose a methodology to identify potential selection biases when the researcher is limited to information collected at one office 6

AGENDA 7

Agenda 0. Context 1. Motivations 2. The problem and proposed solution 3. Data 4. Empirical analysis 5. Conclusions 8

MOTIVATIONS The single office count is very popular Early evidence that it may lead to a selection bias Mo8va8ons Solu8on Data Analysis Conclusion

Single office count is a widespread practice Random sample of 20 recent articles in A* or A journals that estimate IPF for European firms 17 use single office count (mostly EPO) 1 uses two offices (EPO + national) 2 use a broader count All articles provide very few information on the patent indicator used 10

Risk of a selection bias No study has explicitly looked at this question Some results suggest that the filing route is not random: Seip (2010): large Dutch companies much more likely than SMEs to go at the EPO van Zeebroeck and van Pottelsberghe (2011) and Jensen, Thomson and Yong (2011): patent value affect filing route Raises the spectre of a selection bias 11

PROBLEM AND PROPOSED SOLUTION Exploit information on the mix between priority filings and second filings Mo8va8ons Solu8on Data Analysis Conclusion

The selection of patents may bias estimates of IPF The true unobserved output for firm i is (in logadditive form): ln (% & ) = *, &, +. & [IPF] Only a fraction of the output is observed at the reference office: ln($ % ) = (, % * +, % Hence the observed output can be written as: ln($ % ) = ln(( % $ % ) = ln(( % ) + ln ($ % ) =, % (. + /) + 0 % + 1 % Selection bias if alpha different from 0. 13

Objective: looking for randomness in the selection We would like to test whether π is random Not observed, so direct inference impossible Patenting process gives us one information on the structural form of π: we know that the patents observed at the reference office are of two types. Priority filings, which are directly filed at the reference office, and second filings, which are filed at the reference office in a later stage 14

Objective: looking for randomness in the selection The variable π can thus be expressed in a generic way as! " =! $ " + &1! $ * " )! " The variable π depends on x when at least one of the two components depends on x. In this case, the following ratio %!!" # = #! % # + '1! % + # *! # also depends on x. This ratio is known to the researcher. (Limited) risk of false positive and false negative 15

DATA Novel data on the whole population of patents by Belgian companies Mo8va8ons Solu8on Data Analysis Conclusion

Three databases are used Three waves of O&O statistieken by the Ministerie van de Vlaamse Gemeenschap Survey data on R&D (2002-2008) Bureau van Dijk s Belfirst Administrative data Patstat database by the OECD-EPO Data on patents (2000-2007) Full sample Subsample (N = 345) N Mean Min Mean Max Std. Dev. Diff. EMP (FTE) 861 536 4 608 5,685 933 * R&D (mio) 762 14 0 25 1,153 129 *** AGE 871 30 1 31 151 28 COMP (c) 902 2.12 1 2.12 3 0.57 - COMP_LOC (d) 946 0.07 0 0.04 1 - - COMP_REG (d) 946 0.28 0 0.26 1 - - COMP_WORLD (d) 946 0.65 0 0.70 1 - - 17

The identification of patents proceeds in three steps Among all the priority filings, identify these by Belgian inventors (de Rassenfosse et al.) Identify the companies Match these companies with R&D data - Cy 1 - Cy 2 - Cy 3 - Cy 4 - etc. - Cy 2 - Cy 3 - etc. Popula'on of priority patent applica'ons filed worldwide (i.e. regardless of the PO) 18

Even though 85% of all the patents are observed (through PF and SF), partial or no information for half the companies in the sample 27 % None 45 % Some 28 % All 42 % No PF at EPO 26 % Some 31 % All 30 % None 34 % Some 36 % All Correct information for 53% of companies Partial information for 34% of companies No information for 13% of companies 19

EMPIRICAL ANALYSIS The empirical analysis proceeds in two steps Context Solu8on Data Analysis Conclusion 20

Step 1: Innovation production function (IPF) IPF are estimated as Poisson (Hausman et al., 1984): 1,![# $% ' $%,) $ ] = exp/' $% 0 + ) $ 2 = 3 $% 4 $ for $ = 1,, ; and % = 1,,? where the fixed-effect is approximated with the pre-sample mean of the patent series (Blundell et al., JE, 2002) Three dependent variables:! "! "! " + $ " True count Count of PF at EPO Count of PF and SF at EPO 21

Step 2: Selection equation The test for a selection bias is estimated as a Bernoulli following Papke and Wooldridge (JAE, 1996): 2 "[$% &' ) &' ] = h() &' /) where h(.) is a link function such as the logistic function. 22

1 2 found. (1) found. (2) found. (3) found. (4) found. (5) Dep. Variable:! "! "! #! # + % # &' ln(emp) 0.470 *** 0.453 *** 0.386 *** 0.432 *** -0.042 (0.092) (0.099) (0.107) (0.109) (0.303) ln(rd/emp) 0.276 *** 0.267 *** 0.434 *** 0.228 *** 0.806 * (0.073) (0.074) (0.127) (0.083) (0.471) ln(age) -0.010 0.006-0.496 *** -0.001-0.736 ** (0.120) (0.142 ) (0.143) (0.165) (0.337) COMP 0.062 1.004 *** 0.120 1.618 ** (0.219) (0.204) (0.241) (0.682) PRE_PAT 0.347 *** 0.35 * -0.220 0.463 ** (0.171) (0.181) (0.305) (0.211) NO_PRE_PAT 0.284 0.323-0.787 *** 0.410 (0.312) (0.331) (0.402) (0.344) NO_PATENT -34.246 *** (0.415) Constant -4.723 *** -4.874 *** -5.354 *** -4.989 *** -1.791 (0.659) (0.624) (0.702) (0.695) (3.037) Industry dummies Y *** Y *** Y *** Y *** Y *** Year dummies Y *** Y *** Y ** Y *** Y Observations 388 345 345 345 345 Log -47-525 -477-284 -440 pseudolikelihood R 2 0.55 0.55 0.57 0.51 0.80 23

1 2 (1) (2) (3) (4) Dep. Variable:! "! #! # + % # &' ln(emp) 0.459 *** 0.414 *** 0.441 *** 0.080 (0.092) (0.103) (0.099) (0.090) ln(rd/emp) 0.264 *** 0.576 *** 0.215 ** 1.098 ** (0.079) (0.131) (0.086) (0.073) ln(age) 0.016-0.366 ** 0.023-0.513 * (0.130) (0.159) (0.149) (0.123) COMP_LOC -0.660-1.439-1.617 *** 18.317 *** (1.058) (1.327) (0.599) COMP_REG -0.028 0.461-0.221 0.734 (0.217) (0.351) (0.243) PRE_PAT 0.341 ** -0.243 0.441 ** (0.170) (0.295) (0.182) NO_PRE_PAT 0.334-0.507 0.437 (0.328) (0.390) (0.344) NO_PATENT -49.734 *** (0.467) Constant -4.799 *** -4.755 *** -4.794 *** -1.549 (0.660) (0.712) (0.719) (3.420) Industry dummies Y *** Y *** Y *** Y *** Year dummies Y *** Y ** Y *** Y Observations 345 345 345 345 Log -298-438 -48-477 pseudolikelihood Pseudo R2 0.55 0.52 0.52 0.80 24

CONCLUSION Two contributions Mo8va8ons Solu8on Data Analysis Conclusion

Wrap up Look at the widespread practice of using one single office of reference for counting patents 1. Single office count biases estimates of the IPF 2. Propose a simple way to test the existence of a selection bias. The methodology allows to detect biases arising from both PF and SF Silent about the direction of the bias 26

Implications Global count is warranted. If limited to one office, report estimates for (1) priority filings (2) total filings (i.e. priority filings and second filings) (3) determinants of the proxy variable. If coefficients not significant in (3), one can be reasonably confident that the selection bias does not affect the findings. Application to the competitive environment of the firm. Patent indicator used affects the findings: the effect of competition on innovation is observed only with international, high-value patents. Empirical studies have not generated clear conclusion about the relationship between innovation and competition (Gilbert, 2006): future studies should pay particular attention to the way patents are counted. 27

Thank you. (gaetand@unimelb.edu.au) 28

Counting both priority filings and second filings increases the number of observations Distribution of priority filings: ROW 25 % Share of priority filings identified when second filings are taken into account 85 % of Belgian patents end up at the EPO Belgium has one of the highest rate of patents transferred at the EPO US 5% Belgium 25 % EPO 85 % Belgian case is a very strong test of our claim: if a bias exist with Belgian data, likely to be worst for other countries EPO 45 % 29