NBER WORKING PAPER SERIES PROFESSIONALS DO NOT PLAY MINIMAX: EVIDENCE FROM MAJOR LEAGUE BASEBALL AND THE NATIONAL FOOTBALL LEAGUE Kenneth Kovash Steven D. Levitt Working Paper 15347 http://www.nber.org/papers/w15347 NATIONAL BUREAU OF ECONOMIC RESEARCH 1050 Massachusetts Avenue Cambridge, MA 02138 September 2009 We are grateful to John List, Ben Baumer, Andrew Stein, Jeff Ma, Mark Kamal, Lucas Ruprecht, and Paraag Marathe for helpful comments and discussions. Daniel Hirschberg provided outstanding research assistance. Thanks to Baseball Info Solutions and Citizen Sports Network for providing data. Correspondence should be addressed to: Professor Steven Levitt, Department of Economics, University of Chicago, 1126 E. 59th Street, Chicago, IL 60637, or slevitt@uchicago.edu. The views expressed herein are those of the author(s) and do not necessarily reflect the views of the National Bureau of Economic Research. NBER working papers are circulated for discussion and comment purposes. They have not been peerreviewed or been subject to the review by the NBER Board of Directors that accompanies official NBER publications. 2009 by Kenneth Kovash and Steven D. Levitt. All rights reserved. Short sections of text, not to exceed two paragraphs, may be quoted without explicit permission provided that full credit, including notice, is given to the source.
Professionals Do Not Play Minimax: Evidence from Major League Baseball and the National Football League Kenneth Kovash and Steven D. Levitt NBER Working Paper No. 15347 September 2009 JEL No. D01,D82 ABSTRACT Game theory makes strong predictions about how individuals should behave in two player, zero sum games. When players follow a mixed strategy, equilibrium payoffs should be equalized across actions, and choices should be serially uncorrelated. Laboratory experiments have generated large and systematic deviations from the minimax predictions. Data gleaned from real-world settings have been more consistent with minimax, but these latter studies have often been based on small samples with low power to reject. In this paper, we explore minimax play in two high stakes, real world settings that are data rich: choice of pitch type in Major League Baseball and whether to run or pass in the National Football League. We observe more than three million pitches in baseball and 125,000 play choices for football. We find systematic deviations from minimax play in both data sets. Pitchers appear to throw too many fastballs; football teams pass less than they should. In both sports, there is negative serial correlation in play calling. Back of the envelope calculations suggest that correcting these decision making errors could be worth as many as two additional victories a year to a Major League Baseball franchise, and more than a half win per season for a professional football team. Kenneth Kovash Mozilla 5807 S Woodlawn Ave Chicago, IL 60637 kkovash@gmail.com Steven D. Levitt Department of Economics University of Chicago 1126 East 59th Street Chicago, IL 60637 and NBER slevitt@midway.uchicago.edu
Aspiriteddebatehasarisenregardingthequestionoftheextenttowhichactions intwo-playerzerosumgamesconformtothepredictionsofgametheory.von Neumann sminimaxtheorymakesthreebasicpredictionsaboutbehaviorinsuchgames. First,sincetheplayermustbeindifferentbetweenactionsinordertomix,theexpected payoffsacrossallactionsthatarepartofthemixingequilibriummustbeequalized. Second,theexpectedpayoffforallactionsthatareplayedwithpositiveprobabilitymust begreaterthantheexpectedpayoffforallactionsthatarenotplayedwithpositive probability. 2 Third,thechoiceofactionsispredictedtobeseriallyindependent,sinceif thepatternofplayispredictable,itcanbeexploitedbyanoptimizingopponent. Laboratorytestsofminimaxhave,almostwithoutexception,shownsubstantial deviationsfromthesetheoreticalpredictions(lieberman,1960,1962;brayer1964, Messick1967,Fox1972,BrownandRosenthal1990,Rosenthaletal.2003).One remarkableexceptiontothispatternispalacios-huertaandvolij(2008),inwhichsoccer playerswhoarebroughtintothelaboratoryplayminimaxinbotha2x2gameand O Neill s(1987)4x4game.levitt,list,andreiley(2008),however,areunableto replicatethesefindings,eitherusingprofessionalsoccerplayersorworldclasspoker players,callingthepalacios-huertaandvolijresultsintoquestion. Instarkcontrasttothelab,theexistingliteratureonminimaxplayinthefieldhas generallyprovidedsupporttothepredictionsofgametheory.whetherthetaskisthe directionofaserveintennis(walkerandwooders2001,hsu,huang,andtang2007), 2 Inmostpriorempiricalanalyses,thispredictionhaseitherbeenirrelevantofuntestable.Ifthereareonly twoactionsavailable(e.g.walkerandwooders2001,hsu,huang,andtang2007),thenbydefinitionany mixingstrategymustincludebothactions.evenwhentheactionspaceisricher,theexpectedpayofftoan actionthatisnottakenistypicallynotobserved,makingthispredictionuntestable.hirschberg,levitt,and List(2009),ontheotherhand,areabletoaddressthispredictioninstudyingtheactionsofpokerplayers becauseoftherichnessoftheirdata.
penaltykicksinsoccer(chiappori,levitt,andgroseclose2002,palacios-huerta2003), orthedecisiontocall,check,fold,orraiseinlimitpoker(hirschbergetal.2009), equalizedpayoffsacrossactionsthatareincludedinthemixedstrategycannotbe rejected.theevidenceonserialindependenceinthefieldismoremixed,buthasfound somesupport(e.g.,hsu,huang,andtang2007). Thereareanumberofpossibleexplanationsforthesharpdifferencesobservedin priorstudiesdoneinthelaboratoryversusthefield.failureofminimaxinthelabmay betheresultofalackoffamiliaritywiththegamesthatareplayed,lowstakes,or selectionofparticipantsintothestudieswhodonothaveexperienceortalentformixing. Averydifferentexplanationforthecontrastingconclusionsoflabandfieldstudiesisthat fieldstudiestendtohaveverylowpowertoreject(levittandlist2007).forinstance, inchiappori,levitt,andgroseclose(2002),thetotalnumberofpenaltykicksisonly 459,spreadovermorethan100shooters.WalkerandWooders(2001)observe approximately3,000servesspreadoverfortygrandslamtennismatches. Inthispaper,weaddtotheexistingliteraturebystudyingbehaviorintwonew fieldsettings:thepitcher schoiceofpitchtype(e.g.,fastballversuscurveball)in professionalbaseball,andtheoffense schoiceofrunversuspassinprofessionalfootball. Ineachofthesesettings,weareabletoanalyzefarmoredatathanhaspreviouslybeen availableinfieldstudiesofmixedstrategybehavior.inthecaseofbaseball,weobserve everypitchthrowninthemajorleaguesovertheperiod2002-2006 atotalofmorethan 3millionpitches.Forfootball,weobserveeveryplayintheNationalFootballLeague fortheyears2001-2005 over125,000plays.inbothsettings,thechoicesbeingmade haveveryhighstakesassociatedwiththem.
Theresultsobtainedfromanalyzingthefootballandbaseballdataarequite similar.inbothcases,wefindcleardeviationsfromminimaxplay,asevidencedbya failuretoequalizeexpectedpayoffsacrossdifferentactionsplayedaspartofmixed strategies,andwithrespecttonegativeserialcorrelationinactions.inthenfl,wefind thatoffensesonaveragedosystematicallybetterbypassingtheballratherthanrunning. Inbaseball,pitchersappeartothrowtoomanyfastballs,i.e.,batterssystematicallyhave betteroutcomeswhenthrownfastballsversusanyothertypeofpitch. Infootball,teamsaremorelikelytorunifthepreviousplaywasapass,andvice versa.thispatternisespeciallypronouncedwhenthepreviousplaywasunsuccessful. Negativeserialcorrelationinactionsisconsistentwithalargebodyofpriorlaboratory evidence(e.g.,brownandrosenthal1990).pitchersalsoexhibitsomenegativeserial correlation,particularlywithfastballs,i.e.,theyaremorelikelytothrowanon-fastballif thepreviouspitchwasafastball,andviceversa. Themagnitudeofthesedeviationsisnottrivial.Back-of-the-envelope calculationssuggestthattheaveragenflteamsacrificesonepointagameonoffense (4.5percentofcurrentscoring)asaconsequenceofthesemistakes.Inbaseball,we estimatethattheaverageteamgivesupanextra20runsaseason(abouta1.3percent increase).iftheseestimatesarecorrect,thenthevaluetoimprovingthesedecisionsison theorderof$4millionayearforthetypicalbaseballteamand$5millionayearforan NFLfranchise. Theremainderofthepaperisstructuredasfollows.SectionIreportsourresults formajorleaguebaseball.sectioniianalyzesnflfootball.sectioniiiconcludes.
SectionI:Ananalysisofpitchchoiceinmajorleaguebaseball OurdataonpitchchoiceinmajorleaguebaseballwerepurchasedfromBaseball InfoSolutions,whichemploysdatatrackersatallgamesandcompilestheinformationin ordertosellittomajorleagueteamsandotherinterestedparties.thedatasetincludesa wealthofinformationforeachpitchthrowninthemajorleaguesovertheperiod2002-2006:theidentityofthepitcherandbatter,thecurrentgamesituation(inning,count, numberofouts,currentscore,etc.),thetypeofpitchthrown,andtheoutcomeofthe pitch(e.g.,homerun,foul,sacrificebunt). Therearemultipledimensionsalongwhichpitchesvary:thetypeofpitch(e.g., changeup,slider,fastball),thelocationofthepitch,thevelocity,etc.welimitour attentiontojustoneofthesedimensions:pitchtype. 3 Therawdatacontain12different typesofpitches.afterconsultationwithmajorleagueteams,weconsolidatedtheseinto fivecategories:fastball,curveball,slider,changeup,andother. 4 Inouranalysis,wedrop allpitchesclassifiedas other. 5 Ourprimaryoutcomemeasureforanatbatisthebaseballstatisticknownas OPS, whichisthesumofabatter son-basepercentageandhissluggingpercentage.in 3 Unlikevelocityorlocation,pitchtypeisnotaffectedbyfaultyexecutiononthepartofthepitcher;a pitchermightintendtolocateapitchovertheinsidecorneroftheplate,butmistakenlythrowittothe outsideoftheplate. 4 Thesepitchtypes,withnumberofcasesinparentheses,areasfollows:Fastballs(2,083,248)andcut fastballs(39,830)arecombinedas fastball. Changeups(362,387)andsplitfingers(50,818)are combinedas changeup. Forkballs(430),knuckleballs(18,905),pitchouts(4,379),screwballs(766), sinkers(84),andunknowns(188,927)arecombinedas other. Sliders(449,378)andcurveballs(314,633) composetherestofthedata. 5 Theaccuracywithwhichpitchtypeiscodediscriticaltoourstudy.Asacheckonthisissue,weobtained thecodingforanoverlappingsampleofpitchescollectedbystatsinc.,whichcompeteswithbaseball InfoSolutionsinprovidinginformationtobaseballteams.Thesetwoindependentassessmentsofpitch typematchonover90percentofallpitches.thecodingmatchesespeciallywellonfastballs,withmore variationoccurringwhenthetwodatasetscodeanoff-speedpitchdifferently.ifwelimitourcomparison tofastballversusnon-fastball,approximately94percentofallpitchtypesmatchacrossthetwodatasets. Importantly,thedegreetowhichthetwodatasetsmatchdoesnotappeartobeafunctionoftheoutcomeof theatbat.matchratesarenearlyidenticalregardlessofwhetherthepitchisnotputinplay,isputinplay forahit,orisputinplayforanout.
priorempiricalresearch,opshasbeenshowntobeastrongpredictorofthenumberof runsateamscores(fox2006).ifabattermakesanout,hisopsforthatatbatiszero.if thebatterwalks,hisopsisone.asingleearnsanopsoftwo,adoubleanopsof three,atripleanopsoffour,andahomerunanopsoffive. Ourrawdatacoverseveryregularseasonpitchthrowninthemajorleaguesover theperiod2002-2006.aftergeneratingseasonlevelstatisticssuchasbatterops,we excludeanyatbatincludinganypitchcategorizedas other, aswellasdatafromextra innings.aftertheseexclusions,wehave3,110,429totalpitchesthrown.table1 presentssummarystatisticsforthesepitches.asshownincolumn1,fastballsarethe mostcommontypeofpitch,accountingforapproximately65percentofallthrows. Slidersarethesecondmostcommonpitchtype,followedbychangeupsandcurveballs. Columns2-5ofTable1reportthedistributionofoutcomesforeachpitchtype. Wereportfourmutuallyexclusiveandexhaustivepitchoutcomes:aball,astrike,theball isputintoplayandthebatterisout,andtheballisputintoplayandthebattergetsahit. 6 Pitchersareslightlymorelikelytorecordstrikeswhenthrowingfastballsrelativetoall non-fastballs,andslightlylesslikelytoregisteraball.changeupsaremostlikelytolead tobothanout(15.29percent)andahit(7.08percent);curveballsareleastlikelytoyield bothoutsandhits.column6oftable1showstheops(ourpreferredoutcomemetric) bypitchtypewhenthepitchendstheatbat.foreshadowingtheresultsfromthe regressionanalysis,theopsonfastballsishigherthanfornon-fastballs:.753versus.620.onepotentialexplanationforthatgap,however,isthatfastballsaremorelikelyto bethrowninhitters counts,asdemonstratedinthefinalthreecolumnsoftable1. 6 Afoulballthatisnotcaughtforanoutisclassifiedasastrikeinthiscategorization.
Tofurtherexploretheroleofthecount,Table2reportsresultsforpitchesthrown oneachpossiblecount,e.g.,1-0,3-1,etc.ascolumn3demonstrates,thelikelihoodofa fastballvarieswidelyacrosscounts.ona3-0count,almost95percentofallpitchesare fastballs;whenthecountis1-2theshareoffastballsisonly52percent.columns3-6 showopscomparisonsforfastballsandnon-fastballsbycount,forpitchesthatendtheat bat.thedifferencesinoutcomesforfastballsversusnon-fastballstendtobesmallwhen therearefewerthantwostrikes.ontwostrikecounts,however,non-fastballsgenerate anopsthatismorethan100pointslowerthanforfastballs,andthisgapishighly statisticallysignificant.thelastfourcolumnsoftable2reportthefinaloutcomeofthe at-batasafunctionofwhichpitchwasthrownateachcount,whenthatpitchdoesnot actuallyendtheat-bat.iftherearenospilloversacrosspitches,thereshouldbeno differenceinoutcomesacrosspitchtypesifthepitchdoesnotendtheatbat.tothe extent,however,thatfastballsareslightlymorelikelytogeneratestrikesthannon- fastballs,throwingafastballmayprovidesomebenefittothepitcherwhentheat-bat doesnotendwiththecurrentpitch. 7 TheresultsinthelastfourcolumnsofTable2 suggest,however,that,ifanything,throwingafastballonthecurrentpitchleadsto slightlyworseoutcomeswithinthisat-batifthepitchdoesnotterminatetheat-bat.for mostcounts,theeventualat-batopsiscloseforfastballsandnon-fastballs,butwithtwo strikesthenon-fastballsyieldlowerops. 7 Thereareotherchannels,aswell,viawhichafastballmightprovidedeferredbenefits.First,itmaybe thatitishardertohitapitchiftheprecedingpitchwasafastball.second,fastballsmightbelesslikelyto generateothernegativeresults,likewildpitches,passedballs,andstolenbases.third,fastballsmight causelesswearandtearonthepitcher sarm.backoftheenvelopecalculationssuggestthatnoneofthese channelsislikelytobeevenclosetoamagnitudetooffsettheobservedopsdifferencesbetweenfastballs andotherpitches.
Table3analyzesmoreformallythelinkbetweenpitchtypeandOPSusing regressionspecificationsofthefollowingform: (1) OPS = β Pitchtype + X ' Γ + λ + θ + ε apb k apb apb p b apb wherea,p,b,andkindexat-bats,pitchers,batters,andpitchtypesrespectively.opsis ourmeasureofhowsuccessfulthebatterisintheat-bat.pitchtypedenoteswhetherthe pitchthatendstheat-batisafastball,curveball,slider,orchangeup.alsoincludedinthe regressionisasetofcovariatesxthatincludesindicatorsforthecountpriortothefinal pitchoftheat-bat,theinningofthegame,thenumberofouts,andthenumberofrunners onbase.insomespecificationspitcherandbatterfixed-effectsareincluded,pitcher- batterinteractions,andinourmostfullysaturatedmodels,pitcher*batter*count interactions.intheseregressions,welimitthesampletopitchesthatendtheat-bat. 8 Column1ofTable3includespitchtype,butnoothercontrols.Changeupisthe omittedpitchcategory,soallcoefficientsshouldbeinterpretedasrelativetotheoutcome ifachangeupisthrown.withnocovariatesatall,asalreadynotedinthesummary statistics,theoutcomeswhenfastballsarethrownarequitebadforpitchers:anopsgap of.094(standarderror=.004)relativetochangeups.curveballsandslidershavethebest pitcheroutcomes.asdemonstratedincolumn(2),however,asubstantialfractionofthe gapacrosspitchtypeiseliminatedwiththeinclusionofcount-fixedeffects.after controllingforcount,thegapbetweenfastballsandchange-upsfallsto.041(se=.004). Slidersdoslightlybetterthanchangeups,curveballsslightlyworse. Column3ofTable3addsarangeofcontrolscorrespondingtothegamesituation: theinning,numberofouts,andnumberofrunnersonbase.includingthesecovariates 8 Wehavealsorunthesespecificationsforpitchesthatdonotendtheat-bat.Theresultsforspecifications matchingcolumns1,2,and3showchangeupsunder-performingallotherpitchesbyasmall,but significantamountwhilethosematchingcolumns4and5intable3aresmallandinsignificant.
haslittleimpactonthecoefficientsonpitchtype.opsislowestintheninthinningand highestwithnooutsandwiththebasesloaded.thelikelyexplanationforloweropsin theninthinningisthatonaveragethequalityofthepitcherishigherbecausespecialist closers arebroughtinduringthefinalinningsofclosegames.theinclusionofpitcher- batterinteractionsconfirmsthisintuitionincolumn(4).inthisspecification,itisthe earlyinningsinwhichopsislow.controllingforpitcher-batterinteractionsincreases theopsgapbetweenfastballsandotherpitches,whichimplieseitherthatbetterthan averagepitcherstendtothrowmorefastballs,orthatbetterthanaveragehittersseefewer fastballsthanotherhitters.column(5)addspitcher*batter*countinteractions.thus,the identificationincolumn(5)comesonlyfromcaseswherethesamepitcherandbatterare facingeachother,withthesamecount,andinoneinstancethepitcherthrowsaparticular pitch,andonanothersuchoccasion,adifferenttypeofpitch.addingthesethree-way interactionshaslittleimpactonthecoefficients. TheOPSgapsonfastballsinTable3aresubstantialinmagnitude.Fox(2006) estimatesthateach.001pointofopsoverthecourseofaseasontranslatesinto2.16 additionalruns.ifapitchingstaffwereabletoreducetheshareoffastballsthrownby10 percentagepointswhilemaintainingtheobservedopsgaponfastballs,thiswould reducethenumberofrunsallowedbyroughly15perseason,ortwopercentofateam s totalrunsallowed.becauseofbehavioralresponsesbybatters,thisislikelytobean upperboundonthecostofteamsthrowingtoomanyfastballs. Table4exploresthesensitivityofthecoefficientonpitchtypestoavarietyof subsetsofthedata,usingthespecificationreportedincolumn5oftable3asabaseline. Thethreecolumnsofthetablecorrespondtotheestimateforfastballs,curveballs,and
slidersrespectively,inallcasesrelativetotheomittedcategory,whichischangeups. Eachrowofthetablerepresentsestimatesfromoneregression;onlythecoefficientson thepitchtypevariablesarepresentedinthetable. Thetoprowofthetableshowsthebaselineestimatesfortheentiresample.The nextthreerowsoftable4dividethesampleaccordingtowhetheritisahitter scount,a neutralcount,orapitcher scount. 9 Interestingly,oncewecontrolforotherfactors, fastballsnotonlydoworseonpitcher scounts(aswasapparentinthelargeopsgapin therawdatafortwo-strikecounts),butonneutralandhitter scountsaswell.across thesethreeclassifications,thecoefficientonfastballrangesfrom.064to.087. Thenextthreerowsdividethesampleofpitcherswithatleast200plate appearancesagainstthemintothreeequal-sizedcategoriesaccordingtotheirops. Good pitchershavethelowestthirdofopsagainst,and bad pitchershavethe highestopsagainst.theopsgapassociatedwithfastballsissmallestforthegood pitchers.defininggoodandbadhittersinaparallelfashion,wefindthattheopsgapfor fastballsispresentonlyforgoodandmediumhitters.badhittersdobestwithchangeups andworstwithcurveballs. Thepitcherswhothrowthefewestfastballsgenerallydoworsewithfastballs thanpitcherswhothrowmorefastballs.fastballsdobestwhentherearerunnersin scoringposition;inthatcircumstance,fastballshaveworseoutcomesthanchange-ups, butsimilaroutcomestocurveballsandsliders.thereislittlesystematicdifferenceinthe coefficientonfastballasafunctionofthenumberofouts. 9 Specifically,wedefinehitter scountas1-0,2-0,3-0,and3-1counts,neutralcountas0-0,1-1,2-1,and3-2counts,andpitcher scountas0-1,0-2,1-2,and2-2counts.
Serialcorrelationinpitchchoice Minimaxtheorypredictsthatequilibriumactionswillbeseriallyuncorrelated.In thecontextofbaseball,testingthispredictioniscomplicatedbythefactthatthepayoff matrixchangesbothacrossat-bats,andevenwithinanat-bat.thepayofftoafastball,at leastaccordingtothechoicespitchersactuallymake,ishigherwitha3-0countthanan 0-2count.Theempiricalchallengeistoconvincinglycontrolfortheheterogeneityin payoffs,knowingthatthesepayoffsarepotentiallyafunctionofmanyvariablesthatare notinourdataset(e.g.,howfatiguedthepitcheris,whichwaythewindisblowing,etc.) Consistentwithourestimationstrategyabove,onemeansofcontrollingfor unobservablesistoincludepitcher*batter*countinteractions.insuchaspecification,the identifyingvariationcomesonlyfrominstanceswhenthesamepitcherandbatterreach thesamecountonmultipleoccasions,butthepitcherchoosestothrowdifferentpitches. Eventhisstrategy,however,issubjecttocriticismwhentryingtomeasureserial correlation:ifunobservablefactorsledthepitchertochooseafastballontheprevious pitchofthisat-bat,perhapsthosesamefactorsarealsorelevantwhenchoosingthenext pitchtothrow.onadaywhenapitcherhashiscurveballworkingeffectively,hewill tendtothrowmorecurveballs. Toaddressthispotentialcriticism,weconditionnotonlyonpitcher*batter*count, butalsoonthenumberofpitchesofeachpitchtypethathavebeenthrownthusfarinthe at-bat.thus,ouridentificationcomesonlyfromcaseswherethesamepitcherandbatter meetonmultipleoccasions,reachthesamecount,andprogressthroughtheexactsame numberoffastballs,curveballs,changeups,andslidersinreachingthatcount,butthe orderinwhichthosepitcheswerethrowndiffers.minimaxtheorywouldpredictthatfor
thesamebatterandpitcher,ifthecountis2-1,andthusfarintheat-battherehavebeen twofastballsandoneslider,itshouldnotmatterwhethertheslidercameonthefirst, second,orthirdpitchoftheat-bat. Formally,theregressionspecificationweestimatetakestheform: (2) Pitchtypeabpt = β Pitchtypeabpt 1 + X abpγ + δbpcn f ncu ns n + ε ch abpt Wherea,b,p,andtindexanat-bat,pitcher,batter,andthenumberofthepitchwithinan at-batrespectively.pitchtype,asbefore,correspondstowhetherthepitchisafastball, curveball,etc.thecontrolvariablesxincludethepercentofpitchesbypitchtypethat havebeenthrowntothisbatteronthiscountuptothispointduringtheseason(excluding thisobservation),thesamevariabledefinedforthepitcher,andtheshareofeachpitch typethrownbythepitcherthusfarinthisgame.theδ termrepresentsa pitcher*batter*count*numberofpitchesofeachpitchtypethrownthusfarintheat-bat, withcreflectingthecountandn f capturingthenumberoffastballsthrownthusfarinthe at-batandsimilarlywiththeothersubscripts. Table5reportsourestimatesofserialcorrelationusingvariationsonequation(2). Thedependentvariableineachregressionislistedatthetopofthecolumn.Incolumns 1-4,alongwiththeinteractionsandcontrols,weincludeanindicatorvariableequalto oneiftheprecedingpitchisthesameasthedependentvariable.thesespecifications measurewhether,conditionalonthecontrolsdescribedabove(e.g.,thecountandthe numberofpitchesbytypeinthisatbat),knowingthepitcherthrewaparticularpitchon thelastpitchhelpspredictwhetherhewillthrowitasthecurrentpitch.forthreeofthe fourpitchtypes,weobservestatisticallysignificantnegativeserialcorrelation.the largestcoefficientisforfastballs.ifthepitcherthrewafastballonthelastpitch,allelse
equal,itlowersthelikelihoodthispitchwillbeafastballby4.1percentagepoints.in relativeterms,thenegativeserialcorrelationforslidersisgreater,sinceslidersrepresent onlyabout10percentofallpitches.ifthelastpitchwasaslider,thelikelihoodthatthis currentpitchisasliderfallsbytwopercentagepoints,ortwentypercent.thenegative serialcorrelationisroughlyhalfaslargeforcurveballs,andnotpresentforchangeups. Columns5-8ofTable5addtheindicatorsforonce-laggedvaluesofeachpitch type,whichallowsustolearnnotjustwhetherpitchersrepeatthesamepitchmoreorless thanwouldbeexpected,butalsowhetherothertransitionalsequencesfrompitchtopitch appearmoreorlessfrequentlythanpredictedbytheory.ineachofthesecolumns,oneof thelaggedpitchtypesisomitted,andallresultsarerelativetothatomittedcategory.the resultsthatemergeincolumns5-8demonstratethatthereisgreaternuanceassociated withtheorderingofpitchesthansimplythenegativeserialcorrelationobservedinthe firstfourcolumns.forinstance,incolumn5,notonlyisitthecasethatfastballsfollow fastballslessthanwouldbeexpected,butalso,fastballsaremorelikelytooccurafter changeupsthanafterothernon-fastballs.incontrast,curveballsareleastlikelytofollow changeups(andviceversa),andcurveballsaremostlikelytofollowfastballs. Changeupsaremorelikelytooccurifthelastpitchwasachangeupthanitwasanother non-fastball. Calibratingthevaluetoateam sbattersofexploitingthesecorrelationpatterns requiresmakingassumptionsastohowvaluableitistoamajorleaguehittertoknow whattypeofpitchiscoming.executivesofmajorleaguebaseballteamswithwhomwe spokeestimatedthattherewouldbea.150gapinopsbetweenabatterwhoknewfora certainafastballwascomingversusthatsamebatterwhomistakenlythoughtthatthere
wasa100percentchangethenextpitchwouldnotbeafastball,butinfactwassurprised andfacedafastball.ifonemakesthefurtherassumptionthattheopsgapislinearina hitter sexpectationsaboutwhattypeofpitchwillbecoming,thenknowingthatafastball is4.1percentagepointslesslikelyifthelastpitchwasafastball(andconverselymore likelyifthelastpitchwasnotafastball)isworthroughly.006opspointstoabatter. Thus,thepotentialbenefitfromexploitingthepatternsofserialcorrelationisthesame magnitudeasidentifiedearlierfrompitchersthrowingtoomanyfastballs about10-15 runsperyear. SectionII:AnanalysisofplayselectionintheNationalFootballLeague OurdataonplaychoiceintheNationalFootballLeaguewascompiledby STATS,Inc.STATS,Inc.maintainsanetworkofreporterstrackingeverysnapindetail toprovideexclusiveinformationfromtheirproprietarydatabasetothenflandother clients.thedatasetincludesextensiveinformationforeachplayinthenfloverthe period2001-2005:dateofgame,offensiveteam,defensiveteam,generalgame description(e.g.,stadium,weather,etc.),currentgamesituation(quarter,locationon field,down,yardstogo,etc.),offensiveformation,thetypeofplay(run,pass,punt,field goal,etc.),andtheoutcomeoftheplay(e.g.,yardsgained). Aswasthecasewithbaseball,therearemanydimensionsonwhichplaytypes vary:runorpass,direction,distance,movementofplayers,etc.welimitourstudyto justonedimension:thechoiceofwhethertocallarunningplayorapassingplay. OurrawdatacoverseveryplayfromregularseasonNFLgamesoverfivefull seasons:2001-2005.weexcludefourthdownplays,aswellasallplaysthatoccurinthe
lasttwominutesofthehalf,duringovertime,orwhenateamkneelsdowntorunoutthe clock.becauseofdifficultiesinourdataofidentifyingwhethertheteam sintentionwas torunortopassonplayswherethequarterbackruns,andonpenaltiescalledbeforea playunfolds(e.g.,falsestart),theseplaysarealsoexcluded.aftertheseexclusions,we have127,885totalplays. Unlikebaseball,wheretherearewell-establishedsummarymetricsfor evaluatingthesuccessofanatbat(e.g.ops),thereisnoparallelstatisticinfootball. Consequently,weconstructourownmeasureofsuccessforaplayinfootballasfollows. First,weestimatethevaluetoateamofhavingpossessionoftheballasafunctionof distancefromtheendzone,whatdownitis,andyardstoachieveafirstdown,usinga regressiontakingtheform (3) Y = f (down, yardstofirstdown,distancetogoal) wheretheoutcomevariableyisthechangeinthegamescorebetweenthecurrenttime andtheendofthehalf.weallowforaflexiblefunctionformwithrespecttotheright- hand-sidevariables,includingfullyinteractedquinticsofeachofthevariables.the valuesgeneratedfromequation(3)appearsensible.forinstance,thethreelinesin Figure1showtheestimatedvaluetoateamofhavingtheballfirstdownandtenyardsto go,seconddownandtenyardstogo,andthirddownandtenyardstogorespectivelyasa functionofthedistancetotheendzone. 10 Ifateamhastheballfirstandtenatthe opponent stenyardline,thatteamwillexpecttogainmorethanfourpointsrelativeto theotherteambytheendofthehalf.thevalueofhavingtheballfirstandtendeclines 10 OneothermeasureofperformanceinNFLfootballisNetExpectedScoring(NES),developedbyCitizen SportsNetwork.NESisnotasflexibleasoursuccessmetric,butishighlycorrelated(ρ=.0.8583)withour successmetric.
nearlylinearlywithfieldposition;havingtheballfirstandtenonone sowntenyardline isassociatedwithessentiallynoexpectedchangeinthehalf-timescore.havingtheball secondandtencostsateamaboutone-halfapointrelativetohavingtheballfirstandten fromthesamefieldposition.movingfromsecondandtentothirdandtenisevenmore costlyforateam. Tocomputehowsuccessfulaparticularplayis,wecalculatethechangein expectedpointsscoredbeforeandaftertheplay(e.g.,lookingatfigure1,ifateamgains 20yardswhenitisfirstandtenfromitsown20yardline,expectedpointsscoredjump byroughlyone)andsubtracttheaveragechangeinexpectedpointsforallplaysinthe datasetthatbeganatthesamedown,distance,andyardstothegoal. 11 Theresulting statistic,whichwecallour successmetric, ismeanzero.thesuccessmetrictellsus,in unitsofexpectedpointsscored,howmuchthisplayexceedsorunderperformsthe averageplayrunonthisdown,distance,andyardstogoal. Inadditiontothisconstructedmeasureofsuccess,wealsoreportresultsformore traditional,buthighlyimperfectoutcomemeasures:yardsgained, 12 whetherafirstdown ismade,whetherpointsarescored,andwhetheraturnoveroccurs. Table6reportssummarystatisticsforthefootballdataset.Column1shows outcomesforallplays;columns2and3dividethesampleintorunningplaysandpassing playsrespectively.column4reportsthet-statisticofthecomparisonofmeansbetween runningandpassingplays.overall,runsrepresent44percentofthecombinedpassesand runs.forallplaysoursuccessmetric,bydefinition,hasameanvalueofzero,sinceitis 11 Whentheplayresultsinpointsbeingscored,thosepointsareincludedinourcalculationofthemetric. 12 Weadjustedyardsgainedperplaytocapturecertaincircumstances:penaltyyardageforpenalties occurringwithinplayswereincorporated,touchdownswithin10yardsofendzonewerecreditedwitha full10yardsgained,andinterceptionswereadjustedto-45yardsgained.
definedasthedeviationfromtheexpectedoutcomeonaplay.note,however, comparingthetoprowofcolumns2and3,thatpassingplayssystematicallyoutperform runningplays.themeangapbetweenthetwotypesofplaysisroughly.066,implying thatonaverage,apassingplaygenerates.066morepointsthanarun.thisdifferencein meansishighlystatisticallysignificant.consistentwiththisresult,passesonaverage yieldanextra.55yardsgained,andareninepercentagepointsmorelikelytoyieldafirst down.passesdo,however,producemoreturnovers,andthushavehighervariance.runs resultinscoringplays2.8percentofthetime;passesleadtoscoreswitha3.8percent probability. Tofurtheranalyzethedifferenceacrossrunsandpasses,weestimateregressions oftheform: (4) Outcome pij = α + βpass + X Γ + λ + ε pij pij ij pij wherep,i,andjindexaparticularplay,offensiveteam,anddefendingteamrespectively. Outcomeisourmeasureofsuccessforanoffensiveplay.Passisanindicatorvariable equaltooneiftheteamcallsapassingplay,andzeroiftheplayisdesignedtobearun. Xisavectorofcontrols,suchasthescoredifferentialatthetimeoftheplay,whetherthe gameisplayedongrass,whethertheoffensiveteamisthehometeam,theyearofthe game,etc.insomespecifications,wealsoaddteam-fixedeffectsfortheoffenseandthe defense,orinteractionsbetweentheoffensiveanddefensiveteams. Theresultsfromestimatingequation(4)arepresentedinTable7.Thefour columnsrepresentfourdifferentspecifications,withthenumberofcontrolsincreasing
movingfromlefttorightinthetable.thefirstcolumn,whichincludesnocovariates, confirmstherawdifferenceinmeansbetweenpassingandrunning;apassgeneratesan additional.066pointsinexpectation.column2addscontrols,butdoesnotincludeteam- fixedeffects.therelativevalueofapassincreasesto.083pointsinthisspecification. ThisspecificationalsohighlightsthesizablehomefieldadvantageintheNFL:an offensiveplayrunbythehometeamgenerateanextra.041points,orroughlyhalfthe differencebetweenapassandarun.offensesperformslightlyworseinthecold.the playingsurfacedoesnothavealargeimpactontheoffense seffectiveness. Column3addsfixed-effectsfortheoffensiveanddefensiveteams.Thesefixed effectswillabsorbanysystematicdifferencesacrossteamsinoffensiveanddefensive prowess.therelativevalueofapassdecreasesslightlyto.077pointsinthis specification.thelastcolumnofthetableincludesinteractioneffectsfortheoffensive anddefensiveteams.therelativevalueofapassisessentiallyunchangedat.075points. Table8examinesthesensitivityoftheresultsofrunningversuspassingtoa varietyofsubsetsofthedata,aswellasreportingresultsforanexpandedsetofoutcome measures(yardsgained,achievingafirstdownontheplay,turnovers,andwhethera touchdownisscoredontheplay).thecolumnsofthetablecorrespondtodifferent outcomemeasures,e.g.ourconstructedsuccessmetric,yardsgained,etc.eachrowof thetablerepresentsadifferentsubsetofthedata.inallcases,weincludeteam-fixed effectsandcontrolsmirroringthoseincolumn4oftable7.onlythecoefficientonthe passindicatorvariableisreportedinthetable. Focusingfirstonthecolumn1ofTable8,thetoprowofthetablereportsour baselinespecification.thus,theentryinthetoprowincolumn1matchesthecoefficient
wereportintable6,column4:a.075gapbetweenpassesandrunsonoursuccess metric.consistentwiththisresult,passesdobetteronyardsgained,firstdownsmade, andscoring,butalsoleadtomoreturnovers.movingdownthroughthetable,passes outperformrunsinallquartersofthegame,butbyagreatermargininthefirsthalfthan thesecondhalfofthegame.thebenefitsofpassingaccrueequallytohometeamsand visitors.thebestoffensesexhibitthegreatestgapbetweenpassesandruns;fortheworst offensesthedifferentialisnotstatisticallysignificant.teamsthatpassthemosthave slightlysmalleredgeswhenpassingthanotherteams. TheresultsinTables7and8demonstratethattheexpectedoutcomeofapass systematicallyexceedsthatofarun aresultthatisinconsistentwithminimaxtheory. 13 Accordingtothetheory,defensesshouldadjusttobetterdefendthepass.Absentthat adjustment,offensesshouldbepassingmoreoften.themagnitudeofthedeviationsin payoffsthatareobservedaresubstantial.thetypicaloffenserunsabout60playsagame, 56percentofwhicharepasses.Iftheoffensecouldincreaseitsshareofpassesto70 percentwithoutinducinganoffsettingresponseonthepartofthedefense,itwould generateanadditional0.63pointspergameinexpectation(8.4additionalpasses*0.075 expectedpoints),oranextratenpointsoverthecourseofaseason,orroughly3percent ofateam stotalscoring.becausedefensesarelikelytorespond,thatestimateislikely anupperboundonhowmuchanoffensecouldgainbyexploitingthedeviationsfrom minimaxplaythatarepresentinthedata. 13 Alamar(2006)showspassingplaysashavinganoutcomeadvantageofnearly1.8yardsperplay (adjustedyards)forthe2005nflseason.however,thesourceofthatdata--http://www.pro-football- reference.com/years/2005/--showsadifferenceof0.5yardswhenconsideringbothsacksandinterceptions aspassingplays( AdjustedNetYardsgainedperpassattempt ),whichisconsistentwithourfindings.
SerialcorrelationinNFLplaycalling Toassesswhetherthereisserialcorrelationinthechoiceofrunsversuspasseson thepartofnfloffenses,werunregressionsoftheform (5) Pass pij = α + βpass 1 + X Γ + ε p ij pij pij wherepassisanindicatorforwhethertheplaycalledwasintendedtobeapass.the coefficientβcapturesthedegreeofserialcorrelationinplaycalling.includedinthe vectorofcontrolsarethesamesetofcovariatesintheearlierfootballanalysis,alongwith threeadditionalvariables:thepercentageoftimethattheoffensiveteampassedoverthe courseoftheentireseason,thepercentageoftimethatthedefensiveteamwaspassed againstoverthecourseoftheentireseason,andtheshareofpassesbytheoffenseinthis game,upuntilthetimethisplayiscalled.becausewecontrolfordownanddistancein theseregressions,aswellasanoffense soveralltendenciestowardspassingversus running,ouridentificationcomesfromacomparisonof,forexample,whetheron2 nd downand10,apassismorelikelyifthepreviousplaywasarunthatwentfornogain,or thepreviousplaywasanincompletepass.minimaxplaypredictsnoserialcorrelation, implyingazerocoefficientonwhetherornotthelastplaywasapass. Thebasicestimationresultscorrespondingtoequation5arepresentedincolumn 1ofTable9. 14 Offensiveplaycallingrevealssubstantialnegativeserialcorrelation,with acoefficientof-.100(se=.003).inotherwords,conditionalonotherfactors,ateamis 14 ThenumberofobservationsinTable9issmallerthanintheearlieranalysisfortworeasons.First,the firstplayofeachdriveisnotincludedintheserialcorrelationanalysis.second,playsforwhichthe precedingplaycouldnotbecatalogedasarunorapass(e.g.becauseofapenaltyoraquarterbackrun)are alsoexcluded.
almost10percentagepointslesslikelytopassonthiscurrentplayiftheypassedonthe previousplay. Tofurtherexplorethequestionofserialcorrelation,wedividethesampleinto thirdsaccordingtohowsuccessfulthepreviousplaywas,withsuccessdefinedbyour constructedsuccessmetrictheresultsforthesethreesubsetsofthedata(i.e.previous playwasinthebottom-third/middle-third/upper-thirdsuccess-wise)areshownin columns2through4oftable9.negativeserialcorrelationismostpronouncedwhenthe precedingplaywasunsuccessful.experiencingapoorresultonthelastplayincreases thelikelihoodtheteamwillswitchfromaruntoapassorvice-versaby14.5percentage points.incontrast,whenthelastplayisintheupper-thirdofsuccessfuloutcomes,the tendencytoswitchawayfromthatplaytypeisgreatlymitigated(serialcorrelation coefficientofonly-.025). Thesecoefficientsimplytheopportunityfornon-trivialgainsforteamsthat successfullyexploitserialcorrelationonthepartofopposingoffenses.assume,for instance,thatifadefenseknewwith100percentcertaintywhetheraplaywouldbearun orapass,itcouldcuttheaverageyardagegainedinhalfbyadjustingdefensivepersonnel orpositioning.assume,aswell,thatifthedefensewas100percentcertainapasswas coming,butinsteadtheoffenserantheball,theexpectedyardagegainedwouldbe50 percentgreaterthantheaverage,andsimilarlyifthedefenseexpectedarunandthe offensepassed. 15 Finally,letusassumethatthevalueofknowingwhatplayiscomingis linearintheprobabilities,i.e.goingfrom50percentlikelihoodofarunto60percent likelihoodyieldsone-tenthofthebenefitofgoingfromzeropercentto100percent.take 15 BasedondiscussionswithNFLteams,theseassumptionsarelikelytobeconservative,understatingthe potentialvaluetodefensesofexploitingserialcorrelatedoffensiveplay.
thecasewhereabsentserialcorrelation,adefenseexpectsanequalmixofrunsand passes,witheachtypeofplaygaining4.5yardsonaverage.withserialcorrelation, however,thetruemixofplaysafterapasswillberoughly60percentrunsand40percent passes.undertheassumptionsabove,ifthedefenseadjuststothisinformation,the averagerunningplaywillyield3.6yardsandtheaveragepassingplay5.4yards,yielding anoverallaveragegainfortheoffenseof4.32yards.18yardslessthanifthedefense ignorestheserialcorrelation.thereareroughly60offensiveplaycallspergamethatare precededbyanotheroffensiveplay.iftheaveragereductioninyardsgainedperplayis.18yards,thenthisamountstoanoverallreductionof10.8yardspergame,which translatesintoroughly1pointpergame.onepointpergameisworthapproximatelya halfvictoryperyear considerablegiventhenflregularseasonincludesjustsixteen games. 16 Thepotentialbenefitfromexploitingthepatternsofserialcorrelationin footballisslightlylargerthanthebenefitfromcallingfewerrunningplaysanalyzed earlier. 16 Webasethiscalculationonthechangeinexpectedwinningpercentagebyscoring16additionalseason pointsasestimatedbythepythagoreanwinningpercentage[expectedwinningpercentage=(points scored^2.64)/(pointsscored^2.64+pointsallowed^2.64)].takinganaveragenflteamwith350season pointsscoredand350pointsallowed,increasingtheirpointsscoredto366increasestheirexpected winningpercentagefrom.500to.529.
SectionIV:Conclusion Inthispaper,weutilizetwoenormousdatasetsgeneratedbyprofessionalsina highstakesenvironmenttoprovidethemostpowerfultesttodateofminimaxbehaviorin anaturalsetting.incontrasttomostpriorstudiesusingfielddata,wefindsubstantial deviationsfromminimaxbehavior,bothwithrespecttoequalizingpayoffsandserially correlatedactions.thesedeviationsarenotenormousinmagnitude meaningthatthey mightplausiblynothavebeendetectedinthesmallerdatasetsthathavebeenavailablein mostpriorfieldresearchonthetopic butarelargeenoughthatateamthatsuccessfully exploitedthesepatternscouldaddoneortwoseasonwinsandmillionsofdollarsin associatedrevenue. OurfindingsreinforcetheresultsofRomer(2006),Levitt(2006),andPopeand Schweitzer(2009)indemonstratingthathighstakesalonearenotsufficienttoensurethat optimaldecision-makingwillensue,evenamongprofessionalsoperatingintheirnatural environments.
Figure1:ValueofPossessingtheBallbyDownandDistance ExpectedPoints -2 0 2 4 6 FirstDown SecondDown ThirdDown 10 20 30 40 50 60 70 80 90 YardsfromGoal
Table 1: Major League Baseball Summary Statistics by Pitch Type DistributionofOutcomes PercentThrownIn OPSifAB PitchType Numberof observations Ball Strike/ foul InPlay Out InPlay Hit Endson thispitch Hitter's Counts Neutral Counts Pitcher's Counts Fastball 2000619 36.41% 43.37% 13.43% 6.79% 0.753 75.33% 66.49% 55.52% AllNon-Fastball 1109810 38.07% 42.60% 13.09% 6.24% 0.620 24.67% 33.51% 44.48% Change-Up 391318 37.12% 40.51% 15.29% 7.08% 0.658 11.52% 11.84% 14.16% Slider 421031 37.81% 44.22% 12.14% 5.83% 0.598 8.76% 12.58% 17.38% Curveball 297461 39.68% 43.07% 11.54% 5.70% 0.594 4.40% 9.10% 12.94% Notes:Datacoverpitchesfrom2002-2006.Pitchtypesbasedonclassificationsbythedataprovider,BaseballInfoSolutions,withsomeaggregationofcategoriesbythe authors.columns2-5reporttheoutcomeonthepitchinquestion.opsreferstothestatisticon-basepercentageplussluggingpercentage.hitter scountsaredefinedas 1-0,2-0,3-0,and3-1counts;neutralcountsare0-0,1-1,2-1,and3-2counts,pitcher scountsare0-1,0-2,1-2,and2-2counts.seethetextfordetailsonsample exclusions.
Table 2: Outcomes for Fastballs versus Non-Fastballs by Count OPSifpitchendsat-bat Number Count ofpitches %Fastballs All Fastball Non- Fastball P-value All Fastball OPSofat-batifthispitch doesnotendtheat-bat Non- Fastball P-value 0-0 834355 68.93% 0.839 0.838 0.841 0.802 0.684 0.681 0.691 0.001 1-0 340215 68.95% 0.873 0.877 0.861 0.193 0.738 0.735 0.744 0.052 2-0 118774 81.59% 0.947 0.955 0.900 0.038 0.820 0.820 0.821 0.845 3-0 38346 94.93% 1.008 1.008 1.005 0.813 0.876 0.876 0.886 0.752 0-1 387207 57.42% 0.774 0.780 0.766 0.120 0.575 0.578 0.571 0.094 1-1 319943 57.40% 0.821 0.820 0.823 0.785 0.619 0.615 0.623 0.073 2-1 170748 68.84% 0.882 0.882 0.883 0.977 0.702 0.703 0.699 0.508 3-1 72665 84.61% 0.997 1.005 0.949 0.000 0.721 0.725 0.703 0.150 0-2 180300 56.07% 0.401 0.441 0.361 0.000 0.509 0.511 0.505 0.358 1-2 273334 52.13% 0.438 0.473 0.406 0.000 0.571 0.569 0.575 0.250 2-2 232790 55.89% 0.491 0.521 0.457 0.000 0.682 0.687 0.677 0.105 3-2 141752 69.76% 0.731 0.769 0.651 0.000 0.762 0.763 0.760 0.794 2Strike Counts 828176 57.06% 0.523 0.576 0.458 0.000 0.604 0.610 0.597 0.000 Other Counts 2282253 66.95% 0.860 0.871 0.833 0.000 0.676 0.682 0.666 0.000 Notes:DatacoverpitchesthrowninMajorLeagueBaseballbetween2002and2006.Countreferstothenumbersofballsandstrikespriortothepitchthatisthrown. ThemiddlefourcolumnsofthetablereportmeanOPS(onbaseplussluggingpercentage)forpitchesthatendtheat-bat,bypitchtype.Thep-valuecolumnreportsthe statisticalsignificanceofat-testoffastballsversusnon-fastballs.thelastfourcolumnsreporttheopsoftheat-batifthispitchdoesnotendtheat-bat.if,conditional ontheat-batnotending,throwingafastballonthispitchbenefitsthepitcherlateronintheat-bat,thentheopsonfastballinthethird-to-lastcolumnshouldbelessthan theopsfornon-fastballsinthepenultimatecolumn.
Table 3: Regression Analysis of Outcomes of Fastballs vs. Other Pitches (1) (2) (3) (4) (5) Fastball 0.094 *** 0.041 *** 0.042 *** 0.070 *** 0.073 *** (0.004) (0.004) (0.004) (0.005) (0.008) Slider -0.060 *** -0.016 ** -0.011 * 0.008 0.024 * (0.005) (0.005) (0.005) (0.006) (0.011) Curveball -0.064 *** 0.006 0.005 0.010 0.017 (0.006) (0.006) (0.006) (0.007) (0.011) 1stInning 0.061 *** -0.045 *** -0.038 * (0.006) (0.009) (0.016) 2ndInning 0.027 *** -0.041 *** -0.033 * (0.006) (0.009) (0.017) 3rdInning 0.037 *** -0.032 *** -0.031 (0.006) (0.009) (0.016) 4thInning 0.056 *** -0.011-0.012 (0.006) (0.009) (0.016) 5thInning 0.042 *** -0.002 0.000 (0.006) (0.009) (0.017) 6thInning 0.056 *** 0.018 * 0.014 (0.006) (0.009) (0.016) 7thInning 0.026 *** 0.014 0.013 (0.006) (0.009) (0.016) 8thInning 0.015 * 0.004-0.000 (0.006) (0.008) (0.016) 0Outs 0.028 *** 0.028 *** 0.029 *** (0.003) (0.004) (0.006) 1Out 0.020 *** 0.023 *** 0.025 *** (0.003) (0.004) (0.006) 0Runners 0.009 0.003 0.016 (0.008) (0.010) (0.018) 1Runners -0.004-0.013 0.002 (0.008) (0.010) (0.018) 2Runners -0.016-0.018-0.013 (0.009) (0.011) (0.019) R² 0.003 0.027 0.027 0.284 0.745 CountFEs No Yes Yes Yes Yes PitcherFEs No No No Yes Yes BatterFEs No No No Yes Yes PitcherxBatter No No No Yes Yes FEs PitcherxBatter No No No No Yes xcountfes Notes:ThedependentvariableistheOPSoftheat-bat.Onlypitchesthatendtheat-batareincludedintheanalysis. Standarderrorsareshowninparentheses.Theomittedpitchtypeischange-up,sothepitchtypecoefficientsare relativetochange-ups.thenumberofobservationsisequalto834,345inallcolumns. * p<0.05, ** p<0.01, *** p<0.001
Table 4: Sensitivity Analysis of the Pitch-type Coefficients Fastball Slider Curveball Baseline 0.073*** 0.024* 0.017 (0.008) (0.011) (0.011) Hitter'sCount 0.087*** 0.063 0.000 (0.025) (0.04) (0.05) NeutralCount 0.064*** 0.016 0.001 (0.013) (0.018) (0.020) Pitcher'sCount 0.077*** 0.024 0.027 (0.010) (0.013) (0.014) GoodPitcher 0.055*** 0.003 0.014 (0.016) (0.020) (0.022) MediumPitcher 0.085*** 0.029 0.023 (0.014) (0.020) (0.020) BadPitcher 0.085*** 0.040 0.025 (0.017) (0.024) (0.025) GoodBatter 0.080*** 0.030* 0.027* (0.009) (0.012) (0.013) MediumBatter 0.051*** 0.004-0.009 (0.015) (0.021) (0.022) BadBatter 0.002 0.004-0.135 (0.063) (0.075) (0.077) MostFasballs 0.059*** 0.064** 0.019 (0.016) (0.022 (0.023) MediumFastballs 0.040** -0.016-0.035 (0.013) (0.018 (0.018) FewestFastballs 0.111*** 0.034* 0.061*** (0.012) (0.016 (0.018) RISP 0.030** 0.053*** 0.023 (0.011) (0.014 (0.015) Manon1st 0.090*** 0.042 0.025 (0.023) (0.030) (0.033) BasesEmpty 0.091*** 0.023 0.028 (0.013) (0.017) (0.018) 2Outs 0.085*** 0.051 0.017 (0.021) (0.027) (0.029) 1Out 0.072*** 0.015 0.048 (0.020) (0.027) (0.029) 0Outs 0.064** -0.004-0.019 (0.020) (0.027) (0.028) Notes:ThedependentvariableinallcasesistheOPSofanat-bat,forpitchesthatendtheat-bat.Valuesinthetable arethecoefficientsonthepitch-typeindicatorsfromspecificationsthatparallelthoseshownintable3,column5.in allcases,theomittedpitchtypeisachange-up,soallcoefficientsarerelativetochange-ups.eachrowofthetable reportstheresultsfromadifferentregression.standarderrorsareshowninparentheses.thetoprowreproducesthe resultsforthebaselinesampleintable3.theremainingrowsreportresultsforarangeofsubsetsofthedata. * p<0.05, ** p<0.01, *** p<0.001
Table 5: Serial Correlation in Pitch Type (1) (2) (3) (4) (5) (6) (7) (8) Fastball Curveball Changeup Slider Fastball Curveball Changeup Slider PreviousFastball -0.041 *** -0.033 *** 0.016 *** 0.021 *** 0.015 *** (0.001) (0.002) (0.001) (0.001) (0.001) PreviousCurveball -0.009 *** 0.014 *** 0.005 *** 0.000 (0.001) (0.002) (0.001) (0.002) PreviousChangeup 0.001 0.019 *** (0.001) (0.002) PreviousSlider -0.023 *** 0.013 *** 0.006 *** 0.011 *** -0.010 *** (0.001) (0.002) (0.001) (0.002) (0.002) Observations 2276074 2276074 2276074 2276074 2276074 2276074 2276074 2276074 R² 0.194 0.241 0.198 0.242 0.194 0.241 0.198 0.242 Notes:Dependentvariableisanindicatorvariableequaltooneifthepitchthrownisnamedatthetopofthecolumn,andzerootherwise.Inallcasesthevaluesreported inthetablearethecoefficientonanindicatorvariablecorrespondingtowhetherthepreviouspitchintheat-batwasthepitchtypenamedintherightmostcolumn.each columnrepresentsadifferentregression.columns(1)-(4)includepitch-typesoneatatime;columns(5)-(8)includeallpitch-typessimultaneously.allspecifications includeinteractionsforpitcher*batter*count*numberofpitchesofeachpitchtypethrownthusfarintheat-bat,soidentificationcomesonlyfromcaseswherethesame pitcherandbatterhavereachedthesamecountwiththesamedistributionofpitchtypes,butindifferingordersofpitchtypesthrown.standarderrorsareshownin parentheses. * p<0.05, ** p<0.01, *** p<0.001
Table 6: Summary Statistics for NFL Football Allplays Runsonly Passesonly P-valueofruns versuspasses SuccessMetric 0.000-0.0370 0.0292 0.000 (1.233) (0.903) (1.441) YardsGained 4.367 4.052 4.615 0.000 (11.51) (7.636) (13.82) FirstDownMade 0.265 0.210 0.308 0.000 (0.441) (0.407) (0.462) FumbleorInterception 0.0336 0.0150 0.0482 0.000 (0.180) (0.122) (0.214) ScoringPlay 0.0338 0.0284 0.0382 0.000 (0.181) (0.166) (0.192) FarfromGoal 0.354 0.345 0.361 0.000 (0.478) (0.475) (0.480) MediumfromGoal 0.379 0.361 0.394 0.000 (0.485) (0.480) (0.489) ClosetoGoal 0.267 0.294 0.245 0.000 (0.442) (0.456) (0.430) 2001 0.194 0.191 0.197 0.006 (0.396) (0.393) (0.398) 2002 0.205 * 0.198 0.211 0.000 (0.404) (0.398) (0.408) 2003 0.202 0.207 * 0.199 0.000 (0.402) (0.405) (0.399) 2004 0.203 0.207 0.200 0.004 (0.402) (0.405) (0.400) 2005 0.195 0.197 0.193 0.068 (0.396) (0.398) (0.395) Temperature40orBelow 0.120 0.124 0.116 0.000 (0.325) (0.330) (0.320) HomeTeam 0.505 0.515 0.498 0.000 (0.500) (0.500) (0.500) Grass 0.638 0.639 0.638 0.609 (0.480) (0.480) (0.481) Numberofobservations 127885 56401 71484 Notes:Theunitofobservationisanoffensiveplay.Dataincludesplaysfrom2001-2005fortheNationalFootball League,excludingfourth-downplays,playsinthelasttwominutesofahalf,overtime,andquarterbackruns(whichwe cannotaccuratelycategorizeintermsofintentionsintorunsversuspasses).thevariable successmetric isour estimateofagivenplay scontributiontotheoffensiveteam sscorerelativetotheaverageplayfromthisdown, distance,andyardstogoal.thefinalcolumnofthetablereportsp-valuesfromat-testofequalityofmeansfor runningandpassingplays.standarddeviationsareshowninparentheses. * p<0.05, ** p<0.01, *** p<0.001
Table 7: Regression Estimates of the Determinants of an Offensive Play s Success (1) (2) (3) (4) Pass 0.066 *** 0.083 *** 0.078 *** 0.075 *** (0.007) (0.008) (0.008) (0.008) 2002 0.022 * 0.023 * 0.027 * (0.011) (0.011) (0.013) 2003 0.005 0.006 0.005 (0.011) (0.011) (0.013) 2004 0.034 ** 0.034 ** 0.047 *** (0.011) (0.011) (0.013) 2005 0.014 0.016 0.022 (0.011) (0.011) (0.012) Temperature40orbelow -0.009-0.013-0.012 (0.011) (0.011) (0.014) HomeTeam 0.040 *** 0.043 *** 0.048 *** (0.007) (0.007) (0.008) Grass 0.003 0.025 ** 0.021 (0.007) (0.009) (0.011) R² 0.001 0.004 0.007 0.016 DownxDistance No Yes Yes Yes QuarterxScoreDifferential No Yes Yes Yes OffensiveTeamFEs No No Yes Yes DefensiveTeamFEs No No Yes Yes OffensiveTeamxDefensiveTeamFEs No No No Yes Notes:Thedependentvariableisour successmetric, whichisourbestestimateofthemarginalcontributionofthis offensiveplaytotheoutcomeofthegame,measuredinunitsofpointsscored.thesuccessmetricismeasuredrelative totheexpectedoutcomeforaplayatagivendown,distance,andyardstothegoal.thevariable pass isanindicator variableequaltooneiftheplayisdesignedtobeapassandzerootherwise.standarderrorsareshowninparentheses. Thenumberofobservationsisequalto127,885inallcolumns. * p<0.05, ** p<0.01, *** p<0.001
Table 8: Sensitivity Analysis of the Gap between Runs and Passes Coefficientonpass: SuccessMetric YardsGained FirstDown Made Turnover Scoring Baseline 0.075 *** 0.794 *** 0.128 *** 0.031 *** 0.023 *** (0.008) (0.073) (0.003) (0.001) (0.001) 1stQuarter 0.095 *** 1.119 *** 0.133 *** 0.028 *** 0.022 *** (0.014) (0.133) (0.005) (0.002) (0.002) 2ndQuarter 0.094 *** 0.825 *** 0.132 *** 0.033 *** 0.030 *** (0.017) (0.159) (0.006) (0.002) (0.003) 3rdQuarter 0.061 *** 0.715 *** 0.126 *** 0.030 *** 0.022 *** (0.015) (0.141) (0.005) (0.002) (0.002) 4thQuarter 0.049 ** 0.435 ** 0.116 *** 0.034 *** 0.021 *** (0.018) (0.167) (0.006) (0.003) (0.003) Visitor 0.071 *** 0.687 *** 0.128 *** 0.034 *** 0.023 *** (0.011) (0.105) (0.004) (0.002) (0.002) Home 0.076 *** 0.878 *** 0.126 *** 0.028 *** 0.023 *** (0.011) (0.103) (0.004) (0.002) (0.002) Top1/3Offenses 0.121 *** 1.475 *** 0.146 *** 0.026 *** 0.028 *** (0.013) (0.124) (0.005) (0.002) (0.002) Middle1/3Offenses 0.068 *** 0.692 *** 0.127 *** 0.032 *** 0.023 *** (0.014) (0.129) (0.005) (0.002) (0.002) Bottom1/3Offenses 0.028 * 0.143 0.110 *** 0.035 *** 0.017 *** (0.014) (0.130) (0.004) (0.002) (0.002) PassingOffenses 0.053 *** 0.672 *** 0.120 *** 0.032 *** 0.021 *** (0.014) (0.132) (0.005) (0.002) (0.002) BalancedOffenses 0.100 *** 0.938 *** 0.135 *** 0.030 *** 0.026 *** (0.013) (0.127) (0.005) (0.002) (0.002) RunningOffenses 0.074 *** 0.788 *** 0.128 *** 0.031 *** 0.022 *** (0.013) (0.124) (0.004) (0.002) (0.002) Notes:Dependentvariableislistedattheheadofeachcolumn.Eachentryinthetableisfromadifferentregression paralleltothespecificationshownincolumn(4)oftable7.controlsincludedintheregressionareinteractionsfor team*opponent,down*distance,quarter*scoredifferential,andindicatorsforgrassversusturf,temperaturebelow40 degrees,andhometeam.thetoprowofthetableincludesthewholesample.theotherrowsofthetabledividethe datasetintosubsamples.standarderrorsareshowninparentheses. * p<0.05, ** p<0.01, *** p<0.001