Gail E. Potter, Timo Smieszek, and Kerstin Sailer. April 24, 2015

Similar documents
Multiple Imputation for Missing Data in KLoSA

Table 1: Number of patients by ICU hospital level and geographical locality.

Flexible Working Arrangements, Collaboration, ICT and Innovation

Missing Data Treatments

wine 1 wine 2 wine 3 person person person person person

Decision making with incomplete information Some new developments. Rudolf Vetschera University of Vienna. Tamkang University May 15, 2017

Online Appendix to. Are Two heads Better Than One: Team versus Individual Play in Signaling Games. David C. Cooper and John H.

Problem Set #3 Key. Forecasting

Appendix A. Table A1: Marginal effects and elasticities on the export probability

BORDEAUX WINE VINTAGE QUALITY AND THE WEATHER ECONOMETRIC ANALYSIS

Table A.1: Use of funds by frequency of ROSCA meetings in 9 research sites (Note multiple answers are allowed per respondent)

To: Professor Roger Bohn & Hyeonsu Kang Subject: Big Data, Assignment April 13th. From: xxxx (anonymized) Date: 4/11/2016

Climate change may alter human physical activity patterns

The Development of a Weather-based Crop Disaster Program

Imputation of multivariate continuous data with non-ignorable missingness

Labor Supply of Married Couples in the Formal and Informal Sectors in Thailand

Appendix A. Table A.1: Logit Estimates for Elasticities

Ex-Ante Analysis of the Demand for new value added pulse products: A

Comparing R print-outs from LM, GLM, LMM and GLMM

The multivariate piecewise linear growth model for ZHeight and zbmi can be expressed as:

The R survey package used in these examples is version 3.22 and was run under R v2.7 on a PC.

Fair Trade and Free Entry: Can a Disequilibrium Market Serve as a Development Tool? Online Appendix September 2014

Curtis Miller MATH 3080 Final Project pg. 1. The first question asks for an analysis on car data. The data was collected from the Kelly

Relationships Among Wine Prices, Ratings, Advertising, and Production: Examining a Giffen Good

Handling Missing Data. Ashley Parker EDU 7312

Mobility tools and use: Accessibility s role in Switzerland

*p <.05. **p <.01. ***p <.001.

Flexible Imputation of Missing Data

Summary of Main Points

Eestimated coefficient. t-value

Gasoline Empirical Analysis: Competition Bureau March 2005

Protest Campaigns and Movement Success: Desegregating the U.S. South in the Early 1960s

Final Exam Financial Data Analysis (6 Credit points/imp Students) March 2, 2006

Online Appendix to The Effect of Liquidity on Governance

Business Statistics /82 Spring 2011 Booth School of Business The University of Chicago Final Exam

Missing value imputation in SAS: an intro to Proc MI and MIANALYZE

PSYC 6140 November 16, 2005 ANOVA output in R

COMPARISON OF CORE AND PEEL SAMPLING METHODS FOR DRY MATTER MEASUREMENT IN HASS AVOCADO FRUIT

An application of cumulative prospect theory to travel time variability

Heat stress increases long-term human migration in rural Pakistan

This appendix tabulates results summarized in Section IV of our paper, and also reports the results of additional tests.

Relation between Grape Wine Quality and Related Physicochemical Indexes

From VOC to IPA: This Beer s For You!

Regression Models for Saffron Yields in Iran

STA Module 6 The Normal Distribution

STA Module 6 The Normal Distribution. Learning Objectives. Examples of Normal Curves

The R&D-patent relationship: An industry perspective

STAT 5302 Applied Regression Analysis. Hawkins

What makes a good muffin? Ivan Ivanov. CS229 Final Project

Online Appendix to Voluntary Disclosure and Information Asymmetry: Evidence from the 2005 Securities Offering Reform

Missing Data Methods (Part I): Multiple Imputation. Advanced Multivariate Statistical Methods Workshop

Dietary Diversity in Urban and Rural China: An Endogenous Variety Approach

Occupational Structure and Social Stratification in East Asia: A Comparative Study of Japan, Korea and Taiwan

Wine-Tasting by Numbers: Using Binary Logistic Regression to Reveal the Preferences of Experts

Wine Rating Prediction

Citrus Attributes: Do Consumers Really Care Only About Seeds? Lisa A. House 1 and Zhifeng Gao

Preferred citation style

Online Appendix for. Inattention and Inertia in Household Finance: Evidence from the Danish Mortgage Market,

Measuring economic value of whale conservation

2016 China Dry Bean Historical production And Estimated planting intentions Analysis

Comparative Analysis of Dispersion Parameter Estimates in Loglinear Modeling

Supporing Information. Modelling the Atomic Arrangement of Amorphous 2D Silica: Analysis

Internet Appendix for CEO Personal Risk-taking and Corporate Policies TABLE IA.1 Pilot CEOs and Firm Risk (Controlling for High Performance Pay)

Zeitschrift für Soziologie, Jg., Heft 5, 2015, Online- Anhang

Return to wine: A comparison of the hedonic, repeat sales, and hybrid approaches

The Role of Calorie Content, Menu Items, and Health Beliefs on the School Lunch Perceived Health Rating

Tim Woods Lia Nogueira Shang Ho Yang Xueting Deng WERA 72 Meetings 2014

Transportation demand management in a deprived territory: A case study in the North of France

Mini Project 3: Fermentation, Due Monday, October 29. For this Mini Project, please make sure you hand in the following, and only the following:

Missing Data: Part 2 Implementing Multiple Imputation in STATA and SPSS. Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 4/24/13

This is a repository copy of Poverty and Participation in Twenty-First Century Multicultural Britain.

A Web Survey Analysis of the Subjective Well-being of Spanish Workers

The premium for organic wines

RELATIVE EFFICIENCY OF ESTIMATES BASED ON PERCENTAGES OF MISSINGNESS USING THREE IMPUTATION NUMBERS IN MULTIPLE IMPUTATION ANALYSIS ABSTRACT

Community differences in availability of prepared, readyto-eat foods in U.S. food stores

The age of reproduction The effect of university tuition fees on enrolment in Quebec and Ontario,

The International Food & Agribusiness Management Association. Budapest, Hungary. June 20-21, 2009

OF THE VARIOUS DECIDUOUS and

INSTITUTE AND FACULTY OF ACTUARIES CURRICULUM 2019 SPECIMEN SOLUTIONS. Subject CS1B Actuarial Statistics

Effects of political-economic integration and trade liberalization on exports of Italian Quality Wines Produced in Determined Regions (QWPDR)

Risk Assessment Project II Interim Report 2 Validation of a Risk Assessment Instrument by Offense Gravity Score for All Offenders

International Journal of Business and Commerce Vol. 3, No.8: Apr 2014[01-10] (ISSN: )

AJAE Appendix: Testing Household-Specific Explanations for the Inverse Productivity Relationship

Valuing Health Risk Reductions from Air Quality Improvement: Evidence from a New Discrete Choice Experiment (DCE) in China

Imputation Procedures for Missing Data in Clinical Research

HW 5 SOLUTIONS Inference for Two Population Means

A Hedonic Analysis of Retail Italian Vinegars. Summary. The Model. Vinegar. Methodology. Survey. Results. Concluding remarks.

Table S1. Countries and years in sample.

Cointegration Analysis of Commodity Prices: Much Ado about the Wrong Thing? Mindy L. Mallory and Sergio H. Lence September 17, 2010

PROBIT AND ORDERED PROBIT ANALYSIS OF THE DEMAND FOR FRESH SWEET CORN

Method for the imputation of the earnings variable in the Belgian LFS

Enquiring About Tolerance (EAT) Study. Randomised controlled trial of early introduction of allergenic foods to induce tolerance in infants

Sponsored by: Center For Clinical Investigation and Cleveland CTSC

Online Appendix. for. Female Leadership and Gender Equity: Evidence from Plant Closure

Valuation in the Life Settlements Market

ASSESSING THE HEALTHFULNESS OF FOOD PURCHASES AMONG LOW-INCOME AREA SHOPPERS IN THE NORTHEAST

Faculty of Science FINAL EXAMINATION MATH-523B Generalized Linear Models

Internet Appendix. For. Birds of a feather: Value implications of political alignment between top management and directors

THE STATISTICAL SOMMELIER

Structural Reforms and Agricultural Export Performance An Empirical Analysis

Transcription:

Supplementary Material to Modelling workplace contact networks: the effects of organizational structure, architecture, and reporting errors on epidemic predictions, published in Network Science Gail E. Potter, Timo Smieszek, and Kerstin Sailer April 24, 2015 1

A Comparison of reporting probability estimates to those in previous work We compare the reporting probability estimates from our proportional odds model with angular distance to those from Smieszek et al. (2012) in table 1. The estimates obtained by the two different methods are extremely similar. The wide confidence interval for contacts lasting more than an hour is due to the fact that all contacts of this duration were reported with 100% consistency, so there is no variability with which to estimate the standard error of the reporting probability. Table 1: Comparison of our reporting probability estimates to those in Smieszek et al. (Monday only) Estimate Angular Model Smieszek et al. 0 5 0.56 [0.41, 0.69] 0.53 6 15 0.96 [0.84, 0.99] 0.96 16 60 0.93 [0.83, 0.99] 0.93 61 480 1.00 [0.00, 1.00] 1.00 B Results from proportional odds models with four different distance metrics Table 2 shows results from proportional odds models with four different distance metrics. model with angular distance metrics fits best according to the AIC. The Table 2: Coefficients for proportional odds models for contact duration, using four different distance metrics Metric Topo Angular Axtopo Group 1-0.32 (0.19). -0.33 (0.19). -0.18 (0.20) -0.20 (0.20) Group 2-0.07 (0.18) -0.06 (0.18) 0.11 (0.19) 0.13 (0.20) Group mixing 3.41 (0.48) *** 3.49 (0.47) *** 3.42 (0.45) *** 3.39 (0.45) *** Distance -0.01 (0.02) -0.01 (0.04) -0.22 (0.08) ** -0.20 (0.08) * Female 0.36 (0.21). 0.37 (0.21). 0.31 (0.21) 0.31 (0.21) Role mixing 0.79 (0.30) ** 0.83 (0.30) ** 0.60 (0.29) * 0.63 (0.29) * Gender mixing -0.21 (0.26) -0.22 (0.26) -0.18 (0.26) -0.18 (0.26) Floor 1.12 (0.52) * 1.23 (0.61) * -0.09 (0.68) -0.33 (0.77) Shared projects 1.17 (0.28) *** 1.20 (0.28) *** 1.06 (0.28) *** 1.08 (0.28) *** AIC 779.1 779.4 772.5 773.1 Significance levels: *** = p < 0.001; ** = p < 0.01; * = p < 0.05;. = p < 0.10 2

C Results from testing proportional odds model assumption Table 3 compares log odds ratio estimates from logistic regression models fitted to contact duration, dichotomized at different cutoffs (0, 5, 15, or 60 minutes). Some estimates are effectively infinite, with infinite standard errors because either 0% or 100% cell counts were observed. The table suggests that while the proportional odds assumption probably does not hold perfectly, it is not unreasonable. Group mixing and distance coefficient estimates are remarkably similar, the two main effects of primary interest. Other coefficients vary somewhat, but differences are not statistically significant. Table 3: Log odds ratio estimates and 95% confidence intervals at different dichotomizations of contact duration to test proportional odds model assumption, metric distance measure. Duration cutoff Effect > 0 > 5 > 15 > 60 Group 1-0.04-0.02-0.49-0.65 [-0.54, 0.46] [-0.65, 0.60] [-1.17, 0.20] [-1.23, -0.06] Group 2-0.27-0.05-0.41-0.58 [-0.75, 0.22] [-0.63, 0.53] [-1.06, 0.24] [-1.09, -0.06] Group mixing 3.96 4.13 3.59 17.73 [2.92, 5.01] [2.57, 5.69] [2.00, 5.18] [NA, NA] Distance -0.01-0.03-0.02-0.01 [-0.04, 0.03] [-0.07, 0.01] [-0.06, 0.02] [-0.06, 0.04] Sex -0.08-0.04 0.33 0.19 [-0.51, 0.35] [-0.53, 0.44] [-0.21, 0.87] [-0.47, 0.86] Role 0.93 1.52 1.76-0.17 [0.35, 1.51] [0.86, 2.18] [1.01, 2.51] [-1.28, 0.94] Gender mixing -0.12-0.29-0.41 0.7 [-0.66, 0.41] [-0.91, 0.33] [-1.12, 0.30] [-0.24, 1.64] Same floor 1.57 16.82 16.62 16.93 [0.40, 2.74] [NA, NA] [NA, NA] [NA, NA] Shared projects 3.78 2.15 2.39 1.42 [1.40, 6.15] [0.90, 3.39] [1.23, 3.55] [0.32, 2.53] 3

D Additional fits of proportional odds models Table 4: Coefficients (SEs) for proportional odds models for five days of the week, using angular distance metric Intercepts Monday Tuesday Wednesday Thursday Friday 0 2.65 (1.04) * 3.37 (0.88) ** 3.11 (1.14) * 4.25 (0.98) ** 2.98 (1.4) * 1-5 3.87 (1.04) *** 4.23 (0.89) *** 4.41 (1.13) *** 5.01 (0.98) *** 3.33 (1.40) *** 6-15 4.58 (1.04) *** 4.82 (0.89) *** 5.01 (1.13) *** 5.51 (0.99) *** 3.76 (1.40) *** 16-60 6.22 (1.08) *** 6.44 (0.93) *** 6.78 (1.15) *** 6.90 (1.00) *** 5.55 (1.41) *** Group 1-0.18 (0.20) 0.40 (0.18) * 0.28 (0.18) 0.07 (0.18) 0.29 (0.31) Group 2 0.11 (0.19) 0.17 (0.20) - 0.09 (0.18) 0.08 (0.19) 0.06 (0.33) Group mixing 3.42 (0.45) *** 2.75 (0.39) *** 4.54 (0.64) *** 3.94 (0.49) *** 3.00 (0.48) *** Distance -0.22 (0.08) * -0.13 (0.07). -0.29 (0.09) ** -0.16 (0.07) * -0.15 (0.11) Female 0.31 (0.21) -0.09 (0.17) 0.29 (0.20) 0.02 (0.19) - 0.02 (0.27) Role mixing 0.60 (0.29). 0.29 (0.25) 0.79 (0.30) * 0.98 (0.24) ** 1.35 (0.37) ** Gender mixing -0.18 (0.26) 0.26 (0.22) 0.27 (0.25) 0.01 (0.22) -0.50 (0.31) Floor -0.09 (0.68) -0.14 (0.56) -0.76 (0.69) 0.63 (0.62) -0.61 (0.79) Shared projects 1.06 (0.28) ** 1.62 (0.23) *** 1.28 (0.23) *** 0.82 (0.22) ** 0.65 (0.26) * E Multinomial logit model E.0.1 Multinomial logit model likelihood In this model we predict both contact and contact duration as a function of covariates. We use a multinomial logit model to estimate the probability of each of the four duration categories, or a fifth category, non-contact. We will now re-define our notation to reflect the inclusion of non-contact as a duration category. Define π k (x) = P (D ij = d k X ij = x), for k = 0,..., 4 (representing categories 0, 1-5, 6-15, 16-60, and 61+ minutes). Let X ij denote individual-level and dyadic covariates in our model. Again we let D denote the matrix of contact durations (after removing inconsistencies in duration reports) with non-contacts having duration zero. Using non-contact as the baseline duration category, the multinomial model is defined by Agresti (2002): log P (D ij = d k X ij = x) P (D ij = d 1 X ij = x) = α k + β T k x, for k = 1, 2, 3, 4 From this we obtain: P (D ij = d k X ij = x) = e α k+β T k x 1 + 4 h=1 e α h+β T h x Because the probabilities must sum to one, P (D ij = d 0 X ij = x) = 1 4 h=1 e α h+β T h x. By applying our assumptions, rules of conditional probability, and the Law of Total Probability, we find that the joint likelihood of D and C is: P (C ij = 1, C ji = 1, D ij = d k ) = P (D ij = d k )p 2 k P (C ij = 1, C ji = 0, D ij = d k ) = P (D ij = d k )p k (1 p k ) 4

P (C ij = 0, C ji = 0, D ij = 0) = P (D ij = 0) + Then the probability of the observed data is: P (C = c, D = d) = n n i=1 j=i+1 4 P (D ij = d k )(1 p k ) 2 k=1 P (C ij = c ij, C ji = c ji, D ij = d k ) We maximize the log likelihood to estimate α, β, and p using the trust function in R and computed standard errors by inverting the Fisher information matrix (Geyer, 2009). F Goodness of fit to assess modelling of transitivity Figure 1 compares goodness of fit diagnostics for two models in order to assess how well our model captured transitivity present in the network. The first model is our ERGM with angular distance, fit to a nondirectional binary network created by assuming that contact between two individuals occurred if it was reported by at least one of the two. The second model is the same ERGM, but also including a geometrically weighted edgewise shared partners (gwesp) term with alpha = 0.5. The box plots show network statistics for networks simulated from each model, while the solid line shows network statistics for the actual data. The figures show that our model does a good job representing the degree distribution and the minimum geodesic distance of the network, but overestimates the proportion of edges with 2 3 shared partners, and underestimates the proportion of edges with 6 8 shared partners. The model with the added gwesp term mostly corrects this problem. 5

proportion of nodes 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 Goodness of fit diagnostics proportion of edges 0.0 0.1 0.2 0.3 proportion of dyads 0.0 0.1 0.2 0.3 0.4 0 2 4 6 8 10 12 14 16 degree 0 1 2 3 4 5 6 7 8 9 11 edge wise shared partners 1 2 3 4 5 6 7 8 9 10 minimum geodesic distance proportion of nodes 0.00 0.05 0.10 0.15 0.20 0.25 proportion of edges 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 proportion of dyads 0.0 0.1 0.2 0.3 0.4 0 2 4 6 8 10 12 14 16 degree 0 1 2 3 4 5 6 7 8 9 11 edge wise shared partners 1 2 3 4 5 6 7 8 9 10 minimum geodesic distance Figure 1: Goodness of fit diagnostics for our model (top) without adjusting for reporting errors, compared to those for an extension of model which also includes a gwesp(0.5) term to capture transitivity. 6

F.1 Multinomial logit model likelihood results Table 5 shows coefficient estimates from the multinomial logit model with four distance metrics. Coefficients are interpreted as follows: The odds of a 1 5 minute contact relative to no contact increases by a factor of e 3.24 = 26 if two people are in the same research group, controlling for other variables in the model. The odds of a 16 60 minute contact relative to no contact decreases by a factor of e 0.05 = 0.95 for each unit increase in metric distance between their workstations, controlling for other variables in the model. Some coefficients do not have finite standard errors because of zero or 100% cell counts. For example, all reported 16 60 and 60+ minute contacts were on the same floor. The floor coefficient for these categories should be infinite, but is estimated as a very large number (after exponentiation). All reported 61+ minute contacts were among members of the same research group, resulting in an infinite coefficient for group mixing. The set of predictor variables in the multinomial model that we fit differs from our full model in the text in that the shared projects is excluded. However, inclusion of this variable would only amplify the estimation problems caused by a large number of parameters being estimated with several cases of small cell counts. We include in this section estimates from the proportional odds model so the reader may compare them to the multinomial model. 7

Table 5: Multinomial model estimates (SEs) Metric Angular Topo Axtopo 1 5 minutes Int. -4.48 (0.94) *** -2.28 (1.05) * -4.85 (1.02) *** -2.36 (1.18) * Group 1-0.01 (0.2) 0.15 (0.21) -0.02 (0.2) 0.11 (0.21) Group 2 0.29 (0.2) 0.5 (0.22) * 0.28 (0.2) 0.49 (0.23) * Group Mixing 3.24 (0.44) *** 3.22 (0.41) *** 3.28 (0.43) *** 3.19 (0.41) *** Distance 0 (0.02) -0.21 (0.1) * 0.02 (0.04) -0.16 (0.09). Female -0.13 (0.21) -0.15 (0.21) -0.13 (0.21) -0.15 (0.21) Role Mixing 0.42 (0.31) 0.24 (0.32) 0.45 (0.31) 0.29 (0.31) Gender Mixing -0.3 (0.26) -0.28 (0.26) -0.3 (0.26) -0.29 (0.26) Floor 0.22 (0.48) -1.15 (0.72) 0.42 (0.57) -1.2 (0.84) 6 15 minutes Int. -6.6 (1.47) *** -2.34 (1.42) -6.6 (1.58) *** -1.91 (1.53) Group 1-0.08 (0.26) 0.31 (0.28) -0.08 (0.26) 0.25 (0.27) Group 2 0.37 (0.25) 0.86 (0.28) ** 0.4 (0.26) 0.89 (0.28) ** Group Mixing 3.63 (0.79) *** 3.78 (0.78) *** 3.68 (0.79) *** 3.7 (0.78) *** Distance -0.02 (0.02) -0.55 (0.11) *** -0.05 (0.05) -0.48 (0.1) *** Female 0.21 (0.26) 0.09 (0.27) 0.21 (0.26) 0.09 (0.26) Role Mixing 0.9 (0.37) * 0.48 (0.37) 0.91 (0.37) * 0.53 (0.37) Gender Mixing -0.23 (0.33) -0.17 (0.34) -0.24 (0.33) -0.17 (0.34) Floor 1.21 (0.88) -1.84 (1.06). 1.05 (0.98) -2.4 (1.17) * 16 60 minutes Int. -19.76 (NA) -18.94 (NA) -19.94 (NA) -18.23 (NA) Group 1-0.12 (0.23) 0.09 (0.24) -0.13 (0.23) 0.05 (0.23) Group 2-0.09 (0.23) 0.26 (0.24) -0.04 (0.23) 0.28 (0.24) Group Mixing 3.72 (0.78) *** 4.05 (0.75) *** 3.75 (0.78) *** 4 (0.75) *** Distance -0.05 (0.02) ** -0.39 (0.1) *** -0.1 (0.04) * -0.35 (0.1) *** Female 1.1 (0.4) ** 1.05 (0.41) * 1.12 (0.4) ** 1.05 (0.41) * Role Mixing 1.58 (0.38) *** 1.53 (0.37) *** 1.6 (0.38) *** 1.56 (0.37) *** Gender Mixing -1.31 (0.45) ** -1.38 (0.45) ** -1.33 (0.45) ** -1.37 (0.45) ** Floor 14.49 (NA) 13.8 (NA) 14.5 (NA) 12.99 (NA) 61+ minutes Int. -26.51 (NA) -52.12 (6.45) *** -29.14 (NA) -24.45 (NA) Group 1-0.74 (0.3) * -0.5 (0.31) -0.72 (0.3) * -0.53 (0.31). Group 2-0.61 (0.26) * -0.27 (0.28) -0.56 (0.26) * -0.21 (0.29) Group Mixing 13.72 (126.44) 42.44 (10.25) *** 14.7 (116.34) 14.09 (NA) Distance -0.04 (0.03) -0.35 (0.15) * -0.03 (0.06) -0.34 (0.14) * Female 0.38 (0.34) 0.3 (0.34) 0.35 (0.34) 0.31 (0.34) Role Mixing 0.31 (0.54) 0.28 (0.51) 0.53 (0.54) 0.27 (0.52) Gender Mixing 0.53 (0.48) 0.47 (0.48) 0.47 (0.48) 0.48 (0.48) Floor 11.83 (NA) 9.15 (NA) 12.86 (NA) 9.78 (NA) AIC 1478 1453 1480 1456 Significance levels: *** = p < 0.001; ** = p < 0.01; * = p < 0.05;. = p < 0.10 8

Table 6: Coefficients (SEs) for proportional odds models for contact duration, using four different distance metrics Metric Angular Topo Axtopo Group 1-0.32 (0.19). -0.18 (0.20) -0.33 (0.19). -0.20 (0.20) Group 2-0.07 (0.18) 0.11 (0.19) -0.06 (0.18) 0.13 (0.20) Group mixing 3.41 (0.48) *** 3.42 (0.45) *** 3.49 (0.47) *** 3.39 (0.45) *** Distance -0.01 (0.02) -0.22 (0.08) ** -0.01 (0.04) -0.20 (0.08) * Female 0.36 (0.21). 0.31 (0.21) 0.37 (0.21). 0.31 (0.21) Role mixing 0.79 (0.30) ** 0.60 (0.29) * 0.83 (0.30) ** 0.63 (0.29) * Gender mixing -0.21 (0.26) -0.18 (0.26) -0.22 (0.26) -0.18 (0.26) Floor 1.12 (0.52) * -0.09 (0.68) 1.23 (0.61) * -0.33 (0.77) Shared projects 1.17 (0.28) *** 1.06 (0.28) *** 1.20 (0.28) *** 1.08 (0.28) *** AIC 779.1 772.5 779.4 773.1 Significance levels: *** = p < 0.001; ** = p < 0.01; * = p < 0.05;. = p < 0.10 9

Table 7: Coefficients [95% Confidence Intervals] for multinomial model with no floor effect and two largest duration categories collapsed METRIC MODEL 1-5 mins 6-15 mins 16+ mins Effect Est. 95% CI Est. 95% CI Est. 95% CI Intercept -4.16 [-5.66, -2.66] -5.22 [-7.47, -2.97] -4.07 [-6.06, -2.07] Group 1-0.03 [-0.42, 0.37] -0.13 [-0.66, 0.41] -0.44 [-0.84, -0.05] Group 2 0.31 [-0.06, 0.69] 0.43 [-0.09, 0.95] -0.30 [-0.68, 0.08] Group Membership 3.23 [2.36, 4.11] 3.68 [2.09, 5.28] 4.14 [2.61, 5.67] Distance 0 [-0.03, 0.02] -0.04 [-0.08, -0.01] -0.07 [-0.10, -0.04] Sex -0.12 [-0.53, 0.29] 0.24 [-0.27, 0.75] 0.68 [0.18, 1.19] Role mixing 0.43 [-0.19, 1.06] 0.86 [0.13, 1.59] 1.13 [0.49, 1.76] Sex mixing -0.29 [-0.81, 0.23] -0.16 [-0.82, 0.49] -0.49 [-1.10, 0.12] TOPO MODEL 1-5 mins 6-15 mins 16+ mins Effect Est. 95% CI Est. 95% CI Est. 95% CI Int. -4.27 [-5.7, -2.84] -5.27 [-7.47, -3.07] -4.26 [-6.24, -2.28] Group 1-0.03 [-0.43, 0.37] -0.10 [-0.63, 0.43] -0.42 [-0.81, -0.03] Group 2 0.32 [-0.06, 0.70] 0.44 [-0.08, 0.96] -0.26 [-0.64, 0.12] Group Mixing 3.28 [2.43, 4.13] 3.68 [2.10, 5.26] 4.18 [2.65, 5.71] Distance 0 [-0.05, 0.05] -0.09 [-0.16, -0.02] -0.14 [-0.20, -0.08] Female -0.12 [-0.53, 0.29] 0.24 [-0.27, 0.75] 0.69 [0.19, 1.19] Role Mixing 0.43 [-0.19, 1.05] 0.85 [0.12, 1.58] 1.13 [0.49, 1.77] Gender Mixing -0.29 [-0.81, 0.23] -0.19 [-0.85, 0.47] -0.53 [-1.14, 0.08] ANGULAR MODEL 1-5 mins 6-15 mins 16+ mins Effect Est. 95% CI Est. 95% CI Est. 95% CI Int. -3.63 [-4.90, -2.36] -4.11 [-6.13, -2.09] -4.15 [-6.01, -2.29] Group 1 0.09 [-0.33, 0.51] 0.26 [-0.28, 0.8] -0.13 [-0.53, 0.27] Group 2 0.31 [-0.06, 0.68] 0.70 [0.18, 1.22] 0.01 [-0.36, 0.38] Group Mixing 3.06 [2.28, 3.84] 3.53 [2.03, 5.03] 4.37 [2.89, 5.85] Distance -0.08 [-0.18, 0.02] -0.45 [-0.63, -0.27] -0.40 [-0.54, -0.26] Female -0.16 [-0.57, 0.25] 0.09 [-0.44, 0.62] 0.57 [0.07, 1.07] Role Mixing 0.30 [-0.31, 0.91] 0.53 [-0.20, 1.26] 1.09 [0.47, 1.71] Gender Mixing -0.28 [-0.80, 0.24] -0.18 [-0.85, 0.49] -0.59 [-1.20, 0.02] AXTOPO MODEL 1-5 mins 6-15 mins 16+ mins Effect Est. 95% CI Est. 95% CI Est. 95% CI Int. -3.79 [-5.05, -2.53] -4.32 [-6.32, -2.32] -4.19 [-6.04, -2.34] Group 1 0.06 [-0.35, 0.47] 0.18 [-0.35, 0.71] -0.17 [-0.57, 0.23] Group 2 0.31 [-0.06, 0.68] 0.69 [0.18, 1.20] 0.03 [-0.35, 0.41] Group Mixing 3.09 [2.30, 3.88] 3.46 [1.96, 4.96] 4.24 [2.76, 5.72] Distance -0.05 [-0.13, 0.03] -0.36 [-0.52, -0.2] -0.35 [-0.48, -0.22] Female -0.15 [-0.56, 0.26] 0.10 [-0.42, 0.62] 0.57 [0.07, 1.07] Role Mixing 0.34 [-0.27, 0.95] 0.60 [-0.13, 1.33] 1.11 [0.49, 1.73] Gender Mixing -0.29 [-0.81, 0.23] -0.18 [-0.85, 0.49] -0.59 [-1.2, 0.02] 10

References Agresti, A. (2002). Categorical Data Analysis. 2nd edn. Wiley Series in Probability and Statistics. Wiley-Interscience. Geyer, Charles J. (2009). trust: Trust region optimization. R package version 0.1-2. Smieszek, Timo, Burri, Elena U., Scherzinger, Robert, & Scholz, Roland W. (2012). Collecting close-contact social mixing data with contact diaries: reporting errors and biases. Epidemiology and Infection, 140(4), 744 752. 11