Comparison of Multivariate Data Representations: Three Eyes are Better than One

Similar documents
Wine-Tasting by Numbers: Using Binary Logistic Regression to Reveal the Preferences of Experts

Predicting Wine Quality

Activity 10. Coffee Break. Introduction. Equipment Required. Collecting the Data

Which of the following are resistant statistical measures? 1. Mean 2. Median 3. Mode 4. Range 5. Standard Deviation

IT 403 Project Beer Advocate Analysis

Sensory Approaches and New Methods for Developing Grain-Based Products. Symposia Oglethorpe CC Monday 26 October :40 a.m.

Imputation of multivariate continuous data with non-ignorable missingness

Flexible Imputation of Missing Data

Population Trends 139 Spring 2010

Multi-variable analyses of marketing by wine producers

Reliable Profiling for Chocolate and Cacao

1. Continuing the development and validation of mobile sensors. 3. Identifying and establishing variable rate management field trials

WINE RECOGNITION ANALYSIS BY USING DATA MINING

Handling Missing Data. Ashley Parker EDU 7312

Northern Region Central Region Southern Region No. % of total No. % of total No. % of total Schools Da bomb

A Study on Consumer Attitude Towards Café Coffee Day. Gonsalves Samuel and Dias Franklyn. Abstract

Wine Consumption Production

Modeling Wine Quality Using Classification and Regression. Mario Wijaya MGT 8803 November 28, 2017

Washington State Snap-Ed Curriculum Fidelity for Continuous Improvement

Find the wine you are looking for at the best prices.

Economics 101 Spring 2016 Answers to Homework #1 Due Tuesday, February 9, 2016

A COMPARATIVE STUDY OF DEMAND FOR LOCAL AND FOREIGN WINES IN BULGARIA

MBA 503 Final Project Guidelines and Rubric

New from Packaged Facts!

Comparing and Graphing Ratios

STUDY REGARDING THE RATIONALE OF COFFEE CONSUMPTION ACCORDING TO GENDER AND AGE GROUPS

Missing value imputation in SAS: an intro to Proc MI and MIANALYZE

Virginie SOUBEYRAND**, Anne JULIEN**, and Jean-Marie SABLAYROLLES*

Background & Literature Review The Research Main Results Conclusions & Managerial Implications

Oregon Wine Board Consumer Study. December 18, 2015

RESULTS OF THE MARKETING SURVEY ON DRINKING BEER

SIMPLE CODED IDENTIFICATION REFERENCES OF HARVESTING TIME FOR OIL PALM FRUITS

Food and beverage services statistics - NACE Rev. 2

Tips for Writing the RESULTS AND DISCUSSION:

Running Head: MESSAGE ON A BOTTLE: THE WINE LABEL S INFLUENCE p. 1. Message on a bottle: the wine label s influence. Stephanie Marchant

The R survey package used in these examples is version 3.22 and was run under R v2.7 on a PC.

Risk Assessment Project II Interim Report 2 Validation of a Risk Assessment Instrument by Offense Gravity Score for All Offenders

Previous analysis of Syrah

ASSESSING THE HEALTHFULNESS OF FOOD PURCHASES AMONG LOW-INCOME AREA SHOPPERS IN THE NORTHEAST

URBAN CONSUMER ACCEPTABILITY OF TRADITIONAL (WET) AND DRIED NIGERIAN FUFU

The Roles of Social Media and Expert Reviews in the Market for High-End Goods: An Example Using Bordeaux and California Wines

The Effects of Dried Beer Extract in the Making of Bread. Josh Beedle and Tanya Racke FN 453

Fromage Frais and Quark (Dairy and Soy Food) Market in Australia - Outlook to 2020: Market Size, Growth and Forecast Analytics

An Investigative Study of Factors Influencing Dining out in Casual Restaurants Among Young Consumers

A Web Survey Analysis of the Subjective Well-being of Spanish Workers

November 9, Myde Boles, Ph.D. Program Design and Evaluation Services Multnomah County Health Department and Oregon Public Health Division

Improving Capacity for Crime Repor3ng: Data Quality and Imputa3on Methods Using State Incident- Based Repor3ng System Data

0 + 1 = = = 2 + = = 3 + = = 5 + = = 8 + = = 13 + =

As described in the test schedule the wines were stored in the following container types:

Generation w-y-ne Consumer insights & Chenin blanc wine style preferences

The changing face of the U.S. consumer: How shifting demographics are re-shaping the U.S. consumer market for wine

Gail E. Potter, Timo Smieszek, and Kerstin Sailer. April 24, 2015

Labor Supply of Married Couples in the Formal and Informal Sectors in Thailand

Esri Demographic Data Release Notes: Israel

Missouri State University

Abstract. Keywords: Gray Pine, Species Classification, Lidar, Hyperspectral, Elevation, Slope.

Multiple Imputation for Missing Data in KLoSA

Comparative Analysis of Fresh and Dried Fish Consumption in Ondo State, Nigeria

A CASE STUDY: HOW CONSUMER INSIGHTS DROVE THE SUCCESSFUL LAUNCH OF A NEW RED WINE

distinct category of "wines with controlled origin denomination" (DOC) was maintained and, in regard to the maturation degree of the grapes at

Panel A: Treated firm matched to one control firm. t + 1 t + 2 t + 3 Total CFO Compensation 5.03% 0.84% 10.27% [0.384] [0.892] [0.

STUDY AND IMPROVEMENT FOR SLICE SMOOTHNESS IN SLICING MACHINE OF LOTUS ROOT

Learning Connectivity Networks from High-Dimensional Point Processes

WINE GRAPE TRIAL REPORT

Authors : Abstract. Keywords. Acknowledgements. 1 sur 6 13/05/ :49

Chemical and Sensory Differences in American Oak Toasting Profiles

FAST FOOD PROJECT WAVE 1 CAMPAIGN: PREPARED FOR: "La Plazza" PREPARED BY: "Your Company Name" CREATED ON: 26 May 2014

Wine On-Premise UK 2018

Ex-Ante Analysis of the Demand for new value added pulse products: A

OF THE VARIOUS DECIDUOUS and

AWRI Refrigeration Demand Calculator

PARENTAL SCHOOL CHOICE AND ECONOMIC GROWTH IN NORTH CAROLINA

Efficient Image Search and Identification: The Making of WINE-O.AI

Gender and Firm-size: Evidence from Africa

Volume 30, Issue 1. Gender and firm-size: Evidence from Africa

Decision making with incomplete information Some new developments. Rudolf Vetschera University of Vienna. Tamkang University May 15, 2017

Analysis of Things (AoT)

FOOD FOR THOUGHT Topical Insights from our Subject Matter Experts LEVERAGING AGITATING RETORT PROCESSING TO OPTIMIZE PRODUCT QUALITY

Online Appendix to Voluntary Disclosure and Information Asymmetry: Evidence from the 2005 Securities Offering Reform

UNIT TITLE: PROVIDE ADVICE TO PATRONS ON FOOD AND BEVERAGE SERVICES NOMINAL HOURS: 80

BNI of kinds of corn chips (descriptive statistics)

RESEARCH UPDATE from Texas Wine Marketing Research Institute by Natalia Kolyesnikova, PhD Tim Dodd, PhD THANK YOU SPONSORS

Japan Consumer Trial Results

Grape Growers of Ontario Developing key measures to critically look at the grape and wine industry

PSYC 6140 November 16, 2005 ANOVA output in R

Practical 1 - Determination of Quinine in Tonic Water

22/05/2018 STUDY RATIONALE OBJECTIVES OF THE STUDY THEORETICAL BACKGROUND

TEACHER NOTES MATH NSPIRED

Understanding consumer health choices

Danish Consumer Preferences for Wine and the Impact of Involvement

Why PAM Works. An In-Depth Look at Scoring Matrices and Algorithms. Michael Darling Nazareth College. The Origin: Sequence Alignment

Internet Appendix for CEO Personal Risk-taking and Corporate Policies TABLE IA.1 Pilot CEOs and Firm Risk (Controlling for High Performance Pay)

Mobility tools and use: Accessibility s role in Switzerland

From VOC to IPA: This Beer s For You!

Emerging Local Food Systems in the Caribbean and Southern USA July 6, 2014

5 Populations Estimating Animal Populations by Using the Mark-Recapture Method

Nogales Produce Import Report. An overview of the Nogales Fresh Produce Imports. Top 10 items imported by volume

Feasibility Project for Store Brand Macaroni and Cheese

International Journal of Business and Commerce Vol. 3, No.8: Apr 2014[01-10] (ISSN: )

Wine Australia Wine.com Data Report. July 21, 2017

Transcription:

Comparison of Multivariate Data Representations: Three Eyes are Better than One Natsuhiko Kumasaka (Keio University) Antony Unwin (Augsburg University)

Content Visualisation of multivariate data Parallel coordinate plots Textile plots Mosaic plots Visual data analysis Some examples

Parallel Coordinate Plots Each variable has its own vertical axis. Each case is represented by a set of line segments joining its points on the axes. The form, scaling and order of the axes influence the display a great deal. Interaction: querying, selecting and linking, rescaling, reordering

Decathlon dataset Best performances each year, 1985 to 2006, by individual decathletes, 7968 cases Only complete, not hand-timed 10 events, results, points, competition dates Nationality, birthday Source: www.decathlon2000.ee

Decathlon analysis goals Are the points distributions the same for each event? Which events are most influential? Have the performances changed over the years?

Pdt Psp Phj Plj P110h 400 1152 Pjt P1500 Ppv P400m P100m

Wine dataset (1) Californian/French Tasting, July 1976 10 Cabernet Sauvignons (6 US, 4 French) 11 judges (9 French, 1 US, 1 English) scored the wines from 0 to 20 The data may also be analysed as ranks Source: www.liquidasset.com

Wine dataset (2) Cabernet challenge 1999 47 (only 46 rated) Cabernet Sauvignons: 34 US, 9 French, 2 Italian, 2 others Vintages from 1994 to 1996 33 judges (Californian) ranked the wines Source: www.liquidasset.com

Wine analysis goals Which wines were rated best? Is the ranking of wines clear-cut? Do the judges have similar opinions? Are there clusters of judges?

Wine boxplots by mean Var33 Var12 Var22 Var37 Var6 Var26 Var7 Var36 Var40 Var31 Var1 Var39 Var8 Var4 Var34 Var9 Var32 Var21 Var38 Var18 Var23 Var47 Var25 1 46 Var27 Var30 Var19 Var20 Var43 Var11 Var28 Var44 Var17 Var10 Var29 Var35 Var13 Var16 Var45 Var42 Var3 Var5 Var46 Var15 Var2 Var14 Var41

Judge correlations A heatmap of correlations between judges, after Ward clustering of the original data to order the judges. (The display was drawn with Alex Gribov s SEURAT.)

PCPs and Textile plots PCPs stick to the raw data Textile plots transform scales Textile plots offer informative defaults PCPs are flexible through interaction

Mosaicplots (Classical) A rectangle is drawn for every combination of categories. Area is proportional to count. Divide the horizontal axis according to the category counts of the first variable. Divide each vertical column according to the relevant counts of the second variable. Continue dividing horiz/vert according to the relevant counts of the next variable.

A zoo dataset 101 animal types 17 attributes (mostly binary) Created by Forsyth Source: mlearn.ics.uci.edu/databases/ Analysis goals: What features best classify animals by type? How are the features related?

Mosaicplot variants Classical for efficient use of space Fluctation diagrams for cell sizes Same binsize to identify zeros and compare rates Multiple barcharts for comparisons Doubledecker plots for rates

Mosaicplot interactions Querying Reordering variables Reordering categories Rotating variables Rotating plots Size and aspect ratio Censored (and quantum) zooming

Titanic dataset 2201 passengers and crew Class (First, Second, Third, Crew) Age (Young, Old) Gender (Male, Female) Survived (Yes, No) Journal of Statistics Education (1995)

Titanic analysis goals Which kinds of passenger survived? Did Women and children first apply? What was the effect of class? What was the combined effect of gender and class?

Mosaicplots and Textiles Mosaicplots show variable combinations Mosaics are limited in number of variables Mosaics have many, many display options to reveal information Textile plots emphasise absolute numbers Textile plots can handle many variables

Software Mondrian (Martin Theus) interactive graphics crossplatform, links to R via Rserve iplots available as R package www.stats.math.uni-augsburg.de

Conclusions Multivariate displays have many options and making choices is difficult Textile plots can provide excellent defaults Interactive tools empower graphics, when they are fast, flexible and efficient No one display can show all information Three eyes are better than one