WINE RECOGNITION ANALYSIS BY USING DATA MINING

Similar documents
Wine-Tasting by Numbers: Using Binary Logistic Regression to Reveal the Preferences of Experts

Predicting Wine Quality

Pasta Market in Italy to Market Size, Development, and Forecasts

COMPARISON OF FOUR MERLOT CLONAL SELECTIONS FROM SKOPJE S VINEYARD REGION, R. MACEDONIA

Structures of Life. Investigation 1: Origin of Seeds. Big Question: 3 rd Science Notebook. Name:

Regression Models for Saffron Yields in Iran

Late season leaf health CORRELATION OF VINEYARD IMAGERY WITH PINOT NOIR YIELD AND VIGOUR AND FRUIT AND WINE COMPOSITION. 6/22/2010

Varietal Specific Barrel Profiles

Tips for Writing the RESULTS AND DISCUSSION:

Modeling Wine Quality Using Classification and Regression. Mario Wijaya MGT 8803 November 28, 2017

Relation between Grape Wine Quality and Related Physicochemical Indexes

Session 4: Managing seasonal production challenges. Relationships between harvest time and wine composition in Cabernet Sauvignon.

Virginie SOUBEYRAND**, Anne JULIEN**, and Jean-Marie SABLAYROLLES*

THE EFFECT OF DIFFERENT APPLICATIONS ON FRUIT YIELD CHARACTERISTICS OF STRAWBERRIES CULTIVATED UNDER VAN ECOLOGICAL CONDITION ABSTRACT

ROUSSEAU OCHRATOXIN A IN WINES: CURRENT KNOWLEDGE FACTORS FAVOURING ITS EMERGENCE IN VINEYARDS AND WINES PAGE 1

Wine Rating Prediction

Relationship between Mineral Nutrition and Postharvest Fruit Disorders of 'Fuerte' Avocados

The Hungarian simulation model of wine sector and wine market

CARTHAMUS TINCTORIUS L., THE QUALITY OF SAFFLOWER SEEDS CULTIVATED IN ALBANIA.

Distribution of Inorganic Constituents in Avocado Fruits

STUDY REGARDING THE RATIONALE OF COFFEE CONSUMPTION ACCORDING TO GENDER AND AGE GROUPS

Influence of climate and variety on the effectiveness of cold maceration. Richard Fennessy Research officer

D Lemmer and FJ Kruger

Coffee zone updating: contribution to the Agricultural Sector

Grapes of Class. Investigative Question: What changes take place in plant material (fruit, leaf, seed) when the water inside changes state?

New challenges of flour quality fluctuations and enzymatic flour standardization.

Discrimination of Ruiru 11 Hybrid Sibs based on Raw Coffee Quality

RELATIVE EFFICIENCY OF ESTIMATES BASED ON PERCENTAGES OF MISSINGNESS USING THREE IMPUTATION NUMBERS IN MULTIPLE IMPUTATION ANALYSIS ABSTRACT

Vibration Damage to Kiwifruits during Road Transportation

Grape Growers of Ontario Developing key measures to critically look at the grape and wine industry

What Makes a Cuisine Unique?

Research - Strawberry Nutrition

PINEAPPLE LEAF FIBRE EXTRACTIONS: COMPARISON BETWEEN PALF M1 AND HAND SCRAPPING

Do heating and cooling have an effect on matter?

OUTLINE Plan of the talk. Introduction Vineyards are variable in space The efficient vineyard project. The field site in Sonoma Results

Development of Value Added Products From Home-Grown Lychee

Environmental Monitoring for Optimized Production in Wineries

Comparison of Multivariate Data Representations: Three Eyes are Better than One

Climate effects on grape production and quality at Kumeu, New Zealand

EFFECT OF TOMATO GENETIC VARIATION ON LYE PEELING EFFICACY TOMATO SOLUTIONS JIM AND ADAM DICK SUMMARY

Ratio of Meat Recovered to Meat Recalled: 2007 to Present

IMPACT OF RAINFALL AND TEMPERATURE ON TEA PRODUCTION IN UNDIVIDED SIVASAGAR DISTRICT

PERFORMANCE OF HYBRID AND SYNTHETIC VARIETIES OF SUNFLOWER GROWN UNDER DIFFERENT LEVELS OF INPUT

Increasing Toast Character in French Oak Profiles

DETERMINANTS OF DINER RESPONSE TO ORIENTAL CUISINE IN SPECIALITY RESTAURANTS AND SELECTED CLASSIFIED HOTELS IN NAIROBI COUNTY, KENYA

Measurement and Study of Soil ph and Conductivity in Grape Vineyards

Slow Rot or Not! By Jennifer Goldstein

Using Growing Degree Hours Accumulated Thirty Days after Bloom to Help Growers Predict Difficult Fruit Sizing Years

THE EXPORT PERFORMANCE OF INDONESIAN DRIED CASSAVA IN THE WORLD MARKET

Oregon Wine Advisory Board Research Progress Report

Temperature Regimes for Avocados Grown In Kwazulu-Natal

Missing value imputation in SAS: an intro to Proc MI and MIANALYZE

FINAL REPORT TO AUSTRALIAN GRAPE AND WINE AUTHORITY. Project Number: AGT1524. Principal Investigator: Ana Hranilovic

27004 Preliminary Results of an ERT in a Vineyard in Estremoz, Portugal

RUST RESISTANCE IN WILD HELIANTHUS ANNUUS AND VARIATION BY GEOGRAPHIC ORIGIN

A typology of Chinese wine consumers.

WINE GRAPE TRIAL REPORT

As described in the test schedule the wines were stored in the following container types:

1. Continuing the development and validation of mobile sensors. 3. Identifying and establishing variable rate management field trials

Michigan Grape & Wine Industry Council Annual Report 2012

Emerging Local Food Systems in the Caribbean and Southern USA July 6, 2014

Lamb and Mutton Quality Audit

UPPER MIDWEST MARKETING AREA THE BUTTER MARKET AND BEYOND

TOASTING TECHNIQUES: Old World and New World RESEARCH. Joel Aiken and Bob Masyczek, Beaulieu Vineyard Maurizio Angeletti, Antinori Winery

Online Appendix to. Are Two heads Better Than One: Team versus Individual Play in Signaling Games. David C. Cooper and John H.

Multiple Imputation for Missing Data in KLoSA

Journal of Chemical and Pharmaceutical Research, 2017, 9(9): Research Article

Growing divergence between Arabica and Robusta exports

Labor Supply of Married Couples in the Formal and Informal Sectors in Thailand

Detecting Melamine Adulteration in Milk Powder

DEVELOPMENT AND STANDARDISATION OF FORMULATED BAKED PRODUCTS USING MILLETS

Recent Developments in Wheat Quality & Classification in Australia By Dr Irfan Hashmi

Greenhouse Effect Investigating Global Warming

DEVELOPMENT OF MILK AND CEREAL BASED EXTRUDED PRODUCTS

Processing Conditions on Performance of Manually Operated Tomato Slicer

DETERMINATION OF FRYING TEMPERATURE AND VACUUM PRESSURE TO PRODUCE PINEAPPLE CHIPS USING SIMPLE VACUUM FRIER *)

Tree Rings and Water Resource Management in the Southwest

Application of value chain to analyze harvesting method and milling efficiency in sugarcane processing

ICC September 2018 Original: English. Emerging coffee markets: South and East Asia

ASSESSMENT OF NUTRIENT CONTENT IN SELECTED DAIRY PRODUCTS FOR COMPLIANCE WITH THE NUTRIENT CONTENT CLAIMS

INVESTIGATIONS INTO THE RELATIONSHIPS OF STRESS AND LEAF HEALTH OF THE GRAPEVINE (VITIS VINIFERA L.) ON GRAPE AND WINE QUALITIES

A New Approach for Smoothing Soil Grain Size Curve Determined by Hydrometer

Acceptability and proximate composition of some sweet potato genotypes: Implication of breeding for food security and industrial quality

Imputation of multivariate continuous data with non-ignorable missingness

Preparation of a malt beverage from different rice varieties

International Journal of Business and Commerce Vol. 3, No.8: Apr 2014[01-10] (ISSN: )

ANALYSIS OF CLIMATIC FACTORS IN CONNECTION WITH STRAWBERRY GENERATIVE BUD DEVELOPMENT

Measuring economic value of whale conservation

Determination of wine colour by UV-VIS Spectroscopy following Sudraud method. Johan Leinders, Product Manager Spectroscopy

Wine production: A global overview

Development and evaluation of a mobile application as an e-learning tool for technical wine assessment

Health Effects due to the Reduction of Benzene Emission in Japan

Predictors of Repeat Winery Visitation in North Carolina

Effective and efficient ways to measure. impurities in flour used in bread making

Studies in the Postharvest Handling of California Avocados

Authors : Abstract. Keywords. Acknowledgements. 1 sur 6 13/05/ :49

THE EFFECTS OF FINAL MOLASSES AND SUGAR PURITY VALUES ON THE CALCULATION OF 96 0 SUGAR AND FACTORY RECOVERY INDEX. Heera Singh

SCAA Best Practice Guidelines for Using By-Pass in the Drip Coffee Brewing Process

Research on assurance of viticultural biodiversity, by using local varieties and biotypes of Buziaş-Silagiu area

7. LOCALIZATION OF FRUIT ON THE TREE, BRANCH GIRDLING AND FRUIT THINNING

Transcription:

9 th International Research/Expert Conference Trends in the Development of Machinery and Associated Technology TMT 2005, Antalya, Turkey, 26-30 September, 2005 WINE RECOGNITION ANALYSIS BY USING DATA MINING Kivanc Kilicer T.C. Bahcesehir University Bahcesehir, Istanbul Turkey Adem Karahoca T.C. Bahcesehir University Bahcesehir, Istanbul Turkey ABSTRACT The aim of this study is to evaluate and to understand the indicators of wine quality by using data mining methods. We used a java based program Weka to compare the effects of 13 constituents found in each of three types of wines. Our Wine recognition dataset contains the results of a chemical analysis of wines grown in the same region in Italy but derived from three different cultivars. A chemical analysis of 178 Italian wines from three different cultivars yielded 13 measurements. This dataset is often used to test and compare the performance of various classification algorithms. Keywords: Data Mining, Wine, Classification, Clustering, Discretisizing, Bayes, K-means 1. INTRODUCTION Wine recognition dataset contains the results of a chemical analysis of wines grown in the same region in Italy but derived from three different cultivars. A chemical analysis of 178 Italian wines from three different cultivars yielded 13 measurements. This dataset is often used to test and compare the performance of various classification algorithms. The analysis determined the effects of 13 constituents found in each of the three types of wines [1, 5]. These are: 1) Alcohol 2) Malic acid 3) Ash 4) Alcalinity of ash 5) Magnesium 6) Total phenols 7) Flavanoids 8) Nonflavanoid phenols 9) Proanthocyanins 10)Color intensity 11)Hue 12)OD280/OD315 of diluted wines 13)Proline By using datamining methods, my main target is to evaluate these figures and try to understand the indicators of wine classification.

2. MATERIAL METHOD In order to evaluate our dataset, we will use pre-processing, classification and clustering methods of data mining. Association method is not applicable because of numeric values. By preprocessing, my main aim is to summarize the data in the best way. With the help of discretisizing and range removal, we will try to visualize the constituent ratios used in 3 different types of wine. By Classification methods, my main aim is to classify the data into decision trees and making a prediction most near to reality. We will use j48 filter for treeing. To understand entropy and gain values, we will use Bayes theorem. To reach the best results we will also use SMO and Multilayer Perceptron filters [2]. During classification, we will also make a prediction with the help of Weka software. By Clustering, our main aim is to observe the effects of variables in one chart. Cogweb and K-means will be helpful methods in my investigation to reach the cluster means and make judgement about the contituents. 3. RESULTS We have used 178 different wines from which 59 of them belong to Class1 ; 71 of them belong to Class2 and 48 of them belong to Class3. Class1 can be seen as blue colour; Class2 as red colour and Class3 as cyan colour. Our dataset revealed the graphs of 13 constituents present in these 3 classes: Figure 1. Thirteen constitunets in three different classes of wine quality 3.1. Preprocessing by Discretisizing In order to have a more clear picture on our dataset, we discretisized the data into 12 bins for each constituent and use equal frequencies: Figure 2. Discretisizing the constituents into twelve bins

3.2 Classification by Bayes Theorem By using NaiveBayes classification method, we reached the P(h1) values as %33 for Class 1, %40 for Class 2, %27 for Class 3. Bayes classification method gave us good percentage on prediction. Posterior probabilities are also shown below. Entropy value is close to zero which means almost no surprises would happen on our predictions. [3] Result Result Figure 3.Prediction of Bayes Therorem can be accepted with 98.3 per cent We used other classification methods such as Multilayer Perceptron [4] and SMO where we also obtained good results. Let s make a prediction here with SMO by adding? into the dataset and choosing output prediction in the options menu. The last line of dataset is like this 13.42,4.65,2.55,20,93,3,.9,.47,1.32,4.3,.94,2.35,580,? We can see in the below table that Weka s prediction is class 3 with %95 certainty. This means that using these values for each constraint in the soil, it s possible to grow a grape with class 3 quality.

Figure 4.Output Prediction of SMO shows that the grape will be in class 3 quality when we use the constraint levels as 13.42, 4.65, 2.55, 20, 93, 3, 0.9, 0.47, 1.32, 4.3, 0.94, 2.35, 580 respectively 3.3 Decision Trees and Clustering Our analysis under Weka J48 filter decision tree showed us that there is an important relationship between flavonoids and color intensity. Colorintensity is a good indicator for defining wine type. Plus, the amount of proline can affect the class type if flavanoid amount is above 1,57. By using Cogweb and K-means tests[6], we saw the same relationship between flavanoid and color intensity. We saw that flavonoid, proline and color intensity variables affects most the class of wine in the final situation.

Figure 5.Clustering 4. CONCLUSIONS AND DISCUSSION Our classification, clustering and decision tree methods show that the most important factors lying beneath defining the wine quality are color intensity, flavonoid, alchool and proline attributes in the soil. On the other hand, 3 different classes are clustered in different areas, which means that the classification of the wine types are will-made. The content of wine can be more than %95 seperable according to 3 mentioned classes. With this reliable analysis, it s now more easy to understand the type of an unknown third party wine because our output prediction worked very well in the above mentioned experiment. A future analysis may help us to divide the regions of Italy and other countries into different clusters according to types of cultivation in grape fields. 5. REFERENCES [1] Kment, Petr and Mihalijevic, Martin, Differentiation of Czech Wines Using Multielement Composition A Comparison With Vineyard Soil, Faculty of Science, Institute of Geochemistry, Mineralogy and Mineral Resources, Charles Uiversity, Czech Republic [2] S. Aeberhard, D. Coomans and O. de Vel, Comparison of Classifiers in High Dimensional Settings, Tech. Rep. no. 92-02, (1992), Dept. of Computer Science and Dept. of Mathematics and Statistics, James Cook University of North Queensland [3] S. Aeberhard, D. Coomans and O. de Vel, "THE CLASSIFICATION PERFORMANCE OF RDA" Tech. Rep. no. 92-01, (1992), Dept. of Computer Science and Dept. of Mathematics and Statistics, James Cook University of North Queensland. [4] Forina, M. et al, PARVUS - An Extendible Package for Data Exploration, Classification and Correlation. Institute of Pharmaceutical and Food Analysis and Technologies, Via Brigata Salerno,16147 Genoa, Italy [5] Ying, Guang-Guo and Williams, Bryan, Dissipation of Herbicides in Soil and Grapes in a South Australian Vineyard(1999), Department of Environmental Science and Management, University of Adelaide, Australia [6] J. Leonard and P. Andrieux, Infiltration Charactheristics of Soils in Mediterranean Vineyards in Southern France (1998), INRA, UFR Science du Sol, 2 Place Viala, 34060, Montpellier cedex 2, France