DOI /j. cnki 欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟欟. R Rapid Miner Mahout

Similar documents
About this Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Mahout

Cloud Computing CS

STUDY REGARDING THE RATIONALE OF COFFEE CONSUMPTION ACCORDING TO GENDER AND AGE GROUPS

CGSS Journal of Arid Land Resources and Environment Jan Aizen C916

Monitoring Regional Alcohol Consumption through Social Media

Your key to the nut and dried fruit industry. Media Kit

Shaping the Future: Production and Market Challenges

Opportunities. SEARCH INSIGHTS: Spotting Category Trends and. thinkinsights THE RUNDOWN

Fairfield Public Schools Family Consumer Sciences Curriculum Food Service 30

Vegetarian Culinary Arts Courses 2018/2019

Country Report on PAN Localization Phase (Dzongkha Localization Project in Open Source)

New from Packaged Facts!

The World Atlas Of Coffee From Beans To Brewing Coffees Explored Explained And Enjoyed

US FOODS E-COMMERCE AND TECHNOLOGY OFFERINGS

A

Sandringham, Auckland

Modeling Wine Quality Using Classification and Regression. Mario Wijaya MGT 8803 November 28, 2017

Scientific Research and Experimental Development (SR&ED) Tax Credit

WINE INDUSTRY The Time is Ripe

Most Affordable Professional Grade 2D & 3D CAD Software

Fig.1 Diagram of vacuum cooling system [7-8]

Geographic Information Systemystem

Restaurant reservation system thesis documentation. Restaurant reservation system thesis documentation.zip

Jure Leskovec, Computer Science Dept., Stanford

Food Image Recognition by Deep Learning

Structural optimal design of grape rain shed

WINE RECOGNITION ANALYSIS BY USING DATA MINING

Wine Purchase Intentions: A Push-Pull Study of External Drivers, Internal Drivers, and Personal Involvement

The Function of English on the Spread of Chinese Tea Culture under the Background of Cross-Border E-Commerce

Feasibility Study of Toronto Public Health's Savvy Diner Menu Labelling Pilot Project

Grade: Kindergarten Nutrition Lesson 4: My Favorite Fruits

The University Wine Course A Wine Appreciation Text Self Tutorial

Food Defense: The Academic Perspective

Markus J. Prutsch Workshop at the Ludwig Boltzmann Institute for Neo-Latin Studies Innsbruck, 9 November 2012

Environmental Monitoring for Optimized Production in Wineries

A CASE STUDY: HOW CONSUMER INSIGHTS DROVE THE SUCCESSFUL LAUNCH OF A NEW RED WINE

The University of Georgia

A Guide To The Historic French Quarter (History & Guide) By Andy Peter Antippas

Efficient Image Search and Identification: The Making of WINE-O.AI

WiX Cookbook Free Ebooks PDF

Structures of Life. Investigation 1: Origin of Seeds. Big Question: 3 rd Science Notebook. Name:

Dining Room Theory

Integrated Protection in Viticulture

7 th Annual Conference AAWE, Stellenbosch, Jun 2013

Broaden Your Palate. Shape Your Future SPONSORSHIP DECK

IT and Firm Performance:

Learning Connectivity Networks from High-Dimensional Point Processes

SMALLHOLDER TEA FARMING AND VALUE CHAIN DEVELOPMENT IN CHINA

DOWNLOAD OR READ : COOKIES QUICK DROP SIMPLE ICE BOX HAND SHAPED TRADITION HERITAGE BEST EVER BARS FINAL TOUCHES PDF EBOOK EPUB MOBI

The Ideation Capacity Guided by an Intercultural Experience During the Concept Designing Process, a Case Study

From Selling to Supporting-Leveraging Mobile Services in the Field of Food Retailing

Optimization Model of Oil-Volume Marking with Tilted Oil Tank

N e w Yo r k C i t y / N YS T L C ata lo g for FAMIS purchases

The Fruits We Eat. The Fruits We Eat

Chapter 1. Introduction

2016 AGU Fall Meeting Scientific Program Public Affairs

Subject Area: High School French State-Funded Course: French III

Promote and support advanced computing to further Tier-One research and education at the University of Houston

Hops II Interfacing with the Hop Industry Role of a Hops Supplier. Tim Kostelecky John I. Haas, Inc ASBC Meeting June 6, 2017

Restaurant Management

Release Letter. Trufa

Hand Book Of Confectionery With Formulations With Directory Of Manufacturers Suppliers Of Plant Equ

External Trade And Income Distribution (Development Centre Studies) By Francois Bourguignon;Christian Morrisson READ ONLINE

Grapes of Class. Investigative Question: What changes take place in plant material (fruit, leaf, seed) when the water inside changes state?

13 COLONIES TRIVIA AND ANSWERS 13 COLONIES TRIVIA AND PDF 13 COLONIES TRIVIA AND ANSWERS PDF THIRTEEN COLONIES QUIZ - BRAINPOP

ID: Cookbook: browseurl.jbs Time: 19:59:33 Date: 23/03/2018 Version:

NEW YORK CITY COLLEGE OF TECHNOLOGY, CUNY DEPARTMENT OF HOSPITALITY MANAGEMENT COURSE OUTLINE COURSE#: HMGT 2305 COURSE TITLE: DINING ROOM OPERATIONS

Find the wine you are looking for at the best prices.

WHEN IS WINE O CLOCK?

2015 OSAP Performance Indicators

Regularity and Co ntrol of Fluorine Relea se fro m Co al2clay Briquette Co mbustio n in Op en Gro und2stove

RELATIVE EFFICIENCY OF ESTIMATES BASED ON PERCENTAGES OF MISSINGNESS USING THREE IMPUTATION NUMBERS IN MULTIPLE IMPUTATION ANALYSIS ABSTRACT

Rationale or Purpose: This lesson introduces students to the process of prehistoric hot rock cooking in earth ovens on the Edwards Plateau of Texas.

POSITION DESCRIPTION. DATE OF VERSION: August Position Summary:

DIVIDED SQUARE DIFFERENCE CORDIAL LABELING OF SPLITTING GRAPHS

12% Baking Mad. Page views increased by. Ridgeway. FOOD AND DRINK

Wine Microbiology: Science And Technology (Food Science And Technology) By Claudio Delfini READ ONLINE

three sites, three different voices

Non-GMO Project Trademark Use Guide

LISTEN A MINUTE.com. Coffee. One minute a day is all you need to improve your listening skills.

Practice of Chinese Food II Hotel Restaurant and Culinary Science

The R&D-patent relationship: An industry perspective

Caesar Salad With Chicken: My Favorite Recipe: Fully Illustrated Step By Step (Volume 15) By Osno Monto

Mcdonalds Q Star Guide READ ONLINE

Starbucks Geography Summary

BNI of kinds of corn chips (descriptive statistics)

International Journal of Business and Commerce Vol. 3, No.8: Apr 2014[01-10] (ISSN: )

Charles Creek Flood Zone Modeling: A Correlation Study of Environmental Conditions Versus Water Level in the Pasquotank Watershed

THE INSIGHT THE OPPORTUNITY THE SOLUTION

Big Data Integration. Xin Luna Dong (Amazon) Divesh Srivastava (AT&T Labs-Research)

Maple Syrup Cookbook: 100 Recipes For Breakfast, Lunch & Dinner By Ken Haedrich

Michelin The Green Guide California (Michelin Green Guide: California) By Michelin Travel Publications READ ONLINE

The restaurateur s guide to online ordering

OenoFoss. Instant quality control throughout the winemaking process. Dedicated Analytical Solutions

Religion and Innovation

Understanding consumer health choices

UNIVERSITY COLLEGE OF ENGINEERING (A) OSMANIA UNIVERSITY, HYDERABAD B.E. III Year - II-SEMESTER (MAIN) PRACTICAL EXAMS.

An Advanced Tool to Optimize Product Characteristics and to Study Population Segmentation

Reaction to the coffee crisis at the beginning of last decade

Innovations for a better world. Ingredient Handling For bakeries and other food processing facilities

Transcription:

DOI 10. 16353 /j. cnki. 1000-7490. 2015. 03. 027 255049 * R Rapid Miner Mahout R Rapid Miner Mahout R Abstract According to the features of big data era this paper analyzes the main challenges that massive data bring to the a- nalysis tool of data science. The paper introduces the big data analysis tool in response to challenges. Then the paper carries on the comparative analysis of R language Rapid Miner and Mahout 3 popular analysis tools of big data in data science which finds that R language and Rapid Miner have fully functions and the Mahout has more outstanding analysis capability of big data. Finally the paper points out the development trend of data science analysis tool. Keywords data science R language big data R Rapid Miner Apache Mahout Nature Science 2008 Nature Big Data 1 1 2011 Science Dealing with Data 2 2012 3 3 1 6 Horizon 2020 4 2014 2 5 7 1 6 * ZR2011GL025 21 134 38 2015 3

ITA 1 V. Dhar 8 J. Leak 9 5 Cyberspace 10 14 3 2 J. Gray Google 11 NoSQL Google BigTable 15 3 VMware Redis 16 Microsoft Azure Tables 17 2. 1 2. 2 N log N N N N 12 PB 13 2. 3 Google MapReduce 18 YouTube Yahoo Hadoop 19 HDFS MapReduce Hadoop Hadoop HPCC R Storm Apache Drill Rapid Miner Mahout 38 2015 3 135

R Hadoop 3. 1 2 Rapid Miner 21 Yale Mahout Java R Rapid Miner Mahout Rapid Miner 6 1 R 20 GNU R CRAN R R R Hadoop Hadoop API GUI GUI 3 Apache Mahout 22 2008 Mahout Hadoop 1 R Rapid Miner Mahout Linux Windows Mac OSX UNIX Linux FreeBSD MacOS Windows Linux Mahout Na ve R Bayes K-Means EM Neural Network MapReduce SVM Apriori KNN Mahout Excel Arff Mahout SPSS Dbase CSV SequenceFile Txtfile PDF ASCII XML HTML SequenceFile NoSQL Rapid Miner6 5 Hadoop MapReduce R Rapid Miner Mahout 1D 2D 3D pdf jpg png R Hadoop Hadoop 23 Hadoop MapReduce PB Radoop 24 Ra- Mahout Hadoop Rapid Miner Mahout Hadoop TB GB doop RapidMiner Apache Hadoop R Hadoop Hadoop MapReduce Hadoop R R Mahout HDFS Hive Mahout R MapReduce Java MapReduce 136 38 2015 3

ITA Mahout Map-Reduce Naive Bayes Naive Bayes Complemen- tary Naive Bayes Naive Bayes 3 3. 2 Hadoop 3 Mahout 1 1 3 5 1 R R R Rapid Miner Rapid Miner 6 Hadoop R Mahout 3D Mahout Hadoop Hadoop Mahout Mahout Ma- preduce Mahout 1 Big Data-Nature EB /OL. 2014-04-10. http / /www. nature. Mahout com /. 2 Dealing with Data-Science EB /OL. 2014-04-10. http / / www. sciencemag. 4 3 DB /OL. 2014-04-10. http / /www. most. gov. cn /. 4 Horizon 2020 EB /OL. 2014-04-10. http / /eu. mofcom. gov. cn /. 5 2014 DB /OL. 2014-04-10. http / / dc2014. codata. cn /. 1 6 Date Science at NYU EB /OL. 2014-04-10. http / / datascience. nyu. edu /. 7 Wikipedia Date Science EB /OL. 2014-04-10. http / / en. wikipedia. org / wiki / Data_science. 8 DHAR V. Data science and prediction EB /OL. 2014-04- 10. http / /cacm. acm. org /magazines /2013 /12 /169933-data-science-and-prediction / fulltext. 9 LEAK J. The key word in Data Science is not data it is science. EB /OL. 2014-04-20. http / /simplystatistics. org / 2 2013 /12 /12 / the-key-word-in-data-science-is-not-data-it-is-science /. 144 38 2015 3 137

ring intention under self-efficacy trust reciprocity and shared-language J. Computers & Education 2013 68 223-232. 27 CHIU C M HSU M H WANG E T G. Understanding knowledge sharing in virtual communities an integration of social capital and social cognitive theories J. Decision Support Systems 2006 42 3 1872-1888. J. 2012 35 7 56-60. 38 SUH A SHIN K S. Exploring the effects of online social ties on 29. Wiki knowledge sharing a comparative analysis of collocated vs dispersed teams J. Journal of Information Science 2010 36 J. 2008 2 30-34. 4 443-463. 30. 39 CHI L CHAN W K SEOW G et al. Transplanting social J. 2009 16 57-81. 31. D. 2009. 32. 40. D. J. 2012 18 1 74-76. 33 ZHANG Y X FANG Y L WEI K K et al. Exploring the role of psychological safety in promoting the intention to continue sharing knowledge in virtual communities J. International Journal of Information Management 2010 30 5 425-436. 34 ZHA X J LI J YAN Y Y. Understanding preprint sharing on sciencepaper online from the perspectives of motivation and trust J. Information Development 2013 29 1 81-95. 35. J. 2012 31 10 1026-1033. 36. CAS D. 2011. 37 PAROUTIS S SALEH A A. Determinants of knowledge sharing 28. using Web 2. 0 technologies J. Journal of Knowledge Management 2009 13 4 52-63. capital to the online world insights from two experimental studies J. Journal of Organizational Computing and Electronic Commerce 2009 19 3 214-236. 2010. 41. D. 2013. 1974 1988 2014-09 - 01 137 10. EB /OL. 2014-04- 20. http / /www. dataology. fudan. edu. cn. 11 GRAY J. Jim Gray on escience atransformed scientific method R. The FourthParadigm Data-intensive Scientific Discovery 2009. 12 HEY T. M. 2012. 13. J. 2013 50 1 146-169. 14 WONG P C SHEN H-W JOHNSON C R et al. The top 10 challenges in extreme-scale visual analytics J. Computer Graphics and Applications 2012 32 4 63-67. 15 CHANG F DEAN J GHEMAWAT S et al. Bigtable a distributed storage system for structured data J. ACM Transactions on Computer Systems TOCS 2008 26 2 4. 16 Redis EB /OL. 2014-05-10. http / /redis. io /. 17 Azure Tables EB /OL. 2014-05-10. http / /azure. microsoft. com /. 18 DEAN J GHEMAWAT S. MapReduce simplified data processing on large clusters J. Communications of the ACM 2008 51 1 107-113. 19 Hadoop EB /OL. 2014-05-20. http / /hadoop. apache. 20 R EB /OL. 2014-05-20. http / /www. r-project. 21 Rapid-I EB /OL. 2014-04-20. http / /rapid-i. com /content /view /181 /196 /. 22 Mahout EB /OL. 2014-04-21. https / /mahout. apache. 23 2013 23 19-20.. R J. 24 PREKOPS K Z MAKRAI G HENK T et al. Radoop analyzing big data with rapidminer and hadoop C / /Proceedings of the 2nd RapidMiner Community Meeting and Conference RCOMM 2011 2011 1-12. 1990 1961 1979 1988 2014-09 - 15 144 38 2015 3