Using the Forest to see the Trees: A computational model relating features, objects and scenes

Similar documents
Efficient Image Search and Identification: The Making of WINE-O.AI

GrillCam: A Real-time Eating Action Recognition System

STACKING CUPS STEM CATEGORY TOPIC OVERVIEW STEM LESSON FOCUS OBJECTIVES MATERIALS. Math. Linear Equations

DIR2017. Training Neural Rankers with Weak Supervision. Mostafa Dehghani, Hamed Zamani, Aliaksei Severyn, Sascha Rothe, Jaap Kamps, and W.

Modeling Wine Quality Using Classification and Regression. Mario Wijaya MGT 8803 November 28, 2017

Predicting Wine Quality

Food Image Recognition by Deep Learning

Word Embeddings for NLP in Python. Marco Bonzanini PyCon Italia 2017

EAT TOGETHER EAT BETTER MY PERFECT PLATE. 40 minutes

We give a twist to the classic American Grilled Cheese!

Learning Connectivity Networks from High-Dimensional Point Processes

Lesson 23: Newton s Law of Cooling

Perceptual Mapping and Opportunity Identification. Dr. Chris Findlay Compusense Inc.

MBA 503 Final Project Guidelines and Rubric

Jure Leskovec, Computer Science Dept., Stanford

Grapes of Class. Investigative Question: What changes take place in plant material (fruit, leaf, seed) when the water inside changes state?

Decision making with incomplete information Some new developments. Rudolf Vetschera University of Vienna. Tamkang University May 15, 2017

Napa Valley Vintners Teaching Winery Napa Valley College Marketing and Sales Plan February 14, 2018

A CASE STUDY: HOW CONSUMER INSIGHTS DROVE THE SUCCESSFUL LAUNCH OF A NEW RED WINE

The Market Potential for Exporting Bottled Wine to Mainland China (PRC)

Learning the Language of Wine CS 229 Term Project - Final Report

Multiple Imputation for Missing Data in KLoSA

Team Davis Good Foods Lesson 2: Breakfast

Better Punctuation Prediction with Hierarchical Phrase-Based Translation

Alisa had a liter of juice in a bottle. She drank of the juice that was in the bottle.

After your yearly checkup, the doctor has bad news and good news.

WP 2.5. Market Perspectives. Monique Jonis, Jean Baptiste Aninat, Uwe Hoffmann, Gianni Trioli, Hanna Stolz and Otto Schmid. Modena - June 2008

Abstract. Keywords: Gray Pine, Species Classification, Lidar, Hyperspectral, Elevation, Slope.

ENGI E1006 Percolation Handout

What s the Best Way to Evaluate Benefits or Claims? Silvena Milenkova SVP of Research & Strategic Direction

confidence for front line staff Key Skills for the WSET Level 1 Certificate Key Skills in Wines and Spirits ISSUE FIVE JULY 2005

User Studies for 3-Sweep

Sustainable Coffee Challenge FAQ

Rotary Arm Design for U.S. Roaster Corp.

FRANCHISE BROCHURE Planet Grilled Cheese Business Overview

Why PAM Works. An In-Depth Look at Scoring Matrices and Algorithms. Michael Darling Nazareth College. The Origin: Sequence Alignment

DATA MINING CAPSTONE FINAL REPORT

Sensory Approaches and New Methods for Developing Grain-Based Products. Symposia Oglethorpe CC Monday 26 October :40 a.m.

1. Determine methods that can be used to form curds and whey from milk. 2. Explain the Law of Conservation of Mass using quantitative observations.

STUDY REGARDING THE RATIONALE OF COFFEE CONSUMPTION ACCORDING TO GENDER AND AGE GROUPS

About this Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Mahout

COLLEGE EMPLOYEE SATISFACTION SURVEY RESULTS Gallaudet University - Fall Comparison to 4-year, Private not-for-profit Institutions

TEACHER NOTES MATH NSPIRED

What Makes a Cuisine Unique?

Unit title: Fermented Patisserie Products (SCQF level 7)

Objective: Decompose a liter to reason about the size of 1 liter, 100 milliliters, 10 milliliters, and 1 milliliter.

Lesson 4. Choose Your Plate. In this lesson, students will:

Wine Rating Prediction

Structures of Life. Investigation 1: Origin of Seeds. Big Question: 3 rd Science Notebook. Name:

COMPARISON OF THREE METHODOLOGIES TO IDENTIFY DRIVERS OF LIKING OF MILK DESSERTS

Hubbard County Food Revue Participant Packet

Lab Evaluation of Tollway SMA Surface Mixes With Varied ABR Levels Steve Gillen Illinois Tollway

Small-scale hillside farmers, Demand Driven Extension and Better Access to Markets

1. Determine methods that can be used to form curds and whey from milk. 2. Explain the Law of Conservation of Mass using quantitative observations.

Coffee zone updating: contribution to the Agricultural Sector

AST Live November 2016 Roasting Module. Presenter: John Thompson Coffee Nexus Ltd, Scotland

Shop for Healthy Groceries

1) What proportion of the districts has written policies regarding vending or a la carte foods?

Introduction to the Practical Exam Stage 1

GCSE 4091/01 DESIGN AND TECHNOLOGY UNIT 1 FOCUS AREA: Food Technology

Food on the Road. Did you know food is grown all year long?

Investigation 1: Ratios and Proportions and Investigation 2: Comparing and Scaling Rates

-SQA- SCOTTISH QUALIFICATIONS AUTHORITY NATIONAL CERTIFICATE MODULE: UNIT SPECIFICATION GENERAL INFORMATION. -Module Number Session

What makes a good muffin? Ivan Ivanov. CS229 Final Project

Computerized Models for Shelf Life Prediction of Post-Harvest Coffee Sterilized Milk Drink

Investigation 1: Ratios and Proportions and Investigation 2: Comparing and Scaling Rates

Who s snitching my milk?

SITHCCC019 Produce cakes, pastries and breads

A Hedonic Analysis of Retail Italian Vinegars. Summary. The Model. Vinegar. Methodology. Survey. Results. Concluding remarks.

PSYC 6140 November 16, 2005 ANOVA output in R

Lesson 11: Comparing Ratios Using Ratio Tables

Pg. 2-3 CS 1.2: Comparing Ratios. Pg CS 1.4: Scaling to Solve Proportions Exit Ticket #1 Pg Inv. 1. Additional Practice.

Permission Slip. My child,, does not have allergic reactions to the ingredients.

Activity 10. Coffee Break. Introduction. Equipment Required. Collecting the Data

CCEI530B: Nutrition II: Nutrition and Food Service in the Childcare Setting Course Handout

Tastes and Textures Estimation of Foods Based on the Analysis of Its Ingredients List and Image

All About Food 1 UNIT

primary solutions How do you take your tea? Nuffield Design & Technology working in the curriculum five and a half hours work in design & technology

SENIOR NUTRITION SERVICES WORKER

Review for Lab 1 Artificial Selection

Aquarium of the Pacific Food Allergy and Anaphylaxis Protocol

Title: Lettuce Explore Lettuce!

SWISS WATER Logo Usage Guidelines

ACT Aspire Spring 2018 Questions. Question 1. Correct Answer: B

Missing value imputation in SAS: an intro to Proc MI and MIANALYZE

Hot Stuff! Ph! Year 3 Science Year 4 Health and Physical Education

Updating Training Package Products Cookery Qualifications. Consultation Briefing Paper

FOR PERSONAL USE. Capacity BROWARD COUNTY ELEMENTARY SCIENCE BENCHMARK PLAN ACTIVITY ASSESSMENT OPPORTUNITIES. Grade 3 Quarter 1 Activity 2

A ìsourî Subject. Predictions. Introduction. Name. Team Member Names

Biosignal Processing Mari Karsikas

Styrofoam Cup Design Middle School and High School Lauri Thorley and Adrienne Lessard

MyPlate. National FCS Standard: Apply various dietary guidelines in planning to meet nutrition and wellness needs.

ILSI Workshop on Food Allergy: From Thresholds to Action Levels. The Regulators perspective

Building a Customer Service Oriented Team. Greeley/Evans SD 6- Colorado

Unit 2, Lesson 4: Color Mixtures

0 + 1 = = = 2 + = = 3 + = = 5 + = = 8 + = = 13 + =

To the Unicode Consortium Proposal for New Emoji: Fondue. Tobias Bolzern & Stefan Wehrle. November 14th, 2018

Section D - What Should They Learn?

Unit of competency Content Activity. Element 1: Organise coffee workstation n/a n/a. Element 2: Select and grind coffee beans n/a n/a

Development of smoke taint risk management tools for vignerons and land managers

Transcription:

Using the Forest to see the Trees: A computational model relating features, objects and scenes Antonio Torralba CSAIL-MIT Joint work with Aude Oliva, Kevin Murphy, William Freeman Monica Castelhano, John Henderson

From objects to scenes SceneType 2 {street, office, } S Object localization O 1 O 1 O 1 O 1 O 2 O 2 O 2 O 2 Local features L L L L Image I Riesenhuber & Poggio (99); Vidal- Naquet & Ullman (03); Serre & Poggio, (05); Agarwal & Roth, (02), Moghaddam, Pentland (97), Turk, Pentland (91),Vidal-Naquet, Ullman, (03) Heisele, et al, (01), Agarwal & Roth, (02), Kremp, Geman, Amit (02), Dorko, Schmid, (03) Fergus, Perona, Zisserman (03), Fei Fei, Fergus, Perona, (03), Schneiderman, Kanade (00), Lowe (99)

From scenes to objects SceneType 2 {street, office, } S Object localization O 1 O 1 O 1 O 1 O 2 O 2 O 2 O 2 G Global gist features Local features L L L L Image I

From scenes to objects SceneType 2 {street, office, } S Object localization O 1 O 1 O 1 O 1 O 2 O 2 O 2 O 2 G Global gist features Local features L L L L Image I

The context challenge What do you think are the hidden objects? 1 2 Biederman et al 82; Bar & Ullman 93; Palmer, 75;

The context challenge What do you think are the hidden objects? Chance ~ 1/30000 Answering this question does not require knowing how the objects look like. It is all about context.

From scenes to objects SceneType 2 {street, office, } S G Global gist features Local features L L L L Image I

Scene categorization Office Corridor Street Oliva & Torralba, IJCV 01; Torralba, Murphy, Freeman, Mark, CVPR 03.

Place identification Office 610 Office 615 Draper street 59 other places Scenes are categories, places are instances

Supervised learning { V g, Office} { V g, Office} { V g, Corridor} Classifier { V g, Street}

Supervised learning { V g, Office} { V g, Office} { V g, Corridor} Classifier { V g, Street} Which feature vector for a whole image?

Global features (gist) First, we propose a set of features that do not encode specific object information Oliva & Torralba, IJCV 01; Torralba, Murphy, Freeman, Mark, CVPR 03.

Global features (gist) First, we propose a set of features that do not encode specific object information V = {energy at each orientation and scale} = 6 x 4 dimensions 80 features v t PCA G Oliva & Torralba, IJCV 01; Torralba, Murphy, Freeman, Mark, CVPR 03.

Example visual gists I I Global features (I) ~ global features (I ) Cf. Pyramid Based Texture Analysis/Synthesis, Heeger and Bergen, Siggraph, 1995

Learning to recognize places We use annotated sequences for training Office 610 Corridor 6b Corridor 6c Office 617 Hidden states = location (63 values) Observations = v G t (80 dimensions) Transition matrix encodes topology of environment Observation model is a mixture of Gaussians centered on prototypes (100 views per place)

Wearable test-bed v1

Wearable test-bed v2

Place/scene recognition demo

From scenes to objects SceneType 2 {street, office, } S Object localization O 1 O 1 O 1 O 1 O 2 O 2 O 2 O 2 G Global gist features Local features L L L L Image I

Global scene features predicts object location New image v g Image regions likely to contain the target

Global scene features predicts object location Training set (cars) { V g 1, X 1 } { V g 2, X 2 } { V g 3, X 3 } The goal of the training is to learn the association between the location of the target and the global scene features { V g 4, X 4 }

Global scene features predicts object location v g X Results for predicting the vertical location of people Results for predicting the horizontal location of people True Y True X Estimated Y Estimated X

The layered structure of scenes p(x) p(x 2 x 1 ) In a display with multiple targets present, the location of one target constraints the y coordinate of the remaining targets, but not the x coordinate.

Global scene features predicts object location v g X Stronger contextual constraints can be obtained using other objects.

1

1

Attentional guidance Local features Saliency Saliency models: Koch & Ullman, 85; Wolfe 94; Itti, Koch, Niebur, 98; Rosenholtz, 99

Attentional guidance Local features Saliency Global features Scene prior TASK Torralba, 2003; Oliva, Torralba, Castelhano, Henderson. ICIP 2003

Attentional guidance Local features Saliency Object model Global features Scene prior TASK

Comparison regions of interest Torralba, 2003; Oliva, Torralba, Castelhano, Henderson. ICIP 2003

Comparison regions of interest 30% 20% Saliency predictions 10% Torralba, 2003; Oliva, Torralba, Castelhano, Henderson. ICIP 2003

Comparison regions of interest 30% 20% 10% Saliency predictions Saliency and Global scene priors Torralba, 2003; Oliva, Torralba, Castelhano, Henderson. ICIP 2003

Comparison regions of interest 30% 20% 10% Saliency predictions Dots correspond to fixations 1-4 Torralba, 2003; Oliva, Torralba, Castelhano, Henderson. ICIP 2003

Comparison regions of interest 30% 20% 10% Saliency predictions Saliency and Global scene priors Dots correspond to fixations 1-4 Torralba, 2003; Oliva, Torralba, Castelhano, Henderson. ICIP 2003

Results % of Scenes without people 100 fixations 90 inside the region 80 70 60 100 90 80 70 60 Scenes with people 50 50 1 2 3 4 1 2 3 4 Fixation number Fixation number Chance level: 33 % Saliency Region Contextual Region

Task modulation Local features Saliency Global features Scene prior TASK Torralba, 2003; Oliva, Torralba, Castelhano, Henderson. ICIP 2003

Task modulation Saliency predictions Saliency and Global scene priors Mug search Painting search

Discussion From the computational perspective, scene context can be derived from global image properties and predict where objects are most likely to be. Scene context considerably improves predictions of fixation locations. A complete model of attention guidance in natural scenes requires both saliency and contextual pathways