Cooking the Books. The Semantic Structure of Recipes. David Tresner-Kirsch. Masters Thesis Computational Linguistics

Similar documents
Corpus analysis. Alessia Cadeddu. This analysis has been carried out on a corpus of dessert recipes taken from the Internet.

Old Fashioned Snickerdoodles

Kay s Recipes. Baked Peach Oatmeal. Directions: Ingredients:

SAVOR THE MYSTERY RECIPE BOOK

P e r f e c t P u m p k i n R e c i p e s. F r o m : Y o u r V i s i t i n g T e a c h e r s

Applying ISO 9001 to Baking Cookies

T. oil-mix with rotary beater or in blender. Repeat cooking method per above.

BLBS015-Conforti August 11, :35 LABORATORY 1. Measuring Techniques COPYRIGHTED MATERIAL


COOKING FOR ONE OR TWO

The question: How that which is inside a person might change over time as a consequence of repeated interactions with a task world.

Performance Task: FRACTIONS! Names: Date: Hour:

Your Meal Plan. Day 1. Day 2 BREAKFAST LUNCH DINNER BREAKFAST. Jump to Day 1 Day 2 Day 3 Day 4 Day 5 Day 6 Day 7. Scrambled Eggs with Spinach and Feta

Foods and Nutrition. Unit 2 Notes: Measuring, Preparation, and Cooking Methods

COOKING WITH ENTERGY. Breads

2013 Warren RECC s Recipe of the Month Collection

PEANUT BUTTER COOKIE MIX

1 pkg. (26 ounces) Pepperidge Farm Entertaining Quartet Distinctive Crackers or thinly sliced Baguette

Grandma s Southern Mini Cook Book

Cooking with Cranberries

Table of Contents Breakfast... 3 Sweet Potato Muffins... 4 Sweet Potato Pancakes... 6 Sweet Potato Pie Smoothie... 8 Sweet Potato Waffles...

Breakfast. Keto NOatmeal. Serving size: 4 servings Macros: Fat 31g, Net Carbs 3g, Protein 11g

HERSHEY S COCOA ICONIC RECIPES

Preparation. Ingredients

Thank you so much for being a supporter of TheHappierHomemaker.com

Appetizers and Snacks

Your Meal Plan. Day 1 BREAKFAST LUNCH DINNER. Jump to Day 1 Day 2 Day 3 Day 4 Day 5 Day 6 Day 7. Scrambled Eggs with Spinach and Feta

SHELTER DINNER MENUS

Peppermint Meltaways

Healthy Holiday Cooking Webinar 12/11/17 Recipes. Add 1/2 cup of hot water. (or add water and heat in microwave for about a minute and a half.

THIS WEEK'S MENU: DAY 2 DAY 1 DAY 3 DAY 4 DAY 7 DAY 6 DAY 5 STANDARD PLAN Easy Taco Pizza. Baked Hawaiian Chicken

All Time Favorite Christmas Cookies.

MAKE AHEAD MASHED POTATOES

Recipe & Cooking Terms. The Language of the Recipe

8 th grade Cookbook. Mrs. Rosenbaum Ms. Galante Ms. Strajanekova

Slimming Strawberry Peach Cobbler prep time: 10 minutes cook time: 35 minutes servings: 8

The Gluten Free Cookbook. 22 Recipes

Cherry Bomb Cookies 1 box cherry cake mix 1 egg 2 cup Cool Whip, thawed

Introduction. I hope you will enjoy them as much as I have! Katie The Warrior Wife

Taco Crescent Ring. Ingredients

THIS WEEK'S MENU: DAY 2 DAY 1 DAY 3 DAY 4 DAY 6 DAY 7 DAY 5 SMALLER FAMILY Smaller Family- Chicken Caesar Wraps

January Mom s Chicken Casserole

Persimmon Cookies. 1 teaspoon baking soda 1 cup persimmon pulp

Oven Safe up to 428 F/220 C. Microwave Safe. Dishwasher Safe. Refrigerator Safe. Freezer Safe. *Features Limited Lifetime Warranty.

Krazy Kitchen: Fall Foods

A FAMILY TRADITION ALL NATURAL PRODUCT TAP

Peanut Butter Snickerdoodle Tart with Cinnamon Peanut Crust

COOKBOOK. Recipes brought to you by. Los Rios Community College District. Employees

Sandy s Famous Chocolate Chip Cookies

Southern Sweets. Sweet. Southern. A collection of. recipes

Braided Bread. Nutrition Facts. Makes 12 servings

WESTTOWN SCHOOL. presents HOLIDAY COOKIE RECIPES happy holidays and best wishes for the new year

Baker s Dozen Holiday Cookbook from your friends at. agents of science

Recipes with Post Grape-Nuts

MEMBER-EXCLUSIVE. recipes TOP 20 FESTIVE APPETIZERS & DESSERTS FROM

THIS WEEK'S MENU: DAY 1 DAY 2 DAY 3 DAY 6 DAY 4 DAY 7 DAY 5 SMALLER FAMILY Smaller Family- Easy Club Chicken

Mild Salsa: Food processor. Ingredients: Ingredients

15 MINUTE RECIPES. Extra Easy Hummus! Microwave Popcorn. Grilled Fruit. Watermelon and Strawberry Lemonade. Kale Chips.

First Birthday Cake Ideas

March 2018 Recipes. Preparation Directions (before freezing): Soften cream cheese. Place all ingredients in a ziplock bag and massage/mix ingredients.

French Cuisine October 13, 2008

Tips and Recipes for The Smart Cookie set by Shape+Store

Life Skills: Cooking Name

DROP IN THE BUCKET Bake Sale Recipes

A. GENERAL INFORMATION No. 2 (1) DEFINITION OF TERMS USED IN FOOD PREPARATION

Fast Track Fat Loss Delicious Recipes

Cinnamon French Toast Internet

SPECIAL OCCASION HEALTHY DESSERT IDEAS

websolutions.com/holidays2017 Preheat oven to 375 F and grease a baking tray.

March 2014 Recipes. Quest Chefs Whitney Bremner and Robert Hill (Pinawa) Whitney Bremner s Recipes:

Desserts Thoroughbred Pie Pumpkin Cheesecake Dessert

Chef Fred Laughlin, Great Lakes Culinary Institute and the. NMC Wellness Committee presents. Three Squares Series. 3 Healthy Breakfasts

TEST PROJECT. Server Side B. Submitted by: WorldSkills International Manuel Schaffner CH. Competition Time: 3 hours. Assessment Browser: Google Chrome

PEEPS apple yogurt parfait

FEBRUARY 2015 RECIPES

"UNLEAVENED"RECIPES" Unleavened Cinnamon-Pecan Crisps

Directions: 1. Melt butter in your Gotham 10 1/4 Skillet on medium heat. Add onion and sauté until soft.

OUR T h a n k s g i v i n g M e n u

Chocolat. Chocolate Cook Books. Coming in January. Pure Chocolate by Fran Bigelow Adult Nonfiction 641.

citrus herb-roasted turkey & port gravy

Pumpkin Crumb Cake Muffins

Create with confidence. Crisco Professional Sauté and Grill

1/8 teaspoon freshly ground black pepper 1 cup half and half 1 green onion, green top only, chopped

Apple Cider Floats. Apple cider Ice cream Caramel ice cream topping Cinnamon

TORTELLINI SOUP. Ingredients:

100% WHOLE WHEAT BREAD MARY S WHOLE WHEAT BREAD

CONTENTS. For more recipes, please visit Buy This Cook That Page 3 Strawberry Muffin Shortcakes

KETO MEAL PLAN. Lunch Dinner Notes Net Carbs Deviled Egg Simple Beef and Salad Eggplant Stir-Fry with Cauliflower

2019 Recipes BAKING AND PASTRY STAR EVENT

THIS WEEK'S MENU: DAY 2 DAY 1 DAY 3 DAY 5 DAY 4 DAY 7 DAY 6 STANDARD PLAN Turkey Mexican Casserole

Blue Cheese & Date Croquettes

Key Stage 3 Design & Technology Food Technology. Recipe Booklet NAME:... TUTOR GROUP:...

Costco Printable Grocery List #2

North Carolina Peanut Growers Association PB&J Contest Thursday, October 13, 2016

La Fonda on the Plaza Santa Fe, New Mexico

2011 Holiday Dishes. From The RiceSelect Kitchen

MEMBER-EXCLUSIVE COLLECTION 2015 & / OF GENERAL MILLS 2015 & / OF GENERAL MILLS

Banana Split Dessert Jaime Littrell

Culinary Terms. The Language of the Recipe

Holiday Cookies, Candies and Chocolates

Transcription:

Cooking the Books The Semantic Structure of Recipes David Tresner-Kirsch dwkirsch@gmail.com Masters Thesis Computational Linguistics Brandeis University, Spring 2010

Abstract A data-driven semantic analysis of the structure and meaning of cooking recipes, and an implementation of software that automatically derive this structure from raw recipe text.

I. Task and Domain The goal of this project is a software package capable of generating a structured representation of the actions described by the text of recipes. It is developed and tested using a recipe database culled from the website allrecipes.com for previous work [Shwayder and Tresner-Kirsch]. The Toll-House Chocolate Chip Cookie Recipe will be used as a source of examples in this paper. This recipe is reproduced in full below. Ingredients * 2 1/4 cups all-purpose flour * 1 teaspoon baking soda * 1 teaspoon salt * 1 cup 2 sticks) butter, softened * 3/4 cup granulated sugar * 3/4 cup packed brown sugar * 1 teaspoon vanilla extract * 2 large eggs * 2 cups 12-oz. pkg.) NESTLÉ TOLL HOUSE Semi-Sweet Chocolate Morsels * 1 cup chopped nuts Instructions:

Combine flour, baking soda and salt in small bowl. Beat butter, granulated sugar, brown sugar and vanilla extract in large mixer bowl until creamy. Add eggs, one at a time, beating well after each addition. Gradually beat in flour mixture. Stir in morsels and nuts. Drop by rounded tablespoon onto ungreased baking sheets. Bake for 9 to 11 minutes or until golden brown. Cool on baking sheets for 2 minutes; remove to wire racks to cool completely. II. Linguistic Observations The solution to this task was informed by observation of the text of 100 recipes, including detailed hand-annotation of 40 of those recipes. Observations of relevant patterns and phenomena at the document, clause, and word levels are summarized here. Recipes are strictly composed of two distinct sections: an ingredient list and a set of instructions. Language is used quite differently in the two sections, and it is necessary to approach their analysis with accordingly differentiated techniques. 1. Ingredient List Fragments, generally consisting of a quantity, a unit of measure, a collection of nouns and adjectives naming an ingredient, and a collection of past-participles describing preparatory actions on the ingredient e.g. 1 cup chopped nuts). The quantity and unit are sometimes omitted, and the verbs are often omitted e.g. salt and pepper).

2. Instructions Prose, in reasonably complete sentences, mostly in imperative mood e.g. Gradually beat in flour mixture). Food activity verbs beat) are applied to ingredients or ingredient constructs. Many of the contentful words used in recipes are drawn from very small vocabularies. Units of measurement: cups, Tbsp., pound, pinch, dash, etc. Food activities: mix, melt, whisk, etc. Recipe authors fairly strictly employ a smaller subset of possible words than might normally be acceptable in describing the inted actions. Some words that are close synonyms to frequently used food activity verbs may appear extremely rarely in recipes e.g. merge and unite could convey the same instruction as mix, but do not appear in the analyzed data). Ingredients: butter, all-purpose flour, vanilla extract, etc. These are drawn from a larger vocabulary than food activities or measurements. However, within the context of a single recipe, the instruction text is constrained to referring to only those ingredients specified in the ingredient list, where the simpler grammatical structure makes them easier to identify. Ingredient coreference is generally lexical it is fairly safe to assume that references to an ingredient in the instructions will match on at least one of the words describing that ingredient in the list, almost always without morphological change. There are some interesting exceptions to this; most common of these is the case where a descriptive term such as dry ingredients may be used to reference a collection of the listed ingredients e.g. flour, baking soda, and salt). Clause structures can be treated as being fairly simple for the purpose of this task. The fact that the relevant clauses are consistently imperative means there are no passive or unusual tense constructions to rer difficult the identification of food activity verbs and their direct objects ingredients). In many clauses of recipe instructions, the food activity verbs take an implied indirect object

which is the most salient ingredient or combination of ingredients in the frame of reference. For example in the following example, the second clause is instructing that the eggs should be whisked into the already combined flour, soda, and salt: Mix together flour, salt, and baking powder in a bowl. Whisk in eggs. The majority 70%) of clauses in the analyzed recipes have this property. III. Implementation Appix A includes the text of a Ruby program that implements recipe information extraction. It runs in Ruby 1.8.x with no additional packages. It makes use of a fledgling annotation framework also submitted) which will be made available as a gem package once it is slightly more mature) at rubyforge.org/projects/annotated/. Architecturally, this solution is composed of a tagger, a chunker, and a representation builder. The tagger and chunker are familiar abstractions for programmatic components that provide token-level and phrase-level annotations, respectively [Jurafsky and Martin]. The representation builder constructs semantic representations of each phrase, and combines these representations into a structured document-level semantic representation. A main program loads in vocabulary data and the recipe to be analyzed, and manages the flow of information between those three processing objects. These relationships are shown in the figure below the grey arrows should be interpreted as the actions of the main program). The main program is also responsible for aming the tag information to identify those activity verbs which should take an implied salient object.

Each of these objects performs a set of tasks: Tagger tagger.rb): Annotates recipe text at word level, identifying and categorizing contentful words. Within each line of the ingredient list, the tagger: Finds activity words, units, quantities, and functional words e.g. and). Tags all remaining words as ingredient words, flagged with a unique identifier for the line In the instructions, the tagger: Tags activity words Tags words that match ingredient words from the list with the matches' identifiers Chunker chunker.rb): Annotates instructions at clause level, marking boundaries within which the representation builder should work. Places boundaries after periods and semicolons.

Places boundaries before food activity verbs. Representation Builder representer.rb): Works through the clauses of the instructions sequentially, building a structured representation of the relationships between actions and ingredients. Initializes an array of available food entities by including all ingredients In each clause, identifies the food entities either basic ingredients or artifacts) to which the activity is applied Shifts those entities out of the food entities array, merges them into a new artifact entity marked with the activity, and adds this new artifact back into the array. By the of the recipe, there should be only one remaining available food entity: the completed dish. The builder can display the object structure as either text or html. The solution has been implemented to make use of an existing database of recipes and its corresponding interface. It also makes use of the Brandeis Semantic Ontology database via a Ruby API. In the interest of submitting a solution that is runnable without external data sources, the code that accompanies this paper is a modified version in which links to other frameworks and data sources have been stripped out instead, it hard-codes the information it would have extracted from the BSO and includes an example recipe to be processed. IV. Example Execution To run a demo of the solution on the Toll-House Chocolate Chip Cookie Recipe, use the console command 'ruby main.rb' from a directory with the solution files. The console output will show the state of the recipe content as it progresses through lexical tagging, clause chunking, salience tagging, and

then each step as the representation is built up clause-by-clause. That output is also included in Appix B. V. Future Work Although this solution performs successfully on some recipes, and has fairly high accuracy on a clauseby-clause basis, there are some situations it does not yet handle correctly and some phenomena which are not yet modeled at all. As mentioned in Section II, there are some cases in which ingredients, artifacts, or ingredient sets are referred to without a direct lexical match. In some cases, a phrase like dry ingredients is used to stand in for a number of ingredients from the recipe's list; those ingredients never get mentioned directly in the instructions. In other cases, a food artifact may suddenly be referred to by a new name e.g. dough). Most of these situations are not handled correctly. The structured representation wants for an API which could accept and answer queries like: What activities get applied to eggs in this recipe? Do the peppers get fried in this recipe? Does the flour get heated by any action in this recipe? Though the features used for determining whether clauses have an implied salient object cover the majority of cases, there are some additional features that could be included to increase accuracy. These issues will all be addressed by future work.

Appix A: Code class Tagger attr_accessor :annotated_ingredients, :annotated_instructions def initializeingredient_text, instructions_text) load 'food_activity_definitions.rb' load 'annotated_string.rb' load 'stop_words.rb' load 'unit_words.rb' @annotated_ingredients = Annotated.newingredient_text) @annotated_instructions = Annotated.newinstructions_text) tag_activities_inannotated_ingredients, FoodActivities.past_tense) tag_ingredient_words_in_ingredients tag_activities_inannotated_instructions, FoodActivities.present_tense) tag_ingredient_words_in_instructions merge_multiword_ingredient_tags # make sure each word is only tagged as one thing annotated_instructions.purge_overlaps_greedily def tag_activities_inannotated_string, activities) activities.each do act # See mapscan in Annotated. This grants access to multiple matches with their indices. # If the matched text is also needed, get md[0] when getting the offset annotated_string.mapscan/^ [^A-Za-z])#{act})$ [^A-Za-z])/i){ md md}.each do md st, fin = md.offset2) annotated_string.tag:start => st, :finish => fin, :tag => "v", :layer => :tags) def tag_ingredient_words_in_ingredients pointer = 0 annotated_ingredients.split"\n").each_with_index do ing, index ing.mapscan/[a-za-z\-]+)/i){ md md}.each do md st, fin = md.offset1) word = md[1] unless UnitWords.all.include?word) or StopWords.all.include?word) or FoodActivities.past_tense.include?word) annotated_ingredients.tag:start => st+pointer, :finish => fin+pointer, :tag => "i#{index}") pointer += ing.length + 1 #line def tag_ingredient_words_in_instructions annotated_ingredients.split"\n").each_with_index do ing, index ing.split' ').each do x x.sub!/[,.):;]/, '') next unless x =~ /[a-za-z]/ next if UnitWords.all.include?x) or StopWords.all.include?x) annotated_instructions.mapscan/[^a-za-z]#{x})[^a-za-z]/i){ md md}.each do md

md}.each do md "i#{index}") st, fin = md.offset1) annotated_instructions.tag:start => st, :finish => fin, :tag => "i#{index}") if x.match/flour/i) annotated_instructions.mapscan/[^a-za-z]dry ingredients)[^a-za-z]/i){ md st, fin = md.offset1) annotated_instructions.tag:start => st, :finish => fin, :tag => def merge_multiword_ingredient_tags tmp_tags = annotated_instructions.layers[:tags].sort_by{ t t.start} tmp_tags.each do tag1 tmp_tags -= [tag1] tmp_tags.each do tag2 annotated_instructions.merge!tag1, tag2) if annotated_instructions.consecutive? tag1,tag2) && tag1.tag == tag2.tag && tag1.tag =~ /i\d/ def to_s tmp = [] tmp << annotated_ingredients.with_layer:tags) annotated_instructions.split"\n").collect{ line line.split "."}.flatten.collect{ sent sent.split ";"}.flatten.each do x tmp << x.with_layer:tags) #if!x.tags.empty? tmp class Chunker load 'annotated_string.rb' attr_accessor :all def initializetext) @all = text @all.tag:start => 0, :finish => @all.length, :tag => "EOF", :layer => :chunks) def chunk_atboundary) @all.mapscanboundary){ md md}.each do md st, fin = md.offset0) past_chunk = @all.layers[:chunks].select{ c c.start <= st && c.finish >= fin}.first unlesspast_chunk.finish == fin) @all.tag:start => past_chunk.start, :finish => fin, :tag => boundary.to_s, :layer => :chunks) @all.tag:start => fin+1, :finish => past_chunk.finish, :tag => past_chunk.tag, :layer => :chunks) @all.layers[:chunks].deletepast_chunk)

def chunk_at_tag_matchestarget) @all.layers[:chunks].each do c matches = @all.layers[:tags].select{ t t.tag.matchtarget) && t.start >= c.start && t.start <= c.finish}.sort_by{ t t.start} matches.shift matches.each do t chunk_at_tagt) def chunk_at_tagtag) st, fin = tag.start, tag.finish past_chunk = @all.layers[:chunks].select{ c c.start <= st && c.finish >= fin}.first @all.tag:start => past_chunk.start, :finish => st - 1), :tag => tag.tag, :layer => :chunks) @all.tag:start => st, :finish => past_chunk.finish, :tag => past_chunk.tag, :layer => :chunks) @all.layers[:chunks].deletepast_chunk) def to_a @all.layer_with_content:chunks) def to_s tmp = [] @all.layer_with_content:chunks).each_with_index do chunk, index tmp << index.to_s + ": " + chunk.with_layer:tags) tmp.join"\n") class Representer attr_accessor :annotated_ingredients, :annotated_instructions, :ingredients, :food_stuffs def initializetagger, chunker) @annotated_ingredients = tagger.annotated_ingredients @annotated_instructions = chunker.to_a @ingredients = [] @food_stuffs = [] process_ingredients process_instructionstrue) = 1 def process_ingredients annotated_ingredients.split"\n").each do line next if line.match/^\s*$/) ingredient_word_tags = line.match_tags_with_content:tags, /i/) raise "unexpected non-matching ingredient identifiers in single line of ingredients: #{line}" if ingredient_word_tags.collect{ t t.tag}.uniq.length! @ingredients << Ingredient.new:ident => ingredient_word_tags.first.tag,

:text => ingredient_word_tags.sort_by{ t t.start}.collect{ t line.content_att)}.join' ')) @food_stuffs = ingredients def process_instructionsverbose = false) annotated_instructions.each do c puts c if verbose next if c.layers[:tags].empty? verb_tag = c.match_tags_with_content:tags, /^v/).first next if verb_tag.nil? verb = c.content_atverb_tag) food_tags = c.match_tags_with_content:tags, /^i/) food_idents = food_tags.collect{ t t.tag} food_objects = [] food_idents.each do id foo = food_stuffs.select{ f f.is_a?ingredient) && id == f.ident}.first foo = food_stuffs.select{ f f.is_a?artifact) && f.flatten.collect{ i i.ident}.include? id)}.first if foo.nil? food_objects << foo food_objects << @food_stuffs.last if verb_tag.tag.match/v-sal/) food_objects << @food_stuffs.last if food_objects.empty? incorporate:items => food_objects.uniq, :action => verb) @food_stuffs.collect{ f puts f.to_s} if verbose puts "---------" if verbose def incorporateoptions) @food_stuffs << Artifact.newoptions) @food_stuffs -= options[:items] def to_s @food_stuffs.collect{ f f.to_s}.join"\n") def to_html @food_stuffs.collect{ f puts f.to_html}.join"\n<br/>\n") class Ingredient attr_accessor :ident, :text def initializeoptions) @ident = options[:ident] @text = options[:text] def flatten self def to_sindent = 0) #@text " "*indent + @text def to_html

to_s + "<br/>" class Artifact attr_accessor :items, :action def initializeoptions) @items = options[:items] @action = options[:action] def flatten @items.collect{ i i.flatten}.flatten @action def to_sindent = 0) #@action + " " + @items.collect{ i i.to_s}.join", ") + " ) " " "*indent + " \n" + @items.collect{ i i.to_sindent+1)}.join",\n") + "\n" + " "*indent +") " + def to_html "<div style=\"max-width:85%;border:thin black solid;border-left:none;\"> <div style=\"float:right;\"> #{@action} </div> <br/> #{@items.collect{ i i.to_html}.join} </div>" module StopWords def self.all ["and", "large", "medium", "or", "of", "small"] module UnitWords def self.all ["cup", "tsp", "tablespoon", "tbsp", "pound", "teaspoon", "teaspoons", "tablespoons", "cups", "ounce", "ounces", "pint", "pints"] + ["dash", "pinch"] + ["slice", "slices"] module FoodActivities # food activities from BSO def self.all food_activities = ["a la carte", "arrosto", "bake", "bake-able", "bakeable", "baked", "baked hawaii", "baking", "barbecue", "barbecue", "barbecued", "barbecueing", "barbecuing", "barbeque", "barbeque", "barbequeing", "baste", "bbq", "blanch", "boil", "boil", "boiled", "boiling", "braise", "braised", "braising", "bread", "bread making", "broil", "broiling", "brown", "butcher", "butter", "cake mixing", "candy-making", "caramelize", "caramelized", "cater", "catering", "char", "charbroil", "charbroiling", "chop", "chopped", "cook", "cookie-baking", "cooking", "cooking", "country-fried", "cream", "creaming", "crisp", "crisp", "crisping", "crispy", "crumb", "cure", "curried", "deep-fried", "dish up", "filet", "flake", "flavor", "flavored", "flavour", "flavoured", "flour", "fricassee", "fried", "fritter", "fry", "frying", "grill", "grilled",

"grilling", "hardboil", "hardboiling", "herb-roasted", "home-cooked", "husk", "juicing", "knead", "ladle", "leaven", "marinate", "mash", "microwave", "microwaving", "milling", "mince", "overcook", "overcooking", "parboil", "parboiling", "pepper", "peppery", "pickle", "poach", "poached", "poaching", "pop", "precook", "preheat", "raw", "roast", "roasted", "roasting", "salt", "saute", "scald", "scallop", "scramble", "sear", "season", "seasoned", "serve", "serve up", "serving", "shell", "simmer", "simmering", "slaughter", "smoke", "smoked", "softboil", "softboiling", "spark", "spice", "spice up", "spicey", "starch-thickened", "steam", "steamed", "steaming", "stew", "stewed", "stewing", "stir-fry", "stir-fry", "stuff", "stuffed", "sugar", "sweeten", "sweetening", "terize", "toast", "toasting", "truss", "whisk", "wintermint", "yeast-raised"] # get rid of some of these that aren't used as recipe instructions food_activities -= ["a la carte", "arrosto", "bake-able", "bakeable", "baked hawaii", "bread making", "butcher", "cake mixing", "candy-making", "cater", "catering", "cookie-baking", "home-cooked", "overcook", "overcooking", "peppery", "raw", "scallop", "slaughter", "spice up", "spicey", "starch-thickened", "wintermint", "yeastraised"] # get rid of some troublesome ones for now) "preheat"] food_activities -= ["baking", "brown", "pepper", "pickle", "salt", "spice", "butter", "flour", "sugar", # more food activities food_activities += ["add", "beat", "beaten", "beating", "bl", "chill", "combine", "cool", "cut", "dice", "diced", "fold", "freeze", "frozen", "melt", "melted", "mix", "process", "puree", "pureed", "refrigerate", "rise", "roll", "sift", "soften", "softened", "sprinkle", "stir", "stirring"] # How to deal with "cream" being both an activity and an ingredient??? Also, "salt" and "pepper", where the nouns get verbed? def self.past_tense # some of these are past-tense past_tense_activities = ["baked", "barbecued", "beaten", "boiled", "chopped", "curried", "diced", "grilled", "melted", "pureed", "softened", "stewed", "stuffed"] def self.present_tense self.all - self.past_tense module ExampleRecipeText def self.ingredient_text ingredient_text = " * 2 1/4 cups all-purpose flour * 1 teaspoon baking soda * 1 teaspoon salt * 1 cup 2 sticks) butter, softened * 3/4 cup granulated sugar * 3/4 cup packed brown sugar * 1 teaspoon vanilla extract * 2 large eggs * 2 cups 12-oz. pkg.) NESTLE TOLL HOUSEÂ Semi-Sweet Chocolate Morsels * 1 cup chopped nuts".gsub/\[^)]*\)/, '') def self.instructions_text instructions_text = "Combine flour, baking soda and salt in small bowl. Beat butter, granulated sugar,

brown sugar and vanilla extract in large mixer bowl until creamy. Add eggs, one at a time, beating well after each addition. Gradually beat in flour mixture. Stir in morsels and nuts. Drop by rounded tablespoon onto ungreased baking sheets. Bake for 9 to 11 minutes or until golden brown. Cool on baking sheets for 2 minutes; remove to wire racks to cool completely. ".gsub/\[^)]*\)/, '') load 'tagger.rb' load 'chunker.rb' load 'representer.rb' load 'apple_butter_spice_cake.rb' puts "RAW:" puts ExampleRecipeText.ingredient_text, ExampleRecipeText.instructions_text puts "###########################################" tagger = Tagger.newExampleRecipeText.ingredient_text, ExampleRecipeText.instructions_text) puts "TAGGED:" puts tagger.to_s puts "###########################################" puts chunker = Chunker.newtagger.annotated_instructions) chunker.chunk_at".") chunker.chunk_at";") chunker.chunk_at_tag_matches"v") puts "CHUNKED: " + chunker.to_a.length.to_s + " chunks)" chunker.to_a.each_with_index do chunk, index puts "#{index}: " + chunk.with_layer:tags) puts "###########################################" puts puts "FLAG SALIENCE REQUIREMENTS:" chunker.all.match_tags_with_content:tags, /^v/).sort_by{ t t.start}.each do t t.tag = "v-sal" if chunker.all.content_att).match/^add$/i) t.tag = "v-sal" if chunker.all.slicet.start..chunker.all.length).match/^[a-za-z]+ in [^a]/) t.tag = "v-sal" if chunker.all.slicet.start..chunker.all.length).match/^[a-za-z]+ into [^a]/) puts t.start.to_s + ": " + chunker.all.content_att) + "[" + t.tag + "]" puts "###########################################" puts "REPRESENTER:" representer = Representer.newtagger,chunker) puts representer.to_s #puts representer.to_html

Appix B: Output of the demo program on the Toll-House Chocolate Chip Cookie Recipe. $ ruby main.rb RAW: * 2 1/4 cups all-purpose flour * 1 teaspoon baking soda * 1 teaspoon salt * 1 cup butter, softened * 3/4 cup granulated sugar * 3/4 cup packed brown sugar * 1 teaspoon vanilla extract * 2 large eggs * 2 cups NESTLE TOLL HOUSE Semi-Sweet Chocolate Morsels * 1 cup chopped nuts Combine flour, baking soda and salt in small bowl. Beat butter, granulated sugar, brown sugar and vanilla extract in large mixer bowl until creamy. Add eggs, one at a time, beating well after each addition. Gradually beat in flour mixture. Stir in morsels and nuts. Drop by rounded tablespoon onto ungreased baking sheets. Bake for 9 to 11 minutes or until golden brown. Cool on baking sheets for 2 minutes; remove to wire racks to cool completely. ########################################### TAGGED: * 2 1/4 cups [all-purpose]{i0} [flour]{i0} * 1 teaspoon [baking]{i1} [soda]{i1} * 1 teaspoon [salt]{i2} * 1 cup [butter]{i3}, [softened]{v} * 3/4 cup [granulated]{i4} [sugar]{i4} * 3/4 cup [packed]{i5} [brown]{i5} [sugar]{i5}

* 1 teaspoon [vanilla]{i6} [extract]{i6} * 2 large [eggs]{i7} * 2 cups [NESTLE]{i8} [TOLL]{i8} [HOUSE]{i8} [Semi-Sweet]{i8} [Chocolate] {i8} [Morsels]{i8} * 1 cup [chopped]{v} [nuts]{i9} [Combine]{v} [flour]{i0}, [baking soda]{i1} and [salt]{i2} in small bowl [Beat]{v} [butter]{i3}, [granulated sugar]{i4}, [brown[ suga]{i4}r]{i5} and [vanilla extract]{i6} in large mixer bowl until creamy [Add]{v} [eggs]{i7}, one at a time, [beating]{v} well after each addition Gradually [beat]{v} in [flour]{i0} mixture [Stir]{v} in [morsels]{i8} and [nuts]{i9} Drop by rounded tablespoon onto ungreased [baking]{i1} sheets [Bake]{v} for 9 to 11 minutes or until golden [brown]{i5} [Cool]{v} on [baking]{i1} sheets for 2 minutes remove to wire racks to [cool]{v} completely ########################################### CHUNKED: 11 chunks) 0: [Combine]{v} [flour]{i0}, [baking soda]{i1} and [salt]{i2} in small bowl. 1: [Beat]{v} [butter]{i3}, [granulated sugar]{i4}, [brown[ suga]{i4}r]{i5} and [vanilla extract]{i6} in large mixer bowl until creamy. 2: [Add]{v} [eggs]{i7}, one at a time, 3: [beating]{v} well after each addition. 4: Gradually [beat]{v} in [flour]{i0} mixture. 5: [Stir]{v} in [morsels]{i8} and [nuts]{i9}. 6: Drop by rounded tablespoon onto ungreased [baking]{i1} sheets. 7: [Bake]{v} for 9 to 11 minutes or until golden [brown]{i5}. 8: [Cool]{v} on [baking]{i1} sheets for 2 minutes;

9: remove to wire racks to [cool]{v} completely. 10: ########################################### FLAG SALIENCE REQUIREMENTS: 0: Combine[v] 51: Beat[v] 148: Add[v-sal] 173: beating[v] 217: beat[v-sal] 240: Stir[v-sal] 324: Bake[v] 372: Cool[v] 433: cool[v] ########################################### REPRESENTER: Combine flour, baking soda and salt in small bowl. butter granulated sugar packed brown sugar vanilla extract eggs NESTLE TOLL HOUSE Semi-Sweet Chocolate Morsels nuts all-purpose flour, baking soda, salt ) Combine ---------

Beat butter, granulated sugar, brown sugar and vanilla extract in large mixer bowl until creamy. eggs NESTLE TOLL HOUSE Semi-Sweet Chocolate Morsels nuts all-purpose flour, baking soda, salt ) Combine butter, granulated sugar, packed brown sugar, vanilla extract ) Beat --------- Add eggs, one at a time, NESTLE TOLL HOUSE Semi-Sweet Chocolate Morsels nuts all-purpose flour, baking soda, salt ) Combine eggs, butter, granulated sugar,

packed brown sugar, vanilla extract ) Beat ) Add --------- beating well after each addition. NESTLE TOLL HOUSE Semi-Sweet Chocolate Morsels nuts all-purpose flour, baking soda, salt ) Combine eggs, butter, granulated sugar, packed brown sugar, vanilla extract ) Beat ) Add ) beating --------- Gradually beat in flour mixture. NESTLE TOLL HOUSE Semi-Sweet Chocolate Morsels nuts

all-purpose flour, baking soda, salt ) Combine, eggs, butter, granulated sugar, packed brown sugar, vanilla extract ) Beat ) Add ) beating ) beat --------- Stir in morsels and nuts. NESTLE TOLL HOUSE Semi-Sweet Chocolate Morsels, nuts, all-purpose flour, baking soda, salt ) Combine, eggs,

butter, granulated sugar, packed brown sugar, vanilla extract ) Beat ) Add ) beating ) beat ) Stir --------- Drop by rounded tablespoon onto ungreased baking sheets. Bake for 9 to 11 minutes or until golden brown. NESTLE TOLL HOUSE Semi-Sweet Chocolate Morsels, nuts, all-purpose flour, baking soda, salt ) Combine, eggs, butter, granulated sugar,

packed brown sugar, vanilla extract ) Beat ) Add ) beating ) beat ) Stir ) Bake --------- Cool on baking sheets for 2 minutes; NESTLE TOLL HOUSE Semi-Sweet Chocolate Morsels, nuts, all-purpose flour, baking soda, salt ) Combine, eggs, butter, granulated sugar, packed brown sugar, vanilla extract ) Beat

) Add ) beating ) beat ) Stir ) Bake ) Cool --------- remove to wire racks to cool completely. NESTLE TOLL HOUSE Semi-Sweet Chocolate Morsels, nuts, all-purpose flour, baking soda, salt ) Combine, eggs, butter, granulated sugar, packed brown sugar, vanilla extract ) Beat ) Add

) beating ) beat ) Stir ) Bake ) Cool ) cool --------- NESTLE TOLL HOUSE Semi-Sweet Chocolate Morsels, nuts, all-purpose flour, baking soda, salt ) Combine, eggs, butter, granulated sugar, packed brown sugar, vanilla extract ) Beat ) Add

) beating ) beat ) Stir ) Bake ) Cool ) cool