Friday, March 12, 2010

Interpreting Recipe Input

As some of you know, I've been working on recipe software off and on for a few years. It keeps finding its way to the backburner, mostly because other things have always taken priority, but also because doing it The Right Way (TM) is pretty intimidating.

A few weeks ago I started The Latest Attempt. I started simple, and had few eyeballs look at it. A couple of days ago, I went back to the beginning of my blog and started transcribing recipes into the simple interface that I had. I ended up finding several issues, most of which I don't think most of my testers ever encountered. I thought I'd lay them out here, and let them percolate in my brain.

I'm going to use an example that wasn't in my early archives, but that I've been playing with lately. The original is here. I blogged about a version of mine here. Hopefully Food Network won't mind if I reprint the original here, because I'm going to tweak it a little of the sake of demonstration.

6 squares unsweetened chocolate
3/4 cup unsalted butter
2 cups sugar
3 eggs
1 teaspoon pure vanilla extract
1 cup unbleached allpurpose flour
1 cup chopped nuts (optional)

Words like "pure" and "unbleached" sound pretty specific. But it starts with "6 squares unsweetened chocolate". What size is a "square"? The experienced baker will tell you that unsweetened chocolate is often measured in one-ounce squares. But it presents the first problem what I've encountered, and that at least one tester came across too:

Non-Standard Measurements

"two sticks butter". "one can olives". "one package spinach". These are all arbitrary sizes, that don't necessarily mean what you think they mean. Okay, so in America, butter comes in 4 oz sticks. That's something that we can rely upon. But one can of olives? Are we talking about the little 4 oz cans of sliced or chopped olives? Or are we talking about a 15 oz can of whole olives? How about the spinach? I've seen fresh spinach come in packages ranging from a few ounces to a couple of pounds, and who's to say we're not talking about frozen spinach? This actually leads into the next issue:

Inspecific Ingredients
What kind of butter? Salted or unsalted? A professional chef would never cook with salted butter. Joe Q. America, who knows? And I've already brought up the issues with olives and spinach. But the issue that I found here was actually in trying to categorize the food items, with minimal effort on the user's behalf. I've been using the USDA SR22 this time around, along with some auto-suggest AJAX code, to try and link up what the user has been looking for with something that the user can use for things like nutritional charts. The SQL looks something like this:

select NDB_No, Shrt_Desc from ABBREV where Shrt_Desc like '%butter%' limit 10;

Offhand, it seems reasonable that this would give us results like "unsalted butter", "salted butter", etc. But the SR22 isn't in an order that is condusive that that kind of thing. Instead, we get a result like this:

+--------+------------------------------------------------------------+
| NDB_No | Shrt_Desc |
+--------+------------------------------------------------------------+
| 42291 | PEANUT BUTTER,RED NA |
| 42307 | MARGARINE-LIKE,BUTTER-MARGARINE BLEND,80% FAT,STK,WO/ SALT |
| 42309 | MARGARINE-LIKE,VEG OIL-BUTTER SPRD,RED CAL,TUB,W/ SALT |
| 43214 | BUTTER REPLCMNT,WO/FAT,PDR |
| 11866 | SQUASH,WNTR,BUTTERNUT,CKD,BKD,W/SALT |
| 11867 | SQUASH,WNTR,BUTTERNUT,FRZ,CKD,BLD,W/SALT |
| 11372 | POTATOES,SCALLPD,HOME-PREPARED W/BUTTER |
| 11373 | POTATOES,AU GRATIN,HOME-PREPARED FROM RECIPE USING BUTTER |
| 11381 | POTATOES,MSHD,DEHYD,PREP FR GRNLS WO/MILK,WHL MILK&BUTTER |
| 11385 | POTATOES,AU GRATIN,DRY MIX,PREP W/H2O,WHL MILK&BUTTER |
+--------+------------------------------------------------------------+

Only one of those even remotely resembles what I'm looking for, and honestly, I don't want it. This problem is easily, if tediously solved: all I need to do is create a new database, of commonly used ingredients, and query it first, and then supplement it with the second database. Oog. Let's move onto the really tricky stuff.

Storing Measurements

Let's go back to our brownie recipe. One of the ingredients is 3/4 cup butter. Databases don't store fractions in any mathematically-usable format. Do we store it as a VARCHAR to maintain integrity with what the user entered, and then convert it later? Or do we store it as a decimal, and then store a flag saying whether it was entered as a fraction or a decimal? Or do we store it as a decimal, and assume that it will always be displayed to ther user as a fraction (which is usually what the user wants)? For the sake of argument, let's go with decimals as the storage mechanism, and not worry about anything else for now. In MySQL, we might have something that looks like this:

...SNIP...
amount DECIMAL(10,2),
unit VARCHAR(10),
...SNIP...

Measurement Ranges

Brownie recipe again. One of those ingredients, the nuts, is optional. It's technically a garnish, if an internal one. It's already subjective (walnuts? peanuts? pecans?), which means we can fudge a little on the amount too. Depending on how much you like your nut of choice, let's say you might opt for anywhere from 3/4 cup to 1 1/2 cups. This presents another problem with database management, at the very least. Maybe we can solve it by breaking the amount into two different fields?

...SNIP...
amount_min DECIMAL(10,2),
amount_max DECIMAL(10,2),
unit VARCHAR(10),
...SNIP...

Intermixed Measurement Units

Let's play with the sugar a little. A lot of people (like me) like to swap out some of the white sugar with brown sugar. Let's say that after careful testing and tweaking, I come up with the following measurements:

1 cup + 2 Tbsp white sugar
3/4 cup + 2 Tbsp brown sugar

Oh man. How do you store that? We could convert everything down to the lowest common denominator, Tbsp in this case, and then upscale when we display it back to the user. In the database, we would have something like:

18 Tbsp white sugar
14 Tbsp brown sugar

Of course, now we have to write code to scale this back to a reasonable measurement, since 18 Tbsp of anything is just weird, even just for behind-the-scenes storage. I think I would rather convert everything to a common unit, store it, and then convert it back. And I don't think any (non-metric) unit of measurement is more versatile than the ounce. Now we can store the sugar as:

9 oz white sugar
7 oz brown sugar

Of course, this leads us to the next problem:

Weight vs Volume

In the metric world, this isn't an issue, because everything is stored in either some version of grams or some version of litres. But in the Imperial system favored in America, an ounce could mean weight or it could mean volume. With some ingredients, this isn't a big deal. "A pint's a pound, the whole world round", right? Makes sense, since a pint is 16 ounces by volume and a pound is 16 ounces by weight. Well, not exactly (1 pint == 1.043 pounds), but close enough for the home cook.

In the above example, we know that we're referring to volume, if only because we know that we started with cups. But if I were looking at the recipe without any context, I personally would start off by thinking that it was a weight measurement. Some would assume it was volume. Even for the home cook, this is kind of significant, since a cup of sugar weighs a little over 7 ounces, not 8 ounces. In the aforementioned example, we could just store a flag that states whether this unit is by weight, volume, count, etc. If a user specified ounce, without any context, we'd have to store it as "unspecified", until it became important to the user (say, for nutritional data).

Of course, this is all backend stuff. Let's talk about another major problem:

Dealing With Users

If you can force the user to use your own forms, custom-tailored to suit your database, you can force them to do everything properly. And any UI designer worth his or her salt can tell you, when you start forcing users to what you want, you start losing users to your competitor. Your competitor's software may or may not do an inferior job, but if it makes your users feel better, that's what your users will use.

I've heard a lot of people complain about recipe software. From my limited experience, when a user gets tired of their recipe software (as most inevitably will), they will switch back to using a word processor, or possibly a spreadsheet. So it seems to me that the best way to get somebody to use your recipe software is to make it feel as much as possible like using a word processor. That means using a lot of fuzzy logic to convert what your users type into something that your software can use.

As you can see, writing recipe software The Right Way (TM) presents itself with a lot of issues. I haven't figured out how to handle most of them, and one of the biggest problems seems to be that some issues are dependent upon other issues. Well, I'll get it figured out.

2 comments:

  1. Oh, the fate of doing something The Right Way. With this much consideration and thought (exactly what I'd do if I actually tried to make a recipe program - which I have, fleetingly), I'd definitely use such a thing.
    The ideas in the above comments are great - elg extended it to being very useful and makes sense. Underline "1 package" and let the user specify further (did you mean...16oz? 4oz?), but don't force them to. If they don't, you just have to store it as text, and make sure they know they can't scale/get nutritional info/plan grocery trips/all the cool stuff until they fix it. Depending on how you end up doing ingredients, maybe a green/blue/less obtrusive line if you're uncertain about categorizing a certain ingredient.
    As per your manual DB, that might be something to have some pre-alpha testers work on - I for one would be quite willing to go through my cookbooks and categorize some ingredients from them here and there, in my spare time.

    (Also, Chrome squigglied "elg" as I typed it, appropriately enough)

    ReplyDelete
  2. hey i was try it on http://www.gucciebag.com gucci
    http://www.chanelebag.com chane

    ReplyDelete

Comments for posts over 14 days are moderated