blog.josephhall.com: 2010

Thursday, April 15, 2010

Using the find command

Today I was working on a problem with a coworker, and I saw him start to type the following:


find . | grep

I said, "no, I'm not going to let you do that. You need to do it the right way." (Yes, I do that sort of thing. Don't act like you didn't expect it.)

Don't get me wrong, there's nothing wrong with grep. It's a fine tool, and I encourage people to constantly try to become better with it. But it has its place, and this was not it. Using the find command properly in this case would result in less typing, and would spawn one less process. It may not be important for a one-line command, but in a larger script it might be more significant. Better to get in the right habit now, so that when you do find yourself working on that big script, you do it right the first time.

The find command is extremely powerful. Unlike locate, which uses a pre-built database of files and paths on your system, find searches your filesystem in real time, paying more attention to the individual files, and their properties. It may be slower than locate, but it's more accurate and far more flexible.

I see people use find mostly to search by filename, but it has plenty of other options. Let's start with filename and build from there. There are two relevant options:


-name
-iname

They are identical, except that -iname is case insensitive. Since files on a *nix system are traditionally all lowercase, you might want to save yourself a few processor cycles and just go with -name. If there's a chance that case may be an issue, use -iname instead.


find -name 'myfile.txt'

Using quotes is not strictly necessary with most filenames, but it's a good habit to get into. Keep in mind that by default, find searches by exact filenames. If you're not sure what the extension is, or you want to look anywhere in the filename, you can use globs:


find -iname '*myfile*'

The above commands will process files recursively, inside the current working directory. If you want to search a different directory, you need to specify it before any other options:


find /etc -name passwd

There are a few subtleties of find that you will encounter. They're not usually a big deal, but they can be annoying sometimes. For instance, find does not sort its results. If I expect a lot of results, I generally pipe it through the sort command. It also isn't very good at searching its own results, which is where grep can come in handy:


find / -name 'Net' | sort | grep -i perl

Now that you have the basics of find, let's explore some of the other options. Two that I use extensively are:


-ok
-exec

Again, these options are identical in purpose, but there is an important difference in how they behave. Both of them will execute a command on each file found, but -ok will ask for permission first (for each file) while -exec will just do it.


find /etc -name '*conf' -exec mv {} {}.orig \;

First off, everything between -exec (or -ok) and \; is the command that you want to run. Make sure you escape that semi-colon at the end with a backslash, or you'll be sorry. The {} is a placeholder for the filename that was found by find. In this case, we're actually going to be performing a series of commands that looks like this:


mv /etc/httpd/conf/httpd.conf /etc/httpd/conf/httpd.conf.orig
mv /etc/httpd/conf.d/ssl.conf /etc/httpd/conf.d/ssl.conf.orig
mv /etc/httpd/conf.d/perl.conf /etc/httpd/conf.d/perl.conf.orig
...SNIP...

You're not limited to just searching filenames by glob. The find command does actually have support for regular expressions, using the following:


-regex
-iregex

I don't think I need to tell you that -iregex is the case-insensitive version of -regex. If you already know how to use grep, this isn't much of a stretch:


find -regex ".*deskto."

The find command also supports boolean logic:


-and (or -a)
-or  (or -o)
-not

Let's combine these with couple more options from the man page:


touch /tmp/jayceweb.tar.bz2
find / -user jayce -and -group apache -exec tar --remove-files -jrf /tmp/jayceweb.tar.bz2 {} \;

This is the sort of command a person might run if they found a user on their system that they didn't trust, and wanted to quaratine all of their web files. First we make an empty tar file, then we add the suspicious files to it, removing them once they've been archived. It assumes that the user that owns the files is jayce, and the group that owns the files is apache. You could also make use of:


-uid
-gid
-nouser
-nogroup

That's probably enough of a primer to get you started. Now would be a good time to check the man page for some of the other myriad options that you can use to check by date stamp(s), file size, file type and even permissions. A little practice with this powerful command will save you time and energy, and increase your productivity like you won't believe.

Saturday, March 13, 2010

Getting Started with Cassandra

12 days ago, I read an article by Matt Asay which briefly mentioned Cassandra, Facebook's NoSQL offering. I had heard of it before, but hadn't really looked into it. For some reason, Matt's article caused me to look into it again. Within a couple of hours, I was evangelising it to a few friends. Matt's article pointed out that Facebook, Digg and Twitter had all started using it, and as I researched it, it seemed that Digg's and Twitter's migrations to it had taken only a few days. Last night, one of the people that I had been hyping it to sent me a thing from Reddit, posted only 11 days after Matt's article, talking about how they had just finished a 10-day migration to Cassandra.

What is Cassadra?

In order to understand Cassandra, it would first make sense to talk about exactly what this NoSQL movement is.

What is NoSQL?

NoSQL is a nickname that arose sometime last year to describe a series of increeasingly popular database management systems (DBMS) that do not use SQL as an interface, as has been common in database servers for at least a couple of decades. Indeed, SQL is hardly the issue here at all. If you were to migrate from MySQL to PostgreSQL, chances are you would still have to update several of your queries in order to be compatible with the new DBMS. It's almost like changing languages anyway, except that it's more like the differences between Canadian French and Hatian Creole: both French, but not pure French, and there are enough differences to matter.

Switching from SQL to NoSQL is a little more like switching from English to Japanese. Both are languages which accomplish the same goal (interperson communication), but both take very different approaches. They have different keywords, different grammars, and some would argue that Japanese is a much more precise and efficient language. One might even bring scalability into the discussion, as both languages have had the opportunity to grow. One might argue that English has done so sloppily, borrowing from odd places, whereas Japanese has done so with a little less mess, for instance, adding an entire syllabic infrastructure called katakana in order to handle foreign words, among other things.

Both SQL and NoSQL are DBMSs. They both hold data admirably, but whereas most SQL servers were originally built before the idea of database clusters was common, NoSQL servers were introduced around the time that database clusters were becoming a necessity in many infrastructures. This provided NoSQL servers with the ability to consider this concern during the design stages, rather than having to patch it in later. Some of the names that you will see in the NoSQL world are BigTable, Dynamo, HBase, Hadoop, CouchDB, and Cassandra.

So, What is Cassandra?

Cassandra is a NoSQL DBMS written by Facebook. It was open sourced in 2008, and added to the Apache Project in 2009. It is fault-tolerant, decentralized, and "eventually consistent", meaning that when data is added to the database, there is a propagation period before that data is available to all of the nodes in the cluster. A more famous database model that is also eventually consistent is DNS: zone records are updated higher up in the DNS tree, and then trickle down to relevant servers in an organized fashion. This used to take 72 hours or more in DNS, but these days takes closer to an hour. With Cassandra, it is more likely to take a few seconds. This means that your applications must be written with this consideration in mind,

Rather than using SQL, Cassandra uses a system of key/value pairs. This is not a new concept to most programmers, whether they refer to them as libraries, associative arrays or hashes. The concept should be immediately familiar to any Perl programmer, and possibly even more comfortable to anyone who has ever worked with JSON. One major difference is that each name/value pair is also timestamped. So a column, as it were, in Cassandra is comprised of a name/value/timestamp set. For example:


{
    name: "email",
    value: "test@test.com",
    timestamp: 1259991135887
}

Cassandra also has what's called a SuperColumn, which is a grouping of columns, much like a hash of hashes in Perl. For example:


{
    name: "person",
    value: {
        realname: { name: "realname", value: "Billy Bob Test", timestamp: 1259991135887 },
        email: { name: "email", value: "test@test.com", timestamp: 1259991135887 },
        ircnick: { name: "ircnick", value: "billybobtest", timestamp: 1259991135887 }
    }
}

That's all the technical detail that I'm going to go into at the moment, largely because there's already so many great articles out there to get you started, but also because I'm new, and still know just enough to be dangerous (mostly to myself). But I am going to link to a few of those articles for you, if you're interested enough now to check them out.

Bearing in mind that I'm a Perl guy, here are the links that I've already sent out to a couple of friends in email, which may or may not cover your language of choice.

For information about Cassandra and some theories behind it, you'll want to take a look at these links:

http://incubator.apache.org/cassandra/
http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
http://bryanpendleton.blogspot.com/2010/03/following-links-to-cassandra.html
http://blog.evanweaver.com/articles/2009/07/06/up-and-running-with-cassandra/ (Ruby examples included)

When you're ready to install it and start playing with it, you'll want to read these, in roughly this order.

http://dustyreagan.com/installing-cassandra-on-ubuntu-linux/
http://wiki.apache.org/cassandra/CassandraCli
http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model
http://search.cpan.org/~lbrocard/Net-Cassandra-0.35/lib/Net/Cassandra.pm (For the Perl guys)

As I continue to explore and learn Cassandra, you may see an article here and there about it on my blog. I'm pretty excited about it, and while I see no reason to completely abandon SQL databases (they all have their uses, many of which Cassandra is likely not well-suited for), I think that I'm likely to use Cassandra as a major component on an upcoming project.

Friday, March 12, 2010

Interpreting Recipe Input

As some of you know, I've been working on recipe software off and on for a few years. It keeps finding its way to the backburner, mostly because other things have always taken priority, but also because doing it The Right Way (TM) is pretty intimidating.

A few weeks ago I started The Latest Attempt. I started simple, and had few eyeballs look at it. A couple of days ago, I went back to the beginning of my blog and started transcribing recipes into the simple interface that I had. I ended up finding several issues, most of which I don't think most of my testers ever encountered. I thought I'd lay them out here, and let them percolate in my brain.

I'm going to use an example that wasn't in my early archives, but that I've been playing with lately. The original is here. I blogged about a version of mine here. Hopefully Food Network won't mind if I reprint the original here, because I'm going to tweak it a little of the sake of demonstration.


6 squares unsweetened chocolate
3/4 cup unsalted butter
2 cups sugar
3 eggs
1 teaspoon pure vanilla extract
1 cup unbleached allpurpose flour
1 cup chopped nuts (optional)

Words like "pure" and "unbleached" sound pretty specific. But it starts with "6 squares unsweetened chocolate". What size is a "square"? The experienced baker will tell you that unsweetened chocolate is often measured in one-ounce squares. But it presents the first problem what I've encountered, and that at least one tester came across too:

Non-Standard Measurements

"two sticks butter". "one can olives". "one package spinach". These are all arbitrary sizes, that don't necessarily mean what you think they mean. Okay, so in America, butter comes in 4 oz sticks. That's something that we can rely upon. But one can of olives? Are we talking about the little 4 oz cans of sliced or chopped olives? Or are we talking about a 15 oz can of whole olives? How about the spinach? I've seen fresh spinach come in packages ranging from a few ounces to a couple of pounds, and who's to say we're not talking about frozen spinach? This actually leads into the next issue:

Inspecific Ingredients
What kind of butter? Salted or unsalted? A professional chef would never cook with salted butter. Joe Q. America, who knows? And I've already brought up the issues with olives and spinach. But the issue that I found here was actually in trying to categorize the food items, with minimal effort on the user's behalf. I've been using the USDA SR22 this time around, along with some auto-suggest AJAX code, to try and link up what the user has been looking for with something that the user can use for things like nutritional charts. The SQL looks something like this:


select NDB_No, Shrt_Desc from ABBREV where Shrt_Desc like '%butter%' limit 10;

Offhand, it seems reasonable that this would give us results like "unsalted butter", "salted butter", etc. But the SR22 isn't in an order that is condusive that that kind of thing. Instead, we get a result like this:


+--------+------------------------------------------------------------+
| NDB_No | Shrt_Desc                                                  |
+--------+------------------------------------------------------------+
| 42291  | PEANUT BUTTER,RED NA                                       | 
| 42307  | MARGARINE-LIKE,BUTTER-MARGARINE BLEND,80% FAT,STK,WO/ SALT | 
| 42309  | MARGARINE-LIKE,VEG OIL-BUTTER SPRD,RED CAL,TUB,W/ SALT     | 
| 43214  | BUTTER REPLCMNT,WO/FAT,PDR                                 | 
| 11866  | SQUASH,WNTR,BUTTERNUT,CKD,BKD,W/SALT                       | 
| 11867  | SQUASH,WNTR,BUTTERNUT,FRZ,CKD,BLD,W/SALT                   | 
| 11372  | POTATOES,SCALLPD,HOME-PREPARED W/BUTTER                    | 
| 11373  | POTATOES,AU GRATIN,HOME-PREPARED FROM RECIPE USING BUTTER  | 
| 11381  | POTATOES,MSHD,DEHYD,PREP FR GRNLS WO/MILK,WHL MILK&BUTTER  | 
| 11385  | POTATOES,AU GRATIN,DRY MIX,PREP W/H2O,WHL MILK&BUTTER      | 
+--------+------------------------------------------------------------+

Only one of those even remotely resembles what I'm looking for, and honestly, I don't want it. This problem is easily, if tediously solved: all I need to do is create a new database, of commonly used ingredients, and query it first, and then supplement it with the second database. Oog. Let's move onto the really tricky stuff.

Storing Measurements

Let's go back to our brownie recipe. One of the ingredients is 3/4 cup butter. Databases don't store fractions in any mathematically-usable format. Do we store it as a VARCHAR to maintain integrity with what the user entered, and then convert it later? Or do we store it as a decimal, and then store a flag saying whether it was entered as a fraction or a decimal? Or do we store it as a decimal, and assume that it will always be displayed to ther user as a fraction (which is usually what the user wants)? For the sake of argument, let's go with decimals as the storage mechanism, and not worry about anything else for now. In MySQL, we might have something that looks like this:


...SNIP...
amount DECIMAL(10,2),
unit VARCHAR(10),
...SNIP...

Measurement Ranges

Brownie recipe again. One of those ingredients, the nuts, is optional. It's technically a garnish, if an internal one. It's already subjective (walnuts? peanuts? pecans?), which means we can fudge a little on the amount too. Depending on how much you like your nut of choice, let's say you might opt for anywhere from 3/4 cup to 1 1/2 cups. This presents another problem with database management, at the very least. Maybe we can solve it by breaking the amount into two different fields?


...SNIP...
amount_min DECIMAL(10,2),
amount_max DECIMAL(10,2),
unit VARCHAR(10),
...SNIP...

Intermixed Measurement Units

Let's play with the sugar a little. A lot of people (like me) like to swap out some of the white sugar with brown sugar. Let's say that after careful testing and tweaking, I come up with the following measurements:


1 cup + 2 Tbsp white sugar
3/4 cup + 2 Tbsp brown sugar

Oh man. How do you store that? We could convert everything down to the lowest common denominator, Tbsp in this case, and then upscale when we display it back to the user. In the database, we would have something like:


18 Tbsp white sugar
14 Tbsp brown sugar

Of course, now we have to write code to scale this back to a reasonable measurement, since 18 Tbsp of anything is just weird, even just for behind-the-scenes storage. I think I would rather convert everything to a common unit, store it, and then convert it back. And I don't think any (non-metric) unit of measurement is more versatile than the ounce. Now we can store the sugar as:


9 oz white sugar
7 oz brown sugar

Of course, this leads us to the next problem:

Weight vs Volume

In the metric world, this isn't an issue, because everything is stored in either some version of grams or some version of litres. But in the Imperial system favored in America, an ounce could mean weight or it could mean volume. With some ingredients, this isn't a big deal. "A pint's a pound, the whole world round", right? Makes sense, since a pint is 16 ounces by volume and a pound is 16 ounces by weight. Well, not exactly (1 pint == 1.043 pounds), but close enough for the home cook.

In the above example, we know that we're referring to volume, if only because we know that we started with cups. But if I were looking at the recipe without any context, I personally would start off by thinking that it was a weight measurement. Some would assume it was volume. Even for the home cook, this is kind of significant, since a cup of sugar weighs a little over 7 ounces, not 8 ounces. In the aforementioned example, we could just store a flag that states whether this unit is by weight, volume, count, etc. If a user specified ounce, without any context, we'd have to store it as "unspecified", until it became important to the user (say, for nutritional data).

Of course, this is all backend stuff. Let's talk about another major problem:

Dealing With Users

If you can force the user to use your own forms, custom-tailored to suit your database, you can force them to do everything properly. And any UI designer worth his or her salt can tell you, when you start forcing users to what you want, you start losing users to your competitor. Your competitor's software may or may not do an inferior job, but if it makes your users feel better, that's what your users will use.

I've heard a lot of people complain about recipe software. From my limited experience, when a user gets tired of their recipe software (as most inevitably will), they will switch back to using a word processor, or possibly a spreadsheet. So it seems to me that the best way to get somebody to use your recipe software is to make it feel as much as possible like using a word processor. That means using a lot of fuzzy logic to convert what your users type into something that your software can use.

As you can see, writing recipe software The Right Way (TM) presents itself with a lot of issues. I haven't figured out how to handle most of them, and one of the biggest problems seems to be that some issues are dependent upon other issues. Well, I'll get it figured out.

Homemade Server Rack

I actually built about a year ago, and never bothered to post it. I mentioned it to a couple of people this week, and thought I would post it for them:

I had a couple of 2U servers laying around that I was planning on installing, but that's kind of a weird form factor for setting up around the home. Fortunately, we had some laminate boards laying around, left over from a water-damaged built-it-yourself stand-alone closet that was in the basement when we moved in. I tossed the water damaged parts, but held onto the rest of the boards, just for something like this.

I used 4 shelves, and built a box about the right form factor. I already had some casters (not shown in the photo, because they're underneath) so I attached them to the bottom to easily move the box around. The only parts that I ended up buying were the mounting racks for the servers. It's just not a part that I had laying around the house. It's also not a part you can find at Lowes, but it's easy to find online. I bought mine from Star Case. I figured that a 10U set should last me for a while.

Thursday, February 18, 2010

Primary Keys in MySQL

I was reading an AJAX tutorial the other day, and something that the author said caught my eye:

"All we really need is the title, but I always provide a primary key for any table that I create."

Why add it if you know for a fact that you don't need it? In this case, as with every case I've seen so far, the primary key in question was an auto_increment integer. I have long maintained that while this is necessary in most tables, it does not necessarily belong in every table. What you really need is a unique identifier. Without this, all you have is a jumble of data that really doesn't make sense to be stored in a database.

But that unique identifier doesn't always need to be a counter. Let's take a look at a couple of examples. Consider the following hypothetical user table:


CREATE TABLE users (
    username  VARCHAR(50) NOT NULL,
    realname  VARCHAR(50) NOT NULL,
    password  VARCHAR(50) NOT NULL,
    PRIMARY KEY (username)
);

It makes sense for a username to be unique, because the username/password (or other token) combination will be required for access to this application. If you need to refer to this table from another table, you can just refer to the username, because it is unique and identifiable. I might note that MySQL itself does not use an auto_increment field for its own user table.

There are potential problems with this, however. When you change a username in this table, you need to change it in any tables that reference it as well. Additionally, you are taking up a little extra (if negligible) space in referring tables. An auto_increment uses an integer, which takes up less space, and will not change under normal circumstances. It should be noted that Unix-style operating systems use a UID to identify not only users, but file and process ownership. The username itself is rarely used by the system for anything other than making things more human-readable.

Let me show you a table structure that I've been working on this morning. I have a need to store recipes in a database, but because a recipe contains varaiable numbers of ingredients and directions, it doesn't make sense to try and store each in its own field in a single recipe table. Instead, I have broken out my structure into three separate tables:


CREATE TABLE recipes (
    recipe_id    INT(11) UNSIGNED NOT NULL AUTO_INCREMENT,
    name         VARCHAR(255) NOT NULL,
    source       VARCHAR(255) NOT NULL,
    preheat_qty  INT(4),
    preheat_unit ENUM('F','C'),
    yield_qty    INT(4),
    yield_unit   INT(11) NOT NULL,
    PRIMARY KEY (recipe_id)
);

CREATE TABLE recipe_ingredients (
    recipe_id      INT(11) UNSIGNED NOT NULL,
    rank           INT(4),
    name           VARCHAR(255) NOT NULL,
    qty            INT(4) NOT NULL,
    unit           INT(11) NOT NULL,
    PRIMARY KEY (recipe_id, rank)
);

CREATE TABLE recipe_directions (
    recipe_id      INT(11) UNSIGNED NOT NULL,
    rank           INT(4),
    direction      TEXT,
    PRIMARY KEY (recipe_id, rank)
);

The recipe itself is referenced by a unique counter, which is usual. INT(11) contains far more unique identifiers as I expect to need for that table. But there will be multiple ingredients per recipe, and multiple directions per recipe. Even if I only store recipes with no more than 4 ingredients and 4 steps (which is unlikely), I need 4 times as many unique identifiers per table.

I already have to store the recipe_id, to keep from orphaning the data. It's important to store what order each step is in, because if they got out of order, the recipe would quickly becaome confusing. It's nice to store the order of ingredients in the order in which they'll be used too, and many people write recipes with this in mind. I've called this field "rank". Since I already have those two fields, and they already uniquely identify the rows, why not officially make them primary keys together? MySQL allows it, and I'm going to make use of it.

I propose that whenever you create a table, you take a moment to consider whether or not you actually need a counter. Most of the time you will, but not always. Get out of the habit of doing things because that's the way you've always done them, and get into the habit of doing things because you've thought about them and have made an informed decision specific to the situation at hand.

Saturday, January 30, 2010

My Drive Array

A couple of years ago I got ahold of an old ATX computer that I intended to use as a file server. Unfortunately, there were a few problems with it. The biggest problem was that the drive cage for the smaller drives was missing. Smaller problems like an underpowered power supply and limited onboard IDE adapters were fixed with things like a new 600W power supply, and extra IDE expansion cards. As it turns out, Linux had no problem with two onboard adapters and two more cards (two adapters per card). With IDE's master/slave setup, that brought me up to two DVD burners (/dev/scd0 to /dev/scd1) and six hard drives (/dev/hda to /dev/hdf). But the drive cage, that was a problem.

Fortunately, like many American bakers and pastry chefs, I spent a lot of time at the hardware store. And believe it or not, the roofing section at Lowes carries a simple solution: roofing ties. Not TILES, but TIES (no "L"). Behold, the drive array, now connected to my Thinkpad (click to embiggen):

Yes, dear readers, the file server is dead. It seems to have developed memory issues in its old age, leading to its untimely demise. Not Alzheimers, but some form of dementia. A close-up on the array itself:

Sadly, my Thinkpad does not have an external IDE adapter of any kind. But USB to IDE adapters are relatively cheap and easy to find. I can't access every drive at once, but that's not a big deal at the moment.

Lowes has several different sizes of roofing ties, and several different styles. I used two different sizes, both completely flat, but with rows of holes exactly the same width as a standard 3.5" internal drive. One is five holes high and one is three holes high.

If you're going to do this, you'll be buying them in sets of two each. And while you may be tempted to stack three drives in one 3-high roofing tie set, fight the urge! You need airflow between the drives, or they will overheat. I speak from experience. Limit yourself to three drives for the 5-high ties, and only use the 3-high ties for connecting sets of 5-high ties. Look back at photo #2 to see what I mean.

Speaking of heat issues, it's not a bad idea to point a fan at these if you're going to have them all on at once. It's not a big deal with one or two USB-connected drives, but with all six drives that I have in my array, I always had a fan going.

Thursday, January 28, 2010

Importing the USDA SR22 into MySQL

Some of you who were interested in my post on importing SR21 may have been waiting for this one. I know it's been a few months since SR22 came out, but I didn't have a need to import it until just now. There are changes to this new version, but they are minimal, and you may have already patched the code yourself.

Specifically, two new fields were added to the ABBREVS table: Vit_D_mcg and Vit_D_IU. This brings the total column count in that table from 51 to 53. That number is the only change in the Perl file, and those two columns were the only additions to the SQL file. With those in place, I was able to import the new database without a problem. For the lazy and/or efficient, here are the new versions of the files:

import_sr22.pl
sr22.sql

Based on the reponse to my last post, I expect more troubleshooting questions on this post. For those who know what you're doing, you can stop reading now. Everyone else, check here before asking.

You need to have at least Perl 5.8.6 installed.

You need to have the Perl DBI installed, and DBD::mysql.

In RHEL/CentOS/Fedora, these packages should be called perl-DBI and perl-DBD-MySQL.

In Ubuntu/Debian, these packages should be called libdbi-perl and libdbd-mysql-perl.

When you download the files, make sure you save the Perl script as import_sr22.pl, not import_sr22.txt.

This script assumes you've downloaded the abbreviated file. It is a separate download from the full version, so make sure you don't miss it.

I think that covers all the questions I was asked previously. For those who are interested, The Eloquent Geek posted a non-Perl way of doing this on the last post. I haven't tried it myself, but for your reference:

For those who do not want to use the perl here is how you import the data from the command line client:
load data infile '/file_path/TABLE_NAME.txt' into table TABLE_NAME fields terminated by '^' optionally enclosed by '~' ;

Just substitute the table name per table.

Tuesday, January 26, 2010

Ice Cream Nomenclature

Rather than responding in the comments area to Hollie's comment on Ice Cream Machine, I thought I would just make a new post. Her question was, "isn't that a sorbet?" No, Hollie, that wasn't a sorbet or even a sherbet. There are a few terms tossed around for frozen, churned desserts, and there are differences. Some of it has to do with dairy content, but not all.

Sorbet: This dessert generally contains fruit puree or juice, but can contain other things, such as coffee or chocolate. The important thing here is that there is no dairy content. So Hollie, with all the half and half in my recipe, it's definitely not sorbet.

Sherbet/Sherbert: You'd think my recipe would fall under this category, because sherbet can have dairy in it. But as it turns out, sherbet only contains a small amount of dairy, < 3%. The terms sherbet and sorbet are often used interchangeably, and I guess I'm not going to stop that with my little post. But as far as I'm concerned, they are different.

Ice Cream: Once you get above 3% dairy, your dessert becomes "ice cream". This is a pretty generic term that gets tossed around, and is applied to everything from sorbet to gelato. You can have large amounts of fruit puree like I had, or you can just keep it as simple as frozen, churned, sweetened milk or cream. I have a friend that is a big fan of unflavored ice cream: not even vanilla gets in the way of the taste of milk. One day I will have to try it.

Frozen Custard/Premium Ice Cream: Once you add egg to the mix, ice cream technically becomes a frozen custard. But most people still just call it ice cream. I have yet to see a premium ice cream that is not actually a frozen custard, but I'm sure one exists somewhere. From a technical standpoint, there is a definite advantage to using egg, which will help the ice cream set up a little more easily in the churn. It also adds a nice creaminess.

Frozen Yogurt: There's not really a whole lot of difference between ice cream and frozen yogurt, other than the former using cream (or at least half and half) and the latter using yogurt. Because of the yogurt, it's often a little more tart, and generally lower in fat.

Granita: This Italian dessert is kind of like sorbet, but with much larger ice crystals. This is due to the preparation, which is more of a shaved ice technique than a churning technique. The method that I see most often involves pouring flavorful liquid (usually coffee, but sometimes fruit juice) into a cookie sheet and putting it in the freezer, scraping with a fork every couple of hours or so.

Gelato: This is an Italian variety of ice cream who's name I have often seen mis-used and abused in America, so let's set the record straight. One big difference is the low dairy content: almost as low as sherbet. I have seen some gelatos with no diary at all. Gelato also has a higher sugar content than ice cream, and usually involves egg. I have also heard Italians mention some sort of mysterious stabilizer, which I have experimented with before. It's still unclear to me what it is, and if it's actually a requirement. Gelato is churned like ice cream, but has much less air incorporated into it, and is meant to be served fresh, the same day that it is made. While not necessary, gelato generally has a pretty high fruit content.

I have often heard the terms "gelato" and "spumoni" used interchangeably. Let me be clear on this: spumoni is a type of gelato. Not all gelato is spumoni. Spumoni is a layers ice cream, kind of like Neapolitan ice cream in America, but containing things like fruits and nuts.

Kulfi: My favorite dessert from India, this concoction differs from ice cream largely in that it is not churned. It generally contains flavors indigenous to India, like cardamom or pistachios. Americans be warned: it would seem that India loves their desserts sweet, as is evidenced in pretty much every kulfi I have ever eaten. It's not too sweet for me, but it's close.

I hope that clears things up a little bit for some of you. Obviously I haven't hit every type of frozen dessert, but there are a few important ones.

Tuesday, January 19, 2010

I Want Proof

There is a fad that's been circulating around the Internets for years now, and I'm sure is even older than that: the idea of so-called "negative calorie foods". The basic premise is that some foods require more calories to digest than they actually provide. For instance, a food that provides 5 calories, but requires 10 calories worth of energy for your body to process it, is considered to have negative calories.

It's an interesting concept, and it would be awesome if it were true. Unfortunately, I have been unable to find any concrete proof either way. We know how to determine a food's caloric content, but I wonder if we know how to determine how much energy it takes to process it? In trying to find answers online, I found several people who claimed to be knowledgeable, but who were obvious idiots, and/or didn't bother checking their facts. For instance, on this page, I found this comment:

"Celery has almost zero calories, it's so minuscule we round down to 0."

I've never heard of anyone making this claim. If we consult the USDA Standard Reference, we discover that an 8-inch stalk of celery contains about 6 calories. This is not a miniscule amount (unless compared to a Big Mac), and I certainly wouldn't round it down to 0.

I found several other references to celery containing anywhere from 5 to 20 calories per serving (though the serving size was never stated), and guesses that eating a single serving would burn anywhere from 5 to 20 calories. Even Snopes, which I lose more and more faith in every time I read anything there, claims the "negative calorie" concept is true, but offers absolutely no evidence or proof.

Can anyone tell me definitively how many calories are burned by eating a single serving (say, 40g, the approximate weight of an 8-inch stalk) of celery? I want a number, and I want to know how that number was obtained.

My next complaint involves cooking alcohol out of food. There are plenty of people that will tell you, "don't worry, the alcohol burns out". In my experience, these are people that either think you're silly for caring, or are reassuring themselves because they want it to be true. Other people will tell you that you can never burn it all out. The most outspoken of these that I've heard is Alton Brown, followed by his good buddy Ted Allen.

Both Alton and Ted have discussed this on their shows, Good Eats and Food Detectives, respectively. Food Detectives is kind of a culinary Myth Busters, but is far more scripted. They frequently perform experiments to prove or disprove myths, but in the case of the alcohol, they did a food demo that proved nothing, and then stated their "fact" as gospel.

Alton Brown has stated repeatedly that alcohol never cooks out completely, but has never offered proof. Some years ago I did some research and found a report on the USDA's website that seemed to imply that after 2 1/2 hours of oven roasting, the level of alcohol left in foods is 0% (which I'm guessing is actually < 0.5%). Unfortunately, in more recent visits, this report seems to have been removed. I have been unable to find it for years.

So, it begs the question: does alcohol really cook out, or not? Does anyone have any proof? Or can anyone at least point me to a report or study somewhere that even suggests something either way?

I'm not convinced on the negative calorie thing, or the alcohol thing. And I'm sick of people making claims with nothing to back them up. I want proof.

Saturday, January 16, 2010

No More Fast Food

You may have heard that McDonalds is now providing free wireless Internet access at their restaurants. For those of you who eat at McDonalds, this is great news! For myself, I haven't eaten McDonalds in several years. And recently, I decided to abandon fast food in general.

Not eating at McDonalds was an easy choice, back when I made it. I still remember my first Big Mac. I was 12 years old, and my mom gave me $5 and told me I could eat lunch whereever was within walking distance of where we were. Finally, my big chance! I could finally try that Big Mac that I'd seen so many commercials about. Sadly, it did not live up to the hype. It was "okay", but nothing special. Since then, I had never eaten at McDonalds and thought afterwards, "wow, I'm glad I ate there!" So now I don't eat there anymore.

Since then, I've run into disappointment at every fast food establishment I've ever been to. Carl's Jr has one (just one) item on their menu that I like (the Western Bacon), and I always feel like crap after eating it. Wendy's also only has only one menu item I can stand (spicy chicken sandwich), and it's not worth the sheer incompetence that they tend to hire to sell it to me. Even Subway is on my black list, with their selection of styrofoam-inspired breads.

The problems with individual restaurants are just the tip of the iceberg. Fast food is famous for its unhealthiness. Granted, most restaurants now offer healthy options, but I have a hard time paying even a dollar (sometimes two or three) for a bag of apple slices, when I could just plan ahead and bring a $0.33 apple from the grocery store with me instead.

Fast food menus are not designed for healthy eating. What's interesting to me is how many aspects of this were pioneered by McDonalds. When Ray Kroc first found McDonalds, he was surprised and impressed at the manufacturing-line techniques that were used to churn out fast, cheap burgers. Decades later, McDonalds not only began selling value meals, but actually assigned numbers to them, for convenience. And let's not forget the famed "Supersize" option which everyone copied again. While they don't ask anymore if you'd like to supersize, I'm told you can still ask for it. The most classic example is a Big Mac with large fries and a large Coke. Tasty, no? Not for me. And just between you and me, I've never been a fan of McDonald's fries either.

I think my biggest problem with fast food has been using it as a crutch. When I don't bring lunch with me to work, fast food is there to keep me from being hungry. When I forget breakfast in the morning, there are plenty of places I can stop by on the way to work. And when I'm feeling just a little too tired to make my family a delicious and healthy meal, I can always pick up a bag-o-burgers on the way home.

Why do I find myself in these situations? I think that it's been a misguided set of priorities, coupled with poor planning. I could get up early and spend a few extra minutes making pancakes to show a little love to my family, or I could stop by McLazy's on the way into work and leave my family to fend for themselves. I could make a little extra for dinner one night so that I can have leftovers to bring into work the next day, or I could buy a bucket of chicken so as not to lose precious TV time.

I'm not saying that all restaurants are bad, of course. I'm still okay with casual dining restaurants. I will still go to diners. Fine dining, when it can be afforded, is a fine thing indeed. Even delis are okay with me, in moderation. If I could afford it, I would love to take my family out to eat once every week or two. Eating out is a treat, and a way to experience new foods and keep your palate from getting board. But when any treat becomes habit, it starts to lose meaning and spoil us. I'm not okay with that.

Note: I've already been asked this once or twice, so I'd better say it here. As far as I'm concerned, carry-out or delivery pizza is also fast food. I love Pizza Hut, but I don't love the expense, or the idea of using it as a crutch. I can make my own pizza at home, which may take more time and planning, but which will taste 10 times better, and cost less than a half as much.

Sunday, January 3, 2010

Ice Cream Machines

I got an ice cream freezer for Christmas. It represents the latest in a growing line of equipment that I've had the opportunity to make ice cream with, each progressively better.

You see, I started with a frozen core model. This includes a container which has a special solution built into it, which must be frozen for 24 hours before use, and an electric motor. This model was barely serviceable, because it never got cold enough to effectively freeze the ice cream, and it was only good for a 30-minute session. It held somewhere around a quart.

My next model was, aside from the fact that it also had an electric motor, a little more old fashioned. You actually had to add alternating layers of ice and salt, which was surprisingly more effective than the frozen core model. I could go for as long as 40 minutes before the ice got melty enough to raise in temperature again. It could also make somewhere around 3 quarts pretty effectively. The most major drawback was the freezer full of ice that you needed to keep around. The second most major drawback was disposing of the salt water. The third biggest was the water that would condensate on the side, and then melt into a puddle around the churn.

The first model was a joke. I would never recommend a frozen core to anyone. The second model was about the same price, and while it did have its drawbacks, it at least worked. I figured it would be the model I would use until I got rich and could afford a model with a built-in freezer.

Well, that model is what I got for Christmas. It technically holds two quarts, but I have yet to get more than a quart and a half out of it. But that's not the fault of the machine itself. The motor will run until it can't run no more, and then it will stop on its own. Since the built-in freezer won't shut off with the motor, you could probably just add ice cream mixture, turn it on and go shopping, and come back home to ice cream fully ready for consumption.

This brings up an important point. The first two models make soft-serve ice cream, which must be quickly moved into containers, and into the freezer, before it is ready to be served. While you can do that with this model, my first batch was actually hard-frozen. I wasn't used to the machine yet, and I ended up letting the motor run until the ice cream was hard.

This brings up something else important. I have discovered that, with all of my practice batches so far, I need a minimum of 45 minutes to get a good churn (and sometimes longer), something that my old freezers fell short of. I have also discovered that I no longer need to use egg yolk to get a decent freeze. Before, I always used egg-based recipes, because frozen custard is easier to churn. With one exception, I have yet to use anything egg-based in this freezer.

The one exception is eggnog. It's a little late now to do this, but it's something to keep in mind for next year. Commercial eggnog is little more than spiced, unfrozen ice cream. My favorite brand for the past few years has been Southern Comfort's Vanilla Spice Eggnog. They also have a "regular" Southern Comfort Eggnog. Both are alcohol-free (you're supposed to add the Southern Comfort yourself), and I've frozen several batches of the Vanilla Spice version, both for ourselves and for family and friends. For those of you that don't like eggnog, well, I'm guessing you don't like drinking melted ice cream either. And that's fine. It's okay to be wrong sometimes.

I will give you a recipe that I've been playing with. It's not perfect yet, but it's still pretty good. And it's totally egg-free:

Strawberry Ice Cream (beta version)

1 pint half and half
1 pound frozen strawberries
1/2 cup sugar
1 pinch salt

Combine all ingredients in a sauce pan and bring to a low simmer, just long enough to dissolve the sugar and thaw the strawberries. Use an immersion blender to puree the strawberries and homogenize the mixture. Cool and refrigerate overnight before freezing, as per your ice cream freezer's instructions. Makes a little over a quart.