Sunday, November 02, 2008
Importing the USDA SR21 into MySQL
If you're planning on using the USDA Standard Reference in your own software, it helps to have it in a format that your program can actually read. If you're in Windows, this isn't an issue, since the database is available for download in MS Access format. For other platforms, an ASCII version of the database is also available.
The format of the files is pretty easy to deal with; each file is caret-delimited, and text fields are surrounded by tildes. The documentation explains pretty clearly how the tables are laid out. It's simple to put together an SQL file that defines the tables, but it's tedious. And then you have to write a script to parse the data.
I just did those things this morning, and I thought I'd share, to save you the trouble. The SQL is in MySQL format, and the parser is written in Perl. I know that not everybody likes those two things, so if anybody wants to convert my SQL to Oracle (shudder) or PostgreSQL format, I'll be happy to post it. The same goes for converting the Perl script to Ruby (/me waves to Hans) or Python (/me waves to Matt) or to some other language.
import_sr21.pl - Perl script
sr21.sql - MySQL Format





13 Comments:
Just what I was looking for. I knew somebody would have done it.
You're the man!!
12/06/2008 10:42 PM
Googled, found this, used it. Flawless and streamlined.
Thanks!
2/11/2009 9:29 AM
The perl script is giving my errors on the regex. I tried removing it and it won't connect to the db. What version of PERL did you write this for? I definitely have DBI installed, it just can't connect to MySQL. Am I missing something? Could Perl 5.8.5 be causing something?
I really appreciate the time and effort you put into this. Anyone else had this issue?
6/09/2009 9:09 PM
Psyclone: I don't know which of the regular expressions might be causing database issues. What errors are you getting?
FWIW, I'm using Perl 5.8.8.
6/09/2009 9:19 PM
[john@localhost ~]$ ./import_sr21.pl
syntax error at ./import_sr21.pl line 32, near "qw/?/ x "
syntax error at ./import_sr21.pl line 53, near "}"
Execution of ./import_sr21.pl aborted due to compilation errors.
I've got the MySQL socket issue nailed down. This is the one I can't figure out, and I'm not a perl monkey, by any means. Is there an alternative to this 'safe' version, since as you stated, injection really isn't a concern here?
6/09/2009 9:37 PM
Okay it was definitely an environment thing. I ended up setting up a mysql server on ubuntu, running your perl script on that, dumping the tables to a file, then uploading it onto my server and injecting it into my server tables.
Seriously roundabout way of getting it done, but it works. Thanks so much for taking the time to create this. You're seriously the only person on the Internet who's bothered to make the USDA db file even slightly useful.
I can send you the mysql dump if you think it would be helpful to post it.
6/10/2009 10:37 PM
Psyclone: I'm glad upgrading fixed the issue. I thought it might. If you'd like to post the USDA dump file somewhere else and then toss a link up here, I'm all for it. But even though that data is in the public domain, I'd really rather not host it here.
6/11/2009 7:26 AM
you can use sqlyog's migrate tools to move access db to mysql.It's a simple way to imple this
6/21/2009 9:12 PM
For those who do not want to use the perl here is how you import the data from the command line client:
load data infile '/file_path/TABLE_NAME.txt' into table TABLE_NAME fields terminated by '^' optionally enclosed by '~' ;
Just substitute the table name per table.
7/16/2009 5:03 PM
Would be nice to have the ruby version of that perl code. It's a nightmare for someone with no perl experience :-(
8/20/2009 11:54 AM
Tony: You're welcome to write a Ruby version. It's a pretty simple script.
8/20/2009 12:23 PM
Tony: It's only a nightmare for someone who doesn't want to do any work, or who has bought into the anti-perl bs that's running around and isn't willing to think for himself ... it's a *really* simple script.
You should be able to convert it with ease.
8/25/2009 2:59 PM
I was looking just for that ;)
Many thanks for posting this, really helpfull!
9/08/2009 4:36 PM
Post a Comment
<< Home