[TriLUG] MySQL - Perl

Shawn Hartsock hartsock at acm.org
Tue Jul 10 09:09:03 EDT 2007


On 7/9/07, Mark Freeze <mfreeze at gmail.com> wrote:
> * Delete duplicate records within the database that have the same
> batch_id, product_code, and account_value, and retain only one record.

Shouldn't batch_id, product_code, and account_value be a primary key
since that's how you identify a unique entity? Have you normalized
your database?

I notice the other suggestions in this thread are very insightful. Try
and follow those too. Specifically:

1) Help your database to do its job better: create unique/primary keys
and switch to "replace into" instead of just "insert" keys help your
database function faster and identify unique records more easily...
including duplicates.

2) Make full use of the tools you have: use prepared statements and
make sure you use bind parameters in your statements
http://cpan.uwinnipeg.ca/htdocs/DBI/DBI.html#Statement_Handle_Methods


3) Use the right tool for the job: leverage the database engine as
much as possible to do your sorting and organizing of data since
that's what the database is good at.

Your database is built to use the most efficient algorithms possible
if you use it's tools to delete duplicates and summarize accounts you
don't need to work as hard to optimize your own algorithms.

-- 
-- Shawn Hartsock -- http://hartsock.blogspot.com/



More information about the TriLUG mailing list