The Database Design Resource Center

An ETL nightmare

by Jessica

I recently spent more than a year supporting a custom ETL written with Oracle pl/sql comprised of ~100 packages per customer schema, each with varying degrees of customizations (and one set of code in source control which matched none of the live production schemas)... The code used cursor for loops to transform and load the deltas for each table, one record at a time but lacked record-level error handling, so one bad row would fail the ETL run...what's worse, it moved the data in this same fashion TWICE, once from source to staging, then staging to warehouse. From staging to the warehouse each source record was assigned a surrogate primary key, thus making source validation nearly impossible. And those are only the highlights! I'll spare you from the rest. Hands down the worst design I've ever seen!

Comments for An ETL nightmare

Average Rating starstarstarstarstar

Click here to add your own comments

Oct 20, 2017
mobdro free install NEW
by: mobdro download

download mobdro free for laptop/windows

Mar 12, 2016
starstarstarstarstar NEW
by: rokki

Very nice and helpful information has been given in this article.

Sep 15, 2015
by: Someone

I'am using MangoDB and happy as I can be!

Jun 22, 2015
Good post NEW
by: Ed Goldner

Hello! Your site is great!
How much time did you spend to write so awesome articles?
I want to suggest you to advertise my site Fruit-max on your blog, is it possible?
I really love your site and your audience, I will pay any money to publish a guest post, for example, on your site.
Please, answer me! Thank you!

Jun 08, 2015
Great article NEW
by: Aliceok

The clarification of diverse theme is basic in both men and ladies. A few individuals have learning of clarification while a few people groups don't have any thought regarding clarification. All journalists have review same theme however our method for investigating is diverse.

Dec 31, 2012
ETL -- the horror returns NEW
by: keithgreysr

Your experience sounds very similar to what I am going through right now, excepting Perl being (ab)used instead of PL/SQL.

The whole project was off-shored with no input from our internal IT staff. No ROI analysis was made nor proper resource requirement projection attempted.

No attempt was made to design any scalability for tables that reached as many as 82 million rows -- some rows with over 100 columns and averaging 1k EACH!

As in your case, data was thrashed and hashed, using brute force and massive ignorance approaches. Need I point out how munch TEMP and UNDO space was required when no intermediate COMMITs were made?

Fortunately, my direct manager supported my pushing back. I insisted on:

1) Initial loading via SQLLoader, unrecoverable;
2) Convert VARCHAR ID code made of numbers with leading zeros to NUMBER;
3) Use the ID code for PK;
4) Set up Range-based Partitioning on ID code;
5) Converting CHAR date data into Oracle DATE values;
6) Breaking transformation processing to loop through partitions instead of trying to process the whole table in one pass;
7) Changing input file delimiters from TAB to PIPE when I found over 250K lines being rejected;
8) etc, etc.

Finally got the Oracle side of the ETL running under 20 hours, and can hold in reserve using parallel processing on the transformations for later.

Now, on the MySQL Webserver (not my pain, fortunately) side however...
It took them three months to realize there were serious scalability issues; three more months to realize they could not create enough indexes to compensate. Finally, they are implementing the InfoBright engine and are dribbling in additional changes to data structures.

Jun 09, 2012
by: Anonymous

Some of us don'tknow what ETL means. Extract, Transform, Load???

Click here to add your own comments

Join in and write your own page! It's easy to do. How? Simply click here to return to Worst Database Design Experience.

Exclusive interviews with:
Steven Feuerstein, PLSQL expert
Donald Burleson, Top IT consultant

Free eBook

Subscribe to my newsletter and get my ebook on Entity Relationship Modeling Principles as a free gift:

What visitors say...

"I just stumbled accross your site looking for some normalization theory and I have to say it is fantastic.

I have been in the database field for 10+ years and I have never before come across such a useful site. Thank you for taking the time to put this site together."

Mike, USA

Read more Testimonials

Database Normalization eBook:

Database Normalization eBook

Copyright © / Alf A. Pedersen
All rights reserved.
All information contained on this website is for informational purposes only.
Disclaimer: does not warrant any company, product, service or any content contained herein.

Return to top

Copyright acknowledgement note:

The name Oracle is a trademark of Oracle Corporation.
The names MS Access/MS SQL Server are trademarks of Microsoft Corporation.
Any other names used on this website may be trademarks of their respective owners, which I fully respect.