Problem: How to load data concurrently
into a destination within the data flow task to fast load?
Solution: Balanced
Data Distributor Transformation
The Balanced Data Distributor
(BDD) transformation takes advantage of concurrent processing capability of
modern CPUs. It distributes buffers of incoming rows uniformly across outputs
on separate threads. By using separate threads for each output path, the BDD
component improves the
performance of an SSIS package on multi-core or
multi-processor machines.
Balanced data distributor
will be in useful in the following cases:
1.
Large incoming record set.
2. Reads source data faster.
3. Not so good for smaller data loads where single buffer would be enough.
2. Reads source data faster.
3. Not so good for smaller data loads where single buffer would be enough.
Example:
In this article, I am going to show we can use the balanced
data distributor transformation.
To prove better results, I am going to select a table that
has about 1million rows and load it into a brand-new table.
Step 1: Create a data flow task to insert the data into a
new table by pulling ole DB Source with a table of larger rows.
Step 2: execute to capture the numbers and see performance
measures. Clock the time to execute total task.
Step 3: Introduce Balanced data distributor into the data
flow transformation, add three destinations and map to the same table in the
destination table area.
Step 4: execute to see time taken to finish.
Results: As we can see there is a slight difference but the margin is 30% between both the time intervals and on a broader side this would give much better performance, like a hour time would cut down to 30mins or so depending on the buffers you keep and parallel executions.
0 comments:
Post a Comment