Slowly Changing Dimension in SSIS
By: Kasim Wirama, MCDBA, MVP SQL Server
Another SSIS component that is
interesting to examine is Slowly Changing Dimension. Term of SCD (slowly
changing dimension) is used to describe changing row attribute in dimension or
lookup table, such as change of product price. Changing of these row attributes
has some implications of table design, depending whether business user want to
keep its history or just override the destination the cell value.
In this case, there are 3 solutions
depending on how scenario business users would like to choose:
- do not keep historical data
it’s simple to be implemented, but users couldn’t see
historical value, because old value is overridden by new value
- keep historical data
users can see its historical value, but table will grow
bigger, and transaction table that refers to lookup table should be updated its
foreign key reference to lookup table (SCD table)
- keep historical data in one column
users can see last history value, but you will implement by
your own code because the component do not provide for this kind of solution.
Let’s take a look below sample. I
create simple table called dbo.EmployeeSalary that
keeps salaries of each employee. You can take scripts below :
create table dbo.EmployeeSalary
(
EmployeeID int not null identity(1,1) primary key,
EmployeeNumber
char(3) not null ,
EmployeeName varchar(10) not null,
SalaryPerMonth int not null,
[Address] varchar(15) not null,
SalaryStatus varchar(10) not null default ('current')
);
Now, there are some changes,
first change is to raise salary of John from 3500 to 4000, second one is to
change smiley address from garden road to Sudirman
road. Users want to keep history of salary changes but override address
changes. So for this case, you will examine how to use this component. But here
is the text file that contains data changes, named it list.txt
empnumber empname salary permonth address
001 john 4000 orchad road
002 smiley 3000 sudirman road
Create SSIS package, add Flat
file source and SCD component to Data Flow designer, connect two of them, point the Flat File source component to the text file
(list.txt), change column type of salary permonth
from string to integer. On SCD wizard,
you create destination OLEDB point to table dbo.EmployeeSalary,
and setting up address (Input column) to Address (Dimension column) as Not a
key column, similar to empname, salary per month, and
business key for empnumber because it uniquely
identifies each instance of employee from business user perspective.
Next, you specify SalaryPerMonth as HistoricalAttribute
and Address as ChangingAttribute.
On HistoricalAttributeOptions,
select SalaryStatus on Columns to indicate current
record, Current at Value When Current, and Expired at Expiration Value.
When you complete the wizard, run
the package. If the package runs successfully, you will get 3 rows at dbo.EmployeeSalary. With expired row on salary changes belonged to John and Smiley’s updated
address.
If
you see final result of the package, SCD component will expand to several
component, so it will make implementation of SCD much more easier in SSIS
world.