воскресенье, 16 января 2011 г.

Using Check Constraints to Validate Data in SQL Server

One of the key goals when building applications is to make sure the data you enter into the database meets all your business rules. Data validation is a critical part of your application to ensure your data meets the requirements developed by your business analysts. There are a number of different processes that can be used to make sure your data validates against your business rules. Data validate rules can be enforced within your application code in the application layer; or data can be validated by the database engine. Depending on your validation, rule requirement will determine where and how you will need to build your application to ensure your data is properly validated. In this article, I’m going to discuss how you can use database “check constraints” to validate your data within the SQL Server database engine.

What is a check constraint?

A check constraint is a rule that identifies acceptable column values for data in a row within a SQL Server table. Check constraints help enforce domain integrity. Domain integrity defines the valid values for columns within a database table. A check constraint can validate the domain integrity of a single column or a number of columns. You can have multiple check constraint for a single column. If data being inserted or updated violates a check constraint the database engine will not allow the INSERT or UPDATE operation to occur.
A check constraint consists of a logical expression to identify what is a valid expression. The logical expression may be a single expression like “Salary < 200000.00”, or multiple expressions, like “RentalDate > GETDATE() and RentalDate < DATEADD(YY,1,GETDATE())”. A check constraint will restrict data from being inserted or updated in a table if the logical expression of a check constraint returns a FALSE value. All rows where the logical expression equates to something other than FALSE will pass the check constraint and allow a row to be updated or inserted. All data associated with a given INSERT or UPDATE statement must not fail (return a FALSE value) for any of the check constraints in order for the row to be inserted or updated. Check constraints can be created at the column level, or at the table level.

Creating Check Constraints on a CREATE TABLE Statement

One method of creating a check constraint is to do it when a table is created. Here is a simple CREATE TABLE script that creates a single check constraint:
CREATE TABLE dbo.Payroll
     (
      ID int PRIMARY KEY, 
      PositionID INT, 
      SalaryType nvarchar(10),
      Salary decimal(9,2) 
      CHECK (Salary < 150000.00)
     );
Here I have a CHECK clause that is associated with the Salary column. This is a column level constraint. If you create a column level constraint, you can only refer to the column name in the logical expression of your check constraint. This check constraint only allows the Salary column to be less than $150,000.00. When my table is created, this CHECK constraint will be created and given a system generated constraint name (see my article “http://www.databasejournal.com/features/mssql/article.php/1570801/Beware-of-the-System-Generated-Constraint-Name.htm” to find out issues associated with system generated constraint names). If you want to name your check constraint during a CREATE TABLE operation then you could run the following code:
CREATE TABLE dbo.Payroll
     (
      ID int PRIMARY KEY, 
      PositionID INT, 
      SalaryType nvarchar(10),
      Salary decimal(9,2) 
      CONSTRAINT CK_Payroll_Salary CHECK (Salary < 150000.00)
     );
Here I have named my check constraint CK_Payroll_Salary.
Each of the above examples creates a single condition column check constraint. A check constraint expression can have multiple conditions. Here is an example that shows a check constraint that has multiple conditions:
CREATE TABLE dbo.Payroll 
     (
      ID int PRIMARY KEY, 
      PositionID INT, 
      SalaryType nvarchar(10),
      Salary decimal(9,2) 
      CONSTRAINT CK_Payroll_Salary 
          CHECK (Salary > 10.00 and Salary < 150000.00)
     );
Remember in order for SQL Server to reject a record, the final outcome of the logical expression for the check constraint needs to evaluate to FALSE. Therefore, in this example the check constraint verifies that a Salary is greater than $10.00 and less than $150,000.00. When either one of these conditions in the check constraint evaluates to FALSE a row will not be inserted, or updated in the Payroll table, and an error message will be displayed.
If you want create a table level check constraint you can run this code:
CREATE TABLE dbo.Payroll 
     (
      ID int PRIMARY KEY, 
      PositionID INT, 
      Salary decimal(9,2),
      SalaryType nvarchar(10),  
      CHECK  (Salary > 10.00 and Salary < 150000.00) 
     );
Here I have created a single table constraint that checks Salary column, but instead of associating it with the column, I associated it with the table. I could have used any of the columns within my table in the check constraint if I wanted to since it is a table check constraint, but in my case I only used the Salary column. Note this CHECK clause will cause SQL Server to generate a check constraint name because I didn’t give this constraint a name.

Creating a Check Constraint on an Existing Table

Sometimes after you have designed and created a table, you want to place a check constraint on a table. This can be done by using the ALTER TABLE statement. Here is an example that does this:
ALTER TABLE dbo.Payroll 
     WITH NOCHECK ADD CONSTRAINT CK_Payroll_SalaryType
     CHECK (SalaryType in ('Hourly','Monthly','Annual'));
Here I have created a check constraint that will verify that all records in my Payroll table have only “Hourly”, “Monthly”, or “Annual” values in the SalaryType column. I have also named my check constraint with a name, in this case “CK_Payroll_SalaryType”.
You can use a single ALTER TABLE statement to add multiple check constraints to your table at once. Here is an example that does that:
ALTER TABLE dbo.Payroll  
     WITH NOCHECK ADD CONSTRAINT CK_Payroll_SalaryType 
     CHECK (SalaryType in ('Hourly','Monthly','Annual')),
       CONSTRAINT CK_Payroll_Salary
     CHECK (Salary > 10.00 and Salary < 150000.00); 
Here I have added both the SalaryType and Salary constraints with a single ADD CONSTRAINT clause.

Creating Multiple Column Constraints

You don’t have to create constraints that only check the values of a single column. You can create constraints that check values in multiple columns at the same time. For instance, if I wanted to create a single constraint that checked both the Salary, and SalaryType constraints I created above I could use the following code:
ALTER TABLE dbo.Payroll WITH NOCHECK  
   ADD  CONSTRAINT CK_Payroll_Salary_N_SalaryType 
   CHECK  (SalaryType in ('Hourly','Monthly','Annual')
       and Salary > 10.00 and Salary < 150000.00);
This single constraint does the same thing as the above two constraints. Keep in mind when you do this it will be more difficult to understand whether it was the SalaryType, Salary, or both columns that violated your check constraint.
Another twist on the prior example is to use the value in more than one column to determine whether a specific column value is valid. For instance, say I want to ensure that when I enter an “Hourly” SalaryType, I want the Salary to be less than $100.00 or when “Monthly” SalaryType is entered the Salary is not over $10,000, and when an “Annual” SalaryType is enter then any Salary amount is fine. To implement this constraint I could use the following ADD CONSTRAINT clause:
ALTER TABLE dbo.Payroll WITH NOCHECK  
   ADD  CONSTRAINT CK_Payroll_SalaryType_Based_On_Salary
   CHECK  ((SalaryType = 'Hourly' and Salary < 100.00) or
           (SalaryType = 'Monthly' and Salary < 10000.00) or
           (SalaryType = 'Annual'));
Here I have used multiple column conditions together and separated them with an “or” condition so my check constraint could validate the Salary amount for each of the different SalaryType’s.

Understanding What Happens with Null values

Remember back in the “What is a Check Constraint” section of this article I talked about how records only fail a check constraint when the condition of the check constraint evaluates to FALSE. Because of this fact, NULL values on columns may allow you to enter data into your database that might not meet your requirements.
As an example say I only have my CK_Paryroll_SalaryType check constraint on my payroll Table. Just to refresh your memory here is that check constraint:
ALTER TABLE dbo.Payroll 
     WITH NOCHECK ADD CONSTRAINT CK_Payroll_SalaryType
     CHECK (SalaryType in ('Hourly','Monthly','Annual'));
Now say you run the following INSERT statements:
INSERT INTO dbo.Payroll values (1, 1, 'Hourly',25.00);
 INSERT INTO dbo.Payroll values (2, 2, NULL, 25.00);
 INSERT INTO dbo.Payroll values (3, 3, 'Horly',25.00);
What do you think will happen? Will only the first INSERT statement work? What about the second and third INSERT statement? Will they both violate the CK_Payroll_SalaryType? Turns out only the third INSERT statement will fail. It fails because the SalaryType has been typed wrong, and is not “Hourly”, “Monthly”, or “Annual”. Why did the second INSERT not equate to false? Clearly, “NULL” is not in the valid list of SalaryTypes. The reason the second INSERT statement worked is that the CK_Payroll_SalaryType constraint did not equate to FALSE when the second INSERT statement was run. Because of this, the database engine inserted the record. So why did this happen? This happened because when NULL is used in a comparison operation it equates to UNKNOWN. Since UNKNOWN is not FALSE there is no violation of the check constraint. Therefore, you need to be careful when you write your check constraints where you want to reject values that contain NULLS. Another way to code the above constraint so a NULL value is rejected for SalaryType is to write your check constraint like this:
ALTER TABLE dbo.Payroll 
     WITH NOCHECK ADD CONSTRAINT CK_Payroll_SalaryType
     CHECK ((SalaryType in ('Hourly','Monthly','Annual'))
             and SalaryType is not NULL);
Another alternative is to make the SalaryType a NOT NULL field. When you do this you will not get a check constraint violation, but instead you will get an error that indicates you cannot insert a NULL value into your table.

Data Validation through Check Constraints

By using check constraints, you can make sure your database only contains data that passes your constraints. This allows you to let the database engine control your data validation. Doing this will make it so your application will not need to have data validation rules spread throughout your code in every location where you try to insert a record into or update a record in a table. Having check constraints is a clean way of performing data validation.

Комментариев нет: