Inequality Join Predicate- Nested Loop

I was reading about Joins in Kalen Delaney’s Microsoft SQL Server Internals 2012 and came across a point “a nested loops join is the only join algorithm that SQL Server can use without atleast one equijoin predicate”. This means that regardless of the dataset size or indexes, if no equality predicate exists in join condition, the optimizer will choose only ‘Nested Loop’ physical operator. Below test case verifies the same:

Create Table #Test1(sno int identity(1,1) primary key,sname varchar(6) default 'table1')
Create Table #Test2(sno int identity(5000,1) primary key,sname varchar(6) default 'table1')
GO
Insert into #Test1 Values (default)
Go 1000
Insert into #Test2 Values (default)
Go 1000
/*If no equality predicate is present in the join condition, then nested loop join is the only physical operator optimizer can use*/
Select A.sname,B.sname from #Test1 A Inner Join #Test2 B on A.sno<>B.sno and A.sname<>b.sname

Drop table #Test1
Drop table #Test2 

Below is the execution plan:
ExecutionPlan_NestedLoop

Advertisement

Simple Column Level Encryption

Column Level Encryption can be achieved in many ways and one of the simplest ways to accomplish is by encrypting the column with a pass phrase. This is very easy way to implement as encryption and decryption can be done with the same key(symmetric).

Create Table ColumnEncrypt(SEncrypt varbinary(max))
--Encrypt the column
Insert into ColumnEncrypt(SEncrypt)
select ENCRYPTBYPASSPHRASE('My$tr@^gP@$$W0R%','This is confidental')

select * from ColumnEncrypt
--Decrypt the data
Select convert(varchar,DECRYPTBYPASSPHRASE('My$tr@^gP@$$W0R%',SEncrypt)) as [DecryptedData] from ColumnEncrypt

The Encrypted and Decrypted SQL statement is not seen in the profiler or extended event, so, other people cannot track pass phrase.

Is the Index hurting or helping?

The concept of index is to help retrieve data faster and hence to improve query performance. But, it is not always the case, as poorly designed index could have negative performance impact. On a high level, Non Clustered Index is like a sub table formatted in B Tree with columns defined in the index definition. Every insert,delete,update on the base table will also have to do the same operation on the non clustered index for those columns referred in the index definition. So, all this means additional work for SQL server as well as more storage. Hence, it is important to make sure that indexes are being more valuable compared to what they are costing.

Below SQL Code gives a general idea on how useful the indexes are. I am not checking for clustered index, since it exists on the base table and probably almost all tables need a clustered index.

Select DB_Name() as [DatabaseName], Schema_Name(Schema_id)+'.'+Object_Name(A.Object_ID) as [TableName], A.Name as [IndexName],A.type_desc as [IndexType],IndexSize_MB,User_updates as [IndexCostCount], user_seeks+user_scans+user_lookups as [IndexUsageCount],Last_User_Update as Last_Cost_Operation,
(Select Max(v) from (Values (last_user_lookup),(last_user_scan),(last_user_seek)) As Value(v)) as [Last_Useful_Operation]
From sys.indexes A LEFT OUTER JOIN
sys.dm_db_index_usage_stats B on A.index_id=B.index_id and A.object_id=B.object_id and B.database_ID=db_ID()
INNER JOIN sys.tables C on C.object_id=a.object_id and A.type_desc not in ('Heap','CLUSTERED')
INNER JOIN (Select (SUM(used_page_count) * 8/1024) AS IndexSize_MB,index_id,Object_id FROM sys.dm_db_partition_stats GROUP BY object_id,index_id) D ON
D.index_id=A.index_id AND D.object_id=A.object_id
order by IndexSize_MB

I am summing the user seeks,scans and lookups as to get one total value for index for usage.

Higher the IndexUsageCount compared to IndexCostCount,it generally implies that the index is useful. Of course, whether the usage and operation(seek\scan\lookup) of a particular index in a query is optimal or not cannot be decided without investigating the actual execution plan.

This information is not persisted after restart of SQL Server or database is offline.

Parameter Sniffing

Parameter Sniffing is a typical problem when using stored procedures which could result in bad query plans. So, when the procedure is executed for the first time, a query plan is created and stored in procedure cache for reuse. The initial plan generated will be in perspective of parameter passed in the first execution and this plan is saved in the procedure cache and reused for subsequent procedure calls until the plan is invalidated by the update statistics or when procedure cache is cleared or code recompile happens.

So,once the plan is created, it could be reused even for different parameters which is where, it can cause problems because there is no guarantee that the plan generated for first parameter is also good plan for the subsequent parameters. This is called as Parameter Sniffing.

--Below is the scenario for Parameter Sniffing.
Create Procedure usp_PersonNamePhone(@Id int)
as
Begin
Select A.FirstName,A.Lastname, B.PhoneNumber
from [Person].[Person] A
INNER JOIN [Person].[PersonPhone] B on A.BusinessEntityID=B.BusinessEntityID
where A.BusinessEntityID<@Id
End

--You can see same plan being used in all three procedure calls.
usp_PersonNamePhone @Id= 2
usp_PersonNamePhone @Id= 200
usp_PersonNamePhone @Id= 20000

There are several ways parameter sniffing problem can be addressed and below some of the commonly used methods.Keep in mind that each method has it pros and cons.

1.Recompile every time the procedure is called and hence, generates query plan as per the input parameter.
Create Procedure usp_PersonNamePhone_V1(@Id int) WITH RECOMPILE
as
Begin
Select A.FirstName,A.Lastname, B.PhoneNumber
from [Person].[Person] A
INNER JOIN [Person].[PersonPhone] B on A.BusinessEntityID=B.BusinessEntityID
where A.BusinessEntityID<@Id
End

--Different Query plans is generated based on Input.
usp_PersonNamePhone_V1 @Id= 2
usp_PersonNamePhone_V1 @Id= 200
usp_PersonNamePhone_V1 @Id= 20000

2.Use Statement level recompile. This is better option, if the procedure has several statements and only some statements are having issues.
Create Procedure usp_PersonNamePhone_V2(@Id int)
as
Begin
Select A.FirstName,A.Lastname, B.PhoneNumber
from [Person].[Person] A
INNER JOIN [Person].[PersonPhone] B on A.BusinessEntityID=B.BusinessEntityID
where A.BusinessEntityID<@Id Option (Recompile)
End

--Different Query plans is generated based on Input.
usp_PersonNamePhone_V2 @Id= 2
usp_PersonNamePhone_V2 @Id= 200
usp_PersonNamePhone_V2 @Id= 20000

3.Optimize for unknown value by assigning a parameter value to a variable.
Create Procedure usp_PersonNamePhone_V3(@Id int)
as
Begin
declare @EntityID int
Set @EntityID=@Id
Select A.FirstName,A.Lastname, B.PhoneNumber
from [Person].[Person] A
INNER JOIN [Person].[PersonPhone] B on A.BusinessEntityID=B.BusinessEntityID
where A.BusinessEntityID<@EntityID
End

--Generalized Query plan.
usp_PersonNamePhone_V3 @Id= 2
usp_PersonNamePhone_V3 @Id= 200
usp_PersonNamePhone_V3 @Id= 20000

4.Creates plan for the specified 'optimize for' value.
Create Procedure usp_PersonNamePhone_V4(@Id int)
as
Begin
Select A.FirstName,A.Lastname, B.PhoneNumber
from [Person].[Person] A
INNER JOIN [Person].[PersonPhone] B on A.BusinessEntityID=B.BusinessEntityID
where A.BusinessEntityID<@Id Option (Optimize for (@Id=200))
End

--Query plan is created for @id=200.
usp_PersonNamePhone_V4 @Id= 2
usp_PersonNamePhone_V4 @Id= 200
usp_PersonNamePhone_V4 @Id= 20000

5.Creates plan for a Unknown value. The unknown value is calculated based on statistical data.More on this here.
Create Procedure usp_PersonNamePhone_V5(@Id int)
as
Begin
Select A.FirstName,A.Lastname, B.PhoneNumber
from [Person].[Person] A
INNER JOIN [Person].[PersonPhone] B on A.BusinessEntityID=B.BusinessEntityID
where A.BusinessEntityID<@Id Option (Optimize for unknown)
End

--Query plan is created for a unknown value.
usp_PersonNamePhone_V5 @Id= 2
usp_PersonNamePhone_V5 @Id= 200
usp_PersonNamePhone_V5 @Id= 20000

Unicode and Non-Unicode Data

SQL Server supports unicode and non-unicode characters and hence supports multiple languages. Varchar,Char,Text datatypes support non-unicode data and Nvarchar,Nchar,Ntext datatypes support unicode data. Let’s see a simple unicode and non-unicode data demo:

北京 – stands for Beijing in Chinese. The database collation is ‘SQL_Latin1_General_CP1_CI_AS’, so, we know Chinese characters are not directly supported.We can set collation on the table but I choose not to for the demo so as to see how special characters error out when stored in non-unicode column.
create table foobarDemo1(scity varchar(20))
create table foobarDemo2(scity nvarchar(20))
Insert into foobarDemo1(scity)
select '北京'
Insert into foobarDemo2(scity)
select N'北京'

--Data not stored Correctly.
select * from foobarDemo1
--Data stored correctly because of unicode column datatype
select * from foobarDemo2

Unicode data takes twice the storage as non-unicode data, considering this, it is important to define right datatype for the application. This disadvantage on the storage side can take performance hit as more data pages have to be read to fetch the data. Below demo shows how unicode and non-unicode data is stored on the data page.
Create Database TestDb
Go
use TestDb
go
Create Table foobarDemo1(sname varchar(8000))
Create Table foobarDemo2(sname nvarchar(4000))
GO
Insert into foobarDemo1(sname)
Select Replicate('Clippers',1000)
Insert into foobarDemo2(sname)
Select Replicate('Clippers',500)

--Enable trace flag to read the data
DBCC Traceon(3604)
--query to find which data page has the data row(works in sql2012 and above)
Select object_name(object_id) as TableName,allocated_page_page_id
from sys.dm_db_database_page_allocations
(DB_ID('TestDB'),Object_ID('foobarDemo1'),NULL,NULL,'Limited')
where is_iam_page=0

--shows how the non-unicode data is stored in the page
Dbcc page('testdb',1,282,2)
VarcharDataPage--query to find which data page has the unicode data(works in sql2012 and above)
Select object_name(object_id) as TableName,allocated_page_page_id
from sys.dm_db_database_page_allocations
(DB_ID('TestDB'),Object_ID('foobarDemo2'),NULL,NULL,'Limited')
where is_iam_page=0

--shows how the unicode data is stored in the page;you can see placeholder for special character
Dbcc page('testdb',1,284,2)
NVarcharDataPage

Starting from SQL 2008R2, SQL Server has Unicode Compression which basically compresses non-unicode data stored in unicode column as non-unicode. More on this in this MSDN Link.

TSQL code to view Job Activity

SQL Server Agent is a very key component of SQL Server and typically used to schedule jobs,alerts etc. Whenever we need to restart SQL Server agent, it is good practice to make sure there are no running jobs. We can check this using job activity monitor or using the TSQL code below.

I prefer the TSQL code as it is easier and also can run it on central management server(CMS) to get the list of running jobs across multiple servers.

--currently running jobs in job activity monitor(SQL 2005 and above)
Select @@SERVERNAME as [DBServer],C.Name as [JobName], run_requested_date,start_execution_date,stop_execution_date
from (Select max(Session_id) as Session_Id from msdb.dbo.syssessions) A
INNER JOIN msdb.dbo.sysjobactivity B on A.Session_id=B.Session_ID
INNER JOIN msdb.dbo.sysjobs C on B.job_id=C.Job_ID
where B.stop_execution_date is null AND B.run_requested_date is not null

Deleting Old Backup Information from MSDB

MSDB is a system database which stores information on backup history,sql jobs,alerts, database email, job history …etc.With regards to backups,It stores all the backup information unless you have some delete routine to delete the information from the backup tables. Generally, there will be a retention policy which defines how long old backup files are to be retained. This can be anywhere between couple of weeks to years depending on the requirement. Once you determine your retention requirement,you might not need backup information in MSDB beyond your retention policy, as you do not have the files, this information does not really help you. Also, this will keep MSDB database size in check and will also result in smaller MSDB backup file.

So, to delete this information, there are bunch of backup tables where the information is stored. The following pic shows the corresponding tables that store backup information.
MSDB Backup Diagram

Now, Create the below procedure that deletes the data from the above tables.

If Exists(Select 1 from sys.procedures where name='usp_DeleteBackupInformation')
Begin
Drop Procedure usp_DeleteBackupInformation
End
GO
Create Procedure usp_DeleteBackupInformation(@Purgedate datetime)
as
Begin
Set nocount on
Begin Try
Begin Tran

--deletes backupfile info
Delete A from msdb.dbo.backupfile A INNER JOIN msdb.dbo.backupset B
on A.backup_set_id=B.backup_set_id Where backup_finish_date < @PurgeDate

--deletes backupfilegroup info
Delete A from msdb.dbo.backupfilegroup A INNER JOIN msdb.dbo.backupset B
on A.backup_set_id=B.backup_set_id Where backup_finish_date < @PurgeDate

--deletes restorefile info
Delete A from msdb.dbo.restorefile A INNER JOIN msdb.dbo.restorehistory B
on A.restore_history_id=b.restore_history_id INNER JOIN msdb.dbo.backupset C
on C.backup_set_id=B.backup_set_id Where backup_finish_date < @PurgeDate

--deletes restorefilegroup info
Delete A from msdb.dbo.restorefilegroup A INNER JOIN msdb.dbo.restorehistory B
on A.restore_history_id=b.restore_history_id INNER JOIN msdb.dbo.backupset C
on C.backup_set_id=B.backup_set_id Where backup_finish_date < @PurgeDate

--deletes restorehistory info
Delete A from msdb.dbo.restorehistory A INNER JOIN msdb.dbo.backupset B
on A.backup_set_id=B.backup_set_id Where backup_finish_date < @PurgeDate

--delete backupset info
Delete from msdb.dbo.backupset where backup_finish_date < @PurgeDate

--deletes backupmedia info
Delete A from msdb.dbo.backupmediafamily A where media_set_id not in
(Select distinct media_set_id from msdb.dbo.backupset)

--deletes backupmediaset info
Delete A from msdb.dbo.backupmediaset A where media_set_id not in
(Select distinct media_set_id from msdb.dbo.backupset)

Commit
End Try
Begin Catch
Select ERROR_NUMBER as ErrorNumber, ERROR_MESSAGE() as ErrorMessage,
ERROR_SEVERITY() as ErrorSeverity
Rollback
End Catch
End

Example : usp_DeleteBackupInformation '20070101'
In the above example, it will delete all the backup information prior to Jan 1st 2007.

Database Restore Report

It is very typical to frequently have a copy of production database restored on to DEV\QA\TEST Instances, so as to work with more real time data. The below query helps in figuring out details such as when the database was restored,source server of restored database, user who restored it, which backup file was used to restore… etc.

Also, this query is useful to generate a report of all backup files that were restored along with other details while testing your backups by physically restoring them.

NOTE: At this point, the script works correctly if your backup types are Full, Log, Differential.Also, it is a good practice to delete unwanted backup history from MSDB database. So, if the backup\restore history data got purged, we might not see results from some databases.

Select A.destination_database_name as [Restored_Database_Name],
A.user_name as [RestoredBy_User_Name],
A.Restore_Date,A.replace as [Replace],
Case When A.restore_type='D' then 'Full'
When A.restore_type='I' then 'Differential'
When A.restore_type='L' then 'Log' End as [Restore_Type],
B.database_name as [Backup_DatabaseName],
B.database_creation_date as [Backup_DB_Creation_Date],
B.server_name as [Backup_ServerName],
B.user_name as [Backuped_by_User],
B.backup_start_date as [Backup_DB_Start_Date],
B.backup_finish_date as [Backup_DB_Finish_Date],
Case When B.[Type]='D' then 'Full'
When B.[Type]='I' then 'Differential'
When B.[Type]='L' then 'Log' End as [Restore_Type],
D.physical_device_name as [BackupFile],Position as [Backup_File_Position],
D.family_sequence_number as [Split_Backup_Sequence]
from msdb.dbo.restorehistory A
INNER JOIN msdb.dbo.backupset B on A.backup_set_id=B.backup_set_id
INNER JOIN msdb.dbo.backupmediafamily D on D.media_set_id=B.media_set_id
INNER JOIN (Select Max(restore_history_id) as restore_history_id ,
Destination_Database_Name
from msdb.dbo.RestoreHistory
where Restore_Type='D'
Group by destination_database_name) C on C.destination_database_name=A.destination_database_name
and A.restore_history_id >=C.restore_history_id

Restoring MSDB Database

One of the primary reasons to take backups is to protect from accidentally data deletion and data corruption.

But, what happens when a SQL Server job is accidentally deleted? MSDB database contains all the information about backups,SQL Server Agent Jobs,Alerts,Operators..etc. So, the deleted job can be recovered by restoring MSDB database.

To restore the MSDB database, the server does not need to be in Single User Mode.To restore Master database, the server needs to be in single user mode and I explained it in this blog.

So, to restore MSDB, make sure you can get exclusive access on the database.Remember, MSDB is a system DB and is inherently used for many purposes such as maintaining job history,job step logs,notifications,database mail, alerts..etc. So, it may be better to stop SQL Server Agent while restoring MSDB database.To restore MSDB database:

Use Master
GO
Restore Database MSDB from disk='E:\SQL_Backup\Msdb.bak' with replace

One other alternative is to set the MSDB database in read only mode(so as, no new data is written but applications can continue to read data) and restore MSDB database as user database and then stop the SQL Server Service account and copy data and log files to where MSDB files are located and rename them as MSDB files and restart the SQL Service. This might not be possible in all cases as it involves restarting the SQL service.

Snapshot Isolation

To run transactions in snapshot isolation mode, first, enable snapshot isolation level on the database and then set the isolation level on the transaction. To enable option snapshot isolation on the database:
Alter Database <<DatabaseName>>
Set allow_snapshot_isolation on

IF Exists(Select 1 from sys.objects where name='foobar' and type='U')
Begin
Drop Table foobar
End
Create Table foobar (sno int,sname varchar(20))
Insert into foobar Values (1,'Spurs'),(2,'Rockets'),(3,'Pelicans')

--Session 1
Begin Tran
Update foobar set sname='Lakers' where sno=2
waitfor delay '00:00:05'
commit

/*Retrieves only committed data from the table.Writers do not block readers.So, as long as the txn in session 1 is committed before the select statement in session 2 runs,the updated value is returned. If the session 1 txn is not committed by the time select in session 2 runs,it will return the initial values. But data read will always be consistent inside the transaction,even if it is changed another transaction.*/
--Session 2
Set Transaction Isolation Level Snapshot
Begin Tran
Select * from foobar
waitfor delay '00:00:05'
Select * from foobar
Commit

/*CON:Update conflicts can occur when the data that the current transaction is trying to update was modified by another transaction*/
--Session 1
Set Transaction Isolation Level Snapshot
Begin Tran
Select * from foobar
waitfor delay '00:00:05'

--Data read in the same transaction will always be consistent.
select * from foobar
--update conflict occurs,since the data was changed by another transaction.
Update foobar set sname='Heat' where sno=2
select * from foobar
Commit

--Session 2
Begin Tran
Update foobar set sname='Hawks' where sno=2
commit