Index Selectivity – Left Based

There are several types of indexes available but I will be referring to only clustered and non clustered indexes.

Clustered Index is a type of index where the logical and physical order of the table records are in the same order.

Non Clustered Index is a type of index where the logical and physical order of the table records are NOT in the same order. This index typically consists of specific columns defined in the index creation unlike clustered index which applies to the entire base table.

You can refer to BOL link here

There can only be one clustered index and up to 249 non clustered indexes on the table. All non clustered indexes will also include either cluster index key or ROWID depending on whether the table is clustered or heap that act as pointer back to base table. This pointer helps in creating a “look up” if additional columns are need to be fetched from the base table by the query.

The basic purpose of Indexes is to improve the performance of SQL Query and hence the design of indexes is very important as selectivity of index depends on how well the index and query logic is defined.

Query optimizer uses indexes and performs either scan or seek operation on them to fetch the data.Generally, seek is better for fetching single rows and small subset of rows,scan is better for pulling large set of rows.

While it is possible to tell the query optimizer to either use seek or scan using query hints, it is generally best practice to leave it up to the query optimizer and better investigate further to see why the optimizer is using not-so optimal plan(parameter sniffing,statistics update etc). It is also important to look the index definitions and the way queries are begin written.

I am using Adventure Works 2012 for this example,and you can download it from here.In AdventureWorks2012 database,there is a table called Person.Person and it has non clustered non unique index on LastName,FirstName,MiddleName.

Consider the below example:

Select * from Person.Person where FirstName='Gail' and MiddleName='A'
This performs a non clustered index scan but since the query returns only very small subset of data, an index seek would be an ideal operation here. The reason to do index scan is because the index is defined over LastName,FirstName and MiddleName but the query predicate is filtering only on FirstName and MiddleName. The Execution plan looks like this:

NC_IndexScan

Select * from Person.Person where LastName='Erickson' and MiddleName='A' This leads to a non clustered index seek which is an optimal plan.

NC_IndexSeek

So,the take away from this blog is if the left most index columns are used in the query predicate, the query optimizer comes up with an optimal plan. If the left the most column is not used in the query predicate, it will always lead to index scan whether it is optimal or not. That’s why it is important to have right indexes defined so the query optimizer can make best use of the index.

Advertisement

Optimize for Ad Hoc WorkLoads

“Optimize for ad hoc workloads” is a server level setting introduced in SQL Server 2008. The main reason to have this option is to prevent “plan cache bloating” problem due to one time used ad hoc queries.

For any query that runs on SQL Server, query optimizer creates a plan. It first checks if there is an existing plan in the cache,if so, it will use that plan else will create a new plan. The generation of plan consumes CPU as it has to compile the code and come up with best cost based query plan.The generated query plan will then be stored in the plan cache which takes some memory. Now, when the server gets lot of these ad hoc queries,it will take significant part of the server memory. This particularly becomes disadvantage when these queries are not necessary used more than once (referred as ‘plan cache bloating’).

So, when ‘optimize for ad hoc workloads’ is set to 1,it will store plan stub instead of query plan. This plan stub is significantly smaller in size compared to query plan. This is referred as  ‘Complied plan Stub’ cacheobjtype in sys.dm_exec_cached plan. But when the same query is run again(second run), it will store the actual query plan.  This means that the code has to be compiled again on the second run – which could be a disadvantage but generally is acceptable in many situations.

Ad hoc queries are the queries that are run without using parameters. Depending on the nature of the ad hoc query,even in simple parameterization,SQL Server can either parameterise the query that is deemed as “Safe” or will pass the literal values to the query optimizer,if the query is not deemed as safe by the optimizer.

Configuring “Optimize for ad hoc workloads” option

sp_configure 'show advanced options',1
GO
Reconfigure with Override

sp_configure 'optimize for ad hoc workloads',1
GO
Reconfigure With Override

If exists (Select 1 from sys.tables where name='student')
Begin
Drop Table Student
End
Create Table Student(sno int identity(1,1) Primary Key,Slname varchar(20))
Go
Insert into Student(slname)
Values('Herendaz'),('Schulz'),('McKnight'),('Lavenger'),('Smith')

–even though this is ad hoc query,query optimizer considers this as “safe” and parameterize the query
Select * from [dbo].[Student] where sno=5
If you check the execution plan, you would see something like this :
SimpleQuery_Parameterized

For simple queries like this, ‘optimize for ad hoc work loads’ setting does not have any impact and it will store the actual query plan as the query optimizer considers them as parameterized query. By default,SQL Server will try to parameterize the queries that it considers to be “safe”, even in simple parameterization.

The below query gives you an idea on how the query plan is stored in the plan cache. In this example, it is considered as ‘Prepared’ and not ‘ad hoc’.

Select usecounts,cacheobjtype,objtype,size_in_bytes,text
from sys.dm_exec_cached_plans a Cross Apply sys.dm_exec_sql_text(a.plan_handle)
Left Outer Join sys.dm_exec_query_stats b on a.plan_handle=b.plan_handle

SimpleQuery_Parameterized#2

For queries that SQL Server implicitly will not parameterize,it will store ‘complied plan stub’ for the first run and for the second run, it will store the actual query plan.
–query optimizer does not consider this as “safe” and hence does not parameterize this query
Select * from [dbo].[Student] where sno=4 and isnull(slname,'Unknown')='Lavenger'

The execution plan looks like this:
SimpleQuery_NotParameterized#2
Also, using the below query, you can find how the query plan is saved as.
Select usecounts,cacheobjtype,objtype,size_in_bytes,text
from sys.dm_exec_cached_plans a Cross Apply sys.dm_exec_sql_text(a.plan_handle)
Left Outer Join sys.dm_exec_query_stats b on a.plan_handle=b.plan_handle

For First Run:
SimpleQuery_Parameterized#2
For Second Run:
ComplexQuery_NotParameterized#3
You can see the difference in cacheobjtype and size for both first and second runs. The use count for second run shows as 1,because the query plan is recreated.

The database level setting on “parameterization” will impact the decision on whether it would parameterise only the ad hoc queries that it considers are “safe”(simple parameterization) or will it parameterize all the ad hoc queries (Forced Parameterization).

If forced parameterization is set for the database, then the main ad hoc query is parameterised but all the shell executions will have ‘complied plan stub’ for first execution.

I do not think there is an option in SQL server that tells on what run to store the query plan. When ‘optimize for ad hoc workload’ is enabled, SQL server, by default, stores the compiled plan for the second run. I wonder what would be the reason in not making this as user specified option.

Rollback does not reset/reuse the identity property / sequence value.

/*USING Identity property*/
If Exists(Select 1 from sys.tables where name='SIdent')
Begin
Drop Table SIdent
End
Create Table SIdent(sno int identity(1,10) Primary Key,Sname varchar(20))
GO

Begin Transaction
Insert into SIdent(sname) Values ('Wizards')
Rollback
GO
Begin Transaction
Insert into SIdent(sname) Values ('Wizards')
Commit
Go
-- first value is 11 and not 1 because rollback does not reset the Identity value.
Select min(sno) as [IdentityValue] from SIdent
Go

IdentityValue
/*USING SEQUENCE – SEQUENCE is a new TSQL functionality available from SQL Server 2012*/

If Exists(Select 1 from sys.tables where name='STest')
Begin
Drop Table STest
End
If Exists(Select 1 from sys.sequences where name='SeqVal')
Begin
Drop Sequence SeqVal
End
Create table STest(sno int primary key,sname varchar(20))
Create Sequence SeqVal as int
Start with 1
Increment by 10
Minvalue 1
Maxvalue 100
No Cycle

Begin Transaction
Insert into STest(sno,sname)
Select next value for SeqVal,'Cowboys'
RollBack
Begin Transaction
Insert into STest(sno,sname)
Select next value for SeqVal,'Steelers'
Commit
-- first value is 11 and not 1 because rollback does not reset the Sequence value.
Select min(sno) as [SequenceValue] from Stest

Sequence

FileGroup Backup – Simple Recovery

–Create a Test Database; Add a Secondary FG and add file to the FG.
Create Database FooBarDB
GO
Alter Database FooBarDB Set Recovery Simple
Alter Database FooBarDB Add FileGroup Secondary
GO
ALTER DATABASE [FooBarDB] ADD FILE
( NAME = N'FooBarDB2', FILENAME = N'E:\DATA\FooBarDB2.ndf' ,
SIZE = 3072KB , FILEGROWTH = 1024KB ) TO FILEGROUP [Secondary]
GO

–Create a table on the Secondary FG;Insert some data and Switch it Read only mode.
use FooBarDB
Go
Create Table Stest(sno int,Sname varchar(20)) on [Secondary]
Insert into STest(sno,sname) Values(1,'Filegroup')
GO
Use Master
ALTER DATABASE [FooBarDB] MODIFY FILEGROUP [Secondary] READONLY

–Backup Secondary FG.Since it is Read only FileGroup, it can be backed up and restored by itself unlike Read_Write FileGroups.
Backup Database FooBarDB FILEGROUP = N'Secondary' to Disk ='E:\DATA\FooBarDB_Secondary_FG.bak'with init
–Create table in Primary FG and In simple Recovery, all the Read_Write_FileGroups have to be backed up together.
use FooBarDB
Go
Create Table StestPrimary(sno int,Sname varchar(20)) on [Primary]
Insert into StestPrimary(sno,sname) Values(1,'FGPrime')

–In Simple Recovery, While taking FileGroup backup, all Read Write FG should be backed up.Individual Read_write Filegroups cannot be backed up.
Use Master
GO
Backup Database FooBarDB Read_Write_FileGroups to Disk ='E:\DATA\FooBarDB_RWFG_FG.bak' with init

—Restore the Read Write FileGroups Backup. Read Write Filegroups needs to be restored together in Simple Recovery.

Restore Database FooBarDB_RS From Disk='E:\DATA\FooBarDB_RWFG_FG.bak' WITH FILE = 1,
MOVE N'FooBarDB' TO N'E:\DATA\FooBarDB_RS.mdf',
MOVE N'FooBarDB_log' TO N'E:\DATA\FooBarDB_RS_0.LDF',
MOVE N'FooBarDB2' TO N'E:\DATA\FooBarDB_RS_1.ndf',recovery

–can access data in tables present on Primary FG.
select * from FooBarDB_RS.dbo.StestPrimary
–cannot access data in tables present on Secondary FG as it has not been yet restored.
select * from FooBarDB_RS.dbo.Stest
–Restore the Secondary FileGroup
Restore Database FooBarDB_RS From Disk='E:\DATA\FooBarDB_Secondary_FG.bak'

–can access data in tables present on Secondary FG now.
select * from FooBarDB_RS.dbo.Stest

FileGroup Backup – Full Recovery

–Create a Test Database; Add a Secondary FG and add file to the FG.
Create Database FooBarDB
GO
Alter Database FooBarDB Set Recovery Full
Alter Database FooBarDB Add FileGroup Secondary
GO
ALTER DATABASE [FooBarDB] ADD FILE
( NAME = N'FooBarDB2', FILENAME = N'E:\DATA\FooBarDB2.ndf' ,
SIZE = 3072KB , FILEGROWTH = 1024KB ) TO FILEGROUP [Secondary]
GO

–Create a table on the Secondary FG;Insert some data and Switch it Read only mode.
use FooBarDB
Go
Create Table Stest(sno int,Sname varchar(20)) on [Secondary]
Insert into STest(sno,sname) Values(1,’Filegroup’)
GO
Use Master
ALTER DATABASE [FooBarDB] MODIFY FILEGROUP [Secondary] READONLY

–Backup Secondary FG
Backup Database FooBarDB FILEGROUP = N’Secondary’ to Disk =’E:\DATA\FooBarDB_Secondary_FG.bak’
–Create table in Primary FG and Backup Primary FG and Log. Log Backup needed for FG level restoration in Full recovery.
use FooBarDB
Go
Create Table StestPrimary(sno int,Sname varchar(20)) on [Primary]
Insert into StestPrimary(sno,sname) Values(1,’FGPrime’)

Use Master
GO
Backup Database FooBarDB FILEGROUP = N’PRIMARY’ to Disk =’E:\DATA\FooBarDB_Primary_FG2.bak’
Backup Log FooBarDB to Disk =’E:\DATA\FooBarDB_log.bak’

–Restore the Primary FG. Keyword “PARTIAL” makes the data in the Primary FG accessible.

Restore Database FooBarDB_RS From Disk=’E:\DATA\FooBarDB_Primary_FG2.bak’ WITH FILE = 1,
MOVE N’FooBarDB’ TO N’E:\DATA\FooBarDB_RS.mdf’,
MOVE N’FooBarDB_log’ TO N’E:\DATA\FooBarDB_RS_0.LDF’,
MOVE N’FooBarDB2′ TO N’E:\DATA\FooBarDB_RS_1.ndf’,
Norecovery,Partial

Restore Log FooBarDB_RS From Disk =’E:\DATA\FooBarDB_log.bak’
–can access data in tables present on Primary FG.
select * from FooBarDB_RS.dbo.StestPrimary
–cannot access data in tables present on Secondary FG as it has not been yet restored.
select * from FooBarDB_RS.dbo.Stest
–Restore the Secondary FileGroup
Restore Database FooBarDB_RS From Disk=’E:\DATA\FooBarDB_Secondary_FG.bak’

–can access data in tables present on Secondary FG now.
select * from FooBarDB_RS.dbo.Stest