InMemory Tables – Hash Index

Couple of quick notes about InMemory Objects:

  • Even with “Schema_and_Data” durability option on the Inmemory, the index data is not persisted to the disk. Indexes are rebuilt during the instance startup. This is something to be aware of as it increases the database recovery time.
  • All Inmemory Tables need to have at least one index and primary key is required only for “Schema_and_Data” durability, primary key is not necessary for “schema_only” durability option.
  • There are only two types of indexes that can be created on InMemory tables – 1. Hash and 2. Range.
  • With hash index, hash value of the column is calculated and the memory address for the row is stored in the hash bucket. The same data will have same hash value, however, sometimes two different values can also have the same value.

Hash Match Example:
Create Table InMemoryHash(sno int not null index IX_Sno nonclustered hash with (bucket_count=100))with (Memory_Optimized=ON,Durability=Schema_Only)

Insert into InMemoryHash
Values (1),(2)
--check the maximum chain length
select * from sys.dm_db_xtp_hash_index_stats where object_id=object_id('InMemoryHash')

Insert into InMemoryHash
Values (4),(49)
--check the maximum chain length
select * from sys.dm_db_xtp_hash_index_stats where object_id=object_id('InMemoryHash')

The example shows that 4 and 49 have the same hash value and hence will be stored in the same bucket.  The newer row will be pointed by the hash bucket and the newer row will point to older row for values having same hash match. So, if there are several hash matches, there will be large chain of rows and this can lead to performance issue as optimizer should traverse through all the chains to find the data.

  • We need to specify the bucket_count – for hash indexes. The bucket counts are created to the base of two. So, if you specify bucket_count=100 in the table DDL, the actual bucket_count created is 128(2^7). If you specify, bucket_count=1000000, the actual bucket_count created is 1048576(2^20).
  • Range Index are implemented similar to regular row index but with some difference. I will cover range indexes  in another post.

Monitoring InMemory Objects and Txns – Memory Utilization

For Inmemory objects, the data exists strictly in the memory. The data may or may not be persisted on to the disk depending on the table durability but that does not change the fact the data always stays in the memory. Monitoring Memory Utilization is always key for database systems but it is more important now than ever because Inmemory transactions cannot continue if memory is not available(no spilling\paging to disk ever happens with InMemory transactions). Also, Inmemory transactions use optimistic concurrency, which is achieved by using multiple row versions and all these stay in the memory as well. With all these complexities around, it is very important to monitor the memory utilization.

Using resource governor, we can create a InMemory resource pool, set up max and min memory limits and bind the Inmemory database to the resource pool, can help limiting the Inmemory objects and transactions memory usage.  By setting up appropriate values, this can help not bringing the entire system down(it still can bring the InMemory transactions down).

The bottom line is with InMemory you always need to monitor the memory utilization.

There are two easy ways to do this:

1.Performance Counters DMV:

select * from sys.dm_os_performance_counters
where counter_name ='XTP Memory Used (KB)' and instance_name = '_Total'

2.Using PerfMon Counters

MSSQL$<<InstanceName>>::Databases --> XTP Memory Used(KB) --> _Total

This gives us the total memory utilization by all InMemory objects and transactions across the instance. You can also track the Inmemory memory usage per individual databases.