Wednesday, October 30, 2019

Introduction to Memory-Optimized Tables


Memory-optimized tables are fully durable by default, and transactions on memory-optimized tables are fully atomic, consistent, isolated, and durable (ACID).
·      Memory-optimized tables reside in memory.
·      Rows in the table are read from and written to memory. The entire table resides in memory.
·      A second copy of the table data is maintained on disk, but only for durability purposes.

In-Memory OLTP is integrated with SQL Server to provide a seamless experience in all areas such as development, deployment, manageability, and supportability.

Rows in memory-optimized tables are versioned so each row in the table potentially has multiple versions. Row versioning is used to allow concurrent reads and writes on the same row.

The following figure illustrates multi-versioning. The table has three rows: R1, R2, and R3. R1 has three versions, R2 has two versions, and R3 has four versions.


Note: Different versions of the same row do not necessarily occupy consecutive memory locations. Instead, it can be dispersed throughout the table data structure.

Durability
·       Memory-optimized tables are fully durable by default, and they are fully atomic, consistent, isolated, and durable (ACID) (like transactions disk-based tables).
·       In-Memory OLTP supports durable tables with transaction durability delayed. Delayed durable transactions are saved to disk soon after the transaction has committed.
·       Besides the default durable memory-optimized tables, SQL Server also supports non-durable memory-optimized tables, which are not logged, and their data is not persisted on disk. This means that transactions on these tables do not require any disk IO, but the data will not be recovered if there is a server crash or failover.

Durability options:

·       SCHEMA_AND_DATA (default): This option ensures that data is recovered to the Memory-Optimized table when SQL Server is restarted or is recovering from a crash.
·       SCHEMA_ONLY: Like Tempdb data, the SCHEMA_ONLY bound Memory-Optimized table will be truncated if/when SQL Server is restarted or is recovering from a crash, but unlike the tables in Tempdb, the Memory-Optimized table will be re-created as a blank table at the end of the restart/recovery operation

Accessing Data in Memory-Optimized Tables
Data in memory-optimized tables is accessed in two ways:
·       Through interpreted Transact-SQL (outside of a natively-compiled stored procedure). These Transact-SQL statements may be either inside interpreted stored procedures or they may be ad-hoc Transact-SQL statements.
·       Through natively compiled stored procedures.

Performance and Scalability
The following factors will affect the performance gains that can be achieved with In-Memory OLTP:
·       Communication - an application with many calls to short stored procedures may see a smaller performance gain compared to an application with fewer calls and more functionality implemented in each stored procedure.
·       Transact-SQL Execution - In-Memory OLTP achieves the best performance when using natively compiled stored procedures rather than interpreted stored procedures or query execution. Stored procedures that execute other stored procedures cannot be natively compiled, but there can be a benefit to accessing memory-optimized tables from such stored procedures.
·       Range Scan Vs Point Lookup - Memory-optimized nonclustered indexes support range scans and ordered scans. For point lookups, memory-optimized hash indexes have better performance than memory-optimized nonclustered indexes. Memory-optimized nonclustered indexes have better performance than disk-based indexes. Index operations are not logged and they exist only in memory.
·       Concurrency - Applications whose performance is affected by engine-level concurrency, such as latch contention or blocking, improves significantly when the application moves to In-Memory OLTP.

Creating Memory-Optimized Tables
Assumption for below exercise is that you have SQL server 2014 installed. Below examples will be using TestDB database for this exercise.

·       Step 1: Create Memory Optimized File Group

USE [master]
GO
ALTER DATABASE [TestDB] ADD FILEGROUP [fgMOD] CONTAINS MEMORY_OPTIMIZED_DATA
GO
ALTER DATABASE [TestDB] ADD FILE ( NAME = N'SQL2014_MOD', FILENAME = N'D:\MSSQL\SQL2014_MOD' ) TO FILEGROUP [fgMOD]


You can make changes in file name or path as per your requirement.

·       Step 2: Create Memory Optimized Table
Creating an In-Memory (Memory-Optimized) tables in SQL Server is straight forward.

CREATE TABLE dbo.MemoryOptimizedTable
(
tableId INT NOT NULL PRIMARY KEY NONCLUSTERED HASH WITH (bucket_count = 2000000)
,naturalId INT NOT NULL INDEX NC_MemoryOptimizedTable_NaturalId NONCLUSTERED HASH WITH (bucket_count = 1000000)
,value NVARCHAR(100) NULL
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)
GO

Indexing Memory-Optimized Tables
Memory Optimized tables support two types of indexes:
·       Non-Clustered Hash Index: This is a memory optimized index, it does not support inequality operators as well as sort-order matching operations.
·       Non-Clustered Index: This is a disk based index, fully supports all normal index operations.

There are a few stipulations for creating indexes on Memory-Optimized tables:
·       Only 8 indexes are allowed on a Memory Optimized table
·       Indexes cannot be added to a Memory Optimized table, instead the table has to be dropped and re-created with the new index.
·       Primary Key is a requirement for Memory Optimized tables.
·       All indexes are covering, which means they include all columns in a table.
·       Indexes reference the (hashed) row directly, rather than referencing the Primary Key.

Querying Memory-Optimized Tables
Memory-Optimized tables supports access through T-SQL and Natively Compiled Stored Procedures. Generally, you will require a SNAPSHOT isolation level or higher in order to access a Memory-Optimized table. 

References:
  • docs.microsoft.com
  • thinknook.com

Wednesday, March 15, 2017

Slowly Changing Dimensions

I was planning to explain Slowly Changing Dimension (SCDs) since long time so this article is all about SCD(s).

Slowly Changing Dimensions (SCDs) are those dimensions that have data that may change slowly over the period, rather than changing on a time-based regular schedule. For example, a dimension in your Sales database tracks the sales information of your company's salespeople. Creating sales reports are easy until a salesperson is transferred from one department (office) to another. How do you handle such a change in your sales dimension?
 
Here is a scenario:
An Employee "XXX" belongs to Department D1 from DateOfJoining till 01/01/2008, so whatever sales made by this employee added to department D1. After couple of years, that employee moved in to department D2 on 01/01/2010. Now all his sales contribution should be added to D2  from 01/01/2010 onwards and the old sales should still belong to D1.


Types of SCDs:
There are different types of management methodologies referred as Type 0 through 6.

SCD Type 0:
The Type 0 method is a passive approach to managing dimension value changes, in which no action is taken. Values remain as they were at the time the dimension record was first entered.Type 0 provides the least control or no control over managing a slowly changing dimension.

The most common slowly changing dimensions are Types 1, 2, and 3.

SCD Type 1:
This methodology overwrites old data with new data and therefore does not track historical data at all.

Here is an example of a database table that keeps customer's email information:
CustomerID
CustomerName
Email
101
Cust101
abc@xyz.com
102
Cust102
pqr@xyz.com

If Cust101 email changes from abc@xyz.com to def@xyz.com then we need simple update statement to overwrite the old value with new one.
Here is how table will look after update command:
CustomerID
CustomerName
Email
101
Cust101
def@xyz.com
102
Cust102
pqr@xyz.com

The main disadvantage to Type 1 SCDs is that there is no historical record kept in the data warehouse. But an advantage to Type 1 SCDs is that they are very easy to maintain.

SCD Type 2:
The Type 2 SCD tracks historical data by creating multiple records in the dimensional tables with separate surrogate keys. With Type 2, we have unlimited history preservation as a new record is inserted each time a change is made.

Lets consider above example again. We need to maintain history of email address as well, in that case, we need two more columns (FromDate & ToDate).

CustomerID
CustomerName
Email
FromDate
ToDate
101
Cust101
abc@xyz.com
2011-01-01
9999-12-31
102
Cust102
pqr@xyz.com
2014-10-25
9999-12-31

If Cust101 email changes from abc@xyz.com to def@xyz.com on 2017-02-15, then we need to insert a new record and update the ToDate of previous record
Here is how table will look after update command:

CustomerID
CustomerName
Email
FromDate
ToDate
101
Cust101
abc@xyz.com
2011-01-01
2017-02-14
101
Cust101
abc@xyz.com
2017-02-15
9999-12-31
102
Cust102
pqr@xyz.com
2014-10-25
9999-12-31

SCD Type 3:
The Type 3 method tracks changes using separate columns. Whereas Type 2 had unlimited history preservation, Type 3 has limited history preservation, as it’s limited to the number of columns we designate for storing historical data. Where the original table structure in Type 1 and Type 2 was very similar, Type 3 will add additional columns to the tables:

CustomerID
CustomerName
OriginalEmail
FromDate
CurrentEmail
101
Cust101
abc@xyz.com
2017-02-15
abc@xyz.com
102
Cust102
pqr@xyz.com
NULL
NULL

Type 3 is less common because it involves changing the physical tables and is not scalable.

Reference: MSDN

Thursday, March 31, 2016

Analysis Services connectivity issue from Remote Server

Problem: I was recently working with one of my colleagues to setup a new server in Azure. He installed all the SQL server components. When we tried to connect OLAP cubes from another local or remote server, we were facing connectivity issue with the following error message:
A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connection host has failed to respond xxx.xxx.xxx.xxx:xxxx(System)."

Solution:
First we checked if SQL Server Browser service was running (Log On As) under “Local Service”. We made the changes and tried to connect SSAS through SQL Server Management Studio (SSMS) but no luck.

Then we guessed that may be our Windows Firewall was denying the access. To solve this issue, we added a new inbound rule in Windows Firewall for TCP port 2383. Please note that SSAS port is 2383 (instead of SQL port 1433). After adding new port, we were able to connect SSAS successfully from remote server.


I hope this will help you in case you encounter similar problem.