Solutions with Practical SQL DBA : May 2020

Sunday, 31 May 2020

Stairway to Microsoft Azure SQL Database Part 4 : Azure SQL server , an overview

In the last few posts, we were discussing about database deployment model and purchase model. Out of the three deployment models, managed instance works almost like on-premises SQL server and hence we are not going to discuss much about the managed instance. On the other side, a single database and elastic pool works a completely different ways.

while creating a single database or elastic poor, you need to attach those into the Azure SQL server. In other words, you need to create a SQL server before creating the SQL database. While searching for services in the Azure portal, you can search for 'SQL database' and 'SQL server'. In this post, we are going to discuss about the SQL Server, which is a logical container for your single Azure databases and Elastic Pool.

As per the Azure documentation, "a SQL server is a logical construct that acts as a central administrative point for a collection of databases". At the server level, you can administrate the following components:

Logins
Connectivity
Firewall rules
Auditing rules
Threat detection Policy
Auto Failover group
Transparent Data Encryption
Auditing
Backup Management

When you create an empty SQL Server in Azure, it will not cost you anything until you create a single database or elastic pool under this SQL server. To create an empty SQL server, search for "SQL Server" in the Azure portal, and follow the instructions. While creating the SQL server, it will ask for a user name and password. This is the only SQL login that you can create from the Azure portal. This login has the highest privilege in the server and can access all databases in that server without providing explicit permission in each database. Azure SQL server does not support window authentication but supports Azure active directory authentication. For enabling the Azure active directory authentication, you need to connect your server with Azure AD. You can achieve this by setting the Active directory Admin from the Azure portal. You can find this option under the Settings section of the server. This can be an individual account or a group. This account also will have the highest privilege and can access all the databases under that server without providing explicit permission. These are the only two logins that you can create from the Azure portal. There is no option to change the SQL login once you have created it but you can change the Active directory admin at any time. These are the only two logins will have access to all the databases created under this server. There is no option to create server wide login which will have access to all the databases under this server. Still, you can create server wide login and can explicitly create the corresponding user in each database like we used to do in the on-premise installation.

Azure SQL database provides two types of user :

Login and user: This is the same as we used to do in on-premise installations. Create a login and create corresponding user in each database. For example, connect to master database

create login njohn with password ='Password@123'

--Connect to DB1

create user njohn from login njohn

--Connect to DB2

create user njohn from login njohn

In this case, the login information is stored at the server level and the password remains same across all databases. While copying the database DB1 and DB2 to another server, you also need to copy the login and need to map the SID properly.

Contained user: These are the users created under the database without having the logins. In this case, the user informations are stored in the database level. After the restoration of the databases, DB1 and DB1 to another server, we don't have to copy the login information separately as the user password and other information are stored in the user database. In this scenario, the same user can have different passwords to connect with each database. For example

--Connect to DB1

create user njohn with password ='Password1'

--Connect to DB2

create user njohn with password ='Password2'

When you create the SQL server, by default a virtual master database will be created. This is the only system database that you can see in the Azure SQL server. If you look into this database, you can see two additional database role added with Azure SQL server :

dbmanager: Members of this database role can create and delete the databases. When a member of the dbmanager role creates the database, becomes the owner of the database, and will have full access to that database as dbo. The members of this role will not have access to the database which is not created by them (owner) unless you explicitly created users under the database for those logins like we do in the on-premise installation. This role exists only in the master database and only the users of the master database can be added into this role. The ideal method is, create a login and create a user for that login and add that user to the dbmanager role. Contained database users in the master database also can be added to the dbmanger role, but they will not have permission to create new databases.

loginmanager: can create and delete logins in the master database. Note that, members of this role does not have the permission to create users in the master database.

As mentioned earlier, the Azure SQL server is a logical construct and does not represent a specific VM or hardware like on-premise installation. The databases associated with an Azure SQL server, maybe running in different VMs which is not visible for us. In our on-premise installation, the SQL server is installed on VM and all databases created in that database server will be running on the same VM. Whereas Azure SQL server is a logical server and can deploy the databases under that server in multiple VMs. The below diagram helps you to understand the concept in a better way.

We have learned in our earlier post, each Azure SQL database will have one or more standby replica for high availability depending on the service tier. The database name marked in red are standby replica. Azure SQL inherently multi-tenant and it needs to share its resource with multiple clients. You can see one extra database, DB5 which is not there in our on-premises diagram. When you connect to Azure SQL server, "mysqlserver.database.windows.net", based on the database need to be connected, it will redirect the connection to data node hold the primary replica. This is the reason why we need to specify the database name while connecting through SSMS and not allowing us to change the database context using the USE statement. To change the context of the database, it needs to reroute the traffic to different data nodes. If we are not specifying the database name, it will connect to the virtual master database. If the 'datanode1' goes down for some reason, DB1, and DB2 comes online as the primary in Datanode2 and standby replica will get created in another data node. During this time, the connection to these two databases will be broken and the user will get an error message. To overcome these transient errors and to give a smooth experience to the application users, Microsoft suggests to implement retry mechanism for application connecting to Azure SQL. This is also beneficial to

Handle the errors happens while azure perform the maintenance or patch update. Azure SQL Database is a managed service and Azure performs the regular patch and version update. User does not have control over the date and time of these maintenance windows.
When we scale up the SQL database, there is a possibility to failover the database to another node due to a lack of resources in the current node. In that scenario, the existing connection will be terminated.

In our next post, we will discuss about connection architecture and different settings available under the "firewall and virtual network"

Monday, 25 May 2020

Ghost is taking full backup and breaking my backup chain . Be careful with Azure VM backup on VM having SQL server installation

I believe that I have strong knowledge in SQL Server backup and recovery process, but recently there was an incident which shook my confidence. I was working on a migration for a noncritical production server from SQL server 2008 to 2017. As we need to upgrade the OS also, we decided to go ahead with the side by side migration. The backup plan on the existing server was full backup on a weekly basis. So I decided to restore the previous week's full backup well in advance and during the migration window to take a differential backup and apply that on the new server and I communicated this process will take approximately 15 minutes. As I was pretty confident in this process, I didn't put any effort to do a dry run prior to the migration day and everything set for migration.

During the migration window, as a first step, I fired the command to take differential backup on the existing server. To my surprise, it throws the following error:

Msg 3035, Level 16, State 1, Line 1
Cannot perform a differential backup for database “Mydatabase”, because a current database backup does not exist. Perform a full database backup by reissuing BACKUP DATABASE

I was not able to believe this. Immediately I queried the backup set table and I saw multiple full backup records after the previous week full backup. I ran a set of queries and it revealed that some backup operation is happening without DBA's knowledge to a virtual device at a regular interval. As per the Microsoft documentation, Device_type 7 is a virtual device.

I don't have time to dig further, I continued the migration by issuing full backup and restoring the same. This delayed the entire migration process by 3-4 hours and it was an embarrassing situation.

After the migration, I decided to figure out the issue and went ahead and searched the SQL error log and I found the following messages along with the backup message.

I/O is frozen on database Mydatabase. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup
I/O was resumed on database Mydatabase. No user action is required.
Database backed up. Database: Mydatabase, creation date(time): 2012/02/02(09:42:18), pages dumped: 578, first LSN: 528:626:37, last LSN: 528:643:1, number of dump devices: 1, device information: (FILE=1, TYPE=VIRTUAL_DEVICE: {'{4618D52D-9C83-400A-A698-AB391E30369A}26'}).This is an informational message only. No user action is required.

The I/O frozen error was just before the backup and that triggered me to think about disk snapshot backups running behind the scene and contacted the infra team to understand this. On contacting them, they told me that, they have enabled the Azur VM backup on all VM on a daily basis and the Azure VM backup schedule on the server in question was matching with the above error message. With that, we were sure that the azure VM is the culprit behind this ghost SQL server backup. On reading further on azure VM backup, we learned that Azure VM backup uses the VSS (Volume Shadow copy) service to take the backup and SQL server is a VSS aware application and it takes a full backup of SQL Server databases using SQL writer service. Before taking backup, it will freeze the disk operation to get a consistent backup.

What is VSS backup?

Volume Shadow Copy is a technology included in Microsoft Windows that can create backup copies or snapshots of computer files or volumes, even when they are in use. It is implemented as a Windows service called the Volume Shadow Copy service. It allows backup applications to safely back up locked and open files.

Components of VSS:

VSS service: Part of windows operating system and installed as by default. This Orichistate the communication between the backup tool and VSS aware applications.

VSS requester: The application who initiate the backup request to the VSS service. This can be Microsoft or non-microsof backup applications such as Windows server backup utility, any third party VM backup tool, etc.

VSS writer This is the component that guarantees a consistent data set to back up. For example, Microsoft provides the VSS writers for SQL server, Exchange server, windows registry, etc. Many non-Microsoft applications also provide VSS components that need to guarantee data consistency during back up.

Finally, we understood what was happening behind the scene and decided to stop the backup operation initiated by the Azure VM backup as DBA's are responsible for SQL server backup and recovery. We can not stop the Azure VM backup as the infra team need this for recovering the instance in case of disaster. So we found two solutions.

Stop and disable the SQL writer service. With this VSS will not able to take the SQL server database backup but still takes the VM backup.
Add below registry key [ HKEY_LOCAL_MACHINE\SOFTWARE\MICROSOFT\BCDRAGENT]
"USEVSSCOPYBACKUP"="TRUE" This will force the VSS to take COPY_ONLY backup which will not break the backup chain

I hope this piece of information will help you one day. If you are more curious about VSS service, refer the below links:

Sunday, 17 May 2020

Stairway to Microsoft Azure SQL Database Part 3 : Purchase Models DTU vs vCore continue

In the last post, we have discussed the DTU purchase model. Azure also provides another purchase mode which is called vCore. vCore purchase model available with all three deployment models (Managed instance, Single database, and elastic pool).

vCore purchase model provides :

Higher compute, memory, IO, and storage limit.
Better control over opting for the hardware configuration.
Option to bring your own license using the Azure Hybrid Benefit(AHB) and to reserve the instance to reduce the cost.
Flexibility to choose the compute and storage independently.
Storage and compute charged separately. In the DTU model ,limited amount od storage bundled with the computing power. Storage allocated above that is charged separately.

Service Tier: Like in the DTU model, the vCore model also provides multiple service tier to match with your performance requirements and budget. vCore purchase model provides the following service tiers.

General Purpose
Hyper-Scale
Business Critical

General Purpose :

Offer Budget-oriented, balanced and scalable compute and storage option.
Suited for general workload.
Available with all three deployment model (Single, elastic and managed)
Option to select 2 to 80 vCore. The options are 2 ,4 ,6 ,8 ,10,12,14,16,18, 20,24,32,40 and 80 vCore
Storage upto 4 TB (0.12 USD per GB).
Remote storage
500 to 20000 IOPS
2 to 10 ms latency
For high availability, it provides one replica.
No read replica.
7 to 35 Backup retention

Hyperscale :

Available only for single database deployment model
Provide highly scalable storage up to 100TB
For rapid read scale-out deployment, provide up to 4 read replica
Read replicas are charged.
Option to select compute power from 2 to 80 vCore (options are the same as in general-purpose)
Use local SSD storage.
Snapshot based backup in Azure remote storage. Restore use these snapshots for fast recovery.
Backup/restore does not depend on the size of the database and does not affect the IO or compute as it is performed using the snapshot. Hence backup and restorations are instantaneous.
For example. 50 TB database can be restored in less than 15 minutes. You can watch the video here
Depending on the workloads, IOPS between 500 to 204800 with 1 to 10 ms latency.
Most suited for databases that will not fit into the 4TB storage limit of general-purpose and business-critical service tier.
Hyperscale database engines use distributed function architecture and use the following components (we will discuss the architecture of hyper-scale in a different blog post). This architecture is different from the existing database engine architecture

Compute
Page Server
Log Service
Aure storage

Business Critical :

Available with all three deployment model (Single, elastic and managed)
Option to select 2 to 80 vCore. The options are 2 ,4 ,6 ,8 ,10,12,14,16,18, 20,24,32,40 and 80 vCore
Offers business applications the highest resilience to failures by using several isolated replicas.
It keeps 4 replicas for better high availability
To offload the read workload from the writable replica, provide read replica without additional charge.
Use local SSD storage for better IO performance.
Provides up to 4 TB storage with 1 -2 ms latency (0.25 USD per GB)
5000 to 204800 IOPS
7 to 35 days backup retention.

Compute Tier: Under the service tier, azure SQL server provides the compute tier. For managed instance and elastic pool deployment models, azure provides only provisioned compute tier and this option will not be available to choose. For the single database deployment, there is one more additional compute tier called Serverless. Let us see the difference between these two compute tier:

Provisioned :

compute tier is provisioned through out the lifecycle of elastic poo/ single database/managed instance and billed hourly basis.
There is no change in the billing based on the workload on the database.
Suitable for regular usage patterns with higher average compute utilization over time.
Compute scaling should be done manually (increasing the vCore)

Serverless :

Suitable for Intermittent, unpredictable usage with lower average compute utilization over time.
Automatically scale up the compute based on the workload and billed per second.
Automatically pause the database if there is no activity for a preconfigured time (Auto pause delay) and start automatically when the wake-up events happen. During this period compute is not charged.
While provisioning, we need to define the minimum and maximum vCore.
Based on the workload, the compute tier is scaled up to the max vCore defined. Compute is billed per second.
Auto pause delay is a configurable parameter that defines the required inactive period before pausing the database. This can vary from 1 hour to 7 days. For example, if we define the auto-pause delay as 2 hrs, then if there is no activity in the database for 2 hours, then the database will go to pause state and will not be charged for computing until the next wake up event. Till that point, only the storage cost will be charged.
The minimum and maximum vCore are configurable parameters that define the range of computing power available for the database. The maximum vCore can vary between 1 to 16 vCore. (1,2,4,6,8.1,10,12,14 and 16 are possible options) . The minimum vCore can vary from 0.5 vCPU to maximum vCore configured. Each vCpu provides 3GB of memory. If the workload requires memory more than that will be charged.
Cost

Compute cost and storage costs are charged separately. Storage cost remains the same throughout the life cycle unless you scale up the storage.
When the compute usage is between the min vCore and max vCore configured, compute cost is based on vCore or memory.
When the compute usage is below the min vCore setting, it will be still charged for min vCpu/ Memory.
For the duration of the paused state, Computing is not charged. Only storage will be charged.

Auto pause and auto-scaling

When the database is in a paused state, the first wake up event (example connection request) fails with the error message "database is unavailable" with error code 40613. It will take approximately one minute to resume the operation. Applications that are working with a serverless database should have mechanisms to handle this error message and retry the connection after a minute.
When the load on the database increases, azure automatically scales up the compute / memory up to the max vCore configured. while scaling up the compute, at the end of the process all connections to the database are dropped. The applications which use the serverless database should have a mechanism to handle these error and try to reconnect after a specific time to provide smooth user experience to the end-users.

Billing

Use per second billing.
Each vCore in action provides 3GB memory. If the workload uses more memory than the memory bundled with the vCore, billing will be based on the memory. For example, for a specific time interval, the load on the database is using 2 vCore and 10 GB memory. In this case, billing will be based on the memory as the memory utilization is above 6 GB (2vCore*3 GB). To derive the number vCore to be billed from the memory utilization, memory utilized will be divided by 3 (3GB for each Vcore). In our case, it will be billed for 10GB/3 = 3.33 vCore. The below table gives you a better understanding of the billing. Let us consider one database, that is provisioned with 3 vCore as minimum vCore and 16 as maximum vCore.The auto pause delay is configured as 2 hrs. Each vCore seconds is charged at 0.000145

The total vCore seconds are 389952 and it will cost 56.54 USD (0.000145 USD * 389952 vCore seconds) for the workload mentioned in the above table which spread across 24hrs.

Best suited for :

Single database with intermittent and unpredictable usage patterns.
The database workload is only during specific intervals. So after the auto-pause delay, the database enters into stop state and not charged for the compute.
Workload is not sensitive to the response time and the application is capable fo handling the errors while resuming the service and auto-scaling

Apart from connecting to the database, there are many other events that resume the database from a paused state. You can find those events in the Azure documentation page

Hardware Configuration Azure provides multiple hardware options to choose from under the service tier. These options differ from region to region. Commonly available hardware options are

Gen4

Provide 7 GB memory for each vCore.
Approaching the end of life by Jan 2023

Gen5

Has new generation hardware
Has network acceleration and provide better IO performance on remote storage (General purpose service tier)
It uses faster local SSD disk than Gen4

M-Series (preview)

New Memory-optimized hardware
Provide 29GB per vCore.
It provides up to 128 vCore.
In full capacity memory can be nearly 4 TB (128 vCore X 29 GB per vCore).

Fsv2-series (Preview)

Compute-optimized hardware.
Low CPU latency and high clock speed for the most CPU demanding workloads.
Can deliver more CPU performance per vCore than Gen5.
The 72 vCore size can also provide more CPU performance for less cost than 80 vCores on Gen5.
provides less memory and tempdb size per vCore than other hardware. so workloads sensitive to those limits may want to consider Gen5 or M-series.

The below diagram provides you a visual representation of vCore purchase model options for Single database and Elastic pool.

vCore Purchase Model

Managed Instance deployment model also provides General Purpose and Business Critical service tier. Under the service tier, it provides Gen4 and Gen5 hardware.

Each combination of these options has many other characteristics like tempdb space allocated, log space allocated, IOPS, no of concurrent session, log write throughput, etc. All those detailed information can be found in the following Azure documentation.

I hope, this post helped you understand the vCore purchase mode in detail.

Sunday, 3 May 2020

Stairway to Microsoft Azure SQL Database Part 2 : Purchase Models DTU vs vCore

In the last post, we have discussed about different deployment options available with Azure SQL server. I hope you got a good understanding of various deployment models. In this post let us discuss the purchase models available with each deployment model.

Azure SQL database provides multiple purchase models based on the deployment model to accommodate your performance requirements and budgets. At a high level, these purchase models are divided into two:

DTU or eDTU purchase model
vCore purchase model

Each of these purchase model provides multiple service tier for more customization of your performance requirements and budgets.

DTU or eDTU

Basic
Standard
Premium

vCore

General Purpose
Hyperscale
Business Critical

DTU and eDTU purchase model

The database transaction unit (DTU) is a specific composition of compute power, memory, storage, and IO. This purchase model provides us all the resources as a bundle which includes a specific amount of compute power, storage, and IO. It provides the simplicity and full control of your budget. This purchase model available with single database and elastic pool deployment model. As mentioned earlier, Azure provides multiple service tier to match our performance requirements and budget.

Basic tier:

Suitable for very low CPU intensive workload.
Provides 5 DTU with 2GB of storage.
Each DTU provides 1-5 IOPS.
35 days backup retention.
Each DTU costs 1 USD per month.
Suitable for initial design and developments.

Standard tier:

Suitable for low and medium CPU intensive workload.
It provides a DTU range between 10 and 3000 DTU with 250 GB storage.
Additional storage up to 1 TB with additional cost. To go beyond 250 GB, we have to opt for 100 or more DTU.
35 days backup retention.
Each DTU provides 1-5 IOPS.
Each DTU costs 1.5 USD per month.
The range of DTU in this tier is divided into 9 compute size: S0 10 DTU), S1 20 DTU, S250 DTU,S3 100 DTU, S4 200 DTU, S6 400 DTU, S7 800 DTU, S9 1600 DTU, and S12 3000 DTU. There is no option for us to randomly select the DTU in the range of DTU.

Premium Tier:

Suitable for medium and high CPU and IO intensive workload.
Provides DTU range between 125 and 4000 with a storage of 500 maximum to 4 TB depends on DTU.
Each DTU unit provides 25 IOPS. That makes it suitable for IO intensive workload.
It also provides better IO latency (2ms for read and write) compared to basic and standard (5 ms for read and 10 ms for write).
35 days backup retention.
Support column store index and in-memory OLTP
DTU range divided into multiple compute sizes. p1 125 DTU,p2 250 DTU ,p4 500 DTU, p6 1000 DTU p11 1750 DTU, p15 4000 DTU
Premium tier also supports read scale-out to load balance the read workload with no extra cost. As part of high availability architecture, each database in the premium service tier is automatically provisioned with multiple secondary replicas. These secondary replicas are provisioned with the same compute size of the primary replica. Enabling this feature, allow you to load balance your read-only workload using one of the read replica instead of read-write replica. This is very useful for read-intensive application which can segregate their read and write operations by passing the readintent=true property in the connection string.
Premium tier also supports Zone redundancy. As part of high availability architecture, each database in the premium service tier is automatically provisioned with multiple secondary replicas in the same data center. This architecture helps to protect us from the server failure. By enabling the Zone redundant feature, the replicas are placed in different availability zones, it makes the databases resilient to the data center outage

For more accurate details of each service tier, refer the following Azure documentation :

While using the single database deployment model, the DTU purchase model comes into the picture. In the same way, for an elastic pool deployment model, the eDTU purchase model is used. eDTU purchase model also provides different service tiers like in the DTU model but with different limits. For the details of each service tier, refer to the Azure documentation. As you know, in the elastic pool deployment model, the resource is allocated at the pool level and each database shares the resource. To avoid one database consuming all the resources, there is a couple of parameter settings in the elastic pool level.

Max eDTUs per database: This is the maximum number of eDTU that any database in the pool may use if available based on utilization by other databases in the pool. Max eDTU per database is not a resource guarantee for each database. if the workload on these databases are spread across time, we can over commit the max eDTU per database. For example, we have an elastic pool with 1000 eDTU and 5 databases attached to that. Out of that 2 of them are reporting data and most load comes to those two at night. The other two databases are part of the OLTP system will have varying workload but the peak will be between 10 AM and 5 PM. The fifth database is an internal database most of the load on this database will be between 9 AM and 12 PM. In this scenario, we can safely set max eDTU per database to 300 or 400 even if the total goes beyond the 1000 eDTU.
Min eDTUs per database: This is the guaranteed minimum number of eDTU for each database in the pool. This can set to 0 and that is the default value. This value multiplied by the number of databases in the pool should not exceed the eDTU allocated to the pool. To set the value of this parameter to 100 of an elastic pool which has 20 databases, we need to allocate a minimum of 2000 eDTU to the elastic pool.
Max storage per database: This is the maximum size of any database attached to this elastic pool. None of the databases can grow beyond this limit. Even if we set this value to a higher number which exceeds, (no.of database X Max storage per database), the total available storage space of the pool storage limit, it will never allow all database storage to grow more than available storage space fo the pool.

Each service tiers in the elastic pool has a limit on the individual database resource utilization. The below table explains those limitations:

In case if you are migrating your workload from your on-premise installation of SQL Server to Azure SQL and you are not sure how much DTU need to be allocated and which service tier to be provisioned, DTU calculator comes handy. You need to provide values of the following performance counters :

Processor - % Processor Time
Logical Disk - Disk Reads/sec
Logical Disk - Disk Writes/sec
Database - Log Bytes Flushed/sec

I hope with this you got a good understanding of the DTU and eDTU purchase model. In the next post, we will cover the vCore purchase model.

Solutions with Practical SQL DBA

Pages