Pages

Saturday, 20 June 2020

Stairway to Microsoft Azure SQL Database Part 5 : Azure SQL serve Server Firewall and virtual Network Setting

In our last blog, we were discussing about Azure SQL server and we learned that Azure SQL server is a logical construct that acts as a central administrative point for the databases hosted under that server. One of the important aspects of the server is managing the connections. All these connection settings are found in the firewall and virtual network under the Security tab. Let us discuss various connectivity options in this blog.

Public Network Access: By default, Azure SQL server provides a public endpoint to access SQL Server through whitelisted IP addresses. You can access your database from anywhere in the world through the internet as long as your IP address is whitelisted. You can whitelist a single IP address or range of IP addresses using the option available under the page "Firewall and Virtual Network settings". As per the security best practices, you should never allow public endpoint to your database. To disable the public endpoint associated with your server, you can enable the "Deny Public network" option. Azure portal will not allow you to disable the public network option unless you have configured a private endpoint connection. This makes sense as without a private endpoint you will not able to connect the server once you disable the public network access. We will discuss private endpoints and its configuration in our next blog. You can whitelist the IP address in two-level, on the server level, and another one in the database level. When you whitelist the IP address from the portal, it will whitelist at the server level. You can find the whitelisted IP address information using the DMV sys.firewall_rule under the master database. You can also whitelist the IP address using the procedure sp_set_firewall_rule. To white list the IP in the database level, you can use the procedure sp_set_database_firewall_rule. Database firewall rule can be configured for the master database as well as for the user databases. Database level firewall setting makes more sense while using the contained user. The below diagram provide a better understanding of this setup. While establishing the connection, Azure SQL checks database level whitelisting first and if it is not available will check server-level whitelisting.
Azure Firewall Setup
Azure SQL Firewall 

Minimum TLS version: TLS is a cryptographic protocol that provides end to end communication security over the network. The minimal TLS version setting allows you to control the version of TLS used by the server. If you set the Minimum TLS version as '>1.2' and try to connect from the client with a TLS version less than 1.2, your connection will fail with an error "Login failed with invalid TLS version". If your applications are using the unencrypted connection, do not set any minimum TLS version. You can use the 'encrypt_option' column available in DMV sys.dm_exec_connections to check the connection from the applications is encrypted or not. Azure suggest to have   "Encrypt=True;TrustServerCertificate=False;" in the connection string. With this, the end to end communication between the server and client will be encrypted. By setting the TrustServerCertificate to false, it forces the client machine to verify the certificate provided by the server. That helps to avoid any kind of spoofing or man in the middle attack.

Connection Policy: Azure provides three connection policy to set. 
  1. Proxy
  2. Redirect 
  3. Default
These connection policies define the communication channel between the server and the client machine. When you try to connect Azure SQL server, the request first reach to Azure SQL gateway. Azure has a set of gateway IP addresses for each region which keeps listening on the port 1433. The number of gateway in each region depends on the workload in that region. For example, East US2 has 5 gateway IP addresses whereas Australia central region has only one gateway IP address. You can do telnet to these IP addresses through port number 1433. Once it reached the gateway, based on the connection policy setting, request forwarded to the actual SQL server node hosting your database or return the redirection information back to the client. You can do nslookup to your Azure SQL server and it provides you the detail of associated gateway. I did nslookup to one of my Azure SQL servers hosted in the US East2 region. It returned the one of the gateway IP address and its DNS name. Now I can do telnet to this IP address using the port number 1433. 



To establish the connection from your local machine or from the Virtual machine hosted in Azure, it should allow the outbound traffic to all the Azure gateway IP  address of that region through port number 1433. The gateway IP address of each region is published on their website. Let us look into each connection policy in detail.

        Proxy:  With this policy, when the client machine initiates the connection to Azure SQL server, first it will connect to one of the gateways in that region and forward the connection to the actual compute node in which your SQL database/server is hosted. In the below case (diagram), I am trying to connect to my server "mydbserver.database.windows.net". When I did the nslookup to my server name, it resolved to  52.167.104.0 which is one of the gateway IP addresses in the US East 2 region. When the connection is initiated from the client to the server, it will get connected to the gateway and gateway forward the connection to the node in which my database is hosted. In this case, the gateway act as a proxy between the client machine and the server. The client machine does not have any idea to which the connection is forwarded.

Proxy Connection



            Redirect:  With this connection policy, when the client initiates the connection, it will connect to the gateway and gateway return the redirection information to connect the node in which the database is hosted to the client. Using that information, the client directly connects to the node in which the database is hosted bypassing the gateway. Azure claims that this will reduce the latency. To make this work, the client machine :
  • Should allow outbound communication to the SQL database gateway IP addresses on port 1433 (This is for the initial connection to get the redirection information)
  • Should allow outbound communication to all Azure SQL IP addresses (nodes) in the region on a port range of 11000 to 11999 (All the SQL node in azure listen to one of port between 11000 and 11999). This can be easily done using the service tag. It is not practical to add numerous IP addresses of the node manually. 

In the below example, the client machine is initiating the connection to connect the server "Mydbserver.database.windows.net". The initial request will go to the gateway. On receiving the request, the gateway returns the redirection information (basically the name of the node in which the database is hosted), and the client uses that information to establish the direct communication with the node in which the database is hosted. There onwards all request in that connection goes directly to the node bypassing the gateway.

Redirection Connection 


    Default: This is the default setting when you create the Azure SQL server unless you explicitly change to Proxy or redirect. With this setting, all the connections initiated from the Azure use the redirect connection policy and use the proxy connection policy for all connections initiated from outside Azure.

Allow Azure Services and resources to access this server: When this setting is enabled, all the azure service and resources in that region can access this server. For example, if we have VM hosted in azure platform, that VM can access the SQL server without whitelisting the IP address of the VM. Keep in mind that, this is not restricting to the resource hosted in your account. Enabling this feature widely opens the communication from all the resources hosted in the Azure platform including the other customer's account also.

Connect  VNET/subnet: This is one of the options where you can attach the subnet to this sever. This allows all resources hosted in that subnet to connect to the Azure SQL server without whitelisting the IP address. Moreover, this enables communication through the Azure backbone networks. Microsoft suggests to use this feature to enable a secure and fast connection. Before attaching the subnet, note that we need to enable the service endpoint for the SQL server at the subnet level.
 
These are the various option available under the Firewall and Virtual network setting. In our next post, we will discuss enabling private endpoints.


Sunday, 31 May 2020

Stairway to Microsoft Azure SQL Database Part 4 : Azure SQL server , an overview

In the last few posts, we were discussing about database deployment model and purchase model. Out of the three deployment models, managed instance works almost like on-premises SQL server and hence we are not going to discuss much about the managed instance. On the other side, a single database and elastic pool works a completely different ways.

while creating a single database or elastic poor, you need to attach those into the Azure SQL server. In other words, you need to create a SQL server before creating the SQL database. While searching for services in the Azure portal, you can search for 'SQL database' and 'SQL server'. In this post, we are going to discuss about the SQL Server, which is a logical container for your single Azure databases and Elastic Pool.

As per the Azure documentation, "a SQL server is a logical construct that acts as a central administrative point for a collection of databases". At the server level,  you can administrate the following components:
  • Logins
  • Connectivity 
  • Firewall rules 
  • Auditing rules 
  • Threat detection Policy 
  • Auto Failover group
  • Transparent Data Encryption 
  • Auditing 
  • Backup Management 
When you create an empty SQL Server in Azure, it will not cost you anything until you create a single database or elastic pool under this  SQL server. To create an empty SQL server, search for "SQL Server"  in the Azure portal, and follow the instructions. While creating the SQL server, it will ask for a user name and password. This is the only SQL login that you can create from the Azure portal. This login has the highest privilege in the server and can access all databases in that server without providing explicit permission in each database. Azure SQL server does not support window authentication but supports Azure active directory authentication. For enabling the Azure active directory authentication, you need to connect your server with Azure AD. You can achieve this by setting the Active directory Admin from the Azure portal. You can find this option under the Settings section of the server. This can be an individual account or a group. This account also will have the highest privilege and can access all the databases under that server without providing explicit permission.  These are the only two logins that you can create from the Azure portal. There is no option to change the SQL login once you have created it but you can change the Active directory admin at any time. These are the only two logins will have access to all the databases created under this server. There is no option to create server wide login which will have access to all the databases under this server. Still, you can create server wide login and can explicitly create the corresponding user in each database like we used to do in the on-premise installation. 

Azure SQL database provides two types of user :

Login and user: This is the same as we used to do in on-premise installations. Create a login and create corresponding user in each database. For example, connect to master database 

create login njohn with password ='Password@123'
--Connect to DB1
create user njohn from login njohn 
--Connect to DB2 
create user njohn from login njohn 

In this case, the login information is stored at the server level and the password remains same across all databases. While copying the database DB1 and DB2 to another server, you also need to copy the login and need to map the SID properly.

Contained user: These are the users created under the database without having the logins. In this case, the user informations are stored in the database level. After the restoration of the databases, DB1 and DB1 to another server, we don't have to copy the login information separately as the user password and other information are stored in the user database. In this scenario, the same user can have different passwords to connect with each database. For example 


--Connect to DB1
create user njohn with password ='Password1' 
--Connect to DB2 
create user njohn with password ='Password2'


When you create the SQL server, by default a virtual master database will be created. This is the only system database that you can see in the Azure SQL server. If you look into this database, you can see two additional database role added with Azure SQL server :

dbmanager: Members of this database role can create and delete the databases. When a member of the dbmanager role creates the database, becomes the owner of the database, and will have full access to that database as dbo. The members of this role will not have access to the database which is not created by them (owner) unless you explicitly created users under the database for those logins like we do in the on-premise installation. This role exists only in the master database and only the users of the master database can be added into this role. The ideal method is, create a login and create a user for that login and add that user to the dbmanager role. Contained database users in the master database also can be added to the dbmanger role, but they will not have permission to create new databases.
 
loginmanager:  can create and delete logins in the master database. Note that, members of this role does not have the permission to create users in the master database. 

As mentioned earlier, the Azure SQL server is a logical construct and does not represent a specific VM or hardware like on-premise installation. The databases associated with an Azure SQL server, maybe running in different VMs which is not visible for us. In our on-premise installation, the SQL server is installed on  VM and all databases created in that database server will be running on the same VM. Whereas Azure SQL server is a logical server and can deploy the databases under that server in multiple VMs. The below diagram helps you to understand the concept in a better way.

 
We have learned in our earlier post, each Azure SQL database will have one or more standby replica for high availability depending on the service tier. The database name marked in red are standby replica. Azure SQL inherently multi-tenant and it needs to share its resource with multiple clients. You can see one extra database, DB5 which is not there in our on-premises diagram. When you connect to Azure SQL server, "mysqlserver.database.windows.net", based on the database need to be connected, it will redirect the connection to data node hold the primary replica. This is the reason why we need to specify the database name while connecting through SSMS and not allowing us to change the database context using the USE statement. To change the context of the database, it needs to reroute the traffic to different data nodes. If we are not specifying the database name, it will connect to the virtual master database. If the 'datanode1' goes down for some reason, DB1, and DB2 comes online as the primary in Datanode2 and standby replica will get created in another data node. During this time, the connection to these two databases will be broken and the user will get an error message. To overcome these transient errors and to give a smooth experience to the application users, Microsoft suggests to implement retry mechanism for application connecting to Azure SQL. This is also beneficial to 
  • Handle the errors happens while azure perform the maintenance or patch update.  Azure SQL Database is a managed service and Azure performs the regular patch and version update. User does not have control over the date and time of these maintenance windows.
  • When we scale up the SQL database, there is a possibility to failover the database to another node due to a lack of resources in the current node. In that scenario, the existing connection will be terminated.

In our next post, we will discuss about connection architecture and different settings available under the "firewall and virtual network"





Monday, 25 May 2020

Ghost is taking full backup and breaking my backup chain . Be careful with Azure VM backup on VM having SQL server installation

I believe that I have strong knowledge in SQL Server backup and recovery process, but recently there was an incident which shook my confidence. I was working on a migration for a noncritical production server from SQL server 2008 to 2017. As we need to upgrade the OS also, we decided to go ahead with the side by side migration. The backup plan on the existing server was full backup on a weekly basis. So I decided to restore the previous week's full backup well in advance and during the migration window to take a differential backup and apply that on the new server and I communicated this process will take approximately 15 minutes. As I was pretty confident in this process, I didn't put any effort to do a dry run prior to the migration day and everything set for migration.

During the migration window, as a first step, I fired the command to take differential backup on the existing server. To my surprise, it throws the following error:

Msg 3035, Level 16, State 1, Line 1
Cannot perform a differential backup for database “Mydatabase”, because a current database backup does not exist. Perform a full database backup by reissuing BACKUP DATABASE

I was not able to believe this. Immediately I queried the backup set table and I saw multiple full backup records after the previous week full backup. I ran a set of queries and it revealed that some backup operation is happening without DBA's knowledge to a virtual device at a regular interval. As per the Microsoft documentation, Device_type 7 is a virtual device. 













I don't have time to dig further, I continued the migration by issuing full backup and restoring the same. This delayed the entire migration process by 3-4 hours and it was an embarrassing situation. 

After the migration, I decided to figure out the issue and went ahead and searched the SQL error log and I found the following messages along with the backup message.

  • I/O is frozen on database Mydatabase. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup
  • I/O was resumed on database Mydatabase. No user action is required.
  • Database backed up. Database: Mydatabase, creation date(time): 2012/02/02(09:42:18), pages dumped: 578, first LSN: 528:626:37, last LSN: 528:643:1, number of dump devices: 1, device information: (FILE=1, TYPE=VIRTUAL_DEVICE: {'{4618D52D-9C83-400A-A698-AB391E30369A}26'}).This is an informational message only. No user action is required.
The I/O frozen error was just before the backup and that triggered me to think about disk snapshot backups running behind the scene and contacted the infra team to understand this. On contacting them, they told me that, they have enabled the Azur VM backup on all VM on a daily basis and the Azure VM backup schedule on the server in question was matching with the above error message. With that, we were sure that the azure VM is the culprit behind this ghost SQL server backup. On reading further on azure VM backup, we learned that Azure VM backup uses the VSS (Volume Shadow copy) service to take the backup and SQL server is a VSS aware application and it takes a full backup of  SQL Server databases using SQL writer service. Before taking backup, it will freeze the disk operation to get a consistent backup. 

What is VSS backup? 
Volume Shadow Copy is a technology included in Microsoft Windows that can create backup copies or snapshots of computer files or volumes, even when they are in use. It is implemented as a Windows service called the Volume Shadow Copy service. It allows backup applications to safely back up locked and open files. 

Components of VSS:

  • VSS service: Part of windows operating system and installed as by default. This Orichistate the communication between the backup tool and VSS aware applications.
  • VSS requester: The application who initiate the backup request to the VSS service. This can be Microsoft or non-microsof backup applications such as Windows server backup utility, any third party VM backup tool, etc.
  • VSS writer   This is the component that guarantees a consistent data set to back up. For example, Microsoft provides the VSS writers for SQL server, Exchange server, windows registry, etc.  Many non-Microsoft applications also provide VSS components that need to guarantee data consistency during back up. 
Finally, we understood what was happening behind the scene and decided to stop the backup operation initiated by the Azure VM backup as DBA's are responsible for SQL server backup and recovery. We can not stop the Azure VM backup as the infra team need this for recovering the instance in case of disaster. So we found two solutions.
  • Stop and disable the SQL writer service. With this VSS will not able to take the SQL server database backup but still takes the VM backup.
  • Add below registry key    [                HKEY_LOCAL_MACHINE\SOFTWARE\MICROSOFT\BCDRAGENT] 
     "USEVSSCOPYBACKUP"="TRUE"  This will force the VSS to take COPY_ONLY backup which will not break the backup chain 
 
I hope this piece of information will help you one day. If you are more curious about VSS service, refer the below links:

 






Sunday, 17 May 2020

Stairway to Microsoft Azure SQL Database Part 3 : Purchase Models DTU vs vCore continue

In the last post, we have discussed the DTU purchase model. Azure also provides another purchase mode which is called vCore. vCore purchase model available with all three deployment models (Managed instance, Single database, and elastic pool). 

vCore purchase model provides :
  • Higher compute, memory, IO, and storage limit.  
  • Better control over opting for the hardware configuration.
  • Option to bring your own license using the Azure Hybrid Benefit(AHB) and to reserve the instance to reduce the cost.
  • Flexibility to choose the compute and storage independently.
  • Storage and compute charged separately. In the DTU model ,limited amount od storage bundled with the computing power.  Storage allocated above that is charged separately.

Service Tier: Like in the DTU model, the vCore model also provides multiple service tier to match with your performance requirements and budget. vCore purchase model provides the following service tiers.
  • General Purpose 
  • Hyper-Scale
  • Business Critical
General Purpose :
  • Offer Budget-oriented, balanced and scalable compute and storage option.
  • Suited for general workload.
  • Available with all three deployment model (Single, elastic and managed)
  • Option to select 2 to 80 vCore. The options are 2 ,4 ,6 ,8 ,10,12,14,16,18, 20,24,32,40 and 80 vCore
  • Storage upto 4 TB (0.12 USD per GB). 
  • Remote storage 
  • 500 to 20000 IOPS
  • 2 to 10 ms latency 
  • For high availability, it provides one replica.
  • No read replica.
  • 7 to 35 Backup retention 
Hyperscale :
  • Available only for single database deployment model
  • Provide highly scalable storage up to 100TB 
  • For rapid read scale-out deployment, provide up to 4 read replica 
  • Read replicas are charged.
  • Option to select compute power from 2 to 80 vCore (options are the same as in general-purpose) 
  • Use local SSD storage.
  • Snapshot based backup in Azure remote storage. Restore use these snapshots for fast recovery. 
  • Backup/restore does not depend on the size of the database and does not affect the IO or compute as it is performed using the snapshot. Hence backup and restorations are instantaneous.
  • For example. 50 TB database can be restored in less than 15 minutes. You can watch the video here 
  • Depending on the workloads, IOPS between 500 to 204800 with 1 to 10 ms latency.
  • Most suited for databases that will not fit into the 4TB storage limit of general-purpose and business-critical service tier.
  •  Hyperscale database engines use distributed function architecture and use the following components (we will discuss the architecture of hyper-scale in a different blog post). This architecture is different from the existing database engine architecture 
    • Compute 
    • Page Server
    • Log Service 
    • Aure storage 
Business Critical  :
  • Available with all three deployment model (Single, elastic and managed)
  • Option to select 2 to 80 vCore. The options are 2 ,4 ,6 ,8 ,10,12,14,16,18, 20,24,32,40 and 80 vCore
  • Offers business applications the highest resilience to failures by using several isolated replicas.
  • It keeps 4 replicas for better high availability 
  • To offload the read workload from the writable replica, provide read replica without additional charge.
  • Use local SSD storage for better IO performance.
  • Provides up to 4 TB storage with 1 -2 ms latency (0.25 USD per GB)
  • 5000 to  204800 IOPS
  • 7 to 35 days backup retention.

Compute Tier: Under the service tier,  azure SQL server provides the compute tier. For managed instance and elastic pool deployment models, azure provides only provisioned compute tier and this option will not be available to choose. For the single database deployment, there is one more additional compute tier called Serverless. Let us see the difference between these two compute tier:

Provisioned : 
  • compute tier is provisioned through out the lifecycle of elastic poo/ single database/managed instance and billed hourly basis.
  • There is no change in the billing based on the workload on the database.
  • Suitable for regular usage patterns with higher average compute utilization over time.
  • Compute scaling should be done manually (increasing the vCore)
Serverless :
  • Suitable for Intermittent, unpredictable usage with lower average compute utilization over time.
  • Automatically scale up the compute based on the workload and billed per second.
  • Automatically pause the database if there is no activity for a preconfigured time (Auto pause delay)  and start automatically when the wake-up events happen. During this period compute is not charged.
  • While provisioning, we need to define the minimum and maximum vCore. 
  • Based on the workload, the compute tier is scaled up to the max vCore defined. Compute is billed per second.
  • Auto pause delay is a configurable parameter that defines the required inactive period before pausing the database. This can vary from 1 hour to 7 days. For example, if we define the auto-pause delay as 2 hrs, then if there is no activity in the database for 2 hours, then the database will go to pause state and will not be charged for computing until the next wake up event. Till that point, only the storage cost will be charged.
  • The minimum and maximum vCore are configurable parameters that define the range of computing power available for the database. The maximum vCore can vary between 1 to 16 vCore. (1,2,4,6,8.1,10,12,14 and 16 are possible options) . The minimum vCore can vary from 0.5 vCPU to maximum vCore configured. Each vCpu provides 3GB of memory. If the workload requires memory more than that will be charged.
  • Cost
    • Compute cost and storage costs are charged separately. Storage cost remains the same throughout the life cycle unless you scale up the storage.
    • When the compute usage is between the min vCore and max vCore configured, compute cost is based on vCore or memory.
    • When the compute usage is below the min vCore setting, it will be still charged for min vCpu/ Memory.
    • For the duration of the paused state, Computing is not charged. Only storage will be charged.
  • Auto pause and auto-scaling
    • When the database is in a paused state, the first wake up event (example connection request)  fails with the error message "database is unavailable" with error code 40613. It will take approximately one minute to resume the operation. Applications that are working with a serverless database should have mechanisms to handle this error message and retry the connection after a minute.
    • When the load on the database increases, azure automatically scales up the compute / memory up to the max vCore configured. while scaling up the compute, at the end of the process all connections to the database are dropped. The applications which use the serverless database should have a mechanism to handle these error and try to reconnect after a specific time to provide smooth user experience to the end-users. 
  • Billing 
    • Use per second billing.
    • Each vCore in action provides 3GB memory. If the workload uses more memory than the memory bundled with the vCore, billing will be based on the memory. For example, for a specific time interval, the load on the database is using  2 vCore and 10 GB  memory. In this case, billing will be based on the memory as the memory utilization is above 6 GB (2vCore*3 GB). To derive the number vCore to be billed from the memory utilization, memory utilized will be divided by 3 (3GB for each Vcore). In our case, it will be billed for 10GB/3 = 3.33 vCore. The below table gives you a better understanding of the billing. Let us consider one database, that is provisioned with 3 vCore as minimum vCore and 16 as maximum vCore.The auto pause delay is configured as 2 hrs. Each vCore seconds is charged  at 0.000145 
Serverless Billing calculation


The total vCore seconds are 389952 and it will cost 56.54 USD (0.000145 USD * 389952 vCore seconds) for the workload mentioned in the above table which spread across 24hrs.

  • Best suited for :
    • Single database with intermittent and unpredictable usage patterns.
    • The database workload is only during specific intervals. So after the auto-pause delay, the database enters into stop state and not charged for the compute.
    • Workload is not sensitive to the response time and the application is capable fo handling the errors while resuming the service and auto-scaling
Apart from connecting to the database, there are many other events that resume the database from a paused state. You can find those events in the Azure documentation page 


Hardware Configuration Azure provides multiple hardware options to choose from under the service tier. These options differ from region to region. Commonly available hardware options are 

  • Gen4 
    • Provide 7 GB memory for each vCore.
    • Approaching the end of life by Jan 2023
  • Gen5 
    • Has new generation hardware
    • Has network acceleration and provide better IO performance on remote storage  (General purpose service tier)
    • It uses faster local SSD disk than Gen4
  • M-Series (preview)
    • New Memory-optimized hardware 
    • Provide 29GB per vCore.
    • It provides up to 128 vCore.
    • In full capacity memory can be nearly 4 TB (128 vCore X 29 GB per vCore).
  • Fsv2-series (Preview)
    • Compute-optimized hardware.
    • Low CPU latency and high clock speed for the most CPU demanding workloads.
    • Can deliver more CPU performance per vCore than Gen5.
    • The 72 vCore size can also provide more CPU performance for less cost than 80 vCores on Gen5.
    • provides less memory and tempdb size per vCore than other hardware. so workloads sensitive to those limits may want to consider Gen5 or M-series.

The below diagram provides you a visual representation of vCore purchase model options for Single database and Elastic pool.

vCore purchase model

vCore Purchase Model


Managed Instance deployment model also provides General Purpose and Business Critical service tier. Under the service tier, it provides Gen4 and Gen5 hardware. 

Each combination of these options has many other characteristics like tempdb space allocated, log space allocated, IOPS, no of concurrent session, log write throughput, etc.   All those detailed information can be found in the following Azure documentation.

I hope, this post helped you understand the vCore purchase mode in detail.  





Sunday, 3 May 2020

Stairway to Microsoft Azure SQL Database Part 2 : Purchase Models DTU vs vCore

In the last post, we have discussed about different deployment options available with Azure SQL server. I hope you got a good understanding of various deployment models. In this post let us discuss the purchase models available with each deployment model.

Azure SQL database provides multiple purchase models based on the deployment model to accommodate your performance requirements and budgets. At a high level, these purchase models are divided into two:
  • DTU or eDTU purchase model
  • vCore purchase model
Each of these purchase model provides multiple service tier for more customization of your performance requirements and budgets.
  • DTU or eDTU 
    • Basic
    • Standard
    • Premium  
  • vCore 
    • General Purpose
    • Hyperscale
    • Business Critical
DTU and eDTU purchase model 

The database transaction unit (DTU) is a specific composition of compute power, memory, storage, and IO. This purchase model provides us all the resources as a bundle which includes a specific amount of compute power, storage, and IO. It provides the simplicity and full control of your budget. This purchase model available with single database and elastic pool deployment model. As mentioned earlier, Azure provides multiple service tier to match our performance requirements and budget.
  • Basic tier: 
    • Suitable for very low CPU intensive workload. 
    • Provides 5 DTU with 2GB of storage. 
    • Each DTU provides 1-5 IOPS. 
    • 35 days backup retention. 
    • Each DTU costs 1 USD per month. 
    • Suitable for initial design and developments. 
  • Standard tier: 
    • Suitable for low and medium CPU intensive workload. 
    • It provides a DTU range between 10 and 3000 DTU with 250 GB storage. 
    • Additional storage up to 1 TB with additional cost. To go beyond 250 GB, we have to opt for 100 or more DTU. 
    • 35 days backup retention. 
    • Each DTU provides 1-5 IOPS. 
    • Each DTU costs 1.5 USD per month.  
    • The range of DTU in this tier is divided into 9 compute size: S0 10 DTU), S1 20 DTU, S250 DTU,S3 100 DTU, S4 200 DTU, S6 400 DTU, S7 800 DTU, S9 1600 DTU, and S12 3000 DTU. There is no option for us to randomly select the DTU in the range of DTU.
  • Premium Tier: 
    • Suitable for medium and high CPU and IO intensive workload. 
    • Provides DTU range between 125 and 4000 with a storage of 500 maximum to 4 TB depends on DTU.
    • Each DTU unit provides 25 IOPS. That makes it suitable for IO intensive workload. 
    • It also provides better IO latency (2ms for read and write) compared to basic and standard (5 ms for read and 10 ms for write).
    • 35 days backup retention. 
    • Support column store index and in-memory OLTP
    • DTU range divided into multiple compute sizes. p1 125 DTU,p2 250 DTU ,p4 500 DTU, p6 1000 DTU p11 1750 DTU, p15 4000 DTU 
    • Premium tier also supports read scale-out to load balance the read workload with no extra cost. As part of high availability architecture, each database in the premium service tier is automatically provisioned with multiple secondary replicas. These secondary replicas are provisioned with the same compute size of the primary replica. Enabling this feature, allow you to load balance your read-only workload using one of the read replica instead of read-write replica. This is very useful for read-intensive application which can segregate their read and write operations by passing the readintent=true property in the connection string. 
    • Premium tier also supports Zone redundancy. As part of high availability architecture, each database in the premium service tier is automatically provisioned with multiple secondary replicas in the same data center. This architecture helps to protect us from the server failure. By enabling the Zone redundant feature, the replicas are placed in different availability zones, it makes the databases resilient to the data center outage 
For more accurate details of each service tier, refer the following Azure documentation :
While using the single database deployment model, the DTU purchase model comes into the picture. In the same way, for an elastic pool deployment model, the eDTU purchase model is used. eDTU purchase model also provides different service tiers like in the DTU model but with different limits. For the details of each service tier, refer to the Azure documentation. As you know, in the elastic pool deployment model, the resource is allocated at the pool level and each database shares the resource. To avoid one database consuming all the resources, there is a couple of parameter settings in the elastic pool level.
  • Max eDTUs per database: This is the maximum number of eDTU that any database in the pool may use if available based on utilization by other databases in the pool. Max eDTU per database is not a resource guarantee for each database. if the workload on these databases are spread across time, we can over commit the max eDTU per database. For example, we have an elastic pool with 1000 eDTU and 5 databases attached to that. Out of that 2 of them are reporting data and most load comes to those two at night. The other two databases are part of the OLTP system will have varying workload but the peak will be between 10 AM and 5 PM. The fifth database is an internal database most of the load on this database will be between 9 AM and 12 PM. In this scenario, we can safely set max eDTU per database to 300 or 400 even if the total goes beyond the 1000 eDTU. 
  • Min eDTUs per database: This is the guaranteed minimum number of eDTU for each database in the pool.  This can set to 0 and that is the default value. This value multiplied by the number of databases in the pool should not exceed the eDTU allocated to the pool. To set the value of this parameter to 100 of an elastic pool which has 20 databases, we need to allocate a minimum of  2000 eDTU to the elastic pool.
  • Max storage per database: This is the maximum size of any database attached to this elastic pool. None of the databases can grow beyond this limit. Even if we set this value to a higher number which exceeds, (no.of database X  Max storage per database), the total available storage space of the pool storage limit, it will never allow all database storage to grow more than available storage space fo the pool.

Each service tiers in the elastic pool has a limit on the individual database resource utilization. The below table explains those limitations:

In case if you are migrating your workload from your on-premise installation of SQL Server to Azure SQL and you are not sure how much DTU need to be allocated and which service tier to be provisioned, DTU calculator comes handy. You need to provide values of the following performance counters :
  • Processor - % Processor Time
  • Logical Disk - Disk Reads/sec
  • Logical Disk - Disk Writes/sec
  • Database - Log Bytes Flushed/sec
I hope with this you got a good understanding of the DTU and eDTU purchase model.  In the next post, we will cover the vCore purchase model.









Sunday, 26 April 2020

Stairway to Microsoft Azure SQL Database Part 1 : Deployment Options

Recently I started working with Azure SQL Database. It took some time for me to understand the concepts of the Azure SQL database. Earlier I have experience in AWS RDS for SQL server, but in Azure it works in a little complex way. Once you understand the different offerings, security, deployment model, etc, it is is easy to provision the resources as per the requirements.

Through a set of articles, I will try to explain the Azure SQL database in a very concise way. All this information is available in Azure documentation. For the granular level of information, always refer to Azure documentation.   

Microsoft Azure database is a fully managed platform as a service (PaaS). Azure provides three different deployment options to cater to the different needs of the customer.

  • Managed Instance 
  • Single Database 
  • Elastic Pool
Let us look into the details of each deployment model :

Managed Instance: Managed instance is one of the most suitable deployment option for those who are moving to cloud from an on-premise server. This deployment model supports most of our on-premise database features.
  • Managed instance does not provide access to the Operating system.
  • It does not allow you to specify the version of the SQL Server. Managed instance always runs the latest stable SQL engine version, which is equal to or higher than the latest available RTM version of SQL Server. 
  • Supports 
    • SQL Agent service 
    • DB mail with external SMTP server
    • Service broker
    • SQL Agent service 
    • Transactional replication
    • Change Data Capture (CDC)  
    • CLR creation from binary (Not using the assembly file)
    • linked server 
    • Managed instance supports SQL Server Integration Services (SSIS) and can host SSIS catalog (SSISDB) that stores SSIS packages, but they are executed on a managed Azure-SSIS Integration Runtime (IR) in Azure Data Factory (ADF)
    • To choose the time zone
  • Not yet supporting the file stream or file table.
On top of that, managed instance support all PaaS features like :
  • No separate purchase of infrastructure and managing the underlying hardware of the database server.
  • Automated patching and version upgrade. There is no option to control the maintenance window of patching or upgrade. To handle the connection error during the maintenance window, you need to implement the retry mechanism in your code.
  • No headache of scheduling and monitoring of backups. Automated backups are available 
Migration to the managed instance is much easier as this supports restoring from the native backup created from the on-premises server. To restore the backup in the managed instance, the backup should be available in Azure storage account and should use RESTORE DATABASE FROM URL. Managed instances also allow the customer to take COPY_ONLY backups which do not break the azure automated backup chain. 

Managed instance is placed inside the Azure virtual network and in a dedicated subnet. This provides :
  • Secure private IP address.
  • The ability to connect an on-premises network to a managed instance.
  • The ability to connect a managed instance or another on-premises database server through a linked server.
Managed instances also provide public endpoints. Public endpoint provides the ability to connect to the Managed Instance from the Internet without using a VPN. Access is disabled by default unless explicitly allowed. We need to explicitly whitelist the IP address to access through the public IP address.

managed instance provides two flavors of compute and storage :
  • General Purpose: This supports most of the production workload. Support up to 80 vCore and 8TB fast storage
  • Business Critical: For IO intensive and compute-intensive workload.Support up to 80 vCore and 4 TB super fast storage 
Managed instance also support to bring your own license with software assurance (SA). Considering all these, managed instances are the best option for lift and shift of your existing workload to the cloud. Note that, if you need direct access to OS/filesystem or dependent on a specific version of SQL server or required specific features that are not supported in the Azure SQL server then Azure Paas database service is not suitable unless you resolve those dependencies.

Single Database:  A single database is a fully-managed database as a service (DbaaS) under the Azure PaaS offering. This deployment option is most suited for modern application development like microservices. The single database deployment option creates a database in Azure with its own set of resources and is managed via a SQL database server. With a single database, each database resources are isolated from each other. A SQL database server is completely different from the SQL server that we used to have in the on-premise installation. In this context of azure DbaaS, an SQL database server is a logical construct that acts as a central administrative point for one or more single databases.
  • Single database does not provide access to the Operating system.
  • It does not allow you to specify the version of the SQL Server. Single database always runs the latest stable SQL engine version, which is equal to or higher than the latest available RTM version of SQL Server. 
  • Does not Supports 
    • SQL Agent service 
    • DB mail service
    • Service broker
    • Replication (can be push subscriber )
    • Change Data Capture (CDC)  
    • CLR creation 
    • linked server 
    • To choose the time zone
    • File Stream and File table
In short, Single database support only database-level features and does not support server level features.  To migrate your existing workload to a single database,  there are three primary methods.
  • Export to a BACPAC file and import the BACPAC file into azure single database. (BACPAC file is a ZIP file with an extension of BACPAC containing the metadata and data from a SQL Server database. A BACPAC file can be stored in Azure Blob storage or in local storage in an on-premises location and later imported back into Azure SQL Database or into a SQL Server on-premises installation). This is a time-consuming process and requires downtime. The time required to complete the migration depends on the size of the database.
  • Using transactional replication: Azure single database can be configured as a subscriber of your on-premise database publication. There is no UI available.  Need to configure through T-SQL. Once the complete data synched with the source server, on migration day cut down the traffic to your on-premise database and after synchronizing the data, point your application to the new Azure database 
  • Use the Azure DMS (Data Migration service ).
Azure database server does not allow you to host inside the Vnet. By default, the access is through the public endpoint.  All the traffic through this endpoint is blocked and you need to explicitly whitelist the required IP address to connect to the database server through this endpoint. There is no option to disable the private endpoint but as mentioned earlier, no one can connect through public endpoint unless you provide the access by whitelisting the IP addresses. You can enable the private endpoint by integrating virtual network private endpoints. Private endpoint helps to connect from the same VNet or peered VNet in the same/cross-region or from on-premises using a VPN.  Note that virtual network private endpoint will be billed separately 

Azure Database provides two types of purchase :
  • DTU base
  • vCore base
Users can select the option based on their workload. In the vCore model, azure allows you to bring your own license. We will discuss these options in another post.

Single database support PaaS features like :

  • No separate purchase of infrastructure and managing the underlying hardware of the database server.
  • Automated patching and version upgrade. There is no option to control the maintenance window of patching or upgrade. To handle the connection error during the maintenance window, you need to implement the retry mechanism in your code.
  • No headache of scheduling and monitoring of backups. Automated backups are available 

Elastic Pool: Elastic Pool Is a set of single databases with a shared resource pool. In a single database, each database will have its own dedicated resource. In the elastic pool, resources are configured in the pool level and each single database connected with that pool shares the resource of elastic pool. SQL Database elastic pools are a simple solution for managing and scaling multiple databases that have varying and unpredictable usage demands. For example, let us assume you are providing some SaaS solutions to multiple clients in different time zone. Each customer data are stored in independent databases. In this scenario, the resource can be utilized in a much better way by adding those databases to the elastic pool instead of allocating dedicated resources to each individual customer database. Apart from this, the rest of the characteristics are the same as the Single database mentioned in the above section.

Hope the below pictorial representation helps you to understand different deployment option available in Azure DbaaS.