Pages

Tuesday, 2 July 2013

SQL Server: Part 4 : All About SQL Server Statistics : Auto Update Statistics Threshold - Importance of Manual statistics maintenance

In the last post, we have discussed about the auto create and auto update property of the statistics. Do we really need to do manual statistics maintenance to keep the performance optimal? The answer is yes, depends on your work load. SQL server will do auto update statistics only when it reaches the threshold limits.When high volume of Insert/Update/Delete operation are happening, the inbuilt auto update stats is not good enough to get consistent  performance.

After a set of insert,delete and update, it may not be reflected in the statistics. If the SQL Server query optimizer requires statistics for a particular column in a table that has undergone substantial update activity since the last time the statistics were created or updated, SQL Server automatically updates the statistics by sampling the column values (by using auto update statistics). The statistics auto update is triggered by query optimization or by execution of a compiled plan, and it involves only a subset of the columns referred to in the query. Statistics are updated before query compilation if AUTO_UPDATE_STATISTCS_ASYNC is OFF, and asynchronously if it is ON. When statistics are updated asynchronously, the query that triggered the update proceeds using the old statistics. This provides more predictable query response time for some work loads, particularly those with short running queries and very large tables.

When a query is first compiled, if the optimizer needs a particular statistics object, and that statistics object exists, the statistics object is updated if it is out of date. When a query is executed and its plan is in the cache, the statistics the plan depends on are checked to see if they are out of date. If so, the plan is removed from the cache, and during recompilation of the query, the statistics are updated. The plan also is removed from the cache if any of the statistics it depends on have changed.

SQL Server 2008 determines whether to update statistics based on changes to column modification counters (colmodctrs).

A statistics object is considered out of date in the following cases:

 If the statistics is defined on a regular table, it is out of date if:


  1. The table size has gone from 0 to >0 rows (test 1).
  2. The number of rows in the table when the statistics were gathered was 500 or less, and the colmodctr of the leading column of the statistics object has changed by more than 500 since then (test 2).
  3. The table had more than 500 rows when the statistics were gathered, and the colmodctr of the leading column of the statistics object has changed by more than 500 + 20% of the number of rows in the table when the statistics were gathered (test 3).
The above part is from MSDN technical article,Statistics Used by the Query Optimizer in Microsoft SQL Server 2008


The first two condition are fairly good but when it comes third which handle the bigger table, some time the threshold will be too high to invalidate the statistics. For example consider a table with 1000000 records.Only after 200500 records modified (update/insert), the statistics will be invalidated to perform an auto update. 

Let us see do a hands on.

Use Mydb
SELECT * INTO SalesOrderDetail FROM AdventureWorks2008.sales.SalesOrderDetail
CREATE INDEX ix_ProductID ON SalesOrderDetail(ProductID)
SELECT * FROM SalesOrderDetail WHERE ProductID=725

I have created a copy of SalesOrderDetail table and created index on ProductId. Let us see the execution plan of the select statement.















Optimizer has selected  index seek along with bookmark lookup operation as optimized plan and it was able to complete this with 377 logical reads.

The salesOrderDetail table has 121317 records. As per the third condition mention above to invalidate the statistics, 20% of 121317 =24263 + 500 = 24763 records to be modified. Let us update just 5000 records of this table with productid 725 and run the same select statement.

SET ROWCOUNT 5000
UPDATE SalesOrderDetail SET ProductID=725 WHERE ProductID<>725
SET ROWCOUNT 0
SELECT * FROM SalesOrderDetail WHERE ProductID=725






















The estimated number of rows in the execution plan is 374 . This is based on the statistics gathered prior to update operation. The optimizer selected  index seek and bookmark lookup as optimal plan based on the statistical data.The select operation performed 5392 logical reads to complete the operation.

As a next step,let us update 19762 records  with productid 725. In effect we are updating 24762 (including 5000 record updated in the previous step) which is one record less than the number of records to be updated (24763) to invalidate the statistics.

SET ROWCOUNT 19762
UPDATE SalesOrderDetail SET ProductID=725 WHERE ProductID<>725
SET ROWCOUNT 0
SELECT FROM SalesOrderDetail WHERE ProductID=725

















The estimated number of rows is still 374 based on the statistics gathered prior to both updated statement. The optimizer selected index seek with bookmark lookup as optimized plan for the select statement based on the statistics. 25212 logical reads performed to complete the operation.

Now let us update one more record to invalidate the statistics.


SET ROWCOUNT 19762
UPDATE SalesOrderDetail SET ProductID=725 WHERE ProductID<>725SET ROWCOUNT 0SELECT FROM SalesOrderDetail WHERE ProductID=725





















As we expected the select statement triggered the auto update stats and in the plan the estimated number rows and actual number of rows are very close. That helped the optimizer to choose a better execution plan. Optimizer selected table scan rather than going for index seek and bookmark lookup operation. The select operation took only 1496 logical reads to select 25137 which much lower that the 25212 logical read to select 2516 records in the previous step.In the first step , when we updated the 5000 records itself , the optimizer might have selected table scan as optimized plan instead of index seek and bookmark operation if the statistics was got updated that moment. So that it can complete the operation with 1495 logical read instead of 5392 logical read, which is much better.

From this exercise, it is clear that the threshold for auto update statistics is not good enough to get optimal performance.This will be worst in bigger table. A manual statistics update is needed to guarantee optimal performance of the query but the frequency of the update depends on the workload.

Even though statistics become outdated after lot of DML operation,it will not get updated automatically till a query plan try to access that statistics .To make it more clear,SQL Server will automatically update a statistic when:
  • A query compiles for the first time, and a statistic used in the plan is out of date
  • A query has an existing query plan, but a statistic in the plan is out of date

If you liked this post, do like my page on FaceBook

11 comments:

  1. dongtam
    game mu
    http://nhatroso.net/
    http://nhatroso.com/
    nhac san cuc manh
    tư vấn luật
    dịch vụ thành lập công ty trọn gói
    văn phòng luật
    tổng đài tư vấn pháp luật
    thành lập công ty
    http://we-cooking.com/
    chém gió
    trung tâm ngoại ngữ

    - Vạn Biến Ma Công tu luyện cực kỳ khó khăn, người bình thường tu luyện, trong một trăm người thành công một mà thôi, nhưng chủ nhân muốn tu luyện dễ dàng hơn.

    Vạn Biến Thiên Ma nói, trong lòng không biết chủ nhân muốn tu luyện làm cái gì.

    - Thế thì tốt quá, ngươi trước dạy ta, chờ ta học xong cũng có chút bảo đảm.

    Nhạc Thành cao hứng nói, không thể nghi ngờ nếu Xích Ngạo Tuyết đem chuyện chính mình đánh chết Cực Sắc Đạo Nhân nói ra, thì hắn cũng chỉ có thể trở thành đối tượng truy nã của Tiên Đạo liên minh cùng Ma Đạo liên minh, đến lúc đó hắn có thủ đoạn Vạn Biến Thiên Ma, cũng an toàn h Cố Thành Vệ, đem ma công dạy cho ta cùng Diễm Ma, Thú Ma, chúng ta hiện tại dễ dàng khiến cho Hỗn Thế Ma Vương chú ý, về sau chúng ta biến hóa một cái bộ dáng mới xuất hiện.

    Lang Lễ nhìn Vạn Biến Thiên Ma Cố Thành Vệ nói.

    - Điều này không có vấn đề, chính là thay đổi một chút bên ngoài mà thôi, bất quá đây chính là

    ReplyDelete




  2. I loved the way you discuss the topic great work thanks for the share, Let me share this, vmware training in pune

    ReplyDelete
  3. شركة تنظيف خزانات بجدة
    شركة عزل خزانات بجدة
    شركة اصلاح خزانات بجدة
    شركة عزل خزانات في جدة
    قد اسسنا شركة مكافحة الصراصير بجدة توفر إليك خدمة رش مبيدات بجدة اللازمة في التخلص من الصراصير القاتلة والتي تسبب إليك الأمراض والإصابة بالكثير من الفيروسات وتنقل إليك العدوى والجراثيم إلى الأطعمة الخاصة بك أو من خلال الادوات التي تستخدمها بشكل شخصي وبالتالي إذا كنت ترغب في حل نهائي لهذه المشكلة فأليك الأن شركة مكافحة البق بجدة تعمل على تخلصك من بق الفراش بشكل نهائي وفعال وبأقل الأسعار ولديك ايضا شركة مكافحة النمل الابيض بجدة التي تقضي على الأرضة وحشرات الخشب التي تدمر المنازل

    ReplyDelete
  4. Thanks for sharing this post, is helpful for developers. Today i talk about Geek Squad who offers you various computer related services and accessories for customers. Geek Squad provides services in store, on door step, and over the Internet via remote access, and also provides a 24/7 and 365 days telephone and emergency on-site support and services.

    Geek Squad Support

    ReplyDelete
  5. من اجود وارخص الشركات التي توجد في منطقة مكة المكرمة والتي تعمل في مجال نقل العفش مع الفك والتركيب تلك الشركة الجيدة التي تقدم خدمات جيدة وتسمى اقوى شركة نقل اثاث بجدة التي تحافظ على ممتلكاتك ايضا وهي التي تختص بنقل العفش من بيت الى بيت آخر في مدينة جدة وما جاورها من مناطق تابعة لها وقد نضطر قبل نقل العفش الى المكان الجديد الى تنظيف المنزل الجديد قبل النقل من افضل شركات منطقة مكة المكرمة التي تعمل في مجال تنظيف وتعقيم المنازل تلك الشركة الجيدة التي تقدم خدمات جيدة في تنظيف المنازل من الداخل ومن الخارج وتسمى افضل شركة تنظيف شقق بجدة التي تختص بأعمال التنظيف للمنازل الجديدة والمفروشة ومن الأفضل ان تقوم بعمل مكافحة للحشرات بالتواصل مع افضل واقوى الشركات التي تقدم خدمات مكافحة حشرات المنزل البق والصراصير والنمل والعته تلك التي تستخدم مبيدات آمنة وفعالة ومضمونة مثل ارخص شركه مكافحه الصراصير بجده من مثيلاتها التي تتعامل في مكافحة الحشرات وتستخدم مبيدات آمنة ومضمونة ونحتاج ايضا الى تنظيف الخزان وذلك بالتعرف على اكبر وافضل واقوى شركات تنظيف خزانات المياه بجدة تلك التي تقوم بأعمال تنظيف وتعقيم خزانات المياه في جدة بمنتهى الدقة والاحترافية وتقدم شركة تنظيف خزانات بجدة اقوى وافضل الخدمات الجيدة في تنظيف وتعقيم الخزانات لكي تحافظ على الماء نظيفا ومعقما اطول فترة زمنية ممكنة لكي تكون مياهك نظيفة فان شركه مكافحه حشرات مستعدة لاستقبال مكالمتكم في أي وقت وطلب الخدمة

    ReplyDelete
  6. I am interested in this area if you are a great site but know some community forums. I think it should be a good opportunity to get feedback from others who are experienced. 국산야동

    Please visit once. I leave my blog address below
    야설
    국산야동

    ReplyDelete
  7. After looking into a few of the blog posts on your site, I seriously appreciate your way of blogging 중국야동넷

    Please visit once. I leave my blog address below
    야설
    중국야동넷

    ReplyDelete
  8. Great post! I completely agree that understanding the concepts of SQL Server is crucial for any DBA. Thanks for sharing your expertise! best seo services in gwalior

    ReplyDelete