Server is sometimes slow and you want to know why? Here is a lightweight diagnostic SQL kit which will give you the answer. You need to put it on your server to start logging of performance info. This solution is already in use by many productions with mission-critical workload.
There are several parts to install:
- Perfmon counters – they log CPU, RAM, disk, number of TCP connections. As well as some SQL instance-level counters like lock wait time, memory, plan cache, batch requests, etc. SQL counters are added for every auto-detected sql instance on the machine.
- LongRunningStmt.sql – Extended Events session – logs slow queries on that SQL instance. Requires 1GB of disk space (rolling 10 files x 100MB). Any query above 5 seconds will be logged. If you have a ton of such queries, adjust the threshold a bit higher (eg 20 sec) to avoid catching too much events. Because we want to concentrate on the slowest ones first.
- BlockedProcessReport.sql – Extended Events session – in case a process is blocked for longer time (>10s), logs which statement was blocked by which statement, and other details about it.
Examine the scripts before running, and adjust paths on disk. Make sure disk has enough space (>2GB).
Change path and execute that SQL. It creates Extended Events session that logs slow queries.
Impact on production
This is super-lightweight logging. Overhead is zero or close to zero. It is safe to leave it running even after performance tuning is finished. Because app code changes, and when new issue arrives, you are ready with logged event. The only thing to keep eye on is disk space. Because perfmon log creates around 40MB per day and will not auto-delete old logs. Extended event logs are limited to 1GB (10 files x 100MB) and will auto-delete and not grow. One might want to adjust the thresholds for extended events to a higher value (parameters at beginning of each script), if extended events fills 1GB in less than 24h. Not because impact, but to have at least 24h of history recorded.
Why is impact so low? Because perfmon is already part of Windows. We just set which counters to track every 30 seconds (eg CPU%). Extended events are also existing part of SQL Server, we do not install any new tool. Extended events are lightweight by nature. We configure it to capture slow queries, blocked queries and errors. Those events are rare by nature, making this logging even more lightweight, with almost zero overhead. Runs on many production environments non-stop, because it is so light that stopping it would make no difference, and the data it logs is valuable. Even environments with super-heavy workload handle it flawlessly without a single “overhead” issue found so far.
2018-09-08 Version 1.3: Added BlockedProcessReport.sql, improved LongRunningStmt
2018-05-25 Version 1.2