Recently, we have observed a number of cases where DBAs or application developers are complaining about out-of-Memory errors or even machine not responding (hangs) despite the fact that there is plenty of available memory on the system. Frequently this is on systems where SQL Server is running and the issue occurs after increasing the 'max server memory' sp_configure option, again with plenty of memory available. For example on a system with 64-GB RAM, SQL Server's 'max server memory' is set to 54 GB and you start getting OS errors 1453 or 1450 for example or any other memory-related issue. In some cases the OS stops responding. The key symptom was that when the DBA reduced the SQL Server max server memory below 50 GB (below 80% of RAM on the machine), the problem stopped happening. So logically, you would think there is a problem with SQL Server.
In this one case, the machine where SQL Server was running was experiencing "hangs" - lack of response. One of our colleagues from the Windows team (kudos Syed Yusuf) analyzed a manually-generated kernel memory dump.
3: kd> !mex.mem Page File: ??D:pagefile.sys Current: 68157440 Kb Free Space: 68000152 Kb ( 65.00 GB) Minimum: 68157440 Kb Maximum: 68157440 Kb ( 65.00 GB) Physical Memory: 16596487 ( 66385948 Kb) ( 63.31 GB) Available Pages: 2254327 ( 9017308 Kb) ( 8.60 GB) ----- Available memory looks good. ResAvail Pages: 1118 ( 4472 Kb) ( 4.37 MB) ----- ResAvail Pages are the Minimum Working Set that process reserves/set while starting. The value of ResAvail is 4.73 MB which is too low. Even though overall available memory is 8.6 GB, there’s some process which has set Min Working Set to a huge value. This makes available pages reserved to this process and cannot be allocated to anything else. ********** Running out of physical memory ********** Locked IO Pages: 0 ( 0 Kb) ( 0) Free System PTEs: 4294986002 (17179944008 Kb) ( 16.00 TB) ******* 3 system PTE allocations have failed ****** Modified Pages: 18188 ( 72752 Kb) ( 71.05 MB) Modified PF Pages: 18181 ( 72724 Kb) ( 71.02 MB) Modified No Write Pages: 0 ( 0 Kb) ( 0) NonPagedPool 0 Used: 41 ( 164 Kb) ( 164.00 KB) NonPagedPoolNx 0 Used: 18074 ( 72296 Kb) ( 70.60 MB) NonPagedPool 1 Used: 66 ( 264 Kb) ( 264.00 KB) NonPagedPoolNx 1 Used: 17231 ( 68924 Kb) ( 67.31 MB) NonPagedPool Usage: 409 ( 1636 Kb) ( 1.60 MB) NonPagedPoolNx Usage: 68905 ( 275620 Kb) ( 269.16 MB) NonPagedPool Max: 4294967296 (17179869184 Kb) ( 16.00 TB) PagedPool 0 Usage: 58073 ( 232292 Kb) ( 226.85 MB) PagedPool 1 Usage: 21277 ( 85108 Kb) ( 83.11 MB) PagedPool 2 Usage: 21274 ( 85096 Kb) ( 83.10 MB) PagedPool Usage: 100624 ( 402496 Kb) ( 393.06 MB) PagedPool Maximum: 4160749568 (16642998272 Kb) ( 15.50 TB) ********** 5625 pool allocations have failed ********** Session Commit: 9360 ( 37440 Kb) ( 36.56 MB) Shared Commit: 55767 ( 223068 Kb) ( 217.84 MB) Special Pool: 0 ( 0 Kb) ( 0) Shared Process: 24208 ( 96832 Kb) ( 94.56 MB) Pages For MDLs: 656 ( 2624 Kb) ( 2.56 MB) Pages For AWE: 12704397 ( 50817588 Kb) ( 48.46 GB) NonPagedPool Commit: 70902 ( 283608 Kb) ( 276.96 MB) PagedPool Commit: 100624 ( 402496 Kb) ( 393.06 MB) Driver Commit: 11318 ( 45272 Kb) ( 44.21 MB) Boot Commit: 196761 ( 787044 Kb) ( 768.60 MB) System PageTables: 1593 ( 6372 Kb) ( 6.22 MB) VAD/PageTable Bitmaps: 11287 ( 45148 Kb) ( 44.09 MB) ProcessLockedFilePages: 0 ( 0 Kb) ( 0) Pagefile Hash Pages: 86 ( 344 Kb) ( 344.00 KB) Sum System Commit: 482562 ( 1930248 Kb) ( 1.84 GB) Total Private: 14023786 ( 56095144 Kb) ( 53.50 GB) Misc/Transient Commit: 24642 ( 98568 Kb) ( 96.26 MB) Committed pages: 14530990 ( 58123960 Kb) ( 55.43 GB) Commit limit: 33635847 ( 134543388 Kb) ( 128.31 GB) System Region Base Address NumberOfBytes NonPagedPool : ffffd18000000000 100000000000 PagedPool : ffffa40000000000 f8000000000 SystemCache : ffffe18000000000 100000000000 SystemPtes : ffff900000000000 100000000000 UltraZero : ffffba0000000000 100000000000 CFG : ffffca0000000000 60000000000 Hyperspace : 0 8000000000 SpecialPool : ffffa00000000000 10000000000 SessionSpace : fffffb8000000000 8000000000 PagedPoolWorkingSet : ffffa20000000000 8000000000 SystemCacheWorkingSet : ffff890000000000 8000000000 System Images : fffff50000000000 8000000000 Virtual Memory Physical Memory File Cache Cache Writes 3: kd> !minws Process name msmdsrv.exe Min WS: 12.66 GB ---------------> MSMDSRV has set Min WS to be 12.66 GB Total working set size for all processes: 4.13 GB 3: kd> !mex.p ffffd188d8c1b080 Name Address Ses PID User Name Create Time Up Time Mods Handle Act Thrd Z Thrd Parent =========== ================ === ============ ===================== ========================== ============ ==== ====== ======== ====== ========================= msmdsrv.exe ffffd188d8c1b080 0 e90 (0n3728) DOMAINACCOUNT$ 07/03/2017 08:48:16.035 PM 9h:49:55.964 82 1824 100 0 services.exe 3f8 (0n1016) Command Line: "C:Program FilesMicrosoft SQL ServerMSAS13.POWERPIVOTOLAPbinmsmdsrv.exe" -s "C:Program FilesMicrosoft SQL ServerMSAS13.POWERPIVOTOLAPConfig" Memory Details: VM Peak Work Set Commit Size PP Quota NPP Quota ========= ======= ======== =========== ========= ========= 852.57 MB 1.84 GB 57.73 MB 117.86 MB 571.65 KB 64.37 KB CPU Details: User Kernel Total ===== ====== ===== 328ms 298ms 626ms
The same problem , but no SQL Server or Analysis Services, was described in this blog.
The big gotchas came from the fact that Performance Monitor counters showed plenty of available physical memory and working set sizes for all processes were quite reasonable. So don't expect to find this in Perfmon, it just isn't there.
Solution
The Analysis Services experts joined the t-shooting (thank you, Yinn Wong) and as it turns out that by default the SSAS tabular instances, sets minimum working set to 20% of the physical memory on the system. The system was using POWERPIVOT SSAS? Adjusting the VertiPaqMemoryLimit to a fairly low value or a fixed value reduced the min working set reserved by SSAS and thus the problem went away. Of course, you need to do some benchmark testing to ensure you set the right value to match performance requirements.
There is whitepaper (https://msdn.microsoft.com/en-us/library/jj874401.aspx ) and doc about setting SSAS memory for tuning https://docs.microsoft.com/en-us/sql/analysis-services/server-properties/memory-properties
Also, in several other instances it turned out SSAS was not even needed and used. Therefore, in those cases that service was stopped and set to manual or uninstalled.
Namaste!
Joseph
This post first appeared on MSDN Blogs | Get The Latest Information, Insights, Announcements, And News From Microsoft Experts And Developers In The MSDN Blogs., please read the originial post: here