Capacity Calculator for Storage Spaces
There is no official information (that I am aware of) on how to calculate maximum capacity one can get from physical disks of different sizes in Storage Spaces. Based on partial documentation and logic, I got the formula that all my tests so far confirm correct. But of course, use at your own risk, since it is not official. The algorithm works for different disk sizes, resiliency types, and any column counts. It is coded into TSQL procedure you can use. Now you have a tool to predict the total capacity of your storage pool and make easier decisions when buying disks in the future!
Capacity depends on:
- Type of Virtual Disk (simple, 2-way mirror, parity, …)
- Number of Columns (disks written in parallel – stripe)
- Disk sizes (of course 🙂 )
Official FAQ is very useful to read with particularly useful table of how column count relates to disk count accessed „at once“ or “in parallel” or in “stripe”. More columns – the faster is Virtual Disk. But also the greater chance for unusable space on mix-sized disks. Number of disks in a stripe is identical to number of columns for all except mirroring. To get the number of disks in a stripe (accessed in parallel) for mirroring, you need to multiply columns by 2 (or 3 for 3-way mirror):
If we have 3 disks: 1TB, 1TB, 2TB, what is the total capacity of Simple (no resiliency) Virtual Disk with 2 Columns? 2 columns = 2 disks in a stripe. We can allocate from biggest 2 disks 1TB each (blue “stripe”), then 1TB of remaining 2 disks each (green “stripe”) = 4TB total capacity. Entire space is utilized, no “leftovers” – although we have different sizes of disks! How cool is that? 🙂
The same example can be applied for Mirroring with 1 column. 1 column = 2 disks in a stripe (because it is mirroring). That means we write two “disks at once”. The picture is the same, but capacity is half: 2TB, as blue stripe gives 1TB plus green stripe 1TB too. That is a natural “cost” of having the same data on 2 places for resiliency.
Example with 5 disks: 1TB, 1TB, 2TB, 3TB, 4TB, for Virtual Disk with PARITY and 3 columns (minimum columns for parity):
3 columns = 3 disks in a stripe (3 disks “written at once” or “accessed in parallel”). Going from the disk with the most free space (4TB disk) downwards, we count disks in stripe – which is 3. Free space of 3rd disk from the right is 2TB. We allocate 2TB of first 3 disks by painting them blue – the blue “stripe” is born! Again we count 3 disks starting from the one with the biggest free space – we get to 1TB disk, so 1TB is our next “max allocation” size. We allocate 1TB on 3 disks from the one with the biggest free space down – resulting in “green stripe”. Not that actual stripes are not as high as 1TB – they are by default 256KB high (default “Interleave” or allocation unit per each physical disk). We use max sizes here in order to calculate max capacity in as few steps as possible.
From the blue stripe we got 4TB capacity (6TB raw disk space minus 1 disk for “parity”). From the green stripe we got 2TB capacity. Total capacity is 4+2=6TB! Unusable „leftover“ is 2TB (white). We can experiment with different column counts or disk sizes and try to find a combination that fills entire space. Or, we can create another VD with columns=1 to use that remaining space, if we want to utilize every bit of space.
Capacity Calculator in TSQL
To make your life easier, I coded that algorithm into a TSQL stored procedure. The main part is here:
WHILE 1=1 BEGIN
DELETE @TopN; -- otherwise rows would pile-up
INSERT INTO @TopN
SELECT TOP(@disksWritten) d.DiskId, d.FreeGB
FROM #disks d
WHERE d.FreeGB > 0
ORDER BY d.FreeGB DESC, d.DiskId DESC
IF @@ROWCOUNT < @disksWritten BREAK; -- no more disks that could fit this write
SELECT @chunkSizeGB = MIN(t.FreeGB) FROM @TopN t;
SET @capacityGB += ROUND(@chunkSizeGB * @disksWritten * @capacityPct, 0); -- stripe size * capacity%
-- Decrease free space on disks
UPDATE d SET d.FreeGB -= @chunkSizeGB
FROM #disks d
JOIN @TopN t ON t.DiskId = d.DiskId
Download entire procedure code GetStorageSpaceCapacity here!
It gives two resultsets: summary, plus per-disk which is suitable for graphical presentation.
NOTE: If you are a developer and can implement a graphical presentation of the result, you are more than welcome to do it! I will put the link to your graphical calculator here. As an example, look at this RAID calculator.
- After adding a drive, we must „even out” the data across all disks by using optimize command. Otherwise, calculated max capacity won’t be possible to reach!
- Run „Optimize-StoragePool -FriendlyName MyPool” after adding drive(s)!
- Make sure OS is patched and the storage pool is upgraded to the latest version
- Run „Update-StoragePool -FriendlyName MyPool”
- Every VD „expand” (and initial “create”) operation takes space for metadata. That takes away around 1-2GB per disk, per expand.
- Eg. on 10 disks, initial create+2 VD expands might take about 30-60GB for total metadata.
- Since metadata is written to disks, you should be able to plug them into a different windows machine, and Virtual Disks should be recognized correctly – in theory. I haven’t tried yet. Let me know if you did, and I will update the post.
- Expand less frequently – use generous increase steps so you do not have to expand very frequently. But not too big either as VD shrink is NOT possible.
- TB is not TiB! Actual space is about 10% less than declared
- Not error, just different units!
- 10% rule works only if size is expressed in TB (not GB, MB).
- Eg: disks declared as 4TB, 6TB, 8TB, 10TB will show in OS as 3.6, 5.4, 7.2, 9 TiB (10% less!)
- Not related to Storage Spaces, that is how plain disk sizes are presented in different units.
- Not only you can add mixed-size disks to the pool, but you can even get away with zero space leftovers!
- Use capacity calculator TSQL procedure from blog.sqlxdetails.com to plan future disk purchases.
- Please comment if you find this calculator useful, or errors. Your feedback is very appreciated. I am creating and sharing all of this – for you. Thanks!