8 Myths about Storage Spaces

8 Myths about Storage Spaces

Introduction

EVERYBODY should read this!

Whether you have a big database that needs speed, or other valuable data like documents or just personal pictures you do not want to lose when disk fails (disks DO fail without notice, just a matter of moment), you should read this post.

There is a hidden gem in Windows OS you can leverage for free. Just not many people know about it and how big value can it bring to you. Read on and watch the demo in video.

Benefits

What benefits can Storage Spaces give me?

Well, if you do not have Windows OS, and you have only 1 disk, then none. For those having or planning to have at least 2 disks (plus OS drive) it can:

  • Increase the speed of your local storage. As multiple disks work in parallel for better resulting speed.
  • Get resiliency to failure of a (one) disk. Or two failed disks if you choose three-way mirroring. That can save you from losing your data if you do not have a backup (you should have a backup). Or from ugly downtime caused by a disk failure.
  • Consolidate many DIFFERENT disks into only few big, fast, resilient virtual drives, taking less drive letters.
  • Ability to expand the storage capacity and speed by adding new physical disks, without changing ANY application or database settings – transparent to any software you use, no config changes needed in them, they will just see an “old” drive drive just got bigger and faster.
  • On SAME physical disks you can have a fast “simple” (striping) virtual disks for temporary stuff, eg. temp db database, scratch disks, temp folder, and at the same time, over same disks you can create resilient “mirror” or “parity” virtual drive which protects your data from at least one disk failure. You can eg. place more important data to that virtual disk which is a tiny bit slower, but resilient to disk failure.
  • Cloud Servers – have disks which are mirrored anyway. So just create a “simple” (striped, RAID0) virtual disk across all of drives and enjoy phenomenal increase in speed without losing capacity.

If you need speed, today’s normal, affordable M2 drives reach 3200 MB/s and 500 000 IOPS easily, just one individual drive, eg. WD BLACK SN750. One can use a PCIe card like this one, that measures 10 000 MB/s (yes, 10 GB per second – watch test here!) write and read speeds! Insane. And you can have 4 such cards if motherboard permits, for total of 16 M2 disks:

Filled with 2TB M2 drives, that would give you 16×2 = 32 TB of space with totally insane speed of 40 GB/s!!! In theory at least.

UPDATE 2019-12-05: In practice, this card won’t work if your MB is not on it’s “supported” list, or your MB does not have “x16 = 4 devices x4” option, or your CPU/MB does not have enough free PCIe lanes. That means, it won’t work in most cases unless you combined all components very carefully.

Terminology

Storage Pool = set of disks (physical). Disks can be very different in size and type. We will create virtual disks on this set of physical disks.

Storage Space = Virtual Disk. Created on set of physical disks (Storage Pool). You define size and parity of the Virtual Disk.

Volume = partition on virtual disk. Can have a drive letter assigned.

Storage Spaces = Microsoft’s technology to turn local disks into arrays of disks, building virtual disks on them. Kind of software RAID. And works very, very well.

Storage Spaces Direct = completely DIFFERENT from Storage Spaces. Different technology and purpose, but similar name to confuse people. Like Java and JavaScript. While Storage Spaces is about arrays of local disks, Storage Spaces Direct is like a Virtual SAN: multiple machines in a cluster present their local disks as one big, resilient SAN storage. Requires very fast network between cluster nodes to be efficient. Multiple machines acting as one storage. While Storage Spaces is much simpler technology turning local disks of one machine into arrays of disks.

8 Myths about Storage Spaces

Whether in form of a myth (We cannot …!) or a question (Can we…?), here are 8 interesting things about Storage Spaces:

  1. Storage Spaces work only on servers OS, it cannot work on Windows 10.

NOT true! It works on Win10 quite nice.

  1. With only 2 disks, can you have both mirrored and striped virtual disks at the same time?

Mirroring = resiliency, Striping = speed and capacity. Yes we can.

  1. Adding disks – if we create pool with 2 disks initially, we cannot add just 1 disk, we must add “column” number of disks we initially had (in our case 2).

NOT true! We can add 1 disk or as many as we like.

  1. Adding disk – does NOT increase speed, only space.

It adds speed too! But in server OS you will need to run Powershell command “Optimize-StoragePool -FriendlyName <YourPoolName>” to achive that.

UPDATE: Sadly, this myth seems true (thanks Dinko!). “Optimize-StoragePool” only spreads data evenly over disks, so new disk gets data too. I got faster speed because I added a faster drive to the Pool. If you add a same-speed drive, speed will not be faster. It is determined by the number of “columns” – a number of disks used in parallel (does not include disks for redundant copies of mirror, but for simple and parity includes all disks, even parity disk). When creating a Virtual Disk, you can specify number of columns only through PowerShell, and it is NOT changeable after you create VD! Adding a drive to the pool does not increase columns of VD. To increase columns, you would need to create a new Virtual Disk with more columns, and copy data to it. Which is a real BUMMER! Microsoft, are you listening? We need ability to easily increase columns – to get speed with adding new drives, not just boring capacity!

  1. Expanding virtual disk – is it difficult, requires reboot?

Super-easy, few clicks, online.

  1. Shrinking virtual disk – is it possible?

Unfortunately not possible. Expand when you need, but do not over-expand too much, and do not expand too often. Each expand seems to take some space (around 250MB).

  1. Removing disk – Can you remove the disk that is used in virtual disks, and has data?

Sure we can, but we need to “prepare it for removal” first, to drain data to other disks.

  1. Failing disks – can mirroring/parity in SS really protect your data when a disk crashes? It calls for a test of disk failure!

Yes, it protects your data for 1 drive failure.

The Demo

You can see all of that “in action” in this video:

(Music from https://filmmusic.io
“Verano Sensual” by Kevin MacLeod https://incompetech.com
License: CC BY http://creativecommons.org/licenses/by/4.0/)

UPDATE 2019-08-01 About Myth 4 – Adding disk on Server OS

After adding new physical disk, we should “Optimize” or spread the data over all disks (some call it “rebalance”) to enjoy increase in speed spread data evenly across drives. In video at 9:09 you can see Windows 10 has that “Optimize” checkbox. But server Windows have quite different GUI and that checkbox is missing. So how to invoke “Optimize” after adding new disks to a pool in server OS?

Powershell command “Optimize-StoragePool” to the rescue! As described here and here, you might also have to run “Update-StoragePool” if pool was created before Windows 2016.

“Storage Spaces Direct automatically optimizes drive usage after you add drives or servers to the pool (this is a manual process for Storage Spaces systems…). Optimization starts 15 minutes after you add a new drive to the pool. Pool optimization runs as a low-priority background operation, so it can take hours or days to complete, especially if you’re using large hard drives.”

Summary

If you have, or plan to have, more than 2 drives, leverage Storage Spaces to get simpler disk management, faster disk speeds, and resilience to disk failure. It is free and very, very efficient. I personally used Storage Spaces to saved some very big database servers in the cloud (Windows 2016 server OS) that suffered from serious disk issues. After applying Storage Spaces, disk problems went away, they got the speed they needed.

Useful Links

  • “Storage Spaces FAQ” from microsoft.com – very good, answers many questions.
  • “Understanding Storage Space Internal Storage” from itprotoday.com – tries to explain columns and interleave
  • “Dell Storage with Microsoft Storage Spaces Best Practices Guide” from dell.com – very detailed, also click on their other pages that explain Interleave, Virtual Disks – excellent stuff, a must read!
  • “Storage Spaces Demo” from IT Free Training – near the end explains tiering.
Tagged with: ,

28 Comments on “8 Myths about Storage Spaces

  1. Hi Vedran!

    Nice article! Difficult to find such summarized article about Storage Spaces on the Internet. Please explain a little bit more on Myth #4. You said that by adding physical disks you also increase speed. In my experience you have to recreate virtual disk to utilize additional number of columns because – more columns, more speed. It is not possible to increase number of columns without destroying virtual disk so just by adding physical disk it will not automatically increase speed. Please clarify on what did you have in mind…

    • Thanks Dinko for the comment!
      At 9:09 in video while adding new physical disk you can see a checkbox “Optimize drive usage to spread existing data across all drives”. In server OS (eg Windows 2016) a GUI for Storage Spaces is quite a bit different, and as I remember it does not include that checkbox. Maybe there is a way to do that in server OS (hidden in GUI, powershell)? Without that “spread the data” step, it is logical not to see increase in speed when adding new drives.

      • There is Optimize-StoragePool powershell cmmdlet in Windows Server 2016 that according to Microsoft documentation that does the following:

        “The Optimize-StoragePool cmdlet rebalances the Spaces allocations in a pool to disks with available capacity.

        If you are adding new disks or fault domains, this operation helps move existing Spaces allocations to them, and optimization improves their performance. However, rebalancing is an I/O intensive operation.”

        I just did a performance test on my Windows Server 2016 (14393.3085) with 1 virtual disk (mirror, 1 column, 2 physical disks in the pool) before and after adding 2 additional physical disks. Storage pool was optimized after adding physical disks which rebalanced the data across physical disks. However, I have measured the same performance before and after. Also, it did not change the columns number for the virtual disk. But I still think that statement about performance increase is correct because if you have multiple virtual disks in the same pool, unbalanced data spread could stress some physical disks more than others. If you have only one virtual disk in the pool, there is no difference.

        • Thanks Dinko for your valuable insight and measuring. It is a sad thing to say we do not get speed by adding disks (except maybe a little due to rearranging of data). Because adding a new Virtual Disk with more columns, copy all data, remove old VD – is a real bummer and a downtime.
          I hope Microsoft will do something about it, to give us ability to increase column number without wasting double disks and a huge downtime. I corrected the post according to your findings – thanks again!

  2. Great summary!

    I am a photographer using storage spaces on an external drive that is networked through our main computer. Love how easy it is.

    I just discovered that I could not expand a storage space past a ceiling created by cluster size (the file system has a maximum size) … I created my storage space of 10TB because at the time I didn’t need more space and I didn’t have more discs in the enclosure – and the storage space dialog just says add more drives when you need to. The cluster size ended up at 4 kb which gave me a maximum of 15.99 TB in the volume, if I try to make it larger than that it doesn’t let me.

    I found the following info from a user on Reddit …
    Cluster size – max volume
    4kb – 16TB
    8kb – 32TB
    16kb – 64TB
    32kb – 128TB
    64kb – 256TB

    This means that if you know you are going to eventually want your virtual drive to be 30TB, you will want to initially create your storage space at more than 16TB even if you only have 2TB of physical drives to start. The initial volume size sets the cluster size.

    One thing you didn’t mention that I saw in some Storage Space Direct documentation is the best practices of leaving unallocated space in the storage pool, that way if a drive dies windows will start using the unallocated space to rebuild in the new drive. So for my set up I have 6 6TB drives in the pool, I am using 2 mirror parity so I am looking at a storage pool of 36TB with a 15TB volume (mirrored parity = 30TB of data) which leaves 6TB of unallocated which is equal to the size of a drive. Haven’t seen this best practices mentioned for Storage Spaces and don’t know if it will rebuild, but thought I would ask 🙂 … Windows 10 documentation for Storage Spaces is slim!

    Thanks!

    • Hi Roger,
      Virtual disk is “thin” provisioned, right? Because fixed-provisioned cannot grow. I think for a 2-way mirror you need to add at least 2 disks to be able to use their space. Run “Optimize drive usage” to help redistribute drive usage across existing disks and get some space from that. You mentioned it is a good practice to leave space of at least one disk – agreed.
      You also need a backup if data is important. I usually use main storage which is array of local disks in simple (striping) mode which is fast but prone to data loss, plus a backup on external storage eg. Synology. Backup configured in a way to be resilent to crypto viruses, like this: https://blog.sqlxdetails.com/crypto-virus-resistent-backup/

  3. HI,

    was wondering if you found or know about a way to monitor the Storage Tier usage within a virtual disk.

    i.e. if my VD is total of 300GB
    performance tier is 100 GB and capacity tier is 200GB
    then i would like to know how much is used from the 100GB.

    • I haven’t experimented with tiered storage yet, but I guess if you explore powershell objects around tiers, you might find properties that show you tier usage information. Maybe somebody else who reads this can answer more precisely.

  4. Hi there,

    I found out the SS created in Win10v1809 has worse writing disk performance compare SS created in Win10v1607.

    for comparison:
    SS created in Win10v1607 benchmark: 330MB/s
    SS created in Win10v1809 benchmark: 240MB/s

    *both benchmark on the same platform & hard disk
    *The benchmark tested on Win10v1697 and Win10v1809

    Jemiruddin

    • Hi,
      Speed highly depends on number of “columns”, which means, number of disks accessed in parallel (simultaneously). Column count is set at creation of each Virtual Disk, and cannot be modified. It also cannot be more than number of physical disks. Your situation looks like you had 3 columns before and now you have 2 columns. Maybe it is a default that changed in higher version, if you have not specified it explicitly, and that caused the slower speed (I usually specify it using powershell to create VD). Check column count with this powershell command:
      Get-VirtualDisk | select FriendlyName, ResiliencySettingName, Interleave, NumberOfColumns

      • Hi,

        Configuration for both SS created in 2016 & 2019 exactly the same. No different at all. Just one create in Win10v1607 another one in Win10v1809

        • If you checked number of columns with the given powershell command, and it is the same as before, then I do not know the reason. If you created SS using the GUI (not powershell) then in new windows versions default number of columns for “simple” VD is changed and is 1 less than number of disks, 1 less than previously was default. That would give you a slower speed exactly as you described. And can be corrected by creating SS with explicitly specifying num of columns, not relying on default.

          • The SS created in powershell. Which I already save it as ’15colunm2parity.ps1′. Basically, the configuration no difference at all.

  5. Thanks Jemi for the report. I don’t know what is causing this if the number of columns (you have not provided that though) is the same as before, and everything else except version is the same. Same script can (and will) result in different configuration options on different versions (eg. they will result in different default column count). If resulting configuration is exactly the same then really it would be a version difference and you should file a bug to Microsoft. It is a considerable slowdown, really big difference. In “simple” layout, with column count = disk count, you should get resulting speed about the same as sum of all single disk speeds. Eg, if each disk alone gives 100MB/s, with 3 disks you should get around 300 MB/s.

  6. Hi Vedran.
    I am planning to purchase an OWC Express 4M2 4-Slot M.2 NVMe SSD Enclosure to use as a fast external (thunderbolt 3) storage for video editing. It is basically the same as your AsRock quad card, but external (and not as fast, of course.)
    My question is, will I be able to simply connect this device with a Storage Spaces RAID pool to another windows computer and work with it? In other words, is the pool tied to the machine that created it?

    • Hi Žarko! Thanks for being active and asking. The question is – if your ultimate goal is to speedup video editing, is storage really your biggest bottleneck? If you use Premiere, maybe much faster process is using Adobe Media Encoder to convert all source videos to near-lossless QuickTime “ProRes 422 (HQ)”. It is fast for editing. Press “Render” whenever you go to pause. For export use “match source” and tick “Use previews” to export to “ProRes 422 (HQ)” with super-high speed. Then use Handbrake for quick conversion into H264/265 (MP4). See here: https://www.youtube.com/watch?v=AT8sU0MyncA
      Answer to your original question: To be honest I haven’t tried. I returned my ASRock card because my motherboard had limited number of PCIe lanes (16 is “norm”) and almost all were already taken, therefore only 1 NVMe was recognized and others ignored because of lack of PCIe lanes (internal “bus” connections) on MB. If your computer sees individual NVMe drives in that enclosure, and you set Storage Spaces in certain order of disks, you would have to setup same Storage Spaces in other PC in the same order of disks. I think it is very risky, likely to corrupt data and probably unsupported.
      But here is the good news: Good NVMe drives are fast enough to use them as individual drives inside this box, no storage spaces are required. Then you should be able to move this box around.
      IMHO, laptops are generally slower than desktops, and are not the best tool for editing video, unless you really need portability. The fastest would be desktop filled with SSDs and NVMes into storage spaces, and very fast graphic card and CPU.
      But aside hardware, this described video editing technique gives you really fast way to edit and export videos, even on laptops.

      • Unfortunately, the box in question does not have x4 bandwidth for each m.2 slot, therefore the need for RAID. Otherwise, the throughput is capped at around 700 MB/s for each populated slot.

        My current workflow is exactly as you suggested, and it worked just great until recently when I got a new contract that will not let me have the luxury of time for proxy rendering. I mean terabytes of video footage weekly.

        For this reason I am getting a powerful new workstation that will be able to crunch the source files (in multiple layers). This thunderbolt box would be a nice bonus, eg. for sharing the project with my colorist – that is why I asked if it could be transferred to another machine. I would not use it as a primary working unit since internal drives in the workstation will go up to 5GB/s.

        • Try Handbrake. In my case it encodes 60x faster than Adobe Media Encoder, and I’m ok with it’s limitations on resulting format (mp4) and resolution. Smaller, lower-res files are faster to manage. And can be replaced in your Premiere projects by right-click original file->Replace Footage. Then you can select any file, but at least fps should match. Editing and even colorist might work with this smaller files. On render you switch back original files. It is like a proxy method, but more flexible.

          • I know all about Handbrake, but I am trying to avoid any proxy rendering, hence the new workstation. I am not new here, been working professionally for the past 20 years. Mostly, smooth editing with source footage is not about hard drive throughput (only for red 8k and similar) but about CPU crunching power. And while I (the editor) can work with proxy files, the colorist cannot. He has to work with the original footage. BTW, when encoding into the same kind of file at same bitrate, Handbrake is about 15-20% faster than AME, and it also produces a gamma shift with certain type of footage which is unacceptable.

          • Thanks for the info! I’m not nearly as proficient in video, only as much as I need to record SQL Server videos. Can you please share your findings from disk test here? Then many people can benefit. Thank you!

    • Now I read about device you mention, and it seems it has it’s own software for combining disks “OWC SoftRAID”. If you install that same software in two computers, I believe it should work, you should be able to switch computers. They mention Storage Spaces only in situation where one enclosure is not enough. You would buy eg. 2 such enclosures, fit there 8 M2 SSDs, and combine these two using Storage Spaces. That probably would not be “movable” to other computer, but I think you need only one box with 4 SSDs so you should be good. But I haven’t used that device so it is safest to ask their support or someone who has it.

      • SoftRAID unfortunately currently works only in Mac OS, and this product is targeted for Macintosh users. They were mentioning windows beta tests, but I haven’t seen anything in the news lately.

        • That means on windows you see just 4 individual drives until this software comes out. One risky way, that can result in data loss, and maybe does not work is this: When you join a drive to Storage Pool, data is erased. Therefore, while drives are empty, join all drives to a storage pool on one computer, unplug and plug to other computer, join all drives to a storage pool there. Create a Virtual Disk, put some files there. Unplug, and plug into first computer. I don’t think it will work (Virtual Disk be visible), but you can try. This post gives hope that it might work, because config data is written onto pool itself: https://serverfault.com/questions/550136/can-storage-spaces-drives-be-moved-to-a-replacement-server-when-there-is-a-failu

          • That is an interesting idea. I will try it beforehand using my dual hard drive dock. Thanks.

        • Žarko, one more thing you can do IF the colorist is in the same local network (same building?): a NAS storage. Bump the RAM inside, and buy 10Gbit card if higher speed is needed. But these Video editors claim 1Gbps is fast enough to edit even 4K meterials directly from NAS, and enables simultaneous work for multiple editors and colorists on the same files: https://www.youtube.com/watch?v=4jgEHyx3Kp0
          And this: https://www.youtube.com/watch?v=U_XBGD12veI
          DS1819+ gives around 400-500MB/s over 10Gbps.
          Even if you are alone in the building, I highly recommend getting a NAS, even over 1Gbps network. It is great for backups and many more things.

          • Unfortunately my colorist is off-site, and while I have a sufficient internet bandwidth to send it to him online (500/250Mbps), he is still on ADSL. Funny you should mention Synology, as I do own one DS1815+ in RAID6 environment, which was a godsend. It is the single most useful device I have ever bought. I see that you are a tinkerer, maybe this will tickle your fancy: just google for xpenology project.

  7. I need some advice regarding Windows Storage Spaces. If i have several small drives in the pool and then add a big drive, will i be able to use all the space on the big drive? I want to set up the pool with parity

    • For a given set of disks, a VD max capacity depends on the column count you choose and type of resiliency (simple, parity, 2 way mirror, 3 way mirror). I’m not aware of any formula or official way to find a physical disk footprint for a thin provisioned VD when his max capacity is reached. If your column count is 3 which is smallest (2+parity), and you have disks 100GB, 200GB, 300GB and then add 1000GB disk to add space, I think your max capacity is 200GB capacity taking 300GB space on first 3 drives, then 200GB taking another 300GB space on 2nd-4th drive. Total max capacity is 400GB taking 600GB of space, leaving 100GB free on 3rd drive and 900GB free on 4th drive. Not nearly close to utilizing all available disk space. It is just my assumption on how it works, and needs to be tested. You can check current footprint on physical disks via powershell, and also an “optimize” VD command which redistributes data more evenly across the disks, eg. after you add a fresh drive. But that won’t move your max capacity limit. The only way I can think is to add more big drives (ideally to reach 3 big) so they can be fully utilized, OR create another VD with 2-way mirror with 1 column that would be able to use remaining space. Even 2-way mirror cannot use entire space if you have 2 biggest disks of unequal size. Only 1-column simple VD would always utilize entire space. Bad has no resiliency and is not very fast (because only 1 column).

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.