SDFS is fast. It runs on standard spinning disks at speeds of over 2GB/s for data that is highly redundant on a single server class CPU. On data that’s unique its performance is bound by your storage or cloud bandwidth, if your sending data to the cloud. It doesn’t need any special hardware it just uses what you give it.
SDFS Also scales quite well. It uses a custom hash table implementation optimized for deduplication. The hashtable grows in segments and keeps its data inline so there is not a lot of fragmentation. Older, less frequently access segments of the hashtable are kept cold on disk and newer or more requently access segments are kept in memory. This makes deduplication fast even at large scale.