IN THIS ARTICLE
Provides an overview of Qumulo's FIFO based SSD caching in versions 3.0.1 or below of Qumulo Core
- Cluster running Qumulo Core 3.0.1 and below
In Qumulo Core, the SSDs are used to store data layout information, provide a fast persistence layer for data modifications coming into the cluster, and cache file system data and metadata for fast data retrieval. In contrast, the HDDs are used to store data that is accessed less frequently to ensure that the data is still available, but not taking up valuable SSD space.
Prior to 2.8.5, a data block would only be stored on either the SSD or the HDD. Qumulo Core aimed to keep at least 20% of the SSD available for these incoming writes but if the write load was too fast, those incoming writes would be blocked until the file data blocks were moved to the HDDs. The period of time when SSDs could accept incoming writes without being blocked by writes to the HDD is called the “burst write window”. The policy for choosing which data to evict from the SSD and move to the HDD was random and did not take into account recent access to the data to impact whether the data would remain on the SSD or get moved to the HDD.
In Qumulo Core releases 2.8.5 to 3.0.1, the FIFO-based (First In First Out) SSD caching feature allows data to be written to the HDD while still residing on the SSD. An approximate FIFO eviction policy is used to evict data from the SSDs based on the order of the writes. This results in an increased burst write window since we are able to write to a larger portion of the SSD at burst speed without waiting for any writes to HDD. Additionally, recently written data has an increased likelihood of being on the SSD and is available for faster retrieval (since eviction of data is approximate based on the order of writes).
Read Promotion (2.8.9-3.0.1)
As of version 2.8.9, a new algorithm promotes data from the HDDs back up to the SSDs to enhance read performance. All data read from the HDDs has to go through a two-stage check that requires it to be read at least twice before being promoted. The first time a data block is read from the HDD, we’ll apply a filter that decides whether it may be promoted. If the filter is passed, the data will be added to the LRU cache that tracks recently read data. On a subsequent read, if the data is still in the LRU cache, the data will be selected for promotion to the SSD. Our two-stage check provides a property called scan-resistance, meaning that reading all files once (for example, as part of a backup job) doesn’t cause irrelevant data to be promoted to the SSD.
With the release of 2.9.0, we implemented a more flexible algorithm. We favor promoting smaller read sizes that map to the more IOPS-heavy portion of a workload because SSDs provide a tremendous IOPS boost over HDDs. Smaller reads have the highest probability of promotion over subsequently larger reads, or reads that represent the portion of a workload that is serviced well by the hard drives.
You should now have an overall understanding of Qumulo's FIFO based SSD caching available in versions 3.0.1 or below of Qumulo Core
Like what you see? Share this article with your network!