IN THIS ARTICLE
Outlines how to create and customize a CloudWatch dashboard to monitor metrics and alarms for your Qumulo cloud cluster in AWS
- AWS Cloud Cluster with Qumulo Core 3.1.1 or above
- AWS Console access
- IAM permissions for full access to EC2
Sending cluster metrics to AWS CloudWatch with Qumulo Sidecar requires the following permissions
To configure CloudWatch alarms, the cloudwatch:PutMetricAlarm IAM permission is required.
IMPORTANT! Qumulo Sidecar requires separate IAM permissions for deployment with Qumulo cloud clusters in AWS. Check out the Qumulo in AWS: Qumulo Sidecar article for more info.
A CloudWatch dashboard is a customizable homepage that allows you to monitor the status of your cluster in a single view. You can configure the dashboard to display the alarms or metrics that are most important to you, making it easy to determine the health of your cluster at a quick glance.
Metrics provide information about the performance of your cloud cluster. While these details can come from a variety of services, this article focuses on two groups of Qumulo-based metrics that are under a Qumulo/Metrics custom namespace.
The ClusterName group specifies the cluster-wide metrics outlined below:
- FileSystemTotalCapacity: the total capacity of the file system in bytes.
- FileSystemUsedDataCapacity: the amount of file system capacity used by data in bytes.
- FileSystemUsedMetadataCapacity: the amount of file system capacity used by file metadata, directories, and other non-data sources in bytes.
- FileSystemUsedSnapshotsCapacity: the amount of file system capacity used by snapshots in bytes.
- FileSystemFreeCapacity: the amount of file system capacity remaining in bytes.
- ProtocolWriteOps: the count of write operations per second.
- ProtocolReadOps: the count of read operations per second.
- ProtocolWriteThroughput: the combined write throughput of all protocols in bytes per second.
- ProtocolReadThroughput: the combined read throughput of all protocols in bytes per second.
- ProtocolWriteLatency: the average latency of write operations of all protocols in microseconds.
- ProtocolReadLatency: the average latency of read operations of all protocols in microseconds.
- ProtocolMetadataLatency: the average latency of metadata operations of all protocols in microseconds.
- TotalNodeCount: the number of nodes in the cluster
- HealthyNodeCount: the number of healthy nodes in the cluster
- RemainingNodeFailures: the number of nodes that can fail before the cluster is offline.
- RemainingDriveFailures: the lowest number of drives that can fail before a data loss scenario may occur.
- FailedDriveCount: the number of drives currently failed in the cluster.
The NodeId group specifies the per-node metrics included below:
- ConnectionsNFSCount: the number of NFS connections that a given node is servicing.
- ConnectionsSMBCount: the number of SMB connections that a given node is servicing.
Alarms can be triggered based on metrics exceeding or dropping below certain thresholds. When triggered, the alarm performs actions based on the value of the metric or expression relative to a threshold over a number of time periods.
Create a New CloudWatch Dashboard
To create a dashboard, follow the instructions in Creating a CloudWatch Dashboard. Refer to the sections below to add metrics and alarms based on the needs of your cluster.
Add a Metric to a CloudWatch Dashboard
NOTE: Qumulo Metrics are only captured once the Qumulo Sidecar service is running. For more details see Qumulo in AWS: Qumulo Sidecar.
- Click Dashboards in the left-hand navigation pane on your AWS CloudWatch Dashboard.
- Select the dashboard you created.
- Click Add Widget.
- Select Number and click Configure.
- Select Qumulo/Metrics under Custom Namespaces.
- Select the following based on the type of metric you wish to add:
- Click ClusterName, NodeId for metrics that are node-specific (e.g., NFS or SMB connection counts)
- Click ClusterName for metrics that are cluster-wide (e.g., total space or number of nodes)
NOTE: The example below creates a widget for a cluster-wide metric, and thus assumes ClusterName is selected.
- Enter the name of the cluster in the search box and click Graph search.
- All metrics for your cluster are displayed and labeled in the Metric Name field. Check the box alongside the desired metric and click Create widget.
The new widget will now display on your dashboard with current data.
Repeat steps 3-9 above for any additional widgets needed. Once you’ve successfully added all of your widgets, click Save dashboard to save the changes.
TIP! See Using Amazon CloudWatch Dashboards for additional details around the types of widgets and graphs that can be added.
Add an Alarm to a Dashboard
- Click Alarms in the left-hand navigation pane on your AWS CloudWatch Dashboard.
- Click the Alarm Name link to view details on an alarm that corresponds to one of your Qumulo nodes.
- Click Add to dashboard.
- Verify that your dashboard is selected in the drop-down menu and click Add to dashboard.
- Click View dashboard in the green banner at the top of the screen to view the changes.
- Rearrange the widgets (if desired) and click Save dashboard to save the new configuration.
Repeat the steps above to add additional alarms to the CloudWatch Dashboard.
You should now be able to successfully create and customize a CloudWatch dashboard to monitor metrics and alarms for your Qumulo cloud cluster in AWS
Like what you see? Share this article with your network!