Introduction
Looking for a version 8 command. It may be here. I’ll add more over time.
Clearing Events
Commands required on Isilon cluster v8
isi event groups bulk --resolved=true --ignore=true
Search failing IP’s across the cluster reported by zabbix
isi_for_array ifconfig | grep
View/Remove from Grid dynamic pool
isi network pools view --id= <groupnet_id>.<subnet_name>.<pool_name>
isi network pools modify <groupnet_id>.<subnet_name>.<pool_name> --remove-ifaces=<ifaces>
example:
isi network pools modify <groupnet_id>.<subnet_name>.<pool_name> --remove-ifaces=6:10gige-agg-1
The following can be used to add interface back
isi network pools modify <groupnet_id>.<subnet_name>.<pool_name> --add-ifaces=
Rebalance IP addresses
To manually rebalance IP addresses in a pool:
isi network pools list
isi network pools rebalance-ips <groupnet_id>.<subnet_name>.<pool_name>
Restart Services
Restart all NFS services on the cluster:
isi_for_array -s /usr/likewise/bin/lwsm restart onefs_nfs
Disk Status
There are various reasons for needing to check the status of drives on an Isilon cluster. For example, if they were ever to reach 100% full, they will cause very serious knock on effects for various other processes / jobs. A common reason for drives to end up more highly used than others is the running of a FlexProtect job type. This job type is started when the cluster decides it is time to Smartfail a drive out of the cluster. The job causes the disk to bail out all data onto drives in the same stripe where we find Disk usage increasing rapidly over a subset of drives in the same node.
Command structure required to get statistics on drive usage:
isi statistics drive list --verbose --degraded --format=table --limit=10 --nodes=all --sort=used
isi statistics drive -nall --verbose --format=table --limit=25 --sort=used
There may also be times when you may want to see which drives are most busy. This will be a sign that the cluster is working very hard. If you see high disk queues as well then you should probably be concerned.
Get statistics on drive busyness:
isi statistics drive list --nodes=all --type=sas,sata --sort=busy | head -n 30
Degraded Job Mode
This one is not generally something you should be touching. You should contact support however for your information it can be used when the cluster is very full, >80% and the action of a FlexProtect job running has a detrimental knock on effect because SnapshotDelete cannot run to remove data.
https://community.emc.com/docs/DOC-63909
The following code will return the status of the job engine. The value in question is “Run Jobs When Degraded”:
isi job status -v
The next command will either put the cluster in or take it out of degraded mode depending on the boolean value:
isi_gconfig -t job-config core.run_degraded=true
You will need to resume the SnapshotDelete job which is likely system paused in the background with:
isi job jobs resume SnapshotDelete
Checking Synciq Job Status
Synciq often runs without interference but when you do have long running jobs you can use the following to check if the process is hung:
isi_classic sync job repo -v
The output will provide various metrics on Lin updates, bytes transferred etc. If you find that there are no changes then it is likely the job is hung and needs manual intervention. At this stage you’ll need to raise a call with support to reach resolution.
Logs / ESRS
If you ever have problems with one cluster and it’s connectivity to ESRS (I know quite specific) You may use another cluster to proxy the upload of the logs. First use rsync to push the logs to the other cluster and then use the following command to upload the specific logset:
isi_gather_info --esrs -f /ifs/data/Isilon_Support/pkg/<cluster logset to upload>