Adventures in ZFS: Crucial M4 Firmware SMART Issue
Let me preface this article by stating that this issue is NOT a ZFS issue or a Solaris issue. Previously I documented a issue in which the Crucial M4 SSD would “hesitate” causing ZFS to see it as faulted. This was promptly fixed in v0002 of their firmware. That issue is documented here. This issue is a bit more aggravating and has not been fully resolved.
The issue is that after a drive has been operating for 5184 hours it will cease to respond to the system. After a reboot the drive will work again, although the problem will repeat itself after an hour. This will continue to happen until a fix is implemented in the firmware.
Firmware v0309 has been released which will resolve the issue IF you are not using a SAS expander. If you are using a SAS expander (like that found in the Dell Poweredge R515 for example) then you have nothing you can do but wait for a new version of the firmware which will address this issue AND work with SAS expanders. If you are not using a SAS expander then you should install this firmware as soon as possible whether you are seeing the issue or not.
Now with regards to ZFS, if you are using these devices unpatched as a ZIL then you need to either patch this immediately or if you are behind a SAS expander then you have no choice but to replace these SSDs with a different model, deconfigure them as a ZIL. With a ZIL this error can cause real data loss. If you are using these drives as a cache device (L2ARC) then the risk is much lower, you basically run the risk of losing your cache (which is lost and rebuilt on every reboot anyways) so no data loss there, so you could choose to operate with faulted cache devices until a fix is implemented.
Good luck and I hope you all dodged the bullet, or perhaps that the bullet hasn’t been fired at you yet.
If you are using SAS expanders with this SSD, you can install the SSD in a machine without a SAS expander and then update to firmware 0309 and reinstall the updated drive back into the SAS expander. Apparently this issue was with the actual firmware update process and the SAS expander, not the firmware and the SAS expander. But don’t take my word for it contact support and talk to them about your situation.