Bad RAM on iQstor.

February 10, 2010 at 2:55 pm (Technology, TechTips) (, , )

I have a iQstor 2880 SAN device.  Recently I was logged into one of the controllers and noticed some strange error messages showing up on the console:

10:49:11, Wednesday, 02/10/2010
: EXCEPTION: Dram Error detected:      count=19 cause=4000 esr_c_0004=20000 esr_c_000C=0 esr_c_Lcause=1 esr_c_Lerr=7cc2a2b6
Dram Error being handled: count=19 cause=4000 esr_c_0004=20000 esr_c_000C=0 esr_c_Lcause=1 esr_c_Lerr=7cc2a2b6
Dram Error recovered:      count=19 cause=4000 esr_c_0004=20000 esr_c_000C=0 esr_c_Lcause=1 esr_c_Lerr=7cc2a2b6

If you see errors like this, it indicates that you have bad RAM on the controller that you are logged into.  Now, the important thing to note here is that these memory errors will only show up on the console of the controller that has the bad RAM.  The messages will not be placed into syslog OR copied to the console of all controllers.  The best way to verify which controller has the bad RAM is to open a telnet session to both controllers and leave them up for a while.  Wait for the console posted error to show up and then you have confirmation on which controller to switch out the memory on.

Now, if only iQstor would get the errors to trigger an alert, copy to both consoles (with detail of which console has bad RAM, and syslog.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: