Switch to DuckDuckGo Search
   November 26, 2013  
< | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | >

Toggle Join/Part | bottom
[00:02:20] *** esproul has quit IRC
[00:25:01] *** szaydel has quit IRC
[00:28:41] *** Clory has quit IRC
[00:29:05] *** Clory has joined #omnios
[00:47:24] *** wez is now known as wez|away
[00:59:28] *** postwait has quit IRC
[00:59:43] *** khushildep has quit IRC
[01:16:15] *** joltman has quit IRC
[01:23:13] *** szaydel has joined #omnios
[01:25:27] *** wez|away is now known as wez
[01:47:43] *** szaydel has quit IRC
[01:49:42] *** szaydel has joined #omnios
[02:29:38] *** gmason has quit IRC
[02:48:44] *** wuff has quit IRC
[03:04:19] *** desai has joined #omnios
[03:07:21] *** gmason has joined #omnios
[03:09:32] *** jdboyd has joined #omnios
[03:12:39] *** postwait has joined #omnios
[03:12:42] *** ChanServ sets mode: +o postwait
[03:17:46] *** wez is now known as wez|away
[03:19:37] *** wuff has joined #omnios
[03:28:05] *** wez|away is now known as wez
[03:38:13] *** szaydel has quit IRC
[03:38:43] *** berend`` is now known as berend
[03:42:25] *** szaydel has joined #omnios
[03:48:23] *** gmason has quit IRC
[03:57:16] *** Watcher7 has quit IRC
[03:58:42] *** Watcher7 has joined #omnios
[04:01:36] *** ghost75 has joined #omnios
[04:03:45] *** ghost75_ has quit IRC
[04:08:57] *** szaydel has quit IRC
[04:15:58] *** desai has quit IRC
[04:16:23] *** sebasp_ is now known as sebasp
[04:24:24] *** sebasp is now known as sebasp_
[04:27:05] *** vrou has quit IRC
[04:27:39] *** vrou has joined #omnios
[04:41:50] *** desai has joined #omnios
[04:47:57] *** wez has quit IRC
[04:50:17] *** jpeach has quit IRC
[04:50:50] *** jpeach has joined #omnios
[04:55:23] *** jpeach has quit IRC
[04:58:38] *** desai has quit IRC
[05:04:14] *** szaydel has joined #omnios
[05:09:03] *** jdboyd has quit IRC
[05:11:46] *** jpeach has joined #omnios
[05:16:33] *** jpeach has quit IRC
[05:17:58] *** wez has joined #omnios
[05:17:59] *** postwait has quit IRC
[05:21:34] *** jdboyd has joined #omnios
[05:41:03] *** wez has quit IRC
[05:44:04] *** wez has joined #omnios
[06:40:59] *** wez is now known as wez|away
[06:43:40] *** wez|away is now known as wez
[06:52:27] *** vrou has quit IRC
[06:53:06] *** vrou has joined #omnios
[06:54:50] *** szaydel has quit IRC
[06:56:15] *** jpeach has joined #omnios
[06:58:25] *** szaydel has joined #omnios
[06:59:11] *** szaydel has quit IRC
[07:00:30] *** jpeach has quit IRC
[07:20:32] *** nefilim has quit IRC
[07:36:10] *** slx86 has joined #omnios
[07:56:13] *** wez is now known as wez|away
[07:59:44] *** xeyed4good has joined #omnios
[08:04:10] *** xeyed4good has quit IRC
[08:07:33] *** sebasp_ is now known as sebasp
[08:26:42] <mihai_omniosuser> hi all
[08:26:52] <mihai_omniosuser> sorrry for yesterday I had to leave was kinda late
[08:27:11] <mihai_omniosuser> I am bringing this again
[08:27:32] <mihai_omniosuser> in one week two pools on the same server wend degraded with almost all disks degraded
[08:27:37] <mihai_omniosuser> might be a bug?
[08:27:42] <mihai_omniosuser> LSI SAS 9201-16e SI000276 is used
[08:27:48] <mihai_omniosuser> and behind SATA drives
[08:30:06] <db48x> checksum errors on all drives?
[08:31:42] *** wez|away is now known as wez
[08:32:50] <mihai_omniosuser> yesterday evening after the resilvering was done we cleared the pool and was fine
[08:33:06] <mihai_omniosuser> in the morning again more than one third is back degraded
[08:33:33] <db48x> did zpool status say that there are checksum errors on the drives?
[08:33:56] <mihai_omniosuser> zpool status yes, but the resilvering found no problem
[08:34:22] <mihai_omniosuser> then clearing was showing disks ok
[08:34:35] <db48x> zfs records a checksum error any time it reads a block and that block's checksum does not match the checksum stored in the block's parent
[08:34:41] <mihai_omniosuser> also iostat -En does not show any errors
[08:35:23] <db48x> that can happen with the disk returns bad data (hopefully rare, unless the disk has bad sectors), when there are loose cables, marginal power supplies, bad ram, high radiation environment (or non-ecc ram), etc
[08:35:35] <db48x> if it's just one disk that has a problem, then the problem is probably the disk
[08:35:48] <db48x> if it's all or most of them, then it's probably something else
[08:36:03] <mihai_omniosuser> db48x: we have 20 disks all the same errors
[08:36:16] <db48x> of course a bug in the driver for the controller is a possibility, but pretty unlikely given the popularity of that card
[08:36:21] <mihai_omniosuser> all harware is new
[08:36:57] <db48x> seriously, check for loose cabling, and measure the power output of your power supply when it's under load
[08:37:21] <mihai_omniosuser> and we get some error that no memory for Unable to allocate dma memory for extra SGL.
[08:37:42] <db48x> hmm. that's a very different kind of error than a checksum error
[08:37:42] <mihai_omniosuser> [ID 107833 kern.warning] WARNING: /pci@75,0/pci8086,3c06@2,2/pci1000,30d0@0 (mpt_sas2):
[08:37:59] <mihai_omniosuser> would the upodate to latest release helpo?
[08:38:10] <mihai_omniosuser> I saw some driver rollback because of stability
[08:38:15] <db48x> not sure
[08:38:50] <db48x> it'd be worth trying though
[08:39:20] <mihai_omniosuser> so Ill do it just now
[08:40:28] <db48x> you could also check the bug reports that led to the rollback, see how the problem manifested
[08:44:31] <mihai_omniosuser> we are running the server in production, and we have not too much time
[08:44:46] <mihai_omniosuser> we have an hour until the guys come in
[08:45:29] <db48x> ouch
[08:55:35] *** mihai_omniosuser has quit IRC
[09:03:09] *** slx86 has quit IRC
[09:03:43] *** mihai_omniosuser has joined #omnios
[09:04:03] <mihai_omniosuser> db48x: rebooted and all working
[09:04:20] <mihai_omniosuser> until the storm comes, I need to see
[09:05:31] <db48x> do you have a way of load-testing?
[09:05:36] <mihai_omniosuser> no
[09:05:51] <mihai_omniosuser> can you recommend something?
[09:06:21] <db48x> well, there are various disk benchmark programs, but nothing beats being able to run your actual workload
[09:06:27] *** TBCOOL has quit IRC
[09:07:01] <mihai_omniosuser> but there was something that you could simulate some load with iostat or zpool iostat
[09:07:08] <mihai_omniosuser> but I cannot remember
[09:19:34] *** khushildep has joined #omnios
[09:33:32] *** berend has quit IRC
[09:36:23] <mihai_omniosuser> what means one or more I/O devices have been retired?
[09:55:18] *** berend has joined #omnios
[10:04:56] *** khushildep has quit IRC
[10:13:40] *** wez has quit IRC
[10:17:34] *** jdboyd has quit IRC
[10:45:58] *** TBCOOL has joined #omnios
[10:47:36] *** kschiess has joined #omnios
[11:15:21] *** bens1 has joined #omnios
[12:16:48] *** khushildep has joined #omnios
[12:59:59] *** mihai_omniosuser has quit IRC
[13:05:33] *** TBCOOL has quit IRC
[13:08:11] *** desai has joined #omnios
[13:31:15] *** szaydel has joined #omnios
[13:33:38] *** desai has quit IRC
[13:38:08] *** kschiess has quit IRC
[13:38:45] *** kschiess has joined #omnios
[13:54:41] *** kschiess has quit IRC
[13:54:59] *** kschiess has joined #omnios
[14:05:08] *** bens1 has quit IRC
[14:15:43] *** bens1 has joined #omnios
[14:41:57] *** TBCOOL has joined #omnios
[14:42:20] *** sebasp is now known as sebasp_
[15:15:43] *** xeyed4good has joined #omnios
[15:29:21] *** sebasp_ is now known as sebasp
[15:32:11] *** gmason has joined #omnios
[15:39:13] *** [1]wuff has joined #omnios
[15:42:17] *** wuff has quit IRC
[15:44:30] *** xeyed4good has left #omnios
[15:45:57] *** [1]wuff has quit IRC
[16:01:49] *** gmason has quit IRC
[16:02:06] *** neophenix has joined #omnios
[16:02:46] *** wuff has joined #omnios
[16:03:15] *** gmason has joined #omnios
[16:04:25] *** nefilim has joined #omnios
[16:11:23] *** jdboyd has joined #omnios
[16:19:36] *** ilovezfs_ has quit IRC
[16:30:12] *** ira has joined #omnios
[16:37:40] *** joltman has joined #omnios
[16:55:39] *** jpeach has joined #omnios
[17:04:30] *** desai has joined #omnios
[17:04:46] *** desai has quit IRC
[17:06:03] *** desai has joined #omnios
[17:07:12] *** ira has quit IRC
[17:22:42] <wuff> so i've been having some unexplained kernel panics (http://lists.omniti.com/pipermail/omnios-discuss/2013-November/001773.html), so this weekend the box just crashed without a panic and i get a "Command failed to complete.. Device is gone" for one of my rpool's mirrored SSDs
[17:23:18] <wuff> i rebooted and everything came up, zpool status shows the rpool as healthy, so i did a scrub and boom, one drive has errors and is degraded
[17:24:46] <wuff> things have been "stable" since (2 days).. my question is, could a failing rpool drive cause all these weird kernel panics?
[17:34:49] *** kschiess has quit IRC
[17:35:45] *** khushildep has quit IRC
[17:50:49] *** ira has joined #omnios
[18:00:19] *** gmason has quit IRC
[18:40:30] *** gmason has joined #omnios
[18:47:34] *** gmason has quit IRC
[19:05:27] *** desai has quit IRC
[19:05:51] *** wez has joined #omnios
[19:06:15] *** desai has joined #omnios
[19:38:39] *** gmason has joined #omnios
[19:42:50] *** gmason has quit IRC
[20:02:02] *** sebasp is now known as sebasp_
[20:05:33] *** jtimberman has joined #omnios
[20:10:12] *** postwait has joined #omnios
[20:10:12] *** ChanServ sets mode: +o postwait
[20:23:17] *** gmason has joined #omnios
[20:36:47] *** gmason has quit IRC
[20:41:28] *** bens1 has quit IRC
[20:56:48] *** neophenix has quit IRC
[21:05:53] *** postwait has quit IRC
[21:23:07] *** gmason has joined #omnios
[21:24:05] *** wez is now known as wez|away
[21:24:50] *** jdboyd1 has joined #omnios
[21:27:05] *** jdboyd has quit IRC
[21:27:19] *** gmason has quit IRC
[21:52:52] *** wez|away is now known as wez
[22:04:56] *** wez is now known as wez|away
[22:09:47] *** ira has quit IRC
[22:29:53] *** wez|away is now known as wez
[23:10:36] *** wez is now known as wez|away
[23:10:44] *** neophenix has joined #omnios
[23:39:03] *** wez|away is now known as wez
[23:40:44] <patdk-lap> wez, please spot it
[23:41:10] <patdk-lap> stop
[23:41:23] <wez> patdk-lap: ?
[23:41:26] <apeiron> nick changing
[23:42:44] <wez> what's your problem?
[23:43:41] <wez> just configure your client to ignore it
[23:44:15] <apeiron> it's superfluous to change your nick if you're /away really
[23:45:24] <wez> I don't care. This is useful to me and the people I work with. You can ignore me in your irc client if it really bothers you.
[23:47:15] *** szaydel has quit IRC
[23:58:39] *** szaydel has joined #omnios
top

   November 26, 2013  
< | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | >