NOTICE: This channel is no longer actively logged.
[00:07:30] *** Bryanstein has quit IRC[00:07:53] *** tsukasa has quit IRC[00:11:36] *** Bryanstein has joined ##nexenta[00:36:15] *** myers has quit IRC[00:55:01] *** agagag has quit IRC[00:55:21] *** agagag has joined ##nexenta[01:03:52] *** mota_ has joined ##nexenta[01:06:44] *** mota has quit IRC[01:32:09] *** agagag has quit IRC[01:32:34] *** agagag has joined ##nexenta[01:50:41] *** trbs has quit IRC[02:26:35] *** asqui has quit IRC[02:49:38] *** trbs has joined ##nexenta[03:08:23] *** master_of_master has quit IRC[03:10:34] *** master_of_master has joined ##nexenta[03:17:54] *** Torpeo is now known as Torpeo_[03:34:50] *** trbs has quit IRC[03:42:07] *** Andys^ has joined ##nexenta[03:55:02] *** Torpeo_ is now known as Torpeo[05:56:38] *** kart_ has joined ##nexenta[06:00:30] *** myers has joined ##nexenta[06:06:09] *** shmoo has quit IRC[06:35:43] *** mota_ is now known as mota[06:35:51] *** mota has joined ##nexenta[06:37:17] * neoice waves at moa[06:37:21] * neoice waves at mota*[06:37:36] <mota> howdy neoice[07:21:08] *** JagWaugh has joined ##nexenta[07:24:01] *** myers has quit IRC[07:42:55] *** neoice has quit IRC[07:48:53] *** neoice has joined ##nexenta[07:51:19] *** tsukasa has joined ##nexenta[07:56:37] *** shmoo has joined ##nexenta[08:52:02] *** Maliuta has quit IRC[09:08:11] *** Maliuta has joined ##nexenta[09:50:30] *** alhazred has joined ##nexenta[09:55:06] *** anilg has joined ##nexenta[10:09:55] *** alhazred has quit IRC[10:18:38] *** Darky has joined ##nexenta[10:57:52] *** Darky has quit IRC[11:16:38] *** JagWaugh has quit IRC[11:24:32] *** kart_ has quit IRC[11:26:09] *** kart_ has joined ##nexenta[12:07:20] *** tsukasa has quit IRC[12:20:53] *** tsukasa has joined ##nexenta[12:30:47] *** Torpeo is now known as Torpeo_[12:38:09] *** alhazred has joined ##nexenta[12:51:47] *** andygraybeal has joined ##nexenta[13:08:30] *** alhazred has quit IRC[13:32:21] *** BugBlue has quit IRC[13:32:21] *** BugBlue has joined ##nexenta[13:37:49] *** Torpeo_ is now known as Torpeo[13:48:52] *** Torpeo is now known as Torpeo_[14:06:09] *** kart_ has quit IRC[14:13:54] *** Torpeo_ is now known as Torpeo[14:16:19] *** anilg has quit IRC[15:40:56] *** laserbled has joined ##nexenta[16:24:00] *** think has joined ##nexenta[16:24:37] *** laserbled has quit IRC[16:30:52] *** viridari has quit IRC[16:33:18] *** viridari has joined ##nexenta[16:34:26] *** Torpeo is now known as Torpeo_[16:59:52] *** laserbled has joined ##nexenta[17:38:09] *** laserbled has quit IRC[17:40:56] *** kart_ has joined ##nexenta[18:09:04] *** kart_ has quit IRC[18:09:28] *** kart_ has joined ##nexenta[18:13:07] *** miip has quit IRC[18:24:19] *** Torpeo_ is now known as Torpeo[18:51:08] *** ikarius has joined ##nexenta[19:27:38] *** trbs has joined ##nexenta[20:34:18] *** kart_ has quit IRC[21:01:54] *** miip has joined ##nexenta[21:02:20] *** miip has joined ##nexenta[22:00:49] *** trbs has quit IRC[22:14:20] *** trbs has joined ##nexenta[22:25:53] *** ipmb has joined ##nexenta[22:27:16] <ipmb> hi everyone. I just replaced the mobo/processor in my Nexentastor box and am getting a "ZFS device failed" on reboot. It seems to be stuck there and I'm not getting a login prompt[22:27:46] <ipmb> there doesn't seem to be any disk activity any longer, what should I do?[22:28:23] <SynQ> what other information do you get?[22:29:00] <ipmb> just the standard Nexenta error message. it repeats every few minutes[22:29:22] <SynQ> no information on which drive that might be?[22:29:29] <SynQ> drive/device[22:29:33] <ipmb> says to run `zpool status -x`, but I can't do anything past that[22:29:43] <ipmb> no device information[22:30:35] <ipmb> fwiw, I got errors about the network devices just before the ZFS errors, but that was expected (they are different on the new mobo)[22:32:13] <SynQ> can you see all devices on a bios-level?[22:32:43] <SynQ> and, what sort of setup do you use? disks, zil, l2arc?[22:32:58] <ipmb> they seemed to be present, but I didn't double check everything[22:33:26] <ipmb> ddrdrive for zil, SAS disks on an adaptec controller[22:33:36] <SynQ> you say you replaced mobo and processor, did you do a clean shutdown before swapping?[22:33:49] <ipmb> (we're trying to move off the adaptec controller today)[22:33:53] <ipmb> yes, clean shutdown[22:34:16] <ipmb> no special l2arc device[22:34:25] <SynQ> I'd say focus on the ddrdrive first[22:34:45] <ipmb> at this point, I'm scared to reboot[22:35:04] <ipmb> am I going to cause more harm?[22:35:10] <SynQ> possibly yes[22:35:15] <ipmb> :)[22:35:28] <SynQ> easy answer :)[22:35:38] <SynQ> lets dive in a little deeper[22:36:09] <SynQ> you do not get a login on the console[22:36:17] <SynQ> but can you reach the system over the network?[22:36:38] <ipmb> not at it's old address[22:36:50] <ipmb> suppose I can scan to see if it picked up DHCP on the new controller[22:38:49] <SynQ> perhaps[22:39:14] <SynQ> I don't know if there are 'other ways' to know what the system is doing right now[22:39:44] <SynQ> are the sysvol drives also attached to the adaptec?[22:40:05] <ipmb> yes[22:40:15] <ipmb> hardware raided on the adaptec[22:40:18] <ipmb> raid-1[22:40:37] <SynQ> which version of nexentastor are you using?[22:40:42] <ipmb> 3.14[22:41:00] <SynQ> why hardware raided? that makes no sense[22:41:10] <ipmb> this is all legacy stuff[22:41:30] <ipmb> I'm here on a saturday to swap in an LSI controller and "do it right" :)[22:42:25] <SynQ> 3.14?[22:42:51] <ipmb> not seeing it on the network to answer your prev. question[22:43:04] <ipmb> pretty sure 3.14... whatever the current version is[22:43:37] <ipmb> hmm, maybe 3.0.4[22:43:43] <SynQ> ah[22:43:59] <SynQ> that sounds more like a version that exists :)[22:44:52] <SynQ> you say you are not getting any disk activity any longer[22:44:59] <SynQ> for how long is that?[22:45:12] <SynQ> and did you actually see disk activity before?[22:45:24] <ipmb> at boot, yes all disks were active[22:45:41] <ipmb> it's been ~30 minutes of almost no activity[22:45:52] <SynQ> do you have deduplication turned on anywhere?[22:46:06] <ipmb> every once and a while all lights flash at once[22:46:06] <ipmb> no dedup[22:46:25] <ipmb> fwiw, I don't care at all about the OS, it's getting wiped anyway[22:47:00] <SynQ> huh[22:47:10] <SynQ> I don't understand[22:47:14] * ipmb http://img194.imageshack.us/img194/7118/1250826555121s.jpg[22:47:30] <ipmb> moving from "community" edition to "enterprise"[22:47:40] <SynQ> why did you turn it on in the first place when you where half way upgrading the stuff?[22:48:11] <ipmb> plan was to drop in new mobo/proc, confirm shit still worked, then begin migrating[22:48:30] <SynQ> bad plan :P[22:48:35] <ipmb> got hung up on step 2 there...[22:48:49] <SynQ> ok[22:48:59] <SynQ> before you shut it down[22:49:14] <SynQ> was it being written to heavily?[22:49:23] <ipmb> no[22:49:48] <ipmb> everything that was using it was shutdown first[22:49:57] <SynQ> can you for example safely say that there where no writes in the 5 minutes before you shut it down?[22:51:20] <ipmb> 90% sure of that[22:51:25] <SynQ> good[22:51:27] <ipmb> 100% in the 3 minutes before[22:51:42] <SynQ> here is what I would do:[22:51:54] <ipmb> and even used the stupid nexentastor shutdown command, `setup appliance poweroff`[22:51:57] <SynQ> (first read and discuss, don't start doing that right away!)[22:52:14] <ipmb> I'm not in arms reach of the server right now, so you're good[22:52:28] <SynQ> 1. kill the system by yanking power from it[22:52:41] <SynQ> 2. remove all disks that make up your pool(s)[22:53:14] <SynQ> 3. remove the ddrdrive[22:53:26] <SynQ> 4. install the LSI card[22:53:42] <SynQ> 5. start the system[22:54:03] <SynQ> 6. make sure your nexenta sees all system components[22:54:17] <SynQ> (like ram, network, memory, etc)[22:54:31] <ipmb> fyi, http://serverfault.com/questions/257044/migrate-zpool-to-new-sas-controller[22:54:31] <SynQ> 7. shut it down again[22:54:44] <ipmb> my migration isn't just swapping out the cards[22:54:50] <SynQ> 8. install the ddrdrive[22:55:00] <SynQ> 9. make sure that works[22:55:24] <SynQ> 10. if it all works, insert your pool drives and boot it up again[22:56:26] <ipmb> seems reasonable[22:56:41] <SynQ> uh[22:57:10] <SynQ> do you have a copy of the data?[22:57:35] <ipmb> no :([22:58:25] <SynQ> what happened to the 'zfs send all data to the temporary pool' step?[22:58:54] <ipmb> that was going to happen after this[22:59:00] <SynQ> ok[22:59:07] <ipmb> in hindsight, would have been better to do on the old system[22:59:10] <SynQ> I change my recommendation[22:59:19] <SynQ> kill it by yanking power[22:59:26] <SynQ> build it back to it's old spec[22:59:29] <ipmb> I wanted to take advantage of the faster hardware for the transfer[22:59:33] <SynQ> see if it still works[22:59:41] <ipmb> yep[22:59:45] <SynQ> then make backups[22:59:51] <SynQ> then proceed[23:00:07] <ipmb> alright, I'll report back if you're still around...[23:00:15] <SynQ> it's 23:00 here[23:00:24] <SynQ> I'll be heading for bed soon[23:00:27] <ipmb> have a goodnight :)[23:00:29] <ipmb> thx for the advice[23:00:40] <SynQ> would you like some more advice?[23:00:45] <ipmb> sure[23:00:57] <SynQ> I would build it back to it's old state[23:01:03] <SynQ> and replan for next week[23:01:20] <SynQ> so you can have some more thought about how to do it[23:01:22] <ipmb> why?[23:01:31] <ipmb> I've run it by a few people[23:01:34] <SynQ> ah[23:01:39] <ipmb> a couple of them "professionals"[23:01:55] <SynQ> and none of those advised to make backups?[23:02:08] <ipmb> probably goes without saying...[23:02:17] <ipmb> time and amount of data was an issue[23:02:25] <SynQ> what I would also do[23:02:33] <SynQ> is check how heavy your zil is being used[23:02:53] <SynQ> if you can do without the ddrdrive your data will be a lot safer[23:03:04] <ipmb> ok[23:03:27] <ipmb> we ran without it for a long time, but recently, we've needed it[23:03:52] <SynQ> if you can afford it[23:04:02] <SynQ> get an stec ZeusRAM instead[23:04:04] <ipmb> I can already tell you, we can't[23:04:11] <SynQ> ah[23:04:22] <SynQ> understood[23:04:25] <ipmb> I'm going to give this a shot, thx again for your help[23:04:30] <SynQ> no problem[23:04:34] <SynQ> good luck[23:04:39] *** ipmb is now known as ipmb|away[23:29:11] *** ipmb|away is now known as ipmb[23:29:33] <ipmb> SynQ: got the old hardware back in[23:29:38] <SynQ> and?[23:29:49] <ipmb> no errors, but no login prompt[23:29:51] <ipmb> (yet)[23:30:00] <SynQ> no errors is good[23:30:06] <ipmb> no network yet either[23:30:11] <ipmb> seeing disk activity[23:30:19] <SynQ> let it run for a while[23:30:39] <ipmb> what is "a while"[23:31:04] <SynQ> half an hour[23:31:14] <ipmb> k[23:31:29] <ipmb> what would be happening during that half hour?[23:31:38] <SynQ> dunno :P[23:31:46] <SynQ> it could be resilvering the pool[23:31:58] <SynQ> is there any way you can tell the ddrdrive is working?[23:32:01] <ipmb> that could take hours on our data[23:32:29] <SynQ> what type of raid config do you use?[23:32:32] <ipmb> it goes through an initialization on startup[23:32:33] <ipmb> which looked successful[23:33:05] <SynQ> what do you see on the console then right now?[23:33:06] <ipmb> multiple raidz2s[23:33:41] <ipmb> the last thing it showed was a network card failure (which is normal, nothing is plugged into it)[23:33:59] <SynQ> is that 'normal'?[23:34:03] <ipmb> yes[23:34:07] <SynQ> ok[23:34:15] <ipmb> not the slow login prompt though[23:34:40] <ipmb> still lots of disk activity and the console is responsive[23:34:58] <ipmb> that is, if I push [enter] the cursor moves[23:34:58] <SynQ> hmm[23:35:26] <ipmb> it's responding to pings[23:35:32] <SynQ> ah[23:35:37] <SynQ> can you ssh to it?[23:35:42] <ipmb> not yet[23:37:25] <SynQ> my guess is that it is resilvering something[23:38:10] <SynQ> other question..[23:38:20] <SynQ> is this system normally not backed up either?[23:38:26] <ipmb> ssh might be working...[23:38:43] <ipmb> no, it's a long and sordid story...[23:38:50] <ipmb> typically offsite nightlies[23:39:11] <SynQ> I do not envy you[23:40:40] <SynQ> still lots of disk activity?[23:40:45] <ipmb> I'm in on SSH[23:40:49] <SynQ> good[23:40:55] <ipmb> both pools resilvering[23:41:00] <SynQ> good too[23:41:15] <SynQ> let that finish before you do anything else to it[23:41:32] <SynQ> if you love your data[23:41:37] *** tsukasa has quit IRC[23:42:13] <ipmb> SynQ... this look bad[23:42:15] <ipmb> http://pastebin.com/PSfenjsp[23:44:09] <SynQ> tank FAULTED 0 0 0 bad intent log[23:44:27] <SynQ> that looks like your ddrdrive is not in working order[23:44:29] *** drac_h has joined ##nexenta[23:44:34] <ipmb> yes[23:44:37] *** drac_h has quit IRC[23:44:51] <ipmb> c2t0d0 UNAVAIL 0 0 0 cannot open[23:44:59] <ipmb> that's the ddrdrive[23:45:02] <SynQ> I'm afraid that I do not know enough to help you further[23:45:21] <SynQ> you need someone from the nexenta corp support team to help you out[23:45:34] <ipmb> thx SynQ[23:45:46] <SynQ> I have (fortunatly) not been in this kind of situation ever before[23:46:40] <SynQ> https://www.nexenta.com/corp/contact-us is your best startingpoint from here[23:46:49] <SynQ> don't turn the box off[23:46:58] <SynQ> make sure you get in contact with support[23:48:58] <SynQ> what I do know is that you cannot 'normally' start your pool if there used to be a zil device which is not available now[23:49:04] <SynQ> no zil no pool[23:49:30] <SynQ> but I also know that this is going to be fixed in 3.1 and that support has a way around this for current versions[23:49:37] <ipmb> I thought that was fixed in recent versions[23:49:58] <SynQ> oh[23:50:03] <SynQ> that could also be true[23:50:23] <SynQ> but in that case I don't know what you have to do to start without that zil[23:50:42] <SynQ> I'm actually quite curious how that works[23:51:00] <SynQ> I think I'm going to test that in a vm[23:51:11] <SynQ> hope I find the time for that soon[23:51:19] <SynQ> but now it's time for bed really[23:51:27] <SynQ> midnight[23:51:44] <SynQ> ipmb: are you coming to the nexenta user conference in the fall?[23:52:06] <ipmb> no[23:52:15] <SynQ> pitty[23:52:36] * ipmb not sure how to get anybody at nexenta on the weekend\[23:52:55] <SynQ> call?[23:53:19] <ipmb> voicemail[23:53:23] <SynQ> surely there will be some sort of 24/7 support option[23:53:59] <SynQ> leave your number and distress signal in the voicemail[23:55:41] <SynQ> I really don't know[23:55:54] <ipmb> thx for your help, I'll track them down...[23:55:58] <SynQ> sorry I cannot help you further[23:56:02] <SynQ> good luck[23:56:10] <SynQ> and good night :)