[00:00:23] <mikl> how do I debug why a newly created smartos zone is failing/crashing?
[00:00:57] <mikl> when I create it and start it, I can login with zlogin, but after a short time, I'm disconnected
[00:01:30] <mikl> and when I try to modify the zone afterwards with vmadm, it complains that the zone state is "failed"
[00:06:45] <Licenser> hmm for some reason the erlang probes don't work for me on SmartOS as anyone experienced that before?
[00:07:07] *** jbergstroem has quit IRC
[00:14:42] *** jbergstroem has joined #smartos
[00:14:56] *** jesser has joined #smartos
[00:15:34] *** rancor has quit IRC
[00:20:21] *** Tekni has joined #smartos
[00:23:31] *** xinkeT has quit IRC
[00:32:25] <jeffpc> is GZ's devfsadmd supposed to exit at any point?
[00:32:29] *** jim80net has quit IRC
[00:58:58] *** Ymx1 is now known as blu
[00:59:17] *** tonyarkles has quit IRC
[01:09:32] *** tru_tru has quit IRC
[01:18:18] <richlowe> jeffpc: ideally, no
[01:19:01] <jeffpc> richlowe: I did notice it disappearing on me (and getting started up again)
[01:24:28] *** socketwiz has joined #smartos
[01:48:56] <rmustacc> jeffpc: Look at the service log then. And look for core dumps in /zones/global/cores
[01:50:08] <jeffpc> rmustacc: I have a core
[01:50:41] <jeffpc> want it?
[01:52:13] <jeffpc> btw, my super-slow 'vmadm create -f foo' seems like it may be related to zcons (richlowe suggested I mention that since there were some zcons related changes)
[02:04:23] <jeffpc> specifically, devfsadmd does an ioctl to load a driver, and during a device walk we eventually end up calling ndi_devi_enter
[02:04:46] <jeffpc> but we end up in cv_wait on line 1926
[02:04:52] <jeffpc> for ~50 seconds
[02:05:25] <jeffpc> time for me to go home
[02:05:33] <jeffpc> more debugging tomorrow evening
[02:11:37] <rmustacc> In general I want cores.
[02:11:45] <rmustacc> My time to look at it is less free at the moment.
[02:13:02] *** enmand_ has joined #smartos
[02:14:19] *** dho has joined #smartos
[02:18:18] *** enmand has quit IRC
[02:18:19] *** dho_ has quit IRC
[02:18:20] *** jelmd has quit IRC
[02:18:20] *** Xenith has quit IRC
[02:20:23] *** AlainODea_ has joined #smartos
[02:23:40] *** Xenith has joined #smartos
[02:26:52] <AlainODea_> Does SmartDataCenter have an equivalent feature to vSphere HA? I see high availability in the feature set, but not in the documentation.
[02:29:16] <konobi> what does it do?
[02:31:57] *** jelmd has joined #smartos
[02:32:14] <konobi> the vsphere dodad that is
[02:32:35] <AlainODea_> konobi: It lets a VM essentially exist on two physical hosts running the hypervisor. One VM active, one VM standby. If the host with the active mirror VM gets disconnected the other makes the VM active there.
[02:33:27] <AlainODea_> konobi: memory and some other state (IO etc.) is mirrored live
[02:34:54] <konobi> nope, no support for that
[02:36:28] <AlainODea_> konobi: Thank you :) vSphere HA doesn't work nearly as smoothly as advertised, but the feature is becoming a possible sticking point for switching to SmartOS and SDC
[02:37:41] <AlainODea_> konobi: personally I think it is the wrong place for HA in any event, very much like SANs are also flawed
[02:38:15] <konobi> a man after our own hearts
[02:38:37] <AlainODea_> konobi: more fuel for my plan to do HA in the services and app themselves.
[02:40:08] *** tonyarkles has joined #smartos
[02:40:39] *** ahaydock has quit IRC
[02:41:28] <AlainODea_> konobi: I have a fairly epic proposal for using SmartOS in dev/test. SmartOS facilitates SaaS testing in ways that make it hard not to write a book, but I have to fit it into an email for now.
[02:42:05] <konobi> heh
[02:42:35] <AlainODea_> konobi: thanks you for your help. Have a good evening :)
[02:43:17] <arekinath> if something is really worth HA, it's worth proper HA at the next layer up that has no cutover delay and will survive a SAN being set on fire
[02:43:27] <arekinath> imho. :P
[02:43:56] <konobi> AlainODea_: np
[02:44:04] <arekinath> I usually find if the business can tolerate a service going down, it can tolerate it going down for 30 seconds (vmware HA failover delay) or 10 minutes just as well
[02:45:03] <arekinath> because it's not the sort of service where every second is costing you money
[02:45:10] <arekinath> if it is that, you need real HA
[02:48:25] <AlainODea_> arekinath: I totally agree. The timeout nonsense proves that active/standby is practically useless for a live service. Worse than useless really since it gives people a false sense of sense of security.
[02:49:59] <arekinath> well, I think that kind of "fake" HA is a very poor trade for the performance difference between vmware with SANs versus smartos with zfs on local SAS disks (with or without SSDs too)
[02:50:48] <arekinath> haha.. I'm sure you could have something about the 10 minutes versus 30 seconds being worth the fact that users can get their jobs done more quickly using a more performant service or something
[02:50:49] <arekinath> ;)
[02:52:28] *** tonyarkles has quit IRC
[02:57:15] *** wolstena has quit IRC
[02:58:57] *** AlainODea_ has quit IRC
[03:01:46] *** AlainODea_ has joined #smartos
[03:12:18] *** AlainODea_ has quit IRC
[03:19:04] *** Vod has quit IRC
[03:20:03] *** tonyarkles has joined #smartos
[03:21:24] *** potatosalad has quit IRC
[03:30:37] *** potatosalad has joined #smartos
[03:40:24] *** wolstena has joined #smartos
[03:42:51] *** potatosalad has quit IRC
[03:45:50] *** newlix has joined #smartos
[03:47:52] *** jxh has joined #smartos
[03:51:31] *** jxh has quit IRC
[03:54:31] *** jxh has joined #smartos
[03:57:32] *** jxh has quit IRC
[03:57:49] *** jxh has joined #smartos
[04:00:16] *** ira has quit IRC
[04:05:20] *** d[^_^]b has quit IRC
[04:05:28] *** d[^_^]b has joined #smartos
[04:15:17] *** jxh has quit IRC
[04:16:50] *** socketwiz has quit IRC
[04:19:51] <richlowe> jeffpc: actually, what rmustacc said, if you can get me a crash dump, I'll look too
[04:19:56] <richlowe> where rmustacc has no free time, I'm bored and always willing to do things that stop me noticing my email.
[04:20:39] *** jxh has joined #smartos
[04:31:58] *** jxh has quit IRC
[04:39:03] *** jxh has joined #smartos
[04:48:03] <konobi> ugh... mysql cmake, why do you mess with my head
[04:50:09] <richlowe> if you encounter a build system that is not GNU make, you know someone is out of their damn mind
[04:50:13] <richlowe> and you can adjust your expectations accordingly.
[04:50:37] <richlowe> and if you encounter people who seem to switch with the seasons, you should raise that to the power of the number of times they've switch
[04:50:45] <richlowe> basically, node are maximally crazy, on this scale.
[04:51:11] *** potatosalad has joined #smartos
[04:52:20] *** wolstena has quit IRC
[04:53:09] *** tonyarkles has quit IRC
[05:00:28] *** jxh has quit IRC
[05:06:59] *** potatosalad has left #smartos
[05:07:02] *** potatosalad has joined #smartos
[05:07:27] *** darjeeling has quit IRC
[05:13:37] *** tonyarkles has joined #smartos
[05:16:01] *** jasonpincin has quit IRC
[05:17:18] *** jasonpincin has joined #smartos
[05:19:10] <richlowe> I think I have the vbox equivalent somewhere
[05:19:26] <richlowe> I keep meaning to try to determine if there's any hope of fixing F1-a
[05:20:08] <richlowe> like, it not working _all the time_ is totally logical
[05:20:17] <richlowe> but I haven't seen it work since, say, 2005
[05:20:22] <richlowe> which seems less logical
[05:33:26] *** tonyarkles has quit IRC
[05:47:37] *** potatosalad has quit IRC
[05:57:41] <konobi> Aram hasn't been back recently, right?
[05:58:55] <richlowe> not that I've seen.
[06:00:27] *** e^ipi has quit IRC
[06:13:25] *** sachinsharma has joined #smartos
[06:21:24] <konobi> mkay... i'll have to give it a try on some hardware at home to see if i can match up
[06:28:57] *** jesser_ has joined #smartos
[06:32:18] *** jesser has quit IRC
[06:32:26] *** scubasteve has joined #smartos
[06:32:34] <scubasteve> hello and greetings everyone
[06:32:59] <scubasteve> I seem to be having some trouble with smartos. namely I cannot ping anything with a hostname
[06:33:05] <scubasteve> only ip addresses
[06:33:37] <scubasteve> I have read that dns is off by default in smartos in the global zone, just wondering how I can enable it. I cannot route to datasest.joyent.com at the moment
[06:33:49] <scubasteve> datasets.joyent.com
[06:33:51] <scubasteve> typo
[06:33:53] *** jesser_ has quit IRC
[06:35:05] <scubasteve> any suggestions on how I can fix this? :(
[06:37:16] <konobi> scubasteve: did you set up DNS resolvers when you installed?
[06:37:57] <scubasteve> it was a few days ago. I just finally got my sas cards recognized after cross-flashing and modifying the device_aliases list
[06:38:05] <scubasteve> I forgot what I did during the install :)
[06:39:06] <scubasteve> is there a way to manually add them now?
[06:39:13] <scubasteve> I still have a pretty clean install at the moment
[06:41:06] <konobi> `cat /usbkey/config`
[06:41:14] <konobi> do you see anything about resolvers?
[06:42:25] <scubasteve> dns_resolvers=8.8.8.8,8.8.4.4
[06:42:49] <konobi> mkay, it's probably your default gateway at issue then
[06:42:56] <scubasteve> hmmmm
[06:43:09] <scubasteve> admin_gateway=dhcp
[06:43:16] <scubasteve> admin_ip=dhcp
[06:43:30] <konobi> yeah, best to have static for those
[06:43:43] <scubasteve> I have lots of other machines here that can resolve by host that use pure DHCP, mac and windows and linux
[06:43:51] <scubasteve> ahhhh
[06:44:03] <scubasteve> ok, let me try adding static gateway address and ip
[06:44:04] <konobi> there's also a default_gateway option that you want to match up
[06:45:47] <scubasteve> ok
[06:57:05] *** potatosalad has joined #smartos
[07:02:58] *** ryancnelson has joined #smartos
[07:09:42] *** potatosalad has quit IRC
[07:16:30] *** e^ipi_laptop has quit IRC
[07:19:08] *** e^ipi_laptop has joined #smartos
[07:19:47] *** tonyarkles has joined #smartos
[07:25:18] <konobi> AlainODea: another canuck... ullo
[07:25:48] *** scubasteve has quit IRC
[07:26:52] *** scubasteve has joined #smartos
[07:31:38] *** tonyarkles has quit IRC
[07:32:49] *** jamesd has quit IRC
[07:34:22] <MerlinDMC> morning
[07:42:30] *** kevinykchan has quit IRC
[07:54:15] <scubasteve> morning
[07:54:57] *** potatosalad has joined #smartos
[07:56:25] <MerlinDMC> scubasteve, DNS in the smartos global zone is active
[07:56:51] <MerlinDMC> is is not on SDC compute nodes (or was not enabled at least - don't know how it is now)
[08:00:50] *** tonyarkles has joined #smartos
[08:01:36] *** newlix has quit IRC
[08:02:03] *** newlix has joined #smartos
[08:06:17] *** tonyarkles has quit IRC
[08:08:01] *** potatosalad has quit IRC
[08:13:42] *** mamash has joined #smartos
[08:14:27] *** potatosalad has joined #smartos
[08:19:44] *** alucardX has joined #smartos
[08:27:55] *** potatosalad has quit IRC
[08:34:43] *** scubasteve has quit IRC
[08:41:28] *** dubban has quit IRC
[08:49:20] *** kschiess has joined #smartos
[08:51:03] *** darjeeling has joined #smartos
[08:53:08] *** darjeeling has quit IRC
[09:01:30] *** darjeeling has joined #smartos
[09:05:49] *** darjeeling has quit IRC
[09:23:00] *** texarcana has quit IRC
[09:24:15] *** texarcana has joined #smartos
[09:46:38] *** alcir has joined #smartos
[09:56:25] <alcir> set ip-type=shared
[09:56:35] <alcir> is it possible to configure a zone that way?
[09:57:45] *** sachinsharma_ has joined #smartos
[10:01:39] *** sachinsharma has quit IRC
[10:01:39] *** sachinsharma_ is now known as sachinsharma
[10:03:09] *** dubban has joined #smartos
[10:09:53] *** leecallen has quit IRC
[10:09:55] *** KermitTheFragger has joined #smartos
[10:10:03] *** leecallen has joined #smartos
[10:16:59] *** darjeeling has joined #smartos
[10:25:44] *** rawtaz has joined #smartos
[10:26:26] *** sachinsharma_ has joined #smartos
[10:28:58] <jperkin> rawtaz: not at the moment, I will hopefully do a build for 2012Q4 though
[10:29:23] <rawtaz> i see :)
[10:30:04] *** sachinsharma has quit IRC
[10:30:04] *** sachinsharma_ is now known as sachinsharma
[10:30:09] <jperkin> I was hoping it'd be much simpler with fusion 5, but it doesn't work with the recovery partition, so will have to do the hacky chroot build instead
[10:30:16] <rawtaz> i guess that these build are currently not going to happen more than once per quarter, so one will not be able to get updates that happen to pkgsrc after each release?
[10:30:45] <jperkin> not unless I get a system to do them on other than my laptop, no :)
[10:30:54] <rawtaz> hmm, odd. never heard of that problem with the recovery partition :)
[10:31:00] <rawtaz> ah, i see. makes sense
[10:31:27] <jperkin> well, during install it checks whether you purchased osx in the store, and fails if not
[10:31:30] <rawtaz> how long does it take to build these 7000+ packages?
[10:31:35] <rawtaz> on your lappy, for example
[10:31:47] <jperkin> it took about a week
[10:31:48] <rawtaz> ah ok, for 10.7+ i take it
[10:31:51] <rawtaz> oh wow
[10:32:25] <jperkin> whereas I can build a full 9,000+ set on my SmartOS cluster in just over a day
[10:32:55] <rawtaz> would be cool to do it there :)
[10:33:08] <rawtaz> or have an esxi box sitting somewhere with os x on it to build
[10:34:12] <MerlinDMC> jperkin, you need osx 10.6 + xcode to build those packages?
[10:34:25] <jperkin> MerlinDMC: I used the minimal xcode install
[10:34:42] <jperkin> which actually results in less packages, as a bunch require the stuff in /Developer
[10:34:50] <jperkin> I'll hopefully fix that for the next build
[10:35:58] <MerlinDMC> it's a mess that apple disallows virtualization on those osx versions -.-
[10:38:32] <rawtaz> it truly is.
[10:40:54] *** haydock has joined #smartos
[10:43:22] *** dubban has quit IRC
[10:44:33] *** dubban has joined #smartos
[10:48:56] *** sachinsharma has quit IRC
[10:57:32] *** sachinsharma has joined #smartos
[11:02:13] *** darjeeling has quit IRC
[11:30:58] *** dubban has quit IRC
[11:31:28] *** hafthorr has quit IRC
[11:37:47] *** dubban has joined #smartos
[11:43:13] *** jamesd has joined #smartos
[12:02:19] *** ira has joined #smartos
[12:12:54] *** newlix has quit IRC
[12:13:22] *** newlix has joined #smartos
[12:32:50] *** Mareo has quit IRC
[12:35:33] *** sachinsharma has quit IRC
[12:40:03] *** darjeeling has joined #smartos
[13:00:20] *** e^ipi_laptop has quit IRC
[13:00:31] *** Mareo has joined #smartos
[13:06:33] *** arx has joined #smartos
[13:11:40] *** rc10 has joined #smartos
[13:11:59] <rc10> hi, how to cpu affinity of pid ?
[13:11:59] <rc10> find to which process a pid is running
[13:12:00] <rc10> any command or dtrace script ?
[13:16:59] *** jesser has joined #smartos
[13:24:33] *** jesser has quit IRC
[13:25:26] *** haydock has quit IRC
[13:29:58] *** socketwiz has joined #smartos
[13:56:00] *** enmand_ has quit IRC
[13:59:10] *** darjeeling has quit IRC
[14:23:03] *** eule has joined #smartos
[14:26:39] *** kschiess has quit IRC
[14:34:57] *** enmand has joined #smartos
[14:53:03] *** ahaydock has joined #smartos
[14:53:45] <nahamu> rc10: pbind -Q
[14:54:00] <nahamu> man pbind
[14:54:14] <nahamu> but I'm guessing it's a bad idea...
[14:54:17] <rc10> thnx, is there a way to bind a command to cpu ?
[14:54:26] <nahamu> yes, with the pbind command.
[14:54:36] <nahamu> pbind -b <cpuid> <pid>
[14:54:40] <nahamu> look in the man page.
[14:54:50] <nahamu> but why do you want to bind a process to a cpu?
[14:54:57] <rc10> no, i want to trigger a command on a cpu - say "date" should run on cpu 2
[14:55:03] <rc10> pbind is for process only
[14:55:33] <rc10> similar to taskset in linux
[14:55:54] <nahamu> why do you want to do that? what makes you smarter than the scheduler in this instance?
[14:56:38] <rc10> there is no guarantee that commands triggered will be equally distributed across cpus
[14:57:19] <nahamu> are you talking about short-lived commands or long-lived ones?
[14:57:36] <rc10> long lived
[14:57:46] <rc10> say "du " on 2Tb disk
[14:57:49] <nahamu> if long lived, the penalty of firing it up and having it be wherever it ends up until you pbind it seems like not a big deal to me.
[14:58:16] <jperkin> rc10: man psrset
[14:58:42] <nahamu> ah, but it looks like there is indeed a command for that. :)
[14:58:49] <nahamu> thanks jperkin!
[15:00:33] <nahamu> not sure why my eyes skipped past that in pbind's SEE ALSO section...
[15:00:42] *** Sachiru has joined #smartos
[15:02:49] <rc10> # psrset -e 2 date
[15:02:49] <rc10> psrset: cannot exec in processor set 2: Invalid argument
[15:03:05] <nahamu> rc10: you might have to define the processor set first...
[15:08:42] <rc10> thnx
[15:12:41] *** jim80net has joined #smartos
[15:23:00] *** rc10 has quit IRC
[15:31:20] *** jasonpincin has quit IRC
[15:31:53] *** pinja has joined #smartos
[15:32:08] *** eule has quit IRC
[15:33:34] *** pinja has joined #smartos
[15:35:31] *** rc10 has joined #smartos
[15:40:52] *** kschiess has joined #smartos
[15:41:11] *** mamash has left #smartos
[15:41:19] *** mamash has joined #smartos
[15:49:53] *** potatosalad has joined #smartos
[15:54:35] *** kamilr has joined #smartos
[15:54:48] <kamilr> Hi, can i somehow change zpool for imgadm ?
[15:55:07] <kamilr> so imgadm will import datasets not to zones but for example data
[15:55:08] <kamilr> ?
[15:55:46] <jperkin> not easily
[15:56:58] <kamilr> so maybe it is easier to rename the zpool zones ?
[15:57:05] <kamilr> to other name, hm?
[15:57:11] <jperkin> no, that would require a lot of changes too
[15:57:20] <kamilr> oh, i see
[15:57:58] <jperkin> 'zones' is hardcoded in lots of places, you would need to create patches versions of each of those files and lofs mount them on top
[15:58:22] <kamilr> as i thought
[16:00:29] <kamilr> is there any way to create VM's on another zpool then zones ?
[16:01:00] <jperkin> again, not without significant changes
[16:01:38] *** enmand has quit IRC
[16:02:25] <kamilr> So what if i have only 20GB on zpool zones ?
[16:02:52] <kamilr> if i can't install VM on another zpool
[16:03:20] <jperkin> the smartos model is to spread your zones across all available disks
[16:03:40] <kamilr> ok, i have 2 disk each 20GB
[16:03:44] <kamilr> i create mirror
[16:03:51] <kamilr> so i have only 20gb
[16:03:52] <jperkin> not to have multiple zones across random sets of disks
[16:04:19] <jperkin> 20GB? seriously? :)
[16:04:21] <MerlinDMC> zones is hardcoded once in imgadm as a "constant" afaik
[16:04:31] <kamilr> i would like to mount FC storage to server
[16:04:41] <kamilr> create zpool and name it data
[16:04:42] <jperkin> I don't recall how long it's been since I saw a 20GB disk :)
[16:05:25] <kamilr> no, i don't have 20gb. assume a hypothetical version
[16:05:34] *** enmand has joined #smartos
[16:05:54] <kamilr> i would like to install all vm on zpool data. I change in manifest zpool variable
[16:06:10] <kamilr> but i think i also need to have image on the same dataset
[16:06:14] <kamilr> for cloning
[16:06:18] <kamilr> am i right ?
[16:06:25] *** tonyarkles has joined #smartos
[16:06:43] *** rc10 has quit IRC
[16:07:16] <MerlinDMC> kamilr, yep
[16:07:30] <kamilr> damn
[16:07:37] <MerlinDMC> you could manually import them to data ... or send / recv them from zones to data
[16:14:11] <aszeszo> hi all, i am still scratching my head how to answer our customer to the question about how much memory his zones are using
[16:14:51] <aszeszo> is there anyone here who knows how to determine how many gigs of the host's 48GB memory pool zone is actually consuming?
[16:15:03] <MerlinDMC> aszeszo, "sm-meminfo rss"?
[16:16:33] *** darjeeling has joined #smartos
[16:18:22] *** kevinykchan has joined #smartos
[16:19:35] <aszeszo> :(
[16:20:09] <MerlinDMC> aszeszo, it works fine here ... but i use the default uuid zone names everywhere
[16:21:38] <MerlinDMC> also the memory part seems to be there ... but it seems to be cool to have a cap of 512G and usage of 54M ;)
[16:21:40] <jeffpc> (joyent_20121115T191935Z)
[16:23:27] *** wdent has joined #smartos
[16:23:31] <aszeszo> MerlinDMC: the cap is set to 1TB, zones are not capped on this box but I had to use some value otherwise kstats zonememstat tool is using were not getting populated
[16:26:34] <MerlinDMC> aszeszo, sm-meminfo is using "/usr/bin/prstat -Z -s rss" to get the used memory
[16:28:03] *** wdent has quit IRC
[16:28:31] *** wdent has joined #smartos
[16:29:16] <aszeszo> MerlinDMC: yep it is returning 55 gigs on 48GB host
[16:29:34] *** chriss- has quit IRC
[16:29:41] *** rancor has joined #smartos
[16:33:09] *** wdent has quit IRC
[16:40:52] *** wdent has joined #smartos
[16:58:05] *** kschiess has quit IRC
[16:58:59] *** sachinsharma has joined #smartos
[16:59:01] *** kfr- has joined #smartos
[16:59:24] <aszeszo> what does the swap -s output mean inside the zone?
[16:59:26] <kfr-> from time to time, I have a kvm host that stops responding
[16:59:40] *** alucardX has quit IRC
[17:00:14] <kfr-> right now the kvm zone will not respond to ping and it's vnc console is frozen
[17:00:37] <kfr-> in the past I have just rebooted the zone and that has fixed the problem
[17:00:59] <kfr-> this time i'd like to collect more information but I'm not sure what to collect
[17:01:06] <aszeszo> kfr-: anything interesting in /zones/<uuid>/root/vm.log ?
[17:01:23] <aszeszo> sorry, it is /zones/<uuid>/root/tmp/vm.log
[17:03:57] *** tonyarkles has quit IRC
[17:04:41] <kfr-> not since dec 1st
[17:06:52] *** enmand has quit IRC
[17:07:20] <kfr-> root 10658 1 0 Dec 01 ? 0:12 zoneadmd -z 54b8d878-1fa3-11e2-97ee-87497812969d
[17:08:16] <kfr-> root 10737 10678 4 Dec 01 ? 16871:47 /smartdc/bin/qemu-system-x86_64 -m 32768 -name 54b8d878-1fa3-11e2-97ee-87497812
[17:10:36] *** enmand has joined #smartos
[17:14:03] *** alcir has quit IRC
[17:14:48] <kfr-> how can I find out more on "OS-717 apix code induces hangs"
[17:19:10] *** montyz0 has joined #smartos
[17:19:16] *** goodbytes has quit IRC
[17:23:41] <rmustacc> kfr-: What do you want to know about that change?
[17:26:35] *** arx has quit IRC
[17:27:33] <kfr-> rmustacc: It's title is "apix code induces hangs". What kind of hang?
[17:27:35] <rmustacc> kfr-: The interesting things to collect / look at are a pstack, kvmstat output, looking at where the kernel threads are.
[17:27:46] *** Sachiru has quit IRC
[17:27:51] <rmustacc> kfr-: What platform are you on?
[17:28:00] <kfr-> xeon
[17:28:07] <rmustacc> kfr-: Sorry, I mean uname -v
[17:28:20] <kfr-> joyent_20121018T224723Z
[17:28:20] *** d[^_^]b has quit IRC
[17:28:28] *** d[^_^]b has joined #smartos
[17:28:29] *** d[^_^]b has quit IRC
[17:29:00] <rmustacc> kfr-: You won't be hitting that, it was always worked around until the fix went in. The hang manifests as a full system hang, you can't dump, can't do anything else with the system, it's entirely locked up.
[17:29:40] <rmustacc> Generally related to the kvm kernel module loading or unloading.
[17:29:57] <kfr-> rmustacc: good
[17:31:14] *** kamilr has quit IRC
[17:31:44] <kfr-> do I simply pstack "pid of fozen qemu guest"
[17:32:03] <rmustacc> First look at the kvmstat output for that pid.
[17:32:16] *** rc10 has joined #smartos
[17:32:48] <rmustacc> If you're seeing a lot of activity then your guest might have just entered into a bad state.
[17:32:55] <rmustacc> eg. guest bug.
[17:33:11] <rmustacc> Of course, with my luck that's probably not what happened and it's hard to prove that it is.
[17:33:28] *** d[^_^]b has joined #smartos
[17:33:42] <kfr-> I see activity every 3 seconds
[17:34:32] *** mamash has left #smartos
[17:35:09] <kfr-> it seems every 3 seconds
[17:37:20] <rmustacc> That guest definitely seems to be doing odd things.
[17:39:50] *** dap has joined #smartos
[17:40:45] <kfr-> rebooting it will fix the problem
[17:41:20] <kfr-> but I have 7 of these vm's and it would be nice to find the root cause
[17:41:41] <kfr-> 7 will be growing to 20+ shortly
[17:41:50] <kfr-> not on the same box
[17:43:57] *** kevinykchan has quit IRC
[17:47:17] *** dap has quit IRC
[17:51:33] *** gkyildirim has joined #smartos
[17:54:54] *** gkyildirim has quit IRC
[17:59:08] <richlowe> jeffpc: mid-way through dlopening the kvm linkmod, and doing bugger all else?
[17:59:55] <jeffpc> richlowe: the system is idle, with the exception of in the evening where I create and delete zones (via vmadm) to track down the latency issue
[18:03:16] *** rc10 has quit IRC
[18:08:54] <richlowe> well, did I miss context? The core you linked is from while vmadm is screwing up, right?
[18:17:05] <jeffpc> no
[18:17:07] <jeffpc> unrelated
[18:17:29] <jeffpc> yesterday, before I got to do more latency digging, I noticed that devfsadmd was gone
[18:17:36] <jeffpc> and I had a core left
[18:18:45] <richlowe> that's... fun?
[18:19:07] <richlowe> were you dtraceing it?
[18:19:35] <richlowe> jeffpc: can you give me more context as to about what may have been going on when this died?
[18:19:41] <richlowe> 'cos what _happened_ is pretty bad, if it happened on purpose
[18:19:53] <richlowe> Wait, no, didn't happen on purpose
[18:20:06] <richlowe> basically, you have a breakpoint in rtld_db_dlactivity
[18:20:17] <richlowe> and nobody around, so you just took SIGTRAP and fell over.
[18:20:27] <richlowe> ld.so.1`rtld_db_dlactivity: int $0x3
[18:20:30] <jeffpc> so, the corefile has a timestamp of 23:24 UTC... which means...
[18:20:51] *** enmand has quit IRC
[18:21:27] *** enmand has joined #smartos
[18:21:45] <jeffpc> I don't know :(
[18:22:10] <jeffpc> I noticed it gone at about :30
[18:22:30] <richlowe> well, it's pretty safe to say that what seems to have happened shouldn't have
[18:22:32] <jeffpc> I think I wasn't dtracing
[18:22:43] <richlowe> but I don't know if there's any way to find out who did it to you.
[18:22:48] <jeffpc> which would mean... idle system
[18:22:58] <richlowe> well, not that idle, or we wouldn't be loading the linkmod
[18:23:02] <jeffpc> maybe one zone; or one zone getting deleted
[18:23:46] <richlowe> but it must be a targetted probe/bp, or they'd be everywhere and we'd have died sooner
[18:24:31] <jeffpc> I was dtracing the process ~24 hours prior
[18:25:39] *** sachinsharma has quit IRC
[18:27:46] <richlowe> that'd stick a bp there via dt_proc_rdwatch, I believe
[18:27:54] <richlowe> but so would basically anything else that smells of debugger
[18:28:19] <jeffpc> I did run truss ~ two days ago
[18:28:28] <jeffpc> but dtrace & truss are it
[18:28:54] <richlowe> yeah, that's a shame
[18:29:17] <richlowe> 'cos I don't know if it's possible to determine which, if either, left this around
[18:29:55] <jeffpc> :/
[18:30:47] <jeffpc> FWIW, I did ^C dtrace a couple of times
[18:31:01] <jeffpc> (start dtrace command; oops wrong one; ^C)
[18:31:05] *** cncfanatics has joined #smartos
[18:31:14] <jeffpc> but that was all before it coredumped
[18:31:20] <richlowe> jeffpc did you use truss -u?
[18:31:53] <jeffpc> no
[18:32:14] <richlowe> well, probably dtrace then.
[18:32:16] <jeffpc> that's all I remember
[18:32:20] <richlowe> I think we only screw with rtld events iff -u
[18:32:49] <jeffpc> few days ago, I did have some broad dtrace going
[18:33:25] <jeffpc> definitely fbt::entry, but not sure what I did with the pid provider exactly
[18:34:29] <richlowe> Not sure actually using pid would be necessary, the rtld stuff is necessary to find out what exists _for_ pid
[18:35:59] <richlowe> and, happily enough, this is a bug that fishworks hit
[18:36:03] <richlowe> and it appears ignored.
[18:37:48] *** wdent has quit IRC
[18:43:06] <richlowe> how very annoying.
[18:43:47] <richlowe> rmustacc: you guys hit a bug whereby, if you used dtrace -p and a script no longer available to mortals, and interrupted dtrace, it'd sometimes do exactly what it's done to jeffpc
[18:43:51] <richlowe> rmustacc: presumably for the same reasons.
[18:44:13] <richlowe> "This happened three times out of several more"
[18:44:26] <richlowe> I fail to see how it's not shameful.
[18:45:34] <rmustacc> It wasn't a bug that fishworks fixed was it?
[18:45:41] <richlowe> I would have no idea
[18:45:53] <richlowe> I was hoping you would, or could ask Bryan if he remembered it
[18:46:06] <rmustacc> I'll try and ask Bryan about it.
[18:46:45] <richlowe> also, devfsadmd is the poster child for why pgrep's ability to match so few characters is annoying.
[18:46:51] <richlowe> pgrep -n devfsamd # haha, the d screws you
[18:46:56] *** KermitTheFragger has quit IRC
[18:47:58] *** cncfanatics has quit IRC
[18:49:01] <richlowe> jeffpc: don't suppose you have the DTrace script handy?
[18:49:09] <richlowe> jeffpc: 'cos depending on the value of "several", I can't yet succeed in breaking it
[18:50:05] <jeffpc> richlowe: I have some scripts, but they all trace only fbt
[18:50:16] <jeffpc> I did run a bunch via -n in the shell
[18:50:21] <jeffpc> those are gone
[18:52:02] <richlowe> clearly smartos should put HISTFILE on the zpool :)
[18:52:10] <jeffpc> :)
[18:53:13] <richlowe> yeah, I can't seem to get it by hand, at least.
[19:00:54] *** ryancnelson1 has joined #smartos
[19:03:27] *** CarlosC has joined #smartos
[19:05:36] *** CarlosC has quit IRC
[19:07:05] <MerlinDMC> heyho!
[19:13:51] *** tonyarkles has joined #smartos
[19:15:13] *** jim80net has quit IRC
[19:16:35] *** wdent has joined #smartos
[19:17:06] *** kevinykchan has joined #smartos
[19:17:12] *** wdent has quit IRC
[19:17:31] *** wdent has joined #smartos
[19:20:19] *** sjorge has quit IRC
[19:20:35] *** sjorge has joined #smartos
[19:32:15] *** wolstena has joined #smartos
[19:33:13] *** jim80net has joined #smartos
[19:36:19] *** marsell has joined #smartos
[19:38:34] *** dap has joined #smartos
[19:38:37] <ira> Ok, that's interesting. I have a zone based of 1.8.1 64 bit. I have it redirecting syslog to another server. On boot of the zone, it doesn't work. If I restart the service it does.
[19:47:14] <MerlinDMC> ira, gets the rsyslog service startet while network is down?
[19:47:35] <ira> It shouldn't if it is all brought up by smf.
[19:47:44] <richlowe> and also if the dependencies are correct.
[19:48:02] <MerlinDMC> ira, only if the deps in the smf manifest require network :)
[19:48:04] *** chriss- has joined #smartos
[19:49:15] *** d[^_^]b has quit IRC
[19:49:22] *** d[^_^]b has joined #smartos
[20:02:03] <ira> It looks right
[20:03:06] *** leecallen has quit IRC
[20:04:29] <ira> Unless dns/client lies about when it is ready…
[20:30:24] *** dubban has quit IRC
[20:36:08] <ira> Figured it out… PEBCAK.
[20:36:23] <rmustacc> What is it?
[20:36:38] <ira> I was/am using NIS, and didn't have it listed to resolve name service.
[20:36:51] <ira> And worse, it has to be in front of dns… :/
[20:37:13] <ira> So… I need to figure out which one of about 7 ways I want to fix that.
[20:38:26] <richlowe> when you say "has to be in front of dns" you mean for local reasons, right?
[20:39:05] <ira> I'm not sure… if I didn't list them in nis dns order it didn't work, once I disabled nis it worked.
[20:39:16] <richlowe> that's a bit worrying, too :\
[20:39:16] <ira> (The latter is a better solution in many ways… but.)
[20:39:31] <richlowe> I think you may just need the NOTFOUND magic?
[20:39:40] <ira> ?
[20:40:34] <richlowe> NOTFOUND=continue, and all that, see nsswitch.conf(4) (also, this is totally just random memory on my part, it could be wildly innacurate)
[20:41:00] <ira> Somewhat. In the end, I don't want to resolve names with NIS period.
[20:41:26] <ira> But because it is marked as a valid name service… once it is up… name-service is good to go.
[20:44:22] <ira> notfound=continue should do it…
[20:57:02] *** dap has quit IRC
[20:57:34] *** dap has joined #smartos
[20:57:47] *** rawtaz has left #smartos
[20:58:33] *** tru_tru has joined #smartos
[21:04:05] *** e^ipi has joined #smartos
[21:07:42] *** notmatt has quit IRC
[21:26:19] *** tonyarkles_ has joined #smartos
[21:26:46] *** wdent_ has joined #smartos
[21:27:20] *** wolstena1 has joined #smartos
[21:27:24] *** fsteinelX has joined #smartos
[21:29:59] *** kevinykchan has quit IRC
[21:30:24] *** chris--- has joined #smartos
[21:32:05] *** kevinykchan has joined #smartos
[21:33:35] *** richlowe` has joined #smartos
[21:33:46] *** rodgort` has joined #smartos
[21:33:48] *** kfr- has quit IRC
[21:33:58] *** matthewp- has joined #smartos
[21:34:19] *** fly_ has joined #smartos
[21:34:44] *** tru_tru has quit IRC
[21:34:46] *** chriss- has quit IRC
[21:34:47] *** wolstena has quit IRC
[21:34:47] *** wdent has quit IRC
[21:34:47] *** tonyarkles has quit IRC
[21:34:52] *** papertigers has quit IRC
[21:34:53] *** matthewpucc has quit IRC
[21:34:54] *** richlowe has quit IRC
[21:34:56] *** rodgort has quit IRC
[21:34:58] *** fly__ has quit IRC
[21:35:00] *** tonyarkles_ is now known as tonyarkles
[21:38:53] *** tru_tru has joined #smartos
[21:41:48] *** tonyarkles has quit IRC
[21:43:45] *** bdha_ has joined #smartos
[21:43:57] *** dubban has joined #smartos
[21:43:58] *** enmand_ has joined #smartos
[21:45:16] *** thalin_ has joined #smartos
[21:45:17] *** thalin_ has quit IRC
[21:45:17] *** thalin_ has joined #smartos
[21:47:34] *** enmand has quit IRC
[21:47:40] *** bdha has quit IRC
[21:47:44] *** dumfries has quit IRC
[21:47:44] *** thalin has quit IRC
[21:47:46] *** samu has quit IRC
[21:51:30] *** dumfries has joined #smartos
[21:52:58] *** samu has joined #smartos
[21:56:51] *** kfr- has joined #smartos
[21:57:47] *** richlowe` is now known as richlowe
[22:00:02] <richlowe> jeffpc: no reason I can't attach the core you sent to an illumos bug, right?
[22:00:36] <jeffpc> richlowe: go for it
[22:01:40] <richlowe> thanks
[22:04:00] *** enmand_ has quit IRC
[22:04:12] *** wramthun2 has quit IRC
[22:09:10] *** wramthun has joined #smartos
[22:28:14] *** marsell_ has joined #smartos
[22:29:21] *** marsell__ has joined #smartos
[22:30:18] *** marsell has quit IRC
[22:30:18] *** marsell__ is now known as marsell
[22:32:35] *** marsell_ has quit IRC
[22:32:57] *** tonyarkles has joined #smartos
[22:33:05] *** tonyarkles has quit IRC
[22:51:09] *** papertigers has joined #smartos
[22:52:58] *** thalin_ is now known as thalin
[23:06:23] *** fsteinelX has quit IRC
[23:06:59] *** dap has quit IRC
[23:07:22] *** dap has joined #smartos
[23:22:17] *** wdent_ has quit IRC
[23:40:42] *** enmand has joined #smartos