Update: 08-25-2005 ------------------ This problem is solved with help of ipslinux mailing list. Now there is a patch for ips driver to reset the device during init if there is any pending interuppts. The patch is being pushed upstream. Summary: ------- - The system crashed while running FSRacer test - The dump catpture kernel failed to boot and kdump could be not be collected. - Opened new bug at bugme.osdl.org http://bugme.osdl.org/show_bug.cgi?id=4573 Details and Dump analysis: -------------------------- Following is the console/boot log for dump capture kernel showing crash stack trace Linux version 2.6.12-rc2-mm3-II (root@x235b) (gcc version 3.3.35 BIOS-provided physical RAM map: BIOS-e820: 0000000000000100 - 000000000009c000 (usable) BIOS-e820: 000000000009c000 - 00000000000a0000 (reserved) BIOS-e820: 0000000000100000 - 000000005ffd8740 (usable) BIOS-e820: 000000005ffd8740 - 000000005ffe0000 (ACPI data) BIOS-e820: 000000005ffe0000 - 0000000060000000 (reserved) BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved) user-defined physical RAM map: user: 0000000000000000 - 00000000000a0000 (usable) user: 0000000001000000 - 00000000014fd000 (usable) user: 000000000159d400 - 0000000004000000 (usable) 0MB HIGHMEM available. 64MB LOWMEM available. DMI 2.3 present. Allocating PCI resources starting at 04000000 (gap: 04000000:fc000000) Built 1 zonelists Initializing CPU#0 Kernel command line: root=/dev/sda1 init 1 vga=0x31a selinux=0 splash=silent resume=/dev/sda2 elevator=cfq showopts console=tty0 console=ttyS0,38400n1 memmap=exactmap memmap=640K@0K memmap=5108K@163K PID hash table entries: 512 (order: 9, 8192 bytes) Detected 2794.079 MHz processor. Using tsc for high-res timesource Console: colour VGA+ 80x25 Unknown interrupt or fault at EIP 00000246 00000060 c1495530 Dentry cache hash table entries: 16384 (order: 4, 65536 bytes) Inode-cache hash table entries: 8192 (order: 3, 32768 bytes) Memory: 43340k/65536k available (3563k kernel code, 5720k reserved, 1061k data, 176k init, 0k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Mount-cache hash table entries: 512 CPU: Trace cache: 12K uops, L1 D cache: 8K CPU: L2 cache: 512K Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. CPU0: Intel P4/Xeon Extended MCE MSRs (12) available CPU: Intel(R) Xeon(TM) CPU 2.80GHz stepping 07 Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking 'hlt' instruction... OK. ACPI: setting ELCR to 0020 (from 0e28) softlockup thread 0 started up. NET: Registered protocol family 16 PCI: PCI BIOS revision 2.10 entry at 0xfd7bc, last bus=10 PCI: Using configuration type 1 mtrr: v2.0 (20020519) ACPI: Subsystem revision 20050309 ACPI: Interpreter enabled ACPI: Using PIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (0000:00) PCI: Probing PCI hardware (bus 00) ACPI: Assume root bridge [\_SB_.PCI0] segment is 0 ACPI: Assume root bridge [\_SB_.PCI1] segment is 0 ACPI: Assume root bridge [\_SB_.PCI2] segment is 0 ACPI: Assume root bridge [\_SB_.PCI3] segment is 0 ACPI: Assume root bridge [\_SB_.PCI4] segment is 0 PCI: Ignoring BAR0-3 of IDE controller 0000:00:0f.1 ACPI: Can't get handler for 0000:00:00.0 ACPI: Can't get handler for 0000:00:00.1 ACPI: Can't get handler for 0000:00:00.2 ACPI: Can't get handler for 0000:00:09.0 ACPI: Can't get handler for 0000:00:0f.1 ACPI: Can't get handler for 0000:00:0f.3 ACPI: PCI Root Bridge [PCI1] (0000:02) PCI: Probing PCI hardware (bus 02) ACPI: Assume root bridge [\_SB_.PCI0] segment is 0 ACPI: Assume root bridge [\_SB_.PCI1] segment is 0 ACPI: Assume root bridge [\_SB_.PCI2] segment is 0 ACPI: Assume root bridge [\_SB_.PCI3] segment is 0 ACPI: Assume root bridge [\_SB_.PCI4] segment is 0 ACPI: Can't get handler for 0000:02:08.0 ACPI: PCI Root Bridge [PCI2] (0000:05) PCI: Probing PCI hardware (bus 05) ACPI: Assume root bridge [\_SB_.PCI0] segment is 0 ACPI: Assume root bridge [\_SB_.PCI1] segment is 0 ACPI: Assume root bridge [\_SB_.PCI2] segment is 0 ACPI: Assume root bridge [\_SB_.PCI3] segment is 0 ACPI: Assume root bridge [\_SB_.PCI4] segment is 0 ACPI: Can't get handler for 0000:05:03.0 ACPI: PCI Root Bridge [PCI3] (0000:07) PCI: Probing PCI hardware (bus 07) ACPI: Assume root bridge [\_SB_.PCI0] segment is 0 ACPI: Assume root bridge [\_SB_.PCI1] segment is 0 ACPI: Assume root bridge [\_SB_.PCI2] segment is 0 ACPI: Assume root bridge [\_SB_.PCI3] segment is 0 ACPI: Assume root bridge [\_SB_.PCI4] segment is 0 ACPI: PCI Root Bridge [PCI4] (0000:09) PCI: Probing PCI hardware (bus 09) ACPI: Assume root bridge [\_SB_.PCI0] segment is 0 ACPI: Assume root bridge [\_SB_.PCI1] segment idisabled. ACPI: PCI Interrupt Link [LP1F] (IRQs) *0, disabled. ACPI: PCI Interrupt Link [LPUS] (IRQs *11) Linux Plug and Play Support v0.97 (c) Adam Belay pnp: PnP ACPI init ACPI: No ACPI bus support for 00:00 ACPI: No ACPI bus support for 00:01 ACPI: No ACPI bus support for 00:02 ACPI: No ACPI bus support for 00:03 ACPI: No ACPI bus support for 00:04 ACPI: No ACPI bus support for 00:05 ACPI: No ACPI bus support for 00:06 ACPI: No ACPI bus support for 00:07 ACPI: No ACPI bus support for 00:08 ACPI: No ACPI bus support for 00:09 ACPI: No ACPI bus support for 00:0a ACPI: No ACPI bus support for 00:0b ACPI: No ACPI bus support for 00:0c ACPI: No ACPI bus support for 00:0d pnp: PnP ACPI: found 14 devices SCSI subsystem initialized usbcore: registered new driver usbfs usbcore: registered new driver hub PCI: Using ACPI for IRQ routing PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report fscache: general fs caching registered pnp: 00:00: ioport range 0x900-0x93f has been reserved pnp: 00:00: ioport range 0x510-0x517 could not be reserved pnp: 00:00: ioport range 0x504-0x507 could not be reserved pnp: 00:00: ioport range 0x500-0x503 could not be reserved pnp: 00:00: ioport range 0x520-0x53f has been reserved pnp: 00:00: ioport range 0x420-0x427 has been reserved pnp: 00:00: ioport range 0x460-0x461 has been reserved pnp: 00:0b: ioport range 0x1ec-0x1ef has been reserved pnp: 00:0b: ioport range 0x400-0x4fe could not be reserved pnp: 00:0b: ioport range 0x600-0x600 has been reserved pnp: 00:0b: ioport range 0x800-0x80f has been reserved pnp: 00:0b: ioport range 0xc00-0xcfe could not be reserved pnp: 00:0b: ioport range 0xf50-0xf58 has been reserved Machine check exception polling timer started. audit: initializing netlink socket (disabled) audit(1114797309.615:0): initialized inotify device minor=63 Installing knfsd (copyright (C) 1996 okir@monad.swb.de). JFS: nTxBlock = 339, nTxLock = 2714 SGI XFS with large block numbers, no debug enabled ACPI: Power Button (FF) [PWRF] ACPI: CPU0 (power states: C1[C1]) isapnp: Scanning for PnP cards... isapnp: No Plug & Play device found lp: driver loaded but no devices found Linux agpgart interface v0.101 (c) Dave Jones [drm] Initialized drm 1.0.0 20040925 PNP: PS/2 controller has invalid data port 0x64; using default 0x60 PNP: PS/2 controller has invalid command port 0x60; using default 0x64 PNP: PS/2 Controller [PNP0303:PS2K,PNP0f13:PS2M] at 0x60,0x64 irq 1,12 ACPI: No ACPI bus support for i8042 serio: i8042 AUX port at 0x60,0x64 irq 12 serio: i8042 KBD port at 0x60,0x64 irq 1 Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled ACPI: No ACPI bus support for serial8250 ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A parport: PnPBIOS parport detected. parport0: PC-style at 0x378, irq 7 [PCSPP(,...)] lp0: using parport0 (interrupt-driven). io scheduler noop registered io scheduler anticipatory registered io scheduler deadline registered io scheduler cfq registered (default) ACPI: Floppy Controller [FDC0] at I/O 0x3f0-0x3f5 irq 6 dma channel 2 ACPI: [FDC0] doesn't declare FD_DCR; also claiming 0x3f7 Floppy drive(s): fd0 is 1.44M ACPI: No ACPI bus support for serio0 ACPI: No ACPI bus support for serio1 FDC 0 is a National Semiconductor PC87306 ACPI: No ACPI bus support for floppy.0 tg3.c:v3.25 (March 24, 2005) ACPI: PCI Interrupt Link [LP0D] enabled at IRQ 3 PCI: setting IRQ 3 as level-triggered ACPI: PCI Interrupt 0000:02:08.0[A] -> Link [LP0D] -> GSI 3 (level, low) -> IRQ 3 eth0: Tigon3 [partno(BCM95703A30) rev 1002 PHY(5703)] (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:09:6b:a5:10:03 eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1] Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx SvrWks CSB5: IDE controller at PCI slot 0000:00:0f.1 SvrWks CSB5: chipset revision 147 SvrWks CSB5: not 100% native mode: will probe irqs later SvrWks CSB5: simplex device: DMA forced ide0: BM-DMA at 0x0700-0x0707, BIOS settings: hda:DMA, hdb:DMA SvrWks CSB5: simplex device: DMA forced ide1: BM-DMA at 0x0708-0x070f, BIOS settings: hdc:DMA, hdd:DMA hda: LTN486S, ATAPI CD/DVD-ROM drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 ACPI: No ACPI bus support for 0.0 ide2: I/O resource 0x1E8-0x1EF not free. ide2: ports already in use, skipping probe hda: ATAPI 48X CD-ROM drive, 120kB Cache, UDMA(33) Uniform CD-ROM driver Revision: 3.20 Loading Adaptec I2O RAID: Version 2.4 Build 5go Detecting Adaptec I2O RAID controllers... ACPI: PCI Interrupt Link [LP06] enabled at IRQ 9 PCI: setting IRQ 9 as level-triggered ACPI: PCI Interrupt 0000:05:03.0[A] -> Link [LP06] -> GSI 9 (level, low) -> IRQ 9 Unable to handle kernel paging request at virtual address 00002d90 printing eip: c12882f1 *pde = 00000000 Oops: 0002 [#1] PREEMPT DEBUG_PAGEALLOC Modules linked in: CPU: 0 EIP: 0060:[] Not tainted VLI EFLAGS: 00010206 (2.6.12-rc2-mm3-II) EIP is at ips_chkstatus+0x31/0x1a0 eax: 00000010 ebx: 00002d00 ecx: c3c21720 edx: c1659d74 esi: c3c23200 edi: c3c23230 ebp: c1659d60 esp: c1659d34 ds: 007b es: 007b ss: 0068 Process swapper (pid: 1, threadinfo=c1658000 task=c1651a10) Stack: c3c33000 c1616660 00000001 00000000 c1659d10 c1012f22 c1616660 00000163 c3c23200 c3c23230 c1659e08 c1659d80 c1284c26 c3c23200 c1659d74 00000002 00103c00 c3c23200 00000000 c1659d98 c1284aef c3c23200 00000002 c3c21720 Call Trace: [] show_stack+0xab/0xc0 [] show_registers+0x156/0x1d0 [] die+0xf6/0x190 [] do_page_fault+0x494/0x6ab [] error_code+0x4f/0x54 [] ips_intr_morpheus+0x76/0xe0 [] do_ipsintr+0x9f/0xc0 [] handle_IRQ_event+0x3e/0x90 [] __do_IRQ+0xd5/0x170 [] do_IRQ+0x26/0x40 [] common_interrupt+0x1a/0x20 [] request_irq+0x84/0xb0 [] ips_init_phase2+0x4a/0x1b0 [] ips_insert_device+0x9b/0xa0 [] pci_device_probe_static+0x4d/0x70 [] __pci_device_probe+0x3c/0x50 [] pci_device_probe+0x2f/0x60 [] driver_probe_device+0x36/0xc0 [] __driver_attach+0x43/0x50 [] bus_for_each_dev+0x54/0x80 [] driver_attach+0x27/0x30 [] bus_add_driver+0x80/0xc0 [] pci_register_driver+0x58/0x80 [] ips_module_init+0x12/0x60 [] do_initcalls+0x26/0xc0 [] init+0x2d/0x120 [] kernel_thread_helper+0x5/0x10 Code: 89 5d f4 8b 55 0c 89 75 f8 8b 75 08 89 7d fc 0f b6 42 01 8b 7e 40 0f b6 c0 8d 1c 40 c1 e3 06 01 fb 0f b6 42 02 8d 7e 30 88 45 e4 <88> 83 90 00 00 00 <0>Kernel panic - not syncing: Fatal exception in interrupt <3>BUG: soft lockup detected on CPU#0! Pid: 1, comm: swapper EIP: 0060:[] CPU: 0 EIP is at delay_tsc+0x14/0x20 EFLAGS: 00000297 Not tainted (2.6.12-rc2-mm3-II) EAX: 4fc01f3c EBX: 002a0015 ECX: 4fa608ac EDX: 0000180c ESI: ffffffff EDI: c138c473 EBP: c1659bb8 DS: 007b ES: 007b CR0: 8005003b CR2: 00002d90 CR3: 014b4000 CR4: 000006c0 [] show_regs+0x144/0x170 [] softlockup_tick+0x55/0x80 [] timer_interrupt+0x2a/0x90 [] handle_IRQ_event+0x3e/0x90 [] __do_IRQ+0xd5/0x170 [] do_IRQ+0x26/0x40 [] common_interrupt+0x1a/0x20 [] __delay+0x14/0x20 [] panic+0xd8/0x110 [] die+0x15d/0x190 [] do_page_fault+0x494/0x6ab [] error_code+0x4f/0x54 [] ips_intr_morpheus+0x76/0xe0 [] do_ipsintr+0x9f/0xc0 [] handle_IRQ_event+0x3e/0x90 [] __do_IRQ+0xd5/0x170 [] do_IRQ+0x26/0x40 [] common_interrupt+0x1a/0x20 [] request_irq+0x84/0xb0 [] ips_init_phase2+0x4a/0x1b0 [] ips_insert_device+0x9b/0xa0 [] pci_device_probe_static+0x4d/0x70 [] __pci_device_probe+0x3c/0x50 [] pci_device_probe+0x2f/0x60 [] driver_probe_device+0x36/0xc0 [] __driver_attach+0x43/0x50 [] bus_for_each_dev+0x54/0x80 [] driver_attach+0x27/0x30 [] bus_add_driver+0x80/0xc0 [] pci_register_driver+0x58/0x80 [] ips_module_init+0x12/0x60 [] do_initcalls+0x26/0xc0 [] init+0x2d/0x120 [] kernel_thread_helper+0x5/0x10 : line 1: ogin:: command not found