Commit Graph

139 Commits

Author SHA1 Message Date
Blaisorblade
71ae18ec69 [PATCH] sys_get_thread_area does not clear the returned argument
sys_get_thread_area does not memset to 0 its struct user_desc info before
copying it to user space...  since sizeof(struct user_desc) is 16 while the
actual datas which are filled are only 12 bytes + 9 bits (across the
bitfields), there is a (small) information leak.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:26:08 -07:00
Brian Gerst
7657e20e46 [PATCH] Fix warning in powernow-k8.c
powernow-k8.c: In function `query_current_values_with_pending_wait':
powernow-k8.c:110: warning: `hi' may be used uninitialized in this function

Signed-off-by: Brian Gerst <bgerst@didntduck.org>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:25:54 -07:00
Eric W. Biederman
910de55c66 [PATCH] APM: Remove redundant call to set_cpus_allowed
machine_power_off now always switches to the boot cpu so there
is no reason for APM to also do that.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26 14:35:45 -07:00
Eric W. Biederman
4fa2564a6f [PATCH] i386 machine_power_off cleanup
Call machine_shutdown() to move to the boot cpu
and disable apics.  Both acpi_power_off and
apm_power_off want to move to the boot cpu.
and we are already disabling the local apics
so calling machine_shutdown simply reuses
code.

ia64 doesn't have a special path in power_off
for efi so there is no reason i386 should.  If
we really need to call the efi power off path
the efi driver can set pm_power_off like everyone
else.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26 14:35:44 -07:00
Eric W. Biederman
d8e392e7c8 [PATCH] machine_shutdown: Typo fix to actually allow specifying which cpu to reboot on
This appears to be a typo I introduced when cleaning
this code up earlier. Ooops.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26 14:35:44 -07:00
Eric W. Biederman
4a1421f81b [PATCH] i386: Implement machine_emergency_reboot
set_cpus_allowed is not safe in interrupt context
and disabling apics is complicated code so don't
call machine_shutdown on i386 from emergency_restart().

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26 14:35:42 -07:00
Eric W. Biederman
59586e5a26 [PATCH] Don't export machine_restart, machine_halt, or machine_power_off.
machine_restart, machine_halt and machine_power_off are machine
specific hooks deep into the reboot logic, that modules
have no business messing with.  Usually code should be calling
kernel_restart, kernel_halt, kernel_power_off, or
emergency_restart. So don't export machine_restart,
machine_halt, and machine_power_off so we can catch buggy users.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26 14:35:42 -07:00
Linus Torvalds
72538d8565 Remove "noreplacement" kernel command line option.
It is no longer valid to not replace instructions, since we depend on
different behaviour depending on CPU capabilities.

If you need to limit the capabilities of the replacements (because the
boot CPU has features that non-boot CPU's do not have, for example), you
need to explicitly disable those capabilities that are not shared across
all CPU's.

For example, if your boot CPU has FXSR, but other CPU's in your system
do not, you need to use the "nofxsr" kernel command line, not disable
instruction replacement per se.
2005-07-22 18:29:40 -04:00
Linus Torvalds
8ed1383fb7 x86: make restore_fpu() use alternative assembler instructions
It's really just a single instruction, conditional on whether the CPU
supports FXSR or not, so implement it as such instead of making it a
function that queries FXSR dynamically.

This means that the instruction just gets automatically rewritten to the
correct one at boot-time.
2005-07-22 16:06:16 -04:00
Linus Torvalds
b339a18b81 Fix up incorrect "unlikely()" on %gs reload in x86 __switch_to
These days %gs is normally the TLS segment, so it's no longer zero.  As
a result, we shouldn't just assume that %fs/%gs tend to be zero
together, but test them independently instead.

Also, fix setting of debug registers to use the "next" pointer instead
of "current".  It so happens that the scheduler will have set the new
current pointer before calling __switch_to(), but that's just an
implementation detail.
2005-07-22 15:23:47 -04:00
Robert Love
0eeca28300 [PATCH] inotify
inotify is intended to correct the deficiencies of dnotify, particularly
its inability to scale and its terrible user interface:

        * dnotify requires the opening of one fd per each directory
          that you intend to watch. This quickly results in too many
          open files and pins removable media, preventing unmount.
        * dnotify is directory-based. You only learn about changes to
          directories. Sure, a change to a file in a directory affects
          the directory, but you are then forced to keep a cache of
          stat structures.
        * dnotify's interface to user-space is awful.  Signals?

inotify provides a more usable, simple, powerful solution to file change
notification:

        * inotify's interface is a system call that returns a fd, not SIGIO.
	  You get a single fd, which is select()-able.
        * inotify has an event that says "the filesystem that the item
          you were watching is on was unmounted."
        * inotify can watch directories or files.

Inotify is currently used by Beagle (a desktop search infrastructure),
Gamin (a FAM replacement), and other projects.

See Documentation/filesystems/inotify.txt.

Signed-off-by: Robert Love <rml@novell.com>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12 20:38:38 -07:00
Len Brown
5028770a42 [ACPI] merge acpi-2.6.12 branch into latest Linux 2.6.13-rc...
Signed-off-by: Len Brown <len.brown@intel.com>
2005-07-12 17:21:56 -04:00
Venkatesh Pallipadi
02df8b9385 [ACPI] enable C2 and C3 idle power states on SMP
http://bugzilla.kernel.org/show_bug.cgi?id=4401

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2005-07-12 00:14:36 -04:00
Nickolai Zeldovich
9d9437759e [ACPI] S3 resume -- use lgdtl, not lgdt
From: Nickolai Zeldovich <kolya@MIT.EDU>
Signed-off-by: Len Brown <len.brown@intel.com>
2005-07-12 00:04:31 -04:00
Christoph Lameter
6c036527a6 [PATCH] mostly_read data section
Add a new section called ".data.read_mostly" for data items that are read
frequently and rarely written to like cpumaps etc.

If these maps are placed in the .data section then these frequenly read
items may end up in cachelines with data is is frequently updated.  In that
case all processors in an SMP system must needlessly reload the cachelines
again and again containing elements of those frequently used variables.

The ability to share these cachelines will allow each cpu in an SMP system
to keep local copies of those shared cachelines thereby optimizing
performance.

Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Shobhit Dayal <shobhit@calsoftinc.com>
Signed-off-by: Christoph Lameter <christoph@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:46 -07:00
Shaohua Li
3b520b238e [PATCH] MTRR suspend/resume cleanup
There has been some discuss about solving the SMP MTRR suspend/resume
breakage, but I didn't find a patch for it.  This is an intent for it.  The
basic idea is moving mtrr initializing into cpu_identify for all APs (so it
works for cpu hotplug).  For BP, restore_processor_state is responsible for
restoring MTRR.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:42 -07:00
Rusty Lynch
6772926bef [PATCH] kprobes: fix namespace problem and sparc64 build
The following renames arch_init, a kprobes function for performing any
architecture specific initialization, to arch_init_kprobes in order to
cleanup the namespace.

Also, this patch adds arch_init_kprobes to sparc64 to fix the sparc64 kprobes
build from the last return probe patch.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-05 19:19:00 -07:00
Greg Kroah-Hartman
7586585897 [PATCH] PCI: clean up dynamic pci id logic
The dynamic pci id logic has been bothering me for a while, and now that
I started to look into how to move some of this to the driver core, I
thought it was time to clean it all up.

It ends up making the code smaller, and easier to follow, and fixes a
few bugs at the same time (dynamic ids were not being matched
everywhere, and so could be missed on some call paths for new devices,
semaphore not needed to be grabbed when adding a new id and calling the
driver core, etc.)

I also renamed the function pci_match_device() to pci_match_id() as
that's what it really does.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-07-01 13:35:50 -07:00
Ingo Molnar
306e440daf [PATCH] x86: i8253/i8259A lock cleanup
Introduce proper declarations for i8253_lock and i8259A_lock.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-30 08:45:10 -07:00
Greg KH
8644d2a42b Merge rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2005-06-27 22:07:56 -07:00
Greg Kroah-Hartman
545493917d [PATCH] PCI: add proper MCFG table parsing to ACPI core.
This patch is the first step in properly handling the MCFG PCI table.
It defines the structures properly, and saves off the table so that the
pci mmconfig code can access it.  It moves the parsing of the table a
little later in the boot process, but still before the information is
needed.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-27 21:52:47 -07:00
Kenji Kaneshige
b1bb248a5d [PATCH] ACPI based I/O APIC hot-plug: add interfaces
This patch adds the following new interfaces for I/O xAPIC
hotplug. The implementation of these interfaces depends on each
architecture.

    o int acpi_register_ioapic(acpi_handle handle, u64 phys_addr,
			       u32 gsi_base);

        This new interface is to add a new I/O xAPIC specified by
        phys_addr and gsi_base pair. phys_addr is the physical address
        to which the I/O xAPIC is mapped and gsi_base is global system
        interrupt base of the I/O xAPIC. acpi_register_ioapic returns
        0 on success, or negative value on error.

    o int acpi_unregister_ioapic(acpi_handle handle, u32 gsi_base);

        This new interface is to remove a I/O xAPIC specified by
        gsi_base. acpi_unregister_ioapic returns 0 on success, or
        negative value on error.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-27 21:52:44 -07:00
Rusty Lynch
4bdbd37f6d [PATCH] Return probe redesign: i386 specific changes
The following patch contains the i386 specific changes for the new
return probe design.  Changes include:

 * Removing the architecture specific functions for querying a return probe
   instance off a stack address
 * Complete rework onf arch_prepare_kretprobe() and trampoline_probe_handler()
 * Removing trampoline_post_handler()
 * Adding arch_init() so that now we handle registering the return probe
   trampoline instead of kernel/kprobes.c doing it

Signed-off-by: Rusty Lynch <rusty.lynch@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 15:23:53 -07:00
Andrea Arcangeli
ffaa8bd6c9 [PATCH] seccomp: tsc disable
I believe at least for seccomp it's worth to turn off the tsc, not just for
HT but for the L2 cache too.  So it's up to you, either you turn it off
completely (which isn't very nice IMHO) or I recommend to apply this below
patch.

This has been tested successfully on x86-64 against current cogito
repository (i686 compiles so I didn't bother testing ;).  People selling
the cpu through cpushare may appreciate this bit for a peace of mind.

There's no way to get any timing info anymore with this applied
(gettimeofday is forbidden of course).  The seccomp environment is
completely deterministic so it can't be allowed to get timing info, it has
to be deterministic so in the future I can enable a computing mode that
does a parallel computing for each task with server side transparent
checkpointing and verification that the output is the same from all the 2/3
seller computers for each task, without the buyer even noticing (for now
the verification is left to the buyer client side and there's no
checkpointing, since that would require more kernel changes to track the
dirty bits but it'll be easy to extend once the basic mode is finished).

Eliminating a cold-cache read of the cr4 global variable will save one
cacheline during the tlb flush while making the code per-cpu-safe at the
same time.  Thanks to Mikael Pettersson for noticing the tlb flush wasn't
per-cpu-safe.

The global tlb flush can run from irq (IPI calling do_flush_tlb_all) but
it'll be transparent to the switch_to code since the IPI won't make any
change to the cr4 contents from the point of view of the interrupted code
and since it's now all per-cpu stuff, it will not race.  So no need to
disable irqs in switch_to slow path.

Signed-off-by: Andrea Arcangeli <andrea@cpushare.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 15:11:44 -07:00
Jens Axboe
22e2c507c3 [PATCH] Update cfq io scheduler to time sliced design
This updates the CFQ io scheduler to the new time sliced design (cfq
v3).  It provides full process fairness, while giving excellent
aggregate system throughput even for many competing processes.  It
supports io priorities, either inherited from the cpu nice value or set
directly with the ioprio_get/set syscalls.  The latter closely mimic
set/getpriority.

This import is based on my latest from -mm.

Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 14:33:29 -07:00
Linus Torvalds
6f0dcb72d6 Fix up try_to_freeze() usage in arch/i386/kernel/signal.c
The parentheses were missing. Noted by Pavel Machek.
2005-06-25 20:09:12 -07:00
Linus Torvalds
2031d0f586 Merge Christoph's freeze cleanup patch 2005-06-25 17:16:53 -07:00
Christoph Lameter
3e1d1d28d9 [PATCH] Cleanup patch for process freezing
1. Establish a simple API for process freezing defined in linux/include/sched.h:

   frozen(process)		Check for frozen process
   freezing(process)		Check if a process is being frozen
   freeze(process)		Tell a process to freeze (go to refrigerator)
   thaw_process(process)	Restart process
   frozen_process(process)	Process is frozen now

2. Remove all references to PF_FREEZE and PF_FROZEN from all
   kernel sources except sched.h

3. Fix numerous locations where try_to_freeze is manually done by a driver

4. Remove the argument that is no longer necessary from two function calls.

5. Some whitespace cleanup

6. Clear potential race in refrigerator (provides an open window of PF_FREEZE
   cleared before setting PF_FROZEN, recalc_sigpending does not check
   PF_FROZEN).

This patch does not address the problem of freeze_processes() violating the rule
that a task may only modify its own flags by setting PF_FREEZE. This is not clean
in an SMP environment. freeze(process) is therefore not SMP safe!

Signed-off-by: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 17:10:13 -07:00
Adrian Bunk
9e84d1c36a [PATCH] i386: cleanup boot_cpu_logical_apicid variables
There are currently two different boot_cpu_logical_apicid variables:
- a global one in mpparse.c
- a static one in smpboot.c

Of these two, only the one in smpboot.c might be used (through
boot_cpu_apicid).

This patch therefore removes the one in mpparse.c .

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrey Panin <pazke@donpac.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:25:05 -07:00
Jesper Juhl
4ae6673e02 [PATCH] get rid of redundant NULL checks before kfree() in arch/i386/
Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:25:00 -07:00
Domen Puncer
77617bd806 [PATCH] arch/i386/kernel/apm.c: fix sparse warnings
Signed-off-by: Alexey Dobriyan <adobriyan@mail.ru>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:58 -07:00
Domen Puncer
3f3ae3471f [PATCH] arch/i386/kernel/traps.c: fix sparse warnings
Signed-off-by: Alexey Dobriyan <adobriyan@mail.ru>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:58 -07:00
Maneesh Soni
72414d3f1d [PATCH] kexec code cleanup
o Following patch provides purely cosmetic changes and corrects CodingStyle
  guide lines related certain issues like below in kexec related files

  o braces for one line "if" statements, "for" loops,
  o more than 80 column wide lines,
  o No space after "while", "for" and "switch" key words

o Changes:
  o take-2: Removed the extra tab before "case" key words.
  o take-3: Put operator at the end of line and space before "*/"

Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:55 -07:00
Alexander Nyberg
4f339ecb30 [PATCH] kdump: Save trap information for later analysis
If we are faulting in kernel it is quite possible this will lead to a
panic.  Save trap number, cr2 (in case of page fault) and error_code in the
current thread (these fields already exist for signal delivery but are not
used here).

This helps later kdump crash analyzing from user-space (a script has been
submitted to dig this info out in gdb).

Signed-off-by: Alexander Nyberg <alexn@telia.com>
Cc: <fastboot@lists.osdl.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:55 -07:00
Alexander Nyberg
6e274d1443 [PATCH] kdump: Use real pt_regs from exception
Makes kexec_crashdump() take a pt_regs * as an argument.  This allows to
get exact register state at the point of the crash.  If we come from direct
panic assertion NULL will be passed and the current registers saved before
crashdump.

This hooks into two places:
die(): check the conditions under which we will panic when calling
do_exit and go there directly with the pt_regs that caused the fatal
fault.

die_nmi(): If we receive an NMI lockup while in the kernel use the
pt_regs and go directly to crash_kexec(). We're probably nested up badly
at this point so this might be the only chance to escape with proper
information.

Signed-off-by: Alexander Nyberg <alexn@telia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:54 -07:00
Vivek Goyal
2030eae52b [PATCH] Retrieve elfcorehdr address from command line
This patch adds support for retrieving the address of elf core header if one
is passed in command line.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:53 -07:00
Vivek Goyal
92aa63a5a1 [PATCH] kdump: Retrieve saved max pfn
This patch retrieves the max_pfn being used by previous kernel and stores it
in a safe location (saved_max_pfn) before it is overwritten due to user
defined memory map.  This pfn is used to make sure that user does not try to
read the physical memory beyond saved_max_pfn.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:52 -07:00
Vivek Goyal
a3ea8ac846 [PATCH] Kexec: Kexec on panic fix with nmi watchdog enabled
o Problem: Kexec on panic hangs if first kernel is booted with nmi_watchdog
  command line parameter. This problem occurs because kexec crash shutdown
  code replaces the NMI callback handler. This handler saves the cpu register
  states and halts the cpu. If system is booted with nmi_watchdog parameter,
  then crashing cpu also runs this nmi handler and halts itself.

o This patch fixes the problem by keeping a track of crashing cpu and not
  executing the new nmi handler on crashing cpu.

o There is a dependence on smp_processor_id() function which might return
  insane value for cpu, if cpu field of thread_info is corrupted.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:52 -07:00
Vivek Goyal
4d55476c3f [PATCH] kdump: NMI handler segment selector, stack pointer fix
CPU does not save ss and esp on stack if execution was already in kernel mode
at the time of NMI occurrence.  This leads to saving of erractic values for ss
and esp.  This patch fixes the issue.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:52 -07:00
Vivek Goyal
625f1c8219 [PATCH] Kdump: Export crash notes section address through sysfs
o Following patch exports kexec global variable "crash_notes" to user space
  through sysfs as kernel attribute in /sys/kernel.

Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:51 -07:00
Eric W. Biederman
1bc3b91aee [PATCH] crashdump: x86 crashkernel option
This is the x86 implementation of the crashkernel option.  It reserves a
window of memory very early in the bootup process, so we never use it for
anything but the kernel to switch to when the running kernel panics.

In addition to reserving this memory a resource structure is registered so
looking at /proc/iomem it is clear what happened to that memory.

ISSUES:
Is it possible to implement this in a architecture generic way?
What should be done with architectures that always use an iommu and
thus don't report their RAM memory resources in /proc/iomem?

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:50 -07:00
Eric W. Biederman
63d30298ef [PATCH] kexec: x86 shutdown APICs during crash_shutdown
In the case of a crash/panic an architecture specific function
machine_crash_shutdown is called.  This patch adds to the x86 machine_crash
function the standard kernel code for shutting down apics.

Every line of code added to that function increases the risk that we will call
code after a kernel panic that is not safe.

This patch should not make it to the stable kernel without a being reviewed a
lot more.  It is unclear how much a hardned kernel can take when it comes to
misconfigured apics.  So since a normal kernel has problems this patch does a
clean shutdown.

It is my expectation this patch will be dropped from future generations of the
kexec work.  But for the moment it is a crutch to keep from breaking
everything.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:50 -07:00
Eric W. Biederman
2c818b45a2 [PATCH] kexec: x86: snapshot registers during crash shutdown
After the kernel panics if we wish to generate an entire machine core file it
is very nice to know the register state at the time the machine crashed.

After long discussion it was realized that if you are going to be saving the
information anyway it is reasonable to store the information in a format that
it will be used and recognized in so the register state is stored in the
standard ELF note format.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:49 -07:00
Eric W. Biederman
c4ac4263a0 [PATCH] crashdump: x86: add NMI handler to capture other CPUs
One of the dangers when switching from one kernel to another is what happens
to all of the other cpus that were running in the crashed kernel.  In an
attempt to avoid that problem this patch adds a nmi handler and attempts to
shoot down the other cpus by sending them non maskable interrupts.

The code then waits for 1 second or until all known cpus have stopped running
and then jumps from the running kernel that has crashed to the kernel in
reserved memory.

The kernel spin loop is used for the delay as that should behave continue to
be safe even in after a crash.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:49 -07:00
Eric W. Biederman
5033cba087 [PATCH] kexec: x86 kexec core
This is the i386 implementation of kexec.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:49 -07:00
Eric W. Biederman
dd2a13054f [PATCH] kexec: x86: factor out apic shutdown code
Factor out the apic and smp shutdown code from machine_restart so it can be
called by in the kexec reboot path as well.

By switching to the bootstrap cpu by default on reboot I can delete/simplify
some motherboard fixups well.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:49 -07:00
Vivek Goyal
8a9190853c [PATCH] kexec: reserve Bootmem fix for booting nondefault location kernel
This patch fixes a problem with reserving memory during boot up of a kernel
built for non-default location.  Currently boot memory allocator reserves
the memory required by kernel image, boot allocaotor bitmap etc.  It
assumes that kernel is loaded at 1MB (HIGH_MEMORY hard coded to 1024*1024).
 But kernel can be built for non-default locatoin, hence existing
hardcoding will lead to reserving unnecessary memory.  This patch fixes it.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:48 -07:00
Eric W. Biederman
3d345e3fc9 [PATCH] kexec: x86: add CONFIG_PYSICAL_START
For one kernel to report a crash another kernel has created we need
to have 2 kernels loaded simultaneously in memory.  To accomplish this
the two kernels need to built to run at different physical addresses.

This patch adds the CONFIG_PHYSICAL_START option to the x86 kernel
so we can do just that.  You need to know what you are doing and
the ramifications are before changing this value, and most users
won't care so I have made it depend on CONFIG_EMBEDDED

bzImage kernels will work and run at a different address when compiled
with this option but they will still load at 1MB.  If you need a kernel
loaded at a different address as well you need to boot a vmlinux.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:48 -07:00
Eric W. Biederman
ad0d75ebac [PATCH] kexec: x86: vmlinux: fix physical addresses
The vmlinux on i386 does not report the correct physical address of
the kernel.  Instead in the physical address field it currently
reports the virtual address of the kernel.

This is patch is a bug fix that corrects vmlinux to report the
proper physical addresses.

This is potentially a help for crash dump analysis tools.

This definitiely allows bootloaders that load vmlinux as a standard
ELF executable.  Bootloaders directly loading vmlinux become of
practical importance when we consider the kexec on panic case.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:47 -07:00
Eric W. Biederman
650927ef8a [PATCH] kexec: x86: resture apic virtual wire mode on shutdown
When coming out of apic mode attempt to set the appropriate
apic back into virtual wire mode.  This improves on previous versions
of this patch by by never setting bot the local apic and the ioapic
into veritual wire mode.

This code looks at data from the mptable to see if an ioapic has
an ExtInt input to make this decision.  A future improvement
is to figure out which apic or ioapic was in virtual wire mode
at boot time and to remember it.  That is potentially a more accurate
method, of selecting which apic to place in virutal wire mode.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:47 -07:00