fc908ed33e
19518 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Paul E. McKenney
|
fc908ed33e |
rcu: Make RCU_CPU_STALL_INFO include number of fqs attempts
One way that an RCU CPU stall warning can happen is if the grace-period kthread is not allowed to execute. One proxy for this kthread's forward progress is the number of force-quiescent-state (fqs) scans. This commit therefore adds the number of fqs scans to the RCU CPU stall warning printouts when CONFIG_RCU_CPU_STALL_INFO=y. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> |
||
Paul E. McKenney
|
734d168013 |
rcu: Make rcu_nmi_enter() handle nesting
The x86 architecture has multiple types of NMI-like interrupts: real NMIs, machine checks, and, for some values of NMI-like, debugging and breakpoint interrupts. These interrupts can nest inside each other. Andy Lutomirski is adding RCU support to these interrupts, so rcu_nmi_enter() and rcu_nmi_exit() must now correctly handle nesting. This commit therefore introduces nesting, using a clever NMI-coordination algorithm suggested by Andy. The trick is to atomically increment ->dynticks (if needed) before manipulating ->dynticks_nmi_nesting on entry (and, accordingly, after on exit). In addition, ->dynticks_nmi_nesting is incremented by one if ->dynticks was incremented and by two otherwise. This means that when rcu_nmi_exit() sees ->dynticks_nmi_nesting equal to one, it knows that ->dynticks must be atomically incremented. This NMI-coordination algorithms has been validated by the following Promela model: ------------------------------------------------------------------------ /* * Promela model for Andy Lutomirski's suggested change to rcu_nmi_enter() * that allows nesting. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, you can access it online at * http://www.gnu.org/licenses/gpl-2.0.html. * * Copyright IBM Corporation, 2014 * * Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> */ byte dynticks_nmi_nesting = 0; byte dynticks = 0; /* * Promela verision of rcu_nmi_enter(). */ inline rcu_nmi_enter() { byte incby; byte tmp; incby = BUSY_INCBY; assert(dynticks_nmi_nesting >= 0); if :: (dynticks & 1) == 0 -> atomic { dynticks = dynticks + 1; } assert((dynticks & 1) == 1); incby = 1; :: else -> skip; fi; tmp = dynticks_nmi_nesting; tmp = tmp + incby; dynticks_nmi_nesting = tmp; assert(dynticks_nmi_nesting >= 1); } /* * Promela verision of rcu_nmi_exit(). */ inline rcu_nmi_exit() { byte tmp; assert(dynticks_nmi_nesting > 0); assert((dynticks & 1) != 0); if :: dynticks_nmi_nesting != 1 -> tmp = dynticks_nmi_nesting; tmp = tmp - BUSY_INCBY; dynticks_nmi_nesting = tmp; :: else -> dynticks_nmi_nesting = 0; atomic { dynticks = dynticks + 1; } assert((dynticks & 1) == 0); fi; } /* * Base-level NMI runs non-atomically. Crudely emulates process-level * dynticks-idle entry/exit. */ proctype base_NMI() { byte busy; busy = 0; do :: /* Emulate base-level dynticks and not. */ if :: 1 -> atomic { dynticks = dynticks + 1; } busy = 1; :: 1 -> skip; fi; /* Verify that we only sometimes have base-level dynticks. */ if :: busy == 0 -> skip; :: busy == 1 -> skip; fi; /* Model RCU's NMI entry and exit actions. */ rcu_nmi_enter(); assert((dynticks & 1) == 1); rcu_nmi_exit(); /* Emulated re-entering base-level dynticks and not. */ if :: !busy -> skip; :: busy -> atomic { dynticks = dynticks + 1; } busy = 0; fi; /* We had better now be in dyntick-idle mode. */ assert((dynticks & 1) == 0); od; } /* * Nested NMI runs atomically to emulate interrupting base_level(). */ proctype nested_NMI() { do :: /* * Use an atomic section to model a nested NMI. This is * guaranteed to interleave into base_NMI() between a pair * of base_NMI() statements, just as a nested NMI would. */ atomic { /* Verify that we only sometimes are in dynticks. */ if :: (dynticks & 1) == 0 -> skip; :: (dynticks & 1) == 1 -> skip; fi; /* Model RCU's NMI entry and exit actions. */ rcu_nmi_enter(); assert((dynticks & 1) == 1); rcu_nmi_exit(); } od; } init { run base_NMI(); run nested_NMI(); } ------------------------------------------------------------------------ The following script can be used to run this model if placed in rcu_nmi.spin: ------------------------------------------------------------------------ if ! spin -a rcu_nmi.spin then echo Spin errors!!! exit 1 fi if ! cc -DSAFETY -o pan pan.c then echo Compilation errors!!! exit 1 fi ./pan -m100000 Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com> |
||
Linus Torvalds
|
5d6a546886 |
CONFIG_PM_RUNTIME elimination for 3.19-rc1
This removes the last few uses of CONFIG_PM_RUNTIME introduced recently and makes that config option finally go away. CONFIG_PM will be available directly from the menu now and also it will be selected automatically if CONFIG_SUSPEND or CONFIG_HIBERNATION is set. / -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJUlYaPAAoJEILEb/54YlRx/SoP/2wYioGzBhOCYfHw6fZF8zrP rotQ86sakhvSHre8K9QyFjvsA9wJ0CaTJF46YKZuHFhqU+IJZ7aXvNdEM1hK214J Mf3L2AcbcdnXioAN+HpeZhQklp2qHe84YkVXBqsFD6kb/qUNV2LSjy6nKEUdY3jW 6KL2f3RgF/LDjTdedujJgcCYwMBwfX4B7U42BG4NQQ8z3wCV+imJgzNDrR5nNlqK xu8ab8hO1Gi3msOJxS0y4MN6VTUpYOvQKhSyM9ErcB2ibclAdmcivKuFAz6gy5U7 PyDfYo/P3mXjMRBFb9fLqGtRcfstsnxPPSeKwp236tIQFX19Bj76UVUMJoUlXJP5 /f55/P7mCascg74ZZC4GiD/BSCRdqwInCsFMzqAfSq2NciKzeS6W7Mhd9VTLKDpl 5kqE39imUjZyps7/QqkfWskzB7Puhmqk3ZgTq2yAd4uQTpV7xlJYcnvr4oHCmAia SsLdYOqMQzWr3qyz2f5cOqPAvOo3/Xk/HHfTOCHW/4L+Ov+C921/f3d5GnxX9Ha+ ucRaMp9j5FPYVwFaFkczAMNF2Eanq+Fupa3e6XUNNbYdchFqT9obnHZbVKyvswjR vdGAYAjP/cLzIH9ETDCCXCRvBRw5pzeelDgvDPjPdmPjndHXG8WViyTIEyLL4+1i BENtc/SUw3pZ7iNlGO78 =QnSO -----END PGP SIGNATURE----- Merge tag 'pm-config-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull CONFIG_PM_RUNTIME elimination from Rafael Wysocki: "This removes the last few uses of CONFIG_PM_RUNTIME introduced recently and makes that config option finally go away. CONFIG_PM will be available directly from the menu now and also it will be selected automatically if CONFIG_SUSPEND or CONFIG_HIBERNATION is set" * tag 'pm-config-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: Eliminate CONFIG_PM_RUNTIME tty: 8250_omap: Replace CONFIG_PM_RUNTIME with CONFIG_PM sound: sst-haswell-pcm: Replace CONFIG_PM_RUNTIME with CONFIG_PM spi: Replace CONFIG_PM_RUNTIME with CONFIG_PM |
||
Rafael J. Wysocki
|
464ed18ebd |
PM: Eliminate CONFIG_PM_RUNTIME
Having switched over all of the users of CONFIG_PM_RUNTIME to use CONFIG_PM directly, turn the latter into a user-selectable option and drop the former entirely from the tree. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Kevin Hilman <khilman@linaro.org> |
||
Linus Torvalds
|
4bb9374e0b |
Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull NOHZ update from Thomas Gleixner: "Remove the call into the nohz idle code from the fake 'idle' thread in the powerclamp driver along with the export of those functions which was smuggeled in via the thermal tree. People have tried to hack around it in the nohz core code, but it just violates all rightful assumptions of that code about the only valid calling context (i.e. the proper idle task). The powerclamp trainwreck will still work, it just wont get the benefit of long idle sleeps" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tick/powerclamp: Remove tick_nohz_idle abuse |
||
Linus Torvalds
|
ac88ee3b6c |
Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq core fix from Thomas Gleixner: "A single fix plugging a long standing race between proc/stat and proc/interrupts access and freeing of interrupt descriptors" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Prevent proc race against freeing of irq descriptors |
||
Linus Torvalds
|
88a57667f2 |
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes and cleanups from Ingo Molnar: "A kernel fix plus mostly tooling fixes, but also some tooling restructuring and cleanups" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits) perf: Fix building warning on ARM 32 perf symbols: Fix use after free in filename__read_build_id perf evlist: Use roundup_pow_of_two tools: Adopt roundup_pow_of_two perf tools: Make the mmap length autotuning more robust tools: Adopt rounddown_pow_of_two and deps tools: Adopt fls_long and deps tools: Move bitops.h from tools/perf/util to tools/ tools: Introduce asm-generic/bitops.h tools lib: Move asm-generic/bitops/find.h code to tools/include and tools/lib tools: Whitespace prep patches for moving bitops.h tools: Move code originally from asm-generic/atomic.h into tools/include/asm-generic/ tools: Move code originally from linux/log2.h to tools/include/linux/ tools: Move __ffs implementation to tools/include/asm-generic/bitops/__ffs.h perf evlist: Do not use hard coded value for a mmap_pages default perf trace: Let the perf_evlist__mmap autosize the number of pages to use perf evlist: Improve the strerror_mmap method perf evlist: Clarify sterror_mmap variable names perf evlist: Fixup brown paper bag on "hint" for --mmap-pages cmdline arg perf trace: Provide a better explanation when mmap fails ... |
||
Thomas Gleixner
|
a5fd9733a3 |
tick/powerclamp: Remove tick_nohz_idle abuse
commit |
||
Linus Torvalds
|
d790be3863 |
The exciting thing here is the getting rid of stop_machine on module
removal. This is possible by using a simple atomic_t for the counter, rather than our fancy per-cpu counter: it turns out that no one is doing a module increment per net packet, so the slowdown should be in the noise. Also, script fixed for new git version. Cheers, Rusty. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJUk3cQAAoJENkgDmzRrbjxr44P/25ZBYmKZZ3XM3flt2o0LCti 1Px+MRbWuXhueWQOYZSXOO3c2ENNuV3siaU4jQZqnxslpdvT4rVsVFkYuwva2vHT hqpoq1Hz++yjFJArjERFOdoZ1gxkBbZQQGYm8esToAqU3b2Z74SrU48dPwp65q/1 r6hbXdWSiKALEBZeW2coi+QVCL/oxE8hmNqDO1mpe82aEKu0xIVpTdU5vAfBIj8/ Z95U2bx+CjiP7khhSjBGtltLqxL6QXw1m2eg1gO9nf1gJNI0/dAY6IJmFbGz+7Bt CAyc9BRsB40Em8G7d7wr4FsURcLfmYNdjtx79j+Rot5PkVIi+Ztv7C1QYlMQESPa ESddUMySOmKlzTm50w3ZLvV1ZTRU8TjmttSkzQYZ3csCLkKUgfeL9SAxU9KGoA2l jFxrvDcWEHtuU1D/FeYyOofNaD/BflPfdhj4WAm9XnPPi+THEu7fulWJaIP4glHh 8TpYNbinXuZqXO4nJ41Ad5utbSbBQa4fFBUuViWRTU0TtWJT2HVqn/XoYJ5mnPEz IbYh31rQDKFJKzePfscWrJ6XzoF59yGiAVcWcI3HS7aT8bFZGapAQu9mNCVu+cLF uRxWrukHG7d8YeYrAtbVXWfxArR155V9QJN55hQ1nKLq2M03gNvYTtAPw2yEsfuw u3Fk/KkV1RfaiFurjoG/ =rDum -----END PGP SIGNATURE----- Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module updates from Rusty Russell: "The exciting thing here is the getting rid of stop_machine on module removal. This is possible by using a simple atomic_t for the counter, rather than our fancy per-cpu counter: it turns out that no one is doing a module increment per net packet, so the slowdown should be in the noise" * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: param: do not set store func without write perm params: cleanup sysfs allocation kernel:module Fix coding style errors and warnings. module: Remove stop_machine from module unloading module: Replace module_ref with atomic_t refcnt lib/bug: Use RCU list ops for module_bug_list module: Unlink module with RCU synchronizing instead of stop_machine module: Wait for RCU synchronizing before releasing a module |
||
Linus Torvalds
|
c0f486fde3 |
More ACPI and power management updates for 3.19-rc1
- Fix a regression in leds-gpio introduced by a recent commit that inadvertently changed the name of one of the properties used by the driver (Fabio Estevam). - Fix a regression in the ACPI backlight driver introduced by a recent fix that missed one special case that had to be taken into account (Aaron Lu). - Drop the level of some new kernel messages from the ACPI core introduced by a recent commit to KERN_DEBUG which they should have used from the start and drop some other unuseful KERN_ERR messages printed by ACPI (Rafael J Wysocki). - Revert an incorrect commit modifying the cpupower tool (Prarit Bhargava). - Fix two regressions introduced by recent commits in the OPP library and clean up some existing minor issues in that code (Viresh Kumar). - Continue to replace CONFIG_PM_RUNTIME with CONFIG_PM throughout the tree (or drop it where that can be done) in order to make it possible to eliminate CONFIG_PM_RUNTIME (Rafael J Wysocki, Ulf Hansson, Ludovic Desroches). There will be one more "CONFIG_PM_RUNTIME removal" batch after this one, because some new uses of it have been introduced during the current merge window, but that should be sufficient to finally get rid of it. - Make the ACPI EC driver more robust against race conditions related to GPE handler installation failures (Lv Zheng). - Prevent the ACPI device PM core code from attempting to disable GPEs that it has not enabled which confuses ACPICA and makes it report errors unnecessarily (Rafael J Wysocki). - Add a "force" command line switch to the intel_pstate driver to make it possible to override the blacklisting of some systems in that driver if needed (Ethan Zhao). - Improve intel_pstate code documentation and add a MAINTAINERS entry for it (Kristen Carlson Accardi). - Make the ACPI fan driver create cooling device interfaces witn names that reflect the IDs of the ACPI device objects they are associated with, except for "generic" ACPI fans (PNP ID "PNP0C0B"). That's necessary for user space thermal management tools to be able to connect the fans with the parts of the system they are supposed to be cooling properly. From Srinivas Pandruvada. / -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJUk0IDAAoJEILEb/54YlRx7fgP/3+yF/0TnEW93j2ALDAQFiLF tSv2A2vQC8vtMJjjWx0z/HqPh86gfaReEFZmUJD/Q/e2LXEnxNZJ+QMjcekPVkDM mTvcIMc2MR8vOA/oMkgxeaKregrrx7RkCfojd+NWZhVukkjl+mvBHgAnYjXRL+NZ unDWGlbHG97vq/3kGjPYhDS00nxHblw8NHFBu5HL5RxwABdWoeZJITwqxXWyuPLw nlqNWlOxmwvtSbw2VMKz0uof1nFHyQLykYsMG0ZsyayCRdWUZYkEqmE7GGpCLkLu D6yfmlpen6ccIOsEAae0eXBt50IFY9Tihk5lovx1mZmci2SNRg29BqMI105wIn0u 8b8Ej7MNHp7yMxRpB5WfU90p/y7ioJns9guFZxY0CKaRnrI2+BLt3RscMi3MPI06 Cu2/WkSSa09fhDPA+pk+VDYsmWgyVawigesNmMP5/cvYO/yYywVRjOuO1k77qQGp 4dSpFYEHfpxinejZnVZOk2V9MkvSLoSMux6wPV0xM0IE1iD0ulVpHjTJrwp80ph4 +bfUFVr/vrD1y7EKbf1PD363ZKvJhWhvQWDgETsM1vgLf21PfWO7C2kflIAsWsdQ 1ukD5nCBRlP4K73hG7bdM6kRztXhUdR0SHg85/t0KB/ExiVqtcXIzB60D0G1lENd QlKbq3O4lim1WGuhazQY =5fo2 -----END PGP SIGNATURE----- Merge tag 'pm+acpi-3.19-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more ACPI and power management updates from Rafael Wysocki: "These are regression fixes (leds-gpio, ACPI backlight driver, operating performance points library, ACPI device enumeration messages, cpupower tool), other bug fixes (ACPI EC driver, ACPI device PM), some cleanups in the operating performance points (OPP) framework, continuation of CONFIG_PM_RUNTIME elimination, a couple of minor intel_pstate driver changes, a new MAINTAINERS entry for it and an ACPI fan driver change needed for better support of thermal management in user space. Specifics: - Fix a regression in leds-gpio introduced by a recent commit that inadvertently changed the name of one of the properties used by the driver (Fabio Estevam). - Fix a regression in the ACPI backlight driver introduced by a recent fix that missed one special case that had to be taken into account (Aaron Lu). - Drop the level of some new kernel messages from the ACPI core introduced by a recent commit to KERN_DEBUG which they should have used from the start and drop some other unuseful KERN_ERR messages printed by ACPI (Rafael J Wysocki). - Revert an incorrect commit modifying the cpupower tool (Prarit Bhargava). - Fix two regressions introduced by recent commits in the OPP library and clean up some existing minor issues in that code (Viresh Kumar). - Continue to replace CONFIG_PM_RUNTIME with CONFIG_PM throughout the tree (or drop it where that can be done) in order to make it possible to eliminate CONFIG_PM_RUNTIME (Rafael J Wysocki, Ulf Hansson, Ludovic Desroches). There will be one more "CONFIG_PM_RUNTIME removal" batch after this one, because some new uses of it have been introduced during the current merge window, but that should be sufficient to finally get rid of it. - Make the ACPI EC driver more robust against race conditions related to GPE handler installation failures (Lv Zheng). - Prevent the ACPI device PM core code from attempting to disable GPEs that it has not enabled which confuses ACPICA and makes it report errors unnecessarily (Rafael J Wysocki). - Add a "force" command line switch to the intel_pstate driver to make it possible to override the blacklisting of some systems in that driver if needed (Ethan Zhao). - Improve intel_pstate code documentation and add a MAINTAINERS entry for it (Kristen Carlson Accardi). - Make the ACPI fan driver create cooling device interfaces witn names that reflect the IDs of the ACPI device objects they are associated with, except for "generic" ACPI fans (PNP ID "PNP0C0B"). That's necessary for user space thermal management tools to be able to connect the fans with the parts of the system they are supposed to be cooling properly. From Srinivas Pandruvada" * tag 'pm+acpi-3.19-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (32 commits) MAINTAINERS: add entry for intel_pstate ACPI / video: update the skip case for acpi_video_device_in_dod() power / PM: Eliminate CONFIG_PM_RUNTIME NFC / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM SCSI / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM ACPI / EC: Fix unexpected ec_remove_handlers() invocations Revert "tools: cpupower: fix return checks for sysfs_get_idlestate_count()" tracing / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM x86 / PM: Replace CONFIG_PM_RUNTIME in io_apic.c PM: Remove the SET_PM_RUNTIME_PM_OPS() macro mmc: atmel-mci: use SET_RUNTIME_PM_OPS() macro PM / Kconfig: Replace PM_RUNTIME with PM in dependencies ARM / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM sound / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM phy / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM video / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM tty / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM spi: Replace CONFIG_PM_RUNTIME with CONFIG_PM ACPI / PM: Do not disable wakeup GPEs that have not been enabled ACPI / utils: Drop error messages from acpi_evaluate_reference() ... |
||
Kees Cook
|
b0a65b0ccc |
param: do not set store func without write perm
When a module_param is defined without DAC write permissions, it can still be changed at runtime and updated. Drivers using a 0444 permission may be surprised that these values can still be changed. For drivers that want to allow updates, any S_IW* flag will set the "store" function as before. Drivers without S_IW* flags will have the "store" function unset, unforcing a read-only value. Drivers that wish neither "store" nor "get" can continue to use "0" for perms to stay out of sysfs entirely. Old behavior: # cd /sys/module/snd/parameters # ls -l total 0 -r--r--r-- 1 root root 4096 Dec 11 13:55 cards_limit -r--r--r-- 1 root root 4096 Dec 11 13:55 major -r--r--r-- 1 root root 4096 Dec 11 13:55 slots # cat major 116 # echo -1 > major -bash: major: Permission denied # chmod u+w major # echo -1 > major # cat major -1 New behavior: ... # chmod u+w major # echo -1 > major -bash: echo: write error: Input/output error Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> |
||
Linus Torvalds
|
87c31b39ab |
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull user namespace related fixes from Eric Biederman: "As these are bug fixes almost all of thes changes are marked for backporting to stable. The first change (implicitly adding MNT_NODEV on remount) addresses a regression that was created when security issues with unprivileged remount were closed. I go on to update the remount test to make it easy to detect if this issue reoccurs. Then there are a handful of mount and umount related fixes. Then half of the changes deal with the a recently discovered design bug in the permission checks of gid_map. Unix since the beginning has allowed setting group permissions on files to less than the user and other permissions (aka ---rwx---rwx). As the unix permission checks stop as soon as a group matches, and setgroups allows setting groups that can not later be dropped, results in a situtation where it is possible to legitimately use a group to assign fewer privileges to a process. Which means dropping a group can increase a processes privileges. The fix I have adopted is that gid_map is now no longer writable without privilege unless the new file /proc/self/setgroups has been set to permanently disable setgroups. The bulk of user namespace using applications even the applications using applications using user namespaces without privilege remain unaffected by this change. Unfortunately this ix breaks a couple user space applications, that were relying on the problematic behavior (one of which was tools/selftests/mount/unprivileged-remount-test.c). To hopefully prevent needing a regression fix on top of my security fix I rounded folks who work with the container implementations mostly like to be affected and encouraged them to test the changes. > So far nothing broke on my libvirt-lxc test bed. :-) > Tested with openSUSE 13.2 and libvirt 1.2.9. > Tested-by: Richard Weinberger <richard@nod.at> > Tested on Fedora20 with libvirt 1.2.11, works fine. > Tested-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com> > Ok, thanks - yes, unprivileged lxc is working fine with your kernels. > Just to be sure I was testing the right thing I also tested using > my unprivileged nsexec testcases, and they failed on setgroup/setgid > as now expected, and succeeded there without your patches. > Tested-by: Serge Hallyn <serge.hallyn@ubuntu.com> > I tested this with Sandstorm. It breaks as is and it works if I add > the setgroups thing. > Tested-by: Andy Lutomirski <luto@amacapital.net> # breaks things as designed :(" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: userns: Unbreak the unprivileged remount tests userns; Correct the comment in map_write userns: Allow setting gid_maps without privilege when setgroups is disabled userns: Add a knob to disable setgroups on a per user namespace basis userns: Rename id_map_mutex to userns_state_mutex userns: Only allow the creator of the userns unprivileged mappings userns: Check euid no fsuid when establishing an unprivileged uid mapping userns: Don't allow unprivileged creation of gid mappings userns: Don't allow setgroups until a gid mapping has been setablished userns: Document what the invariant required for safe unprivileged mappings. groups: Consolidate the setgroups permission checks mnt: Clear mnt_expire during pivot_root mnt: Carefully set CL_UNPRIVILEGED in clone_mnt mnt: Move the clear of MNT_LOCKED from copy_tree to it's callers. umount: Do not allow unmounting rootfs. umount: Disallow unprivileged mount force mnt: Update unprivileged remount test mnt: Implicitly add MNT_NODEV on remount when it was implicitly added by mount |
||
Linus Torvalds
|
603ba7e41b |
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs pile #2 from Al Viro: "Next pile (and there'll be one or two more). The large piece in this one is getting rid of /proc/*/ns/* weirdness; among other things, it allows to (finally) make nameidata completely opaque outside of fs/namei.c, making for easier further cleanups in there" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: coda_venus_readdir(): use file_inode() fs/namei.c: fold link_path_walk() call into path_init() path_init(): don't bother with LOOKUP_PARENT in argument fs/namei.c: new helper (path_cleanup()) path_init(): store the "base" pointer to file in nameidata itself make default ->i_fop have ->open() fail with ENXIO make nameidata completely opaque outside of fs/namei.c kill proc_ns completely take the targets of /proc/*/ns/* symlinks to separate fs bury struct proc_ns in fs/proc copy address of proc_ns_ops into ns_common new helpers: ns_alloc_inum/ns_free_inum make proc_ns_operations work with struct ns_common * instead of void * switch the rest of proc_ns_operations to working with &...->ns netns: switch ->get()/->put()/->install()/->inum() to working with &net->ns make mntns ->get()/->put()/->install()/->inum() work with &mnt_ns->ns common object embedded into various struct ....ns |
||
Linus Torvalds
|
a7c180aa7e |
As the merge window is still open, and this code was not as complex
as I thought it might be. I'm pushing this in now. This will allow Thomas to debug his irq work for 3.20. This adds two new features: 1) Allow traceopoints to be enabled right after mm_init(). By passing in the trace_event= kernel command line parameter, tracepoints can be enabled at boot up. For debugging things like the initialization of interrupts, it is needed to have tracepoints enabled very early. People have asked about this before and this has been on my todo list. As it can be helpful for Thomas to debug his upcoming 3.20 IRQ work, I'm pushing this now. This way he can add tracepoints into the IRQ set up and have users enable them when things go wrong. 2) Have the tracepoints printed via printk() (the console) when they are triggered. If the irq code locks up or reboots the box, having the tracepoint output go into the kernel ring buffer is useless for debugging. But being able to add the tp_printk kernel command line option along with the trace_event= option will have these tracepoints printed as they occur, and that can be really useful for debugging early lock up or reboot problems. This code is not that intrusive and it passed all my tests. Thomas tried them out too and it works for his needs. Link: http://lkml.kernel.org/r/20141214201609.126831471@goodmis.org -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJUjv3kAAoJEEjnJuOKh9ldLNsIANAe5EmDCBw0WjR72n+G3qOH NC8calXfkjqHU0bv8Q3dRv20KH4MHOy6l4+EiV9/ovt71LOF3NEyUJ3HuShf9a8b sWcUhYbX3D1hViQe5sOzv9AWhBCFlKQGoNmQnydX9xa8ivRsBaTGJIGktWlHcwBE jF1i3fj3l3vRQSS8qZFXp3bzreunlGyPoSHcT6eWQeos+utj4sKwQWTLXTLQeM+6 oQtFKRx7E5yX04qO1qFczS8qIEC6JH2C2jIRYEKUGepaELlnGkb8O7jQV/RaLF4/ 6P8VhZFG9YLS7fn7vWu0SnAN+Zwz5LzgjXAZt0FhGtIhLc18Oj8ouHH1UORsdQM= =Z4Un -----END PGP SIGNATURE----- Merge tag 'trace-3.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "As the merge window is still open, and this code was not as complex as I thought it might be. I'm pushing this in now. This will allow Thomas to debug his irq work for 3.20. This adds two new features: 1) Allow traceopoints to be enabled right after mm_init(). By passing in the trace_event= kernel command line parameter, tracepoints can be enabled at boot up. For debugging things like the initialization of interrupts, it is needed to have tracepoints enabled very early. People have asked about this before and this has been on my todo list. As it can be helpful for Thomas to debug his upcoming 3.20 IRQ work, I'm pushing this now. This way he can add tracepoints into the IRQ set up and have users enable them when things go wrong. 2) Have the tracepoints printed via printk() (the console) when they are triggered. If the irq code locks up or reboots the box, having the tracepoint output go into the kernel ring buffer is useless for debugging. But being able to add the tp_printk kernel command line option along with the trace_event= option will have these tracepoints printed as they occur, and that can be really useful for debugging early lock up or reboot problems. This code is not that intrusive and it passed all my tests. Thomas tried them out too and it works for his needs. Link: http://lkml.kernel.org/r/20141214201609.126831471@goodmis.org" * tag 'trace-3.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Add tp_printk cmdline to have tracepoints go to printk() tracing: Move enabling tracepoints to just after rcu_init() |
||
Linus Torvalds
|
988adfdffd |
Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie: "Highlights: - AMD KFD driver merge This is the AMD HSA interface for exposing a lowlevel interface for GPGPU use. They have an open source userspace built on top of this interface, and the code looks as good as it was going to get out of tree. - Initial atomic modesetting work The need for an atomic modesetting interface to allow userspace to try and send a complete set of modesetting state to the driver has arisen, and been suffering from neglect this past year. No more, the start of the common code and changes for msm driver to use it are in this tree. Ongoing work to get the userspace ioctl finished and the code clean will probably wait until next kernel. - DisplayID 1.3 and tiled monitor exposed to userspace. Tiled monitor property is now exposed for userspace to make use of. - Rockchip drm driver merged. - imx gpu driver moved out of staging Other stuff: - core: panel - MIPI DSI + new panels. expose suggested x/y properties for virtual GPUs - i915: Initial Skylake (SKL) support gen3/4 reset work start of dri1/ums removal infoframe tracking fixes for lots of things. - nouveau: tegra k1 voltage support GM204 modesetting support GT21x memory reclocking work - radeon: CI dpm fixes GPUVM improvements Initial DPM fan control - rcar-du: HDMI support added removed some support for old boards slave encoder driver for Analog Devices adv7511 - exynos: Exynos4415 SoC support - msm: a4xx gpu support atomic helper conversion - tegra: iommu support universal plane support ganged-mode DSI support - sti: HDMI i2c improvements - vmwgfx: some late fixes. - qxl: use suggested x/y properties" * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (969 commits) drm: sti: fix module compilation issue drm/i915: save/restore GMBUS freq across suspend/resume on gen4 drm: sti: correctly cleanup CRTC and planes drm: sti: add HQVDP plane drm: sti: add cursor plane drm: sti: enable auxiliary CRTC drm: sti: fix delay in VTG programming drm: sti: prepare sti_tvout to support auxiliary crtc drm: sti: use drm_crtc_vblank_{on/off} instead of drm_vblank_{on/off} drm: sti: fix hdmi avi infoframe drm: sti: remove event lock while disabling vblank drm: sti: simplify gdp code drm: sti: clear all mixer control drm: sti: remove gpio for HDMI hot plug detection drm: sti: allow to change hdmi ddc i2c adapter drm/doc: Document drm_add_modes_noedid() usage drm/i915: Remove '& 0xffff' from the mask given to WA_REG() drm/i915: Invert the mask and val arguments in wa_add() and WA_REG() drm: Zero out DRM object memory upon cleanup drm/i915/bdw: Fix the write setting up the WIZ hashing mode ... |
||
Steven Rostedt (Red Hat)
|
0daa230296 |
tracing: Add tp_printk cmdline to have tracepoints go to printk()
Add the kernel command line tp_printk option that will have tracepoints that are active sent to printk() as well as to the trace buffer. Passing "tp_printk" will activate this. To turn it off, the sysctl /proc/sys/kernel/tracepoint_printk can have '0' echoed into it. Note, this only works if the cmdline option is used. Echoing 1 into the sysctl file without the cmdline option will have no affect. Note, this is a dangerous option. Having high frequency tracepoints send their data to printk() can possibly cause a live lock. This is another reason why this is only active if the command line option is used. Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1412121539300.16494@nanos Suggested-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> |
||
Steven Rostedt (Red Hat)
|
5f893b2639 |
tracing: Move enabling tracepoints to just after rcu_init()
Enabling tracepoints at boot up can be very useful. The tracepoint can be initialized right after RCU has been. There's no need to wait for the early_initcall() to be called. That's too late for some things that can use tracepoints for debugging. Move the logic to enable tracepoints out of the initcalls and into init/main.c to right after rcu_init(). This also allows trace_printk() to be used early too. Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1412121539300.16494@nanos Link: http://lkml.kernel.org/r/20141214164104.307127356@goodmis.org Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> |
||
Linus Torvalds
|
37da7bbbe8 |
TTY/Serial driver patches for 3.19-rc1
Here's the big tty/serial driver update for 3.19-rc1. There are a number of TTY core changes/fixes in here from Peter Hurley that have all been teted in linux-next for a long time now. There are also the normal serial driver updates as well, full details in the changelog below. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlSOD/MACgkQMUfUDdst+ymW+wCfbSzoYMRObIImMPWfoQtxkvvN rpkAnAtyEP/zZIfkQIuKTSH6FJxocF8V =WZt3 -----END PGP SIGNATURE----- Merge tag 'tty-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial driver updates from Greg KH: "Here's the big tty/serial driver update for 3.19-rc1. There are a number of TTY core changes/fixes in here from Peter Hurley that have all been teted in linux-next for a long time now. There are also the normal serial driver updates as well, full details in the changelog below" * tag 'tty-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (219 commits) serial: pxa: hold port.lock when reporting modem line changes tty-hvsi_lib: Deletion of an unnecessary check before the function call "tty_kref_put" tty: Deletion of unnecessary checks before two function calls n_tty: Fix read_buf race condition, increment read_head after pushing data serial: of-serial: add PM suspend/resume support Revert "serial: of-serial: add PM suspend/resume support" Revert "serial: of-serial: fix up PM ops on no_console_suspend and port type" serial: 8250: don't attempt a trylock if in sysrq serial: core: Add big-endian iotype serial: samsung: use port->fifosize instead of hardcoded values serial: samsung: prefer to use fifosize from driver data serial: samsung: fix style problems serial: samsung: wait for transfer completion before clock disable serial: icom: fix error return code serial: tegra: clean up tty-flag assignments serial: Fix io address assign flow with Fintek PCI-to-UART Product serial: mxs-auart: fix tx_empty against shift register serial: mxs-auart: fix gpio change detection on interrupt serial: mxs-auart: Fix mxs_auart_set_ldisc() serial: 8250_dw: Use 64-bit access for OCTEON. ... |
||
Linus Torvalds
|
caf292ae5b |
Merge branch 'for-3.19/core' of git://git.kernel.dk/linux-block
Pull block driver core update from Jens Axboe: "This is the pull request for the core block IO changes for 3.19. Not a huge round this time, mostly lots of little good fixes: - Fix a bug in sysfs blktrace interface causing a NULL pointer dereference, when enabled/disabled through that API. From Arianna Avanzini. - Various updates/fixes/improvements for blk-mq: - A set of updates from Bart, mostly fixing buts in the tag handling. - Cleanup/code consolidation from Christoph. - Extend queue_rq API to be able to handle batching issues of IO requests. NVMe will utilize this shortly. From me. - A few tag and request handling updates from me. - Cleanup of the preempt handling for running queues from Paolo. - Prevent running of unmapped hardware queues from Ming Lei. - Move the kdump memory limiting check to be in the correct location, from Shaohua. - Initialize all software queues at init time from Takashi. This prevents a kobject warning when CPUs are brought online that weren't online when a queue was registered. - Single writeback fix for I_DIRTY clearing from Tejun. Queued with the core IO changes, since it's just a single fix. - Version X of the __bio_add_page() segment addition retry from Maurizio. Hope the Xth time is the charm. - Documentation fixup for IO scheduler merging from Jan. - Introduce (and use) generic IO stat accounting helpers for non-rq drivers, from Gu Zheng. - Kill off artificial limiting of max sectors in a request from Christoph" * 'for-3.19/core' of git://git.kernel.dk/linux-block: (26 commits) bio: modify __bio_add_page() to accept pages that don't start a new segment blk-mq: Fix uninitialized kobject at CPU hotplugging blktrace: don't let the sysfs interface remove trace from running list blk-mq: Use all available hardware queues blk-mq: Micro-optimize bt_get() blk-mq: Fix a race between bt_clear_tag() and bt_get() blk-mq: Avoid that __bt_get_word() wraps multiple times blk-mq: Fix a use-after-free blk-mq: prevent unmapped hw queue from being scheduled blk-mq: re-check for available tags after running the hardware queue blk-mq: fix hang in bt_get() blk-mq: move the kdump check to blk_mq_alloc_tag_set blk-mq: cleanup tag free handling blk-mq: use 'nr_cpu_ids' as highest CPU ID count for hwq <-> cpu map blk: introduce generic io stat accounting help function blk-mq: handle the single queue case in blk_mq_hctx_next_cpu genhd: check for int overflow in disk_expand_part_tbl() blk-mq: add blk_mq_free_hctx_request() blk-mq: export blk_mq_free_request() blk-mq: use get_cpu/put_cpu instead of preempt_disable/preempt_enable ... |
||
Linus Torvalds
|
8f4385d590 |
This code is a fork from the trace-3.19 pull as it needed the trace_seq
clean ups from that branch. This code solves the issue of performing stack dumps from NMI context. The issue is that printk() is not safe from NMI context as if the NMI were to trigger when a printk() was being performed, the NMI could deadlock from the printk() internal locks. This has been seen in practice. With lots of review from Petr Mladek, this code went through several iterations, and we feel that it is now at a point of quality to be accepted into mainline. Here's what is contained in this patch set: o Creates a "seq_buf" generic buffer utility that allows a descriptor to be passed around where functions can write their own "printk()" formatted strings into it. The generic version was pulled out of the trace_seq() code that was made specifically for tracing. o The seq_buf code was change to model the seq_file code. I have a patch (not included for 3.19) that converts the seq_file.c code over to use seq_buf.c like the trace_seq.c code does. This was done to make sure that seq_buf.c is compatible with seq_file.c. I may try to get that patch in for 3.20. o The seq_buf.c file was moved to lib/ to remove it from being dependent on CONFIG_TRACING. o The printk() was updated to allow for a per_cpu "override" of the internal calls. That is, instead of writing to the console, a call to printk() may do something else. This made it easier to allow the NMI to change what printk() does in order to call dump_stack() without needing to update that code as well. o Finally, the dump_stack from all CPUs via NMI code was converted to use the seq_buf code. The caller to trigger the NMI code would wait till all the NMIs finished, and then it would print the seq_buf data to the console safely from a non NMI context. [ Updated to remove unnecessary preempt_disable in printk() ] -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJUi8A8AAoJEEjnJuOKh9ldv0sH/A+l9Ewrc3Kd0XuUKX9UO9Mj yrDz5dSWTxD6Pi7ni5Zo2f/MebXhrgS8gF1MBN1HMS5s9/9XdTTQijosOfs75iFd xufiDur7ssl2EOLB/ouqWVn16tu1PrPyw+U76JUZvsYlIMSWQu2FH8DSdo59N6Iz 7RxS8rtxJ2IwehmO7tu2Lq5rB7zGL4SET5oIfQ1+KnjzqB5Z1bfm9nGwAc8nozx8 3MqwsClEnXBTkY4eYZzu9wD7Nl/eknzTrk8KDbQ49oTYmoBuuh/s1FMuxe75cY55 wEtDA6HvvTXYnw6YOAMUB41cGnRg3KVRmmhcH5T9jrBxg2iZjXYa8iZxvcAM6Es= =zDMJ -----END PGP SIGNATURE----- Merge tag 'trace-seq-buf-3.19-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixlet from Steven Rostedt: "Remove unnecessary preempt_disable in printk()" * tag 'trace-seq-buf-3.19-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: printk: Do not disable preemption for accessing printk_func |
||
Linus Torvalds
|
a99abce2d9 |
Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/audit
Pull audit updates from Paul Moore: "Two small patches from the audit next branch; only one of which has any real significant code changes, the other is simply a MAINTAINERS update for audit. The single code patch is pretty small and rather straightforward, it changes the audit "version" number reported to userspace from an integer to a bitmap which is used to indicate the functionality of the running kernel. This really doesn't have much impact on the kernel, but it will make life easier for the audit userspace folks. Thankfully we were still on a version number which allowed us to do this without breaking userspace" * 'upstream' of git://git.infradead.org/users/pcmoore/audit: audit: convert status version to a feature bitmap audit: add Paul Moore to the MAINTAINERS entry |
||
Jan Kara
|
0809ab69a2 |
fsnotify: unify inode and mount marks handling
There's a lot of common code in inode and mount marks handling. Factor it out to a common helper function. Signed-off-by: Jan Kara <jack@suse.cz> Cc: Eric Paris <eparis@redhat.com> Cc: Heinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Riku Voipio
|
957e3facd1 |
gcov: enable GCOV_PROFILE_ALL from ARCH Kconfigs
Following the suggestions from Andrew Morton and Stephen Rothwell, Dont expand the ARCH list in kernel/gcov/Kconfig. Instead, define a ARCH_HAS_GCOV_PROFILE_ALL bool which architectures can enable. set ARCH_HAS_GCOV_PROFILE_ALL on Architectures where it was previously allowed + ARM64 which I tested. Signed-off-by: Riku Voipio <riku.voipio@linaro.org> Cc: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Masanari Iida
|
d5393955c3 |
kexec: remove unnecessary KERN_ERR from kexec.c
Remove unnecessary KERN_ERR from pr_err() within kexec.c. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
David Drysdale
|
51f39a1f0c |
syscalls: implement execveat() system call
This patchset adds execveat(2) for x86, and is derived from Meredydd Luff's patch from Sept 2012 (https://lkml.org/lkml/2012/9/11/528). The primary aim of adding an execveat syscall is to allow an implementation of fexecve(3) that does not rely on the /proc filesystem, at least for executables (rather than scripts). The current glibc version of fexecve(3) is implemented via /proc, which causes problems in sandboxed or otherwise restricted environments. Given the desire for a /proc-free fexecve() implementation, HPA suggested (https://lkml.org/lkml/2006/7/11/556) that an execveat(2) syscall would be an appropriate generalization. Also, having a new syscall means that it can take a flags argument without back-compatibility concerns. The current implementation just defines the AT_EMPTY_PATH and AT_SYMLINK_NOFOLLOW flags, but other flags could be added in future -- for example, flags for new namespaces (as suggested at https://lkml.org/lkml/2006/7/11/474). Related history: - https://lkml.org/lkml/2006/12/27/123 is an example of someone realizing that fexecve() is likely to fail in a chroot environment. - http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=514043 covered documenting the /proc requirement of fexecve(3) in its manpage, to "prevent other people from wasting their time". - https://bugzilla.redhat.com/show_bug.cgi?id=241609 described a problem where a process that did setuid() could not fexecve() because it no longer had access to /proc/self/fd; this has since been fixed. This patch (of 4): Add a new execveat(2) system call. execveat() is to execve() as openat() is to open(): it takes a file descriptor that refers to a directory, and resolves the filename relative to that. In addition, if the filename is empty and AT_EMPTY_PATH is specified, execveat() executes the file to which the file descriptor refers. This replicates the functionality of fexecve(), which is a system call in other UNIXen, but in Linux glibc it depends on opening "/proc/self/fd/<fd>" (and so relies on /proc being mounted). The filename fed to the executed program as argv[0] (or the name of the script fed to a script interpreter) will be of the form "/dev/fd/<fd>" (for an empty filename) or "/dev/fd/<fd>/<filename>", effectively reflecting how the executable was found. This does however mean that execution of a script in a /proc-less environment won't work; also, script execution via an O_CLOEXEC file descriptor fails (as the file will not be accessible after exec). Based on patches by Meredydd Luff. Signed-off-by: David Drysdale <drysdale@google.com> Cc: Meredydd Luff <meredydd@senatehouse.org> Cc: Shuah Khan <shuah.kh@samsung.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Rich Felker <dalias@aerifal.cx> Cc: Christoph Hellwig <hch@infradead.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Joonsoo Kim
|
9a92a6ce6f |
stacktrace: introduce snprint_stack_trace for buffer output
Current stacktrace only have the function for console output. page_owner that will be introduced in following patch needs to print the output of stacktrace into the buffer for our own output format so so new function, snprint_stack_trace(), is needed. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Dave Hansen <dave@sr71.net> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Jungsoo Son <jungsoo.son@lge.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Davidlohr Bueso
|
4a23717a23 |
uprobes: share the i_mmap_rwsem
Both register and unregister call build_map_info() in order to create the list of mappings before installing or removing breakpoints for every mm which maps file backed memory. As such, there is no reason to hold the i_mmap_rwsem exclusively, so share it and allow concurrent readers to build the mapping data. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Oleg Nesterov <oleg@redhat.com> Acked-by: Hugh Dickins <hughd@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Davidlohr Bueso
|
c8c06efa8b |
mm: convert i_mmap_mutex to rwsem
The i_mmap_mutex is a close cousin of the anon vma lock, both protecting similar data, one for file backed pages and the other for anon memory. To this end, this lock can also be a rwsem. In addition, there are some important opportunities to share the lock when there are no tree modifications. This conversion is straightforward. For now, all users take the write lock. [sfr@canb.auug.org.au: update fremap.c] Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: "Kirill A. Shutemov" <kirill@shutemov.name> Acked-by: Hugh Dickins <hughd@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Davidlohr Bueso
|
83cde9e8ba |
mm: use new helper functions around the i_mmap_mutex
Convert all open coded mutex_lock/unlock calls to the i_mmap_[lock/unlock]_write() helpers. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: "Kirill A. Shutemov" <kirill@shutemov.name> Acked-by: Hugh Dickins <hughd@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Thomas Gleixner
|
c291ee6221 |
genirq: Prevent proc race against freeing of irq descriptors
Since the rework of the sparse interrupt code to actually free the
unused interrupt descriptors there exists a race between the /proc
interfaces to the irq subsystem and the code which frees the interrupt
descriptor.
CPU0 CPU1
show_interrupts()
desc = irq_to_desc(X);
free_desc(desc)
remove_from_radix_tree();
kfree(desc);
raw_spinlock_irq(&desc->lock);
/proc/interrupts is the only interface which can actively corrupt
kernel memory via the lock access. /proc/stat can only read from freed
memory. Extremly hard to trigger, but possible.
The interfaces in /proc/irq/N/ are not affected by this because the
removal of the proc file is serialized in procfs against concurrent
readers/writers. The removal happens before the descriptor is freed.
For architectures which have CONFIG_SPARSE_IRQ=n this is a non issue
as the descriptor is never freed. It's merely cleared out with the irq
descriptor lock held. So any concurrent proc access will either see
the old correct value or the cleared out ones.
Protect the lookup and access to the irq descriptor in
show_interrupts() with the sparse_irq_lock.
Provide kstat_irqs_usr() which is protecting the lookup and access
with sparse_irq_lock and switch /proc/stat to use it.
Document the existing kstat_irqs interfaces so it's clear that the
caller needs to take care about protection. The users of these
interfaces are either not affected due to SPARSE_IRQ=n or already
protected against removal.
Fixes:
|
||
Rafael J. Wysocki
|
798bc6d8d5 |
tracing / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
After commit
|
||
Linus Torvalds
|
a7cb7bb664 |
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree update from Jiri Kosina: "Usual stuff: documentation updates, printk() fixes, etc" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (24 commits) intel_ips: fix a type in error message cpufreq: cpufreq-dt: Move newline to end of error message ps3rom: fix error return code treewide: fix typo in printk and Kconfig ARM: dts: bcm63138: change "interupts" to "interrupts" Replace mentions of "list_struct" to "list_head" kernel: trace: fix printk message scsi: mpt2sas: fix ioctl in comment zbud, zswap: change module author email clocksource: Fix 'clcoksource' typo in comment arm: fix wording of "Crotex" in CONFIG_ARCH_EXYNOS3 help gpio: msm-v1: make boolean argument more obvious usb: Fix typo in usb-serial-simple.c PCI: Fix comment typo 'COMFIG_PM_OPS' powerpc: Fix comment typo 'CONIFG_8xx' powerpc: Fix comment typos 'CONFiG_ALTIVEC' clk: st: Spelling s/stucture/structure/ isci: Spelling s/stucture/structure/ usb: gadget: zero: Spelling s/infrastucture/infrastructure/ treewide: Fix company name in module descriptions ... |
||
Ingo Molnar
|
3459f0d78f |
Merge branch 'linus' into perf/urgent, to pick up the upstream merged bits
Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Linus Torvalds
|
2756d373a3 |
Merge branch 'for-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup update from Tejun Heo: "cpuset got simplified a bit. cgroup core got a fix on unified hierarchy and grew some effective css related interfaces which will be used for blkio support for writeback IO traffic which is currently being worked on" * 'for-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: implement cgroup_get_e_css() cgroup: add cgroup_subsys->css_e_css_changed() cgroup: add cgroup_subsys->css_released() cgroup: fix the async css offline wait logic in cgroup_subtree_control_write() cgroup: restructure child_subsys_mask handling in cgroup_subtree_control_write() cgroup: separate out cgroup_calc_child_subsys_mask() from cgroup_refresh_child_subsys_mask() cpuset: lock vs unlock typo cpuset: simplify cpuset_node_allowed API cpuset: convert callback_mutex to a spinlock |
||
Linus Torvalds
|
0a27044c83 |
Merge branch 'for-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue update from Tejun Heo: "Work items which may be involved in memory reclaim path may be executed by the rescuer under memory pressure. When a rescuer gets activated, it processes whatever are on the pending list and then goes back to sleep until the manager kicks it again which involves 100ms delay. This is problematic for self-requeueing work items or the ones running on ordered workqueues as there always is only one work item on the pending list when the rescuer kicks in. The execution of that work item produces more to execute but the rescuer won't see them until after the said 100ms has passed, so such workqueues would only execute one work item every 100ms under prolonged memory pressure, which BTW may be being prolonged due to the slow execution. Neil wrote up a patch which fixes this issue by keeping the rescuer working as long as the target workqueue is busy but doesn't have enough workers" * 'for-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: allow rescuer thread to do more work. workqueue: invert the order between pool->lock and wq_mayday_lock workqueue: cosmetic update in rescuer_thread() |
||
Linus Torvalds
|
eedb3d3304 |
Merge branch 'for-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Pull percpu updates from Tejun Heo: "Nothing interesting. A patch to convert the remaining __get_cpu_var() users, another to fix non-critical off-by-one in an assertion and a cosmetic conversion to lockless_dereference() in percpu-ref. The back-merge from mainline is to receive lockless_dereference()" * 'for-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: Replace smp_read_barrier_depends() with lockless_dereference() percpu: Convert remaining __get_cpu_var uses in 3.18-rcX percpu: off by one in BUG_ON() |
||
Dave Airlie
|
b59f78228c |
Merge tag 'drm-intel-next-fixes-2014-12-11' of git://anongit.freedesktop.org/drm-intel into drm-next
Here's a batch of i915 fixes for 3.19. * tag 'drm-intel-next-fixes-2014-12-11' of git://anongit.freedesktop.org/drm-intel: drm/i915: save/restore GMBUS freq across suspend/resume on gen4 drm/i915: Remove '& 0xffff' from the mask given to WA_REG() drm/i915: Invert the mask and val arguments in wa_add() and WA_REG() drm/i915/bdw: Fix the write setting up the WIZ hashing mode drm/i915: Don't complain about stolen conflicts on gen3 drm/i915: resume MST after reading back hw state drm/i915: Handle inaccurate time conversion issues drm/i915: compute wait_ioctl timeout correctly drm/i915: don't always do full mode sets when infoframes are enabled |
||
Linus Torvalds
|
27afc5dbda |
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky: "The most notable change for this pull request is the ftrace rework from Heiko. It brings a small performance improvement and the ground work to support a new gcc option to replace the mcount blocks with a single nop. Two new s390 specific system calls are added to emulate user space mmio for PCI, an artifact of the how PCI memory is accessed. Two patches for the memory management with changes to common code. For KVM mm_forbids_zeropage is added which disables the empty zero page for an mm that is used by a KVM process. And an optimization, pmdp_get_and_clear_full is added analog to ptep_get_and_clear_full. Some micro optimization for the cmpxchg and the spinlock code. And as usual bug fixes and cleanups" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (46 commits) s390/cputime: fix 31-bit compile s390/scm_block: make the number of reqs per HW req configurable s390/scm_block: handle multiple requests in one HW request s390/scm_block: allocate aidaw pages only when necessary s390/scm_block: use mempool to manage aidaw requests s390/eadm: change timeout value s390/mm: fix memory leak of ptlock in pmd_free_tlb s390: use local symbol names in entry[64].S s390/ptrace: always include vector registers in core files s390/simd: clear vector register pointer on fork/clone s390: translate cputime magic constants to macros s390/idle: convert open coded idle time seqcount s390/idle: add missing irq off lockdep annotation s390/debug: avoid function call for debug_sprintf_* s390/kprobes: fix instruction copy for out of line execution s390: remove diag 44 calls from cpu_relax() s390/dasd: retry partition detection s390/dasd: fix list corruption for sleep_on requests s390/dasd: fix infinite term I/O loop s390/dasd: remove unused code ... |
||
Eric W. Biederman
|
36476beac4 |
userns; Correct the comment in map_write
It is important that all maps are less than PAGE_SIZE or else setting the last byte of the buffer to '0' could write off the end of the allocated storage. Correct the misleading comment. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> |
||
Eric W. Biederman
|
66d2f338ee |
userns: Allow setting gid_maps without privilege when setgroups is disabled
Now that setgroups can be disabled and not reenabled, setting gid_map without privielge can now be enabled when setgroups is disabled. This restores most of the functionality that was lost when unprivileged setting of gid_map was removed. Applications that use this functionality will need to check to see if they use setgroups or init_groups, and if they don't they can be fixed by simply disabling setgroups before writing to gid_map. Cc: stable@vger.kernel.org Reviewed-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> |
||
Eric W. Biederman
|
9cc46516dd |
userns: Add a knob to disable setgroups on a per user namespace basis
- Expose the knob to user space through a proc file /proc/<pid>/setgroups A value of "deny" means the setgroups system call is disabled in the current processes user namespace and can not be enabled in the future in this user namespace. A value of "allow" means the segtoups system call is enabled. - Descendant user namespaces inherit the value of setgroups from their parents. - A proc file is used (instead of a sysctl) as sysctls currently do not allow checking the permissions at open time. - Writing to the proc file is restricted to before the gid_map for the user namespace is set. This ensures that disabling setgroups at a user namespace level will never remove the ability to call setgroups from a process that already has that ability. A process may opt in to the setgroups disable for itself by creating, entering and configuring a user namespace or by calling setns on an existing user namespace with setgroups disabled. Processes without privileges already can not call setgroups so this is a noop. Prodcess with privilege become processes without privilege when entering a user namespace and as with any other path to dropping privilege they would not have the ability to call setgroups. So this remains within the bounds of what is possible without a knob to disable setgroups permanently in a user namespace. Cc: stable@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> |
||
Linus Torvalds
|
70e71ca0af |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller: 1) New offloading infrastructure and example 'rocker' driver for offloading of switching and routing to hardware. This work was done by a large group of dedicated individuals, not limited to: Scott Feldman, Jiri Pirko, Thomas Graf, John Fastabend, Jamal Hadi Salim, Andy Gospodarek, Florian Fainelli, Roopa Prabhu 2) Start making the networking operate on IOV iterators instead of modifying iov objects in-situ during transfers. Thanks to Al Viro and Herbert Xu. 3) A set of new netlink interfaces for the TIPC stack, from Richard Alpe. 4) Remove unnecessary looping during ipv6 routing lookups, from Martin KaFai Lau. 5) Add PAUSE frame generation support to gianfar driver, from Matei Pavaluca. 6) Allow for larger reordering levels in TCP, which are easily achievable in the real world right now, from Eric Dumazet. 7) Add a variable of napi_schedule that doesn't need to disable cpu interrupts, from Eric Dumazet. 8) Use a doubly linked list to optimize neigh_parms_release(), from Nicolas Dichtel. 9) Various enhancements to the kernel BPF verifier, and allow eBPF programs to actually be attached to sockets. From Alexei Starovoitov. 10) Support TSO/LSO in sunvnet driver, from David L Stevens. 11) Allow controlling ECN usage via routing metrics, from Florian Westphal. 12) Remote checksum offload, from Tom Herbert. 13) Add split-header receive, BQL, and xmit_more support to amd-xgbe driver, from Thomas Lendacky. 14) Add MPLS support to openvswitch, from Simon Horman. 15) Support wildcard tunnel endpoints in ipv6 tunnels, from Steffen Klassert. 16) Do gro flushes on a per-device basis using a timer, from Eric Dumazet. This tries to resolve the conflicting goals between the desired handling of bulk vs. RPC-like traffic. 17) Allow userspace to ask for the CPU upon what a packet was received/steered, via SO_INCOMING_CPU. From Eric Dumazet. 18) Limit GSO packets to half the current congestion window, from Eric Dumazet. 19) Add a generic helper so that all drivers set their RSS keys in a consistent way, from Eric Dumazet. 20) Add xmit_more support to enic driver, from Govindarajulu Varadarajan. 21) Add VLAN packet scheduler action, from Jiri Pirko. 22) Support configurable RSS hash functions via ethtool, from Eyal Perry. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1820 commits) Fix race condition between vxlan_sock_add and vxlan_sock_release net/macb: fix compilation warning for print_hex_dump() called with skb->mac_header net/mlx4: Add support for A0 steering net/mlx4: Refactor QUERY_PORT net/mlx4_core: Add explicit error message when rule doesn't meet configuration net/mlx4: Add A0 hybrid steering net/mlx4: Add mlx4_bitmap zone allocator net/mlx4: Add a check if there are too many reserved QPs net/mlx4: Change QP allocation scheme net/mlx4_core: Use tasklet for user-space CQ completion events net/mlx4_core: Mask out host side virtualization features for guests net/mlx4_en: Set csum level for encapsulated packets be2net: Export tunnel offloads only when a VxLAN tunnel is created gianfar: Fix dma check map error when DMA_API_DEBUG is enabled cxgb4/csiostor: Don't use MASTER_MUST for fw_hello call net: fec: only enable mdio interrupt before phy device link up net: fec: clear all interrupt events to support i.MX6SX net: fec: reset fep link status in suspend function net: sock: fix access via invalid file descriptor net: introduce helper macro for_each_cmsghdr ... |
||
Steven Rostedt (Red Hat)
|
1fb8915b98 |
printk: Do not disable preemption for accessing printk_func
As printk_func will either be the default function, or a per_cpu function for the current CPU, there's no reason to disable preemption to access it from printk. That's because if the printk_func is not the default then the caller had better disabled preemption as they were the one to change it. Link: http://lkml.kernel.org/r/CA+55aFz5-_LKW4JHEBoWinN9_ouNcGRWAF2FUA35u46FRN-Kxw@mail.gmail.com Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> |
||
Jiri Olsa
|
9fc81d8742 |
perf: Fix events installation during moving group
We allow PMU driver to change the cpu on which the event should be installed to. This happened in patch: |
||
Linus Torvalds
|
92a578b064 |
ACPI and power management updates for 3.19-rc1
This time we have some more new material than we used to have during the last couple of development cycles. The most important part of it to me is the introduction of a unified interface for accessing device properties provided by platform firmware. It works with Device Trees and ACPI in a uniform way and drivers using it need not worry about where the properties come from as long as the platform firmware (either DT or ACPI) makes them available. It covers both devices and "bare" device node objects without struct device representation as that turns out to be necessary in some cases. This has been in the works for quite a few months (and development cycles) and has been approved by all of the relevant maintainers. On top of that, some drivers are switched over to the new interface (at25, leds-gpio, gpio_keys_polled) and some additional changes are made to the core GPIO subsystem to allow device drivers to manipulate GPIOs in the "canonical" way on platforms that provide GPIO information in their ACPI tables, but don't assign names to GPIO lines (in which case the driver needs to do that on the basis of what it knows about the device in question). That also has been approved by the GPIO core maintainers and the rfkill driver is now going to use it. Second is support for hardware P-states in the intel_pstate driver. It uses CPUID to detect whether or not the feature is supported by the processor in which case it will be enabled by default. However, it can be disabled entirely from the kernel command line if necessary. Next is support for a platform firmware interface based on ACPI operation regions used by the PMIC (Power Management Integrated Circuit) chips on the Intel Baytrail-T and Baytrail-T-CR platforms. That interface is used for manipulating power resources and for thermal management: sensor temperature reporting, trip point setting and so on. Also the ACPI core is now going to support the _DEP configuration information in a limited way. Basically, _DEP it supposed to reflect off-the-hierarchy dependencies between devices which may be very indirect, like when AML for one device accesses locations in an operation region handled by another device's driver (usually, the device depended on this way is a serial bus or GPIO controller). The support added this time is sufficient to make the ACPI battery driver work on Asus T100A, but it is general enough to be able to cover some other use cases in the future. Finally, we have a new cpufreq driver for the Loongson1B processor. In addition to the above, there are fixes and cleanups all over the place as usual and a traditional ACPICA update to a recent upstream release. As far as the fixes go, the ACPI LPSS (Low-power Subsystem) driver for Intel platforms should be able to handle power management of the DMA engine correctly, the cpufreq-dt driver should interact with the thermal subsystem in a better way and the ACPI backlight driver should handle some more corner cases, among other things. On top of the ACPICA update there are fixes for race conditions in the ACPICA's interrupt handling code which might lead to some random and strange looking failures on some systems. In the cleanups department the most visible part is the series of commits targeted at getting rid of the CONFIG_PM_RUNTIME configuration option. That was triggered by a discussion regarding the generic power domains code during which we realized that trying to support certain combinations of PM config options was painful and not really worth it, because nobody would use them in production anyway. For this reason, we decided to make CONFIG_PM_SLEEP select CONFIG_PM_RUNTIME and that lead to the conclusion that the latter became redundant and CONFIG_PM could be used instead of it. The material here makes that replacement in a major part of the tree, but there will be at least one more batch of that in the second part of the merge window. Specifics: - Support for retrieving device properties information from ACPI _DSD device configuration objects and a unified device properties interface for device drivers (and subsystems) on top of that. As stated above, this works with Device Trees and ACPI and allows device drivers to be written in a platform firmware (DT or ACPI) agnostic way. The at25, leds-gpio and gpio_keys_polled drivers are now going to use this new interface and the GPIO subsystem is additionally modified to allow device drivers to assign names to GPIO resources returned by ACPI _CRS objects (in case _DSD is not present or does not provide the expected data). The changes in this set are mostly from Mika Westerberg, Rafael J Wysocki, Aaron Lu, and Darren Hart with some fixes from others (Fabio Estevam, Geert Uytterhoeven). - Support for Hardware Managed Performance States (HWP) as described in Volume 3, section 14.4, of the Intel SDM in the intel_pstate driver. CPUID is used to detect whether or not the feature is supported by the processor. If supported, it will be enabled automatically unless the intel_pstate=no_hwp switch is present in the kernel command line. From Dirk Brandewie. - New Intel Broadwell-H ID for intel_pstate (Dirk Brandewie). - Support for firmware interface based on ACPI operation regions used by the PMIC chips on the Intel Baytrail-T and Baytrail-T-CR platforms for power resource control and thermal management (Aaron Lu). - Limited support for retrieving off-the-hierarchy dependencies between devices from ACPI _DEP device configuration objects and deferred probing support for the ACPI battery driver based on the _DEP information to make that driver work on Asus T100A (Lan Tianyu). - New cpufreq driver for the Loongson1B processor (Kelvin Cheung). - ACPICA update to upstream revision 20141107 which only affects tools (Bob Moore). - Fixes for race conditions in the ACPICA's interrupt handling code and in the ACPI code related to system suspend and resume (Lv Zheng and Rafael J Wysocki). - ACPI core fix for an RCU-related issue in the ioremap() regions management code that slowed down significantly after CPUs had been allowed to enter idle states even if they'd had RCU callbakcs queued and triggered some problems in certain proprietary graphics driver (and elsewhere). The fix replaces synchronize_rcu() in that code with synchronize_rcu_expedited() which makes the issue go away. From Konstantin Khlebnikov. - ACPI LPSS (Low-Power Subsystem) driver fix to handle power management of the DMA engine included into the LPSS correctly. The problem is that the DMA engine doesn't have ACPI PM support of its own and it simply is turned off when the last LPSS device having ACPI PM support goes into D3cold. To work around that, the PM domain used by the ACPI LPSS driver is redesigned so at least one device with ACPI PM support will be on as long as the DMA engine is in use. From Andy Shevchenko. - ACPI backlight driver fix to avoid using it on "Win8-compatible" systems where it doesn't work and where it was used by default by mistake (Aaron Lu). - Assorted minor ACPI core fixes and cleanups from Tomasz Nowicki, Sudeep Holla, Huang Rui, Hanjun Guo, Fabian Frederick, and Ashwin Chaugule (mostly related to the upcoming ARM64 support). - Intel RAPL (Running Average Power Limit) power capping driver fixes and improvements including new processor IDs (Jacob Pan). - Generic power domains modification to power up domains after attaching devices to them to meet the expectations of device drivers and bus types assuming devices to be accessible at probe time (Ulf Hansson). - Preliminary support for controlling device clocks from the generic power domains core code and modifications of the ARM/shmobile platform to use that feature (Ulf Hansson). - Assorted minor fixes and cleanups of the generic power domains core code (Ulf Hansson, Geert Uytterhoeven). - Assorted minor fixes and cleanups of the device clocks control code in the PM core (Geert Uytterhoeven, Grygorii Strashko). - Consolidation of device power management Kconfig options by making CONFIG_PM_SLEEP select CONFIG_PM_RUNTIME and removing the latter which is now redundant (Rafael J Wysocki and Kevin Hilman). That is the first batch of the changes needed for this purpose. - Core device runtime power management support code cleanup related to the execution of callbacks (Andrzej Hajda). - cpuidle ARM support improvements (Lorenzo Pieralisi). - cpuidle cleanup related to the CPUIDLE_FLAG_TIME_VALID flag and a new MAINTAINERS entry for ARM Exynos cpuidle (Daniel Lezcano and Bartlomiej Zolnierkiewicz). - New cpufreq driver callback (->ready) to be executed when the cpufreq core is ready to use a given policy object and cpufreq-dt driver modification to use that callback for cooling device registration (Viresh Kumar). - cpufreq core fixes and cleanups (Viresh Kumar, Vince Hsu, James Geboski, Tomeu Vizoso). - Assorted fixes and cleanups in the cpufreq-pcc, intel_pstate, cpufreq-dt, pxa2xx cpufreq drivers (Lenny Szubowicz, Ethan Zhao, Stefan Wahren, Petr Cvek). - OPP (Operating Performance Points) framework modification to allow OPPs to be removed too and update of a few cpufreq drivers (cpufreq-dt, exynos5440, imx6q, cpufreq) to remove OPPs (added during initialization) on driver removal (Viresh Kumar). - Hibernation core fixes and cleanups (Tina Ruchandani and Markus Elfring). - PM Kconfig fix related to CPU power management (Pankaj Dubey). - cpupower tool fix (Prarit Bhargava). / -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJUhj6JAAoJEILEb/54YlRxTM4P/j5g5SfqvY0QKsn7sR7MGZ6v nsgCBhJAqTw3ocNC7EAs8z9h2GWy1KbKpakKYWAh9Fs1yZoey7tFSlcv/Rgjlp70 uU5sDQHtpE9mHKiymdsowiQuWgpl962L4k+k8hUslhlvgk1PvVbpajR6OqG8G+pD asuIW9eh1APNkLyXmRJ3ZPomzs0VmRdZJ0NEs0lKX9mJskqEvxPIwdaxq3iaJq9B Fo0J345zUDcJnxWblDRdHlOigCimglElfN5qJwaC4KpwUKuBvLRKbp4f69+wfT0c kYFiR29X5KjJ2kLfP/wKsLyuDCYYXRq3tCia5M1tAqOjZ+UA89H/GDftx/5lntmv qUlBa35VfdS1SX4HyApZitOHiLgo+It/hl8Z9bJnhyVw66NxmMQ8JYN2imb8Lhqh XCLR7BxLTah82AapLJuQ0ZDHPzZqMPG2veC2vAzRMYzVijict/p4Y2+qBqONltER 4rs9uRVn+hamX33lCLg8BEN8zqlnT3rJFIgGaKjq/wXHAU/zpE9CjOrKMQcAg9+s t51XMNPwypHMAYyGVhEL89ImjXnXxBkLRuquhlmEpvQchIhR+mR3dLsarGn7da44 WPIQJXzcsojXczcwwfqsJCR4I1FTFyQIW+UNh02GkDRgRovQqo+Jk762U7vQwqH+ LBdhvVaS1VW4v+FWXEoZ =5dox -----END PGP SIGNATURE----- Merge tag 'pm+acpi-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management updates from Rafael Wysocki: "This time we have some more new material than we used to have during the last couple of development cycles. The most important part of it to me is the introduction of a unified interface for accessing device properties provided by platform firmware. It works with Device Trees and ACPI in a uniform way and drivers using it need not worry about where the properties come from as long as the platform firmware (either DT or ACPI) makes them available. It covers both devices and "bare" device node objects without struct device representation as that turns out to be necessary in some cases. This has been in the works for quite a few months (and development cycles) and has been approved by all of the relevant maintainers. On top of that, some drivers are switched over to the new interface (at25, leds-gpio, gpio_keys_polled) and some additional changes are made to the core GPIO subsystem to allow device drivers to manipulate GPIOs in the "canonical" way on platforms that provide GPIO information in their ACPI tables, but don't assign names to GPIO lines (in which case the driver needs to do that on the basis of what it knows about the device in question). That also has been approved by the GPIO core maintainers and the rfkill driver is now going to use it. Second is support for hardware P-states in the intel_pstate driver. It uses CPUID to detect whether or not the feature is supported by the processor in which case it will be enabled by default. However, it can be disabled entirely from the kernel command line if necessary. Next is support for a platform firmware interface based on ACPI operation regions used by the PMIC (Power Management Integrated Circuit) chips on the Intel Baytrail-T and Baytrail-T-CR platforms. That interface is used for manipulating power resources and for thermal management: sensor temperature reporting, trip point setting and so on. Also the ACPI core is now going to support the _DEP configuration information in a limited way. Basically, _DEP it supposed to reflect off-the-hierarchy dependencies between devices which may be very indirect, like when AML for one device accesses locations in an operation region handled by another device's driver (usually, the device depended on this way is a serial bus or GPIO controller). The support added this time is sufficient to make the ACPI battery driver work on Asus T100A, but it is general enough to be able to cover some other use cases in the future. Finally, we have a new cpufreq driver for the Loongson1B processor. In addition to the above, there are fixes and cleanups all over the place as usual and a traditional ACPICA update to a recent upstream release. As far as the fixes go, the ACPI LPSS (Low-power Subsystem) driver for Intel platforms should be able to handle power management of the DMA engine correctly, the cpufreq-dt driver should interact with the thermal subsystem in a better way and the ACPI backlight driver should handle some more corner cases, among other things. On top of the ACPICA update there are fixes for race conditions in the ACPICA's interrupt handling code which might lead to some random and strange looking failures on some systems. In the cleanups department the most visible part is the series of commits targeted at getting rid of the CONFIG_PM_RUNTIME configuration option. That was triggered by a discussion regarding the generic power domains code during which we realized that trying to support certain combinations of PM config options was painful and not really worth it, because nobody would use them in production anyway. For this reason, we decided to make CONFIG_PM_SLEEP select CONFIG_PM_RUNTIME and that lead to the conclusion that the latter became redundant and CONFIG_PM could be used instead of it. The material here makes that replacement in a major part of the tree, but there will be at least one more batch of that in the second part of the merge window. Specifics: - Support for retrieving device properties information from ACPI _DSD device configuration objects and a unified device properties interface for device drivers (and subsystems) on top of that. As stated above, this works with Device Trees and ACPI and allows device drivers to be written in a platform firmware (DT or ACPI) agnostic way. The at25, leds-gpio and gpio_keys_polled drivers are now going to use this new interface and the GPIO subsystem is additionally modified to allow device drivers to assign names to GPIO resources returned by ACPI _CRS objects (in case _DSD is not present or does not provide the expected data). The changes in this set are mostly from Mika Westerberg, Rafael J Wysocki, Aaron Lu, and Darren Hart with some fixes from others (Fabio Estevam, Geert Uytterhoeven). - Support for Hardware Managed Performance States (HWP) as described in Volume 3, section 14.4, of the Intel SDM in the intel_pstate driver. CPUID is used to detect whether or not the feature is supported by the processor. If supported, it will be enabled automatically unless the intel_pstate=no_hwp switch is present in the kernel command line. From Dirk Brandewie. - New Intel Broadwell-H ID for intel_pstate (Dirk Brandewie). - Support for firmware interface based on ACPI operation regions used by the PMIC chips on the Intel Baytrail-T and Baytrail-T-CR platforms for power resource control and thermal management (Aaron Lu). - Limited support for retrieving off-the-hierarchy dependencies between devices from ACPI _DEP device configuration objects and deferred probing support for the ACPI battery driver based on the _DEP information to make that driver work on Asus T100A (Lan Tianyu). - New cpufreq driver for the Loongson1B processor (Kelvin Cheung). - ACPICA update to upstream revision 20141107 which only affects tools (Bob Moore). - Fixes for race conditions in the ACPICA's interrupt handling code and in the ACPI code related to system suspend and resume (Lv Zheng and Rafael J Wysocki). - ACPI core fix for an RCU-related issue in the ioremap() regions management code that slowed down significantly after CPUs had been allowed to enter idle states even if they'd had RCU callbakcs queued and triggered some problems in certain proprietary graphics driver (and elsewhere). The fix replaces synchronize_rcu() in that code with synchronize_rcu_expedited() which makes the issue go away. From Konstantin Khlebnikov. - ACPI LPSS (Low-Power Subsystem) driver fix to handle power management of the DMA engine included into the LPSS correctly. The problem is that the DMA engine doesn't have ACPI PM support of its own and it simply is turned off when the last LPSS device having ACPI PM support goes into D3cold. To work around that, the PM domain used by the ACPI LPSS driver is redesigned so at least one device with ACPI PM support will be on as long as the DMA engine is in use. From Andy Shevchenko. - ACPI backlight driver fix to avoid using it on "Win8-compatible" systems where it doesn't work and where it was used by default by mistake (Aaron Lu). - Assorted minor ACPI core fixes and cleanups from Tomasz Nowicki, Sudeep Holla, Huang Rui, Hanjun Guo, Fabian Frederick, and Ashwin Chaugule (mostly related to the upcoming ARM64 support). - Intel RAPL (Running Average Power Limit) power capping driver fixes and improvements including new processor IDs (Jacob Pan). - Generic power domains modification to power up domains after attaching devices to them to meet the expectations of device drivers and bus types assuming devices to be accessible at probe time (Ulf Hansson). - Preliminary support for controlling device clocks from the generic power domains core code and modifications of the ARM/shmobile platform to use that feature (Ulf Hansson). - Assorted minor fixes and cleanups of the generic power domains core code (Ulf Hansson, Geert Uytterhoeven). - Assorted minor fixes and cleanups of the device clocks control code in the PM core (Geert Uytterhoeven, Grygorii Strashko). - Consolidation of device power management Kconfig options by making CONFIG_PM_SLEEP select CONFIG_PM_RUNTIME and removing the latter which is now redundant (Rafael J Wysocki and Kevin Hilman). That is the first batch of the changes needed for this purpose. - Core device runtime power management support code cleanup related to the execution of callbacks (Andrzej Hajda). - cpuidle ARM support improvements (Lorenzo Pieralisi). - cpuidle cleanup related to the CPUIDLE_FLAG_TIME_VALID flag and a new MAINTAINERS entry for ARM Exynos cpuidle (Daniel Lezcano and Bartlomiej Zolnierkiewicz). - New cpufreq driver callback (->ready) to be executed when the cpufreq core is ready to use a given policy object and cpufreq-dt driver modification to use that callback for cooling device registration (Viresh Kumar). - cpufreq core fixes and cleanups (Viresh Kumar, Vince Hsu, James Geboski, Tomeu Vizoso). - Assorted fixes and cleanups in the cpufreq-pcc, intel_pstate, cpufreq-dt, pxa2xx cpufreq drivers (Lenny Szubowicz, Ethan Zhao, Stefan Wahren, Petr Cvek). - OPP (Operating Performance Points) framework modification to allow OPPs to be removed too and update of a few cpufreq drivers (cpufreq-dt, exynos5440, imx6q, cpufreq) to remove OPPs (added during initialization) on driver removal (Viresh Kumar). - Hibernation core fixes and cleanups (Tina Ruchandani and Markus Elfring). - PM Kconfig fix related to CPU power management (Pankaj Dubey). - cpupower tool fix (Prarit Bhargava)" * tag 'pm+acpi-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (120 commits) i2c-omap / PM: Drop CONFIG_PM_RUNTIME from i2c-omap.c dmaengine / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM tools: cpupower: fix return checks for sysfs_get_idlestate_count() drivers: sh / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM e1000e / igb / PM: Eliminate CONFIG_PM_RUNTIME MMC / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM MFD / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM misc / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM media / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM input / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM leds: leds-gpio: Fix multiple instances registration without 'label' property iio / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM hsi / OMAP / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM i2c-hid / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM drm / exynos / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM gpio / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM hwrandom / exynos / PM: Use CONFIG_PM in #ifdef block / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM USB / PM: Drop CONFIG_PM_RUNTIME from the USB core PM: Merge the SET*_RUNTIME_PM_OPS() macros ... |
||
Linus Torvalds
|
350e4f4985 |
This code is a fork from the trace-3.19 pull as it needed the trace_seq
clean ups from that branch. This code solves the issue of performing stack dumps from NMI context. The issue is that printk() is not safe from NMI context as if the NMI were to trigger when a printk() was being performed, the NMI could deadlock from the printk() internal locks. This has been seen in practice. With lots of review from Petr Mladek, this code went through several iterations, and we feel that it is now at a point of quality to be accepted into mainline. Here's what is contained in this patch set: o Creates a "seq_buf" generic buffer utility that allows a descriptor to be passed around where functions can write their own "printk()" formatted strings into it. The generic version was pulled out of the trace_seq() code that was made specifically for tracing. o The seq_buf code was change to model the seq_file code. I have a patch (not included for 3.19) that converts the seq_file.c code over to use seq_buf.c like the trace_seq.c code does. This was done to make sure that seq_buf.c is compatible with seq_file.c. I may try to get that patch in for 3.20. o The seq_buf.c file was moved to lib/ to remove it from being dependent on CONFIG_TRACING. o The printk() was updated to allow for a per_cpu "override" of the internal calls. That is, instead of writing to the console, a call to printk() may do something else. This made it easier to allow the NMI to change what printk() does in order to call dump_stack() without needing to update that code as well. o Finally, the dump_stack from all CPUs via NMI code was converted to use the seq_buf code. The caller to trigger the NMI code would wait till all the NMIs finished, and then it would print the seq_buf data to the console safely from a non NMI context. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJUhbrnAAoJEEjnJuOKh9ldsCoIAJ3sKIJ5B3jxJJTCHPAx/lZD GVbV1J1mu4kTAZuhJZOAxW8D6PZGZMyEjg0y6ScDEnBGcjAZ9gTiWCdakPktf9EX GfaPPqwiL9dZ18J9Qc6uR+7M1Ffpzzwbcc6lJrpoTcjRgkoH9wCiLS9ozFQyYzWb /7m5UbUM/PIk9WAjLYXPW6UUVtPTPT0RdEQKofMGTeah+vgqj4TXCOROdlxsXXWF 77vqBvPd5TUPWFH9ftzJGDtZS8SroXVKCu3fZIqHgzAU0yqwVtH/JzDTy9u2UYhX GzDEPeAIdp6m6Uyc406VuIf1QW0gfBgmA0ir80vFoP27uFMM6j5HlF7azgQfx34= =YBgA -----END PGP SIGNATURE----- Merge tag 'trace-seq-buf-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull nmi-safe seq_buf printk update from Steven Rostedt: "This code is a fork from the trace-3.19 pull as it needed the trace_seq clean ups from that branch. This code solves the issue of performing stack dumps from NMI context. The issue is that printk() is not safe from NMI context as if the NMI were to trigger when a printk() was being performed, the NMI could deadlock from the printk() internal locks. This has been seen in practice. With lots of review from Petr Mladek, this code went through several iterations, and we feel that it is now at a point of quality to be accepted into mainline. Here's what is contained in this patch set: - Creates a "seq_buf" generic buffer utility that allows a descriptor to be passed around where functions can write their own "printk()" formatted strings into it. The generic version was pulled out of the trace_seq() code that was made specifically for tracing. - The seq_buf code was change to model the seq_file code. I have a patch (not included for 3.19) that converts the seq_file.c code over to use seq_buf.c like the trace_seq.c code does. This was done to make sure that seq_buf.c is compatible with seq_file.c. I may try to get that patch in for 3.20. - The seq_buf.c file was moved to lib/ to remove it from being dependent on CONFIG_TRACING. - The printk() was updated to allow for a per_cpu "override" of the internal calls. That is, instead of writing to the console, a call to printk() may do something else. This made it easier to allow the NMI to change what printk() does in order to call dump_stack() without needing to update that code as well. - Finally, the dump_stack from all CPUs via NMI code was converted to use the seq_buf code. The caller to trigger the NMI code would wait till all the NMIs finished, and then it would print the seq_buf data to the console safely from a non NMI context One added bonus is that this code also makes the NMI dump stack work on PREEMPT_RT kernels. As printk() includes sleeping locks on PREEMPT_RT, printk() only writes to console if the console does not use any rt_mutex converted spin locks. Which a lot do" * tag 'trace-seq-buf-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: x86/nmi: Fix use of unallocated cpumask_var_t printk/percpu: Define printk_func when printk is not defined x86/nmi: Perform a safe NMI stack trace on all CPUs printk: Add per_cpu printk func to allow printk to be diverted seq_buf: Move the seq_buf code to lib/ seq-buf: Make seq_buf_bprintf() conditional on CONFIG_BINARY_PRINTF tracing: Add seq_buf_get_buf() and seq_buf_commit() helper functions tracing: Have seq_buf use full buffer seq_buf: Add seq_buf_can_fit() helper function tracing: Add paranoid size check in trace_printk_seq() tracing: Use trace_seq_used() and seq_buf_used() instead of len tracing: Clean up tracing_fill_pipe_page() seq_buf: Create seq_buf_used() to find out how much was written tracing: Add a seq_buf_clear() helper and clear len and readpos in init tracing: Convert seq_buf fields to be like seq_file fields tracing: Convert seq_buf_path() to be like seq_path() tracing: Create seq_buf layer in trace_seq |
||
Linus Torvalds
|
1dd7dcb6ea |
There was a lot of clean ups and minor fixes. One of those clean ups was
to the trace_seq code. It also removed the return values to the trace_seq_*() functions and use trace_seq_has_overflowed() to see if the buffer filled up or not. This is similar to work being done to the seq_file code as well in another tree. Some of the other goodies include: o Added some "!" (NOT) logic to the tracing filter. o Fixed the frame pointer logic to the x86_64 mcount trampolines o Added the logic for dynamic trampolines on !CONFIG_PREEMPT systems. That is, the ftrace trampoline can be dynamically allocated and be called directly by functions that only have a single hook to them. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJUhbLGAAoJEEjnJuOKh9ldRV4H/3NcLbgGB2iu96la1zdYE6pG Q7cDJMxXK80YIIL70h9G0IItcD4t62LMb72lfBnMGRj3msgFb3AgISW57EuI0Pxk xk24wuIPoTG2S7v9sc3SboNFwO8qbtIjxD2OBmqIUrGo2sZIiGjyj3gX7mCY3uzL WB2bUOSFz/22OgaANinR5EELHA3pZZCf54Vz1K9ndmtK0xp0j1a7xJShD6TrMdYv mZ3zH5ViIhW4A3mdcMceh6fy2JLQAiEKF0uPTvcMMz7NlVul0mxyL/+10P7AE/3R Ehw4fzmm4NDshPDtBOkKH0LsppgXzuItFuQUTpact3JlqTg++bV6onSsrkt1hlY= =Z7Cm -----END PGP SIGNATURE----- Merge tag 'trace-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "There was a lot of clean ups and minor fixes. One of those clean ups was to the trace_seq code. It also removed the return values to the trace_seq_*() functions and use trace_seq_has_overflowed() to see if the buffer filled up or not. This is similar to work being done to the seq_file code as well in another tree. Some of the other goodies include: - Added some "!" (NOT) logic to the tracing filter. - Fixed the frame pointer logic to the x86_64 mcount trampolines - Added the logic for dynamic trampolines on !CONFIG_PREEMPT systems. That is, the ftrace trampoline can be dynamically allocated and be called directly by functions that only have a single hook to them" * tag 'trace-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (55 commits) tracing: Truncated output is better than nothing tracing: Add additional marks to signal very large time deltas Documentation: describe trace_buf_size parameter more accurately tracing: Allow NOT to filter AND and OR clauses tracing: Add NOT to filtering logic ftrace/fgraph/x86: Have prepare_ftrace_return() take ip as first parameter ftrace/x86: Get rid of ftrace_caller_setup ftrace/x86: Have save_mcount_regs macro also save stack frames if needed ftrace/x86: Add macro MCOUNT_REG_SIZE for amount of stack used to save mcount regs ftrace/x86: Simplify save_mcount_regs on getting RIP ftrace/x86: Have save_mcount_regs store RIP in %rdi for first parameter ftrace/x86: Rename MCOUNT_SAVE_FRAME and add more detailed comments ftrace/x86: Move MCOUNT_SAVE_FRAME out of header file ftrace/x86: Have static tracing also use ftrace_caller_setup ftrace/x86: Have static function tracing always test for function graph kprobes: Add IPMODIFY flag to kprobe_ftrace_ops ftrace, kprobes: Support IPMODIFY flag to find IP modify conflict kprobes/ftrace: Recover original IP if pre_handler doesn't change it tracing/trivial: Fix typos and make an int into a bool tracing: Deletion of an unnecessary check before iput() ... |
||
Linus Torvalds
|
b6da0076ba |
Merge branch 'akpm' (patchbomb from Andrew)
Merge first patchbomb from Andrew Morton: - a few minor cifs fixes - dma-debug upadtes - ocfs2 - slab - about half of MM - procfs - kernel/exit.c - panic.c tweaks - printk upates - lib/ updates - checkpatch updates - fs/binfmt updates - the drivers/rtc tree - nilfs - kmod fixes - more kernel/exit.c - various other misc tweaks and fixes * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (190 commits) exit: pidns: fix/update the comments in zap_pid_ns_processes() exit: pidns: alloc_pid() leaks pid_namespace if child_reaper is exiting exit: exit_notify: re-use "dead" list to autoreap current exit: reparent: call forget_original_parent() under tasklist_lock exit: reparent: avoid find_new_reaper() if no children exit: reparent: introduce find_alive_thread() exit: reparent: introduce find_child_reaper() exit: reparent: document the ->has_child_subreaper checks exit: reparent: s/while_each_thread/for_each_thread/ in find_new_reaper() exit: reparent: fix the cross-namespace PR_SET_CHILD_SUBREAPER reparenting exit: reparent: fix the dead-parent PR_SET_CHILD_SUBREAPER reparenting exit: proc: don't try to flush /proc/tgid/task/tgid exit: release_task: fix the comment about group leader accounting exit: wait: drop tasklist_lock before psig->c* accounting exit: wait: don't use zombie->real_parent exit: wait: cleanup the ptrace_reparented() checks usermodehelper: kill the kmod_thread_locker logic usermodehelper: don't use CLONE_VFORK for ____call_usermodehelper() fs/hfs/catalog.c: fix comparison bug in hfs_cat_keycmp nilfs2: fix the nilfs_iget() vs. nilfs_new_inode() races ... |
||
Al Viro
|
707c5960f1 | Merge branch 'nsfs' into for-next | ||
Oleg Nesterov
|
a53b831549 |
exit: pidns: fix/update the comments in zap_pid_ns_processes()
The comments in zap_pid_ns_processes() are not clear, we need to explain
how this code actually works.
1. "Ignore SIGCHLD" looks like optimization but it is not, we also
need this for correctness.
2. The comment above sys_wait4() could tell more.
EXIT_ZOMBIE child is only possible if it has exited before we
ignored SIGCHLD. Or if it is traced from the parent namespace,
but in this case it will be reaped by debugger after detach,
sys_wait4() acts as a synchronization point.
3. The comment about TASK_DEAD (EXIT_DEAD in fact) children is
outdated. Contrary to what it says we do not need to make sure
they all go away after
|