forked from luck/tmp_suning_uos_patched
6fd92b63d0
The versions with inline assembly are in fact slower on the machines I tested them on (in userspace) (Athlon XP 2800+, p4-like Xeon 2.8GHz, AMD Opteron 270). The i386-version needed a fix similar to06024f21
to avoid crashing the benchmark. Benchmark using: gcc -fomit-frame-pointer -Os. For each bitmap size 1...512, for each possible bitmap with one bit set, for each possible offset: find the position of the first bit starting at offset. If you follow ;). Times include setup of the bitmap and checking of the results. Athlon Xeon Opteron 32/64bit x86-specific: 0m3.692s 0m2.820s 0m3.196s / 0m2.480s generic: 0m2.622s 0m1.662s 0m2.100s / 0m1.572s If the bitmap size is not a multiple of BITS_PER_LONG, and no set (cleared) bit is found, find_next_bit (find_next_zero_bit) returns a value outside of the range [0, size]. The generic version always returns exactly size. The generic version also uses unsigned long everywhere, while the x86 versions use a mishmash of int, unsigned (int), long and unsigned long. Using the generic version does give a slightly bigger kernel, though. defconfig: text data bss dec hex filename x86-specific:4738555
481232 626688 5846475 5935cb vmlinux (32 bit) generic: 4738621 481232 626688 5846541 59360d vmlinux (32 bit) x86-specific: 5392395 846568 724424 6963387 6a40bb vmlinux (64 bit) generic: 5392458 846568 724424 6963450 6a40fa vmlinux (64 bit) Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm> Signed-off-by: Ingo Molnar <mingo@elte.hu>
28 lines
749 B
Makefile
28 lines
749 B
Makefile
#
|
|
# Makefile for x86 specific library files.
|
|
#
|
|
|
|
obj-$(CONFIG_SMP) := msr-on-cpu.o
|
|
|
|
lib-y := delay_$(BITS).o
|
|
lib-y += usercopy_$(BITS).o getuser_$(BITS).o putuser_$(BITS).o
|
|
lib-y += memcpy_$(BITS).o
|
|
|
|
ifeq ($(CONFIG_X86_32),y)
|
|
lib-y += checksum_32.o
|
|
lib-y += strstr_32.o
|
|
lib-y += semaphore_32.o string_32.o
|
|
|
|
lib-$(CONFIG_X86_USE_3DNOW) += mmx_32.o
|
|
else
|
|
obj-y += io_64.o iomap_copy_64.o
|
|
|
|
CFLAGS_csum-partial_64.o := -funroll-loops
|
|
|
|
lib-y += csum-partial_64.o csum-copy_64.o csum-wrappers_64.o
|
|
lib-y += thunk_64.o clear_page_64.o copy_page_64.o
|
|
lib-y += bitops_64.o
|
|
lib-y += memmove_64.o memset_64.o
|
|
lib-y += copy_user_64.o rwlock_64.o copy_user_nocache_64.o
|
|
endif
|