Commit Graph

426514 Commits

Author SHA1 Message Date
Balázs Kéri
f93dee1033 [clang][ASTImporter] Fix import of function with auto return type.
Fix a case of importing a function with auto return type
that is resolved with a type template argument that is declared
inside the function.
Fixes #55500

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D127396
2022-06-10 10:18:53 +02:00
Kirill Okhotnikov
76b57ef88c [libc][math] Differential "diff" test for hypot/hypotf functions.
Added test handler in preparation to fmod/fmodf commit.

Differential Revision: https://reviews.llvm.org/D127091
2022-06-10 10:08:47 +02:00
Nikita Popov
3c514d31d7 [EarlyCSE] Update tests to use opaque pointers (NFC)
Update the EarlyCSE tests to use opaque pointers.

Worth noting that this leaves some bitcast ptr to ptr instructions
in the input IR behind which are no longer necessary. This is
because these use numbered instructions, so it's hard to drop them
in an automated fashion (as it would require renumbering all other
instructions as well). I'm leaving that as a problem for another day.

The test updates have been performed using
https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34.

Differential Revision: https://reviews.llvm.org/D127278
2022-06-10 09:53:35 +02:00
Nikita Popov
6bc8163c79 [cmake] Export driver template to fix standalone build
Export the driver-template.cpp.in file so that tools using
GENERATE_DRIVER work in standalone builds (currently only relevant
for clang). I've given the file an llvm- prefix, as we're now
searching for the file in CMAKE_MODULE_PATH.

Differential Revision: https://reviews.llvm.org/D127384
2022-06-10 09:50:23 +02:00
Nikita Popov
c10921fa1a [CGP] Also freeze ctlz/cttz operand when despeculating
D125887 changed the ctlz/cttz despeculation transform to insert
a freeze for the introduced branch on zero. While this does fix
the "branch on poison" issue, we may still get in trouble if we
pick a different value for the branch and for the ctz argument
(i.e. non-zero for the branch, but zero for the ctz). To avoid
this, we should use the same frozen value in both positions.

This does cause a regression in RISCV codegen by introducing an
additional sext. The DAG looks like this:

    t0: ch = EntryToken
        t2: i64,ch = CopyFromReg t0, Register:i64 %3
      t4: i64 = AssertSext t2, ValueType:ch:i32
    t23: i64 = freeze t4
          t9: ch = CopyToReg t0, Register:i64 %0, t23
          t16: ch = CopyToReg t0, Register:i64 %4, Constant:i64<32>
        t18: ch = TokenFactor t9, t16
            t25: i64 = sign_extend_inreg t23, ValueType:ch:i32
          t24: i64 = setcc t25, Constant:i64<0>, seteq:ch
        t28: i64 = and t24, Constant:i64<1>
      t19: ch = brcond t18, t28, BasicBlock:ch<cond.end 0x8311f68>
    t21: ch = br t19, BasicBlock:ch<cond.false 0x8311e80>

I don't see a really obvious way to improve this, as we can't push
the freeze past the AssertSext (which may produce poison).

Differential Revision: https://reviews.llvm.org/D126638
2022-06-10 09:46:10 +02:00
Jay Foad
6c372daa84 [AMDGPU] New GFX11 intrinsic llvm.amdgcn.s.sendmsg.rtn
Add new intrinsic and codegen support for the s_sendmsg_rtn_b32 and
s_sendmsg_rtn_b64 instructions.

Differential Revision: https://reviews.llvm.org/D127315
2022-06-10 08:15:23 +01:00
Jay Foad
b0a3849439 [AMDGPU] Update dlc usage for GFX11
In GFX10 dlc controlled L1 cache bypass. In GFX11 it has been repurposed
to control MALL NOALLOC, and glc controls L1 as well as L0 cache bypass.

Update the documentation and SIMemoryLegalizer accordingly. Set dlc for
nontemporal and volatile accesses.

Differential Revision: https://reviews.llvm.org/D127405
2022-06-10 08:10:34 +01:00
Tony
802e3f4f57 [AMDGPU] Add GFX11 documentation to AMDGPUUsage
Update most of the document to include GFX11. Memory model changes will
come later.

Differential Revision: https://reviews.llvm.org/D127402
2022-06-10 08:10:34 +01:00
Valentin Clement
5b66cc1000
[flang][NFC] Move Todo.h from Lower to Optimizer
Remove a backwards dependence from Optimizer -> Lower by moving Todo.h
to the optimizer and out of lowering.

This patch is part of the upstreaming effort from fir-dev branch.

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D127292
2022-06-10 08:51:05 +02:00
Sunho Kim
6d67f7a329 [JITLink][EHFrameSupport] Remove CodeAlignmentFactor and DataAlignmentFactor validation.
Removes CodeAlignmentFactor and DataAlignmentFactor validation in EHFrameEdgeFixer. I observed some of aarch64 elf files generated by clang contains CIE record with code_alignment_factor = 4 or data_alignment_factor = -8. code_alignment_factor and  data_alignment_factor are used by call fram instruction that should be correctled handled by libunwind.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D127062
2022-06-10 15:29:20 +09:00
Adrian Kuegel
61132005a9 Fix bazel BUILD. 2022-06-10 08:26:00 +02:00
Eric Schweitz
68cfb6a8e5
Fixes assertion that arose from bad FIR being constructed.
* Fix assertion strings.

* Fixes assertion that arose from bad FIR being constructed.

With the default member-wise component assignment, the LHS and RHS may
be compatible but distinct types. This change to lowering manages both
the LHS and RHS independently rather than assume the two types are
identical. This avoids creating bogus FIR and asserting/crashing in
codegen.

Update the tests with the member-wise copy code.

This patch is part of the upstreaming effort from fir-dev branch.

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D127297
2022-06-10 08:11:43 +02:00
Nathan James
b831786292
[clang-tidy][NFC] Tweak identifier-naming options reading/writiing 2022-06-10 07:07:21 +01:00
Yeting Kuo
f68cad9087 [RISCV] Lower VLEFF/VLSEGFF SDNodes to MachineInstrs with VL outputs.
The patch is a replacement of D125199. PseudoReadVL with vtype has worry for
computing same vtypes of VLEFF/VLSEGFF in two different places, DAGToDAG and
InsertVSETVLI. VLEFF/VLSEGFF MI with VL output still could provide the vtype of
VLEFF/VLSEGFF to the users of its VL.

The patch names the new pseudo as original VLEFF/VLSEGFF name suffixed "_VL" and
expand them in RISCVInsertVSETVLI pass.

This patch also reverts commit 4537aae0d5,
"[RISCV] Make PseudoReadVL have the vtypes of the corresponding VLEFF/VLSEGFF.".

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D126794
2022-06-10 13:57:10 +08:00
Peter S. Housel
2be5abb7e9 [ORC][ORC_RT] Handle ELF .init_array with non-default priority
ELF-based platforms currently support defining multiple static
initializer table sections with differing priorities, for example
.init_array.0 or .init_array.100; the default .init_array corresponds
to a priority of 65535. When building a shared library or executable,
the system linker normally sorts these sections and combines them into
a single .init_array section. This change adds the capability to
recognize ELF static initializers with priorities other than the
default, and to properly sort them by priority, to Orc and the Orc
runtime.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D127056
2022-06-09 22:47:58 -07:00
Peter S. Housel
1aa71f8679 [ORC][ORC_RT] Integrate ORC platforms with LLJIT and lli
This change enables integrating orc::LLJIT with the ORCv2
platforms (MachOPlatform and ELFNixPlatform) and the compiler-rt orc
runtime. Changes include:

- Adding SPS wrapper functions for the orc runtime's dlfcn emulation
  functions, allowing initialization and deinitialization to be invoked
  by LLJIT.

- Changing the LLJIT code generation default to add UseInitArray so
  that .init_array constructors are generated for ELF platforms.

- Integrating the ORCv2 Platforms into lli, and adding a
  PlatformSupport implementation to the LLJIT instance used by lli which
  implements initialization and deinitialization by calling the new
  wrapper functions in the runtime.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D126492
2022-06-09 22:47:58 -07:00
Sunho Kim
87c4268329 [JITLink][ELF][AArch64] Implement Procedure Linkage Table.
Implements Procedure Linkage Table (PLT) for ELF/AARCH64. The aarch64 linux calling convention also uses r16 as the intra-procedure-call scratch register same as MachO/ARM64. We can use the same stub sequence for this reason.

Also, BR regiseter doesn't touch X30 register. External function call by BL instruction (touched by CALL26 relocation) will set X30 to the original PC + 4, which is the intended behavior. External function call by B instruction (touched by JUMP26 relocation) doesn't requite to set X30, so the patch will be correct in this case too.

Reference: https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#611general-purpose-registers

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D127061
2022-06-10 14:44:33 +09:00
owenca
e9f2d47bfe [clang-format][NFC] Remove unused FormatStyle members
Differential Revision: https://reviews.llvm.org/D127390
2022-06-09 22:34:31 -07:00
Jun Zhang
0ecbedc098
Also move WeakRefReferences in CodeGenModule::moveLazyEmssionStates
I forgot this field in b8f9459715
Signed-off-by: Jun Zhang <jun@junz.org>
2022-06-10 13:11:09 +08:00
Sunho Kim
e093e42107 [ORC][AArch64] Add initial support for aarch64 in ELFNixPlatform.
Adds the aarch64 support in ELFNixPlatform. These are few simple changes, but it allows us to use the orc runtime in ELF/AARCH64 backend. It succesfully run the static initializers of stdlibc++ iostream so that "cout << Hello world" testcase starts to work.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D127060
2022-06-10 13:37:36 +09:00
chenglin.bi
cde377db85 [InstCombine] Add negative vector tests for lshr+shl+and/shl+lshr+and transforms; NFC 2022-06-10 11:36:39 +08:00
chenglin.bi
87b5840b34 [InstCombine] Add baseline tests for lshr+shl+and transforms; NFC 2022-06-10 11:00:41 +08:00
lewuathe
999f767f9f [mlir] fix typo in AttributesAndTypes doc 2022-06-10 11:42:57 +09:00
Sunho Kim
175f22d6c3 [JITLink][ELF][AArch64] Implement R_AARCH64_JUMP26
Implements R_AARCH64_JUMP26. We can use the same generic aarch64 Branch26 edge since B instruction and BL nstruction have the same sized&offseted immediate field, and the relocation address calculation is the same.

Reference: ELF for the ARM ® 64-bit Architecture Tabel 4-10, ARM Architecture Reference Manual ® ARMv8, for ARMv8-A architecture profile C6.2.24, C6.2.31

Reviewed By: sgraenitz

Differential Revision: https://reviews.llvm.org/D127059
2022-06-10 11:35:42 +09:00
chenglin.bi
de7a6ae1ff [InstCombine] Optimize shl+lshr+and conversion pattern
if `C1` and `C3` are pow2 and `Log2(C3)+C2 < BitWidth`:
    ((C1 << X) >> C2) & C3 -> X == (Log2(C3)+C2-Log2(C1)) ? C3 : 0;

https://alive2.llvm.org/ce/z/Pus5bd

Fix issue https://github.com/llvm/llvm-project/issues/55739

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D126617
2022-06-10 09:36:58 +08:00
Sunho Kim
51a41f23b6 [JITLink][AArch64] Fix overflow range of Page21 fixup edge.
Allowed range for Page21 relocation is -2^32 <= X < 2^32 in both ELF and MachO.

09c2b7c35a/llvm/lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldMachOAArch64.h (L210) (MachO)

ELF for the ARM ® 64-bit Architecture (AArch64) Table 4-9 (ELF)

Reviewed By: sgraenitz

Differential Revision: https://reviews.llvm.org/D126387
2022-06-10 10:30:19 +09:00
Kirill Okhotnikov
081aba27b1 [libc][math] Separated builtin function in special FPUtils header.
A small refactoring of builtin functions in preparation to adding fmod/fmodf function.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D127088
2022-06-10 03:18:35 +02:00
Sam Clegg
457f38a7b0 [lld][WebAssembly] Revert moving of data relocations to start function
Back in https://reviews.llvm.org/D117412 we moved the application of
data reloctions to the wasm start function.

However, because the dynamic linker doesn't know the final addresses
at module instantiation time, this proved to be too early and the
relocations could be applied with the wrong values.

Fixes: https://github.com/emscripten-core/emscripten/issues/17150

Differential Revision: https://reviews.llvm.org/D127333
2022-06-09 17:49:35 -07:00
Damian Rouson
2a40267a0d [flang] semantics test for ucobound
Add a test with a range of ucobound() intrinsic function
invocations, including a comprehensive set of standard-conforming
keyword and non-keyword arguments with and without optional
arguments present and with argument positions covering all
possible orderings.  Also test that several non-conforming
ucobound() invocations generate the correct error messages.

Differential Revision: https://reviews.llvm.org/D126508
2022-06-09 17:39:25 -07:00
Philip Reames
28be4b7454 [RISCV] Simplify InstrInfo access in doPeepholeMaskedRVV [nfc] 2022-06-09 17:02:40 -07:00
Mogball
2af69c6751 [mlir][NFC] Rename Bazel target aliases and consolidate targets
This patch completes outstanding TODOs of removing aliases bazel target names.
This patch also renames and cosolidates some bazel targets to be more in line
with their CMake counterparts, e.g. combining `:LinalgOps` and `:LinalgInterfaces`
into `:LinalgDialect`.

Differential Revision: https://reviews.llvm.org/D127459
2022-06-09 23:58:07 +00:00
Okwan Kwon
5ccb9df3ba [mlir] Support passing ostream as argument for the create function.
The constructor already supports passing an ostream as argument,
so let's make the create function support it too.

Differential Revision: https://reviews.llvm.org/D127449
2022-06-09 16:34:22 -07:00
Sunho Kim
f2f8ce9699 [NFC] test commit
This is an empty commit to check commit access
2022-06-10 08:32:58 +09:00
Dave Lee
47c4c6a746 [lldb] Use assertState in more tests (NFC)
Follow to D127355, converting more `assertEquals` to `assertState`.

Differential Revision: https://reviews.llvm.org/D127378
2022-06-09 16:18:07 -07:00
Mogball
a31ff0af9b [mlir][spirv] Replace StructAttrs with AttrDefs
Depends on D127370

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D127373
2022-06-09 23:16:44 +00:00
Philip Reames
b59c2315af [BasicTTI] Return Invalid cost for more scalable vector scalarization cases
Instead of crashing on a cast<FixedVectorType>, we should isntead return Invalid for these cases.  This avoids crashes in assert builds, and potential miscompiles in release builds.
2022-06-09 16:10:51 -07:00
Craig Topper
8bbcb98848 [RISCV] Teach RISCVMergeBaseOffset about cases where we use SHXADD to add some immediates.
For an addition with simm14 and simm15 immediates with 2 or 3 trailing bits,
we can use a shXadd instruction and an addi to do the addition.

This patch teaches RISCVMergeBaseOffset to see through this pattern.
I don't think the sh1add case occurs because we use two addis for that,
but I implemented it for completeness.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D127376
2022-06-09 16:07:35 -07:00
Mogball
f1182bd6d5 [mlir][tosa] Replace StructAttrs with AttrDefs
Depends on D127352

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127370
2022-06-09 23:01:51 +00:00
Mogball
d7ef488bb6 [mlir][gpu] Move GPU headers into IR/ and Transforms/
Depends on D127350

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127352
2022-06-09 22:49:03 +00:00
Philip Reames
206f10d3f6 Plumb InstructionCost through unroll costing
Teach the unroller(s) how to handle an invalid cost. This avoids crashes when the backend can't provide a cost due to either a fundemental limitation or an unimplemented cost model case.

Differential Revision: https://reviews.llvm.org/D127305
2022-06-09 15:42:53 -07:00
Denis Revunov
0b7e8baf83 [BOLT][AArch64] Handle data at the beginning of a function when disassembling and building CFG.
This patch adds getFirstInstructionOffset method for BinaryFunction
which is used to properly handle cases where data is at zero offset in
a function. The main change is that we add basic block at first
instruction offset when disassembling, which prevents assertion
failures in buildCFG.

Reviewed By: yota9, rafauler

Differential Revision: https://reviews.llvm.org/D127111
2022-06-09 15:26:32 -07:00
Mogball
7bdd3722f2 [mlir][gpu] Change ParalellLoopMappingAttr to AttrDef
It was a StructAttr. Also adds a FieldParser for AffineMap.

Depends on D127348

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127350
2022-06-09 22:23:21 +00:00
Philip Reames
f85c5079b8 Pipe potentially invalid InstructionCost through CodeMetrics
Per the documentation in Support/InstructionCost.h, the purpose of an invalid cost is so that clients can change behavior on impossible to cost inputs. CodeMetrics was instead asserting that invalid costs never occurred.

On a target with an incomplete cost model - e.g. RISCV - this means that transformations would crash on (falsely) invalid constructs - e.g. scalable vectors. While we certainly should improve the cost model - and I plan to do so in the near future - we also shouldn't be crashing. This violates the explicitly stated purpose of an invalid InstructionCost.

I updated all of the "easy" consumers where bailouts were locally obvious. I plan to follow up with loop unroll in a following change.

Differential Revision: https://reviews.llvm.org/D127131
2022-06-09 15:17:24 -07:00
Mogball
ba79bb4973 [mlir][nvvm] Change MMAShapeAttr to AttrDef
MMAShapeAttr was a StructAttr

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127348
2022-06-09 22:14:45 +00:00
Nico Weber
f8144700eb [gn build] (manually) port 25c8a061c5 2022-06-09 18:07:14 -04:00
Michael Jones
e1c54d4ddc [libc] move printf_main in to object library
Previously printf_main was a header library, but header library
dependencies don't work properly so it's been moved to an object
library. Additionally, the writers have been marked inline.

Reviewed By: sivachandra, lntue

Differential Revision: https://reviews.llvm.org/D126830
2022-06-09 14:35:18 -07:00
Haojian Wu
70d35fe125 [pseudo] Fix the broken build of ClangPseudoBenchmark, after c70aeaa. 2022-06-09 23:03:54 +02:00
Sanjay Patel
afa192cfb6 [InstCombine] add narrowing transform for low-masked binop with zext operand
https://alive2.llvm.org/ce/z/hRy3rE

As shown in D123408, we can produce this pattern when moving
cast around, and we already have a related fold for a binop
with a constant operand.
2022-06-09 16:59:26 -04:00
Sanjay Patel
48a606d0c7 [InstCombine] add tests for masked binop narrowing; NFC 2022-06-09 16:55:24 -04:00
David Green
f8f50a4975 [AggressiveInstcombine] Add target tests for fptosi.sat fold. NFC 2022-06-09 21:47:05 +01:00