Today the JSON uses `Value` and `RawValue` when printing `Flags`, when really
the `Value` field is always the name of an Enum variant, and `RawValue` is its
underlying numeric value. Similarly, we rename the `RawFlags` key to `Value`,
to match the new scheme. This also allows JSON parsing to use consistent logic
for `Flag` types.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D137091
The existing JSON incorrectly outputs line breaks and other invalid JSON.
Example Before this patch:
```
...
"Relocations":[Section (9) .rela.dyn {
0xA3B0 R_X86_64_RELATIVE - 0x43D0
0xA3B8 R_X86_64_RELATIVE - 0x4A30
...
```
Example After this patch:
```
...
"Relocations":[Section (9) .rela.dyn {
{"Relocation":{"Offset":41904,"Type":{"Value":"R_X86_64_RELATIVE","RawValue":8},
"Symbol":{"Value":"","RawValue":0},"Addend":17360}},
{"Relocation":{"Offset":41912,"Type":{"Value":"R_X86_64_RELATIVE","RawValue":8},
"Symbol":{"Value":"","RawValue":0},"Addend":18992}},
{"Relocation":{"Offset":41920,"Type":{"Value":"R_X86_64_RELATIVE","RawValue":8},
"Symbol":{"Value":"","RawValue":0},"Addend":17440}},
...
```
Note there are still issues with the Section, but each Relocation is
now a valid JSON object that can be parsed. Future patches will address
the issues regarding JSON output for the Section.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D137089
Today, the LLVM output uses special handling when the Other field is 0.
This output makes sense for a command line utility that a human will
read, but JSON is a machine readable format, so being consistent is more
important. Prior to this change, any consumer of the JSON output would
need to handle the Other field specially, since the structure of the
JSON would no longer be consistent.
Changes to JSON output when Other flag == 0:
```
"Other": 0, -> "Other": {
"RawFlags": 0,
"Flags": []
},
```
There are no changes to when Other flag != 0:
```
"Other": { -> "Other": {
"RawFlags": 1, "RawFlags": 1,
"Flags": [ "Flags": [
... ...
] ]
}, },
```
This patch adds a overload for the JSONELFDumper's printSymbol() method,
that uses consistent output formatting, regardless of the value of the
Other field.
Depends on D137092
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D137088
Since all ELFDumper implementations will require the same logic when
dealing with Other Flags, we move the logic into a helper so that it can
be easily reused across implementations.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D137092
There were two major problems with the tests.
First, with the pointer size being 32 bit and the original IVs also being 32 bit, almost all of the positive tests were actually unsound. An upcoming change will add the appropriate safety check, but the test diffs are really hard to understand without switching the tests to 64 bit pointers first.
Second, checking debug messages for failures is a major bad practice. This should not have been accepted in review at all. The reason is that it makes the *order* of legality checks visibile and modifying any of them becomes annoying and tedious.
We're not checking the attributes themselves, so hardcoded attribute numbers
only make the tests more fragile, without improving the testing.
Differential Revision: https://reviews.llvm.org/D146334
We pretty consistently don't define those cause they are not needed,
and it removes the potential pitfall to think that these tests are
being run. This doesn't touch .compile.fail.cpp tests since those
should be replaced by .verify.cpp tests anyway, and there would be
a lot to fix up.
As a fly-by, I also fixed a bit of formatting, removed a few unused
includes and made some very minor, clearly NFC refactorings such as
in allocator.traits/allocator.traits.members/allocate.verify.cpp where
the old test basically made no sense the way it was written.
Differential Revision: https://reviews.llvm.org/D146236
On Android, mallinfo2 is an alias of mallinfo, which results
in errors if we try to define both.
Differential Revision: https://reviews.llvm.org/D146324
SerializeToCubin depends on CUDA at *runtime* which is undesirable for MLIR's
general use case, as compilation should be doable on any host, regardless of
whether it has a GPU.
SerializeToCubin is needed to run some GPU tests, so when we build mlir-opt,
SerializeToCubin pass is linked in directly into mlir-opt.
Differential Revision: https://reviews.llvm.org/D146330
If an inlined kernel is called in a loop, the launch point alloca would
lead to increasing stack usage every time the kernel is invoked. This
could make the application run out of stack space and crash. This problem
is fixed by using the alloca insertion point while creating the alloca instruction.
Fixes https://github.com/llvm/llvm-project/issues/60602
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D145820
Since P0857, part of C++20, a *lambda-expression* can contain a
*requires-clause* after its *template-parameter-list*.
While support for this was added as part of
eccc734a69, one specific case isn't
handled properly, where the *requires-clause* consists of an
instantiation of a boolean variable template. This is due to a
diagnostic check which was written with the assumption that a
*requires-clause* can never be followed by a left parenthesis. This
assumption no longer holds for lambdas.
This diagnostic check would then attempt to perform a "recovery", but it
does so in a valid parse state, resulting in an invalid parse state
instead!
This patch adds a special case when parsing requires clauses of lambda
templates, to skip this diagnostic check.
Fixes https://github.com/llvm/llvm-project/issues/61278
Fixes https://github.com/llvm/llvm-project/issues/61387
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D146140
The original MemDeallocPolicy had two options:
* Standard: allocated memory lives until deallocated or abandoned.
* Finalize: allocated memory lives until all finalize actions have been run,
then is destroyed.
This patch introduces a new 'NoAlloc' option. NoAlloc indicates that the
section should be ignored by the JITLinkMemoryManager -- the memory manager
should allocate neither working memory nor executor address space to blocks in
NoAlloc sections. The NoAlloc option is intended to support metadata sections
(e.g. debug info) that we want to keep in the graph and have fixed up if
necessary, but don't want allocated or transmitted to the executor (or we want
that allocation and transmission to be managed manually by plugins).
Since NoAlloc blocks are ignored by the JITLinkMemoryManager they will not have
working memory allocated to them by default post-allocation. Clients wishing to
modify the content of a block in a NoAlloc section should call
`Block::getMutableMemory(LinkGraph&)` to get writable memory allocated on the
LinkGraph's allocator (this memory will exist for the lifetime of the graph).
If no client requests mutable memory prior to the fixup phase then the generic
link algorithm will do so when it encounters the first edge in any given block.
Addresses of blocks in NoAlloc sections are initialized by the LinkGraph
creator (a LinkGraphBuilder, if the graph is generated from an object file),
and should not be modified by the JITLinkMemoryManager. Plugins are responsible
for updating addresses if they add/remove content from these sections. The
meaning of addresses in NoAlloc-sections is backend/plugin defined, but for
fixup purposes they will be treated the same as addresses in Standard/Finalize
sections. References from Standard/Finalize sections to NoAlloc sections are
expected to be common (these represent metadata tracking executor addresses).
References from NoAlloc sections to Standard/Finalize sections are expected to
be rare/non-existent (they would represent JIT'd code / data tracking metadata
in the controller, which would be surprising). LinkGraphBuilders and specific
backends may impose additional constraints on edges between Standard/Finalize
and NoAlloc sections where required for correctness.
Differential Revision: https://reviews.llvm.org/D146183
This can prevent unnecessarily hoisting out of loops.
Test case cribbed from AArch64.
I also intend to make them rematerializable.
Differential Revision: https://reviews.llvm.org/D146314
Currently compiler does not support mixing of shuffled nodes
+ gather/buildvector of the remaining scalar values. It may reduce total
number of instructions and improve performance of the
gather/buildvector sequences.
Part of D110978
Differential Revision: https://reviews.llvm.org/D146167
Summary:
Fixes the lack of a dependency after changing the order of some
includes. Also we weren't running any tests as the GPU was always
disabling them. Fix the logic.
Only override InOperandList when the Real instruction needs a different
type for $offset than the Pseudo.
Differential Revision: https://reviews.llvm.org/D146313
This patch enables integration tests running on the GPU. This uses the
RPC interface implemented in D145913 to compile the necessary
dependencies for the integration test object. We can then use this to
compile the objects for the GPU directly and execute them using the AMD
HSA loader combined with its RPC server. For example, the compiler is
performing the following actions to execute the integration tests.
```
$ clang++ --target=amdgcn-amd-amdhsa -mcpu=gfx1030 -nostdlib -flto -ffreestanding \
crt1.o io.o quick_exit.o test.o rpc_client.o args_test.o -o image
$ ./amdhsa_loader image 1 2 5
args_test.cpp:24: Expected 'my_streq(argv[3], "3")' to be true, but is false
```
This currently only works with a single threaded client implementation
running on AMDGPU. Further work will implement multiple clients for AMD
and the ability to run on NVPTX as well.
Depends on D145913
Reviewed By: sivachandra, JonChesterfield
Differential Revision: https://reviews.llvm.org/D146256
This patch adds initial support for an RPC client / server architecture.
The GPU is unable to perform several system utilities on its own, so in
order to implement features like printing or memory allocation we need
to be able to communicate with the executing process. This is done via a
buffer of "sharable" memory. That is, a buffer with a unified pointer
that both the client and server can use to communicate.
The implementation here is based off of Jon Chesterfields minimal RPC
example in his work. We use an `inbox` and `outbox` to communicate
between if there is an RPC request and to signify when work is done.
We use a fixed-size buffer for the communication channel. This is fixed
size so that we can ensure that there is enough space for all
compute-units on the GPU to issue work to any of the ports. Right now
the implementation is single threaded so there is only a single buffer
that is not shared.
This implementation still has several features missing to be complete.
Such as multi-threaded support and asynchrnonous calls.
Depends on D145912
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D145913
Currently the conversion of elementwise ops only checks for scalar index
types when checking for bitwidth emulation.
Differential Revision: https://reviews.llvm.org/D146307
This is the funcref counterpart to 890146b. We introduce a new attribute
that marks a function pointer as a funcref. It also implements builtin
__builtin_wasm_ref_null_func(), that returns a null funcref value.
Differential Revision: https://reviews.llvm.org/D128440
In Shell tests, replace use of the `p` alias with the `expression` command.
To avoid conflating tests of the alias with tests of the expression command,
this patch canonicalizes to the use `expression`.
See also D141539 which made the same change to API tests.
Differential Revision: https://reviews.llvm.org/D146230
LLDB uses `_up`, `_sp` and `_wp` suffixes for unique, shared and weak
pointers respectively. This can become confusing in combination with
watchpoints which are commonly abbreviated to `wp`. Update
CommandObjectWatchpoint to use `watch_sp` for all `WatchpointSP`
variables.
When interfacing with C code, assumed type should be passed as
basic pointer.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D146300
When setting a variable watchpoint, the watchpoint stores the variable
name in the watchpoint spec. For expression variables we should store
the expression in the watchpoint spec. This patch adds that
functionality.
rdar://106096860
Differential revision: https://reviews.llvm.org/D146262
When offloading is mandatory we can emit a more helpful message if we
did not find any compatible images with the user's system.
Fixes#60221
Reviewed By: ye-luo
Differential Revision: https://reviews.llvm.org/D142369
Update isVectorizedMemAccessUse to also check if the pointer is stored.
This prevents LV to incorrectly consider a pointer as uniform if it is
used as both pointer and stored by the same StoreInst.
Fixes#61396.
This reverts commit 2c78a9e65c.
Fails to compile on mlir-s390x-linux buildbot using GCC 9.4 with:
llvm/lib/Analysis/AliasSetTracker.cpp: In member function 'void llvm::AliasSet::mergeSetIn(llvm::AliasSet&, llvm::AliasSetTracker&, llvm::BatchAAResults&)':
llvm/lib/Analysis/AliasSetTracker.cpp:50:19: error: invalid operands of types 'unsigned char:2' and 'unsigned char:2' to binary 'operator|'
The 'mma.sp.sync.aligned' family of instructions expects
the sparsity selector as a direct literal (0x0 or 0x1).
The current MLIR inline asm passed this as a value in
register, which broke the downstream assemblers
This is a small step towards supporting 2:4 sparsity on
NVidia GPUs in the sparse compiler of MLIR.
Reviewed By: ThomasRaoux, guraypp
Differential Revision: https://reviews.llvm.org/D146110