Summary:
Some older compilers, which we still support, have problems handling the
copy elision that allows us to directly move an `Error` to an
`Expected`. This patch adds explicit moves to remove the error.
D142084 moved an enumeration inside a header from the llvm namespace
into an anon namespace. Some of the bots started failing as a result.
Differential Revision: https://reviews.llvm.org/D146419
This review implements the following PowerPC math operations that we care about:
- fnabs
- fre
- fres
- frsqrte
- frsqrtes
None of these intrinsics require additional error checks in semantics. The interfaces handle checking types and kinds
Reviewed By: kkwli0
Differential Revision: https://reviews.llvm.org/D146139
Previous:
When we do not make decisions about commutative operands, we can end up in a situation where two values have two potential canonical numbers between two regions. This ensures that an ordering is decided after the initial structure between two regions is determined.
Current:
Previously the outliner only checked that assignment to a value matched what was already known, this patch makes sure that it matches what has already been found, and creates a mapping between the two values where it is a one-to-one mapping.
Reviewer: paquette
Differential Revision: https://reviews.llvm.org/D139336
IRUnit is defined as:
using IRUnit = PointerUnion<Operation *, Region *, Block *, Value>;
The tracing::Action is extended to take an ArrayRef<IRUnit> as context to
describe an Action. It is demonstrated in the "ActionLogging" observer.
Reviewed By: rriddle, Mogball
Differential Revision: https://reviews.llvm.org/D144814
This is a convenient flag for context where we intend to summarize a top-level
operation without the full-blown regions it may hold.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D145889
Integrate the `tracing::ExecutionContext()` into mlir-opt with a new
--log-action-to=<file> option to demonstrate the feature.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D144813
c59465e120 introduced mapping to warps and
linear GPU ids.
In the implementation, the delinearization basis is reversed from [x, y, z]
to [z, y x] order to properly compute the strides and allow delinearization.
Prior to this commit, we forgot to reverse it back to [x, y, z] order
before materializing the indices.
Fix this oversight.
By adding signal codes to UnixSignals and adding a new function
where you can get a string with optional address and bounds.
Added signal codes to the Linux, FreeBSD and NetBSD signal sets.
I've checked the numbers against the relevant sources.
Each signal code has a code number, description and printing options.
By default you just get the descripton, you can opt into adding either
a fault address or bounds information.
Bounds signals we'll use the description, unless we have the bounds
values in which case we say whether it is an upper or lower bound
issue.
GetCrashReasonString remains in CrashReason because we need it to
be compiled only for platforms with siginfo_t. Ideally it would
move into NativeProcessProtocol, but that is also used
by NativeRegisterContextWindows, where there would be no siginfo_t.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D146044
applyLoopGuards doesn't always preserve information when there are multiple assumes.
This patch tries to deal with multiple assumes regarding a SCEV's divisibility and min/max values, and rewrite it into a SCEV that still preserves all of the information.
For example, let the trip count of the loop be TC. Consider the 3 following assumes:
1. __builtin_assume(TC % 8 == 0);
2. __builtin_assume(TC > 0);
3. __builtin_assume(TC < 100);
Before this patch, depending on the assume processing order applyLoopGuards could create the following SCEV:
max(min((8 * (TC / 8)) , 99), 1)
Looking at this SCEV, it doesn't preserve the divisibility by 8 information.
After this patch, depending on the assume processing order applyLoopGuards could create the following SCEV:
max(min((8 * (TC / 8)) , 96), 8)
By aligning up 1 to 8, and aligning down 99 to 96, the new SCEV still preserves all of the original assumes.
Differential Revision: https://reviews.llvm.org/D144947
The token annotator doesn't annotate the template opener and closer
as such if they enclose an overloaded operator. This causes the
space between the operator and the closer to be removed, resulting
in invalid C++ code.
Fixes#58602.
Differential Revision: https://reviews.llvm.org/D143755
Add a clang part of OpenHarmony target
Related LLVM part: D138202
~~~
Huawei RRI, OS Lab
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D145227
Update lowering of allocate statement to use the new
functions defined in D146290.
Depends on D146290
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D146291
`AllocatableInitIntrinsic`, `AllocatableInitCharacter` and
`AllocatableInitDerived` are meant to be used to initialize a
descriptor when it is instantiated and not to be used multiple
times in a scope.
Add `AllocatableInitDerivedForAllocate`, `AllocatableInitCharacterForAllocate`
and `AllocatableInitDerivedForAllocate` to be used for the allocation
in allocate statement.
These new functions are meant to be used on an initialized descriptor
and will return directly if the descriptor is allocated so the
error handling is done by the call to `AllocatableAllocate`.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D146290
This revisions refactors the implementation of mapping to threads to additionally allow warps and linear ids to be specified.
`warp_dims` is currently specified along with `block_dims` as a transform attribute.
Linear ids on th other hand use the flattened block_dims to predicate on the first (linearized) k threads.
An additional GPULinearIdMappingAttr is added to the GPU dialect to allow specifying loops mapped to this new scheme.
Various implementation and transform op semantics cleanups are also applied.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D146130
1. Align ManualMapSet with X86MemoryFoldTableEntry instead of using UnfoldStrategy
2. ManualMapSet able to update the existing record in auto-generated MemFold table
Reviewed By: skan
Differential Revision: https://reviews.llvm.org/D142084
There is a piece of logic in GuardWidening which is very limited, and it happens
to ignore implicit control flow, therefore it "works fine" with guards expressed as
intrinsic calls. However, when the guards are represented as branches, its limitations
create a lot of trouble.
The intent here is to make sure that we do not widen code across complex CFG,
so that it can end up being in hotter code than it used to be. The old logic was limited
to unique immediate successor and that's it.
This patch changes the logic there to work the following way: when we need to check
if we can widen from `DominatedBlock` into `DominatingBlock`, we first try to find the
lowest (by CFG) transitive successor of `DominatingBlock` which is guaranteed to not
be significantly colder than the `DominatingBlock`. It means that every time we move
to either:
- Unique successor of the current block, if it only has one successor;
- The only taken successor, if the current block ends with `br(const)`;
- If one of successors ends with deopt, another one is taken;
If the lowest block we can find this way is the `DominatedBlock`, then it is safe to assume
that this widening won't move the code into a hotter location.
I did not touch the existing check with PDT. It looks fishy to me (post-dominance doesn't
really guarantee anything about hotness), but let's keep it as is and maybe fix later.
With this patch, Guard Widening can widen explicitly expressed branches across more than one
dominating guard if it's profitable.
Differential Revision: https://reviews.llvm.org/D146276
Reviewed By: skatkov