We cannot directly use the original result type; instead we need
to deduce it from the converted operand type. This addresses
invalid ops generated from converting single element vectors.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D127574
This avoids pulling in function converion patterns, which is not
part of what we want to test in ArithmeticToSPIRV. It also allows
using ConvertArithmeticToSPIRVPass as a standalone step.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D127573
Add TODO for KIND=2 so the user is notified correctly.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: jeanPerier, PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D127619
Co-authored-by: Peter Steinfeld <psteinfeld@nvidia.com>
Without SSE41, ANY_EXTEND_VECTOR_INREG nodes are likely to be prematurely combined to a target shuffle preventing generic sign extension folds.
Fixes a number of sign-extend regressions in D127115.
This uses rotating reminder of division by 3 to select another
temp vgpr each next time in a sequence of several agpr copies.
Therefore, temp vgpr selection depends on the generated agpr
number. This number could change with any unrelated change to
the register definitions.
Stabilize the selection by using a real agpr number.
Differential Revision: https://reviews.llvm.org/D127524
Add patterns to propagate vector distribution and remove dead
arguments. This handles propagation for several vector operations.
Differential Revision: https://reviews.llvm.org/D127167
Loop variables of a worksharing loop and sequential loops in parallel
region are privatised by default. These variables are marked with
OmpPreDetermined. Skip explicit privatisation of these variables.
Note: This is part of upstreaming from the fir-dev branch of
https://github.com/flang-compiler/f18-llvm-project.
Reviewed By: Leporacanthicus
Differential Revision: https://reviews.llvm.org/D127249
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Mats Petersson <mats.petersson@arm.com>
All supported compilers have concepts support so use that in the C++20
functions in <bit>.
s/_LIBCPP_INLINE_VISIBILITY/_LIBCPP_HIDE_FROM_ABI/ as drive-by fix.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D127594
If we defer the mutation of the instruction, we can add the assert discussed in D126921. Once we do that, the API becomes subject to revision - but let's do that in a separate change.
This simplifies the isel code by removing the manual load creation.
It also improves our ability to use 0 strided loads for vector splats.
There is an assumption here that Mask and ShiftedMask constants are
cheap enough that they don't become constant pool loads so that our
isel optimizations involving And still work. I believe those constants
are 3 instructions in the worst case.
The rv64zbp-intrinsic.ll changes is a regression caused by intrinsics
being expanded to RISCVISD also occuring during lowering. So the optimizations
were only happening during the last DAGCombine, which can't see through the
load. I believe we can fix this test by implementing
TargetLowering::getTargetConstantFromLoad for RISC-V or by adding the intrinsic
to computeKnownBitsForTargetNode to enable earlier DAG combine. Since Zbp is not
a ratified extension, I don't view these as blocking this patch.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D127520
This removes all "TODO: remove these headers" comments from our headers.
Note there seem to be more headers that can be removed, that will be
done in separate commits.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D127592
The compilers clang-11, clang-12, and apple-clang-12 are no longer
supported, so remove their annotations in the tests.
Reviewed By: #libc, philnik
Differential Revision: https://reviews.llvm.org/D127588
Instead of trying to be clever and design our own locking primitive,
simply rely on the OS-provided implementation to do the right thing.
Indeed, manually yielding to the OS does not provide the necessary
information for it to make good prioritization decisions. For example,
if a thread with higher priority yields while waiting for a lock held
by a thread with lower priority but the system is contended, it is
possible for the thread with lower priority to not run until the higher
priority thread has yielded 16 times and goes for __libcpp_mutex_lock().
Once that happens, the OS can bump the priority of the thread that
currently holds the lock to unblock everyone. So instead, we might as
well give the system all the information from the start so it can make
appropriate decisions.
As a fly-by change, also increase the number of locks in the table.
The size increase is modest, but has the potential to half the amount
of contention on those locks.
rdar://93598606
Differential Revision: https://reviews.llvm.org/D126882
Add TODO for half-precision for reduction.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: jeanPerier, PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D127622
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
The pass was raising TODOs when a function both had a fir.boxproc<> argument
and a fir.type<> argument (even if the fir.type<> did not contain a
fir.boxproc itself).
Prevent the TODO from firing when a fir.type<> does not actually contain
a fir.boxproc. Add the location for the remaining TODO (it will be
needed when procedure pointer components are supported in lowering).
FYI, I actually tried to just implement the TODO, but I there is a funny
issue. When creating the new fir::RecordType, since the name and context
are the same as the type being translated, fir::RecordType:get just
returns the existing type, and there is no way to change it (finalize()
does nothing since it is already finalized). So this will require to add
the ability to mutate the existing type, and I am not sure what are the
MLIR constraints here, so I escaped and left the TODO for that case.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: jeanPerier, PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D127633
Co-authored-by: Jean Perier <jperier@nvidia.com>
ALLOCATE statement allows reversed bounds (see Fortran 2018 9.7.1.2
point 1) in which case the extents are zero.
The same applies for the character length provided in the type spec that
can be negative. In which case the new length is zero.
Use genMaxWithZero to deal with these cases.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: jeanPerier, PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D127617
Co-authored-by: Jean Perier <jperier@nvidia.com>
We use the flags `--offload-host-only` and `--offload-device-only` to
change the driver's code generation for offloading programs. These are
currently parsed out independently in many places. This patch simply
refactors this to work as a mode for the Driver. This stopped us from
emitting warnings if unused because it's always used now, but I don't
think this is a great loss.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D127515
system_clock intrinsic calls with dynamically optional arguments
Modify intrinsic system_clock calls to allow for an argument that is optional
or a disassociated pointer or an unallocated allocatable. A call with such an
argument is the same as a call that does not specify that argument.
Rename (genIsNotNull -> genIsNotNullAddr) and (genIsNull -> genIsNullAddr)
and add a use of genIsNotNullAddr.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D127616
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
This reverts commit 340654e0f2, essentially reapplying 1d3ba05e4a.
The test VFS/real-path-found-first.m that was failing on Windows is now passing with a workaround.
The last of getEvictor use was removed on Jun 5, 2022 in commit
5c06f7168f, which was itself a patch to
remove unused code.
Once we remove getEvictor, EvictionTrack becomes a write-only data
structure. The data in it won't affect compilation, so the entire
class is essentially dead.
A test added in https://reviews.llvm.org/D127207 is missing
target/triple. This has caused the PowerPC buildbot to start failing:
* https://lab.llvm.org/buildbot/#/builders/21/builds/42860
(on PowerPC `; CHECK: ret` should be replaced with `; CHECK: `blr`).
Sending this without a review as the fix is rather straightforward. Note
that I've decided to add triple/target instead of e.g. removing:
`; CHECK: ret`. That's for consistency with other tests that generate
assembly. We could change that if that's what folks prefer.
This shows narrowing improvements on the logic tests
(transforms recently added with e247b0e5c9).
This is not a complete fix. That would require adding
folds to visitOr/visitXor. But it enables the expected
transforms for the basic patterns in the affected tests.
This patch adds an optional argument to DexExpectWatchBase, float_range,
which defines a +- acceptance range for expected floating point values.
If passed, this assumes every expected value to be a floating point
value, and an exception will be thrown if this is not the case.
Differential Revision: https://reviews.llvm.org/D124511
This patch adds code so that using bbc we are able to see an end-to-end lowering of simd construct in action.
Reviewed By: kiranchandramohan, peixin, shraiysh
Differential Revision: https://reviews.llvm.org/D125282