There is a discrepancy between how clangd processes CDB loaded from
JSON file on disk and pushed via LSP. Thus the same CDB pushed via
LSP protocol may not work as expected. Some difference between these two
paths is expected but we still need to insert driver mode and target from
binary name and expand response files.
Test Plan: check-clang-tools
Differential Revision: https://reviews.llvm.org/D143436
We have a workaround from D128621 that makes $0 no longer being
a placeholder to conform a vscode feature. However, we have to
refine the logic as it may suppress the last parameter placeholder
for constructor of base class because not all patterns of completion
are compound statements.
This fixesclangd/clangd#1479
Reviewed By: nridge
Differential Revision: https://reviews.llvm.org/D145319
This fused operation should run a lot faster than first transposing the
lhs array and then multiplying the matrices separately.
Based on flang/runtime/matmul.cpp
Depends on D145959
Reviewed By: klausler
Differential Revision: https://reviews.llvm.org/D145960
hlfir.matmul_transpose will be lowered to a new runtime call.
A canonicalizer was chosen because
- Alternative: a new pass for rewriting chained intrinsics - this
would add a lot of unnecessary boilerplate.
- Alternative: including this in the HLFIR Intrinsic Lowering pass -
I wanted to separate these two concerns: not adding a second purpose
complicating the intrinsic lowering pass.
With this change, the MLIR built-in canonicalization pass should be run
before the HLFIR Intrinsic Lowering pass.
Depends on D145504, D145957
Reviewed By: clementval, vzakhari
Differential Revision: https://reviews.llvm.org/D145959
This operation will be used to transform MATMUL(TRANSPOSE(a), b). The
transformation will go in the following stages:
1. Lowering to hlfir.transpose and hlfir.matmul
2. Canonicalise to hlfir.matmul_transpose
3. hlfir.matmul_transpose will be lowered to FIR as a new runtime
library call
Step 2 (and this operation) are included for consistency with the other
hlfir intrinsic operations and to avoid mixing concerns in the intrinsic
lowering pass.
In step 3, a new runtime library call is used because this operation is
most easily implemented in one go (the transposed indexing actually
makes the indexing simpler than for a normal matrix multiplication). In
the long run, it is intended that HLFIR will allow the same buffer
to be shared between different runtime calls without temporary
allocations, but in this specific case we can do even better than that
with a dedicated implementation.
This should speed up galgel from SPEC2000 (but this hadn't been tested
yet). The optimization was implemented in Classic Flang.
Reviewed By: vzakhari
Differential Revision: https://reviews.llvm.org/D145957
This test checks all of the parts of intrinsic lowering work together,
and makes sure that we can pass the hlfir.expr result of one intrinsic
as an argument to another.
Depends on D145503
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D145504
This move is useful for a few reasons:
- It is easier to see what the intrinsic lowering is doing when the
operations it creates are not immediately lowered
- When lowering a HLFIR intrinsic generates an operation which needs
to be lowered by another pattern matcher in the same pass, MLIR will
run that other substitution before validating and finalizing the
original changes. This means that the erasure of operations is not
yet visible to subsequent matchers, which hugely complicates
transformations (in this case, hlfir.exprs cannot be rewritten
because they are still used by the now-erased HLFIR intrinsic op.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D145502
Added a new target ompt mode that depends on libomptarget OMPT support.
Added tests that verify callbacks for target regions, kernel launch,
and data transfer operations. All of them should pass on amdgpu using
make check-libomptarget.
Reviewed By: jplehr
Differential Revision: https://reviews.llvm.org/D127372
Now that SCEV has a dedicated vscale node type, we should also map
vscale intrinsics to it. To make sure this does not regress ranges
(which were KnownBits based previously), add support for vscale to
getRangeRef() as well.
Differential Revision: https://reviews.llvm.org/D146226
Add support for vscale in computeConstantRange(), based on
vscale_range attributes. This allows simplifying based on the
precise range, rather than a KnownBits approximation (which will
be off by a factor of two for the usual case of a power of two
upper bound).
Differential Revision: https://reviews.llvm.org/D146217
The default and pre-link pipeline builders currently require you to
call a separate method for optimization level O0, even though they
have perfectly well-defined O0 optimization pipelines.
Accept O0 optimization level and call buildO0DefaultPipeline()
internally, so all consumers don't need to repeat this.
Differential Revision: https://reviews.llvm.org/D146200
When debugging and using debug-pass-manager (e.g. in regression tests)
we prefer a consistent order in which analysis passes are executed.
But when for example doing
return MyClass(AM.getResult<LoopAnalysis>(F),
AM.getResult<DominatorTreeAnalysis>(F));
then the order in which LoopAnalysis and DominatorTreeAnalysis isn't
guaranteed, and might for example depend on which compiler that is
used when building LLVM.
I've not scanned the full source tree, but this fixes some occurances
of the above pattern found in lib/Analysis.
This problem was discussed briefly in review for D146206.
The default invalidate method for analysis results is just looking
at the preserved state of the pass itself. It does not consider if
the analysis has an internal state that depend on other analyses.
Thus, we need to implement LoopAccessInfoManager::invalidate in order
to catch if LoopAccessAnalysis needs to be invalidated due to
transitive analyses such as AAManager is being invalidated. Otherwise
we might end up having references to an AAManager that is stale.
Fixes https://github.com/llvm/llvm-project/issues/61324
Differential Revision: https://reviews.llvm.org/D146206
The MemoryDependenceAnalysis/invalidation.ll test case was using
; CHECK-NOT-AA-INVALIDATE:
but I think the intention was to use
; CHECK-AA-INVALIDATE-NOT:
Simply changing the checks like that would make the test fail.
The old statement that AA being stateless would result in nothing
to invalidate when doing invalidate<aa> is not true afaict.
It would be different if for example doing invalidate<basic-aa>, then
the AAManager isn't invalidated (and then neither memdep would be
invalidated). But when the AAManager itself is invalidated then we
should expect to find both "Invalidating analysis: AAManager" and
"Invalidating analysis: MemoryDependenceAnalysis" in the output.
Differential Revision: https://reviews.llvm.org/D146205
GNU assembler mandates armv8.5-a for memtag instructions. Maybe
we should remove this restriction in GNU assembler, but let's work
around it for current GNU Binutils releases.
Differential Revision: https://reviews.llvm.org/D146109
As pointed out in D133835 these globals will never be written to
(they're only used for trivially copyable types), so they can always be
constant.
Differential revision: https://reviews.llvm.org/D146211
won't get generated again
As the test shows, we don't want to see the specialized function bodies
if it is already contained in the imported modules. So we can save a lot
of compiling time then.
This is done for consistency with other Predicate/Subtargetfeature
pairs, where the second parameter of the SubtargetFeature correspond
to the NAME of the def of the Predicate associated to the
SubtargetFeature.
Differential Revision: https://reviews.llvm.org/D146129
As part of this patch, I added the ability to add leading zeros.
This is necessary so that the generated files are sorted in ascending order.
Reviewed By: yrouban
Differential Revision: https://reviews.llvm.org/D145484
Movdir64b is special for its mem operand, 67 prefex can not only modify its add size,
so it's mem base and index reg should be the same type as source reg, such as
movdir64b (%rdx), rcx, and could not be movdir64b (%edx), rcx.
Now llvm-mc can encode the asm 'movdir64b (%edx), rcx' but the result is the same as
'movdir64b (%edx), ecx', which offend users' intention, while gcc will object this
action and give a warning.
I add 3 new mem descriptions to let llvm-mc to report the same error.
Reviewed By: skan, craig.topper
Differential Revision: https://reviews.llvm.org/D145893
Plugin that counts the number of times each tree node occurs in a given program. Used for test coverage.
> One thing we need...is a way to determine what features a code uses. Preferably we would also be able to determine if they are implemented or not. Just the former could be done with a visitor for the parse tree. For the latter we would continue compilation and somehow ignore todo errors but collect them - @jdoerfert
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D143704
When compiling compiler-rt with -fsanitize=undefined and running testcases you
end up with the following warning:
UBSan: floatdisf.c:27:15: signed integer overflow: 9223372036854775807 - -1 cannot be represented in type 'di_int' (aka 'long long')
This can be avoided by doing the subtraction in a matching unsigned variant of
the type, given that the overflow is the expected result of the subtraction.
The same kind of pattern exists in floatdidf.c
This was found in an out of tree target.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D146135
For the function getArithmeticReductionCost, it receive a ptr and dereferce it without check,
It is called many times in getTypeBasedIntrinsicInstrCost, the ptr passed to it is inited
from line 1709.
From the code, we can not ensure the ptr VecOpTy is inited when Tys is empty or Tys[VecTyIndex]
is not a VectorType, so that the getArithmeticReductionCost will do an undefined behavior.
I add assert to it, found the ptr passed to it in llvm tests are all not nullptr, but I think the check is
still meaningful for us.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D146118
The feature macro '__cpp_coroutines' is for coroutines TS. And the
coroutines TS is deprecated. So we should remove the feature macro too.
BTW, the corresponding feature macro for standard c++ coroutines is
'__cpp_impl_coroutine'.
Remove use of constexpr if that failed on the build bots.
Original commit message:
It's possible for `getCalleeDecl()` to return a null pointer.
This was encountered by a user of our downstream compiler.
The case involved a DependentScopeDeclRefExpr.
Since this seems to only be for a warning diagnostic, I skipped
the diagnostic check if it returned null. But mabye there's a
different way to fix this.
In Function getVectorInstrCost, situation Opcode == Instruction::ExtractElement
and Opcode == Instruction::InsertElement are all handled in the first 2 if-statements,
So we have no chance for the code in line 4401.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D145908
Add a new compute-known-bits like function to compute all
the interesting floating point properties at once.
Eventually this should absorb all the various floating point
queries we already have.
It's possible for `getCalleeDecl()` to return a null pointer.
This was encountered by a user of our downstream compiler.
The case involved a DependentScopeDeclRefExpr.
Since this seems to only be for a warning diagnostic, I skipped
the diagnostic check if it returned null. But mabye there's a
different way to fix this.
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D146089
This mostly keeps the same warning flags. The most important exceptions are `-Wpedantic` and `-Wconversion`, which are now removed from libc++abi and libunwind.
Reviewed By: ldionne, #libunwind, #libc, #libc_abi
Spies: mikhail.ramalho, phosek, libcxx-commits
Differential Revision: https://reviews.llvm.org/D144252
Change the internal storage scheme from storing a MutableArrayRef to
storing an explicit offset+length pair. Storing an ArrayRef is dangerous
because it contains the pointer to the first element in the range, but
the entire storage vector may be reallocated, making the pointer
dangling. We don't know when the reallocation happends, so we can't
update the ArrayRefs. Store the explicit offset instead and construct
ArrayRefs on-the-fly.
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D146239
Clang infers framework autolink hints when parsing a modulemap. In order to do
so, it checks if the module is a framework and if there is a framework binary
or TBD file in the SDK. Only when Clang finds the filei, then the autolink hint
is added to the module metadata.
During a project build many clang processes perform this check, which causes
many stat calls - even for modules/frameworks that are not even used.
The linker is already resilient to non-existing framework links that come from
the autolink metadata, so there is no need for Clang to do this check.
Instead the autolink hints are now added unconditionally and the linker only
needs to do the check once. This reduces the overall number of stat calls.
This fixes rdar://106578342.
Differential Revision: https://reviews.llvm.org/D146255
MemoryCache::Read is not resilient to partial reads when reading memory
chunks less than or equal in size to L2 cache lines. There have been
attempts in the past to fix this but nothing really solved the root of
the issue.
I first created a test exercising MemoryCache's implementation and
documenting how I believe MemoryCache::Read should behave. I then
rewrote the implementation of MemoryCache::Read as needed to make sure
that the different scenarios behaved correctly.
rdar://105407095
Differential Revision: https://reviews.llvm.org/D145624