Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from llvm:main #400

Merged
merged 2,708 commits into from
Jan 15, 2022
Merged

[pull] main from llvm:main #400

merged 2,708 commits into from
Jan 15, 2022

Conversation

pull[bot]
Copy link

@pull pull bot commented Dec 30, 2021

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

@pull pull bot added the ⤵️ pull label Dec 30, 2021
aralisza and others added 29 commits January 13, 2022 10:31
…ed polymorphic allocatable component

A bogus error message is appearing for structure constructors containing
values that correspond to unlimited polymorphic allocatable components.
A value of any type can actually be used.

Differential Revision: https://reviews.llvm.org/D117154
Add threading support for exhaustive testing and MPFRUtils.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D117028
Stop allowing use of `SmallVectorBase::set_size()` outside of the
SmallVector implementation, which sets the size without calling
constructors or destructors.

Most callers should probably just use `resize()`. Or, if the new size is
guaranteed to be `<= size()`, then the new-ish `truncate()` works too
(and optimizes better).

Some callers want to avoid initializing memory before overwriting, but
need a pointer to the memory and so cannot use `push_back()`,
`emplace_back()`, or `append()`. Before this commit, this depended on
`reserve()` and `set_size()`:

```
V.reserve(V.size() + NumNew);      // Reserve expected size.
NumNew = initialize(V.end(), ...); // Get number added.
V.set_size(V.size() + NumNew);     // Set size to match.
```

Such code should be updated to use `resize_for_overwrite()` and
`truncate()`:

```
auto Size = V.size();                        // Save initial size.
V.resize_for_overwrite(Size + NumNew);       // Resize to expected size.
NumNew = initialize(V.begin() + Size, ...)); // Get number added.
V.truncate(Size + NumNew);                   // Truncate to match.
```

The new pattern is safe even for non-trivial types, since
`resize_for_overwrite()` calls constructors and `truncate()` calls
destructors. For trivial types, it should optimize the same way as the
old pattern.

Downstream code adapt to the disappearance of `set_size()` using this
new pattern should carefully audit uses of `V` between the resize and
the truncate:

- Change `V.size()` => `Size`.
- Change `V.capacity()` => `V.size()` (mostly).
- Change `V.end()` => `V.begin() + Size`.
- If `V` is an out-parameter, early returns need a `V.truncate()` or
  `V.clear()`. A scope exit is recommended.

Differential Revision: https://reviews.llvm.org/D115380
The names of the generated attribute getters for ops changed some time ago. The method created from the attribute name returns the return type and an additional method of the same name with Attr as suffix is generated which returns the actual attribute as its storage type.

The code generating effects however was using the methods without the Attr suffix, which is a problem in the case of FlatSymbolRefAttr as it has a return type of llvm::StringRef. This would lead to compilation errors as the constructor of SideEffects::EffectInstance expects a SymbolRefAttr in this case.

This patch simply fixes the generated effects code to use the Attr suffixed getter to get the actual storage type of the attribute.

Differential Revision: https://reviews.llvm.org/D117194
This makes all the tests consistent and improves code coverage. This also
uncovers a bug with negative indices in advance() (which also impacts
prev()) -- I'll fix that in a subsequent patch.

I chose to only count operations in the tests for ranges::advance because
doing so in prev() and next() too was reaching diminishing returns, and
didn't meaningfully improve our test coverage.
Convert the `crashlog` command to be implemented as a class. The `Symbolicate`
function is switched to a class, to implement `get_long_help`. The text for the
long help comes from the help output generated by `OptionParser`. That is, the
output of `help crashlog` is the same as `crashlog --help`.

Differential Revision: https://reviews.llvm.org/D117165
All credit to Martin Storsjö (mstorsjo) who describes the issue here:
#53167

Differential Revision: https://reviews.llvm.org/D117179
Combine the sm-version tests into a single file.

Reviewed By: bkramer, tra

Differential Revision: https://reviews.llvm.org/D117198
See `gcc -dumpspecs` that -r essentially implies -nostdlib and suppresses
default -l* and crt*.o. The behavior makes sense because otherwise there will be
assuredly conflicting definitions when the relocatable output is linked into the
final executable/shared object.

Reviewed By: thesamesam, phosek

Differential Revision: https://reviews.llvm.org/D116843
…unreachable

I strongly believe we need some variant of this.

The main problem is e.g. that the glibc's assert has 4 parameters,
but the profitability check is only okay with one extra phi node,
so D116692 doesn't even trigger on most of the expected cases.

While that restriction probably makes sense in normal code, if we
are about to run off of a cliff (into an `unreachable`), this
successor block is unlikely so the cost to setup these PHI nodes
should not be on the hotpath, and shouldn't matter performance-wise.

Likewise, we don't sink if there are unconditional predecessors
UNLESS we'd sink at least one non-speculatable instruction,
which is a performance workaround, but if we are about to run into
`unreachable`, it shouldn't matter.

Note that we only allow the case where there are at
most unconditiona branches on the way to the unreachable block.

Differential Revision: https://reviews.llvm.org/D117045
…oint scev expr

Let's consider sequential min/max expression family
to be more complex than their non-sequential counterparts,
preserving internal ordering within them.
This reuses the type=>decl mapping from go-to-definition on auto.
(Which could stand some improvement, but that can happen later).

Fixes clangd/clangd#367

Differential Revision: https://reviews.llvm.org/D116443
Implements part of the legacy "DEC structures" feature from
VMS Fortran.  STRUCTUREs are processed as if they were derived
types with SEQUENCE.  DATA-like object entity initialization
is supported as well (e.g., INTEGER FOO/666/) since it was used
for default component initialization in structures.  Anonymous
components (named %FILL) are also supported.

These features, and UNION/MAP, were already being parsed.
An omission in the collection of structure field names in the
case of nested structures with entity declarations was fixed
in the parser.

Structures are supported in modules, but this is mostly for
testing purposes.  The names of fields in structures accessed
via USE association cannot appear with dot notation in client
code (at least not yet).  DEC structures antedate Fortran 90,
so their actual use in applications should not involve modules.

This patch does not implement UNION/MAP, since that feature
would impose difficulties later in lowering them to MLIR types.
In the meantime, if they appear, semantics will issue a
"not yet implemented" error message.

Differential Revision: https://reviews.llvm.org/D117151
…ders

Not sure it's OK to suppress this in clang itself - if we're building a PCH
or module, maybe it matters?

Differential Revision: https://reviews.llvm.org/D116925
During pop() we convert nodes into spans of expanded syntax::Tokens.
If we precompute a range of plausible (expanded) tokens, then we can do an
extremely cheap approximate hit-test against it, because syntax::Tokens are
ordered by pointer.

This would seem not to buy anything (we don't enter nodes unless they overlap
the selection), but in fact the spans we have are for *newly* claimed ranges
(i.e. those unclaimed by any child node).

So if you have:
   { { [[2+2]]; } }
then all of the CompoundStmts pass the hit test and are pushed, but we skip
full hit-testing of the brackets during pop() as they lie outside the range.

This is ~10x average speedup for selectiontree on a bad case I've seen
(large gtest file).

Differential Revision: https://reviews.llvm.org/D117107
… instructions

Adds NVPTX intrinsics and builtins for CUDA PTX cvt instructions for sm80
architectures and above. Requires ptx 7.0.

PTX ISA description of cvt instructions :
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-cvt

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>

Differential Revision: https://reviews.llvm.org/D116673
…IPRA enabled/disabled

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D117243
This patch enables loop interchange with multiple outer loop
induction variables, and hence removes the limitation that only
a single outer loop induction variable is supported. In fact, it
turns out that the current pass already trivially supports multiple
outer indvars, which is the result of a previous patch
`https://reviews.llvm.org/D102743`. Therefore, this patch removed that
limitation and provides test cases for multiple outer indvars.

Reviewed By: bmahjour

Differential Revision: https://reviews.llvm.org/D114916
jyknight and others added 29 commits January 14, 2022 18:01
EHTerminateScope is used to implement C++ noexcept semantics. Per C++
[except.terminate], it is implemented-defined whether no, some, or all
cleanups are run prior to terminatation.

Therefore, the code to run cleanups on the way towards termination is
unnecessary, and may be omitted.

After this change, we will still run some cleanups: any cleanups in a
function called from the noexcept function will continue to run, while
those in the noexcept function itself will not.

Differential Revision: https://reviews.llvm.org/D113620
Internal writes to character arrays should not blank-fill
records (elements) past the last one that was written to.

Differential Revision: https://reviews.llvm.org/D117342
This commit adds small tests for the combination of:
{exact, no_exact} x { EQ, NE, UGT, UGE, ULT, ULE, SGT, SGE, SLT, SLE}

This is related to the changes in D117338.
ENTRY statement names in module subprograms were not acceptable for
use as a "module procedure" in a generic interface, but should be.
ENTRY statements need to have symbols with place-holding
SubprogramNameDetails created for them in order to be visible in
generic interfaces.  Those symbols are created from the "program
tree" data structure.  This patch adds ENTRY statement names to the
program tree data structure and uses them to generate SubprogramNameDetails
symbols.

Differential Revision: https://reviews.llvm.org/D117345
Since 2959e08, we conservatively
assume all inputs are enabled by default. This isn't the best
interface for controlling these anyway, since it's not granular and
only allows trimming the last fields.
This is a fix for a crash in the HexagonOptAddrMode pass that was looking
for the third operand (offset) in the following instruction that does not,
in fact, have a third operand:

  $r1 = L2_loadw_locked $r1

Additionally, this patch also adds an addrMode value to vgather pseudos
in the Hexagon backend.

Differential Revision: https://reviews.llvm.org/D117133
Summary:
Address @smeenai feedback https://reviews.llvm.org/D117061#inline-1122106:
>CMake has if(IN_LIST) now, which you can use instead of the string(FIND)

IN_LIST is available since CMake 3.3 released in 2015.

Reviewed By: smeenai

FBD33590959
Summary:
Reduce code size by removing redundant dependent template type
from RewriteInstance methods.

Code size savings (via bloaty on llvm-bolt Debug build):
```
symbol,vmsize,filesize -> vmsize,filesize (delta vmsize,filesize)
updateELFSymbolTable         57096,59600 -> 56656,59048 (440,552)
updateELFSymbolTable::lambda 35957,55277 -> 35949,54485   (8,792)
getOutputSections            20592,21440 -> 20372,21156 (220,284)
getOutputSections::lambda      1792,5300 ->   1792,5372   (0,-72)

total delta (668,1556)
```

Reviewed By: maksfb

FBD33589393
…teScope."

Breaks tests on some platforms. Reverting while investigating.

This reverts commit a4e255f.
Currently, when connecting to a remote iOS device from the command line
on Apple Silicon, we end up using the host platform (PlatfromMacOSX)
instead of remote-ios (PlatformRemoteiOS). This happens because
PlatfromMacOSX includes arm64-apple-ios and arm64e-apple-ios as
compatible architectures, presumably to support debugging iOS Apps on
Apple Silicon [1].

This is a problem for debugging remote ios devices, because the host
platform doesn't look for an expanded shared cache on disk and as a
result we end up reading everything from memory, incurring a significant
performance hit.

The crux of this patch is to make PlatfromMacOSX *not* compatible with
arm64(e)-apple-ios. This also means that we now use remote-ios
(PlatformRemoteiOS) as the platform for debugging iOS apps on Apple
Silicon. This has the (unintended) side effect that unlike we do for the
host platform, we no longer check our local shared cache, and incur a
performance hit on debugging these apps.

To avoid that, PlatformRemoteiOS now also check the local cache to
support this use case, which is cheap enough to do unconditionally for
PlatformRemoteiOS.

[1] https://support.apple.com/guide/app-store/iphone-ipad-apps-mac-apple-silicon-fird2c7092da/mac

Differential revision: https://reviews.llvm.org/D117340
Character substrings weren't being folded correctly;
add tests and rework the implementation so that substrings
of literals and named constant character scalars & arrays
are properly folded for use in constant expressions.

Differential Revision: https://reviews.llvm.org/D117343
Avoid other warnings from failing the test, such as
-Wunused-command-line-argument in the downstream Swift fork.
Rather than hardcoding all constants, we now use the input tensor to drive the
code setup. Of course, we still need to hardcode dim-2 of A and the final
verification in CHECK is input dependent, but overall this sets a slightly
better example of tensor setup in general.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D117349
Fixes #52694

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Similar to the ld64 command-line options. These use the same underlying
mechanisms as -l and -hidden-l, but allow specifying an absolute path to
the archive. This is often more convenient for a one-off, or when adding
a new search path could change how existing -l options are resolved.

Differential Revision: https://reviews.llvm.org/D117360
It's NFC because shadow of pointer is clean so origins will not be
propagated anyway.

Depends on D117275

Reviewed By: kda, eugenis

Differential Revision: https://reviews.llvm.org/D117276
Depends on D117276

Reviewed By: kda, eugenis

Differential Revision: https://reviews.llvm.org/D117277
This is the original patch in my GNUInstallDirs series, now last to merge as the final piece!

It arose as a new draft of D28234. I initially did the unorthodox thing of pushing to that when I wasn't the original author, but since I ended up

 - Using `GNUInstallDirs`, rather than mimicking it, as the original author was hesitant to do but others requested.

 - Converting all the packages, not just LLVM, effecting many more projects than LLVM itself.

I figured it was time to make a new revision.

I have used this patch series (and many back-ports) as the basis of NixOS/nixpkgs#111487 for my distro (NixOS), which was merged last spring (2021). It looked like people were generally on board in D28234, but I make note of this here in case extra motivation is useful.

---

As pointed out in the original issue, a central tension is that LLVM already has some partial support for these sorts of things. Variables like `COMPILER_RT_INSTALL_PATH` have already been dealt with. Variables like `LLVM_LIBDIR_SUFFIX` however, will require further work, so that we may use `CMAKE_INSTALL_LIBDIR`.

These remaining items will be addressed in further patches. What is here is now rote and so we should get it out of the way before dealing more intricately with the remainder.

Reviewed By: #libunwind, #libc, #libc_abi, compnerd

Differential Revision: https://reviews.llvm.org/D99484
Enable noundef analysis (-enable-noundef-analysis) via the -fsanitize-memory-param-retval clang flag.
This completes the work found in:
  - https://reviews.llvm.org/D116855
  - https://reviews.llvm.org/D116633

Depends on D116633

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D117293
`zfh` and `zfhmin` have been ratified, with version 1.0.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117098
If function has no sanitize_memory we still reset shadow for nested calls.
The first return from getShadow() correctly returned shadow for argument,
but it didn't reset shadow of byval pointee.

Depends on D117277

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D117278
This is a follow up to Fix size mismatch error with jemalloc.
4243b65
Although that fix works it increased memory footprint.
With this patch we go back to original memory footprint.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D117341
Add patterns for vector widening integer add/subtract instructions

Differential Revision: https://reviews.llvm.org/D117188
This fixes a crash I observed in issue #48708 where the LSR
pass tries to insert an instruction in a basic block with only a
catchswitch statement in there. This happens because the Phi node
being evaluated assumes the same value for different basic blocks.

If the basic block associated with the incoming value of the operand
being evaluated has an EHPad terminator LSR skips optimizing it.
But if that incoming value can come from multiple different blocks
there can be some incoming basic blocks which are terminated in
an EHPad. If these are then rewritten in RewriteForPhi the ones
containing an EHPad terminator will hit the "Insertion point must
be a normal instruction" assert in AdjustInsertPositionForExpand.

This fix makes CollectLoopInvariantFixupsAndFormulae also ignore
cases where the same value has another incoming basic block with an
EHPad, same as it already does in case the primary value has one.

Patch by Lorenz Brun <lorenz@brun.one>

Differential Revision: https://reviews.llvm.org/D98378
Follow-up on 74bb4ad. It should not change behaviors but a good thing to
do.
@devkadirselcuk devkadirselcuk merged commit 69684bb into turkdevops:main Jan 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment