Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from llvm:main #397

Merged
merged 189 commits into from
Dec 29, 2021
Merged

[pull] main from llvm:main #397

merged 189 commits into from
Dec 29, 2021

Conversation

pull[bot]
Copy link

@pull pull bot commented Dec 25, 2021

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

MaskRay and others added 6 commits December 24, 2021 15:41
This reverts 8cb7876 and follow-ups.

GNU ld/gold/ld.lld -O has nothing to do with any code related linker optimizations.
It has very small benefit (save 144Ki (.hash, .gnu_hash) with GNU ld, save 0.7%
.debug_str with gold/ld.lld) while it makes gold/ld.lld significantly slower
when linking RelWithDebInfo clang (gold: 16.437 vs 19.488; ld.lld: 1.882 vs 4.881).
LLVM Dialect in MLIR doesn't have a memmove op. This adds one.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D116274
…::object::ELFFile::sections()

This mainly avoid `relsOrRelas` cost in `InputSectionBase::relocate`.
`llvm::object::ELFFile::sections()` has redundant and expensive checks.
This avoid repeated load of the unique_ptr in hot paths.
This decreases the 0.2% time (no debug info) to nearly no.
@pull pull bot added the ⤵️ pull label Dec 25, 2021
kazutakahirata and others added 23 commits December 24, 2021 20:57
Note: mixed TLSDESC and GD currently does not work.
This temporarily increases sizeof(SymbolUnion), but allows us to mov GOT/PLT/etc
index members outside Symbol in the future.

Then, we can make TLSDESC and TLSGD use different indexes and support mixed
TLSDESC and TLSGD (tested by x86-64-tlsdesc-gd-mixed.s).

Note: needsTlsGd and needsTlsGdToIe may optionally be combined.

Test updates are due to reordered GOT entries.
Identified with readability-redundant-control-flow.
An identical declaration is present just a couple of lines above the
line being removed in this patch.

Identified with readability-redundant-declaration.
New method `FCmpInst::compare` is added, which evaluates the given
compare predicate for constant operands. Interface is made similar to
`ICmpInst::compare`.

Differential Revision: https://reviews.llvm.org/D116168
…vailable to Presburger/

This patch moves some static functions from AffineStructures.cpp to
Presburger/Utils.cpp and some to be private members of FlatAffineConstraints
(which will later be moved to IntegerPolyhedron) to allow for a smoother
transition for moving FlatAffineConstraints math functionality to
Presburger/IntegerPolyhedron.

This patch is part of a series of patches for moving math functionality to
Presburger directory.

Reviewed By: arjunp, bondhugula

Differential Revision: https://reviews.llvm.org/D115869
This patch adds missing formatting for UTF-8 unicode.

Cross-referencing https://reviews.llvm.org/D66447

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D112564
The Support directory was removed from the unittests cmake when the directory
was removed in 204c3b5. Subsequent commits
added the directory back but seem to have missed adding it back to the cmake.

This patch also removes MLIRSupportIndentedStream from the list of linked
libraries to avoid an ODR violation (it's already part of MLIRSupport which
is also being linked here). Otherwise ASAN complains:

```
=================================================================
==102592==ERROR: AddressSanitizer: odr-violation (0x7fbdf214eee0):
  [1] size=120 'vtable for mlir::raw_indented_ostream' /home/arjun/llvm-project/mlir/lib/Support/IndentedOstream.cpp
  [2] size=120 'vtable for mlir::raw_indented_ostream' /home/arjun/llvm-project/mlir/lib/Support/IndentedOstream.cpp
These globals were registered at these points:
  [1]:
    #0 0x28a71d in __asan_register_globals (/home/arjun/llvm-project/build/tools/mlir/unittests/Support/MLIRSupportTests+0x28a71d)
    #1 0x7fbdf214a61b in asan.module_ctor (/home/arjun/llvm-project/build/lib/libMLIRSupportIndentedOstream.so.14git+0x661b)

  [2]:
    #0 0x28a71d in __asan_register_globals (/home/arjun/llvm-project/build/tools/mlir/unittests/Support/MLIRSupportTests+0x28a71d)
    #1 0x7fbdf2061c4b in asan.module_ctor (/home/arjun/llvm-project/build/lib/libMLIRSupport.so.14git+0x11bc4b)

==102592==HINT: if you don't care about these errors you may set ASAN_OPTIONS=detect_odr_violation=0
SUMMARY AddressSanitizer: odr-violation: global 'vtable for mlir::raw_indented_ostream' at /home/arjun/llvm-project/mlir/lib/Support/IndentedOstream.cpp
==102592==ABORTING
```

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D116027
This was once used as a workaround for detecting missing PPC64 TLSGD/TLSLD
relocations produced by ancient IBM XL C/C++.
Instead of hashing DIE offsets, hash DIE references the same as they
would be when used outside of a loclist - that is, deep hash the type on
first use, and hash the numbering on subsequent uses.

This does produce different hashes for different type references, where
it did not before (because we were hashing zero all the time - so it
didn't matter what type was referenced, the hash would be identical).

This also allows us to enforce that the DIE offset (& size) is not
queried before it is used (which came up while investigating another bug
recently).
The const loop iterator was inhibiting the std::move().
This function may take ~1% time. SmallVector<SymbolTableEntry, 0> is smaller (16 bytes
instead of 24) and more efficient.
This does resolve the redundancy in includeInDynsym().
This reverts commit 0c553cc.

This caused a buildbot failure (https://lab.llvm.org/buildbot#builders/197/builds/888).

```
******************** TEST 'ScudoStandalone-Unit :: ./ScudoUnitTest-aarch64-Test/ScudoCommonTest.ResidentMemorySize' FAILED ********************
Script:
--
/home/tcwg-buildbot/worker/clang-aarch64-sve-vla/stage1/projects/compiler-rt/lib/scudo/standalone/tests/./ScudoUnitTest-aarch64-Test --gtest_filter=ScudoCommonTest.ResidentMemorySize
--
Note: Google Test filter = ScudoCommonTest.ResidentMemorySize
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from ScudoCommonTest
[ RUN      ] ScudoCommonTest.ResidentMemorySize
/home/tcwg-buildbot/worker/clang-aarch64-sve-vla/llvm/compiler-rt/lib/scudo/standalone/tests/common_test.cpp:49: Failure
Expected: (getResidentMemorySize()) > (OnStart + Size - Threshold), actual: 707358720 vs 943153152
[  FAILED  ] ScudoCommonTest.ResidentMemorySize (21709 ms)
[----------] 1 test from ScudoCommonTest (21709 ms total)
[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (21709 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] ScudoCommonTest.ResidentMemorySize
 1 FAILED TEST
********************
```
DebugUtils.h contains an identical declaration with a correct comment,
namely:

  /// Render a LookupKind.
  raw_ostream &operator<<(raw_ostream &OS, const LookupKind &K);

Identified with readability-redundant-declaration.
kazutakahirata and others added 29 commits December 29, 2021 00:16
Identified with llvm-header-guard.
The root cause for the crash is the incorrect use of `cast`.
The actual type and cast-to type is different. This patch fixes the
crash by converting the `cast` to `dyn_cast`.
Multithreaded applications using fork(2) need to be extra careful about
what they do in the fork child. Without any special precautions (which
only really work if you can fully control all threads) they can only
safely call async-signal-safe functions. This is because the forked
child will contain snapshot of the parents memory at a random moment in
the execution of all of the non-forking threads (this is where the
similarity with signals comes in).

For example, the other threads could have been holding locks that can
now never be released in the child process and any attempt to obtain
them would block. This is what sometimes happen when using tcmalloc --
our fork child ends up hanging in the memory allocation routine. It is
also what happened with our logging code, which is why we added a
pthread_atfork hackaround.

This patch implements a proper fix to the problem, by which is to make
the child code async-signal-safe. The ProcessLaunchInfo structure is
transformed into a simpler ForkLaunchInfo representation, one which can
be read without allocating memory and invoking complex library
functions.

Strictly speaking this implementation is not async-signal-safe, as it
still invokes library functions outside of the posix-blessed set of
entry points. Strictly adhering to the spec would mean reimplementing a
lot of the functionality in pure C, so instead I rely on the fact that
any reasonable implementation of some functions (e.g.,
basic_string::c_str()) will not start allocating memory or doing other
unsafe things.

The new child code does not call into our logging infrastructure, which
enables us to remove the pthread_atfork call from there.

Differential Revision: https://reviews.llvm.org/D116165
Now that we are caching the dwarf index as well, we will always have
more than one cache file (when not using accelerator tables). I have
adjusted the test to check for the presence of one _symtab_ index.
Remove the Mangled::operator! and Mangled::operator void* where the
comments in header and implementation files disagree and replace them
with operator bool.

This fix PR52702 as https://reviews.llvm.org/D106837 used the buggy
Mangled::operator! in Symbol::SynthesizeNameIfNeeded. For example,
consider the symbol "puts" in a hello world C program:

// Inside Symbol::SynthesizeNameIfNeeded
(lldb) p m_mangled
(lldb_private::Mangled) $0 = (m_mangled = None, m_demangled = "puts")
(lldb) p !m_mangled
(bool) $1 = true          # should be false!!
This leads to Symbol::SynthesizeNameIfNeeded overwriting m_demangled
part of Mangled (in this case "puts").

In conclusion, this patch turns
callq  0x401030                  ; symbol stub for: ___lldb_unnamed_symbol36
back into
callq  0x401030                  ; symbol stub for: puts .

Differential Revision: https://reviews.llvm.org/D116217
The MonitorCallback function was assuming that the "exited" argument is
set whenever a thread exits, but the caller was only setting that flag
for the main thread.

This patch deletes the argument altogether, and lets MonitorCallback
compute what it needs itself.

This is almost NFC, since previously we would end up in the
"GetSignalInfo failed for unknown reasons" branch, which was doing the
same thing -- forgetting about the thread.
This reverts commit 6d09aae.

The test uses ulimit and ran into problems on some bots. Run on linux only.
There's nothing platform-specific about the code we're testing, so this
should be enough to ensure correctness.
Adds diagnosing on attempt to use zero length arrays, pointers, refs, arrays
of them and structs/classes containing all of it.
In case a struct/class with zero length array is used this emits a set
of notes pointing out how zero length array got into used struct, like
this:
```
struct ContainsArr {
  int A[0]; // note: field of illegal type declared here
};
struct Wrapper {
  ContainsArr F; // note: within field of type ContainsArr declared here
  // ...
}

// Device code
Wrapper W;
W.use(); // error: zero-length arrays are not permitted

```
Total deep check of each used declaration may result in double
diagnosing at the same location.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D114080
Co-authored-by: Anirudh Sundar Subramaniam <quic_sanirudh@quicinc.com>
This enables more simplifications and gets us closer to removing undef.
ping @alinas
Identified with modernize-use-nullptr.
Identified with readability-const-return-type.
They are the same as for the other HVX vectors, but types need to be
listed explicitly. Also, add a detailed codegen testcase.

Co-authored-by: Abhikrant Sharma <quic_abhikran@quicinc.com>
The code path can only be reached when folding the tail, so turn the
check into an assertion.
Expose the powi intrinsic to the LLVM dialect within MLIR

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D116364
For vectors with repeating values, old codegen would rotate and insert
every duplicate element. This patch replaces that behavior with a splat
of the most common element, vinsert/vror only occur when needed.
There can only be one permute operations per packet, so this actually
pessimizes the code (due to the extra "or").
…es semantics of `OffsetSizeAndStrideOpInterface`.

The semantics of the ops that implement the
`OffsetSizeAndStrideOpInterface` is that if the number of offsets,
sizes or strides are less than the rank of the source, then some
default values are filled along the trailing dimensions (0 for offset,
source dimension of sizes, and 1 for strides). This is confusing,
especially with rank-reducing semantics. Immediate issue here is that
the methods of `OffsetSizeAndStridesOpInterface` assumes that the
number of values is same as the source rank. This cause out-of-bounds
errors.

So simplifying the specification of `OffsetSizeAndStridesOpInterface`
to make it invalid to specify number of offsets/sizes/strides not
equal to the source rank.

Differential Revision: https://reviews.llvm.org/D115677
 ((Op1 + C) & C) u<  Op1 --> Op1 != 0
 ((Op1 + C) & C) u>= Op1 --> Op1 == 0
 Op0 u>  ((Op0 + C) & C) --> Op0 != 0
 Op0 u<= ((Op0 + C) & C) --> Op0 == 0

https://alive2.llvm.org/ce/z/iUfXJN
https://alive2.llvm.org/ce/z/caAtjj

  define i1 @src(i8 %x, i8 %y) {
    ; the add/mask must be with a low-bit mask (0x01ff...)
    %y1 = add i8 %y, 1
    %pop = call i8 @llvm.ctpop.i8(i8 %y1)
    %ismask = icmp eq i8 %pop, 1
    call void @llvm.assume(i1 %ismask)

    %a = add i8 %x, %y
    %m = and i8 %a, %y
    %r = icmp ult i8 %m, %x
    ret i1 %r
  }

  define i1 @tgt(i8 %x, i8 %y) {
    %r = icmp ne i8 %x, 0
    ret i1 %r
  }

I suspect this can be generalized in some way, but this
is the pattern I'm seeing in a motivating test based on
issue #52851.
…performance

Avoid trying to resolve nested types that may not be needed because the name is
already provided by the outer DIE.
@devkadirselcuk devkadirselcuk merged commit 55a9809 into turkdevops:main Dec 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment