[pull] main from llvm:main #397

pull · 2021-12-25T04:51:44Z

See Commits and Changes for more details.

Can you help keep this open source service alive? 💖 Please sponsor : )

This reverts 8cb7876 and follow-ups. GNU ld/gold/ld.lld -O has nothing to do with any code related linker optimizations. It has very small benefit (save 144Ki (.hash, .gnu_hash) with GNU ld, save 0.7% .debug_str with gold/ld.lld) while it makes gold/ld.lld significantly slower when linking RelWithDebInfo clang (gold: 16.437 vs 19.488; ld.lld: 1.882 vs 4.881).

LLVM Dialect in MLIR doesn't have a memmove op. This adds one. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D116274

…::object::ELFFile::sections() This mainly avoid `relsOrRelas` cost in `InputSectionBase::relocate`. `llvm::object::ELFFile::sections()` has redundant and expensive checks.

This avoid repeated load of the unique_ptr in hot paths.

This decreases the 0.2% time (no debug info) to nearly no.

Note: mixed TLSDESC and GD currently does not work.

This temporarily increases sizeof(SymbolUnion), but allows us to mov GOT/PLT/etc index members outside Symbol in the future. Then, we can make TLSDESC and TLSGD use different indexes and support mixed TLSDESC and TLSGD (tested by x86-64-tlsdesc-gd-mixed.s). Note: needsTlsGd and needsTlsGdToIe may optionally be combined. Test updates are due to reordered GOT entries.

Identified with readability-redundant-control-flow.

An identical declaration is present just a couple of lines above the line being removed in this patch. Identified with readability-redundant-declaration.

New method `FCmpInst::compare` is added, which evaluates the given compare predicate for constant operands. Interface is made similar to `ICmpInst::compare`. Differential Revision: https://reviews.llvm.org/D116168

…vailable to Presburger/ This patch moves some static functions from AffineStructures.cpp to Presburger/Utils.cpp and some to be private members of FlatAffineConstraints (which will later be moved to IntegerPolyhedron) to allow for a smoother transition for moving FlatAffineConstraints math functionality to Presburger/IntegerPolyhedron. This patch is part of a series of patches for moving math functionality to Presburger directory. Reviewed By: arjunp, bondhugula Differential Revision: https://reviews.llvm.org/D115869

This patch adds missing formatting for UTF-8 unicode. Cross-referencing https://reviews.llvm.org/D66447 Reviewed By: labath Differential Revision: https://reviews.llvm.org/D112564

The Support directory was removed from the unittests cmake when the directory was removed in 204c3b5. Subsequent commits added the directory back but seem to have missed adding it back to the cmake. This patch also removes MLIRSupportIndentedStream from the list of linked libraries to avoid an ODR violation (it's already part of MLIRSupport which is also being linked here). Otherwise ASAN complains: ``` ================================================================= ==102592==ERROR: AddressSanitizer: odr-violation (0x7fbdf214eee0): [1] size=120 'vtable for mlir::raw_indented_ostream' /home/arjun/llvm-project/mlir/lib/Support/IndentedOstream.cpp [2] size=120 'vtable for mlir::raw_indented_ostream' /home/arjun/llvm-project/mlir/lib/Support/IndentedOstream.cpp These globals were registered at these points: [1]: #0 0x28a71d in __asan_register_globals (/home/arjun/llvm-project/build/tools/mlir/unittests/Support/MLIRSupportTests+0x28a71d) #1 0x7fbdf214a61b in asan.module_ctor (/home/arjun/llvm-project/build/lib/libMLIRSupportIndentedOstream.so.14git+0x661b) [2]: #0 0x28a71d in __asan_register_globals (/home/arjun/llvm-project/build/tools/mlir/unittests/Support/MLIRSupportTests+0x28a71d) #1 0x7fbdf2061c4b in asan.module_ctor (/home/arjun/llvm-project/build/lib/libMLIRSupport.so.14git+0x11bc4b) ==102592==HINT: if you don't care about these errors you may set ASAN_OPTIONS=detect_odr_violation=0 SUMMARY AddressSanitizer: odr-violation: global 'vtable for mlir::raw_indented_ostream' at /home/arjun/llvm-project/mlir/lib/Support/IndentedOstream.cpp ==102592==ABORTING ``` Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D116027

This was once used as a workaround for detecting missing PPC64 TLSGD/TLSLD relocations produced by ancient IBM XL C/C++.

Instead of hashing DIE offsets, hash DIE references the same as they would be when used outside of a loclist - that is, deep hash the type on first use, and hash the numbering on subsequent uses. This does produce different hashes for different type references, where it did not before (because we were hashing zero all the time - so it didn't matter what type was referenced, the hash would be identical). This also allows us to enforce that the DIE offset (& size) is not queried before it is used (which came up while investigating another bug recently).

The const loop iterator was inhibiting the std::move().

This function may take ~1% time. SmallVector<SymbolTableEntry, 0> is smaller (16 bytes instead of 24) and more efficient.

… symbols like non-global symbols

This does resolve the redundancy in includeInDynsym().

This reverts commit 0c553cc. This caused a buildbot failure (https://lab.llvm.org/buildbot#builders/197/builds/888). ``` ******************** TEST 'ScudoStandalone-Unit :: ./ScudoUnitTest-aarch64-Test/ScudoCommonTest.ResidentMemorySize' FAILED ******************** Script: -- /home/tcwg-buildbot/worker/clang-aarch64-sve-vla/stage1/projects/compiler-rt/lib/scudo/standalone/tests/./ScudoUnitTest-aarch64-Test --gtest_filter=ScudoCommonTest.ResidentMemorySize -- Note: Google Test filter = ScudoCommonTest.ResidentMemorySize [==========] Running 1 test from 1 test suite. [----------] Global test environment set-up. [----------] 1 test from ScudoCommonTest [ RUN ] ScudoCommonTest.ResidentMemorySize /home/tcwg-buildbot/worker/clang-aarch64-sve-vla/llvm/compiler-rt/lib/scudo/standalone/tests/common_test.cpp:49: Failure Expected: (getResidentMemorySize()) > (OnStart + Size - Threshold), actual: 707358720 vs 943153152 [ FAILED ] ScudoCommonTest.ResidentMemorySize (21709 ms) [----------] 1 test from ScudoCommonTest (21709 ms total) [----------] Global test environment tear-down [==========] 1 test from 1 test suite ran. (21709 ms total) [ PASSED ] 0 tests. [ FAILED ] 1 test, listed below: [ FAILED ] ScudoCommonTest.ResidentMemorySize 1 FAILED TEST ******************** ```

DebugUtils.h contains an identical declaration with a correct comment, namely: /// Render a LookupKind. raw_ostream &operator<<(raw_ostream &OS, const LookupKind &K); Identified with readability-redundant-declaration.

Identified with llvm-header-guard.

The root cause for the crash is the incorrect use of `cast`. The actual type and cast-to type is different. This patch fixes the crash by converting the `cast` to `dyn_cast`.

Multithreaded applications using fork(2) need to be extra careful about what they do in the fork child. Without any special precautions (which only really work if you can fully control all threads) they can only safely call async-signal-safe functions. This is because the forked child will contain snapshot of the parents memory at a random moment in the execution of all of the non-forking threads (this is where the similarity with signals comes in). For example, the other threads could have been holding locks that can now never be released in the child process and any attempt to obtain them would block. This is what sometimes happen when using tcmalloc -- our fork child ends up hanging in the memory allocation routine. It is also what happened with our logging code, which is why we added a pthread_atfork hackaround. This patch implements a proper fix to the problem, by which is to make the child code async-signal-safe. The ProcessLaunchInfo structure is transformed into a simpler ForkLaunchInfo representation, one which can be read without allocating memory and invoking complex library functions. Strictly speaking this implementation is not async-signal-safe, as it still invokes library functions outside of the posix-blessed set of entry points. Strictly adhering to the spec would mean reimplementing a lot of the functionality in pure C, so instead I rely on the fact that any reasonable implementation of some functions (e.g., basic_string::c_str()) will not start allocating memory or doing other unsafe things. The new child code does not call into our logging infrastructure, which enables us to remove the pthread_atfork call from there. Differential Revision: https://reviews.llvm.org/D116165

Now that we are caching the dwarf index as well, we will always have more than one cache file (when not using accelerator tables). I have adjusted the test to check for the presence of one _symtab_ index.

Remove the Mangled::operator! and Mangled::operator void* where the comments in header and implementation files disagree and replace them with operator bool. This fix PR52702 as https://reviews.llvm.org/D106837 used the buggy Mangled::operator! in Symbol::SynthesizeNameIfNeeded. For example, consider the symbol "puts" in a hello world C program: // Inside Symbol::SynthesizeNameIfNeeded (lldb) p m_mangled (lldb_private::Mangled) $0 = (m_mangled = None, m_demangled = "puts") (lldb) p !m_mangled (bool) $1 = true # should be false!! This leads to Symbol::SynthesizeNameIfNeeded overwriting m_demangled part of Mangled (in this case "puts"). In conclusion, this patch turns callq 0x401030 ; symbol stub for: ___lldb_unnamed_symbol36 back into callq 0x401030 ; symbol stub for: puts . Differential Revision: https://reviews.llvm.org/D116217

The MonitorCallback function was assuming that the "exited" argument is set whenever a thread exits, but the caller was only setting that flag for the main thread. This patch deletes the argument altogether, and lets MonitorCallback compute what it needs itself. This is almost NFC, since previously we would end up in the "GetSignalInfo failed for unknown reasons" branch, which was doing the same thing -- forgetting about the thread.

This reverts commit 6d09aae. The test uses ulimit and ran into problems on some bots. Run on linux only. There's nothing platform-specific about the code we're testing, so this should be enough to ensure correctness.

Adds diagnosing on attempt to use zero length arrays, pointers, refs, arrays of them and structs/classes containing all of it. In case a struct/class with zero length array is used this emits a set of notes pointing out how zero length array got into used struct, like this: ``` struct ContainsArr { int A[0]; // note: field of illegal type declared here }; struct Wrapper { ContainsArr F; // note: within field of type ContainsArr declared here // ... } // Device code Wrapper W; W.use(); // error: zero-length arrays are not permitted ``` Total deep check of each used declaration may result in double diagnosing at the same location. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D114080

@alinas

ping @alinas

Co-authored-by: Anirudh Sundar Subramaniam <quic_sanirudh@quicinc.com>

@alinas

This enables more simplifications and gets us closer to removing undef. ping @alinas

Identified with modernize-use-nullptr.

Identified with readability-const-return-type.

They are the same as for the other HVX vectors, but types need to be listed explicitly. Also, add a detailed codegen testcase. Co-authored-by: Abhikrant Sharma <quic_abhikran@quicinc.com>

The code path can only be reached when folding the tail, so turn the check into an assertion.

Expose the powi intrinsic to the LLVM dialect within MLIR Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D116364

For vectors with repeating values, old codegen would rotate and insert every duplicate element. This patch replaces that behavior with a splat of the most common element, vinsert/vror only occur when needed.

There can only be one permute operations per packet, so this actually pessimizes the code (due to the extra "or").

…es semantics of `OffsetSizeAndStrideOpInterface`. The semantics of the ops that implement the `OffsetSizeAndStrideOpInterface` is that if the number of offsets, sizes or strides are less than the rank of the source, then some default values are filled along the trailing dimensions (0 for offset, source dimension of sizes, and 1 for strides). This is confusing, especially with rank-reducing semantics. Immediate issue here is that the methods of `OffsetSizeAndStridesOpInterface` assumes that the number of values is same as the source rank. This cause out-of-bounds errors. So simplifying the specification of `OffsetSizeAndStridesOpInterface` to make it invalid to specify number of offsets/sizes/strides not equal to the source rank. Differential Revision: https://reviews.llvm.org/D115677

@src

((Op1 + C) & C) u< Op1 --> Op1 != 0 ((Op1 + C) & C) u>= Op1 --> Op1 == 0 Op0 u> ((Op0 + C) & C) --> Op0 != 0 Op0 u<= ((Op0 + C) & C) --> Op0 == 0 https://alive2.llvm.org/ce/z/iUfXJN https://alive2.llvm.org/ce/z/caAtjj define i1 @src(i8 %x, i8 %y) { ; the add/mask must be with a low-bit mask (0x01ff...) %y1 = add i8 %y, 1 %pop = call i8 @llvm.ctpop.i8(i8 %y1) %ismask = icmp eq i8 %pop, 1 call void @llvm.assume(i1 %ismask) %a = add i8 %x, %y %m = and i8 %a, %y %r = icmp ult i8 %m, %x ret i1 %r } define i1 @tgt(i8 %x, i8 %y) { %r = icmp ne i8 %x, 0 ret i1 %r } I suspect this can be generalized in some way, but this is the pattern I'm seeing in a motivating test based on issue #52851.

…performance Avoid trying to resolve nested types that may not be needed because the name is already provided by the outer DIE.

Differential Revision: https://reviews.llvm.org/D116382

MaskRay and others added 6 commits December 24, 2021 15:41

[MLIR][LLVM] Add MemmoveOp to LLVM Dialect

2709fd1

LLVM Dialect in MLIR doesn't have a memmove op. This adds one. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D116274

[ELF] Add ELFFileBase::{elfShdrs,numELFShdrs} to avoid duplicate llvm…

b5a0f0f

…::object::ELFFile::sections() This mainly avoid `relsOrRelas` cost in `InputSectionBase::relocate`. `llvm::object::ELFFile::sections()` has redundant and expensive checks.

[ELF] Cache global variable target in relocate*

745420d

This avoid repeated load of the unique_ptr in hot paths.

[ELF] Optimize replaceCommonSymbols

40fae4d

This decreases the 0.2% time (no debug info) to nearly no.

[CodeGen] Fix a memory leak

a8cbddc

pull bot added the ⤵️ pull label Dec 25, 2021

kazutakahirata and others added 23 commits December 24, 2021 20:57

Use Optional::getValueOr (NFC)

9c0a422

Use isa instead of dyn_cast (NFC)

62e48ed

Use {DenseSet,SetVector,SmallPtrSet}::contains (NFC)

76f0f1c

Use StringRef::contains (NFC)

3cfe375

[ELF][test] Add tests for mixed GD-to-IE and IE, mixed TLSDESC and GD

cde37a7

Note: mixed TLSDESC and GD currently does not work.

Remove redundant return and continue statements (NFC)

2d303e6

Identified with readability-redundant-control-flow.

[StaticAnalyzer] Remove redundant declaration isStdSmartPtr (NFC)

34558b0

An identical declaration is present just a couple of lines above the line being removed in this patch. Identified with readability-redundant-declaration.

[NFC] Method for evaluation of FCmpInst for constant operands

d86e2cc

New method `FCmpInst::compare` is added, which evaluates the given compare predicate for constant operands. Interface is made similar to `ICmpInst::compare`. Differential Revision: https://reviews.llvm.org/D116168

[lldb] Add support for UTF-8 unicode formatting

46cdcf0

This patch adds missing formatting for UTF-8 unicode. Cross-referencing https://reviews.llvm.org/D66447 Reviewed By: labath Differential Revision: https://reviews.llvm.org/D112564

[ELF] De-template handleTlsRelocation. NFC

dd4f5d4

[ELF] scanReloc: remove unused start parameter. NFC

a00f480

This was once used as a workaround for detecting missing PPC64 TLSGD/TLSLD relocations produced by ancient IBM XL C/C++.

Fix clang-tidy performance-move-const-arg in DLTI Dialect (NFC)

dabfefa

The const loop iterator was inhibiting the std::move().

[ELF][test] Make some TLS tests less sensitive to addresses

d5e310b

[ELF] sortSymTabSymbols: change vector to SmallVector

2c8ebab

This function may take ~1% time. SmallVector<SymbolTableEntry, 0> is smaller (16 bytes instead of 24) and more efficient.

[ELF] reportRangeError: mention symbol name for non-STT_SECTION local…

20b4704

… symbols like non-global symbols

[ELF] Remove one redundant computeBinding

aabe901

This does resolve the redundancy in includeInDynsym().

[Orc] Remove a redundant declaration (NFC)

fc15fc5

DebugUtils.h contains an identical declaration with a correct comment, namely: /// Render a LookupKind. raw_ostream &operator<<(raw_ostream &OS, const LookupKind &K); Identified with readability-redundant-declaration.

Ensure newlines at the end of files (NFC)

7006d34

kazutakahirata and others added 29 commits December 29, 2021 00:16

[clang] Fix header guards (NFC)

b468281

Identified with llvm-header-guard.

[clang] Fix crash in bug52905

8de2d06

The root cause for the crash is the incorrect use of `cast`. The actual type and cast-to type is different. This patch fixes the crash by converting the `cast` to `dyn_cast`.

[lldb] Adjust TestModuleCacheSimple for D115951

daed479

Now that we are caching the dwarf index as well, we will always have more than one cache file (when not using accelerator tables). I have adjusted the test to check for the presence of one _symtab_ index.

[AArch64] Remove outdated FIXME in test arm64-csel.ll. NFC.

4fedd4b

Fix lit feature name in 9dc4af3

3ad32df

[NewGVN] Prefer poison to undef when ranking operands

6d702a1

ping @alinas

[Hexagon] Handle floating point vector loads/stores

33fc675

[Hexagon] Handle floating point splats

2ce586b

Co-authored-by: Anirudh Sundar Subramaniam <quic_sanirudh@quicinc.com>

[NewGVN] Use poison instead of undef to represent unreachable values

680d409

This enables more simplifications and gets us closer to removing undef. ping @alinas

[clang] Remove unused "using" (NFC)

1b329fe

[clang] Use nullptr instead of 0 or NULL (NFC)

298367e

Identified with modernize-use-nullptr.

[Basic] Drop unnecessary const from return types (NFC)

ee3f557

Identified with readability-const-return-type.

[Hexagon] Calling conventions for floating point vectors

4df2aba

They are the same as for the other HVX vectors, but types need to be listed explicitly. Also, add a detailed codegen testcase. Co-authored-by: Abhikrant Sharma <quic_abhikran@quicinc.com>

[RISCV] Add a few more instructions to hasAllNBitUsers.

015ff72

[LV] Replace redundant tail-fold check with assert (NFC).

ba9016a

The code path can only be reached when folding the tail, so turn the check into an assertion.

[MLIR][LLVM] Expose powi intrinsic to MLIR

180455a

Expose the powi intrinsic to the LLVM dialect within MLIR Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D116364

[Hexagon] Improve BUILD_VECTOR codegen

505d574

For vectors with repeating values, old codegen would rotate and insert every duplicate element. This patch replaces that behavior with a splat of the most common element, vinsert/vror only occur when needed.

[Hexagon] Don't build two halves of HVX vector in parallel

ba07f30

There can only be one permute operations per packet, so this actually pessimizes the code (due to the extra "or").

[InstCombine] add tests for lshr(add(shl())); NFC

77df609

[InstCombine] add tests for unsigned overflow of bitmask offset; NFC

baa22e9

DWARFVerifier: Delay loading nested types in type dumping to improve …

f24dff3

…performance Avoid trying to resolve nested types that may not be needed because the name is already provided by the outer DIE.

[libc++] [NFC] Remove an unused parameter from __sift_down.

928852f

Differential Revision: https://reviews.llvm.org/D116382

devkadirselcuk merged commit 55a9809 into turkdevops:main Dec 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] main from llvm:main #397

[pull] main from llvm:main #397

pull bot commented Dec 25, 2021 •

edited

Loading

[pull] main from llvm:main #397

[pull] main from llvm:main #397

Conversation

pull bot commented Dec 25, 2021 • edited Loading

pull bot commented Dec 25, 2021 •

edited

Loading