[cir] [Lowering] Handle VaArg #1088

This is currently crashing on Linux only internally, but not on GitHub's CI, disable it temporarily while we investigate.

This is a straightforward adaption of existing CodeGen logic. While I'm here, move block comments inside their blocks, so that they look nicer.

…VM (llvm#810) As title. Add setAlignmentAttr for GlobalOps created from AST. LLVM Lowering should have LLVM GlobalOp's alignment attribute inherited from CIR::GlobalOp. This PR is definitely needed to fix issue llvm#801 (comment), but the issue doesn't have alignment and comdat attribute for CIR Ops to begin with, so I'll keep investigating and fix CIR problem in another PR.

…y cast op (llvm#812) There are two occurrences of `cir.cast(array_to_ptrdecay, ...)` that drop address spaces unexpectedly for its result pointer type. This PR fixes them with the source address space. ```mlir // Before %1 = cir.cast(array_to_ptrdecay, %0 : !cir.ptr<!cir.array<!s32i x 32>, addrspace(offload_local)>), !cir.ptr<!s32i> // After %1 = cir.cast(array_to_ptrdecay, %0 : !cir.ptr<!cir.array<!s32i x 32>, addrspace(offload_local)>), !cir.ptr<!s32i, addrspace(offload_local)> ```

As mentioned at llvm#809 (comment) , this PR adds more simplify transformations for select op: - `cir.select if %0 then x else x` -> `x` - `cir.select if %0 then #true else #false` -> `%0` - `cir.select if %0 then #false else #true` -> `cir.unary not %0`

…p ops (llvm#818) This PR makes simple lowering generate the result type lowering logic and make it suitable for unary fp2int operations and binary fp2fp operations.

…ization (llvm#820) This PR adds initial support for temporary materialization of local temporaries with trivial cleanups. Materialization of global temporaries and local temporaries with non-trivial cleanups is far more trickier that I initially thought and I decide to submit this easy part first.

This is a straightforward adaption from CodeGen. I checked the uses of the Delegating arg that's passed in various places, and it only appears to be used by virtual inheritance, which should be handled by llvm#624.

…on (llvm#813) Currently some bitcasts would silently change the address space of source pointer type, which hides some miscompilations of pointer type in CIR. llvm#812 is an example. The address space in result pointer type is dropped, but the bitcast later keep the type consistency no matter what the result type is. Such bitcast is commonly emitted in CodeGen. CIR bitcasts are lowered to LLVM bitcasts, which also don't allow mismatch between address spaces. This PR adds this verification.

We were previously printing the type alias for `struct S` as `!ty_22S22` instead of just `!ty_S`. This was because our alias computation for a StructType did the following: os << "ty_" << structType.getName() `structType.getName()` is a `StringAttr`, and writing a `StringAttr` to an output stream adds double quotes around the actual string [1][2]. These double quotes then get hex-encoded as 22 [3]. We can fix this by writing the actual string value instead. Aliases that would end in a number will now receive a trailing underscore because of MLIR's alias sanitization not allowing a trailing digit [4] (which ironically didn't kick in even though our aliases were always previously ending with a number, which might be a bug in the sanitization logic). Aliases containing other special characters (e.g. `::`) will still be escaped as before. In other words: ``` struct S {}; // before: !ty_22S22 = ... // after: !ty_S = ... struct S1 {}; // before: !ty_22S122 = ... // after: !ty_S1_ = ... struct std::string {}; // before: !ty_22std3A3Astring22 // after: !ty_std3A3Astring ``` I'm not a big fan of the trailing underscore special-case, but I also don't want to touch core MLIR logic, and I think the end result is still nicer than before. The tests were mechanically updated with the following command run inside `clang/test/CIR`, and the same commands can be run to update the tests for any in-flight patches. (These are for GNU sed; for macOS change the `-i` to `-i ''`.) find . -type f | xargs sed -i -E -e 's/ty_22([A-Za-z0-9_$]+[0-9])22/ty_\1_/g' -e 's/ty_22([A-Za-z0-9_$]+)22/ty_\1/g' clang/test/CIR/CodeGen/stmtexpr-init.c needed an additional minor fix to swap the expected order of two type aliases in the CIR output. clang/test/CIR/CodeGen/coro-task.cpp needed some surgery because it was searching for `22` to find the end of a type alias; I changed it to search for the actual alias instead. If you run into merge conflicts after this change, you can auto-resolve them via smeenai@715f061 [1] https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L2295 [2] https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L2763 [3] https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L1014 [4] https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L1154

@test

This PR fixes the lowering for BrCond. Consider the following code snippet: ``` #include <stdbool.h> bool test() { bool x = false; if (x) return x; return x; } ``` Emitting the CIR to `tmp.cir` using `-fclangir-mem2reg` produces the following CIR (truncated): ``` !s32i = !cir.int<s, 32> #fn_attr = #cir<extra({inline = #cir.inline<no>, nothrow = #cir.nothrow, optnone = #cir.optnone})> module { cir.func no_proto @test() -> !cir.bool extra(#fn_attr) { %0 = cir.const #cir.int<0> : !s32i %1 = cir.cast(int_to_bool, %0 : !s32i), !cir.bool cir.br ^bb1 ^bb1: // pred: ^bb0 cir.brcond %1 ^bb2, ^bb3 ^bb2: // pred: ^bb1 cir.return %1 : !cir.bool ^bb3: // pred: ^bb1 cir.br ^bb4 ^bb4: // pred: ^bb3 cir.return %1 : !cir.bool } } ``` Lowering the CIR to LLVM using `cir-opt tmp.cir -cir-to-llvm` fails with: ``` tmp.cir:5:10: error: failed to legalize operation 'llvm.zext' marked as erased ``` The CIR cast `%1 = cir.cast(int_to_bool, %0 : !s32i)` is lowered to a CIR comparison with zero, which is then lowered to an `LLVM::ICmpOp` and `LLVM::ZExtOp`. In the BrCond lowering, the zext is deleted when `zext->use_empty()`, but during this phase the lowering for the CIR above is not complete yet, because the zext will still have usage(s) later. The current check for when the zext is deleted is error-prone and can be improved. To fix this, in addition to checking that the use of the zext is empty, an additional check that the defining operation for the BrCond in the CIR (the cast operation in this case) is used exactly once is added.

We haven't been able to find the root cause of llvm#829 just yet, the problem does also not show up under a ASANified build. Add some extra information before we crash, hopefully that might shed some light.

…m#822) Allow from the clang driver the use of lowering from CIR to MLIR standard dialect. Update the test to match the real output when `-fno-clangir-direct-lowering` is used, or with a combination of both `-fclangir-direct-lowering` and `-fno-clangir-direct-lowering`. --------- Co-authored-by: Bruno Cardoso Lopes <[email protected]> Co-authored-by: Shoaib Meenai <[email protected]>

We can now get the cleanup right for other potential throwing ctors, still missing LLVM lowering support.

This PR adds a new transformation that transform suitable ternary operations into select operations. Currently the "suitable" ternary operations are those ternary operations whose both branches satisfy either one of the following criteria: - The branch only contain a single `cir.yield` operation; - The branch contains a `cir.const` followed by a `cir.yield` that yields the constant value produced by the `cir.const`. - ~~The branch contains a `cir.load` followed by a `cir.yield` that yields the value loaded by the `cir.load`. The load operation cannot be volatile and must load from an alloca.~~ These criteria are hardcoded now so that simple C/C++ ternary expressions could be eventually lowered to a `cir.select` operation instead.

This is permitted by the language, and IRGen emits traps for destructors other than the base object destructor. Make CIRGen follow suit.

) The first patch to fix llvm#803 . This PR adds the calling convention attribute to CallOp directly, which is similar to LLVM, rather than adding the information to function type, which mimics Clang AST function type. The syntax of it in CIR assembly is between the function type and extra attributes, as follows: ```mlir %1 = cir.call %fnptr(%a) : (!fnptr, !s32i) -> !s32i cc(spir_kernel) extra(#fn_attr) ``` The verification of direct calls is not included. It will be included in the next patch extending CIRGen & Lowering. --- For every builder method of Call Op, an optional parameter `callingConv` is inserted right before the parameter of extra attribute. However, apart from the parser / printer, this PR does not introduce any functional changes.

…anup

… go into entry block

FlattenCFG will soon get the necessary support for lowering to LLVM, this is CIRGen only for now.

…`constructAttributeList` (llvm#831) Similar to llvm#830 , this PR completes the `setCIRFunctionAttributes` part with the call to `constructAttributeList` method, so that func op and call op share the logic of handling these kinds of attributes, which is the design of OG CodeGen. It also includes other refactors. The function `constructAttributeList` now use `mlir::NamedAttrList &` rather than immutable attribute `mlir::DictionaryAttr &` as the inout result parameter, which benefits the additive merging of attributes.

…ith OG (llvm#830) Previously the body of `setExtraAttributesForFunc` corresponds to `SetLLVMFunctionAttributesForDefinition`, but the callsite of it does not reside at the right position. This PR rename it and adjust the calls to it following OG CodeGen. To be specific, `setExtraAttributesForFunc` is called right after the initialization of `FuncOp`. But in OG CodeGen, the list of attributes is constructed by several more functions: `SetLLVMFunctionAttributes` and `SetLLVMFunctionAttributesForDefinition`. This results in diff in attributes of function declarations, which is reflected by the changes of test files. Apart from them, there is no functional change. In other words, the two code path calling `setCIRFunctionAttributesForDefinition` are tested by existing tests: * Caller `buildGlobalFunctionDefinition`: tested by `CIR/CodeGen/function-attrs.cpp`, ... * Caller `codegenCXXStructor`: tested by `CIR/CodeGen/delegating-ctor.cpp`, `defined-pure-virtual-func.cpp`, ...

The parser was looking for extra(...) before the return type while the pretty-printer put it after the return type. This was breaking the LSP-server for example. Change the parser behavior accordingly.

…#836) This PR implements the CIRGen and Lowering part of calling convention attribute of `cir.call`-like operations. Here we have **4 kinds of operations**: (direct or indirect) x (`call` or `try_call`). According to our need and feasibility of constructing a test case, this PR includes: * For CIRGen, only direct `call`. Until now, the only extra calling conventions are SPIR ones, which cannot be set from source code manually using attributes. Meanwhile, OpenCL C *does not allow* function pointers or exceptions, therefore the only case remaining is direct call. * For Lowering, direct and indirect `call`, but not any `try_call`. Although it's possible to write all 4 kinds of calls with calling convention in ClangIR assembly, exceptions is quite hard to write and read. I prefer source-code-level test for it when it's available in the future. For example, possibly C++ `thiscall` with exceptions. * Extra: the verification of calling convention consistency for direct `call` and direct `try_call`. All unsupported cases are guarded by assertions or MLIR diags.

Consider the following code snippet `test.c`: ``` int test(int x) { static int arr[10] = {0, 1, 0, 0}; return arr[x]; } ``` When lowering from CIR to LLVM using `bin/clang test.c -Xclang -fclangir -Xclang -emit-llvm -S -o -` It produces: ``` clangir/mlir/lib/IR/BuiltinAttributes.cpp:1015: static mlir::DenseElementsAttr mlir::DenseElementsAttr::get(mlir::ShapedType, llvm::ArrayRef<llvm::APInt>): Assertion `hasSameElementsOrSplat(type, values)' failed. ``` I traced the bug back to `Lowering/LoweringHelpers.cpp` where we fill trailing zeros, and I believe this PR does it the right way. I have also added a very simple test for verification.

@b

The main purpose of this PR is to add support for C/C++ attribute annotate. The PR involves both CIR generation and Lowering Prepare. In the rest of this description, we first introduce the concept of attribute annotate, then talk about expectations of LLVM regarding annotation, after it, we describe how ClangIR handles it in this PR. Finally, we list trivial differences between LLVM code generated by clang codegen and ClangIR codegen. **The concept of attribute annotate. and expected LLVM IR** the following is C code example of annotation. say in example.c `int *b __attribute__((annotate("withargs", "21", 12 ))); int *a __attribute__((annotate("oneargs", "21", ))); int *c __attribute__((annotate("noargs"))); ` here "withargs" is the annotation string, "21" and 12 are arguments for this annotation named "withargs". LLVM-based compiler is expected keep these information and build a global variable capturing all annotations used in the translation unit when emitting into LLVM IR. This global variable itself is **not** constant, but will be initialized with constants that are related to annotation representation, e.g. "withargs" should be literal string variable in IR. This global variable has a fixed name "llvm.global.annotations", and its of array of struct type, and should be initialized with a const array of const structs, each const struct is a representation of an annotation site, which has 5-field. [ptr to global var/func annotated, ptr to translation unit string const, line_no, annotation_name, ptr to arguments const] annotation name string and args constants, as well as this global var should be in section "llvm.metadata". e.g. In the above example, We shall have following in the generated LLVM IR like the following ``` @b = global ptr null, align 8 @.str = private unnamed_addr constant [9 x i8] c"withargs\00", section "llvm.metadata" @.str.1 = private unnamed_addr constant [10 x i8] c"example.c\00", section "llvm.metadata" @.str.2 = private unnamed_addr constant [3 x i8] c"21\00", align 1 @.args = private unnamed_addr constant { ptr, i32 } { ptr @.str.2, i32 12 }, section "llvm.metadata" @A = global ptr null, align 8 @.str.3 = private unnamed_addr constant [8 x i8] c"oneargs\00", section "llvm.metadata" @.args.4 = private unnamed_addr constant { ptr } { ptr @.str.2 }, section "llvm.metadata" @c = global ptr null, align 8 @.str.5 = private unnamed_addr constant [7 x i8] c"noargs\00", section "llvm.metadata" @llvm.global.annotations = appending global [3 x { ptr, ptr, ptr, i32, ptr }] [{ ptr, ptr, ptr, i32, ptr } { ptr @b, ptr @.str, ptr @.str.1, i32 1, ptr @.args }, { ptr, ptr, ptr, i32, ptr } { ptr @A, ptr @.str.3, ptr @.str.1, i32 2, ptr @.args.4 }, { ptr, ptr, ptr, i32, ptr } { ptr @c, ptr @.str.5, ptr @.str.1, i32 3, ptr null }], section "llvm.metadata" ``` notice that since variable c's annotation has no arg, the last field of its corresponding annotation entry is a nullptr. **ClangIR's handling of annotations** In CIR, we introduce AnnotationAttr to GlobalOp and FuncOp to record its annotations. That way, we are able to make fast query about annotation if in future a CIR pass is interested in them. We leave the work of generating const variables as well as global annotations' var to LLVM lowering. But at LoweringPrepare we collect all annotations and create a module attribute "cir.global_annotations" so to facilitate LLVM lowering. **Some implementation details and trivial differences between clangir generated LLVM code and vanilla LLVM code** 1. I suffix names of constants generated for annotation purpose with ".annotation" to avoid redefinition, but clang codegen doesn't do it. 3. clang codegen seems to visit FuncDecls in slightly different orders than CIR, thus, sometimes the order of elements of the initial value const array for llvm.global.annotations var is different from clang generated LLVMIR, it should be trivial, as I don't expect consumer of this var is assuming a fixed order of collecting annotations. Otherwise, clang codegen and clangir pretty much generate same LLVM IR for annotations!

Now that the basic is working, start adding cleanups to be attached to cir.call's instead. This is necessary in order to tie the pieces (landing pads and cleanups) more properly, allowing multiple calls inside cir.try op to be connected with the right cleanup. This is the first piece of a series, tests coming next.

… with CIR

…ry to be close to call generation

…locks inside calls getEHDispatchBlock result isn't really used to track anything just yet, so this change isn't supposed to affect anything. This is building block for having a cleanup per call.

There is a typo in `call.cir` that uses a wrong function argument type, leading to failure in the final LLVM IR translation. CIR verification does not reject it, because it skips indirect calls at the beginning. It's `verifySymbolUses` after all. https://github.com/llvm/clangir/blob/bde154cf1243cc4f938339c4dc15b1576d3025ab/clang/lib/CIR/Dialect/IR/CIRDialect.cpp#L2672-L2679 The typo was copied to another IR test. Here we fix them all.

…ages (llvm#840) Fix llvm#805. This PR includes end-to-end implementation. The `convergent` attribute is set depending on languages, which is wrapped as `langOpts.assumeFunctionsAreConvergent()`. Therefore, in ClangIR, every `cir.func` under `#cir.lang<opencl_c>` is set to be convergent. After lowering to LLVM IR, `PostOrderFunctionAttrs` pass will remove unnecessary `convergent` then. In other words, we will still see `convergent` on every function with `-O0`, but not with default optimization level. The test taken from `CodeGenOpenCL/convergent.cl` is a bit complicated. However, the core of it is that `convergent` is set properly for `convfun()` `non_convfun()` `f()` and `g()`. Merge of two `if` is more or less a result of generating the same LLVM IR as OG.

…#843) Per the operation walking documentation [1]: > A callback on a block or operation is allowed to erase that block or > operation if either: > * the walk is in post-order, or > * the walk is in pre-order and the walk is skipped after the erasure. We were doing neither when erasing terminator operations and replacing them with a branch, leading to a use after free and ASAN errors. This fixes the following tests with ASAN: ``` Clang :: CIR/CodeGen/switch-gnurange.cpp Clang :: CIR/Lowering/atomic-runtime.cpp Clang :: CIR/Lowering/loop.cir Clang :: CIR/Lowering/loops-with-break.cir Clang :: CIR/Lowering/loops-with-continue.cir Clang :: CIR/Lowering/switch.cir Clang :: CIR/Transforms/Target/x86_64/x86_64-call-conv-lowering-pass.cpp Clang :: CIR/Transforms/loop.cir Clang :: CIR/Transforms/switch.cir ``` These two tests still fail with ASAN after this, which I'm looking into: ``` Clang :: CIR/CodeGen/pointer-arith-ext.c Clang :: CIR/Transforms/Target/x86_64/x86_64-call-conv-lowering-pass.cpp ``` `CIR/CodeGen/global-new.cpp` is failing even on a non-ASAN Release build for me on the parent commit, so it's unrelated. [1] https://github.com/llvm/llvm-project/blob/0c55ad11ab3857056bb3917fdf087c4aa811b790/mlir/include/mlir/IR/Operation.h#L767-L770

CIR Codegen fails to generate functions with local types with the same names. For instance, the next code : ``` void foo(int a, float b) { struct A { int x; }; struct A loc = {a}; { struct A { float y; }; struct A loc = {b}; } } ``` fails with on the next assertion: `Unable to find record layout information for type`. The problem is that we don't create record layout for the structures with equal names and `CIRGenTypes::convertRecordDeclType` returns the wrong type for the second struct type in the example above. This PR fixes this problem. In the original codegen the call to `Ty->setName(name)` resolves name collisions and assign a proper name for the type. In our case looks like we need to use the same approach as we did for the anonymous structures, i.e. to track the used names in the builder. Also, I fixed the struct type creation. Previously, the type was created several times - first in the `CIRGenTypes::convertRecordDeclType` and then in the `CIRGenTypes::computeRecordLayout`. This is why the indexes used by the anonymous structures naming had relatively big values and this is where the most changes on the tests come from.

…llvm#845) Looks like certain names should not be used - I even could not build CIR on the Ubuntu with a relatively old glibc version. In this case `minor` and `major` are macroses and can not be used in this context. You can take a look at the comments in the [mlir/test/lib/Dialect/Test/TestDialect.h](https://github.com/llvm/clangir/blob/main/mlir/test/lib/Dialect/Test/TestDialect.h#L70) reference as well

)

Generalize approach and be able to tie together cleanups with their matching throwing calls. Before this the dtors were not really emitted in the proper order. LLVM support for this still hits a NYI, so nothing special here on the LLVM lowering side.

Consider the following code snippet `test.c`: ``` typedef struct { char b; int c; } D; typedef struct { D e; int f; } E; void f1() { E a = {}; } ``` When emitting the CIR using `bin/clang test.c -Xclang -fclangir -Xclang -emit-cir -S -o -` the current implementation gets: ``` NYI UNREACHABLE executed at ~/clangir/clang/lib/CIR/CodeGen/CIRGenExprConst.cpp:338! ``` This is only one of the many tests where this happens. Comparing the implementations of `CIR/CodeGen/CIRRecordLayoutBuilder.cpp` and clang's codegen `lib/CodeGen/CGRecordLayoutBuilder.cpp`, there is some padding missing for packed structures, and some alignments that need to be corrected. This PR also updates 2 existing tests. In the first test, `structural-binding.cpp`, I updated some `cir.get_member` indexes. In the second test, `packed-structs.c`, I updated the `cir` layout for the structure, and added more tests. I have compared the changes I made in the tests to the original clang codegen and everything seems fine.

This PR adds CIRGen support for the following 3 builtins related to compile- time assumptions: - `__builtin_assume` - `__builtin_assume_aligned` - `__builtin_assume_separate_storage` 3 new operations are invented to represent the three builtins. _LLVMIR lowering for these builtins cannot be implemented at this moment_ due to the lack of operand bundle support in LLVMIR dialect.

When the output file name is not specified via `-o`, the upstream clang uses `.ll` as the extension of the default output file name. This PR makes ClangIR follow this behavior.

Align parsing of cir.call with its pretty-printing.

…intrinsic (llvm#833) As title. And this PR introduces IntrinsicCallOp which will be used to lower intrinsics to llvm intrinsics. This PR handles clang::AArch64::BI__builtin_arm_ldrex. For this particular one, we only have test .cir, because a MLIR issue mentioned llvm#833 (comment)

This is in prep for creating more accurate landing pads w.r.t to their functions call and associated cleanup, so far we only support one.

…ple landing pads

More refactoring, now the infra to generate one landing pad per call is up, but we still have an assert for more than one call, next commit will actually introduce new functionality.

…emission

…upport

…te personality functions While here, cleanup getOrCreateLLVMFuncOp usaga a bit.

Directly erasing the op causes a use after free later on, presumably because the lowering framework isn't aware of the op being deleted. This fixes `clang/test/CIR/CodeGen/pointer-arith-ext.c` with ASAN.

The loop was erasing the user of a value while iterating on the value's users, which results in a use after free. We're already assuming (and asserting) that there's only one user, so we can just access it directly instead. CIR/Transforms/Target/x86_64/x86_64-call-conv-lowering-pass.cpp was failing with ASAN before this change. We're now ASAN-clean except for llvm#829 (which is also in progress).

Reland llvm#638 This was reverted due to llvm#655. I tried to address the problem in the newest commit. The changes of the PR since the last landed one includes: - Move the definition of `cir::CIRGenConsumer` to `clang/include/clang/CIRFrontendAction/CIRGenConsumer.h`, and leave its `HandleTranslationUnit` interface is left empty. So that `cir::CIRGenConsumer` won't need to depend on CodeGen any more. - Change the old definition of `cir::CIRGenConsumer` in `clang/lib/CIR/FrontendAction/CIRGenAction.cpp` and to `CIRLoweringConsumer`, inherited from `cir::CIRGenConsumer`, which implements the original `HandleTranslationUnit` interface. I feel this may improve the readability more even without my original patch.

This PR fixes the lowering for multi dimensional arrays. Consider the following code snippet `test.c`: ``` void foo() { char arr[4][1] = {"a", "b", "c", "d"}; } ``` When ran with `bin/clang test.c -Xclang -fclangir -Xclang -emit-llvm -S -o -`, It produces the following error: ``` ~/clangir/llvm/include/llvm/Support/Casting.h:566: decltype(auto) llvm::cast(const From&) [with To = mlir::ArrayAttr; From = mlir::Attribute]: Assertion `isa<To>(Val) && "cast<Ty>() argument of incompatible type!"' failed. ``` The bug can be traced back to `LoweringHelpers.cpp`. It considers the values in the array as integer types, and this causes an error in this case. This PR updates `convertToDenseElementsAttrImpl` when the array contains string attributes. I have also added one more similar test. Note that in the tests I used a **literal match** to avoid matching as regex, so `!dbg` is useful.

Support expressions at the top level such as const unsigned int n = 1234; const int &r = (const int&)n; Reviewers: bcardosolopes Pull Request: llvm#857

This is to match clang CodeGen

@smeenai

Fix llvm#829 Thanks @smeenai for pointing out the root cause and UBSan failure!

As title. Also introduced buildAArch64NeonCall skeleton, which is partially the counterpart of OG's EmitNeonCall. And this could be use for many other neon intrinsics. --------- Co-authored-by: Guojin He <[email protected]>

… it (llvm#859)

These were uninitialized, which led to intermittent test failures from the use of uninitialized variables. Initialize them to `nullptr` as is done with other member variables that are pointers to fix this. I did a quick spot-check and didn't find other uninitialized variables in the main CGF class itself. Lots of subclasses have uninitialized member variables, but those are presumably expected to be initialized at all points of construction, so we can leave them alone until they cause any issues. `ninja check-clang-cir` now passes with ASan+UBSan and MSan. Fixes llvm#829

See the test for example.

This PR adds aarch64 big endian support. Basically the support for aarch64_be itself is expressed only in two extra cases for the switch statement and changes in the `CIRDataLayout` are needed to prove that we really support big endian. Hence the idea for the test - I think the best way for proof is something connected with bit-fields, so we compare the results of the original codegen and ours.

This PR splits the old `cir-simplify` pass into two new passes, namely `cir-canonicalize` and `cir-simplify` (the new `cir-simplify`). The `cir-canonicalize` pass runs transformations that do not affect CIR-to-source fidelity much, such as operation folding and redundant operation elimination. On the other hand, the new `cir-simplify` pass runs transformations that may significantly change the code and break high-level code analysis passes, such as more aggresive code optimizations. This PR also updates the CIR-to-CIR pipeline to fit these two new passes. The `cir-canonicalize` pass is moved to the very front of the pipeline, while the new `cir-simplify` pass is moved to the back of the pipeline (but still before lowering prepare of course). Additionally, the new `cir-simplify` now only runs when the user specifies a non-zero optimization level on the frontend. Also fixed some typos and resolved some `clang-tidy` complaints along the way. Resolves llvm#827 .

Currently the C style cast is not implemented/supported for unions. This PR adds support for union casts as done in `CGExprAgg.cpp`. I have also added an extra test in `union-init.c`.

Mistakenly closed llvm#850 llvm#850 (review) This PR fixes array initialization for expression arguments. Consider the following code snippet `test.c`: ``` typedef struct { int a; int b[2]; } A; int bar() { return 42; } void foo() { A a = {bar(), {}}; } ``` When ran with `bin/clang test.c -Xclang -fclangir -Xclang -emit-cir -S -o -`, It produces the following error: ``` ~/clangir/clang/lib/CIR/CodeGen/CIRGenExprAgg.cpp:483: void {anonymous}::AggExprEmitter::buildArrayInit(cir::Address, mlir::cir::ArrayType, clang::QualType, clang::Expr*, llvm::ArrayRef<clang::Expr*>, clang::Expr*): Assertion `NumInitElements != 0' failed. ``` The error can be traced back to `CIRGenExprAgg.cpp`, and the fix is simple. It is possible to have an empty array initialization as an expression argument!

As title, if element type of vector type is sized, then the vector type should be deemed sized. This would enable us generate code for neon without triggering assertion

…eon_vrndaq_v (llvm#871) as title. This also added NeonType support for Float32 Co-authored-by: Guojin He <[email protected]>

…::saved_type::save

It will hit another assert when calling initFullExprCleanup.

This PR fixes the case, when a temporary var is used, and `alloca` operation is inserted in the block start before the `label` operation. Implementation: when we search for the `alloca` place in a block, we take label operations into account as well. Fix llvm#870 --------- Co-authored-by: Bruno Cardoso Lopes <[email protected]>

__attribute__((annotate()) was only accepting integer literals, preventing some meta-programming usage for example. This should be extended to some other kinds of types. --------- Co-authored-by: Bruno Cardoso Lopes <[email protected]>

Just as the title says, but only covers non-exception path, that's coming next.

Nothing unblocked yet, just hit next assert in the same path.

… exceptions Code path still hits an assert sooner, incremental NFC step.

…lvm#878) Close llvm#876 We've already considered the case that there are random stmt after a switch case: ``` for (auto *c : compoundStmt->body()) { if (auto *switchCase = dyn_cast<SwitchCase>(c)) { res = buildSwitchCase(*switchCase, condType, caseAttrs); } else if (lastCaseBlock) { // This means it's a random stmt following up a case, just // emit it as part of previous known case. mlir::OpBuilder::InsertionGuard guardCase(builder); builder.setInsertionPointToEnd(lastCaseBlock); res = buildStmt(c, /*useCurrentScope=*/!isa<CompoundStmt>(c)); } else { llvm_unreachable("statement doesn't belong to any case region, NYI"); } lastCaseBlock = builder.getBlock(); if (res.failed()) break; } ``` However, maybe this is an oversight, in the branch of ` if (lastCaseBlock)`, the insertion point will be updated automatically when the RAII object `guardCase` destroys, then we can assign the correct value for `lastCaseBlock` later. So we will see the weird code pattern in the issue side. BTW, I found the codes in CIRGenStmt.cpp are far more less similar with the ones other code gen places. Is this intentional? And what is the motivation and guide lines here?

as title.

…lvm#882) As title. Notice that for those intrinsics, just like OG, we do not lower to llvm intrinsics, instead, do vector insert. The test case is partially from OG [aarch64-neon-vget.c](https://github.com/llvm/clangir/blob/85bc6407f559221afebe08a60ed2b50bf1edf7fa/clang/test/CodeGen/aarch64-neon-vget.c) But, I did not do all signed and unsigned int tests because unsigned and signed of the same width essentially just use the same intrinsic ID thus exactly same code path as far as this PR concerns. --------- Co-authored-by: Guojin He <[email protected]>

Reviewers: bcardosolopes Reviewed By: bcardosolopes Pull Request: llvm#881

…#877) We need a target-independent way to distinguish OpenCL kernels in ClangIR. This PR adds a unit attribute `OpenCLKernelAttr` similar to the one in Clang AST. This attribute is attached to the extra attribute dictionary of `cir.func` operations only. (Not for `cir.call`.)

…'case' (llvm#879) Motivation example: ``` extern "C" void action1(); extern "C" void action2(); extern "C" void case_follow_label(int v) { switch (v) { case 1: label: case 2: action1(); break; default: action2(); goto label; } } ``` When we compile it, we will meet: ``` case Stmt::CaseStmtClass: case Stmt::DefaultStmtClass: assert(0 && "Should not get here, currently handled directly from SwitchStmt"); break; ``` in `buildStmt`. The cause is clear. We call `buildStmt` when we build the label stmt. To solve this, I think we should be able to build case stmt in buildStmt. But the new problem is, we need to pass the information like caseAttr and condType. So I tried to add such informations in CIRGenFunction as data member.

llvm#884) as title. This PR has simliar test case organization as to [PR882](llvm#882) Notice that comparing to OG, this PR combines cases for some pairs of intrinsics such as BI__builtin_neon_vget_lane_f32 and BI__builtin_neon_vdups_lane_f32. They have the same code generated in OG and CIRGen OG separate them into different case handling because it passes mnemonics which are different. CIRGen doesn't pass that so why not combine them. Co-authored-by: Guojin He <[email protected]>

as title, this would complete solution to fix issue [LLVM lowering missing comdat and constant attributes](llvm#801)

as title. Also add function buildCommonNeonBuiltinExpr just like OG's emitCommonNeonBuiltinExpr. This might help consolidate neon cases and share common code. Notice: - I pretty much keep the skeleton of OG's emitCommonNeonBuiltinExpr at the cost of that we didn't use a few variables they calculate. They might help in the future. - The purpose of having CommonNeonBuiltinExpr is to reduce implementation code duplication. So far, we only have one type implemented, and it's hard for CIR to be more generic. But we should see if in future we can have different types of intrinsics share more generic code path. --------- Co-authored-by: Guojin He <[email protected]>

@test

…no override (llvm#893) As title. The test case used is abort(), but it is from the real code. Notice: Since CIR implementation for NoReturn Call is pending to implement, the generated llvm code is like: `define dso_local void @test() llvm#1 { call void @abort(), !dbg !8 ret void }` which is not right, right code should be like, ` `define dso_local void @test() llvm#1 { call void @abort(), !dbg !8 unreachable }` ` Still send this PR as Noreturn implementation is a separate issue.

as title. The test cases are from [clang codegen test case](https://github.com/llvm/clangir/blob/52323c17c6a3708b3eb72651465f7d4b82f057e7/clang/test/CodeGen/builtins.c#L37)

Before this patch, the CC lowering pass was applied only when explicitly requested by the user. This update changes the default behavior to always apply the CC lowering pass, with an option to disable it using the `-fno-clangir-call-conv-lowering` flag if necessary. The primary objective is to make this pass a mandatory step in the compilation pipeline. This ensures that future contributions correctly implement the CC lowering for both existing and new targets, resulting in more consistent and accurate code generation. From an implementation perspective, several `llvm_unreachable` statements have been substituted with a new `assert_or_abort` macro. This macro can be configured to either trigger a non-blocking assertion or a blocking unreachable statement. This facilitates a test-by-testa incremental development as it does not required you to know which code path a test will trigger an just cause a crash if it does. A few notable changes: - Support multi-block function in CC lowering - Ignore pointer-related CC lowering - Ignore no-proto functions CC lowering - Handle missing type evaluation kinds - Fix CC lowering for function declarations - Unblock indirect function calls - Disable CC lowering pass on several tests

…ntrinsicString (llvm#899) as title. In addition, this PR has 2 extra changes. 1. change return type of GetNeonType into mlir::cir::VectorType so we don't have to do cast all the time, this is consistent with [OG](https://github.com/llvm/clangir/blob/db6b7c07c076cb738d0acae248d7c3c199b2b952/clang/lib/CodeGen/CGBuiltin.cpp#L6234) as well. 2. add getAArch64SIMDIntrinsicString helper function so we have better debug info when hitting NYI in buildCommonNeonBuiltinExpr --------- Co-authored-by: Guojin He <[email protected]>

Fix llvm#895 and it's also missing some more throughout behavior for the pass, it also needs to be enabled by default when emitting object files. This reverts commit db6b7c0.

Then we can observe the time consumed in different part of CIR. This patch is not complete. But I think it is fine given we can always add them easily.

> To keep information about whether an OpenCL kernel has uniform work > group size or not, clang generates 'uniform-work-group-size' function > attribute for every kernel: > > "uniform-work-group-size"="true" for OpenCL 1.2 and lower, > "uniform-work-group-size"="true" for OpenCL 2.0 and higher if '-cl-uniform-work-group-size' option was specified, > "uniform-work-group-size"="false" for OpenCL 2.0 and higher if no '-cl-uniform-work-group-size' options was specified. > If the function is not an OpenCL kernel, 'uniform-work-group-size' > attribute isn't generated. > > *From [Differential 43570](https://reviews.llvm.org/D43570)* This PR introduces the `OpenCLKernelUniformWorkGroupSizeAttr` attribute to the ClangIR pipeline, towards the completeness in attributes for OpenCL. While this attribute is represented as a unit attribute in MLIR, its absence signifies either non-kernel functions or a `false` value for kernel functions. To match the original LLVM IR behavior, we also consider whether a function is an OpenCL kernel during lowering: * If the function is not a kernel, the attribute is ignored. No LLVM function attribute is set. * If the function is a kernel: * and the `OpenCLKernelUniformWorkGroupSizeAttr` is present, we generate the LLVM function attribute `"uniform-work-group-size"="true"`. * If absent, we generate `"uniform-work-group-size"="false"`.

…#897) `CIRGenModule::buildGlobal` --[rename]--> `CIRGenModule::getOrCreateCIRGlobal` We already have `CIRGenModule::buildGlobal` that corresponds to `CodeGenModule::EmitGlobal`. But there is an overload of `buildGlobal` used by `getAddrOfGlobalVar`. Since this name is confusing, this PR rename it to `getOrCreateCIRGlobal`. Note that `getOrCreateCIRGlobal` already exists. It is intentional to make the renamed function an overload to it. The reason here is that the renamed function is basically a wrapper of the original `getOrCreateCIRGlobal` with more specific parameters: `getOrCreateCIRGlobal(decl, type, isDef)` --[call]--> `getOrCreateCIRGlobal(getMangledName(decl), type, decl->getType()->getAS(), decl, isDef)`

…m#901) just as title. --------- Co-authored-by: Guojin He <[email protected]>

…aller pieces (llvm#902) The missing feature flag for OpenCL has very few occurrences now. This PR rearranges them into proper pieces to better track them.

) Heterogeneous languages do not support exceptions, which corresponds to `nothrow` in ClangIR and `nounwind` in LLVM IR. This PR adds nothrow attributes for all functions for OpenCL languages in CIRGen. The Lowering for it is already supported previously.

Fix llvm#801 (the remaining `constant` part). Actually the missing stage is CIRGen. There are two places where `GV.setConstant` is called: * `buildGlobalVarDefinition` * `getOrCreateCIRGlobal` Therefore, the primary test `global-constant.c` contains a global definition and a global declaration with use, which should be enough to cover the two paths. A test for OpenCL `constant` qualified global is also added. Some existing testcases need tweaking to avoid failure of missing constant.

as title. --------- Co-authored-by: Guojin He <[email protected]>

@s

Consider the following code snippet `tmp.c`: ``` #define N 3200 struct S { double a[N]; double b[N]; } s; double *b = s.b; void foo() { double x = 0; for (int i = 0; i < N; i++) x += b[i]; } int main() { foo(); return 0; } ``` Running `bin/clang tmp.c -fclangir -o tmp && ./tmp` causes a segmentation fault. I compared the LLVM IR with and without CIR and noticed a difference which causes this: `@b = global ptr getelementptr inbounds (%struct.S, ptr @s, i32 0, i32 1)` // no CIR `@b = global ptr getelementptr inbounds (%struct.S, ptr @s, i32 1)` // with CIR It seems there is a missing index when creating global pointers from structs. I have updated `Lowering/DirectToLLVM/LowerToLLVM.cpp`, and added a few tests.

as title. Notice this is not target specific nor neon intrinsics.

Entails several minor changes: - Duplicate resume blocks around. - Disable LP caching, we repeat them as often as necessary. - Update maps accordingly for tracking places to patch up. - Make changes to clean up block handling. - Fix an issue in flatten cfg.

as title. The current implementation of this PR is use cir::CastOP integral casting to implement vector type truncation. Thus, LLVM lowering code has been change to accommodate it. In addition. Added code into [CIRGenBuiltinAArch64.cpp](https://github.com/llvm/clangir/pull/909/files#diff-6f7700013aa60ed524eb6ddcbab90c4dd288c384f9434547b038357868334932) to make it more similar to OG. ``` mlir::Type ty = vTy; if (!ty) ``` Added test case into neon.c as the file already contains similar vector move test cases such as vmovl --------- Co-authored-by: Guojin He <[email protected]>

…m#935) as title. Also changed [neon-ldst.c](https://github.com/llvm/clangir/compare/main...ghehg:clangir-llvm-ghehg:macM3?expand=1#diff-ea4814b6503bff2b7bc4afc6400565e6e89e5785bfcda587dc8401d8de5d3a22) to make it have the same RUN options as OG [clang/test/CodeGen/aarch64-neon-intrinsics.c](https://github.com/llvm/clangir/blob/main/clang/test/CodeGen/aarch64-neon-intrinsics.c) Those options help us to avoid checking load/store pairs thus make the test less verbose and easier to compare against OG. Co-authored-by: Guojin He <[email protected]>

Implement derived-to-base address conversions for non-virtual base classes. The code gen for this situation was only implemented when the offset was zero, and it simply created a `cir.base_class_addr` op for which no lowering or other transformation existed. Conversion to a virtual base class is not yet implemented. Two new fields are added to the `cir.base_class_addr` operation: the byte offset of the necessary adjustment, and a boolean flag indicating whether the source operand may be null. The offset is easy to compute in the front end while the entire path of intermediate classes is still available. It would be difficult for the back end to recompute the offset. So it is best to store it in the operation. The null-pointer check is best done late in the lowering process. But whether or not the null-pointer check is needed is only known by the front end; the back end can't figure that out. So that flag needs to be stored in the operation. `CIRGenFunction::getAddressOfBaseClass` was largely rewritten. The code path no longer matches the equivalent function in the LLVM IR code gen, because the generated ClangIR is quite different from the generated LLVM IR. `cir.base_class_addr` is lowered to LLVM IR as a `getelementptr` operation. If a null-pointer check is needed, then that is wrapped in a `select` operation. When generating code for a constructor or destructor, an incorrect `cir.ptr_stride` op was used to convert the pointer to a base class. The code was assuming that the operand of `cir.ptr_stride` was measured in bytes; the operand is the number elements, not the number of bytes. So the base class constructor was being called on the wrong chunk of memory. Fix this by using a `cir.base_class_addr` op instead of `cir.ptr_stride` in this scenario. The use of `cir.ptr_stride` in `ApplyNonVirtualAndVirtualOffset` had the same problem. Continue using `cir.ptr_stride` here, but temporarily convert the pointer to type `char*` so the pointer is adjusted correctly. Adjust the expected results of three existing tests in response to these changes. Add two new tests, one code gen and one lowering, to cover the case where a base class is at a non-zero offset.

Fix llvm#934 While here move scope op codegen outside the builder, so it's easier to dump blocks and operations while debugging.

After 5da4310, the LLVM dialect requires the variadic callee type to be present for variadic calls. The op builders take care of this automatically if you pass the function type, so change our lowering logic to do so. Add tests for this as well as a missing test for indirect function call lowering. Fixes llvm#913 Fixes llvm#933

See the test for the reproducer. It would crash due the NYI. See https://github.com/llvm/llvm-project/blob/327124ece7d59de56ca0f9faa2cd82af68c011b9/clang/lib/CodeGen/CGExpr.cpp#L1295-L1373, I found we've implemented all the cases in CGExpr.cpp. IIUC, I think we can remove the NYI.

Close llvm#883. See the above issue for details

The title describes the purpose of the PR. The logic was gotten from the original CodeGen, and I added a test to check that `-fno-PIE` is indeed enabled.

This PR adds support for the `__fp16` type. CIRGen and LLVM lowering is included. Resolve llvm#900 .

…pare_and_swap (llvm#955) as title. Actually just follow the way in `makeBinaryAtomicValue` in the same file which did the right thing by creating SInt or UInt based on first argument's signess.

…for buildVarAnnotations

LLVM lowering support coming next.

…#842)" (llvm#944) This reverts commit 8f699fd and fixes some issues, namely: - CC lowering pass will no longer fail if the function has no AST information that won't be used. - Fixed CC lowering not disabling when running certain `cc1` compilation commands. - CC lowering can now be disabled when calling `cir-opt` and `cir-translate`. - Compilation commands that generate Object files should now invoke CC lowering by default.

@foo

I tried to run llvm-test-suite and turned out that there are many tests fail with segfault due to old C style (let's remember Kernighan and Ritchie) . This PR fix it by the usual copy-pasta from the original codegen :) So let's take a look at the code: ``` void foo(x) short x; {} int main() { foo(4); return 0; } ``` and CIR for `foo` function is: ``` cir.func @foo(%arg0: !s32i) { %0 = cir.alloca !s16i, !cir.ptr<!s16i>, ["x", init] %1 = cir.cast(bitcast, %0 : !cir.ptr<!s16i>), !cir.ptr<!s32i> cir.store %arg0, %1 : !s32i, !cir.ptr<!s32i> cir.return } ``` We bitcast the **address** (!!!) and store a value of a bigger size there. And now everything looks fine: ``` cir.func no_proto @foo(%arg0: !s32i) { %0 = cir.alloca !s16i, !cir.ptr<!s16i>, ["x", init] %1 = cir.cast(integral, %arg0 : !s32i), !s16i cir.store %1, %0 : !s16i, !cir.ptr<!s16i> cir.return } ``` We truncate an argument and store it. P.S. The `bitcast` that was there before looks a little bit suspicious and dangerous. Are we sure we can do this unconditional cast while we create `StoreOp` ?

This PR tries to give a simple initial implementation for eliminating redundant loads of constant objects, an idea originally posted by OfekShilon. Specifically, this PR adds a new unit attribute `const` to the `cir.alloca` operation. Presence of this attribute indicates that the alloca-ed object is declared `const` in the input source program. CIRGen is updated accordingly to start emitting this new attribute.

…nOp (llvm#959) They should use PoisonOp (which becomes PoisonValue in LLVMIR) as it is the OG's choice. Proof: We generate VecCreateOp [here ](https://github.com/llvm/clangir/blob/2ca12fe5ec3a1e7279256f069010be2d68200585/clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp#L1975) And it's OG counterpart is [here](https://github.com/llvm/clangir/blob/2ca12fe5ec3a1e7279256f069010be2d68200585/clang/lib/CodeGen/CGExprScalar.cpp#L2096) OG uses PoisonValue. As to VecSplatOp, OG unconditionally [chooses PoisonValue ](https://github.com/llvm/clangir/blob/2ca12fe5ec3a1e7279256f069010be2d68200585/llvm/lib/IR/IRBuilder.cpp#L1204) A even more solid proof for this case is that when we use OG to generate code for our test case I changed in this PR , its always using poison instead of undef as far as VecSplat and VecCreate is concerned. The [OG generated code for vectype-ext.cpp ](https://godbolt.org/z/eqx1rns86) here. The [OG generated code for vectype.cpp ](https://godbolt.org/z/frMjbKGeT) here. For reference, generated CIR for the test case vectype-ext.cpp is [here](https://godbolt.org/z/frMjbKGeT) This is to unblock llvm#936 to help it set on the right path. Note: There might be other CIR vec ops that need to choose Poison to be consistent with OG, but I'd limit the scope of this PR, and wait to see issue pop up in the future.

as title. Base on my experience of [this type of test(https://github.com/llvm/clangir/blob/a7ac2b4e2055e169d9f556abf5821a1ccab666cd/clang/test/CIR/CodeGen/attribute-annotate-multiple.cpp#L51), The number of characters varies in this line as it's about full file path which changes during environment.

Pull Request: llvm#872

@bcardosolopes

1. Add new `cir.vtt.address_point` op for visiting the element of VTT to initialize the virtual pointer. 2. Implement `getVirtualBaseClassOffset` method which provides a virtual offset to adjust to actual virtual pointers in virtual base. 3. Follows the original clang CodeGen scheme for the implementation of most other parts. @bcardosolopes's note: this is cherry-picked from an older PR from Jing Zhang and slightly modified for updates: applied review, test, doc and operation syntax. It does not yet has LLVM lowering support, I'm going to make incremental changes on top of this. Any necessary CIR modifications to this design should follow up shortly too. Also, to make this work I also added more logic to `addImplicitStructorParam`s` and `buildThisParam`.

as title. The generated code is the same as Clang codeden except in a small discrepancy when GEP: OG generates code like this: `%6 = getelementptr inbounds <4 x i16>, ptr %retval.i, i32 1` CIR generates a bit differently: `%6 = getelementptr <4 x i16>, ptr %retval.i, i64 1` Ptr offest might be trivial because choosing i64 over i32 as index type seems to be LLVM Dialect's choice. The lack of `inbounds` keyword might be an issue as `mlir::cir::PtrStrideOp` is currently not lowering to LLVM:GEPOp with `inbounds` attribute as `mlir::cir::PtrStrideOp` itself has no `inbounds`. It's probably because there was no need for it though we do have an implementation of [`CIRGenFunction::buildCheckedInBoundsGEP` ](https://github.com/llvm/clangir/blob/10d6f4b94da7e0181a070f0265d079419d96cf78/clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp#L2762). Anyway, the issue is not in the scope of this PR and should be addressed in a separate PR. If we think this is an issue, I can create another PR and probably add optional attribute to `mlir::cir::PtrStrideOp` to achieve it. In addition to lowering work, a couple of more works: 1. Did a little refactoring on variable name changing into desired CamelBack case. 2. Changed neon-misc RUN Options to be consistent with other neon test files and make test case more concise.

…#951) as title. There are two highlights of the PR 1. The PR introduced a new test file to cover neon intrinsics that move data, which is a big category. This would the 5th neon test file. And we're committed to keep total number of neon test files within 6. This file uses another opt option instcombine, which makes test LLVM code more concise, and our -fclangir generated LLVM code would be identical to OG with this. It looks like OG did some instcombine optimization. 2. `getIntFromMLIRValue` helper function could be substituted by [`mlir::cir::IntAttr getConstOpIntAttr` in CIRGenAtomic.cpp](https://github.com/llvm/clangir/blob/24b24557c98d1c031572a567b658cfb6254f8a89/clang/lib/CIR/CodeGen/CIRGenAtomic.cpp#L337). The function `mlir::cir::IntAttr getConstOpIntAttr` is doing more than `getIntFromMLIRValue`, and there is FIXME in the comment, so not sure if we should just use `mlir::cir::IntAttr getConstOpIntAttr`, either is fine with me.

…ly (llvm#961) Close llvm#957 the previous algorithm to convert a multiple dimension array to a tensor is: fill the value one by one and fill the zero values in conditions. And it has some problems handling the multiple dimension array as above issue shows so that the generated values are not in the same shape with the original array. the new algorithm here is, full fill the values ahead of time with the correct element size and full fill the values to different slots and we only need to maintain the index to write. I feel the new version has better performance (avoid allocation) and better readability slightly.

…o llvm intrinsic (llvm#960) This PR refactored Neon Built in code in clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp a bit to make it cleaner. Also changed RUNOption of test file clang/test/CIR/CodeGen/AArch64/neon-arith.c to make test more concise, and easy to compare against OG (to compare, just remove -fclangir from llvm gen part of RUN, and the test should still pass)

This is the following up fix for the previous fix llvm#961 See the attached new test for the reproducer. Sorry for the initial overlook.

- The flag is the default even for cc1, so make it disable two level deep. - While here, remove the unnecessary flag disable for pure `-emit-cir`.

…nv-lowering While here, add more unrecheables to cover some of the current errors, so that our users can see a clear message instead of a random cast assert of sorts. This covers at least all crashes seen when removing -fno-clangir-call-conv-lowering from all tests, there are probably other things we'll find as we exercise this path.

These are not meant to be used by any other component, make sure it's very specific.

This is the usual copy-paste-modify from CodeGen, though I changed all the variable names to conform to our new style. All these functions should be pulled out as common helpers when we're upstream.

llvm/llvm-project@1d0bd8e moves a conditional from CodeGen to AST, and this follows suit for consistency. (Our support for the Microsoft ABI is NYI anyway; this is just to make things simpler to follow when matching up logic between CodeGen and CIRGen.)

There is no change to testing functionality. This refacot let those files have the same Run options that is easier to maintain and extend.

…#976) Close llvm#975 See the attached test case for example

This PR adds initial support for the `__int128` type. The `!cir.int` type is extended to support 128-bit integer types. This PR comes with a simple test that verifies the CIRGen and LLVM lowering of `!s128i` and `!u128i` work. Resolve llvm#953 .

…ll (llvm#982)

…ulhq_lane and vqrdmulh_lane (llvm#985)

…d (in CIR + Direct to LLVM) (llvm#966) Fixes llvm#931 Added type definition in CIRTypes.td, created appropriate functions for the same in CIRTypes.cpp like getPreferredAlignment, getPreferredAlignment, etc. Optionally added lowering in LowerToLLVM.cpp

… pipeline This is causing lots of churn. `-fclangir-call-conv-lowering` is not mature enough, assumptions are leading to crashes we cannot track with special messages, leading to not great user experience. Turn this off until we have someone dedicated to roll this out.

While here add some bits for ptr auth and match OG.

…lvm#990) LLVM's verifier enforces this, which was previously causing us to fail verification. This is a bit of a band-aid; the overall linkage and visibility setting flow needs some work to match the original.

…lvm#965) As title, but important step in this PR is to allow CIR ShiftOp to take vector of int type as input type. As result, I added a verifier to ShiftOp with 2 constraints 1. Input type either all vector or int type. This is consistent with LLVM::ShlOp, vector shift amount is expected. 2. In the spirit of C99 6.5.7.3, shift amount type must be the same as result type, the if vector type is used. (This is enforced in LLVM lowering for scalar int type).

…iltinExpr (llvm#967) This PR helps us to triage unimplemented builtins (that are target independent). There are unhandled builtins in CIR Codegen `[CIRGenFunction::buildBuiltinExpr](https://github.com/llvm/clangir/blob/4c446b3287895879da598e23164d338d04bced3e/clang/lib/CIR/CodeGen/CIRGenBuiltin.cpp#L305)`. And those builtins have implementation in [OG](https://github.com/llvm/clangir/blob/4c446b3287895879da598e23164d338d04bced3e/clang/lib/CodeGen/CGBuiltin.cpp#L2573). Currently, those builtins just are treated as LibraryCall or some other ways which eventually get failure, and failure messages are confusing. This PR address this problem by refactoring `CIRGenFunction::buildBuiltinExpr` to keep the same skeleton as OG counterpart `CodeGenFunction::EmitBuiltinExpr`, and add builtin name to NYI message

…n_vqsub (llvm#988)

Add more NFC skeleton while here.

Forgot to git add in cb0cb34

…lvm#971) The llvm's intrinsic `llvm.is.fpclass` is used to support multiple float point builtins: https://clang.llvm.org/docs/LanguageExtensions.html#builtin-isfpclass > The `__builtin_isfpclass()` builtin is a generalization of functions > isnan, isinf, isfinite and some others defined by the C standard. It tests > if the floating-point value, specified by the first argument, falls into > any of data classes, specified by the second argument. I meant to support this by creating IntrinsicCallOp directly. But I can't make it due to llvm#480 since the return type of the intrinsic will mismatch. So I have to create a new Op for it. But I feel it might not be too bad. At least it is more explicit and more expressive.

…#996)

…lvm#999)

…#1000)

@sitio-couto

Re llvm#958 > Consider the following code snippet `tmp.c`: > ``` > typedef struct { > int a, b; > } S; > > void foo(S s) {} > ``` > Running `bin/clang tmp.c -fclangir -Xclang -emit-llvm -Xclang -fclangir-call-conv-lowering -S -o -`, we get: > ``` > loc(fused["tmp.c":5:1, "tmp.c":5:16]): error: 'llvm.bitcast' op result #0 must be LLVM-compatible non-aggregate type, but got '!llvm.struct<"struct.S", (i32, i32)>' > ``` > We can also produce a similar error from this: > ``` > typedef struct { > int a, b; > } S; > > S init() { S s; return s; } > ``` > gives: > ``` > loc(fused["tmp.c":5:17, "tmp.c":5:24]): error: 'llvm.bitcast' op operand #0 must be LLVM-compatible non-aggregate type, but got '!llvm.struct<"struct.S", (i32, i32)>' > ``` > I've traced the errors back to `lib/CIR/Dialect/Transforms/TargetLowering/LowerFunction.cpp` in `LowerFunction::buildAggregateStore`, `castReturnValue`, and `buildAggregateBitcast`. > > `withElementType(SrcTy)` is currently commented out/ignored in `LowerFunction.cpp`, but it is important. > > This PR adds/fixes this and updates one test. I thought [about it](llvm#958 (comment)) and I understand adding `cir.bitcast` to circumvent the CIR checks, but I am not sure how we can ignore/drop the bitcast while lowering. I think we can just make the CIR casts correct. I have added a number of lowering tests to verify that the CIR is lowered properly. cc: @sitio-couto @bcardosolopes.

…s for source code global and local vars (llvm#1001) Now CIR supports annotations for both globals and locals. They all should just use the same set of annotation related globals including file name string, annotation name string, and arguments. This PR makes sure this is the case. FYI: for the test case we have, OG generates [ code ](https://godbolt.org/z/Ev5Ycoqj1), pretty much the same code except annotation variable names. This would fix the crash like > error: redefinition of symbol named '.str.annotation' > fatal error: error in backend: The pass manager failed to lower CIR to LLVMIR dialect!

… more (llvm#1002)

@PACKED

Previously we didn't generate the index for array correct. The previous test is incorrect already: globals-neg-index-array.c ``` // RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-cir %s -o %t.cir // RUN: FileCheck --input-file=%t.cir %s // RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fclangir -emit-llvm %s -o %t.ll // RUN: FileCheck --check-prefix=LLVM --input-file=%t.ll %s // RUN: %clang_cc1 -x c++ -triple x86_64-unknown-linux-gnu -emit-cir %s -o %t.cir // RUN: FileCheck --input-file=%t.cir %s // RUN: %clang_cc1 -x c++ -triple x86_64-unknown-linux-gnu -fclangir -emit-llvm %s -o %t.ll // RUN: FileCheck --check-prefix=LLVM --input-file=%t.ll %s struct __attribute__((packed)) PackedStruct { char a1; char a2; char a3; }; struct PackedStruct packed[10]; char *packed_element = &(packed[-2].a3); // CHECK: cir.global external @PACKED = #cir.zero : !cir.array<!ty_PackedStruct x 10> {alignment = 16 : i64} loc(#loc5) // CHECK: cir.global external @packed_element = #cir.global_view<@PACKED, [-2 : i32, 2 : i32]> // LLVM: @PACKED = global [10 x %struct.PackedStruct] zeroinitializer + // LLVM: @packed_element = global ptr getelementptr inbounds ([10 x %struct.PackedStruct], ptr @PACKED, i32 0, i32 -2, i32 2) - // LLVM: @packed_element = global ptr getelementptr inbounds ([10 x %struct.PackedStruct], ptr @PACKED, i32 -2, i32 2) ``` Compile it with `-fclangir -S`, we got: ``` packed: .zero 30 packed_element: .quad packed-54 ``` but the traditional pipeline shows (https://godbolt.org/z/eTj96EP1E): ``` packed: .zero 30 packed_element: .quad packed-4 ``` this may be a simple mismatch.

…lvm#1009) The MLIR docs at https://mlir.llvm.org/docs/DefiningDialects/Operations/#literals specify that "An empty literal `` may be used to remove a space that is inserted implicitly after certain literal elements", so I inserted one before the `right` literal to remove the extra space that was being printed. Oddly, the bug is also fixed by inserting an empty literal _after_ the `left` literal, which leads me to believe that tablegen is inserting an implicit space after the `left` literal.

Close llvm#522 This solves the issue we can't handle `case` in nested scopes and we can't handle if the switch body is not a compound statement. The core idea of the patch is to introduce the `cir.case` operation to the language. Then we can get the cases by traversing the body of the `cir.switch` operation easily instead of counting the regions and the attributes. Every `cir.case` operation has a region and now the `cir.switch` has only one region too. But to make the analysis and optimizations easier, I add a new concept `simple form` here. That a simple `cir.switch` operation is: all the `cir.case` operation owned by the `cir.switch` lives in the top level blocks of the `cir.switch` region and there is no other operations except the ending `cir.yield`. This solves the previous `simplified for common-case` vs `general solution` discussion in llvm#522. After implemented this, I feel the correct answer to it is, we want a general solution for constructing and lowering the operations but we like simple and common case for analysis and optimizations. We just mixed the different phases. For other semantics, see `CIROps.td`. For lowering, we can make it generally by lower the cases one by one and finally lower the switch itself. Although this patch has 1000+ lines of changes, I feel it is relatively neat especially it erases some odd behaviors before. Tested with Spec2017's C benchmarks except 500.perlbench_r.

…pes (llvm#1004) This PR adds a support for return values of a struct type. There are two cases that are not covered by this PR and will be added later.

…1005) This PR adds several copy-pasted lines and a small test and now var args seems to work in the calling convention pass

Reviewers: smeenai Reviewed By: smeenai Pull Request: llvm#1022

We diverge from CodeGen here by delaying the function emission that happens for a global variable. However, due to situations where a global can be emitted while building out a function the old CGF might not be invalid. So we need to store it here just in case. Reviewers: bcardosolopes, smeenai Reviewed By: smeenai Pull Request: llvm#1023

This was declared but never implemented. Upon first usage in a later commit this fails to link. Reviewers: bcardosolopes, smeenai Reviewed By: smeenai Pull Request: llvm#1024

@bar

…tion lowering pass (llvm#1003) This PR adds initial function pointers support for the calling convention lowering pass. This is a suggestion, so any other ideas are welcome. Several ideas was described in the llvm#995 and basically what I'm trying to do is to generate a clean CIR code without additional `bitcast` operations for function pointers and without mix of lowered and initial function types. #### Problem Looks like we can not just lower the function type and cast the value since too many operations are involved. For instance, for the next simple code: ``` typedef struct { int a; } S; typedef int (*myfptr)(S); int foo(S s) { return 42 + s.a; } void bar() { myfptr a = foo; } ``` we get the next CIR for the function `bar` , before the calling convention lowering pass: ``` cir.func no_proto @bar() extra(#fn_attr) { %0 = cir.alloca !cir.ptr<!cir.func<!s32i (!ty_S)>>, !cir.ptr<!cir.ptr<!cir.func<!s32i (!ty_S)>>>, ["a", init] %1 = cir.get_global @foo : !cir.ptr<!cir.func<!s32i (!ty_S)>> cir.store %1, %0 : !cir.ptr<!cir.func<!s32i (!ty_S)>>, !cir.ptr<!cir.ptr<!cir.func<!s32i (!ty_S)>>> cir.return } ``` As one can see, first three operations depend on the function type. Once `foo` is lowered, we need to fix `GetGlobalOp`: otherwise the code will fail with the verification error since actual `foo` type (lowered) differs from the one currently expected by the `GetGlobalOp`. First idea would just rewrite only the `GetGlobalOp` and insert a bitcast after, so both `AllocaOp` and `StoreOp` would work witth proper types. Once the code will be more complex, we will need to take care about possible use cases, e.g. if we use arrays, we will need to track array accesses to it as well in order to insert this bitcast every time the array element is needed. One workaround I can think of: we fix the `GetGlobalOp` type and cast from the lowered type to the initial, and cast back before the actual call happens - but it doesn't sound as a good and clean approach (from my point of view, of course). So I suggest to use type converter and rewrite any operation that may deal with function pointers and make sure it has a proper type, and we don't have any unlowered function type in the program after the calling convention lowering pass. #### Implementation I added lowering for `AllocaOp`, `GetGlobalOp`, and split the lowering for `FuncOp` (former `CallConvLoweringPattern`) and lower `CallOp` separately. Frankly speaking, I tried to implement a pattern for each operation, but for some reasons the tests are not passed for windows and macOs in this case - something weird happens inside `applyPatternsAndFold` function. I suspect it's due to two different rewriters used - one in the `LoweringModule` and one in the mentioned function. So I decided to follow the same approach as it's done for the `LoweringPrepare` pass and don't involve this complex rewriting framework. Next I will add a type converter for the struct type, patterns for `ConstantOp` (for const arrays and `GlobalViewAttr`) In the end of the day we'll have (at least I hope so) a clean CIR code without any bitcasts for function pointers. cc @sitio-couto @bcardosolopes

We had some incorrect logic when creating functions and getting their address which resulted in spurious "definition with the same mangled name" errors. Fix that logic to match original CodeGen, which also fixes these errors. It's expected that decls can appear in the deferred decl list multiple times, and CodeGen has to guard against that. In the case that triggered the error, both `CIRGenerator::HandleInlineFunctionDefinition` and CIRGenModule were deferring the declaration. Something else I discovered here is that we emit these functions in the opposite order as regular CodeGen: https://godbolt.org/z/4PrKG7h9b. That might be a meaningful difference worth investigating further. Fixes llvm#991

…intrinsics (llvm#1020) In this PR, also changed `buildNeonShiftVector` to allow it generates negative shift values. When the shift value is negative, the shift amount vector is not used in any ShiftOp of IR (as they don't need sign to know shift direction), instead, it is just input argument to shift intrinsic function call.

…vm#1028) This PR fixes the notorious double whitespaces introduced by visibility attribute, for `cir.func` only. It uses "leading whitespace" for every print. And the printing of visibility attr is properly guarded by a check of `!isDefault()`. Double whitespaces in test files are removed.

This patch introduces support for the abs family of built-in functions (abs, labs, llabs).

Add lowering prepare logic to lower stores to cir.copy. This bring LLVM lowering closer to OG and turns out the rest of the compiler understands memcpys better and generate better assembly code for at least arm64 and x86_64. Note that current lowering to memcpy is only using i32 intrinsic version, this PR does not touch that code and that will be addressed in another PR.

POSION -> POISON

Clang recognizes the function anyway, but this is an obvious error.

…#1013) In addition, this PR enables ZeroAttr of vector type so that CIR can generate a vector initialized with all zero values.

due to the issue described in llvm#1018, the MLIR lowering for `memmove` has been excluded in this patch.

…bits (llvm#1027) This PR adds a support for the return values of struct types > 128 bits in size. As usually, lot's of copy-pasted lines from the original codegen, except the `AllocaOp` replacement for the return value.

@sitio-couto

… convention lowering pass (llvm#1034) This PR adds a support for calls by function pointers. @sitio-couto I think would be great if you'll also take a look

…in_operator_delete (llvm#1035) The added test cases are from [OG's counterpart](https://github.com/llvm/clangir/blob/f9c5477ee10c9bc005ffbfe698691cc02193ea81/clang/test/CodeGenCXX/builtin-operator-new-delete.cpp#L7). I changed run option to -std=c++17 to support [std::align_val_t](https://en.cppreference.com/w/cpp/memory/new/align_val_t)

) This PR adds calling convention lowering support for the int128 type on x86_64. This is a follow up on llvm#953 .

…lvm#1037)

Upstream accepted this being a ClangOption but it got lost in the rebase. Bring it back here.

CodeGen sometimes emits undef constants directly, e.g. when initializing an empty struct (https://godbolt.org/z/68od33aa8). We want to match this behavior, so we need a cir.undef attr to represent the constant. This change implements the lowering for the new op, which matches how cir.zero is lowered. A follow-up will change CIRGen to use it. It also replaces UndefOf with a ConstantOp of UndefAttr to avoid redundancy. Pull Request resolved: llvm#993

If an empty struct has a non-trivial constexpr constructor, CodeGen emits an undef constant. CIRGen was previously emitting an empty attribute, which got interpreted as constant evaluation failing, resulting in a global variable initializer being emitted. Change to undef to match CodeGen. https://godbolt.org/z/7M9EnEddx has the comparison between current CIRGen vs. original CodeGen; it should match after this lands. Pull Request resolved: llvm#994

…vm#1038)

…lvm#1045) Note in the test file, `test_vshrq_n_s32_32` and `test_vshr_n_u16_16` are addition to what traditional clang code gen already has. They tested the case where shift amount is the same as element size ( compiler errors if shift amount is greater than elem size). OG didn't test that case here, but [has somewhat tested elsewhere](https://github.com/llvm/clangir/blob/3d16a0f8499c43497a18a46d838313ab4deeadea/clang/test/CodeGen/aarch64-neon-shifts.c#L23)

Reproducer: ``` struct nested { union { const char *single; const char *const *multi; } output; }; static const char * const test[] = { "test", }; const struct nested data[] = { { { .multi = test, }, }, { { .single = "hello", }, }, }; ``` ClangIR now failed to recognize `data` as an array since it failed to recognize the initializer for union. This comes from a fundamental difference between CIR and LLVM IR. In LLVM IR, the union is simply a struct with the largest member. So it is fine to have only one init element. But in CIR, the union has the information for all members. So if we only pass a single init element, we may be in trouble. We solve the problem by appending placeholder attribute for the uninitialized fields.

This is just the usual adaption from CodeGen.

This follows the same implementation as CodeGen. llvm#1051 tracks potentially switching to a different strategy in the future.

We were missing an override for this previously and thus not emitting vtables when key functions were defined.

Upstream review of a PR requested that we be more explicit with differentiating things from MLIR to similarly named things from clang AST/LLVM/etc. So add an MLIRContext getter that we should start using. Reviewers: bcardosolopes Reviewed By: bcardosolopes Pull Request: llvm#1047

These are just missing getters/setters that should be there already. They are in use in a patch coming up. I'm splitting them out here for reviewability. Reviewers: bcardosolopes Pull Request: llvm#1021

llvm#1041) This PR adds a support for some basic cases for struct types passed by value. The hardest part probably is `createCoercedStore` function, which I rewrote significantly in order to make it closer to the orignal codegen.

This PR adds a support return struct as a value for one missed case for AArch64 big endian arch

`IntrinsicCallOp` is now named `LLVMIntrinsicCallOp` to better reflect its purpose. And now In CIR, we do not have "llvm" prefix which will be added later during LLVMLowering.

This should fix NYI like `BI__builtin_char_memchr NYI UNREACHABLE executed at clang/lib/CIR/CodeGen/CIRGenBuiltin.cpp:1402` The test is from [OG](https://github.com/llvm/clangir/blob/3ef67c19917ad26ed8b19d4d13a43458a952fddb/clang/test/CodeGenCXX/builtins.cpp#L64) see builtin's prototype [char *__builtin_char_memchr(const char *haystack, int needle, size_t size); ](https://clang.llvm.org/docs/LanguageExtensions.html#string-builtins)

fix llvm#1057 --------- Co-authored-by: Bruno Cardoso Lopes <[email protected]> Co-authored-by: Sirui Mu <[email protected]>

…memory (llvm#1059) This PR covers one more case for return values of struct type, where `memcpy` is emitted.

…_v (llvm#1063)

…#1064)

This PR adds LLVMIR lowering support for `cir.assume`, `cir.assume.aligned`, and `cir.assume.separate_storage`.

In OG CodeGen, string literals has `private` linkage as default (marked by `cir_private` in CIR assembly). But CIR uses `internal`, which is probably an ancient typo. This PR keeps align with it and thus modifies the existing test files.

…bits (llvm#1068) This PR adds a partial support for so-called indirect function arguments for struct types with size > 128 bits for aarch64. #### Couple words about the implementation The hard part is that it's not one-to-one copy from the original codegen, but the code is inspired by it of course. In the original codegen there is no much job is done for the indirect arguments inside the loop in the `EmitFunctionProlog`, and additional alloca is added in the end, in the call for `EmitParamDecl` function. In our case, for the indirect argument (which is a pointer) we replace the original alloca with a new one, and store the pointer in there. And replace all the uses of the old alloca with the load from the new one, i.e. in both cases users works with the pointer to a structure. Also, I added several missed features in the `constructAttributeList` for indirect arguments, but didn't preserve the original code structure, so let me know if I need to do it.

Note that we lack two pieces of support for aliases in LLVM IR dialect globals: the `alias` keyword and function types `void (ptr)`, this needs to be done before we can nail this for good, but it's outside the scope of this commit. The behavior is slightly different under -O1, which will be addressed next.

…m#1073) This PR changes the naming format of string literals from `.str1` to `.str.1`, making it easier to reuse the existing testcases of OG CodeGen.

It's currently polluting the `cir` namespace with very generic symbols like `Integer` and `Memory`, which is pretty confusing. `X86_64ABIInfo` already has `Class` alias for `X86ArgClass`, so we can use that alias to qualify all uses.

Note that there are still missing pieces, which will be incrementally addressed.

… test

Just verified this is actually done by some LLVM optimization, not by the frontend emitting directly, so this is a non-goal now, since CIR can also use LLVM opts to do the same once we have real global alias.

Also verified this does not apply anymore, we match -O0. The only remaing part is to lower to proper LLVM globals once LLVM IR dialect gets the global alias support.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[cir] [Lowering] Handle VaArg #1088

[cir] [Lowering] Handle VaArg #1088

Commits on Nov 3, 2024

Commits on Nov 4, 2024

Commits on Nov 5, 2024

Commits on Nov 6, 2024

Commits on Nov 7, 2024

Commits on Nov 8, 2024

[cir] [Lowering] Handle VaArg #1088

Are you sure you want to change the base?

[cir] [Lowering] Handle VaArg #1088

Commits on Nov 3, 2024

Commits on Nov 4, 2024

Commits on Nov 5, 2024

Commits on Nov 6, 2024

Commits on Nov 7, 2024

Commits on Nov 8, 2024