LLVM 19.0.0git
WebAssemblyLowerEmscriptenEHSjLj.cpp
Go to the documentation of this file.
1//=== WebAssemblyLowerEmscriptenEHSjLj.cpp - Lower exceptions for Emscripten =//
2//
3// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4// See https://llvm.org/LICENSE.txt for license information.
5// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6//
7//===----------------------------------------------------------------------===//
8///
9/// \file
10/// This file lowers exception-related instructions and setjmp/longjmp function
11/// calls to use Emscripten's library functions. The pass uses JavaScript's try
12/// and catch mechanism in case of Emscripten EH/SjLj and Wasm EH intrinsics in
13/// case of Emscripten SjLJ.
14///
15/// * Emscripten exception handling
16/// This pass lowers invokes and landingpads into library functions in JS glue
17/// code. Invokes are lowered into function wrappers called invoke wrappers that
18/// exist in JS side, which wraps the original function call with JS try-catch.
19/// If an exception occurred, cxa_throw() function in JS side sets some
20/// variables (see below) so we can check whether an exception occurred from
21/// wasm code and handle it appropriately.
22///
23/// * Emscripten setjmp-longjmp handling
24/// This pass lowers setjmp to a reasonably-performant approach for emscripten.
25/// The idea is that each block with a setjmp is broken up into two parts: the
26/// part containing setjmp and the part right after the setjmp. The latter part
27/// is either reached from the setjmp, or later from a longjmp. To handle the
28/// longjmp, all calls that might longjmp are also called using invoke wrappers
29/// and thus JS / try-catch. JS longjmp() function also sets some variables so
30/// we can check / whether a longjmp occurred from wasm code. Each block with a
31/// function call that might longjmp is also split up after the longjmp call.
32/// After the longjmp call, we check whether a longjmp occurred, and if it did,
33/// which setjmp it corresponds to, and jump to the right post-setjmp block.
34/// We assume setjmp-longjmp handling always run after EH handling, which means
35/// we don't expect any exception-related instructions when SjLj runs.
36/// FIXME Currently this scheme does not support indirect call of setjmp,
37/// because of the limitation of the scheme itself. fastcomp does not support it
38/// either.
39///
40/// In detail, this pass does following things:
41///
42/// 1) Assumes the existence of global variables: __THREW__, __threwValue
43/// __THREW__ and __threwValue are defined in compiler-rt in Emscripten.
44/// These variables are used for both exceptions and setjmp/longjmps.
45/// __THREW__ indicates whether an exception or a longjmp occurred or not. 0
46/// means nothing occurred, 1 means an exception occurred, and other numbers
47/// mean a longjmp occurred. In the case of longjmp, __THREW__ variable
48/// indicates the corresponding setjmp buffer the longjmp corresponds to.
49/// __threwValue is 0 for exceptions, and the argument to longjmp in case of
50/// longjmp.
51///
52/// * Emscripten exception handling
53///
54/// 2) We assume the existence of setThrew and setTempRet0/getTempRet0 functions
55/// at link time. setThrew exists in Emscripten's compiler-rt:
56///
57/// void setThrew(uintptr_t threw, int value) {
58/// if (__THREW__ == 0) {
59/// __THREW__ = threw;
60/// __threwValue = value;
61/// }
62/// }
63//
64/// setTempRet0 is called from __cxa_find_matching_catch() in JS glue code.
65/// In exception handling, getTempRet0 indicates the type of an exception
66/// caught, and in setjmp/longjmp, it means the second argument to longjmp
67/// function.
68///
69/// 3) Lower
70/// invoke @func(arg1, arg2) to label %invoke.cont unwind label %lpad
71/// into
72/// __THREW__ = 0;
73/// call @__invoke_SIG(func, arg1, arg2)
74/// %__THREW__.val = __THREW__;
75/// __THREW__ = 0;
76/// if (%__THREW__.val == 1)
77/// goto %lpad
78/// else
79/// goto %invoke.cont
80/// SIG is a mangled string generated based on the LLVM IR-level function
81/// signature. After LLVM IR types are lowered to the target wasm types,
82/// the names for these wrappers will change based on wasm types as well,
83/// as in invoke_vi (function takes an int and returns void). The bodies of
84/// these wrappers will be generated in JS glue code, and inside those
85/// wrappers we use JS try-catch to generate actual exception effects. It
86/// also calls the original callee function. An example wrapper in JS code
87/// would look like this:
88/// function invoke_vi(index,a1) {
89/// try {
90/// Module["dynCall_vi"](index,a1); // This calls original callee
91/// } catch(e) {
92/// if (typeof e !== 'number' && e !== 'longjmp') throw e;
93/// _setThrew(1, 0); // setThrew is called here
94/// }
95/// }
96/// If an exception is thrown, __THREW__ will be set to true in a wrapper,
97/// so we can jump to the right BB based on this value.
98///
99/// 4) Lower
100/// %val = landingpad catch c1 catch c2 catch c3 ...
101/// ... use %val ...
102/// into
103/// %fmc = call @__cxa_find_matching_catch_N(c1, c2, c3, ...)
104/// %val = {%fmc, getTempRet0()}
105/// ... use %val ...
106/// Here N is a number calculated based on the number of clauses.
107/// setTempRet0 is called from __cxa_find_matching_catch() in JS glue code.
108///
109/// 5) Lower
110/// resume {%a, %b}
111/// into
112/// call @__resumeException(%a)
113/// where __resumeException() is a function in JS glue code.
114///
115/// 6) Lower
116/// call @llvm.eh.typeid.for(type) (intrinsic)
117/// into
118/// call @llvm_eh_typeid_for(type)
119/// llvm_eh_typeid_for function will be generated in JS glue code.
120///
121/// * Emscripten setjmp / longjmp handling
122///
123/// If there are calls to longjmp()
124///
125/// 1) Lower
126/// longjmp(env, val)
127/// into
128/// emscripten_longjmp(env, val)
129///
130/// If there are calls to setjmp()
131///
132/// 2) In the function entry that calls setjmp, initialize
133/// functionInvocationId as follows:
134///
135/// functionInvocationId = alloca(4)
136///
137/// Note: the alloca size is not important as this pointer is
138/// merely used for pointer comparisions.
139///
140/// 3) Lower
141/// setjmp(env)
142/// into
143/// __wasm_setjmp(env, label, functionInvocationId)
144///
145/// __wasm_setjmp records the necessary info (the label and
146/// functionInvocationId) to the "env".
147/// A BB with setjmp is split into two after setjmp call in order to
148/// make the post-setjmp BB the possible destination of longjmp BB.
149///
150/// 4) Lower every call that might longjmp into
151/// __THREW__ = 0;
152/// call @__invoke_SIG(func, arg1, arg2)
153/// %__THREW__.val = __THREW__;
154/// __THREW__ = 0;
155/// %__threwValue.val = __threwValue;
156/// if (%__THREW__.val != 0 & %__threwValue.val != 0) {
157/// %label = __wasm_setjmp_test(%__THREW__.val, functionInvocationId);
158/// if (%label == 0)
159/// emscripten_longjmp(%__THREW__.val, %__threwValue.val);
160/// setTempRet0(%__threwValue.val);
161/// } else {
162/// %label = -1;
163/// }
164/// longjmp_result = getTempRet0();
165/// switch %label {
166/// label 1: goto post-setjmp BB 1
167/// label 2: goto post-setjmp BB 2
168/// ...
169/// default: goto splitted next BB
170/// }
171///
172/// __wasm_setjmp_test examines the jmp buf to see if it was for a matching
173/// setjmp call. After calling an invoke wrapper, if a longjmp occurred,
174/// __THREW__ will be the address of matching jmp_buf buffer and
175/// __threwValue be the second argument to longjmp.
176/// __wasm_setjmp_test returns a setjmp label, a unique ID to each setjmp
177/// callsite. Label 0 means this longjmp buffer does not correspond to one
178/// of the setjmp callsites in this function, so in this case we just chain
179/// the longjmp to the caller. Label -1 means no longjmp occurred.
180/// Otherwise we jump to the right post-setjmp BB based on the label.
181///
182/// * Wasm setjmp / longjmp handling
183/// This mode still uses some Emscripten library functions but not JavaScript's
184/// try-catch mechanism. It instead uses Wasm exception handling intrinsics,
185/// which will be lowered to exception handling instructions.
186///
187/// If there are calls to longjmp()
188///
189/// 1) Lower
190/// longjmp(env, val)
191/// into
192/// __wasm_longjmp(env, val)
193///
194/// If there are calls to setjmp()
195///
196/// 2) and 3): The same as 2) and 3) in Emscripten SjLj.
197/// (functionInvocationId initialization + setjmp callsite transformation)
198///
199/// 4) Create a catchpad with a wasm.catch() intrinsic, which returns the value
200/// thrown by __wasm_longjmp function. In the runtime library, we have an
201/// equivalent of the following struct:
202///
203/// struct __WasmLongjmpArgs {
204/// void *env;
205/// int val;
206/// };
207///
208/// The thrown value here is a pointer to the struct. We use this struct to
209/// transfer two values by throwing a single value. Wasm throw and catch
210/// instructions are capable of throwing and catching multiple values, but
211/// it also requires multivalue support that is currently not very reliable.
212/// TODO Switch to throwing and catching two values without using the struct
213///
214/// All longjmpable function calls will be converted to an invoke that will
215/// unwind to this catchpad in case a longjmp occurs. Within the catchpad, we
216/// test the thrown values using __wasm_setjmp_test function as we do for
217/// Emscripten SjLj. The main difference is, in Emscripten SjLj, we need to
218/// transform every longjmpable callsite into a sequence of code including
219/// __wasm_setjmp_test() call; in Wasm SjLj we do the testing in only one
220/// place, in this catchpad.
221///
222/// After testing calling __wasm_setjmp_test(), if the longjmp does not
223/// correspond to one of the setjmps within the current function, it rethrows
224/// the longjmp by calling __wasm_longjmp(). If it corresponds to one of
225/// setjmps in the function, we jump to the beginning of the function, which
226/// contains a switch to each post-setjmp BB. Again, in Emscripten SjLj, this
227/// switch is added for every longjmpable callsite; in Wasm SjLj we do this
228/// only once at the top of the function. (after functionInvocationId
229/// initialization)
230///
231/// The below is the pseudocode for what we have described
232///
233/// entry:
234/// Initialize functionInvocationId
235///
236/// setjmp.dispatch:
237/// switch %label {
238/// label 1: goto post-setjmp BB 1
239/// label 2: goto post-setjmp BB 2
240/// ...
241/// default: goto splitted next BB
242/// }
243/// ...
244///
245/// bb:
246/// invoke void @foo() ;; foo is a longjmpable function
247/// to label %next unwind label %catch.dispatch.longjmp
248/// ...
249///
250/// catch.dispatch.longjmp:
251/// %0 = catchswitch within none [label %catch.longjmp] unwind to caller
252///
253/// catch.longjmp:
254/// %longjmp.args = wasm.catch() ;; struct __WasmLongjmpArgs
255/// %env = load 'env' field from __WasmLongjmpArgs
256/// %val = load 'val' field from __WasmLongjmpArgs
257/// %label = __wasm_setjmp_test(%env, functionInvocationId);
258/// if (%label == 0)
259/// __wasm_longjmp(%env, %val)
260/// catchret to %setjmp.dispatch
261///
262///===----------------------------------------------------------------------===//
263
265#include "WebAssembly.h"
271#include "llvm/IR/Dominators.h"
272#include "llvm/IR/IRBuilder.h"
273#include "llvm/IR/IntrinsicsWebAssembly.h"
279#include <set>
280
281using namespace llvm;
282
283#define DEBUG_TYPE "wasm-lower-em-ehsjlj"
284
286 EHAllowlist("emscripten-cxx-exceptions-allowed",
287 cl::desc("The list of function names in which Emscripten-style "
288 "exception handling is enabled (see emscripten "
289 "EMSCRIPTEN_CATCHING_ALLOWED options)"),
291
292namespace {
293class WebAssemblyLowerEmscriptenEHSjLj final : public ModulePass {
294 bool EnableEmEH; // Enable Emscripten exception handling
295 bool EnableEmSjLj; // Enable Emscripten setjmp/longjmp handling
296 bool EnableWasmSjLj; // Enable Wasm setjmp/longjmp handling
297 bool DoSjLj; // Whether we actually perform setjmp/longjmp handling
298
299 GlobalVariable *ThrewGV = nullptr; // __THREW__ (Emscripten)
300 GlobalVariable *ThrewValueGV = nullptr; // __threwValue (Emscripten)
301 Function *GetTempRet0F = nullptr; // getTempRet0() (Emscripten)
302 Function *SetTempRet0F = nullptr; // setTempRet0() (Emscripten)
303 Function *ResumeF = nullptr; // __resumeException() (Emscripten)
304 Function *EHTypeIDF = nullptr; // llvm.eh.typeid.for() (intrinsic)
305 Function *EmLongjmpF = nullptr; // emscripten_longjmp() (Emscripten)
306 Function *WasmSetjmpF = nullptr; // __wasm_setjmp() (Emscripten)
307 Function *WasmSetjmpTestF = nullptr; // __wasm_setjmp_test() (Emscripten)
308 Function *WasmLongjmpF = nullptr; // __wasm_longjmp() (Emscripten)
309 Function *CatchF = nullptr; // wasm.catch() (intrinsic)
310
311 // type of 'struct __WasmLongjmpArgs' defined in emscripten
312 Type *LongjmpArgsTy = nullptr;
313
314 // __cxa_find_matching_catch_N functions.
315 // Indexed by the number of clauses in an original landingpad instruction.
316 DenseMap<int, Function *> FindMatchingCatches;
317 // Map of <function signature string, invoke_ wrappers>
318 StringMap<Function *> InvokeWrappers;
319 // Set of allowed function names for exception handling
320 std::set<std::string> EHAllowlistSet;
321 // Functions that contains calls to setjmp
322 SmallPtrSet<Function *, 8> SetjmpUsers;
323
324 StringRef getPassName() const override {
325 return "WebAssembly Lower Emscripten Exceptions";
326 }
327
328 using InstVector = SmallVectorImpl<Instruction *>;
329 bool runEHOnFunction(Function &F);
330 bool runSjLjOnFunction(Function &F);
331 void handleLongjmpableCallsForEmscriptenSjLj(
332 Function &F, Instruction *FunctionInvocationId,
333 SmallVectorImpl<PHINode *> &SetjmpRetPHIs);
334 void
335 handleLongjmpableCallsForWasmSjLj(Function &F,
336 Instruction *FunctionInvocationId,
337 SmallVectorImpl<PHINode *> &SetjmpRetPHIs);
338 Function *getFindMatchingCatch(Module &M, unsigned NumClauses);
339
340 Value *wrapInvoke(CallBase *CI);
341 void wrapTestSetjmp(BasicBlock *BB, DebugLoc DL, Value *Threw,
342 Value *FunctionInvocationId, Value *&Label,
343 Value *&LongjmpResult, BasicBlock *&CallEmLongjmpBB,
344 PHINode *&CallEmLongjmpBBThrewPHI,
345 PHINode *&CallEmLongjmpBBThrewValuePHI,
346 BasicBlock *&EndBB);
347 Function *getInvokeWrapper(CallBase *CI);
348
349 bool areAllExceptionsAllowed() const { return EHAllowlistSet.empty(); }
350 bool supportsException(const Function *F) const {
351 return EnableEmEH && (areAllExceptionsAllowed() ||
352 EHAllowlistSet.count(std::string(F->getName())));
353 }
354 void replaceLongjmpWith(Function *LongjmpF, Function *NewF);
355
356 void rebuildSSA(Function &F);
357
358public:
359 static char ID;
360
361 WebAssemblyLowerEmscriptenEHSjLj()
362 : ModulePass(ID), EnableEmEH(WebAssembly::WasmEnableEmEH),
363 EnableEmSjLj(WebAssembly::WasmEnableEmSjLj),
364 EnableWasmSjLj(WebAssembly::WasmEnableSjLj) {
365 assert(!(EnableEmSjLj && EnableWasmSjLj) &&
366 "Two SjLj modes cannot be turned on at the same time");
367 assert(!(EnableEmEH && EnableWasmSjLj) &&
368 "Wasm SjLj should be only used with Wasm EH");
369 EHAllowlistSet.insert(EHAllowlist.begin(), EHAllowlist.end());
370 }
371 bool runOnModule(Module &M) override;
372
373 void getAnalysisUsage(AnalysisUsage &AU) const override {
375 }
376};
377} // End anonymous namespace
378
379char WebAssemblyLowerEmscriptenEHSjLj::ID = 0;
380INITIALIZE_PASS(WebAssemblyLowerEmscriptenEHSjLj, DEBUG_TYPE,
381 "WebAssembly Lower Emscripten Exceptions / Setjmp / Longjmp",
382 false, false)
383
385 return new WebAssemblyLowerEmscriptenEHSjLj();
386}
387
388static bool canThrow(const Value *V) {
389 if (const auto *F = dyn_cast<const Function>(V)) {
390 // Intrinsics cannot throw
391 if (F->isIntrinsic())
392 return false;
393 StringRef Name = F->getName();
394 // leave setjmp and longjmp (mostly) alone, we process them properly later
395 if (Name == "setjmp" || Name == "longjmp" || Name == "emscripten_longjmp")
396 return false;
397 return !F->doesNotThrow();
398 }
399 // not a function, so an indirect call - can throw, we can't tell
400 return true;
401}
402
403// Get a thread-local global variable with the given name. If it doesn't exist
404// declare it, which will generate an import and assume that it will exist at
405// link time.
408 const char *Name) {
409 auto *GV = dyn_cast<GlobalVariable>(M.getOrInsertGlobal(Name, Ty));
410 if (!GV)
411 report_fatal_error(Twine("unable to create global: ") + Name);
412
413 // Variables created by this function are thread local. If the target does not
414 // support TLS, we depend on CoalesceFeaturesAndStripAtomics to downgrade it
415 // to non-thread-local ones, in which case we don't allow this object to be
416 // linked with other objects using shared memory.
417 GV->setThreadLocalMode(GlobalValue::GeneralDynamicTLSModel);
418 return GV;
419}
420
421// Simple function name mangler.
422// This function simply takes LLVM's string representation of parameter types
423// and concatenate them with '_'. There are non-alphanumeric characters but llc
424// is ok with it, and we need to postprocess these names after the lowering
425// phase anyway.
426static std::string getSignature(FunctionType *FTy) {
427 std::string Sig;
429 OS << *FTy->getReturnType();
430 for (Type *ParamTy : FTy->params())
431 OS << "_" << *ParamTy;
432 if (FTy->isVarArg())
433 OS << "_...";
434 Sig = OS.str();
435 erase_if(Sig, isSpace);
436 // When s2wasm parses .s file, a comma means the end of an argument. So a
437 // mangled function name can contain any character but a comma.
438 std::replace(Sig.begin(), Sig.end(), ',', '.');
439 return Sig;
440}
441
443 Module *M) {
445 // Tell the linker that this function is expected to be imported from the
446 // 'env' module.
447 if (!F->hasFnAttribute("wasm-import-module")) {
448 llvm::AttrBuilder B(M->getContext());
449 B.addAttribute("wasm-import-module", "env");
450 F->addFnAttrs(B);
451 }
452 if (!F->hasFnAttribute("wasm-import-name")) {
453 llvm::AttrBuilder B(M->getContext());
454 B.addAttribute("wasm-import-name", F->getName());
455 F->addFnAttrs(B);
456 }
457 return F;
458}
459
460// Returns an integer type for the target architecture's address space.
461// i32 for wasm32 and i64 for wasm64.
463 IRBuilder<> IRB(M->getContext());
464 return IRB.getIntNTy(M->getDataLayout().getPointerSizeInBits());
465}
466
467// Returns an integer pointer type for the target architecture's address space.
468// i32* for wasm32 and i64* for wasm64. With opaque pointers this is just a ptr
469// in address space zero.
471 return PointerType::getUnqual(M->getContext());
472}
473
474// Returns an integer whose type is the integer type for the target's address
475// space. Returns (i32 C) for wasm32 and (i64 C) for wasm64, when C is the
476// integer.
478 IRBuilder<> IRB(M->getContext());
479 return IRB.getIntN(M->getDataLayout().getPointerSizeInBits(), C);
480}
481
482// Returns __cxa_find_matching_catch_N function, where N = NumClauses + 2.
483// This is because a landingpad instruction contains two more arguments, a
484// personality function and a cleanup bit, and __cxa_find_matching_catch_N
485// functions are named after the number of arguments in the original landingpad
486// instruction.
487Function *
488WebAssemblyLowerEmscriptenEHSjLj::getFindMatchingCatch(Module &M,
489 unsigned NumClauses) {
490 if (FindMatchingCatches.count(NumClauses))
491 return FindMatchingCatches[NumClauses];
492 PointerType *Int8PtrTy = PointerType::getUnqual(M.getContext());
493 SmallVector<Type *, 16> Args(NumClauses, Int8PtrTy);
494 FunctionType *FTy = FunctionType::get(Int8PtrTy, Args, false);
496 FTy, "__cxa_find_matching_catch_" + Twine(NumClauses + 2), &M);
497 FindMatchingCatches[NumClauses] = F;
498 return F;
499}
500
501// Generate invoke wrapper seqence with preamble and postamble
502// Preamble:
503// __THREW__ = 0;
504// Postamble:
505// %__THREW__.val = __THREW__; __THREW__ = 0;
506// Returns %__THREW__.val, which indicates whether an exception is thrown (or
507// whether longjmp occurred), for future use.
508Value *WebAssemblyLowerEmscriptenEHSjLj::wrapInvoke(CallBase *CI) {
509 Module *M = CI->getModule();
510 LLVMContext &C = M->getContext();
511
512 IRBuilder<> IRB(C);
513 IRB.SetInsertPoint(CI);
514
515 // Pre-invoke
516 // __THREW__ = 0;
517 IRB.CreateStore(getAddrSizeInt(M, 0), ThrewGV);
518
519 // Invoke function wrapper in JavaScript
521 // Put the pointer to the callee as first argument, so it can be called
522 // within the invoke wrapper later
523 Args.push_back(CI->getCalledOperand());
524 Args.append(CI->arg_begin(), CI->arg_end());
525 CallInst *NewCall = IRB.CreateCall(getInvokeWrapper(CI), Args);
526 NewCall->takeName(CI);
528 NewCall->setDebugLoc(CI->getDebugLoc());
529
530 // Because we added the pointer to the callee as first argument, all
531 // argument attribute indices have to be incremented by one.
532 SmallVector<AttributeSet, 8> ArgAttributes;
533 const AttributeList &InvokeAL = CI->getAttributes();
534
535 // No attributes for the callee pointer.
536 ArgAttributes.push_back(AttributeSet());
537 // Copy the argument attributes from the original
538 for (unsigned I = 0, E = CI->arg_size(); I < E; ++I)
539 ArgAttributes.push_back(InvokeAL.getParamAttrs(I));
540
541 AttrBuilder FnAttrs(CI->getContext(), InvokeAL.getFnAttrs());
542 if (auto Args = FnAttrs.getAllocSizeArgs()) {
543 // The allocsize attribute (if any) referes to parameters by index and needs
544 // to be adjusted.
545 auto [SizeArg, NEltArg] = *Args;
546 SizeArg += 1;
547 if (NEltArg)
548 NEltArg = *NEltArg + 1;
549 FnAttrs.addAllocSizeAttr(SizeArg, NEltArg);
550 }
551 // In case the callee has 'noreturn' attribute, We need to remove it, because
552 // we expect invoke wrappers to return.
553 FnAttrs.removeAttribute(Attribute::NoReturn);
554
555 // Reconstruct the AttributesList based on the vector we constructed.
557 C, AttributeSet::get(C, FnAttrs), InvokeAL.getRetAttrs(), ArgAttributes);
558 NewCall->setAttributes(NewCallAL);
559
560 CI->replaceAllUsesWith(NewCall);
561
562 // Post-invoke
563 // %__THREW__.val = __THREW__; __THREW__ = 0;
564 Value *Threw =
565 IRB.CreateLoad(getAddrIntType(M), ThrewGV, ThrewGV->getName() + ".val");
566 IRB.CreateStore(getAddrSizeInt(M, 0), ThrewGV);
567 return Threw;
568}
569
570// Get matching invoke wrapper based on callee signature
571Function *WebAssemblyLowerEmscriptenEHSjLj::getInvokeWrapper(CallBase *CI) {
572 Module *M = CI->getModule();
574 FunctionType *CalleeFTy = CI->getFunctionType();
575
576 std::string Sig = getSignature(CalleeFTy);
577 if (InvokeWrappers.contains(Sig))
578 return InvokeWrappers[Sig];
579
580 // Put the pointer to the callee as first argument
581 ArgTys.push_back(PointerType::getUnqual(CalleeFTy));
582 // Add argument types
583 ArgTys.append(CalleeFTy->param_begin(), CalleeFTy->param_end());
584
585 FunctionType *FTy = FunctionType::get(CalleeFTy->getReturnType(), ArgTys,
586 CalleeFTy->isVarArg());
587 Function *F = getEmscriptenFunction(FTy, "__invoke_" + Sig, M);
588 InvokeWrappers[Sig] = F;
589 return F;
590}
591
592static bool canLongjmp(const Value *Callee) {
593 if (auto *CalleeF = dyn_cast<Function>(Callee))
594 if (CalleeF->isIntrinsic())
595 return false;
596
597 // Attempting to transform inline assembly will result in something like:
598 // call void @__invoke_void(void ()* asm ...)
599 // which is invalid because inline assembly blocks do not have addresses
600 // and can't be passed by pointer. The result is a crash with illegal IR.
601 if (isa<InlineAsm>(Callee))
602 return false;
603 StringRef CalleeName = Callee->getName();
604
605 // TODO Include more functions or consider checking with mangled prefixes
606
607 // The reason we include malloc/free here is to exclude the malloc/free
608 // calls generated in setjmp prep / cleanup routines.
609 if (CalleeName == "setjmp" || CalleeName == "malloc" || CalleeName == "free")
610 return false;
611
612 // There are functions in Emscripten's JS glue code or compiler-rt
613 if (CalleeName == "__resumeException" || CalleeName == "llvm_eh_typeid_for" ||
614 CalleeName == "__wasm_setjmp" || CalleeName == "__wasm_setjmp_test" ||
615 CalleeName == "getTempRet0" || CalleeName == "setTempRet0")
616 return false;
617
618 // __cxa_find_matching_catch_N functions cannot longjmp
619 if (Callee->getName().starts_with("__cxa_find_matching_catch_"))
620 return false;
621
622 // Exception-catching related functions
623 //
624 // We intentionally treat __cxa_end_catch longjmpable in Wasm SjLj even though
625 // it surely cannot longjmp, in order to maintain the unwind relationship from
626 // all existing catchpads (and calls within them) to catch.dispatch.longjmp.
627 //
628 // In Wasm EH + Wasm SjLj, we
629 // 1. Make all catchswitch and cleanuppad that unwind to caller unwind to
630 // catch.dispatch.longjmp instead
631 // 2. Convert all longjmpable calls to invokes that unwind to
632 // catch.dispatch.longjmp
633 // But catchswitch BBs are removed in isel, so if an EH catchswitch (generated
634 // from an exception)'s catchpad does not contain any calls that are converted
635 // into invokes unwinding to catch.dispatch.longjmp, this unwind relationship
636 // (EH catchswitch BB -> catch.dispatch.longjmp BB) is lost and
637 // catch.dispatch.longjmp BB can be placed before the EH catchswitch BB in
638 // CFGSort.
639 // int ret = setjmp(buf);
640 // try {
641 // foo(); // longjmps
642 // } catch (...) {
643 // }
644 // Then in this code, if 'foo' longjmps, it first unwinds to 'catch (...)'
645 // catchswitch, and is not caught by that catchswitch because it is a longjmp,
646 // then it should next unwind to catch.dispatch.longjmp BB. But if this 'catch
647 // (...)' catchswitch -> catch.dispatch.longjmp unwind relationship is lost,
648 // it will not unwind to catch.dispatch.longjmp, producing an incorrect
649 // result.
650 //
651 // Every catchpad generated by Wasm C++ contains __cxa_end_catch, so we
652 // intentionally treat it as longjmpable to work around this problem. This is
653 // a hacky fix but an easy one.
654 //
655 // The comment block in findWasmUnwindDestinations() in
656 // SelectionDAGBuilder.cpp is addressing a similar problem.
657 if (CalleeName == "__cxa_end_catch")
659 if (CalleeName == "__cxa_begin_catch" ||
660 CalleeName == "__cxa_allocate_exception" || CalleeName == "__cxa_throw" ||
661 CalleeName == "__clang_call_terminate")
662 return false;
663
664 // std::terminate, which is generated when another exception occurs while
665 // handling an exception, cannot longjmp.
666 if (CalleeName == "_ZSt9terminatev")
667 return false;
668
669 // Otherwise we don't know
670 return true;
671}
672
673static bool isEmAsmCall(const Value *Callee) {
674 StringRef CalleeName = Callee->getName();
675 // This is an exhaustive list from Emscripten's <emscripten/em_asm.h>.
676 return CalleeName == "emscripten_asm_const_int" ||
677 CalleeName == "emscripten_asm_const_double" ||
678 CalleeName == "emscripten_asm_const_int_sync_on_main_thread" ||
679 CalleeName == "emscripten_asm_const_double_sync_on_main_thread" ||
680 CalleeName == "emscripten_asm_const_async_on_main_thread";
681}
682
683// Generate __wasm_setjmp_test function call seqence with preamble and
684// postamble. The code this generates is equivalent to the following
685// JavaScript code:
686// %__threwValue.val = __threwValue;
687// if (%__THREW__.val != 0 & %__threwValue.val != 0) {
688// %label = __wasm_setjmp_test(%__THREW__.val, functionInvocationId);
689// if (%label == 0)
690// emscripten_longjmp(%__THREW__.val, %__threwValue.val);
691// setTempRet0(%__threwValue.val);
692// } else {
693// %label = -1;
694// }
695// %longjmp_result = getTempRet0();
696//
697// As output parameters. returns %label, %longjmp_result, and the BB the last
698// instruction (%longjmp_result = ...) is in.
699void WebAssemblyLowerEmscriptenEHSjLj::wrapTestSetjmp(
700 BasicBlock *BB, DebugLoc DL, Value *Threw, Value *FunctionInvocationId,
701 Value *&Label, Value *&LongjmpResult, BasicBlock *&CallEmLongjmpBB,
702 PHINode *&CallEmLongjmpBBThrewPHI, PHINode *&CallEmLongjmpBBThrewValuePHI,
703 BasicBlock *&EndBB) {
704 Function *F = BB->getParent();
705 Module *M = F->getParent();
706 LLVMContext &C = M->getContext();
707 IRBuilder<> IRB(C);
708 IRB.SetCurrentDebugLocation(DL);
709
710 // if (%__THREW__.val != 0 & %__threwValue.val != 0)
711 IRB.SetInsertPoint(BB);
712 BasicBlock *ThenBB1 = BasicBlock::Create(C, "if.then1", F);
713 BasicBlock *ElseBB1 = BasicBlock::Create(C, "if.else1", F);
714 BasicBlock *EndBB1 = BasicBlock::Create(C, "if.end", F);
715 Value *ThrewCmp = IRB.CreateICmpNE(Threw, getAddrSizeInt(M, 0));
716 Value *ThrewValue = IRB.CreateLoad(IRB.getInt32Ty(), ThrewValueGV,
717 ThrewValueGV->getName() + ".val");
718 Value *ThrewValueCmp = IRB.CreateICmpNE(ThrewValue, IRB.getInt32(0));
719 Value *Cmp1 = IRB.CreateAnd(ThrewCmp, ThrewValueCmp, "cmp1");
720 IRB.CreateCondBr(Cmp1, ThenBB1, ElseBB1);
721
722 // Generate call.em.longjmp BB once and share it within the function
723 if (!CallEmLongjmpBB) {
724 // emscripten_longjmp(%__THREW__.val, %__threwValue.val);
725 CallEmLongjmpBB = BasicBlock::Create(C, "call.em.longjmp", F);
726 IRB.SetInsertPoint(CallEmLongjmpBB);
727 CallEmLongjmpBBThrewPHI = IRB.CreatePHI(getAddrIntType(M), 4, "threw.phi");
728 CallEmLongjmpBBThrewValuePHI =
729 IRB.CreatePHI(IRB.getInt32Ty(), 4, "threwvalue.phi");
730 CallEmLongjmpBBThrewPHI->addIncoming(Threw, ThenBB1);
731 CallEmLongjmpBBThrewValuePHI->addIncoming(ThrewValue, ThenBB1);
732 IRB.CreateCall(EmLongjmpF,
733 {CallEmLongjmpBBThrewPHI, CallEmLongjmpBBThrewValuePHI});
734 IRB.CreateUnreachable();
735 } else {
736 CallEmLongjmpBBThrewPHI->addIncoming(Threw, ThenBB1);
737 CallEmLongjmpBBThrewValuePHI->addIncoming(ThrewValue, ThenBB1);
738 }
739
740 // %label = __wasm_setjmp_test(%__THREW__.val, functionInvocationId);
741 // if (%label == 0)
742 IRB.SetInsertPoint(ThenBB1);
743 BasicBlock *EndBB2 = BasicBlock::Create(C, "if.end2", F);
744 Value *ThrewPtr =
745 IRB.CreateIntToPtr(Threw, getAddrPtrType(M), Threw->getName() + ".p");
746 Value *ThenLabel = IRB.CreateCall(WasmSetjmpTestF,
747 {ThrewPtr, FunctionInvocationId}, "label");
748 Value *Cmp2 = IRB.CreateICmpEQ(ThenLabel, IRB.getInt32(0));
749 IRB.CreateCondBr(Cmp2, CallEmLongjmpBB, EndBB2);
750
751 // setTempRet0(%__threwValue.val);
752 IRB.SetInsertPoint(EndBB2);
753 IRB.CreateCall(SetTempRet0F, ThrewValue);
754 IRB.CreateBr(EndBB1);
755
756 IRB.SetInsertPoint(ElseBB1);
757 IRB.CreateBr(EndBB1);
758
759 // longjmp_result = getTempRet0();
760 IRB.SetInsertPoint(EndBB1);
761 PHINode *LabelPHI = IRB.CreatePHI(IRB.getInt32Ty(), 2, "label");
762 LabelPHI->addIncoming(ThenLabel, EndBB2);
763
764 LabelPHI->addIncoming(IRB.getInt32(-1), ElseBB1);
765
766 // Output parameter assignment
767 Label = LabelPHI;
768 EndBB = EndBB1;
769 LongjmpResult = IRB.CreateCall(GetTempRet0F, std::nullopt, "longjmp_result");
770}
771
772void WebAssemblyLowerEmscriptenEHSjLj::rebuildSSA(Function &F) {
773 DominatorTree &DT = getAnalysis<DominatorTreeWrapperPass>(F).getDomTree();
774 DT.recalculate(F); // CFG has been changed
775
777 for (BasicBlock &BB : F) {
778 for (Instruction &I : BB) {
779 unsigned VarID = SSA.AddVariable(I.getName(), I.getType());
780 // If a value is defined by an invoke instruction, it is only available in
781 // its normal destination and not in its unwind destination.
782 if (auto *II = dyn_cast<InvokeInst>(&I))
783 SSA.AddAvailableValue(VarID, II->getNormalDest(), II);
784 else
785 SSA.AddAvailableValue(VarID, &BB, &I);
786 for (auto &U : I.uses()) {
787 auto *User = cast<Instruction>(U.getUser());
788 if (auto *UserPN = dyn_cast<PHINode>(User))
789 if (UserPN->getIncomingBlock(U) == &BB)
790 continue;
791 if (DT.dominates(&I, User))
792 continue;
793 SSA.AddUse(VarID, &U);
794 }
795 }
796 }
797 SSA.RewriteAllUses(&DT);
798}
799
800// Replace uses of longjmp with a new longjmp function in Emscripten library.
801// In Emscripten SjLj, the new function is
802// void emscripten_longjmp(uintptr_t, i32)
803// In Wasm SjLj, the new function is
804// void __wasm_longjmp(i8*, i32)
805// Because the original libc longjmp function takes (jmp_buf*, i32), we need a
806// ptrtoint/bitcast instruction here to make the type match. jmp_buf* will
807// eventually be lowered to i32/i64 in the wasm backend.
808void WebAssemblyLowerEmscriptenEHSjLj::replaceLongjmpWith(Function *LongjmpF,
809 Function *NewF) {
810 assert(NewF == EmLongjmpF || NewF == WasmLongjmpF);
811 Module *M = LongjmpF->getParent();
813 LLVMContext &C = LongjmpF->getParent()->getContext();
814 IRBuilder<> IRB(C);
815
816 // For calls to longjmp, replace it with emscripten_longjmp/__wasm_longjmp and
817 // cast its first argument (jmp_buf*) appropriately
818 for (User *U : LongjmpF->users()) {
819 auto *CI = dyn_cast<CallInst>(U);
820 if (CI && CI->getCalledFunction() == LongjmpF) {
821 IRB.SetInsertPoint(CI);
822 Value *Env = nullptr;
823 if (NewF == EmLongjmpF)
824 Env =
825 IRB.CreatePtrToInt(CI->getArgOperand(0), getAddrIntType(M), "env");
826 else // WasmLongjmpF
827 Env = IRB.CreateBitCast(CI->getArgOperand(0), IRB.getPtrTy(), "env");
828 IRB.CreateCall(NewF, {Env, CI->getArgOperand(1)});
829 ToErase.push_back(CI);
830 }
831 }
832 for (auto *I : ToErase)
833 I->eraseFromParent();
834
835 // If we have any remaining uses of longjmp's function pointer, replace it
836 // with (void(*)(jmp_buf*, int))emscripten_longjmp / __wasm_longjmp.
837 if (!LongjmpF->uses().empty()) {
838 Value *NewLongjmp =
839 IRB.CreateBitCast(NewF, LongjmpF->getType(), "longjmp.cast");
840 LongjmpF->replaceAllUsesWith(NewLongjmp);
841 }
842}
843
845 for (const auto &BB : *F)
846 for (const auto &I : BB)
847 if (const auto *CB = dyn_cast<CallBase>(&I))
848 if (canLongjmp(CB->getCalledOperand()))
849 return true;
850 return false;
851}
852
853// When a function contains a setjmp call but not other calls that can longjmp,
854// we don't do setjmp transformation for that setjmp. But we need to convert the
855// setjmp calls into "i32 0" so they don't cause link time errors. setjmp always
856// returns 0 when called directly.
857static void nullifySetjmp(Function *F) {
858 Module &M = *F->getParent();
859 IRBuilder<> IRB(M.getContext());
860 Function *SetjmpF = M.getFunction("setjmp");
862
863 for (User *U : make_early_inc_range(SetjmpF->users())) {
864 auto *CB = cast<CallBase>(U);
865 BasicBlock *BB = CB->getParent();
866 if (BB->getParent() != F) // in other function
867 continue;
868 CallInst *CI = nullptr;
869 // setjmp cannot throw. So if it is an invoke, lower it to a call
870 if (auto *II = dyn_cast<InvokeInst>(CB))
871 CI = llvm::changeToCall(II);
872 else
873 CI = cast<CallInst>(CB);
874 ToErase.push_back(CI);
875 CI->replaceAllUsesWith(IRB.getInt32(0));
876 }
877 for (auto *I : ToErase)
878 I->eraseFromParent();
879}
880
881bool WebAssemblyLowerEmscriptenEHSjLj::runOnModule(Module &M) {
882 LLVM_DEBUG(dbgs() << "********** Lower Emscripten EH & SjLj **********\n");
883
884 LLVMContext &C = M.getContext();
885 IRBuilder<> IRB(C);
886
887 Function *SetjmpF = M.getFunction("setjmp");
888 Function *LongjmpF = M.getFunction("longjmp");
889
890 // In some platforms _setjmp and _longjmp are used instead. Change these to
891 // use setjmp/longjmp instead, because we later detect these functions by
892 // their names.
893 Function *SetjmpF2 = M.getFunction("_setjmp");
894 Function *LongjmpF2 = M.getFunction("_longjmp");
895 if (SetjmpF2) {
896 if (SetjmpF) {
897 if (SetjmpF->getFunctionType() != SetjmpF2->getFunctionType())
898 report_fatal_error("setjmp and _setjmp have different function types");
899 } else {
900 SetjmpF = Function::Create(SetjmpF2->getFunctionType(),
901 GlobalValue::ExternalLinkage, "setjmp", M);
902 }
903 SetjmpF2->replaceAllUsesWith(SetjmpF);
904 }
905 if (LongjmpF2) {
906 if (LongjmpF) {
907 if (LongjmpF->getFunctionType() != LongjmpF2->getFunctionType())
909 "longjmp and _longjmp have different function types");
910 } else {
911 LongjmpF = Function::Create(LongjmpF2->getFunctionType(),
912 GlobalValue::ExternalLinkage, "setjmp", M);
913 }
914 LongjmpF2->replaceAllUsesWith(LongjmpF);
915 }
916
917 auto *TPC = getAnalysisIfAvailable<TargetPassConfig>();
918 assert(TPC && "Expected a TargetPassConfig");
919 auto &TM = TPC->getTM<WebAssemblyTargetMachine>();
920
921 // Declare (or get) global variables __THREW__, __threwValue, and
922 // getTempRet0/setTempRet0 function which are used in common for both
923 // exception handling and setjmp/longjmp handling
924 ThrewGV = getGlobalVariable(M, getAddrIntType(&M), TM, "__THREW__");
925 ThrewValueGV = getGlobalVariable(M, IRB.getInt32Ty(), TM, "__threwValue");
926 GetTempRet0F = getEmscriptenFunction(
927 FunctionType::get(IRB.getInt32Ty(), false), "getTempRet0", &M);
928 SetTempRet0F = getEmscriptenFunction(
929 FunctionType::get(IRB.getVoidTy(), IRB.getInt32Ty(), false),
930 "setTempRet0", &M);
931 GetTempRet0F->setDoesNotThrow();
932 SetTempRet0F->setDoesNotThrow();
933
934 bool Changed = false;
935
936 // Function registration for exception handling
937 if (EnableEmEH) {
938 // Register __resumeException function
939 FunctionType *ResumeFTy =
940 FunctionType::get(IRB.getVoidTy(), IRB.getPtrTy(), false);
941 ResumeF = getEmscriptenFunction(ResumeFTy, "__resumeException", &M);
942 ResumeF->addFnAttr(Attribute::NoReturn);
943
944 // Register llvm_eh_typeid_for function
945 FunctionType *EHTypeIDTy =
946 FunctionType::get(IRB.getInt32Ty(), IRB.getPtrTy(), false);
947 EHTypeIDF = getEmscriptenFunction(EHTypeIDTy, "llvm_eh_typeid_for", &M);
948 }
949
950 // Functions that contains calls to setjmp but don't have other longjmpable
951 // calls within them.
952 SmallPtrSet<Function *, 4> SetjmpUsersToNullify;
953
954 if ((EnableEmSjLj || EnableWasmSjLj) && SetjmpF) {
955 // Precompute setjmp users
956 for (User *U : SetjmpF->users()) {
957 if (auto *CB = dyn_cast<CallBase>(U)) {
958 auto *UserF = CB->getFunction();
959 // If a function that calls setjmp does not contain any other calls that
960 // can longjmp, we don't need to do any transformation on that function,
961 // so can ignore it
962 if (containsLongjmpableCalls(UserF))
963 SetjmpUsers.insert(UserF);
964 else
965 SetjmpUsersToNullify.insert(UserF);
966 } else {
967 std::string S;
969 SS << *U;
970 report_fatal_error(Twine("Indirect use of setjmp is not supported: ") +
971 SS.str());
972 }
973 }
974 }
975
976 bool SetjmpUsed = SetjmpF && !SetjmpUsers.empty();
977 bool LongjmpUsed = LongjmpF && !LongjmpF->use_empty();
978 DoSjLj = (EnableEmSjLj | EnableWasmSjLj) && (SetjmpUsed || LongjmpUsed);
979
980 // Function registration and data pre-gathering for setjmp/longjmp handling
981 if (DoSjLj) {
982 assert(EnableEmSjLj || EnableWasmSjLj);
983 if (EnableEmSjLj) {
984 // Register emscripten_longjmp function
985 FunctionType *FTy = FunctionType::get(
986 IRB.getVoidTy(), {getAddrIntType(&M), IRB.getInt32Ty()}, false);
987 EmLongjmpF = getEmscriptenFunction(FTy, "emscripten_longjmp", &M);
988 EmLongjmpF->addFnAttr(Attribute::NoReturn);
989 } else { // EnableWasmSjLj
990 Type *Int8PtrTy = IRB.getPtrTy();
991 // Register __wasm_longjmp function, which calls __builtin_wasm_longjmp.
992 FunctionType *FTy = FunctionType::get(
993 IRB.getVoidTy(), {Int8PtrTy, IRB.getInt32Ty()}, false);
994 WasmLongjmpF = getEmscriptenFunction(FTy, "__wasm_longjmp", &M);
995 WasmLongjmpF->addFnAttr(Attribute::NoReturn);
996 }
997
998 if (SetjmpF) {
999 Type *Int8PtrTy = IRB.getPtrTy();
1000 Type *Int32PtrTy = IRB.getPtrTy();
1001 Type *Int32Ty = IRB.getInt32Ty();
1002
1003 // Register __wasm_setjmp function
1004 FunctionType *SetjmpFTy = SetjmpF->getFunctionType();
1005 FunctionType *FTy = FunctionType::get(
1006 IRB.getVoidTy(), {SetjmpFTy->getParamType(0), Int32Ty, Int32PtrTy},
1007 false);
1008 WasmSetjmpF = getEmscriptenFunction(FTy, "__wasm_setjmp", &M);
1009
1010 // Register __wasm_setjmp_test function
1011 FTy = FunctionType::get(Int32Ty, {Int32PtrTy, Int32PtrTy}, false);
1012 WasmSetjmpTestF = getEmscriptenFunction(FTy, "__wasm_setjmp_test", &M);
1013
1014 // wasm.catch() will be lowered down to wasm 'catch' instruction in
1015 // instruction selection.
1016 CatchF = Intrinsic::getDeclaration(&M, Intrinsic::wasm_catch);
1017 // Type for struct __WasmLongjmpArgs
1018 LongjmpArgsTy = StructType::get(Int8PtrTy, // env
1019 Int32Ty // val
1020 );
1021 }
1022 }
1023
1024 // Exception handling transformation
1025 if (EnableEmEH) {
1026 for (Function &F : M) {
1027 if (F.isDeclaration())
1028 continue;
1029 Changed |= runEHOnFunction(F);
1030 }
1031 }
1032
1033 // Setjmp/longjmp handling transformation
1034 if (DoSjLj) {
1035 Changed = true; // We have setjmp or longjmp somewhere
1036 if (LongjmpF)
1037 replaceLongjmpWith(LongjmpF, EnableEmSjLj ? EmLongjmpF : WasmLongjmpF);
1038 // Only traverse functions that uses setjmp in order not to insert
1039 // unnecessary prep / cleanup code in every function
1040 if (SetjmpF)
1041 for (Function *F : SetjmpUsers)
1042 runSjLjOnFunction(*F);
1043 }
1044
1045 // Replace unnecessary setjmp calls with 0
1046 if ((EnableEmSjLj || EnableWasmSjLj) && !SetjmpUsersToNullify.empty()) {
1047 Changed = true;
1048 assert(SetjmpF);
1049 for (Function *F : SetjmpUsersToNullify)
1051 }
1052
1053 // Delete unused global variables and functions
1054 for (auto *V : {ThrewGV, ThrewValueGV})
1055 if (V && V->use_empty())
1056 V->eraseFromParent();
1057 for (auto *V : {GetTempRet0F, SetTempRet0F, ResumeF, EHTypeIDF, EmLongjmpF,
1058 WasmSetjmpF, WasmSetjmpTestF, WasmLongjmpF, CatchF})
1059 if (V && V->use_empty())
1060 V->eraseFromParent();
1061
1062 return Changed;
1063}
1064
1065bool WebAssemblyLowerEmscriptenEHSjLj::runEHOnFunction(Function &F) {
1066 Module &M = *F.getParent();
1067 LLVMContext &C = F.getContext();
1068 IRBuilder<> IRB(C);
1069 bool Changed = false;
1072
1073 // rethrow.longjmp BB that will be shared within the function.
1074 BasicBlock *RethrowLongjmpBB = nullptr;
1075 // PHI node for the loaded value of __THREW__ global variable in
1076 // rethrow.longjmp BB
1077 PHINode *RethrowLongjmpBBThrewPHI = nullptr;
1078
1079 for (BasicBlock &BB : F) {
1080 auto *II = dyn_cast<InvokeInst>(BB.getTerminator());
1081 if (!II)
1082 continue;
1083 Changed = true;
1084 LandingPads.insert(II->getLandingPadInst());
1085 IRB.SetInsertPoint(II);
1086
1087 const Value *Callee = II->getCalledOperand();
1088 bool NeedInvoke = supportsException(&F) && canThrow(Callee);
1089 if (NeedInvoke) {
1090 // Wrap invoke with invoke wrapper and generate preamble/postamble
1091 Value *Threw = wrapInvoke(II);
1092 ToErase.push_back(II);
1093
1094 // If setjmp/longjmp handling is enabled, the thrown value can be not an
1095 // exception but a longjmp. If the current function contains calls to
1096 // setjmp, it will be appropriately handled in runSjLjOnFunction. But even
1097 // if the function does not contain setjmp calls, we shouldn't silently
1098 // ignore longjmps; we should rethrow them so they can be correctly
1099 // handled in somewhere up the call chain where setjmp is. __THREW__'s
1100 // value is 0 when nothing happened, 1 when an exception is thrown, and
1101 // other values when longjmp is thrown.
1102 //
1103 // if (%__THREW__.val == 0 || %__THREW__.val == 1)
1104 // goto %tail
1105 // else
1106 // goto %longjmp.rethrow
1107 //
1108 // rethrow.longjmp: ;; This is longjmp. Rethrow it
1109 // %__threwValue.val = __threwValue
1110 // emscripten_longjmp(%__THREW__.val, %__threwValue.val);
1111 //
1112 // tail: ;; Nothing happened or an exception is thrown
1113 // ... Continue exception handling ...
1114 if (DoSjLj && EnableEmSjLj && !SetjmpUsers.count(&F) &&
1115 canLongjmp(Callee)) {
1116 // Create longjmp.rethrow BB once and share it within the function
1117 if (!RethrowLongjmpBB) {
1118 RethrowLongjmpBB = BasicBlock::Create(C, "rethrow.longjmp", &F);
1119 IRB.SetInsertPoint(RethrowLongjmpBB);
1120 RethrowLongjmpBBThrewPHI =
1121 IRB.CreatePHI(getAddrIntType(&M), 4, "threw.phi");
1122 RethrowLongjmpBBThrewPHI->addIncoming(Threw, &BB);
1123 Value *ThrewValue = IRB.CreateLoad(IRB.getInt32Ty(), ThrewValueGV,
1124 ThrewValueGV->getName() + ".val");
1125 IRB.CreateCall(EmLongjmpF, {RethrowLongjmpBBThrewPHI, ThrewValue});
1126 IRB.CreateUnreachable();
1127 } else {
1128 RethrowLongjmpBBThrewPHI->addIncoming(Threw, &BB);
1129 }
1130
1131 IRB.SetInsertPoint(II); // Restore the insert point back
1132 BasicBlock *Tail = BasicBlock::Create(C, "tail", &F);
1133 Value *CmpEqOne =
1134 IRB.CreateICmpEQ(Threw, getAddrSizeInt(&M, 1), "cmp.eq.one");
1135 Value *CmpEqZero =
1136 IRB.CreateICmpEQ(Threw, getAddrSizeInt(&M, 0), "cmp.eq.zero");
1137 Value *Or = IRB.CreateOr(CmpEqZero, CmpEqOne, "or");
1138 IRB.CreateCondBr(Or, Tail, RethrowLongjmpBB);
1139 IRB.SetInsertPoint(Tail);
1140 BB.replaceSuccessorsPhiUsesWith(&BB, Tail);
1141 }
1142
1143 // Insert a branch based on __THREW__ variable
1144 Value *Cmp = IRB.CreateICmpEQ(Threw, getAddrSizeInt(&M, 1), "cmp");
1145 IRB.CreateCondBr(Cmp, II->getUnwindDest(), II->getNormalDest());
1146
1147 } else {
1148 // This can't throw, and we don't need this invoke, just replace it with a
1149 // call+branch
1150 changeToCall(II);
1151 }
1152 }
1153
1154 // Process resume instructions
1155 for (BasicBlock &BB : F) {
1156 // Scan the body of the basic block for resumes
1157 for (Instruction &I : BB) {
1158 auto *RI = dyn_cast<ResumeInst>(&I);
1159 if (!RI)
1160 continue;
1161 Changed = true;
1162
1163 // Split the input into legal values
1164 Value *Input = RI->getValue();
1165 IRB.SetInsertPoint(RI);
1166 Value *Low = IRB.CreateExtractValue(Input, 0, "low");
1167 // Create a call to __resumeException function
1168 IRB.CreateCall(ResumeF, {Low});
1169 // Add a terminator to the block
1170 IRB.CreateUnreachable();
1171 ToErase.push_back(RI);
1172 }
1173 }
1174
1175 // Process llvm.eh.typeid.for intrinsics
1176 for (BasicBlock &BB : F) {
1177 for (Instruction &I : BB) {
1178 auto *CI = dyn_cast<CallInst>(&I);
1179 if (!CI)
1180 continue;
1181 const Function *Callee = CI->getCalledFunction();
1182 if (!Callee)
1183 continue;
1184 if (Callee->getIntrinsicID() != Intrinsic::eh_typeid_for)
1185 continue;
1186 Changed = true;
1187
1188 IRB.SetInsertPoint(CI);
1189 CallInst *NewCI =
1190 IRB.CreateCall(EHTypeIDF, CI->getArgOperand(0), "typeid");
1191 CI->replaceAllUsesWith(NewCI);
1192 ToErase.push_back(CI);
1193 }
1194 }
1195
1196 // Look for orphan landingpads, can occur in blocks with no predecessors
1197 for (BasicBlock &BB : F) {
1198 Instruction *I = BB.getFirstNonPHI();
1199 if (auto *LPI = dyn_cast<LandingPadInst>(I))
1200 LandingPads.insert(LPI);
1201 }
1202 Changed |= !LandingPads.empty();
1203
1204 // Handle all the landingpad for this function together, as multiple invokes
1205 // may share a single lp
1206 for (LandingPadInst *LPI : LandingPads) {
1207 IRB.SetInsertPoint(LPI);
1209 for (unsigned I = 0, E = LPI->getNumClauses(); I < E; ++I) {
1210 Constant *Clause = LPI->getClause(I);
1211 // TODO Handle filters (= exception specifications).
1212 // https://github.com/llvm/llvm-project/issues/49740
1213 if (LPI->isCatch(I))
1214 FMCArgs.push_back(Clause);
1215 }
1216
1217 // Create a call to __cxa_find_matching_catch_N function
1218 Function *FMCF = getFindMatchingCatch(M, FMCArgs.size());
1219 CallInst *FMCI = IRB.CreateCall(FMCF, FMCArgs, "fmc");
1220 Value *Poison = PoisonValue::get(LPI->getType());
1221 Value *Pair0 = IRB.CreateInsertValue(Poison, FMCI, 0, "pair0");
1222 Value *TempRet0 = IRB.CreateCall(GetTempRet0F, std::nullopt, "tempret0");
1223 Value *Pair1 = IRB.CreateInsertValue(Pair0, TempRet0, 1, "pair1");
1224
1225 LPI->replaceAllUsesWith(Pair1);
1226 ToErase.push_back(LPI);
1227 }
1228
1229 // Erase everything we no longer need in this function
1230 for (Instruction *I : ToErase)
1231 I->eraseFromParent();
1232
1233 return Changed;
1234}
1235
1236// This tries to get debug info from the instruction before which a new
1237// instruction will be inserted, and if there's no debug info in that
1238// instruction, tries to get the info instead from the previous instruction (if
1239// any). If none of these has debug info and a DISubprogram is provided, it
1240// creates a dummy debug info with the first line of the function, because IR
1241// verifier requires all inlinable callsites should have debug info when both a
1242// caller and callee have DISubprogram. If none of these conditions are met,
1243// returns empty info.
1244static DebugLoc getOrCreateDebugLoc(const Instruction *InsertBefore,
1245 DISubprogram *SP) {
1246 assert(InsertBefore);
1247 if (InsertBefore->getDebugLoc())
1248 return InsertBefore->getDebugLoc();
1249 const Instruction *Prev = InsertBefore->getPrevNode();
1250 if (Prev && Prev->getDebugLoc())
1251 return Prev->getDebugLoc();
1252 if (SP)
1253 return DILocation::get(SP->getContext(), SP->getLine(), 1, SP);
1254 return DebugLoc();
1255}
1256
1257bool WebAssemblyLowerEmscriptenEHSjLj::runSjLjOnFunction(Function &F) {
1258 assert(EnableEmSjLj || EnableWasmSjLj);
1259 Module &M = *F.getParent();
1260 LLVMContext &C = F.getContext();
1261 IRBuilder<> IRB(C);
1263
1264 // Setjmp preparation
1265
1266 BasicBlock *Entry = &F.getEntryBlock();
1267 DebugLoc FirstDL = getOrCreateDebugLoc(&*Entry->begin(), F.getSubprogram());
1268 SplitBlock(Entry, &*Entry->getFirstInsertionPt());
1269
1270 IRB.SetInsertPoint(Entry->getTerminator()->getIterator());
1271 // This alloca'ed pointer is used by the runtime to identify function
1272 // invocations. It's just for pointer comparisons. It will never be
1273 // dereferenced.
1274 Instruction *FunctionInvocationId =
1275 IRB.CreateAlloca(IRB.getInt32Ty(), nullptr, "functionInvocationId");
1276 FunctionInvocationId->setDebugLoc(FirstDL);
1277
1278 // Setjmp transformation
1279 SmallVector<PHINode *, 4> SetjmpRetPHIs;
1280 Function *SetjmpF = M.getFunction("setjmp");
1281 for (auto *U : make_early_inc_range(SetjmpF->users())) {
1282 auto *CB = cast<CallBase>(U);
1283 BasicBlock *BB = CB->getParent();
1284 if (BB->getParent() != &F) // in other function
1285 continue;
1286 if (CB->getOperandBundle(LLVMContext::OB_funclet)) {
1287 std::string S;
1289 SS << "In function " + F.getName() +
1290 ": setjmp within a catch clause is not supported in Wasm EH:\n";
1291 SS << *CB;
1293 }
1294
1295 CallInst *CI = nullptr;
1296 // setjmp cannot throw. So if it is an invoke, lower it to a call
1297 if (auto *II = dyn_cast<InvokeInst>(CB))
1298 CI = llvm::changeToCall(II);
1299 else
1300 CI = cast<CallInst>(CB);
1301
1302 // The tail is everything right after the call, and will be reached once
1303 // when setjmp is called, and later when longjmp returns to the setjmp
1304 BasicBlock *Tail = SplitBlock(BB, CI->getNextNode());
1305 // Add a phi to the tail, which will be the output of setjmp, which
1306 // indicates if this is the first call or a longjmp back. The phi directly
1307 // uses the right value based on where we arrive from
1308 IRB.SetInsertPoint(Tail, Tail->getFirstNonPHIIt());
1309 PHINode *SetjmpRet = IRB.CreatePHI(IRB.getInt32Ty(), 2, "setjmp.ret");
1310
1311 // setjmp initial call returns 0
1312 SetjmpRet->addIncoming(IRB.getInt32(0), BB);
1313 // The proper output is now this, not the setjmp call itself
1314 CI->replaceAllUsesWith(SetjmpRet);
1315 // longjmp returns to the setjmp will add themselves to this phi
1316 SetjmpRetPHIs.push_back(SetjmpRet);
1317
1318 // Fix call target
1319 // Our index in the function is our place in the array + 1 to avoid index
1320 // 0, because index 0 means the longjmp is not ours to handle.
1321 IRB.SetInsertPoint(CI);
1322 Value *Args[] = {CI->getArgOperand(0), IRB.getInt32(SetjmpRetPHIs.size()),
1323 FunctionInvocationId};
1324 IRB.CreateCall(WasmSetjmpF, Args);
1325 ToErase.push_back(CI);
1326 }
1327
1328 // Handle longjmpable calls.
1329 if (EnableEmSjLj)
1330 handleLongjmpableCallsForEmscriptenSjLj(F, FunctionInvocationId,
1331 SetjmpRetPHIs);
1332 else // EnableWasmSjLj
1333 handleLongjmpableCallsForWasmSjLj(F, FunctionInvocationId, SetjmpRetPHIs);
1334
1335 // Erase everything we no longer need in this function
1336 for (Instruction *I : ToErase)
1337 I->eraseFromParent();
1338
1339 // Finally, our modifications to the cfg can break dominance of SSA variables.
1340 // For example, in this code,
1341 // if (x()) { .. setjmp() .. }
1342 // if (y()) { .. longjmp() .. }
1343 // We must split the longjmp block, and it can jump into the block splitted
1344 // from setjmp one. But that means that when we split the setjmp block, it's
1345 // first part no longer dominates its second part - there is a theoretically
1346 // possible control flow path where x() is false, then y() is true and we
1347 // reach the second part of the setjmp block, without ever reaching the first
1348 // part. So, we rebuild SSA form here.
1349 rebuildSSA(F);
1350 return true;
1351}
1352
1353// Update each call that can longjmp so it can return to the corresponding
1354// setjmp. Refer to 4) of "Emscripten setjmp/longjmp handling" section in the
1355// comments at top of the file for details.
1356void WebAssemblyLowerEmscriptenEHSjLj::handleLongjmpableCallsForEmscriptenSjLj(
1357 Function &F, Instruction *FunctionInvocationId,
1358 SmallVectorImpl<PHINode *> &SetjmpRetPHIs) {
1359 Module &M = *F.getParent();
1360 LLVMContext &C = F.getContext();
1361 IRBuilder<> IRB(C);
1363
1364 // call.em.longjmp BB that will be shared within the function.
1365 BasicBlock *CallEmLongjmpBB = nullptr;
1366 // PHI node for the loaded value of __THREW__ global variable in
1367 // call.em.longjmp BB
1368 PHINode *CallEmLongjmpBBThrewPHI = nullptr;
1369 // PHI node for the loaded value of __threwValue global variable in
1370 // call.em.longjmp BB
1371 PHINode *CallEmLongjmpBBThrewValuePHI = nullptr;
1372 // rethrow.exn BB that will be shared within the function.
1373 BasicBlock *RethrowExnBB = nullptr;
1374
1375 // Because we are creating new BBs while processing and don't want to make
1376 // all these newly created BBs candidates again for longjmp processing, we
1377 // first make the vector of candidate BBs.
1378 std::vector<BasicBlock *> BBs;
1379 for (BasicBlock &BB : F)
1380 BBs.push_back(&BB);
1381
1382 // BBs.size() will change within the loop, so we query it every time
1383 for (unsigned I = 0; I < BBs.size(); I++) {
1384 BasicBlock *BB = BBs[I];
1385 for (Instruction &I : *BB) {
1386 if (isa<InvokeInst>(&I)) {
1387 std::string S;
1389 SS << "In function " << F.getName()
1390 << ": When using Wasm EH with Emscripten SjLj, there is a "
1391 "restriction that `setjmp` function call and exception cannot be "
1392 "used within the same function:\n";
1393 SS << I;
1395 }
1396 auto *CI = dyn_cast<CallInst>(&I);
1397 if (!CI)
1398 continue;
1399
1400 const Value *Callee = CI->getCalledOperand();
1401 if (!canLongjmp(Callee))
1402 continue;
1403 if (isEmAsmCall(Callee))
1404 report_fatal_error("Cannot use EM_ASM* alongside setjmp/longjmp in " +
1405 F.getName() +
1406 ". Please consider using EM_JS, or move the "
1407 "EM_ASM into another function.",
1408 false);
1409
1410 Value *Threw = nullptr;
1412 if (Callee->getName().starts_with("__invoke_")) {
1413 // If invoke wrapper has already been generated for this call in
1414 // previous EH phase, search for the load instruction
1415 // %__THREW__.val = __THREW__;
1416 // in postamble after the invoke wrapper call
1417 LoadInst *ThrewLI = nullptr;
1418 StoreInst *ThrewResetSI = nullptr;
1419 for (auto I = std::next(BasicBlock::iterator(CI)), IE = BB->end();
1420 I != IE; ++I) {
1421 if (auto *LI = dyn_cast<LoadInst>(I))
1422 if (auto *GV = dyn_cast<GlobalVariable>(LI->getPointerOperand()))
1423 if (GV == ThrewGV) {
1424 Threw = ThrewLI = LI;
1425 break;
1426 }
1427 }
1428 // Search for the store instruction after the load above
1429 // __THREW__ = 0;
1430 for (auto I = std::next(BasicBlock::iterator(ThrewLI)), IE = BB->end();
1431 I != IE; ++I) {
1432 if (auto *SI = dyn_cast<StoreInst>(I)) {
1433 if (auto *GV = dyn_cast<GlobalVariable>(SI->getPointerOperand())) {
1434 if (GV == ThrewGV &&
1435 SI->getValueOperand() == getAddrSizeInt(&M, 0)) {
1436 ThrewResetSI = SI;
1437 break;
1438 }
1439 }
1440 }
1441 }
1442 assert(Threw && ThrewLI && "Cannot find __THREW__ load after invoke");
1443 assert(ThrewResetSI && "Cannot find __THREW__ store after invoke");
1444 Tail = SplitBlock(BB, ThrewResetSI->getNextNode());
1445
1446 } else {
1447 // Wrap call with invoke wrapper and generate preamble/postamble
1448 Threw = wrapInvoke(CI);
1449 ToErase.push_back(CI);
1450 Tail = SplitBlock(BB, CI->getNextNode());
1451
1452 // If exception handling is enabled, the thrown value can be not a
1453 // longjmp but an exception, in which case we shouldn't silently ignore
1454 // exceptions; we should rethrow them.
1455 // __THREW__'s value is 0 when nothing happened, 1 when an exception is
1456 // thrown, other values when longjmp is thrown.
1457 //
1458 // if (%__THREW__.val == 1)
1459 // goto %eh.rethrow
1460 // else
1461 // goto %normal
1462 //
1463 // eh.rethrow: ;; Rethrow exception
1464 // %exn = call @__cxa_find_matching_catch_2() ;; Retrieve thrown ptr
1465 // __resumeException(%exn)
1466 //
1467 // normal:
1468 // <-- Insertion point. Will insert sjlj handling code from here
1469 // goto %tail
1470 //
1471 // tail:
1472 // ...
1473 if (supportsException(&F) && canThrow(Callee)) {
1474 // We will add a new conditional branch. So remove the branch created
1475 // when we split the BB
1476 ToErase.push_back(BB->getTerminator());
1477
1478 // Generate rethrow.exn BB once and share it within the function
1479 if (!RethrowExnBB) {
1480 RethrowExnBB = BasicBlock::Create(C, "rethrow.exn", &F);
1481 IRB.SetInsertPoint(RethrowExnBB);
1482 CallInst *Exn =
1483 IRB.CreateCall(getFindMatchingCatch(M, 0), {}, "exn");
1484 IRB.CreateCall(ResumeF, {Exn});
1485 IRB.CreateUnreachable();
1486 }
1487
1488 IRB.SetInsertPoint(CI);
1489 BasicBlock *NormalBB = BasicBlock::Create(C, "normal", &F);
1490 Value *CmpEqOne =
1491 IRB.CreateICmpEQ(Threw, getAddrSizeInt(&M, 1), "cmp.eq.one");
1492 IRB.CreateCondBr(CmpEqOne, RethrowExnBB, NormalBB);
1493
1494 IRB.SetInsertPoint(NormalBB);
1495 IRB.CreateBr(Tail);
1496 BB = NormalBB; // New insertion point to insert __wasm_setjmp_test()
1497 }
1498 }
1499
1500 // We need to replace the terminator in Tail - SplitBlock makes BB go
1501 // straight to Tail, we need to check if a longjmp occurred, and go to the
1502 // right setjmp-tail if so
1503 ToErase.push_back(BB->getTerminator());
1504
1505 // Generate a function call to __wasm_setjmp_test function and
1506 // preamble/postamble code to figure out (1) whether longjmp
1507 // occurred (2) if longjmp occurred, which setjmp it corresponds to
1508 Value *Label = nullptr;
1509 Value *LongjmpResult = nullptr;
1510 BasicBlock *EndBB = nullptr;
1511 wrapTestSetjmp(BB, CI->getDebugLoc(), Threw, FunctionInvocationId, Label,
1512 LongjmpResult, CallEmLongjmpBB, CallEmLongjmpBBThrewPHI,
1513 CallEmLongjmpBBThrewValuePHI, EndBB);
1514 assert(Label && LongjmpResult && EndBB);
1515
1516 // Create switch instruction
1517 IRB.SetInsertPoint(EndBB);
1518 IRB.SetCurrentDebugLocation(EndBB->back().getDebugLoc());
1519 SwitchInst *SI = IRB.CreateSwitch(Label, Tail, SetjmpRetPHIs.size());
1520 // -1 means no longjmp happened, continue normally (will hit the default
1521 // switch case). 0 means a longjmp that is not ours to handle, needs a
1522 // rethrow. Otherwise the index is the same as the index in P+1 (to avoid
1523 // 0).
1524 for (unsigned I = 0; I < SetjmpRetPHIs.size(); I++) {
1525 SI->addCase(IRB.getInt32(I + 1), SetjmpRetPHIs[I]->getParent());
1526 SetjmpRetPHIs[I]->addIncoming(LongjmpResult, EndBB);
1527 }
1528
1529 // We are splitting the block here, and must continue to find other calls
1530 // in the block - which is now split. so continue to traverse in the Tail
1531 BBs.push_back(Tail);
1532 }
1533 }
1534
1535 for (Instruction *I : ToErase)
1536 I->eraseFromParent();
1537}
1538
1540 for (const User *U : CPI->users())
1541 if (const auto *CRI = dyn_cast<CleanupReturnInst>(U))
1542 return CRI->getUnwindDest();
1543 return nullptr;
1544}
1545
1546// Create a catchpad in which we catch a longjmp's env and val arguments, test
1547// if the longjmp corresponds to one of setjmps in the current function, and if
1548// so, jump to the setjmp dispatch BB from which we go to one of post-setjmp
1549// BBs. Refer to 4) of "Wasm setjmp/longjmp handling" section in the comments at
1550// top of the file for details.
1551void WebAssemblyLowerEmscriptenEHSjLj::handleLongjmpableCallsForWasmSjLj(
1552 Function &F, Instruction *FunctionInvocationId,
1553 SmallVectorImpl<PHINode *> &SetjmpRetPHIs) {
1554 Module &M = *F.getParent();
1555 LLVMContext &C = F.getContext();
1556 IRBuilder<> IRB(C);
1557
1558 // A function with catchswitch/catchpad instruction should have a personality
1559 // function attached to it. Search for the wasm personality function, and if
1560 // it exists, use it, and if it doesn't, create a dummy personality function.
1561 // (SjLj is not going to call it anyway.)
1562 if (!F.hasPersonalityFn()) {
1563 StringRef PersName = getEHPersonalityName(EHPersonality::Wasm_CXX);
1564 FunctionType *PersType =
1565 FunctionType::get(IRB.getInt32Ty(), /* isVarArg */ true);
1566 Value *PersF = M.getOrInsertFunction(PersName, PersType).getCallee();
1567 F.setPersonalityFn(
1568 cast<Constant>(IRB.CreateBitCast(PersF, IRB.getPtrTy())));
1569 }
1570
1571 // Use the entry BB's debugloc as a fallback
1572 BasicBlock *Entry = &F.getEntryBlock();
1573 DebugLoc FirstDL = getOrCreateDebugLoc(&*Entry->begin(), F.getSubprogram());
1574 IRB.SetCurrentDebugLocation(FirstDL);
1575
1576 // Add setjmp.dispatch BB right after the entry block. Because we have
1577 // initialized functionInvocationId in the entry block and split the
1578 // rest into another BB, here 'OrigEntry' is the function's original entry
1579 // block before the transformation.
1580 //
1581 // entry:
1582 // functionInvocationId initialization
1583 // setjmp.dispatch:
1584 // switch will be inserted here later
1585 // entry.split: (OrigEntry)
1586 // the original function starts here
1587 BasicBlock *OrigEntry = Entry->getNextNode();
1588 BasicBlock *SetjmpDispatchBB =
1589 BasicBlock::Create(C, "setjmp.dispatch", &F, OrigEntry);
1590 cast<BranchInst>(Entry->getTerminator())->setSuccessor(0, SetjmpDispatchBB);
1591
1592 // Create catch.dispatch.longjmp BB and a catchswitch instruction
1593 BasicBlock *CatchDispatchLongjmpBB =
1594 BasicBlock::Create(C, "catch.dispatch.longjmp", &F);
1595 IRB.SetInsertPoint(CatchDispatchLongjmpBB);
1596 CatchSwitchInst *CatchSwitchLongjmp =
1597 IRB.CreateCatchSwitch(ConstantTokenNone::get(C), nullptr, 1);
1598
1599 // Create catch.longjmp BB and a catchpad instruction
1600 BasicBlock *CatchLongjmpBB = BasicBlock::Create(C, "catch.longjmp", &F);
1601 CatchSwitchLongjmp->addHandler(CatchLongjmpBB);
1602 IRB.SetInsertPoint(CatchLongjmpBB);
1603 CatchPadInst *CatchPad = IRB.CreateCatchPad(CatchSwitchLongjmp, {});
1604
1605 // Wasm throw and catch instructions can throw and catch multiple values, but
1606 // that requires multivalue support in the toolchain, which is currently not
1607 // very reliable. We instead throw and catch a pointer to a struct value of
1608 // type 'struct __WasmLongjmpArgs', which is defined in Emscripten.
1609 Instruction *LongjmpArgs =
1610 IRB.CreateCall(CatchF, {IRB.getInt32(WebAssembly::C_LONGJMP)}, "thrown");
1611 Value *EnvField =
1612 IRB.CreateConstGEP2_32(LongjmpArgsTy, LongjmpArgs, 0, 0, "env_gep");
1613 Value *ValField =
1614 IRB.CreateConstGEP2_32(LongjmpArgsTy, LongjmpArgs, 0, 1, "val_gep");
1615 // void *env = __wasm_longjmp_args.env;
1616 Instruction *Env = IRB.CreateLoad(IRB.getPtrTy(), EnvField, "env");
1617 // int val = __wasm_longjmp_args.val;
1618 Instruction *Val = IRB.CreateLoad(IRB.getInt32Ty(), ValField, "val");
1619
1620 // %label = __wasm_setjmp_test(%env, functionInvocatinoId);
1621 // if (%label == 0)
1622 // __wasm_longjmp(%env, %val)
1623 // catchret to %setjmp.dispatch
1624 BasicBlock *ThenBB = BasicBlock::Create(C, "if.then", &F);
1625 BasicBlock *EndBB = BasicBlock::Create(C, "if.end", &F);
1626 Value *EnvP = IRB.CreateBitCast(Env, getAddrPtrType(&M), "env.p");
1627 Value *Label = IRB.CreateCall(WasmSetjmpTestF, {EnvP, FunctionInvocationId},
1628 OperandBundleDef("funclet", CatchPad), "label");
1629 Value *Cmp = IRB.CreateICmpEQ(Label, IRB.getInt32(0));
1630 IRB.CreateCondBr(Cmp, ThenBB, EndBB);
1631
1632 IRB.SetInsertPoint(ThenBB);
1633 CallInst *WasmLongjmpCI = IRB.CreateCall(
1634 WasmLongjmpF, {Env, Val}, OperandBundleDef("funclet", CatchPad));
1635 IRB.CreateUnreachable();
1636
1637 IRB.SetInsertPoint(EndBB);
1638 // Jump to setjmp.dispatch block
1639 IRB.CreateCatchRet(CatchPad, SetjmpDispatchBB);
1640
1641 // Go back to setjmp.dispatch BB
1642 // setjmp.dispatch:
1643 // switch %label {
1644 // label 1: goto post-setjmp BB 1
1645 // label 2: goto post-setjmp BB 2
1646 // ...
1647 // default: goto splitted next BB
1648 // }
1649 IRB.SetInsertPoint(SetjmpDispatchBB);
1650 PHINode *LabelPHI = IRB.CreatePHI(IRB.getInt32Ty(), 2, "label.phi");
1651 LabelPHI->addIncoming(Label, EndBB);
1652 LabelPHI->addIncoming(IRB.getInt32(-1), Entry);
1653 SwitchInst *SI = IRB.CreateSwitch(LabelPHI, OrigEntry, SetjmpRetPHIs.size());
1654 // -1 means no longjmp happened, continue normally (will hit the default
1655 // switch case). 0 means a longjmp that is not ours to handle, needs a
1656 // rethrow. Otherwise the index is the same as the index in P+1 (to avoid
1657 // 0).
1658 for (unsigned I = 0; I < SetjmpRetPHIs.size(); I++) {
1659 SI->addCase(IRB.getInt32(I + 1), SetjmpRetPHIs[I]->getParent());
1660 SetjmpRetPHIs[I]->addIncoming(Val, SetjmpDispatchBB);
1661 }
1662
1663 // Convert all longjmpable call instructions to invokes that unwind to the
1664 // newly created catch.dispatch.longjmp BB.
1665 SmallVector<CallInst *, 64> LongjmpableCalls;
1666 for (auto *BB = &*F.begin(); BB; BB = BB->getNextNode()) {
1667 for (auto &I : *BB) {
1668 auto *CI = dyn_cast<CallInst>(&I);
1669 if (!CI)
1670 continue;
1671 const Value *Callee = CI->getCalledOperand();
1672 if (!canLongjmp(Callee))
1673 continue;
1674 if (isEmAsmCall(Callee))
1675 report_fatal_error("Cannot use EM_ASM* alongside setjmp/longjmp in " +
1676 F.getName() +
1677 ". Please consider using EM_JS, or move the "
1678 "EM_ASM into another function.",
1679 false);
1680 // This is __wasm_longjmp() call we inserted in this function, which
1681 // rethrows the longjmp when the longjmp does not correspond to one of
1682 // setjmps in this function. We should not convert this call to an invoke.
1683 if (CI == WasmLongjmpCI)
1684 continue;
1685 LongjmpableCalls.push_back(CI);
1686 }
1687 }
1688
1689 for (auto *CI : LongjmpableCalls) {
1690 // Even if the callee function has attribute 'nounwind', which is true for
1691 // all C functions, it can longjmp, which means it can throw a Wasm
1692 // exception now.
1693 CI->removeFnAttr(Attribute::NoUnwind);
1694 if (Function *CalleeF = CI->getCalledFunction())
1695 CalleeF->removeFnAttr(Attribute::NoUnwind);
1696
1697 // Change it to an invoke and make it unwind to the catch.dispatch.longjmp
1698 // BB. If the call is enclosed in another catchpad/cleanuppad scope, unwind
1699 // to its parent pad's unwind destination instead to preserve the scope
1700 // structure. It will eventually unwind to the catch.dispatch.longjmp.
1702 BasicBlock *UnwindDest = nullptr;
1703 if (auto Bundle = CI->getOperandBundle(LLVMContext::OB_funclet)) {
1704 Instruction *FromPad = cast<Instruction>(Bundle->Inputs[0]);
1705 while (!UnwindDest) {
1706 if (auto *CPI = dyn_cast<CatchPadInst>(FromPad)) {
1707 UnwindDest = CPI->getCatchSwitch()->getUnwindDest();
1708 break;
1709 }
1710 if (auto *CPI = dyn_cast<CleanupPadInst>(FromPad)) {
1711 // getCleanupRetUnwindDest() can return nullptr when
1712 // 1. This cleanuppad's matching cleanupret uwninds to caller
1713 // 2. There is no matching cleanupret because it ends with
1714 // unreachable.
1715 // In case of 2, we need to traverse the parent pad chain.
1716 UnwindDest = getCleanupRetUnwindDest(CPI);
1717 Value *ParentPad = CPI->getParentPad();
1718 if (isa<ConstantTokenNone>(ParentPad))
1719 break;
1720 FromPad = cast<Instruction>(ParentPad);
1721 }
1722 }
1723 }
1724 if (!UnwindDest)
1725 UnwindDest = CatchDispatchLongjmpBB;
1726 changeToInvokeAndSplitBasicBlock(CI, UnwindDest);
1727 }
1728
1730 for (auto &BB : F) {
1731 if (auto *CSI = dyn_cast<CatchSwitchInst>(BB.getFirstNonPHI())) {
1732 if (CSI != CatchSwitchLongjmp && CSI->unwindsToCaller()) {
1733 IRB.SetInsertPoint(CSI);
1734 ToErase.push_back(CSI);
1735 auto *NewCSI = IRB.CreateCatchSwitch(CSI->getParentPad(),
1736 CatchDispatchLongjmpBB, 1);
1737 NewCSI->addHandler(*CSI->handler_begin());
1738 NewCSI->takeName(CSI);
1739 CSI->replaceAllUsesWith(NewCSI);
1740 }
1741 }
1742
1743 if (auto *CRI = dyn_cast<CleanupReturnInst>(BB.getTerminator())) {
1744 if (CRI->unwindsToCaller()) {
1745 IRB.SetInsertPoint(CRI);
1746 ToErase.push_back(CRI);
1747 IRB.CreateCleanupRet(CRI->getCleanupPad(), CatchDispatchLongjmpBB);
1748 }
1749 }
1750 }
1751
1752 for (Instruction *I : ToErase)
1753 I->eraseFromParent();
1754}
MachineBasicBlock MachineBasicBlock::iterator DebugLoc DL
static GCRegistry::Add< OcamlGC > B("ocaml", "ocaml 3.10-compatible GC")
#define LLVM_DEBUG(X)
Definition: Debug.h:101
std::string Name
#define F(x, y, z)
Definition: MD5.cpp:55
#define I(x, y, z)
Definition: MD5.cpp:58
Memory SSA
Definition: MemorySSA.cpp:71
IntegerType * Int32Ty
const char LLVMTargetMachineRef TM
#define INITIALIZE_PASS(passName, arg, name, cfg, analysis)
Definition: PassSupport.h:38
assert(ImpDefSCC.getReg()==AMDGPU::SCC &&ImpDefSCC.isDef())
raw_pwrite_stream & OS
This file contains some functions that are useful when dealing with strings.
Target-Independent Code Generator Pass Configuration Options pass.
static void nullifySetjmp(Function *F)
static bool canLongjmp(const Value *Callee)
static cl::list< std::string > EHAllowlist("emscripten-cxx-exceptions-allowed", cl::desc("The list of function names in which Emscripten-style " "exception handling is enabled (see emscripten " "EMSCRIPTEN_CATCHING_ALLOWED options)"), cl::CommaSeparated)
static Type * getAddrPtrType(Module *M)
static std::string getSignature(FunctionType *FTy)
static Type * getAddrIntType(Module *M)
static bool canThrow(const Value *V)
static BasicBlock * getCleanupRetUnwindDest(const CleanupPadInst *CPI)
static DebugLoc getOrCreateDebugLoc(const Instruction *InsertBefore, DISubprogram *SP)
static bool containsLongjmpableCalls(const Function *F)
static Value * getAddrSizeInt(Module *M, uint64_t C)
static Function * getEmscriptenFunction(FunctionType *Ty, const Twine &Name, Module *M)
static GlobalVariable * getGlobalVariable(Module &M, Type *Ty, WebAssemblyTargetMachine &TM, const char *Name)
static bool isEmAsmCall(const Value *Callee)
This file provides WebAssembly-specific target descriptions.
This file declares the WebAssembly-specific subclass of TargetMachine.
This file contains the entry points for global functions defined in the LLVM WebAssembly back-end.
Represent the analysis usage information of a pass.
AnalysisUsage & addRequired()
AttributeSet getFnAttrs() const
The function attributes are returned.
static AttributeList get(LLVMContext &C, ArrayRef< std::pair< unsigned, Attribute > > Attrs)
Create an AttributeList with the specified parameters in it.
AttributeSet getRetAttrs() const
The attributes for the ret value are returned.
AttributeSet getParamAttrs(unsigned ArgNo) const
The attributes for the argument or parameter at the given index are returned.
static AttributeSet get(LLVMContext &C, const AttrBuilder &B)
Definition: Attributes.cpp:774
LLVM Basic Block Representation.
Definition: BasicBlock.h:60
static BasicBlock * Create(LLVMContext &Context, const Twine &Name="", Function *Parent=nullptr, BasicBlock *InsertBefore=nullptr)
Creates a new BasicBlock.
Definition: BasicBlock.h:199
const Function * getParent() const
Return the enclosing method, or null if none.
Definition: BasicBlock.h:206
InstListType::iterator iterator
Instruction iterators...
Definition: BasicBlock.h:165
const Instruction & back() const
Definition: BasicBlock.h:455
Base class for all callable instructions (InvokeInst and CallInst) Holds everything related to callin...
Definition: InstrTypes.h:1494
void setCallingConv(CallingConv::ID CC)
Definition: InstrTypes.h:1804
std::optional< OperandBundleUse > getOperandBundle(StringRef Name) const
Return an operand bundle by name, if present.
Definition: InstrTypes.h:2400
Function * getCalledFunction() const
Returns the function called, or null if this is an indirect function invocation or the function signa...
Definition: InstrTypes.h:1742
User::op_iterator arg_begin()
Return the iterator pointing to the beginning of the argument list.
Definition: InstrTypes.h:1662
Value * getCalledOperand() const
Definition: InstrTypes.h:1735
void setAttributes(AttributeList A)
Set the parameter attributes for this call.
Definition: InstrTypes.h:1823
Value * getArgOperand(unsigned i) const
Definition: InstrTypes.h:1687
User::op_iterator arg_end()
Return the iterator pointing to the end of the argument list.
Definition: InstrTypes.h:1668
FunctionType * getFunctionType() const
Definition: InstrTypes.h:1600
void removeFnAttr(Attribute::AttrKind Kind)
Removes the attribute from the function.
Definition: InstrTypes.h:1898
unsigned arg_size() const
Definition: InstrTypes.h:1685
AttributeList getAttributes() const
Return the parameter attributes for this call.
Definition: InstrTypes.h:1819
This class represents a function call, abstracting a target machine's calling convention.
void addHandler(BasicBlock *Dest)
Add an entry to the switch instruction... Note: This action invalidates handler_end().
static ConstantTokenNone * get(LLVMContext &Context)
Return the ConstantTokenNone.
Definition: Constants.cpp:1499
This is an important base class in LLVM.
Definition: Constant.h:41
Subprogram description.
A debug info location.
Definition: DebugLoc.h:33
size_type count(const_arg_type_t< KeyT > Val) const
Return 1 if the specified key is in the map, 0 otherwise.
Definition: DenseMap.h:151
void recalculate(ParentType &Func)
recalculate - compute a dominator tree for the given function
Legacy analysis pass which computes a DominatorTree.
Definition: Dominators.h:317
Concrete subclass of DominatorTreeBase that is used to compute a normal dominator tree.
Definition: Dominators.h:162
bool dominates(const BasicBlock *BB, const Use &U) const
Return true if the (end of the) basic block BB dominates the use U.
Definition: Dominators.cpp:122
void addFnAttr(Attribute::AttrKind Kind)
Add function attributes to this function.
Definition: Function.cpp:585
static Function * Create(FunctionType *Ty, LinkageTypes Linkage, unsigned AddrSpace, const Twine &N="", Module *M=nullptr)
Definition: Function.h:163
FunctionType * getFunctionType() const
Returns the FunctionType for me.
Definition: Function.h:201
void setDoesNotThrow()
Definition: Function.h:575
Module * getParent()
Get the module that this global value is contained inside of...
Definition: GlobalValue.h:656
PointerType * getType() const
Global values are always pointers.
Definition: GlobalValue.h:294
@ ExternalLinkage
Externally visible function.
Definition: GlobalValue.h:52
IntegerType * getIntNTy(unsigned N)
Fetch the type representing an N-bit integer.
Definition: IRBuilder.h:539
ConstantInt * getInt32(uint32_t C)
Get a constant 32-bit value.
Definition: IRBuilder.h:486
ConstantInt * getIntN(unsigned N, uint64_t C)
Get a constant N-bit value, zero extended or truncated from a 64-bit value.
Definition: IRBuilder.h:497
This provides a uniform API for creating instructions and inserting them into a basic block: either a...
Definition: IRBuilder.h:2666
const DebugLoc & getDebugLoc() const
Return the debug location for this node as a DebugLoc.
Definition: Instruction.h:454
const Module * getModule() const
Return the module owning the function this instruction belongs to or nullptr it the function does not...
Definition: Instruction.cpp:82
void setDebugLoc(DebugLoc Loc)
Set the debug location information for this instruction.
Definition: Instruction.h:451
This is an important class for using LLVM in a threaded context.
Definition: LLVMContext.h:67
The landingpad instruction holds all of the information necessary to generate correct exception handl...
An instruction for reading from memory.
Definition: Instructions.h:184
LLVMContext & getContext() const
Definition: Metadata.h:1231
ModulePass class - This class is used to implement unstructured interprocedural optimizations and ana...
Definition: Pass.h:251
virtual bool runOnModule(Module &M)=0
runOnModule - Virtual method overriden by subclasses to process the module being operated on.
A Module instance is used to store all the information related to an LLVM module.
Definition: Module.h:65
LLVMContext & getContext() const
Get the global data context.
Definition: Module.h:301
void addIncoming(Value *V, BasicBlock *BB)
Add an incoming value to the end of the PHI list.
virtual void getAnalysisUsage(AnalysisUsage &) const
getAnalysisUsage - This function should be overriden by passes that need analysis information to do t...
Definition: Pass.cpp:98
virtual StringRef getPassName() const
getPassName - Return a nice clean name for a pass.
Definition: Pass.cpp:81
static PoisonValue * get(Type *T)
Static factory methods - Return an 'poison' object of the specified type.
Definition: Constants.cpp:1827
Helper class for SSA formation on a set of values defined in multiple blocks.
size_type count(ConstPtrType Ptr) const
count - Return 1 if the specified pointer is in the set, 0 otherwise.
Definition: SmallPtrSet.h:360
std::pair< iterator, bool > insert(PtrType Ptr)
Inserts Ptr if and only if there is no element in the container equal to Ptr.
Definition: SmallPtrSet.h:342
SmallPtrSet - This class implements a set which is optimized for holding SmallSize or less elements.
Definition: SmallPtrSet.h:427
size_t size() const
Definition: SmallVector.h:91
This class consists of common code factored out of the SmallVector class to reduce code duplication b...
Definition: SmallVector.h:586
void append(ItTy in_start, ItTy in_end)
Add the specified range to the end of the SmallVector.
Definition: SmallVector.h:696
void push_back(const T &Elt)
Definition: SmallVector.h:426
This is a 'vector' (really, a variable-sized array), optimized for the case when the array is small.
Definition: SmallVector.h:1209
An instruction for storing to memory.
Definition: Instructions.h:317
StringMap - This is an unconventional map that is specialized for handling keys that are "strings",...
Definition: StringMap.h:127
bool contains(StringRef Key) const
contains - Return true if the element is in the map, false otherwise.
Definition: StringMap.h:273
StringRef - Represent a constant reference to a string, i.e.
Definition: StringRef.h:50
static StructType * get(LLVMContext &Context, ArrayRef< Type * > Elements, bool isPacked=false)
This static method is the primary way to create a literal StructType.
Definition: Type.cpp:373
Multiway switch.
Twine - A lightweight data structure for efficiently representing the concatenation of temporary valu...
Definition: Twine.h:81
The instances of the Type class are immutable: once they are created, they are never changed.
Definition: Type.h:45
static IntegerType * getInt32Ty(LLVMContext &C)
LLVM Value Representation.
Definition: Value.h:74
void replaceAllUsesWith(Value *V)
Change all uses of this to point to a new Value.
Definition: Value.cpp:534
iterator_range< user_iterator > users()
Definition: Value.h:421
bool use_empty() const
Definition: Value.h:344
LLVMContext & getContext() const
All values hold a context through their type.
Definition: Value.cpp:1074
iterator_range< use_iterator > uses()
Definition: Value.h:376
StringRef getName() const
Return a constant reference to the value's name.
Definition: Value.cpp:309
void takeName(Value *V)
Transfer the name from V to this value.
Definition: Value.cpp:383
NodeTy * getNextNode()
Get the next node, or nullptr for the list tail.
Definition: ilist_node.h:316
A raw_ostream that writes to an std::string.
Definition: raw_ostream.h:660
constexpr char Args[]
Key for Kernel::Metadata::mArgs.
unsigned ID
LLVM IR allows to use arbitrary numbers as calling convention identifiers.
Definition: CallingConv.h:24
@ WASM_EmscriptenInvoke
For emscripten __invoke_* functions.
Definition: CallingConv.h:229
@ Tail
Attemps to make calls as fast as possible while guaranteeing that tail call optimization can always b...
Definition: CallingConv.h:76
@ C
The default llvm calling convention, compatible with C.
Definition: CallingConv.h:34
Function * getDeclaration(Module *M, ID id, ArrayRef< Type * > Tys=std::nullopt)
Create or insert an LLVM Function declaration for an intrinsic, and return it.
Definition: Function.cpp:1465
cl::opt< bool > WasmEnableSjLj
cl::opt< bool > WasmEnableEmEH
cl::opt< bool > WasmEnableEmSjLj
@ SS
Definition: X86.h:207
@ CommaSeparated
Definition: CommandLine.h:164
This is an optimization pass for GlobalISel generic memory operations.
Definition: AddressRanges.h:18
@ Low
Lower the current thread's priority such that it does not affect foreground tasks significantly.
StringRef getEHPersonalityName(EHPersonality Pers)
BasicBlock * changeToInvokeAndSplitBasicBlock(CallInst *CI, BasicBlock *UnwindEdge, DomTreeUpdater *DTU=nullptr)
Convert the CallInst to InvokeInst with the specified unwind edge basic block.
Definition: Local.cpp:2918
CallInst * changeToCall(InvokeInst *II, DomTreeUpdater *DTU=nullptr)
This function converts the specified invoke into a normal call.
Definition: Local.cpp:2898
iterator_range< early_inc_iterator_impl< detail::IterOfRange< RangeT > > > make_early_inc_range(RangeT &&Range)
Make a range that does early increment to allow mutation of the underlying range without disrupting i...
Definition: STLExtras.h:656
OperandBundleDefT< Value * > OperandBundleDef
Definition: AutoUpgrade.h:33
raw_ostream & dbgs()
dbgs() - This returns a reference to a raw_ostream for debugging messages.
Definition: Debug.cpp:163
void report_fatal_error(Error Err, bool gen_crash_diag=true)
Report a serious error, calling any installed error handler.
Definition: Error.cpp:156
ModulePass * createWebAssemblyLowerEmscriptenEHSjLj()
@ Or
Bitwise or logical OR of integers.
BasicBlock * SplitBlock(BasicBlock *Old, BasicBlock::iterator SplitPt, DominatorTree *DT, LoopInfo *LI=nullptr, MemorySSAUpdater *MSSAU=nullptr, const Twine &BBName="", bool Before=false)
Split the specified block at the specified instruction.
void erase_if(Container &C, UnaryPredicate P)
Provide a container algorithm similar to C++ Library Fundamentals v2's erase_if which is equivalent t...
Definition: STLExtras.h:2051