Formalizing the Concurrency Semantics of an LLVM Fragment

Soham Chakraborty, Viktor Vafeiadis

Max Planck Institute for Software Systems (MPI-SWS)

EuroLLVM 2017

# LLVM Compilation



LLVM

# LLVM Concurrency Compilation



# LLVM Concurrency Compilation



formalized

formalized

# LLVM Concurrency Compilation



# Correctness of the transformations is unclear

Limitation of LLVM Informal Concurrency



Valid opt is removed by over-restriction in bug fix



Formalized fragment of LLVM concurrency Verified correctness of transformations Validated LLVM opt-phase transformations Informal text in Language Reference Manual

Frequent references to C11 concurrency

- "This model is inspired by the C++0x memory model."
- "These semantics are borrowed from Java and C++0x, but are somewhat more colloquial."
- This is intended to match shared variables in C/C++ ..."



Subtle differences

- A program has write-read race on non-atomics
  - C11: the behavior of the program is undefined
  - LLVM: *defined* behavior;

racy read returns **undef(u)** 

Subtle differences

- A program has write-read race on non-atomics
  - C11: the behavior of the program is undefined
  - LLVM: *defined* behavior;

racy read returns undef(u)

$$X = 1; \quad \begin{vmatrix} \text{if}(X) \\ t = 4; \\ \text{else} \\ t = 4; \end{vmatrix}$$

Subtle differences

- A program has write-read race on non-atomics
  - C11: the behavior of the program is undefined
  - LLVM: *defined* behavior;

racy read returns undef(u)

$$X = 1; \quad \begin{vmatrix} \text{if}(X) \\ t = 4; \\ \text{else} \\ t = 4; \end{vmatrix}$$

Subtle differences

- A program has write-read race on non-atomics
  - C11: the behavior of the program is undefined
  - LLVM: *defined* behavior;

racy read returns undef(u)

$$X = 1; \quad \begin{vmatrix} \text{if}(X) \\ t = 4; \\ \text{else} \\ t = 4; \end{vmatrix}$$

 $t \neq 4$ ?

Subtle differences

- A program has write-read race on non-atomics
  - C11: the behavior of the program is undefined
  - LLVM: *defined* behavior;

racy read returns undef(u)

$$X = 1; \quad \begin{vmatrix} \text{if}(X) \\ t = 4; \\ \text{else} \\ t = 4; \end{vmatrix}$$

 $t \neq 4$ ? C11  $\checkmark$ 

Subtle differences

- A program has write-read race on non-atomics
  - C11: the behavior of the program is undefined
  - LLVM: *defined* behavior;

racy read returns undef(u)

$$X = 1; \quad \begin{vmatrix} \text{if}(X) \\ t = 4; \\ \text{else} \\ t = 4; \end{vmatrix}$$

 $t \neq 4$ ? C11  $\checkmark$  LLVM X

Subtle differences

- A program has write-read race on non-atomics
  - C11: the behavior of the program is undefined
  - LLVM: *defined* behavior;

racy read returns undef(u)

$$X = 1; \quad \begin{vmatrix} \text{if}(X) \\ t = 4; \\ \text{else} \\ t = 4; \end{vmatrix}$$

 $t \neq 4$ ? C11  $\checkmark$  LLVM X

- Set of allowed optimizations are different

#### C11 vs LLVM



C11 🗶 LLVM 🗸

#### C11 vs LLVM

Context:if(flag){t = X; $\begin{bmatrix} X = 1; \parallel \end{bmatrix}$ a = X; $\rightsquigarrow$ if(flag){a = t; $\end{pmatrix}$  $\}$ a = t;

C11 🗶 LLVM 🗸



C11 🗸 🛛 LLVM 🗡

# Formalization of LLVM concurrency

# Verified correctness of transformations

Validated LLVM opt-phase transformations

# Example

int 
$$X = 0, Y = 0;$$
  
 $a = X; || b = Y;$   
 $Y = 1; || X = 1;$   
Is  $a == b == 1$  possible?

# Example

int 
$$X = 0, Y = 0;$$
  
 $a = X; \mid b = Y;$   
 $Y = 1; \mid X = 1;$   
Is  $a == b == 1$  possible?  $\checkmark$ 

$$int X = 0, Y = 0; \qquad int X = 0, Y = 0;$$
$$\begin{pmatrix} a = X; \\ Y = 1; \\ \end{bmatrix} \begin{array}{c} b = Y; \\ X = 1; \\ \end{array} \xrightarrow{} Y = 1; \\ a = X; \\ b = Y; \\ a = X; \\ b = Y; \\ \end{array}$$

$$\begin{array}{c|c} int \ X = 0, \ Y = 0; \\ a = X; \\ Y = 1; \end{array} \begin{array}{c} b = Y; \\ X = 1; \end{array}$$

$$WX0 | program-order WY0$$

$$\begin{array}{c|c} \text{int } X = 0, Y = 0; \\ \hline a = X; \\ Y = 1; \\ \end{array} \begin{array}{c} b = Y; \\ X = 1; \\ \end{array}$$























# Example

int 
$$X = 0, Y = 0;$$
  
 $a = X; || b = Y;$   
 $Y = 1; || X = 1;$   
Is  $a == b == 1$  possible?  $\checkmark$ 

$$int X = 0, Y = 0; \qquad int X = 0, Y = 0;$$
$$\begin{pmatrix} a = X; \\ Y = 1; \\ \end{bmatrix} \begin{array}{c} b = Y; \\ X = 1; \\ \end{array} \xrightarrow{} Y = 1; \\ a = X; \\ b = Y; \\ a = X; \\ b = Y; \\ \end{array}$$

#### Execution from Event Structure



#### Execution from Event Structure



#### Execution from Event Structure



13

- Memory operations:
  - load
  - store
  - compare\_and\_swap (CAS)
- Memory orders:
  - non-atomic (na)
  - acquire (acq)
  - release (rel)
  - acquire\_release (acq\_rel)
  - sequentially consistent (sc)

Formalized fragment of LLVM concurrency

# Verified correctness of transformations

- Elimination
- Reordering
- Mappings (C11  $\rightsquigarrow$  LLVM  $\rightsquigarrow$  X86/Power)

Validated LLVM opt-phase transformations

# Transformation Correctness

![](_page_34_Figure_1.jpeg)

# Behavior $(P_{tgt}) \subseteq$ Behavior $(P_{src})$ Behavior: final values observed in each location

# Transformation Correctness

![](_page_35_Figure_1.jpeg)

# Behavior $(P_{tgt}) \subseteq$ Behavior $(P_{src})$ Behavior: final values observed in each location

 $Behavior(G_{tgt}) \subseteq Behavior(G_{src})$ 

♠

#### **Elimination Optimizations**

Adjacent read after read/write elimination

• 
$$a = X_o$$
;  $b = X_{na}$ ;  $\rightsquigarrow a = X_o$ ;  $b = a$ ;

• 
$$X_o = v$$
;  $b = X_{na}$ ;  $\rightsquigarrow X_o = v$ ;  $b = v$ ;

Adjacent overwritten write elimination

• 
$$X_{na} = v'; X_{na} = v; \rightsquigarrow X_{na} = v;$$

Non-adjacent overwritten write elimination

# LLVM performs these eliminations

#### **Elimination Optimizations**

Adjacent read after read/write elimination •  $a = X_o; b = X_{na}; \rightsquigarrow a = X_o; b = a;$ •  $X_o = v; b = X_{na}; \rightsquigarrow X_o = v; b =$ Adjacent overwritten write •  $X_{na} = v': X_{na} = v''$ Non-adjacent  $\mathbf{x}_{na} = \mathbf{v}; \rightsquigarrow C; \mathbf{X}_{na} = \mathbf{v};$ acq-pair  $\notin$  C and  $access(X) \notin C$ LLVM performs these eliminations

#### Also Proved...

Adjacent read after read/write elimination

• 
$$a = X_{acq}$$
;  $b = X_{acq}$ ;  $\rightsquigarrow a = X_{acq}$ ;  $b = a$ ;

• 
$$a = X_{sc}; b = X_{(acq|sc)}; \rightsquigarrow a = X_{sc}; b = a;$$

• 
$$X_{\text{rel}} = v$$
;  $b = X_{\text{acq}}$ ;  $\rightsquigarrow X_{\text{rel}} = v$ ;  $b = v$ ;

• 
$$X_{\rm sc} = v; b = X_{(\rm acq|sc)}; \rightsquigarrow X_{\rm sc} = v; b = v;$$

Adjacent overwritten write elimination

• 
$$X_{rel} = v'; X_{rel} = v; \rightsquigarrow X_{rel} = v;$$
  
•  $X_{(rel|sc)} = v'; X_{sc} = v; \rightsquigarrow X_{sc} = v;$ 

LLVM does NOT perform these eliminations

#### Also Proved...

Adjacent read after read/write elimination

• 
$$a = X_{acq}$$
;  $b = X_{acq}$ ;  $\rightsquigarrow a = X_{acq}$ ;  $b = a$ ;

• 
$$a = X_{sc}; b = X_{(acq|sc)}; \rightsquigarrow a = X_{sc}; b = a;$$

• 
$$X_{\text{rel}} = v$$
;  $b = X_{\text{acq}}$ ;  $\rightsquigarrow X_{\text{rel}} = v$ ;  $b = v$ ;

• 
$$X_{sc} = v; b = X_{(acq|sc)}; \rightsquigarrow X_{sc} = v; b = v;$$

Adjacent overwritten write elimination

• 
$$X_{rel} = v'; X_{rel} = v; \rightsquigarrow X_{rel} = v;$$
  
•  $X_{(rel|sc)} = v'; X_{sc} = v; \rightsquigarrow X_{sc} = v;$ 

# LLVM does NOT perform these eliminations

Non-adjacent read after write elimination

#### Also Proved

Adjacent read after read/write elimination

• 
$$a = X_{acq}; b = X_{acq}; \rightarrow a = X_{acq}; b = a;$$

• 
$$a = X_{sc}; b = X_{(acq|sc)}; \rightsquigarrow a = X_{sc}; b = a;$$

• 
$$X_{\text{rel}} = v; b = X_{\text{acq}}; \rightsquigarrow X_{\text{rel}} = v; b = v;$$

• 
$$X_{sc} = v; b = X_{(acq|sc)}; \rightarrow X_{sc} = v$$
  
Adjacent overwritten write elim:  
•  $X_{rel} = v'; X_{rel} = v; \sim$ 

• 
$$X_{rel} = v'; X_{rel} = v; \sim$$
  
•  $X_{(rel|sc)} = v'; X_{rel} = v; \sim$ 

# LLVM dor For these eliminations Non OR read after write elimination

• 
$$\lambda = v; C; a = X_{na}; \rightsquigarrow X_{na} = v; C; a = v;$$

where rel-acq-pair  $\notin C$  and  $access(X) \notin C$ 

Formalized fragment of LLVM concurrency

# Verified correctness of transformations

- Elimination
- Reordering  $(a; b \rightsquigarrow b; a)$
- Mappings (C11  $\rightsquigarrow$  LLVM  $\rightsquigarrow$  X86/Power)

Validated LLVM opt-phase transformations

# LLVM Reorderings

 $r \cdot h \sim h \cdot r$ 

| a, D 👓 D, a                            |                       |            |              |              |                     |
|----------------------------------------|-----------------------|------------|--------------|--------------|---------------------|
| $\downarrow a \setminus b \rightarrow$ | (St Ld) <sub>na</sub> | $St_{rel}$ | $Ld_{acq}$   | $Ld_{sc}$    | $U_{(acq\_rel sc)}$ |
| (St Ld) <sub>na</sub>                  | $\checkmark$          | -          | $\checkmark$ | $\checkmark$ | -                   |
| St <sub>rel</sub>                      | $\checkmark$          | -          | -            | -            | -                   |
| St <sub>sc</sub>                       | $\checkmark$          | -          | -            | -            | -                   |
| Ld <sub>acq</sub>                      | -                     | -          | -            | -            | -                   |
| U <sub>(acq rel sc)</sub>              | -                     | -          | -            | -            | _                   |

$$X_{\text{rel}} = v; Y_{\text{na}} = v'; \rightsquigarrow Y_{\text{na}} = v'; X_{\text{rel}} = v; \quad \checkmark$$

LLVM performs( $\checkmark$ ) these reorderings

# LLVM Reorderings

 $r \cdot h \sim h \cdot r$ 

| a, D 🗇 D, a                            |                       |            |              |              |                     |
|----------------------------------------|-----------------------|------------|--------------|--------------|---------------------|
| $\downarrow a \setminus b \rightarrow$ | (St Ld) <sub>na</sub> | $St_{rel}$ | $Ld_{acq}$   | $Ld_{sc}$    | $U_{(acq\_rel sc)}$ |
| (St Ld) <sub>na</sub>                  | $\checkmark$          | ×          | $\checkmark$ | $\checkmark$ | ×                   |
| St <sub>rel</sub>                      | $\checkmark$          | ×          | -            | -            | ×                   |
| St <sub>sc</sub>                       | $\checkmark$          | ×          | -            | ×            | ×                   |
| Ld <sub>acq</sub>                      | ×                     | ×          | ×            | ×            | ×                   |
| $U_{(acq_rel sc)}$                     | ×                     | ×          | ×            | ×            | ×                   |

$$Y_{na} = v'; X_{rel} = v; \rightsquigarrow X_{rel} = v; Y_{na} = v'; \quad \times$$

LLVM restricts( $\times$ ) these reorderings

| a; b $\rightsquigarrow$ b; a           |                |            |              |              |                     |
|----------------------------------------|----------------|------------|--------------|--------------|---------------------|
| $\downarrow a \setminus b \rightarrow$ | $(St Ld)_{na}$ | $St_{rel}$ | $Ld_{acq}$   | $Ld_{sc}$    | $U_{(acq\_rel sc)}$ |
| (St Ld) <sub>na</sub>                  | $\checkmark$   | ×          | $\checkmark$ | $\checkmark$ | ×                   |
| St <sub>rel</sub>                      | $\checkmark$   | ×          | $\checkmark$ | $\checkmark$ | ×                   |
| St <sub>sc</sub>                       | $\checkmark$   | ×          | $\checkmark$ | ×            | ×                   |
| Ld <sub>acq</sub>                      | ×              | ×          | ×            | ×            | ×                   |
| $U_{(acq_{rel} sc)}$                   | ×              | ×          | ×            | ×            | ×                   |

$$X_{\rm rel} = v; t = Y_{\rm acq}; \rightsquigarrow t = Y_{\rm acq}; X_{\rm rel} = v; \quad \checkmark$$

LLVM does NOT perform these reorderings

![](_page_45_Figure_1.jpeg)

LLVM does NOT perform these reorderings

Formalized fragment of LLVM concurrency

# Verified correctness of transformations

- Elimination
- Reordering
- Mappings (C11  $\rightsquigarrow$  LLVM  $\rightsquigarrow$  X86/Power)

Validated LLVM opt-phase transformations

![](_page_47_Figure_1.jpeg)

- LLVM has operations (Ld/St/CAS) and memory orders (na/rel/acq/acq\_rel/SC) similar to C11.
- LLVM model is stronger than C11.

#### C11 to LLVM Mapping Correctness

![](_page_48_Figure_1.jpeg)

- LLVM model is stronger than C11.

# LLVM to Architecture Mapping Correctness

![](_page_49_Figure_1.jpeg)

(LLVM  $\rightsquigarrow x86/Power$ ) = (C11  $\rightsquigarrow x86/Power$ ) Proved correctness of these mappings

- LLVM to SC
- LLVM to SPower

Ensure correctness of LLVM  $\rightarrow \times 86$ /Power (results from Lahav & Vafeiadis. FM'16)

#### LLVM to Architecture Mapping Correctness

![](_page_50_Figure_1.jpeg)

# Formalized fragment of LLVM concurrency

Proved correctness of transformations

# Validated LLVM opt-phase transformations • $P_{src} \xrightarrow{\text{LLVM}} P_{tgt}$ ? Correct : Potential Error

# $P_{src} \xrightarrow{\text{LLVM}} P_{tgt} ? \text{Correct} : \text{Potential Error}$ $\uparrow$ $P_{src} \xrightarrow{(R \cup E)^*} P_{tgt} ? \text{Correct} : \text{Potential Error}$

- R: Safe reorderings
- E: Safe eliminations

$$s_1 = X !A$$

$$s_2 = X !B$$

$$V = 1 !C$$

$$s_4 = Z_{acq} !D$$

$$Y = 1 !E$$

$$Y = 2 !F$$

$$\checkmark s_1 = X !A$$

$$s_2 = X !B$$

$$V = 1 !C$$

$$s_4 = Z_{acq} !D$$

$$Y = 1 !E$$

$$Y = 2 !F$$

✓ 
$$s_1 = X ! A$$
  
✓  $s_2 = X ! B$   
 $V = 1 ! C$   
 $s_4 = Z_{acq} ! D$   
 $Y = 1 ! E$   
 $Y = 2 ! F$ 

✓ 
$$s_1 = X ! A$$
  
X  $s_2 = X ! B$   
V = 1 !C  
✓  $s_4 = Z_{acq} ! D$   
Y = 1 !E  
Y = 2 !F

✓ 
$$s_1 = X ! A$$
  
X  $s_2 = X ! B$   
V = 1 !C  
✓  $s_4 = Z_{acq} ! D$   
Y = 1 !E  
✓ Y = 2 !F

✓ 
$$s_1 = X ! A$$
  
×  $s_2 = X ! B$   
V = 1 !C  
✓  $s_4 = Z_{acq} ! D$   
× Y = 1 !E  
✓ Y = 2 !F

✓ 
$$s_1 = X ! A$$
  
✓  $s_2 = X ! B$   
✓  $V = 1 ! C$   
✓  $s_4 = Z_{acq} ! D$   
✓  $Y = 1 ! E$   
✓  $Y = 2 ! F$ 

✓ 
$$s_1 = X ! A$$
  
X  $s_2 = X ! B$   
✓  $V = 1 ! C$   
✓  $s_4 = Z_{acq} ! D$   
X  $Y = 1 ! E$   
✓  $Y = 2 ! F$ 

$$t_1 = X \ !A$$
$$t_2 = Z_{acq} \ !D$$
$$Y = 2 \ !F$$
$$V = 1 \ !C$$

![](_page_61_Figure_1.jpeg)

![](_page_62_Figure_1.jpeg)

- Check that unmatched accesses are deletable
- Check that reorderings are allowed

![](_page_63_Figure_1.jpeg)

- Check that unmatched accesses are deletable
- Check that reorderings are allowed

Formalized fragment of LLVM concurrency

Proved correctness of transformations

Validated LLVM opt-phase transformations

- Generate a test case  $(P_{src})$ .
- Apply LLVM transformations  $(P_{tgt})$ .
- $P_{src} \xrightarrow{\text{LLVM}} P_{tgt}$  ? Correct : Potential Error

# LLVM Formalization [CGO'17]

- Event structure construction rules
- Consistency constraints
- Data race freedom (DRF) theorems
- Proofs: http://plv.mpi-sws.org/llvmcs/

# Translation validation [CGO'16]

- Programs with control flow
- Experimental evaluations
- Artifact: http://plv.mpi-sws.org/validc/

![](_page_66_Figure_1.jpeg)

Extend the LLVM concurrency model

- With relaxed accesses and fences
- Verify more optimizations
- Mechanize the formalization
- Improve the validator
  - Integrate with sequential transformations
  - Handle loops, pointer etc

# Thank You !