Close
The page header's logo
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected 
Invert selection
Deselect all
Deselect all
 Click here to refresh results
 Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Constraint based analysis for persistent memory programs
(USC Thesis Other) 

Constraint based analysis for persistent memory programs

doctype icon
play button
PDF
 Download
 Share
 Open document
 Flip pages
 More
 Download a page range
 Download transcript
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content CONSTRAINT BASED ANALYSIS FOR PERSISTENT MEMORY PROGRAMS by Zunchen Huang A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (COMPUTER SCIENCE) December 2023 Copyright 2023 Zunchen Huang Dedication To my family. ii Acknowledgements My profound thankfulness goes to my advisor Dr. Chao Wang. I was fortunate to join his lab and be under his financial support. His sharp vision and insight into research problems inspired me throughout the journey. Under his guidance, I had the opportunity to explore research direction extensively. I appreciate his patience and countless valuable lessons through our research discussions. My qualifying exam com- mittee and thesis proposal committee members Dr. William G.J. Halfond, Dr. Jyotirmoy Vinay Deshmukh, Dr. Mukund Raghothaman, Dr. Srivatsan Ravi, Dr. Xuehai Qian, and Dr. Pierluigi Nuzzo, their valuable criticism and constructive advice helped me transition through the most challenging part towards my dis- sertation topic. My dissertation committee members Dr. Srivastan Ravi, and Dr. Pierluigi Nuzzo, proofread drafts of my dissertation and provided feedback. A special thanks to Dr. Srivatsan Ravi for many research discussions and Dr. Pierluigi Nuzzo for many helpful suggestions. I thank my collaborators, lab mates, friends, and those who helped me. I am deeply grateful for my parents’ support; my dear wife always encourages me. They have continuously put faith in me and supported me in pursuing my goals. iii TableofContents Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.1 Repairing PM bugs with syntax-guided analysis is not enough . . . . . . . . . . . . 2 1.1.2 Writing PM program requirements is labor intensive . . . . . . . . . . . . . . . . . 3 1.1.3 Handling concurrency in PM programs is difficult . . . . . . . . . . . . . . . . . . . 4 1.2 Insights and Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2.1 SMT solver based symbolic analysis can repair PM bugs efficiently . . . . . . . . . 5 1.2.2 Static and dynamic program analysis can automatically infer PM requirements . . 5 1.2.3 Method used in sequential PM programs property inference can be useful for concurrent PM programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.2.4 The Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.3.1 Constraint based bug detection for persistent memory programs . . . . . . . . . . 8 1.3.2 Constraint based program repair for persistent memory bugs . . . . . . . . . . . . 9 1.3.3 Automatic property inferences for persistent memory programs . . . . . . . . . . . 9 1.3.4 Automatic property inferences for concurrent persistent memory programs . . . . 9 1.4 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Chapter 2: Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.1 Persistent Memory (PM) Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.1.1 The Persistency Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.1.2 PM-related Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.2 Persistent Memory (PM) Bugs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.2.1 Durability Bugs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.2.2 Crash Consistency Bugs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3.1 Persistent Memory (PM) Bug Detection . . . . . . . . . . . . . . . . . . . . . . . . 16 iv 2.3.2 Persistent Memory (PM) Bug Repair . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3.3 Persistent Memory (PM) Property Inference . . . . . . . . . . . . . . . . . . . . . . 16 2.3.4 Constraint Solving for Bug Diagnoses and Repair . . . . . . . . . . . . . . . . . . . 17 Chapter 3: Symbolic Analysis of Persistent Memory Bugs . . . . . . . . . . . . . . . . . . . . . . . 20 3.1 PM Bug Detection and Repair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.1.1 Detecting PM Bugs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.1.2 Repairing PM Bugs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2 Overview of Our Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.3 Symbolic Analysis of the PM Bug . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.3.1 The Satisfiability Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.3.2 UsingΦ program to Encode Execution Order . . . . . . . . . . . . . . . . . . . . . . 30 3.3.2.1 SubformulaΦ pc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.3.2.2 SubformulaΦ so . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.3.2.3 SubformulaΦ fs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.3.2.4 SubformulaΦ fo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.3.2.5 SubformulaΦ mo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.3.3 UsingΦ persistency to Encode Persistency Order . . . . . . . . . . . . . . . . . . . . 32 3.3.3.1 SubformulaΦ pti . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.3.3.2 SubformulaΦ pts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.3.3.3 SubformulaΦ fi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.3.4 UsingΦ assertion to Encode the Assertion . . . . . . . . . . . . . . . . . . . . . . . . 33 3.3.5 An Example for Our Encoding Method . . . . . . . . . . . . . . . . . . . . . . . . . 33 Chapter 4: Automated Repair of Persistent Memory Bugs . . . . . . . . . . . . . . . . . . . . . . . 36 4.1 Computing the Repair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.1.1 Subroutine ComputeRepair(T,A) . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.1.2 Subroutine RepairIsValid(T,R) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.1.3 An Example for Our Repair Method . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.2 Correctness and Optimizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.2.1 Relating to SyGuS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.2.2 Adding New Instructions toT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.2.3 Relaxing the SubformulaΦ so . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.3.1 Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.3.2 Experimental Set-up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.3.3 Results for Answering RQ 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.3.4 Results for Answering RQ 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.3.5 Results for Answering RQ 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.3.6 Discussion of the Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Chapter 5: Inferring Persistent Memory Related Properties . . . . . . . . . . . . . . . . . . . . . . . 52 5.1 Three Types of PM Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 5.1.1 Durability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 5.1.2 Must-Persist-Before (MPB) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 5.1.3 Must-Persist-Atomically (MPA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 5.2 Representations of PM Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 v 5.2.1 Durability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 5.2.2 Must-Persist-Before (MPB) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 5.2.3 Must-Persist-Atomically (MPA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 5.2.4 Relating DURA to MPB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 5.2.5 Relating MPB to MPA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 5.2.5.1 The Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 5.2.5.2 From SCCs to MPAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 5.3 Inferring Properties: The Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 5.3.1 The Life Cycle of a PM-based Program . . . . . . . . . . . . . . . . . . . . . . . . . 57 5.3.2 Inferring Properties from LOADs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 5.3.3 Properties to Be Enforced by STOREs . . . . . . . . . . . . . . . . . . . . . . . . . . 58 5.3.4 Condition for Inferring PM Property . . . . . . . . . . . . . . . . . . . . . . . . . . 59 5.4 Implementation for Inferring Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 5.4.1 Implementaion Stratgies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 5.4.2 Dynamic Analysis Augmented with Static Analysis . . . . . . . . . . . . . . . . . . 60 5.5 The Overall Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 5.5.1 Subroutine for Generating Traces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.5.1.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.5.1.2 Static program analysis on control-dependencies . . . . . . . . . . . . . 63 5.5.1.3 Dependency Instrumentation . . . . . . . . . . . . . . . . . . . . . . . . 64 5.5.1.4 Test case example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.5.2 Subroutine for Inferring Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 5.5.2.1 Subroutine for Inferring DURA Properties . . . . . . . . . . . . . . . . . 65 5.5.2.2 Subroutine for Inferring MPB Properties . . . . . . . . . . . . . . . . . . 65 5.5.2.3 Subroutine for Inferring MPA Properties . . . . . . . . . . . . . . . . . . 66 5.5.3 Subroutine for Checking Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 5.5.3.1 Computing the Persistent Interval . . . . . . . . . . . . . . . . . . . . . . 68 5.5.3.2 Detecting DURA Violations . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.5.3.3 Detecting MPB Violations . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.5.3.4 Detecting MPA Violations . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.6 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.6.1 Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.6.1.1 Trace statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 5.6.2 Experimental Set-up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 5.6.3 Results for Answering RQ 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 5.6.4 Results for Answering RQ 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 5.6.5 Discussion of the Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Chapter 6: Inferring Properties for Concurrent Persistent Memory Programs . . . . . . . . . . . . 90 6.1 The Three Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 6.2 Adding Concurrency to the Symbolic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 91 6.3 Inter-thread Control Dependency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 6.3.0.1 Computing Inter-Thread Control Dependency . . . . . . . . . . . . . . . 94 6.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 6.4.1 Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 6.4.1.1 Trace statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 6.4.2 Experimental Set-up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 6.4.3 Results for Answering RQ 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 vi 6.4.4 Results for Answering RQ 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 6.4.5 Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 6.4.5.1 A case study in lock based concurrent linked list in PM . . . . . . . . . . 98 6.4.5.2 A case study in lock free concurrent linked list in PM . . . . . . . . . . . 100 6.4.5.3 Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 Chapter 7: Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 7.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 vii ListofTables 2.1 Persistency order of Px86 for instructions(I i counter may never show up in PM) and acrashconsistency bug in THEN-branch (write torecords[i].name may not persist before write torecords[i].valid). . . . . . 24 3.3 Ordering constraints for THEN-branch: modifying the program by swapping the two clflushopt instructions will not fix the crash consistency bug. . . . . . . . . . . . . . . . 26 3.4 Two repairs for the bug in THEN-branch of Fig. 3.2: The first repair is incomplete since it adds a new durability bug forI 2 ; the second repair is complete because it removes the new durability and original crash consistency bugs. . . . . . . . . . . . . . . . . . . . . . . . 27 3.5 Our symbolic encoding of the subformulas inΦ program :=Φ pc ∧Φ so ∧Φ fs ∧Φ fo ∧Φ mo , Φ persistency :=Φ pti ∧Φ pts ∧Φ fi andΦ assertion :=Φ du ∧Φ cc . . . . . . . . . . . . . . . . . 31 3.6 Encoding for the THEN-branch of Fig. 3.4 with both durability and crash consistency assertions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 ix 4.1 Formulas act as “filters” of the trace permutations, including buggy (red) and non-buggy (black) permutations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.2 Illustrating the repair computation and validation. . . . . . . . . . . . . . . . . . . . . . . . 40 5.1 The life cycle of a PM-based program before and after crash . . . . . . . . . . . . . . . . . 58 5.2 A combination of static and dynamic analysis for PM property inference . . . . . . . . . . 61 5.3 An test case example for singly linked list . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 5.4 An test case example for doubly linked list . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.5 An test case example for ring buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 5.6 MPBs inferred for cc_list_add_last function in doubly linked list . . . . . . . . . . . . . . . 77 5.7 MPBs inferred for unlinkn function in doubly linked list . . . . . . . . . . . . . . . . . . . 78 5.8 MPBs inferred cc_deque_remove_last function in deque . . . . . . . . . . . . . . . . . . . . 79 5.9 MPBs inferred cc_rbuf_enqueue function in ring_buffer . . . . . . . . . . . . . . . . . . . . 80 5.10 MPBs inferred cc_array_subarray function in array . . . . . . . . . . . . . . . . . . . . . . 81 5.11 MPAs inferred for cc_list_add_last function in doubly linked list . . . . . . . . . . . . . . . 82 5.12 MPAs inferred for swap_adjacent function in doubly linked list . . . . . . . . . . . . . . . 83 5.13 MPAs inferred for link_behind function in doubly linked list . . . . . . . . . . . . . . . . . 84 5.14 MPAs inferred cc_deque_add_first function in deque . . . . . . . . . . . . . . . . . . . . . 85 6.1 The three dimensions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 6.2 Constraint-based symbolic analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 6.3 An example that illustrates the inter-thread control dependency. . . . . . . . . . . . . . . . 93 6.4 new_node and list_new in list-lock.c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 6.5 new_node and list_new in list-lockfree.c . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 x Abstract Emerging persistent memory (PM) technologies are beginning to bridge the gap between volatile memory and non-volatile storage in computer systems. PM has three main advantages. First, it allows high-speed memory access and has better performance than solid-state drives and hard disks at a relatively low cost. Second, it is byte-addressable and can access data in place. Third, it achieves data persistency and can hold data across power failures. Recently, PM software and hardware support have been available in the industry. However, PM programming remains a challenging and error-prone task due to reliance on ordinary developers to have a deep understanding of PM at both system and software levels to be capable of writing correct and efficient PM software code. Current studies in PM have attempted to solve the aforementioned challenges by proposing methods to detect PM bugs automatically using static and dynamic program analysis and repair them using heuristics. In this dissertation, I propose a framework to detect and repair PM bugs automatically using a set of new symbolic analysis techniques. Unlike existing techniques that rely on patterns and heuristics to de- tect and repair a small subset of PM bugs, the proposed techniques can handle a wide range of PM bugs. This is achieved by first encoding the program semantics, correctness properties, and PM requirements as a set of logical constraints and then solving these constraints using off-the-shelf SMT solvers. By rea- soning about these logical constraints symbolically, the proposed techniques can detect, diagnose, and repair PM bugs efficiently. Furthermore, I propose a new method to automatically infer PM requirements using a combination of static and dynamic analysis techniques. Finally, I demonstrate the feasibility of xi applying the proposed techniques to programs that rely on both PM and multi-threading by simultane- ously reasoning about persistency and concurrency. For evaluation, we conduct extensive experiments on real-world benchmarks from industry-level PM softwares, and the experimental results show that our methods achieve state-of-the-art performance and scalability in detecting and repairing PM bugs and in- ferring quality PM requirements. xii Chapter1 Introduction Persistent memory (PM) plays a major role in distributed systems and cloud computing as it can retain data after power cutoff, therefore inherently increasing the guarantee of data safety and correctness. Recent commercial Intel Optane PM [56] paves a path towards more widely used in diverse systems. Meanwhile, the emerging Compute Express Link (CXL) [23] provides support for memory devices, and attaching per- sistent memory is a natural choice for programmers who intend to leverage both PM and CXL simulta- neously. PM is beginning to bridge the gap between the expensive, volatile DRAM and slow, non-volatile storage such as solid-state disks because of its relatively cheap cost and high speed for data access. Users of PM can benefit economically as well as by reducing the latency without sacrificing the safety of the system. However, despite the claim of retaining data after the electrical power off, it is non-trivial to verify the correctness of the data on the software level under varied situations. Intel’s PMDK [57] library provides many APIs that can be directly called in a software program to ensure data persistency. It is not uncommon to see durability [66, 67, 65, 63, 64] (correctly making variables available in PM) and crash consistency [60, 58, 59, 62, 61] (correctly ensuring the order when will be available in PM between variables) bugs from PM programs. Seasoned system programmers also face the challenge of writing correct and efficient PM 1 programs. One major reason for the occurrence of those bugs is the difficulty of inferring the user’s intent regarding how data should be stored in the PM. Prior works on detecting PM related bugs fall into two categories: first is to leverage the heuristics and generate bug patterns to match potential violations of the PM specifications; second is to use symbolic methods such as symbolic execution and model checking to expose PM bugs by searching on the unex- plored program executions paths. Unlike existing methods, in this thesis, I aim to propose a new framework to analyze PM programs with constraint based symbolic methods using off-the-shelf SMT solver to detect and repair PM bugs automatically. 1.1 Motivation In this section, I discuss three main motivations for proposing a new constraint based symbolic method to conduct program analysis on PM. 1.1.1 RepairingPMbugswithsyntax-guidedanalysisisnotenough Many previous studies on PM bug detection rely on syntax-level program analysis, and they showed ef- fectiveness for PM bug detection for durability and crash consistency bugs. When it comes to program repair for PM bugs, the syntax-guided analysis can also be effective for fixing durability bugs by adding missing flush and fence instructions using heuristics. However, relying only on syntax-guided analysis is insufficient for repairing crash consistency bugs. Crash consistency bugs occur when there is a pair of PM related variables appearing in PM hardware, e.g., Intel Optane memory [56], in an order that violates the user’s intent. Thus, when a crash or power failure happens, those variables read from the PM device can be stale, which means users can read inconsistent values. The consequences are disastrous when sensitive information stored in variables is dependent on other variables. For example, if there are two variables flag and data stored in PM, and they are in an if-else statement, i.e., if (flag) { print(data); }. 2 If when updating the values of flag and data, there is a power failure. If they are properly flushed or fenced, there can be a situation whenflag is updated to true in PM, whiledata remains the same as the old value. The print statement in the if-then branch is executed, and users read the wrong value of data. To repair crash consistency bugs, further semantics-level program analysis is needed because repair- ing strategies involve instructions reordering in the program rather than just adding missing instructions. Finding a correct repair from reordering instructions means searching on a large space. If not handled symbolically, explicitly enumerating possible solutions and checking feasibility can lead to factorial time complexity. For example, even with ten instructions in the program, the total number of possible repairs in the solution space can be enormous (3,628,800), which means it is impossible for developers to enumerate the possible repairs manually. This motivates us to design a new SMT solver based symbolic analysis to analyze PM programs to detect and repair PM bugs. Modern SMT solvers such as Z3 [102], CVC5 [4], MathSAT [21] and Yices2 [29] are broadly used in program analysis [49, 75], software and hardware veri- fication [93, 7, 34], and software testing [13, 40] thanks to their scalability and efficient on handling a large number of constraints with well-designed algorithms and heuristics [103] when checking satisfiability of formulas. 1.1.2 WritingPMprogramrequirementsislaborintensive To assist PM bug detection, it requires user-provided PM requirements, e.g., which variables need to be made durable, which pairs of variables should be made available in the PM device in a specific order, and which group of variables needs to be put in a transaction to preserve atomicity. It is extremely challenging even for an experienced system programmer to write correct software codes free of crash consistency bugs. For example, inlibpmemobj [69], one of the libraries in PMDK [57], there is an issue [62] in array usage, and it causes crash consistency bugs due to hardness to infer crash consistency requirements manually. This issue is so severe that the author abandoned this example usage from the library. 3 Previous studies rely on users to write those PM requirements, and it is extremely labor intensive because it requires the user to equip a deep understanding of both PM knowledge and the target program intent. There is an existing work [38] to infer likely correctness conditions for PM requirements. However, they rely on designed heuristics rules for inference and their inferred properties may be inaccurate and contain false positive PM requirements, i.e., violation of their inferred likely correctness condition can be bogus. This motivates us to design a new method using static and dynamic analysis for automatically inferring PM properties. 1.1.3 HandlingconcurrencyinPMprogramsisdifficult Many persistent programs utilize concurrency to boost their performance [57, 108, 95]. Persistency and concurrency share some similarities, such as nondeterministic program behavior. In persistency, data can be nondeterministically flushed into PM. In concurrency, the user cannot control when multiple threads interleave with each other. PM and concurrency are different – they should be regarded as two different dimensions in a modern CPU. PM is about consistency between two execution instances of a program (before and after power failure), and multi-threading is about consistency inside a single execution instance (among concurrently running threads). Analyzing persistency and concurrency together is challenging. This motivates us to test our method on concurrent programs for PM and show the feasibility of handling concurrent algorithms. 1.2 InsightsandHypothesis In this section, I introduce three insights followed by a hypothesis. 4 1.2.1 SMTsolverbasedsymbolicanalysiscanrepairPMbugsefficiently Symbolic methods leveraging an SMT solver have the advantage of efficiently finding a solution in a large solution space. If we can design a new SMT encoding for modeling PM program behaviors on program execution traces, such as PM semantics (how store, fence, flush, etc. instructions behave), persistency time interval (when a variable is available in PM device), PM properties (durability and crash consistency requirements for certain variables), etc., into a unified SMT formula, then SMT solver can automatically find those solutions of violations for the properties in the formula efficiently by taking advantage of SMT solver’s internal optimization procedures. Those solutions searched by solvers are violations of PM prop- erties, if found, which naturally leads to a PM bug being detected. In addition to PM bug detection, it has another advantage of finding the correct solution for repairing a certain PM bug by adding missing instructions; even more, it can help find repair strategies by reordering instructions with respect to correct PM semantics. As discussed in Section 1.1.1, explicitly finding those repairs is hopeless, even for a small program. After finding those violations in a formula, we can block those feasible executions for the tar- get program by adding predicates in the formula iteratively until all feasible violations are blocked and a suggested repair will be provided to the user. 1.2.2 StaticanddynamicprogramanalysiscanautomaticallyinferPMrequirements Inferring durability requirements for a PM program is easy, as long as we can know which variables are from PM and where they are modified. Then, durability requirements will be converted to whether a variable is properly flushed and fenced after its content is modified in PM. However, for crash consistency requirements, it is hard for the user to provide an accurate set of requirements for which variables should be available in PM before others. We observe from PM programs that control dependency between loads are the most obvious and solid hints for inferring crash consistency properties in PM. If two variables from PM are two loads and one variable is control-dependent on another, this means before those two loads, 5 1 int v1, v2; // global variables v1 and v2 Figure 1.1: Global Variables 1 void writer1(int temp1, int temp2) { 2 v1 = temp1; 3 v2 = temp2; 4 } Figure 1.2: Writer function 1 their contents must be made available in PM and in a certain order. The controlled variable should be made available in PM before the guard variable, i.e., two stores’ content must appear in PM in order. By analyzing dependency information statically and on dynamically generated traces, we can infer corresponding PM properties for crash consistency requirements. We show small examples to demonstrate our insight on inferring PM related properties. In Figure 1.1, we define two global integer variables v1 andv2. These global variables are stored in the PM. In Figure 1.2, writer1 takes in two temporary variables and stores their values to v1 and v2. In Figure 1.3 and 1.4, reader1 prints out v2 value if v1 is greater than 0 and reader2 prints out v1 value if v2 is less than 0. reader1 shows loading the data fromv2 is control dependent on loading the data fromv1. These two load instructions from v1 and v2 indicate that the previous stores on v1 and v2 in PM should have an order constraint. Suppose v1 is first made available in the PM, and then reader1 is called, then in line 3, the v2 whose value is printed out can be stale. Thus, it is natural to have a must-persist-before the relation between v1 and v2, which indicates v2 must be made available before v1 according to reader1. Note that the must-persist-before relation indicates the relation between store instructions. From reader2 in Figure 1.4, we can infer v1 must persist before v2. We notice that there is a cycle between the must-persist- before relations of v1 and v2. This inferred cycle results in an atomic persist relation between v1 and v2 because of their mutual dependency on reading data. 6 1 void reader1() { 2 if (v1 > 0) { 3 printf("%d", v2); 4 } 5 } Figure 1.3: Reader function 1 1 void reader2() { 2 if (v2 < 0) { 3 printf("%d", v1); 4 } 5 } Figure 1.4: Reader function 2 1.2.3 Method used in sequential PM programs property inference can be useful for concurrentPMprograms Although concurrency is challenging to handle in PM programs, we notice that in multi-threaded PM pro- grams, we can still leverage the insight in Section 1.2.2. The reason is that, on the dynamically generated traces, in addition to original log information such as store and load instructions, we can utilize concur- rency information such as thread ID number. However, the process of PM properties inference is similar - we only need to handle if those instructions are from different threads and generate the must-persist- relation accordingly. 1.2.4 TheHypothesis Based on the aforementioned insights, I propose the hypothesis for this thesis as follows: AunifiedframeworkthatleveragesSMTsolverbasedsymbolicanalysistechniquescanhelp automatetheprocessofdetectingandrepairingPMbugs 7 1 void writer(void) { 2 v1 = temp1; 3 v2 = temp2; 4 } Figure 1.5: Writer function 2 To prove the correctness of the hypothesis, I first designed a new SMT solver based symbolic analysis with an encoding for PM bug detection. Then, I proposed a new method to iteratively adding/reordering instructions to the SMT formula to block feasible violations for PM properties. To aid my PM bug detection and repair, I designed a new method to combine static and dynamic analysis to infer PM properties. Finally, I demonstrate my inference procedure can be extended to concurrent PM programs. My experimental results showed equivalent PM bug detection ability compared to state-of-the-art tools and better results for PM bug repair, especially for crash consistency bugs. My inference for PM properties showed better results than state-of-the-art tools and generated quality properties for both sequential and concurrent PM programs. These results confirm the hypothesis of this dissertation. 1.3 Contributions 1.3.1 Constraintbasedbugdetectionforpersistentmemoryprograms I proposed a new SMT solver based encoding for PM bug detection. My new encoding can model PM semantics accurately for target PM programs. From PM requirements for a target program, I convert them into assertions along with program behavior constraints, and an SMT formula is generated for PM bug detection. I tested the new symbolic analysis for PM bug detection, and the results showed an equivalent ability for PM bug detection. 8 1.3.2 Constraintbasedprogramrepairforpersistentmemorybugs I propose the first constraint based method for repairing a broader range of PM bugs. My method can repair PM bugs that do not match any known bug pattern compared to existing methods. I formalize PM bug repair as a special case of the syntax-guided synthesis SyGuS) [3] problem, and the soundness and decidability of my method are discussed. I implement a new tool and evaluate the method on many benchmark programs from real-world industry-level software to demonstrate its advantages over the state- of-the-art. 1.3.3 Automaticpropertyinferencesforpersistentmemoryprograms I designed a new method combining static and dynamic analysis techniques to automatically infer PM properties such as must-persist-before and must-persist-atomically relations. Those inferred PM properties can be used for PM bug detection and repair. I tested my method on data structure and large PM programs from real-world benchmarks. The result showed better PM properties inferred compared to the state-of- the-art by reducing false positives. 1.3.4 Automaticpropertyinferencesforconcurrentpersistentmemoryprograms By extending my method from Section 1.3.3 and being accustomed to multi-threaded PM programs, my inference method is effective in generating useful and quality PM properties on concurrent PM programs. I tested my method on data structure programs and demonstrated its ability. The results showed better performance compared to the state-of-the-art. Our inferred PM properties can also show useful hints for programmers to write better PM programs. 1.4 Outline The remainder of this dissertation is organized as follows. 9 Chapter 2 introduces the necessary background for PM semantics and categories of related PM bugs. It also discusses related work for PM bug detection, repair, invariant inference, and constraint solving for bug diagnoses and repair. Chapter 3 presents our studies in designing a new SMT solver based symbolic analysis for PM bug detection and showcases our new encoding through examples. Chapter 4 presents our studies in repairing PM bugs with our SMT solver based symbolic analysis and evaluations on repairing PM bugs with benchmark programs. Chapter 5 presents our studies in automatic inference PM requirements and constraints with a combi- nation of static and dynamic program analysis. Chapter 6 presents our studies in automatic inference PM requirements and constraints with a combi- nation of static and dynamic program analysis for concurrent programs. Chapter 7 concludes this thesis with a summary and future work. 10 Chapter2 Background In this chapter, we review relevant background knowledge related to this dissertation and discuss impor- tant related works. 2.1 PersistentMemory(PM)Semantics There is a line of research works on defining the PM model and semantics precisely [20, 111, 112]. One of the industry pioneers on PM, SNIA, proposed a programming model to define recommended behavior for software with PM support [119, 116, 120]. We focus on Intel’s persistent x86 (Px86) model as published by Raad et al. [111] in the thesis. Fig. 2.1 illustrates, conceptually, the difference between the x86 architecture for volatile memory and thePx86 architecture for persistent memory. In the standardx86 architecture, STORE instructions executed by the CPU are sequentialized in astorebuffer before taking effect in memory, while LOAD instructions are allowed to take effect immediately. This allows a fast LOAD to take effect before a slow STORE, provided that they have no control/data dependency, while preserving the semantic equivalence of the program. In the Px86 architecture, a persistent buffer is added after the store buffer to further sequentialize the STORE instructions, before the written values show up in persistent media. While the CPU still preserves the sequential program behavior during normal (crash-free) execution, when a program crashes due to 11 Persistent Memory Persistent Buffer Store Buffer CPU Load Bypassing Store Buffer CPU V olatile Memory Load Bypassing Figure 2.1: Difference between the x86 model for volatile memory (left) and the Px86 model for persistent memory (right) at the conceptual level. power failure, the order in which the written values show up in persistent media may be significantly different. This may lead to PM bugs. 2.1.1 ThePersistencyTable Table 2.1, which is taken from Raad et al. [111], characterizes an important aspect of Px86 that is relevant to our work: the order in which instructions take effect in persistent memory. Given a pair of instructions, (I i ,I j ), where I i is executed before I j by the CPU, the corresponding table entry shows whether Px86 guarantees thatI i persists beforeI j using the symbols✔ (yes) and✘ (no). The third symbol, CL, means that I i persists before I j only when the two instructions access memory address blocks mapped to the same cache line. For example,(STORE x, LOAD y) may persist in reverse order according to the✘ symbol in Table 2.1 when the CPU chooses to execute the fast LOAD y before the slow STORE x for performance reasons. However, according to the table,(LOAD y, STORE x) must persist in the same order as they appear in the program, due to a possible control/data dependency. That is, since these two instructions may come from either the code snippetif(y>0) {x=1;} (with control dependency) or the code snippet{reg=y; x=1;} (without dependency), to be safe, the CPU would have to disallow the reordering based optimization. 12 Table 2.1: Persistency order of Px86 for instructions(I i counter may never show up in persistent media. This is because the program does not force the CPU to flush the corresponding cache line and, as a result, the written value (temporarily stored in the volatile part of the CPU) may be lost permanently if a power failure occurs whilewriter() is executed. After crash recovery, reader() may not have access to the values written by writer(), for example, due to the incorrect value of header->counter. To make STORE instructions durable,__mm_clflushopt() and__mm_sfence() must be used to force the CPU to flush the cache line; these API calls correspond to CLFLUSHOPT and SFENCE. This is how values are written to the name and addr fields of records[i] are made durable in Fig. 2.2 (Lines 12-14 and 20 for the THEN-branch, and Lines 18 and 20 for the ELSE-branch). Note that neither instruction in the CLFLUSHOPT+SFENCE combination may be omitted; otherwise, durability is not guaranteed. 14 2.2.2 CrashConsistencyBugs When a program crashes due to power failure, it is possible that some (but not all) of the written values have been stored in persistent media. To prevent the persistent media from entering an inconsistent state, the program must use CLFLUSHOPT+SFENCE correctly, to force the STORE instructions to take effect in a certain order. The persistency order, in general, is determined by the reader() executed during crash recovery. Thereader() in Fig. 2.2 usesheader->counter to decide whether to readrecords[i], and then uses the value of records[i].valid to decide whether to read records[i].name and records[i].addr. Thus, the correct persistency order, which must be enforced bywriter(), is that bothrecords[i].name and records[i].addr persist before records[i].valid, meanwhile, records[i].valid persists be- foreheader->counter. However, betweenrecords[i].name andrecords[i].addr, there is nopersists -before relation, meaning that the CPU has the freedom to persist them in arbitrary order. In existing bug detection tools, as well as in our work, the must-persist-before relation is specified by the user and then checked by the tools. Unfortunately, none of the aforementioned must-persist-before relations is enforced in Fig. 2.2. For header->counter, the written value is not made durable using a pair of CLFLUSHOPT+SFENCE. As for records[i], the reader() may read value 1 for records[i].valid from persistent media, and then expect records[i].name and records[i].addr to be available in persistent media, but end up with uninitialized or partially initialized values. 15 2.3 RelatedWork 2.3.1 PersistentMemory(PM)BugDetection Our work is complementary to existing PM bug detectors [45, 114, 46, 18, 38, 39, 86, 114, 41]. This includes, for example, PMemCheck [55] and Persistence Inspector[54], which are trace based PM bug detection tools from Intel, Pmreorder [68], which is an extension of the Intel tools for explicitly generating trace permutations, and Yat [80], which is a framework based on hypervisor for testing persistency bugs on POSIX-compliant file system (PMFS [113]). PMRace [18] and Yashme [46] are tools designed to detect concurrency related bugs in PM systems such as data race. PMTest [88] is a tool that leverages user specified checking rules to compute the persistency time in- terval of STORE instructions, to decide if there are persistency violations. XFDetector [87] is a tool that automatically injects failures into the program and then replays the execution traces before and after fail- ure, to detect cross-failure bugs. PMDebugger [28] is a also tool that leverages user-specified constraints to detect PM bugs. In addition, there are techniques for verifying the absence of PM bugs [111, 74, 44, 43]. 2.3.2 PersistentMemory(PM)BugRepair Hippocrates [104] is the only existing PM bug repair tool, but it is limited to repairing durability bugs. Our method, in contrast, can also repair crash consistency bugs. It is worth mentioning Khoshnood et al. [73] proposed a constraint based method to diagnose concurrency bugs and repair them automatically and their method has the potential to adapt to repair bugs in concurrent PM programs. 2.3.3 PersistentMemory(PM)PropertyInference PMTest [88] does not directly help developers generate must-persist-before relations, however, they do provide a flexible interface to take in a wide range of properties and can them accordingly. Witcher [37, 38] is the only tool which can directly generate likely-correctness conditions, which is similar to our MPB 16 relations. But their tool relies on heuristic rules, which raises the rate of false positives for PM property generation. Other studies on likey-correctness conditions works such as Daikon [31, 30] and its related work [32, 110, 6, 5] use dynamic analysis to analyze on trace can help generate invariants for various software systems including concurrent programs. However, their major disadvantage is the non-guarantee of whether the invariant is real or not. To remedy this, Nimmer et al. [107] proposed a static verification technique to verify the correctness of those inferred likely-correctness conditions. Another line of research in inferring Likely Correctness Conditions for multi-threaded programs [77, 11] also relates to this thesis. Their methods solely focus on inferring likely invariants on concurrency related properties without consideration of persistency. 2.3.4 ConstraintSolvingforBugDiagnosesandRepair Our method is also related to techniques for repairing other software bugs. This includes BugAssist [71, 72], which repairs assertion failures in a sequential program, and ConcBugAssist [73], which repairs fail- ures in a multi-threaded program. Other similar repair techniques include SemiFix [106], DirectFix [94], and the method proposed by Malik et al. [92] for repairing data structures. There are also techniques for synthesizing fences and synchronization primitives for concurrent programs [1, 96, 16]. However, none of these existing techniques can repair PM bugs. Our SMT solver based symbolic analysis is related to techniques used by existing tools for traced-based analysis to detect concurrency bugs such as data races and atomicity violations [124, 125, 50], as well as symbolic analysis techniques for detecting information leaks through side channels [52, 47]. However, these techniques were designed exclusively for programs that use volatile memory, and thus cannot be used to detect or repair PM bugs. Recent works on detecting side-channel leakages in PM system [85, 127] raise security concerns on information leakage through PM. 17 1 // both ’header’ and ’records’ are mapped to the persistent memory 2 struct record_t { char name[64],addr[64]; char valid; } records[32]; 3 struct header_t { uint32_t counter; uint8_t reserved[63]; } header; 4 ... 5 // writer() -- code executed before crash 6 for (int i=0; i<NUM_RECORDS; i++) { 7 header->counter++; 8 if (rand()%2==0) { //store a valid record 9 snprintf( records[i].name, 64, ... ); 10 snprintf( records[i].addr, 64, ... ); 11 records[i].valid = 1; 12 __mm_clflushopt( &records[i].valid ); 13 __mm_clflushopt( records[i].name ); 14 __mm_clflushopt( records[i].addr ); 15 } 16 else { 17 records[i].valid = 0; 18 __mm_clflushopt( &records[i].valid ); 19 } 20 __mm_sfence(); 21 } 22 ... 23 // reader() -- code executed after crash 24 for (int i=0; icounter; i++) { 25 if (records[i].valid==1) { 26 cout << "name =" << records[i].name << "\n"; 27 cout << "addr =" << records[i].addr << "\n"; 28 } 29 } Figure 2.2: An example program with several PM bugs. 18 1 for (int i=0; i<NUM_RECORDS; i++) { 2 if (rand()%2==0) { //store a valid record 3 snprintf( records[i].name, 64, ... ); 4 snprintf( records[i].addr, 64, ... ); 5 __mm_clflushopt( records[i].name ); 6 __mm_clflushopt( records[i].addr ); 7 __mm_sfence(); 8 records[i].valid = 1; 9 __mm_clflushopt( &records[i].valid ); 10 } 11 else { 12 records[i].valid = 0; 13 __mm_clflushopt( &records[i].valid ); 14 } 15 __mm_sfence(); 16 header->counter++; 17 __mm_clflushopt( &header->counter ); 18 __mm_sfence(); 19 } Figure 2.3: The repairedwriter() in the example program. 19 Chapter3 SymbolicAnalysisofPersistentMemoryBugs Persistent memory (PM) is a type of non-volatile random-access memory with the capability of retaining data after the loss of electrical power. It has become commercially viable in the past few years. In modern computer architecture, PM may serve as the intermediate layer between volatile DRAM and non-volatile storage such as solid-state disks or replace part of the DRAM-based main memory. This will lead to a drastic reduction in latency and power consumption of the computing systems and an increase in robust- ness against frequent and unpredictable power interruptions. This is why PM is used in more and more applications as commercial PM devices [56] come close to DRAM in terms of speed but with a significantly larger capacity. However, software developers are required to write PM related software code in order to unleash the full power of these PM devices [117]. Unfortunately, it is a challenging task to use PM instructions and APIs correctly and efficiently. The reason is because, due to performance concerns, PM instructions are often designed to have weaker per- sistency/consistency models than volatile memory instructions. Thus, what is considered as a correct behavior for volatile memory may no longer be correct for persistent memory. Since the persistency/con- sistency models are far from being intuitive, unless developers have a deep understanding of both software and the PM semantics associated with hardware, it will be difficult to use these PM instructions and APIs correctly and efficiently. 20 Although a large number of program analysis techniques have been developed to help detect PM bugs [88, 28, 86, 39, 38, 18, 87, 46, 114, 80, 105] or prove their absence [45, 111, 74], little has been done on automated diagnosis and repair of PM bugs. In fact, the only existing repair technique that we are aware of is the Hippocrates tool developed by Neal et al. [104]. Unfortunately, Hippocrates only repairs one type of relatively simple PM bugs calleddurability bugs; these bugs are simple in that fixing them requires only the addition of missing PM instructions. There are more complex PM bugs, often called crash consistency bugs in the literature, that Hippocrates cannot repair; fixing them requires some of the existing instruc- tions to be reordered. Furthermore, Hippocrates uses syntactic-level pattern-matching, which means if a bug matches a known pattern, the tool will be able to repair it by applying a pre-defined code transfor- mation. However, if the bug does not match any known pattern hard-coded into the repair tool, the bug cannot be repaired. To fill the gap, we propose a constraint based method for automatically computing repairs for a broader class of PM bugs. Unlike the syntactic-level pattern-matching based approach of Neal et al. [104], our method relies on a semantical analysis of PM instructions to compute repairs. By symbolically encoding the PM-related program behavior and the correctness property as a set of logical constraints, and then leveraging an off-the-shelf SMT solver to solve these constraints, our method is able to search for novel repair strategies in a large solution space. As a result, our method is able to repair durability and crash consistency bugs of arbitrary form, even if these bugs do not match any of the known syntactic-level bug patterns hard-coded into Hippocrates. Fig. 3.1 shows an overview of our method. The input consists of the program and the violated PM property, and the output is the suggested repair. Internally, our method first leverages a Valgrind based software tool to instrument the program and generate the execution trace. The traces generated at the end of this step may be fed to any existing PM bug detection tool [55, 54, 68] to confirm the property violation. To compute a repair, our method uses an SMT solver to symbolically encode the solution space. As shown 21 Target Program Trace Generation SMT Based Symbolic Analysis Property Spec. Suggested Repair yes Adding Instructions Reordering Instructions no Validating Repair Figure 3.1: PMBugAssist: our PM bug repair method. in Fig. 3.1, it symbolically checks possible repairs in the solution space to find a valid repair. In this context, a repair can be thought of as a modification of the program through a combination of inserting new PM instructions and reordering PM instructions. Our search for a repair is an iterative process, involving multiple calls to the SMT solver for both finding the repair candidate and validating it. Only valid repairs are returned to the user. At the center of our method is the SMT solver based symbolic analysis for two reasons. First, symbolic analysis allows us to explore a large number of possible solutions quickly. Second, symbolic analysis is able to model various types of PM instructions and properties not only accurately but also uniformly, meaning that during symbolic encoding, everything boils down to a set of logical constraints. Since these constraints are expressed in a fragment of the SMT-LIB format, i.e., linear integer arithmetic (LIA), they can be solved efficiently using any off-the-shelf SMT solver. We have implemented the method in a tool named PMBugAssist. During experimental evaluation, we focused on comparing our method with Hippocrates [104]. This is because our focus is on auto- mated repair, for which Hippocrates represents the state of the art. In contrast, prior work on detecting PM bugs [88, 28, 86, 111, 74] and verifying their absence [45, 111, 74] is less relevant; instead, they are complementary to our method. 22 Our benchmarks include programs from the well-known Intel PMDK library [57] as well as real ap- plications such as Memcached [10] and Recipe [83]. According to prior works on PM bug detection, these benchmarks have 35 known bugs in total, including 23 durability bugs and 12 crash consistency bugs. Our experimental results show that the new method can repair all of these 35 bugs, whereas Hippocrates cannot repair any of the crash consistency bugs. We also evaluated the runtime performance of the new method, and found that, for all benchmark programs, it can finish the repair computation quickly. To summarize, we make the following contributions: • We propose the first constraint based method for repairing a broader class of PM bugs. Compared with the state-of-the-art approach, our method can repair PM bugs that do not match any known bug pattern. • We formalize PM bug repair as a special case of the syntax-guided synthesis (SyGuS) [3] problem, through which we discuss the soundness and decidability of our method. • We implement and evaluate the method on a large number of benchmark programs to demonstrate its advantages over state-of-the-art (Hippocrates). The remainder of this paper is structured as follows. In Section 3.1, we review the technical background. In Section 3.2, we present the top-level procedure in our method. This is followed by our SMT-based sym- bolic analysis algorithm in Section 3.3, our repair computation algorithm in Section 4.1, and our theoretical analysis in Section 4.2. We present the experimental results in Section 4.3. Finally, we give our conclusions in Section 4.4. 23 // trace for executing the THEN-branch Inst I 1 : STORE 0x4a3c000 //STORE records[i].name Inst I 2 : STORE 0x4a3c080 //STORE records[i].valid Inst I 3 : clflushopt 0x4a3c080 //clflushopt records[i].valid Inst I 4 : clflushopt 0x4a3c000 //clflushopt records[i].name Inst I 5 : sfence assert( PTime(I 1 ) < PTime(I 2 ) ) //crash-consistency bug // trace for executing the ELSE-branch Inst I 1 : STORE 0x4a3c0C0 //STORE header->counter Inst I 2 : STORE 0x4a3c080 //STORE records[i].valid Inst I 3 : clflushopt 0x4a3c080 //clflushopt records[i].valid Inst I 4 : sfence assert( PTime(I 1 ) < TMAX ) //durability bug assert( PTime(I 2 ) < PTime(I 1 ) ) //crash-consistency bug Figure 3.2: Execution traces of the program in Fig. 2.2, with a durability bug in ELSE-branch (write to header->counter may never show up in PM) and a crash consistency bug in THEN-branch (write to records[i].name may not persist before write torecords[i].valid). 3.1 PMBugDetectionandRepair 3.1.1 DetectingPMBugs Existing tools for detecting PM bugs (e.g., PMemCheck [55] and PMTest [88]) are based on analyzing the execution traces. Fig. 3.2 shows two example traces for branches of the loop body in Fig. 2.2. For simplicity, we only show the STORE, CLFLUSHOPT, and SFENCE instructions relevant to the violated assertions. The first assertion violated by the ELSE-branch represents durability. For the STORE I 1 , its persistency time is denoted PTime(I 1 ). Assuming that TMAX is the upper bound of the persistency time (bounded by the number of executed instructions in this program), we can express durability asPTime(I 1 )<TMAX. The assertion is violated because clflushopt 0x4a3c0C0 is not used to force the CPU to flush the written value from cache to persistent media. 24 The assertion violated by the THEN-branch captures a crash consistency property. Here, the ex- pectation is that the value written by I 1 always persists before the value written by I 2 , as shown in PTime(I 1 )<PTime(I 2 ). The assertion is violated because the CPU allows two CLFLUSHOPT instruc- tions to take effect in reverse order, as shown by the ✘ symbol in Table 2.1. Note that, even if we swap the execution order of the two instructions (I 3 andI 4 ) in the program, the assertion will still be violated. Fig. 3.3 illustrates the reason. Here, the solid edges represent the execution order, while the dashed edges represent the persistency order imposed by Px86. Since the dashed edges remain the same (before and after swapping the execution order of I 3 and I 4 ), the requirement that I 1 always persists beforeI 2 is still not satisfied. 3.1.2 RepairingPMBugs Hippocrates [104] is the only existing method for repairing PM bugs, with two limitations. First, it only repairs the relatively simpledurability bugs, such as the one shown in the ELSE-branch of Fig. 3.2, but not the more complexcrashconsistency bugs. Second, it only repairs bugs that syntactically match the patterns hard-coded into the repair tool. For bugs that do not have a syntactical match, Hippocrates would not know how to repair them. For example, if repairing a bug requires reordering some instructions, then Hippocrates cannot do it. In contrast, our method can repair bothdurability andcrashconsistency bugs, and can repair bugs that do not syntactically match any of the known patterns hard-coded into Hippocrates. This is because our method has the ability to analyze the semantics of the PM instructions, and thus repair PM bugs through a combination of inserting new PM instructions and reordering instructions. We illustrate the technical challenges using examples in Fig. 3.4. 25 STORE(0x4a3c080) clflushopt(0x4a3c000) clflushopt(0x4a3c080) sfence STORE(0x4a3c000) Execution Order Persistency Order STORE(0x4a3c080) clflushopt(0x4a3c000) clflushopt(0x4a3c080) sfence STORE(0x4a3c000) Figure 3.3: Ordering constraints for THEN-branch: modifying the program by swapping the two clflushopt instructions will not fix the crash consistency bug. Fig. 3.4 shows two possible repairs of the bug in the THEN-branch of Fig. 3.2. The first attempt, based solely on reordering the existing instructions of the execution trace, is not a complete repair. The rea- son is because, by moving I 4 and I 5 before I 2 and I 3 , the new version of the program guarantees that records[i].name persists beforerecords[i].valid. However, reordering also introduces a new dura- bility bug forI 2 : without a subsequent SFENCE instruction, the value written byI 2 is no longer guaranteed to show up in persistent media, e.g., if the program crashes in the middle of the execution due to power failure. 26 // trace for executing the THEN-branch Inst I 1 : STORE 0x4a3c000 //STORE records[i].name Inst I 4 : clflushopt 0x4a3c000 //clflushopt records[i].name Inst I 5 : sfence //sfence Inst I 2 : STORE 0x4a3c080 //STORE records[i].valid Inst I 3 : clflushopt 0x4a3c080 //clflushopt records[i].valid //This is an incomplete repair -- it introduces a new durability bug Inst I 6 : sfence //fence //This is a complete repair -- must also add this ’sfence’ Figure 3.4: Two repairs for the bug in THEN-branch of Fig. 3.2: The first repair is incomplete since it adds a newdurability bug forI 2 ; the second repair is complete because it removes the newdurability and original crash consistency bugs. Fig. 3.4 highlights the fact that, sometimes, it is impossible to repair a crash consistency bug solely by reordering instructions; we also need to add new PM instructions. We shall explain in the remainder of this paper how our method finds out that, by adding SFENCE in I 6 , we can completely repair the crash consistency bug. To summarize, for the buggywriter() in Fig. 2.2, the repaired version is shown in Fig. 2.3. Through a combination of inserting new PM instructions and reordering instructions, the repaired version in Fig. 2.3 guarantees both the durability requirement of header->counter and the crash consistency requirement that records[i].valid always persists before header->counter. Note that, to satisfy the second re- quirement, we not only have to add CLFLUSHOPT+SFENCE forheader->counter, but also have to move header->counter++ (Line 7 in Fig. 2.2) after the IF-ELSE statement (Line 16 in Fig. 2.3). 3.2 OverviewofOurMethod Our method takes an execution traceT and an assertionA as input, and returns a suggested repairR as output. The traceT ={I 1 ,...,I N } is a sequence of instructions, each of which has an instruction type specified in Table 2.1. 27 The assertionA is a conjunction of must-persist-before constraints of the form PT(I i ) < TMAX (durability) or PT(I i ) < PT(I j ) (crash consistency) as shown in Fig. 3.2. Here, TMAX is the upper bound of the persistency time. Thus, if there exists a way of satisfyingPT(I i )≥ TMAX, there exists a durability violation whereI i has not yet taken effect in persistent media at the end of the execution. Algorithm1: Our methodR← PMBugAssist(T,A) 1 T ← the bug trace 2 A← the property 3 R← empty repair 4 whileBugIsFound(T,A)do 5 R← ComputeRepair(T,A) 6 if RepairIsValid(T,R)then 7 returnR as repair 8 endif 9 T ← AddInstructions(T,A,R) 10 endwhile Algorithm 1 shows the top-level procedure. Since we only invoke the procedure on a buggy ex- ecution trace, the first call to the subroutine BugIsFound(A,T) always returns true. Next, we use ComputeRepair(A,T) to compute a potential repair. It guarantees that, after applying the repairR to the given traceT , the assertion violation no longer exists. However, this is not yet enough to guarantee thatR is a valid repair. There are two possibilities. One possibility is thatR indeed is avalid repair: by permuting the instruc- tions inT ,R removes all the bad executions and retains only the good executions. The other possibility is thatR is a vacuous repair in that, by creating a contradiction betweenR andT , it artificially removes all valid executions of the instructions inT . Since there is no longer any valid execution, by definition, the SMT solver cannot detect any violation (which must be a valid, and yet buggy, execution). To find out whether the repair R is valid or vacuous, we use the subroutine RepairIsValid(A,R) to check, after applyingR toT , whether any valid execution exists. If the answer is yes, thenR is a 28 valid repair, and thus is returned to the user. Otherwise, we use AddInstructions(A,T,R) to add more SFENCE and CLFLUSHOPT instructions toT , and try again. Our symbolic method explicitly considers the efficiency of the computed repair by adding SFENCE and CLFLUSHOPT instructions iteratively on a “need-to” basis. As soon as enough instructions are added, the while-loop in Algorithm 1 will terminate. In this sense, it minimizes the number of added instructions, but without using an “optimizing solver” such as MAXSMT in DirectFix [94]. 3.3 SymbolicAnalysisofthePMBug In this section, we present our SMT based method for analyzing the PM bug symbolically. It is the founda- tion of not only the subroutine BugIsFound(T,A) but also the subroutines ComputeRepair(T,A) and RepairIsValid(T,R) in Algorithm 1. 3.3.1 TheSatisfiabilityProblem Given the traceT and the assertionA, whether there exists a valid execution of the instructions inT that violatesA can be formulated as a satisfiability (SAT) problem. Toward this end, we construct a logical formulaΦ:=Φ program ∧Φ persistency ∧¬Φ assertion , whereΦ program encodes the program order,Φ persistency encodes the persistency order, andΦ assertion encodes the assertion. Thus,Φ is satisfiable if and only if there exists a valid execution of the instructions inT that violatesA. We expressΦ in a fragment of the SMT-LIB format that allows only integer variables (such asx and y) and Boolean compositions of linear integer arithmetic (LIA) constraints of the form(x<y). Thus, the satisfiability of Φ can be efficiently decided using any off-the-shelf SMT solver. Before presenting our method for constructingΦ , we define the two sets of variables used to encode Φ as follows: 29 The PC_I i Variables For each instruction I i ∈ T , where i = 1,...,N, we define a variable PC_I i whose value may be any integer in the interval [0,N); it stands for the execution time, i.e., when the instruction I i is executed by the CPU. Inside Φ , we will constrain PC_I i variables to allow only valid permutations ofT . ThePT_I i Variables For each instructionI i ∈T of the STORE type, we define a variable PT _I i whose value may be any integer in the interval[0,N +1]; it stands for the persistency time ofI i , i.e., when the value written byI i is actually stored in persistent media. 3.3.2 UsingΦ program toEncodeExecutionOrder LetΦ program := Φ pc ∧Φ so ∧Φ fo ∧Φ fs ∧Φ mo be a set of constraints onPC_I i variables such that, for every satisfying assignment toΦ program , the values ofPC_I i variables correspond to a valid permutation ofT . 3.3.2.1 SubformulaΦ pc Thisprogram-counter(pc) constraint restricts eachPC_I i to[0,N) to model the time whenI i is executed. The execution time starts from 0 and is bounded by N, the total number of instructions inT . We also require eachPC_I i variable to have a unique value. The definition of Φ pc is presented in Fig. 3.5. 3.3.2.2 SubformulaΦ so This store-order (so) constraint requires the STORE instructions inT to execute in the same order as they appear in the trace. This is becausePx86 has a singlestore-buffer for all STORE instructions; thus, reorder- ing of two STORE instructions(I i ,I j ) is not allowed, as shown by✔ in Table 2.1. The definition of Φ so is also presented in Fig. 3.5. 30 Φ pc := V 1≤ i≤ N (0≤ PC_I i <N)∧ V 1≤ i<j≤ N (PC_I i ̸=PC_I j ) Φ so := V I i ∈Stores∧ I j ∈Stores∧ i<j (PC_I i <PC_I j ) Φ fs := V I j ∈Flushes W I i ∈Stores∧ SameCacheL(I i ,I j ) (PC_I i <PC_I j ) Φ fo := V I i ∈Fences∧ I j ∈Fences∧ i<j (PC_I i <PC_I j ) Φ mo := V I i ,I j ∈Stores∧ SameCacheL(I i ,I j )∧ i<j W I k ∈Flushes∧ SameCacheL(I k ,I i ) (PC_I i <PC_I k <PC_I j ) Φ pti := V I i ∈Stores (− 1≤ PT _I i ≤ N +1) Φ pts := V I i ∈Stores (PC_I i ≤ PT _I i ) Φ fi := V I i ∈Stores∧ I j ∈Flushes∧ SameCacheL(I i ,I j )∧ I k ∈Fences (PC_I i <PC_I j <PC_I k ) =⇒ (PT _I i ≤ PC_I k ) Φ du := V I i ∈Stores (PT _I i <N) Φ cc := V I i ,I j ∈Stores∧ assert(PTime(I i )<PTime(I j )) (PT _I i <PT _I j ) Figure 3.5: Our symbolic encoding of the subformulas in Φ program := Φ pc ∧ Φ so ∧ Φ fs ∧ Φ fo ∧ Φ mo , Φ persistency :=Φ pti ∧Φ pts ∧Φ fi andΦ assertion :=Φ du ∧Φ cc . During our computation of a repair, however, we have two options. If the PM bug can be repaired by enforcingΦ so , we will enforce it while searching for the repair. In this case, the repair will rely solely on reordering the PM instructions (clflushopt and SFENCE). But if the PM bug cannot be repaired in this way, we will relaxΦ so , to allow some of the STORE instructions to reorder (details are in Section 4.2). While computing the repair, we may choose to relaxΦ so in certain cases, to allow some of the STORE instructions to reorder. This is because some PM bugs cannot be repaired unless some STORE instructions are allowed to reorder in the program. We discussed an example at the end of Section 3.1, and we will discuss details of this relaxation in Section 4.2. 31 3.3.2.3 SubformulaΦ fs This flush-store (fs) constraint requires that, for each CLFLUSHOPT (I j ), its execution time must be after at least one of the STORE (I i ) that it can flush. This requires that I i andI j are mapped to the same cache line, i.e.,SameCacheL(I i ,I j ) holds. 3.3.2.4 SubformulaΦ fo Thisfence-order(fo) constraint requires multiple SFENCE instructions to be executed in the same order as they appear in the trace. 3.3.2.5 SubformulaΦ mo This memory overwrite (mo) constraint says that two STORE instructions (I i ,I j ) cannot write the same address without a CLFLUSHOPT (I k ) inserted in between, to avoid memory overwrite. Memory overwrites must be avoided because, by definition, it violates the durability property. 3.3.3 UsingΦ persistency toEncodePersistencyOrder Let Φ persistency := Φ pti ∧ Φ pts ∧ Φ fi be a set of constraints on PT _I i variables such that, for every satisfying assignment toΦ persistency , the values of thePT _I i variables correspond to a valid persistency order of instructions inT . These subformulas are defined in Fig. 3.5. 3.3.3.1 SubformulaΦ pti This persistency time initialization (pti) constraint requires that, for each STORE instructionI i ∈ Stores, the value ofPT _I i is in the interval[− 1,N+1]. Besides[0,N), here,− 1 meansI i has not been executed, N meansI i has been flushed but not yet fenced, and N +1 meansI i has not even been flushed yet at the end of the execution. 32 3.3.3.2 SubformulaΦ pts Thispersistencytimestore(pts) requires the persistency time of eachI i ∈Stores to be no earlier than the execution time ofI i , i.e., the value ofPC_I i . 3.3.3.3 SubformulaΦ fi This fence interval (fi) constraint requires that, for eachI i ∈ Stores, matchingI j ∈ Flushes, andI k ∈ Fences, the persistency time ofI i is no later than the execution time ofI k . 3.3.4 UsingΦ assertion toEncodetheAssertion LetΦ assertion :=Φ du ∧Φ cc , whereΦ du represents the set of durability conditions andΦ cc represents the set of crash consistency conditions. Both of them are defined in Fig. 3.5. Recall that for eachI i ∈ Stores, the value written byI i is expected to be stored in persistent media at the end of the execution (TMAX = N). Thus, if(PT _I i ≥ N) is satisfiable, there exists a durability bug. Similarly, given two instructionsI i ,I j ∈ Stores, ifI i is expected to always persist beforeI j , then the satisfiability of (PT _I i ≥ PT _I j ) means there exists a crash consistency bug. 3.3.5 AnExampleforOurEncodingMethod Fig. 3.6 shows the constraints inΦ constructed by our method for the THEN-branch of Fig. 3.4, after the new SFENCE instructionI 6 has been added to the end of the trace. Specifically, Lines 2-7 encode Φ pc , which requires eachPC_I i to have a unique value in[0,5]. Here, N =6 is the total number of instructions in the extended execution traceT . Line 8 encodes the program order. In particular, PC_I 1 < PC_I 2 encodes Φ so , which requires the two STORE instructions to execute in order. PC_I 1 < PC_I 4 andPC_I 2 < PC_I 3 encodeΦ fs , which 33 1 //Program order constraints: 2 (0≤ PC_I 1 ≤ 5)∧(0≤ PC_I 2 ≤ 5)∧(0≤ PC_I 3 ≤ 5)∧(0≤ PC_I 4 ≤ 5)∧ 3 (0≤ PC_I 5 ≤ 5)∧(0≤ PC_I 6 ≤ 5)∧ 4 (PC_I 1 ̸=PC_I 2 )∧(PC_I 1 ̸=PC_I 3 )∧(PC_I 1 ̸=PC_I 4 )∧(PC_I 1 ̸=PC_I 5 )∧ 5 (PC_I 1 ̸=PC_I 6 )∧(PC_I 2 ̸=PC_I 3 )∧(PC_I 2 ̸=PC_I 4 )∧(PC_I 2 ̸=PC_I 5 )∧ 6 (PC_I 2 ̸=PC_I 6 )∧(PC_I 3 ̸=PC_I 4 )∧(PC_I 3 ̸=PC_I 5 )∧(PC_I 3 ̸=PC_I 6 )∧ 7 (PC_I 4 ̸=PC_I 5 )∧(PC_I 4 ̸=PC_I 6 )∧(PC_I 5 ̸=PC_I 6 )∧ 8 (PC_I 1 <PC_I 2 )∧(PC_I 1 <PC_I 4 )∧(PC_I 2 <PC_I 3 )∧(PC_I 5 <PC_I 6 ) 9 //Persistency time constraints: 10 (− 1≤ PT_I 1 ≤ 7)∧(− 1≤ PT_I 2 ≤ 7)∧(PC_I 1 ≤ PT_I 1 )∧(PC_I 2 ≤ PT_I 2 )∧ 11 (PC_I 1 <PC_I 4 <PC_I 5 =⇒ PT_I 1 ≤ PC_I 5 )∧ 12 (PC_I 1 <PC_I 4 <PC_I 6 =⇒ PT_I 1 ≤ PC_I 6 )∧ 13 (PC_I 2 <PC_I 3 <PC_I 5 =⇒ PT_I 2 ≤ PC_I 5 )∧ 14 (PC_I 2 <PC_I 3 <PC_I 6 =⇒ PT_I 2 ≤ PC_I 6 ) 15 //Assertion violation constraints: 16 ¬((PT_I 1 <6)∧(PT_I 2 <6)∧(PT_I 1 <PT_I 2 )) Figure 3.6: Encoding for the THEN-branch of Fig. 3.4 with both durability and crash consistency assertions. requires each CLFLUSHOPT to execute after a corresponding STORE. PC_I 5 < PC_I 6 encodes Φ fo , which requires the two SFENCE instructions to execute in the same order as in the trace. Line 10 encodes Φ pti and Φ pts , where Φ pti requires each PT _I i to have a value in [− 1,7], and Φ pts requires eachPT _I i to be no earlier than the correspondingPC_I i . Lines 11-14 encode Φ fi . In particular, PC_I 1 < PC_I 4 < PC_I 5 =⇒ PT _I 1 ≤ PC_I 5 means that, whenever the STORE and CLFLUSHOPT instructions for0x4a3c000 execute before the SFENCE in- structionI 5 , the persistency time for0x4a3c000 is guaranteed to be no later than the execution time of I 5 . Finally, Line 16 encodes the conditions under which assertion may be violated. Since the set of constraints (Φ ) in Fig. 3.6 is satisfiable, an SMT solver may return a solution corre- sponding to the permutationT ′ =I 1 ,I 4 ,I 2 ,I 3 ,I 5 ,I 6 . This is a valid permutation ofT because, according 34 to theCL symbol in Table 2.1, CLFLUSHOPT (I 4 ) is allowed to reorder beforeI 2 andI 3 . However, it vio- lates the crash consistency property becauseI 2 may persist beforeI 1 . In the next section, we present our method for repairing this violation. 35 Chapter4 AutomatedRepairofPersistentMemoryBugs 4.1 ComputingtheRepair Algorithm 2 shows our method for computing a repairR when the formulaΦ is satisfiable. Our method first uses the subroutine ComputeRepair(T,A) to compute a candidateR, and then uses the subroutine RepairIsValid(T,R) to check ifR is a valid repair. 4.1.1 SubroutineComputeRepair(T,A) The repairR is represented by a conjunction of blocking constraints, each of which, denoted¬ψ sat , re- moves a subset of permutations ofT allowed by Φ . Recall that Φ allows only valid and yet buggy per- mutations. Thus, we want to compute a set of blocking constraints that remove all valid and yet buggy permutations. In Algorithm 2,R is initialized to true, which represents an empty repair. Then, as long as Φ ∧R remains satisfiable (Line 3), we compute a constraint ψ sat from the satisfying assignment (sol) to the formulaΦ ∧R. Here,ψ sat is a conjunction of happens-before constraints,(PC_I i < PC_I j ), extracted from the antecedents of the subformulaΦ fi such that all these happens-before constraints are satisfied by the assignment (sol). 36 Algorithm2:R← ComputeRepair(T,A) 1 Φ ← Φ program ∧Φ persistency ∧¬Φ assertion 2 R← true 3 whileSatisfiable(Φ ∧R)do 4 ψ sat ← ExtractSatConstraint(Φ ∧R) 5 R←R∧¬ ψ sat 6 endwhile 7 returnR Sinceψ sat captures a set of valid-and-yet-buggy permutations ofT , by adding¬ψ sat toR, we remove them (Line 5). Inside the while-loop of Algorithm 2, we keep adding ¬ψ sat until Φ ∧R is no longer satisfiable. Within each call to ExtractSatConstraint(Φ ∧R), we compute a minimal set of constraints to be included inψ sat based on the satisfying assignment(sol) returned by the SMT solver. This is accomplished using the greedy algorithm as follows: First, we extract the concrete values of thePC_I i variables from the assignment (sol), and use these concrete values to decide, for each(PC_I i < PC_I j ) constraint in the antecedents ofΦ fi , whether the constraint is satisfied. All the satisfied (PC_I i <PC_I j ) constraints are added toψ sat . Thus, the negation ofψ sat will eliminate permutations associated with the assignment (sol). Before adding¬ψ sat toR, we remove the obviously-redundant constraints fromψ sat . These are con- straints that are implied by other constraints inψ sat . For example, ifψ sat contains both(PC_I 1 <PC_I 2 ) and(PC_I 2 <PC_I 3 ), then we remove(PC_I 1 <PC_I 3 ) fromψ sat since it is redundant. 4.1.2 Subroutine RepairIsValid(T,R) Algorithm 3 shows our method for validating the repair candidateR in two steps. First, we define a new formula Ψ := Φ program ∧ Φ persistency to capture the set of valid permutations ofT . Note that Ψ is a 37 Algorithm3: RepairIsValid(T,R) 1 Ψ ← Φ program ∧Φ persistency 2 if Satisfiable(Ψ ∧R)then 3 returntrue 4 else 5 returnfalse 6 endif subformula of Φ because Φ := Ψ ∧¬Φ assertion . Next, we check if the combined formula (Ψ ∧R) is satisfiable; we say that R is a valid repair only if(Ψ ∧R) is satisfiable. Fig. 4.1 illustrates why we check the validity of the repair in this way. Here, formulas¬Φ assertion and Ψ can be thought of as filters of permutations of the traceT : red ones are buggy and black ones are non- buggy. In this sense, Ψ retains only the valid permutations ofT , and the repair candidateR filters out the valid-and-yet-buggy permutations. IfR retains at least some non-buggy permutation (black arrow), we say thatR is a valid repair. But ifR does not retain any non-buggy permutation at all, it is a vacuous repair. The existence of some (valid and non-buggy) permutations means that the constraints imposed byR is realizable. 4.1.3 AnExampleforOurRepairMethod We use the example in Fig. 3.6 to illustrate the repair computation and validation process. Fig. 4.2 shows the corresponding steps. First, recall that the constraints (Φ ) shown in Fig. 3.6 are satisfiable. From the first solution to Φ returned by the SMT solver, our method identifies the happens-before constraints in the antecedents of the fence interval (fi) subformula Φ fi . This corresponds to Line 4 of Algorithm 2. While there are four antecedents inΦ fi as shown in Fig. 3.6, only two of them end up inψ sat , as shown in Fig. 4.2. 38 𝛹 ˄ ¬ 𝛷 as s ert i on all permutations valid permutations valid-and-buggy permutations (a) before repair 𝛹 ˄ 𝓡 ˄ ¬ 𝛷 as s ert i on all permutations valid permutations valid-and-non-buggy permutations valid-and-buggy permutations (b) after repair Figure 4.1: Formulas act as “filters” of the trace permutations, including buggy (red) and non-buggy (black) permutations. By adding ¬ψ sat to R, our method removes the buggy permutation where instructions I 1 and I 2 , together with their CLFLUSHOPT instructions, execute before the first SFENCE instruction in I 5 . Next, we check ifΦ ∧R is satisfiable (Line 3 of Algorithm 2). Since the answer is yes, from the second solution toΦ returned by the SMT solver, our method computes anotherψ sat and then uses¬ψ sat to re- move the buggy permutation where instructionsI 1 andI 2 , together with their CLFLUSHOPT instructions, are moved in between instructionsI 5 andI 6 . At this moment, the only remaining permutation is as follows:I 1 and its CLFLUSHOPT are beforeI 5 , whileI 2 and its CLFLUSHOPT are betweenI 5 andI 6 . Since this permutation does not violate the assertion, our method exists the while-loop in Algorithm 2 and returnsR as a potential repair. 39 //First iteration -- Satisfiable From solution T ′ =I 1 ,I 2 ,I 3 ,I 4 ,I 5 ,I 6 , we extract ψ sat as follows: (PC_I 1 <PC_I 4 <PC_I 5 )∧(PC_I 2 <PC_I 3 <PC_I 5 )∧ (PC_I 1 <PC_I 4 <PC_I 6 )∧(PC_I 2 <PC_I 3 <PC_I 6 ) //Second iteration -- Satisfiable From solutionT ′ =I 5 ,I 1 ,I 2 ,I 3 ,I 4 ,I 6 , we extract ψ sat as follows: (PC_I 5 <PC_I 1 <PC_I 4 )∧(PC_I 5 <PC_I 2 <PC_I 3 )∧ (PC_I 1 <PC_I 4 <PC_I 6 )∧(PC_I 2 <PC_I 3 <PC_I 6 ) //Third iteration -- Unsatisfiable Return R as a potential repair //Validation -- Satisfiable Found a valid permutation of T by solving (Ψ ∧R) Inst I 1 : STORE 0x4a3c000 Inst I 4 : clflushopt 0x4a3c000 Inst I 5 : sfence Inst I 2 : STORE 0x4a3c080 Inst I 3 : clflushopt 0x4a3c080 Inst I 6 : sfence Figure 4.2: Illustrating the repair computation and validation. Finally, our method uses Algorithm 3 to validate the repair by checking the satisfiability of Ψ ∧R. Since Ψ ∧R is satisfiable, the SMT solver returns a solution that corresponds to the permutation T ′ = I 1 ,I 4 ,I 5 ,I 2 ,I 3 ,I 6 . This permutation ofT shows exactly how to reorder instructions in the extended execution trace to avoid the assertion violation. Thus, by mapping the reordered instructions fromT back to the original program, we obtain the repaired software code shown in the THEN-branch of Fig. 2.3. 4.2 CorrectnessandOptimizations In this section, we first discuss the correctness of our repair method by treating it as a special case of the well-known syntax-guided synthesis (SyGuS) problem [3]. Then, we discuss two optimizations. 40 4.2.1 RelatingtoSyGuS Our repair problem can be viewed as deciding the existence of a relationR such thatΨ( x,y)∧R(x) =⇒ Ψ assertion (y) must be valid (for allx andy) and, at the same time,Ψ( x,y)∧R(x) must be satisfiable (for somex andy). Here,x denotes the set ofPC_I i variables andy denotes the set ofPT _I i variables. ∃R. (∀x,y.Ψ( x,y)∧R(x) =⇒ Φ assertion (y)) ∧ (∃x,y.Ψ( x,y)∧R(x)) This is the well-known SyGuS problem [3]. In our method, since the validity ofA∧B =⇒ C is equivalent to the unsatisfiability of the negated formulaA∧B∧¬C, we rewrite the problem as follows: ∃R. ¬(∃x,y.Ψ( x,y)∧R(x)∧¬Φ assertion (y))∧ (∃x,y.Ψ( x,y)∧R(x)) This allows use to use off-the-shelf SMT solvers to decide the two satisfiability subproblems. The first one says thatΨ ∧R∧¬Φ assertion must be unsatisfiable, and the second one says that Ψ ∧R must be satisfiable. They are the foundations of our method for computing and validating the repair in Algorithms 2 and 3. The link to SyGuS allows us to understand the complexity of the repair problem. Since quantification is applied to the relationR, the problem is expressed as a formula in second-order logic, which is known to be undecidable in general. That is why practical solutions to the SyGuS problem tend to be sound (and yet incomplete) solutions. In our repair method, we adopt the same approach. Our Method Is Guaranteed to Be Sound with Respect to the Given Trace. That is, the repairR computed by our method is guaranteed to be correct. This is because, by definition, R is able to make 41 Ψ ∧R∧¬Φ assertion unsatisfiable, as shown in Algorithm 2. At the same time, it is able to make Ψ ∧R satisfiable, as shown in Algorithm 3. Thus, R can always eliminate the failed assertion. Our method is not necessarily complete, meaning that even if there exists a valid repair, in theory, our method may not find it. We do not attempt to make the method complete for efficiency reasons, even if this may be achieved by restricting the search to a decidable solution subspace. Instead, we will demonstrate through experimental evaluation (Section 4.3) that, in practice, our repair method can always find a valid repair. 4.2.2 AddingNewInstructionstoT So far, our analysis assumes that the set of instructions in the execution traceT is fixed. Sometimes, however, the PM bug cannot be fixed merely by permuting T ; in addition, new CLFLUSHOPT and SFENCE instructions must be added. This is the reason why there is a while-loop in Algorithm 1 and whenever the PM bug cannot be repaired using instructions in given execution traceT , we use AddInstructions (Line 6 in Algorithm 1) to add instructions toT , and try again. Which instructions to add first depends on the violated assertion. If the violated assertion is PT _I i < PT _I j , our strategy is to add a CLFLUSHOPT instruction whose address is the same as the address ofI i or I j . If the violated assertion is PT _I i < N (a durability bug), our strategy is to add a CLFLUSHOPT instruction first and then check if a valid repair exists; if the violation still exists, we add an SFENCE instruction and check again. Fig. 3.4 shows an example. Prior to adding the instructionI 6 , the last violated assertion represents the durability of the value written byI 2 . Thus, we add an SFENCE instruction. The reason why there is no need to add the CLFLUSHOPT instruction forI 2 is because such an instruction already exists in the given execution trace. 42 4.2.3 RelaxingtheSubformulaΦ so So far, our analysis assumes that STORE instructions in the given traceT are executed in the same order as they appear in the program. This is codified in the subformula Φ so . However, enforcingΦ so may prevent some bugs from being repaired. An example has been shown in the ELSE-branch of Fig. 3.2. In addition to the durability property (PT _I 1 < N), the user also wants to satisfy the crash consistency property(PT _I 2 < PT _I 1 ). How- ever, since the must-persist-before constraint(PT _I 2 <PT _I 1 ) contradicts with the happens-before con- straint(PC_I 1 <PC_I 2 ) inΦ so , it is impossible to repair the bug. If we assume thatΦ assertion correctly expresses the intended behavior, then we must relax the happens-before constraints inΦ so . In our repair method, the solution is to enforce the subformula Φ so first. However, if this does not lead to a valid repair, we relax it. Toward this end, we first check if Φ so contains a constraint(PC_I i < PC_I j ) that contradicts the transitive closure of the must-persist-before constraints imposed by the crash consistency requirement Φ cc . If the answer is yes, then we remove the conflicting constraint from Φ so , and try again. To summarize, whenever themust-persist-before constraints inΦ assertion contradict with thehappens- before constraints inΦ so , we assume thatΦ assertion is the intended behavior, and relaxΦ so . 4.3 Experiments We implemented our method by using Z3 [102] to conduct the symbolic analysis described in Algorithms 1, 2 and 3. Our method takes an execution trace and a failed assertion as input and returns the repair as output. The known-to-be-buggy execution traces are generated using PMemCheck [55], although many other existing PM bug detection tools [55, 54, 68] can also be used to generate traces. 43 Table 4.1: Statistics of the benchmark programs. Name LoC Description PM Bug Type obj_constructor 186 Object constructor test [104] durability obj_first_next 314 POBJ_FIRST macro test [104] durability obj_mem 68 pmemobj copy, move and set tests [104] durability obj_memops 654 basic memory operations tests [104] durability obj_toid 83 TOID macros test [104] durability pmem_memcpy 174 memcpy test [104] durability pmem_memmove 223 memmove test [104] durability pmem_memset-1 103 memset from libpmemset [104] durability pmem_memset-2 103 memset from libpmemset [104] durability pmemspoil 1,324 pmempool spoil test [104] durability rpmemd_db 653 pool set database [104] durability Recipe (2 bugs) 39,581 convert DRAM index to PM index [83] durability Memcached (10 bugs) 23,032 key/value cache store in distributed sys [10] durability pmreorder_1 141 pmreorder script test [57] crash consistency pmreorder_2 141 pmreorder script test [57] crash consistency pmreorder_3 141 pmreorder script test [57] crash consistency pmreorder_4 141 pmreorder script test [57] crash consistency pmreorder_5 141 pmreorder script test [57] crash consistency pmreorder_6 141 pmreorder script test [57] crash consistency pmreorder_7 141 pmreorder script test [57] crash consistency pmreorder_8 141 pmreorder script test [57] crash consistency pmreorder_stack_1 123 functional test of pmreorder stack [57] crash consistency pmreorder_stack_2 123 functional test of pmreorder stack [57] crash consistency pmreorder_flushes_1 155 store reordering with flushes test [57] crash consistency pmreorder_flushes_2 155 store reordering with flushes test [57] crash consistency 4.3.1 Benchmarks Table 4.1 shows the benchmark statistics, including the name, the number of lines of C code (LoC), a short description, and the known PM bug type. These benchmark programs fall into two sets. The first set consists of programs with durability bugs. The first ten programs come from the Intel PMDK library. The last 2 programs are real applications: Memcached [10] is a high-performance object caching system, and Recipe [83] is a set of durable concurrent data structures for fast indexing. The durability bugs in these programs have been confirmed by prior work [104]. The second set consists of programs with crash consistency bugs. They are unit-testing programs for durable data structures implemented in the Intel 44 PMDK library. The unit tests are created by Intel developers to illustrate various scenarios under which crash consistency bugs occur. All of these crash consistency bugs have been confirmed by the developers. 4.3.2 ExperimentalSet-up Since the only prior work on repairing PM bugs is Hippocrates [104], we focus on comparing our tool, PMBugAssist, with Hippocrates on all benchmark programs. Our experiments were designed to answer the following research questions. • RQ 1: Is PMBugAssist more effective than Hippocrates in repairing the PM bugs? • RQ 2: Is PMBugAssist efficient enough for computing repairs for the benchmark programs? • RQ 3: Does PMBugAssist correctly compute repairs for the benchmark programs? The experiments were conducted on a computer with AMD Ryzen 5 5600X CPU and 32GB memory, run- ning Ubuntu 20.04. We emulate and configure Persistent Memory for use in the DAX mode without a physical PM device thanks to the emulation support from Linux kernel. The emulation is based on DRAM and we expect the results are the same if our tool is run on a physical PM, e.g., Intel Optane PM device. 4.3.3 ResultsforAnsweringRQ1 First, we present the experimental results that answer RQ 1. They are shown in the last two columns of Table 4.2. Here, the first two columns show the benchmark name and the length of the original execution traceT . The last two columns show the effectiveness of the two repair methods: PMBugAssist with Hippocrates. Here, the symbol✔ means that the method can repair the bug, whereas the symbol✘ means that the method cannot repair the bug. For each suggested repair generated, we manually inspect and compare it with the developers’ fix and verify their correctness. 45 Table 4.2: Results of the experimental evaluation. Name Trace PMBugAssist (our method) Hippocrates Length time(s) # inst. inst. repaired repaired # inst. time(s) added reorder added obj_constructor 445,254 0.1 1+1 0 ✔ ✔ 1+1 1.4 obj_first_next 559,572 0.3 2+0 0 ✔ ✔ 2+0 6.7 obj_mem 566,129 18.4 11+0 0 ✔ ✔ 210+0 47.1 obj_memops 565,899 0.3 2+0 0 ✔ ✔ 2+0 20.9 obj_toid 419,186 0.1 3+0 0 ✔ ✔ 3+0 1.6 pmem_memcpy 17,008 0.3 4+0 0 ✔ ✔ 4+0 5.8 pmem_memmove 624 0.1 2+0 0 ✔ ✔ 2+0 1.0 pmem_memset-1 194 183.52 1+1 0 ✔ ✔ 1+2 2.6 pmem_memset-2 4,440 0.1 1+0 0 ✔ ✔ 1+0 1.1 pmemspoil 36 0.1 1+1 0 ✔ ✘ 0+1 0.9 rpmemd_db 8,993 0.1 1+1 0 ✔ ✔ 1+1 0.8 Recipe (2 bugs) 500,415 4.5 3+1 0 ✔ ✔ 3+1 0.3 Memcached (10 bugs) 200,939 1,790.2 9+1 0 ✔ ✔ 10+6 0.3 pmreorder_1 8 0.1 1+1 1 ✔ ✘ N/A N/A pmreorder_2 8 0.1 1+1 2 ✔ ✘ N/A N/A pmreorder_3 10 7.9 2+2 4 ✔ ✘ N/A N/A pmreorder_4 10 7.9 2+2 4 ✔ ✘ N/A N/A pmreorder_5 8 4.8 2+2 1 ✔ ✘ N/A N/A pmreorder_6 8 22.9 2+2 4 ✔ ✘ N/A N/A pmreorder_7 10 22.1 2+2 4 ✔ ✘ N/A N/A pmreorder_8 12 3559.12 3+3 5 ✔ ✘ N/A N/A pmreorder_stack_1 26 0.3 0+0 1 ✔ ✘ N/A N/A pmreorder_stack_2 26 0.6 0+0 2 ✔ ✘ N/A N/A pmreorder_flushes_1 35 3857.8 0+0 5 ✔ ✘ N/A N/A pmreorder_flushes_2 35 539.0 0+0 6 ✔ ✘ N/A N/A The first twelve rows of Table 4.2 are benchmark programs with 23 confirmed durability bugs. The last twelve rows are benchmark programs with 12 confirmed crash consistency bugs. The results in Table 4.2 shows that PMBugAssist was able to repair all of the 35 bugs, while Hippocrates was able to repair 22 of the 23 durability bugs and none of the 12 crash consistency bugs. We also show, in Table 4.2, the CLFLUSHOPT+SFENCE instructions added and the time taken by the two methods. Overall, our method added either the same number of instructions or fewer instructions. For obj_mem, our method used significantly fewer CLFLUSHOPT instructions than Hippocrates (11+0 versus 210+0) because multiple STORE operations share the same cache line. ForMemcached, our method 46 used fewer instructions (9+1 versus 10+6) because SFENCE may be shared by multiple STORE operations. For pmemspoil, our manual inspection shows that Hippocrates’s repair is actually incorrect—at least one CLFLUSHOPT must be added. While our method takes more time since it conducts the additional semantic analysis of the modified program, this is needed to discover new repair strategies; in contrast, Hippocrates only applies the pre- defined repair strategy for durability bugs bug cannot repair crash consistency bugs. For pmem_memset-1, our method had a longer running time because the erroneous STORE residing in a loop showed up in the trace many times and thus slowed down our symbolic analysis. Overall, the time taken by our method is reasonable when compared to the alternative of relying on programmers to manually repair the bugs. Recall that Hippocrates does not have the ability to assess the impact of reordering instructions on the program behavior. Since fixing durability bugs does not require reordering instructions, but fixing crash consistency bugs requires reordering instructions, Hippocrates can only repair the relatively simple durability bugs. In contrast, PMBugAssist has the ability to assess the impact of reordering instructions, and thus can repair the crash consistency bugs as well. 4.3.4 ResultsforAnsweringRQ2 Now, we present the experimental results that answer RQ 2. There are two parts. The first part is shown in Column 2 of Table 4.1, which reports the program size. It shows that PMBugAssist is able to handle programs with reasonably large code sizes. For example, both Memcached and Recipe have more than 20K lines of C code. The second part is shown in Column 2 of Table 4.2, which reports the length of the execution trace. It shows that PMBugAssist is able to handle reasonably long execution traces. Note that neither code size nor trace length is a reliability indicator of how hard the repair problem is. For example, although the majority of durability bugs have traces with more than 100K instructions, 47 the repair problems are often simple, because each(PT _I i <N) constraint involves only one STORE in- structionI i , and many instructions in the trace are unrelated and thus may be ignored during the analysis. In contrast, while the crash consistency bugs have shorter traces, they have more complex interactions between thePC_I i andPT _I i variables and, as a result, have significantly larger search spaces. For example, even with 10 to 30 instructions in the traceT , the total number of possible repairs in the solution space can be astronomically large (10! to30!). This means that it is impossible for developers to enumerate the possible repairs manually. This is also the reason why SMT based symbolic analysis is needed. Column 3 of Table 4.2 shows that our SMT based symbolic analysis is efficient in computing repairs. Except for obj_memops, all durability bugs were repaired in a few seconds. This is the case even for ap- plications such as Memcached and Recipe, for which our repair method finished within 10 seconds. For obj_memops, it took 15 minutes because the program has a very large number of PM accesses and thus requires many SMT solver calls. For crash consistency bugs, our method finished within seconds except for pmreorder_8, pmreorder_flushes_1 and pmreorder_flushes_2 . For pmreorder_8, our method took longer because it went through more iterations in the while-loop, while adding 6 new PM instructions to the orig- inal execution trace (shown in Column 4) and reordering 4 instructions in the extended execution trace. For the last two benchmarks,pmreorder_flushes_1 andpmreorder_flushes_2 , the reason is because there are more relevant instructions in the traces and more of these instructions need to be reordered to repair the bugs. Our method also minimizes the number of SFENCE/CLFLUSHOPT instructions added (Section 3.2). For durability bugs, the results are as efficient as the repairs generated by Hippocrates. For crash consistency bugs (which cannot be handled by Hippocrates), the efficiency of our repairs is shown in Column 4 of Table 4.2. 48 Table 4.3: The type of code block that our repair belongs to. Bug Type Number of Bugs Sequential Branch In Scope Branch Out of Scope Durability 23 20 ( 86%) 3 (14%) 0 (0%) Crash Consistency 12 12 (100%) 0 ( 0%) 0 (0%) 4.3.5 ResultsforAnsweringRQ3 To answer RQ 3, we inspected the repairs computed by our method to see if they are also correct for other traces. Recall that, since each repair is computed from a single trace, in theory, there is no guarantee that the repair is correct also for other traces. However, our results show that for all the benchmarks in Table 4.2, our repairs are correct also for other traces. The reason is that our repair almost always resides in a local code block, such that the code block (basic block) is either executed in its entirety by a trace, or not executed at all by the trace. An example would be the THEN-branch (or the ELSE-branch) of an If-statement. It is extremely rare for the STORE instructions and the corresponding CLFLUSHOPT and SFENCE instructions to be separated into different code blocks. As a result, a trace executes either all or none of the instructions involved in our repair. Table 4.3 shows how often this easy-to-check sufficient condition is satisfied in practice. Here, a repair is called Sequential when all instructions fall into a straight-line code block, called Branch In Scope when all instructions fall into a branch of an If-statement, and called Branch Out of Scope when some are in a branch but others are outside of the branch. For Sequential and Branch In Scope, correctness of the repair is guaranteed for all traces. Table 4.3 shows that, for durability bugs, 20 of the 23 repairs (86%) are Sequential and only 3 (14%) are BranchInScope. For crash consistency bugs, all 12 repairs (100%) areSequential. Note that whether a repair is Sequential or Branch In Scope can be checked automatically using static program analysis. 49 4.3.6 DiscussionoftheLimitations Since our method relies on the execution trace, as opposed to all instructions in a program, it has the limitations of trace-based analysis techniques. For example, while it guarantees to eliminate the property violation from the given trace, in theory, it provides no guarantee for other traces. Note that such limita- tions are shared by the trace-based PM bug detection tools, which are used to generate input for our repair method. In practice, however, the repair computed from a trace is often correct for other traces as well, as shown by our results in Section 4.3.5. Even when that is not the case, limitations of trace-based techniques can be mitigated, to some extent, by leveraging automated testing tools [86] to generate a diverse set of execution traces, and then use these execution traces to detect and repair PM bugs. These tools can also be used to validate that PM bugs no longer exist in any path of the program after repair. However, such extensions are beyond the scope of our current work, which focuses on core technique for diagnosing and repairing PM bugs automatically. Similar to existing trace-based PM bug detection tools, our repair method does not handle concurrency bugs under weak memory models. This is a challenging problem that we leave for future work. 4.4 Conclusions We have presented a method for automatically repairing both durability and crash consistency bugs in application software that leverages byte-addressable persistent memory. Our method relies on a novel SMT based symbolic analysis to first identify the valid and yet buggy executions allowed by the program, and then remove these executions through iterative addition of blocking constraints. Due to the efficiency of the symbolic analysis over explicit enumeration, our method is able to explore possible repairs in a large 50 solution space quickly. Our experiments on a diverse set of benchmark programs show that the proposed method is significantly more effective in repairing PM bugs than the state-of-the-art approach. 51 Chapter5 InferringPersistentMemoryRelatedProperties We propose a set of methods for automatically inferring PM-related properties. These methods rely on a combination of static and dynamic analysis techniques. Here, static analysis is used to compute the control and data dependencies of program statements, and then instrument these dependencies into the executable of the program. During the execution of the instrumented program, traces will be generated to log not only the PM related machine instructions (e.g., LOAD and STORE) but also the dependencies between these instructions. Next, dynamic analysis is used to analyze the control dependencies between LOAD operations of the execution trace, to infer three types of PM related properties: durability (DURA) proper- ties, must-persist-before (MPB) properties, and must-persist-atomically (MPA) properties. To evaluate the quality of these inferred properties, we have applied them to benchmark programs to detect PM bugs. Our experimental results show that, by leveraging our property inference method, significantly more bugs can be detected. Furthermore, compared to an existing method for PM property inference (Witcher [38]), our method generates PM properties with significantly higher quality. Specifically, using Witcher-generated properties often produces many bogus bugs, but using properties generated by our method does not pro- duce bogus bugs. 52 5.1 ThreeTypesofPMProperties We are concerned with the following types of PM properties for our inference algorithm: durability prop- erties, MPB properties, and MPA properties. 5.1.1 Durability Durability properties require the data written to PM to eventually show up in PM media. Here, the word eventually indicates that in CPU hardware, voluntary flushing of data from cache to PM is non- deterministic in nature and thus cannot be relied upon to ensure durability. Instead, the programmers are responsible for using explicit flushing and proper fencing to ensure that data written to PM is persisted. 5.1.2 Must-Persist-Before(MPB) MPB properties require that a pair of data written to PM to persist in a certain order. This is often needed to ensure that the program state stored in PM is always in consistent state. For example, if a program uses a flag to indicate whether a data field is valid, then the data field must persist before the flag field. Otherwise, the PM program state may be inconsistent: thedata field is not available, but the flag is set. This may lead to disaster because, while recovering from the power failure, a program may read thedata only if the flag field is set. Since the PM is in an inconsistent state, the program will read garbage value from thedata field. 5.1.3 Must-Persist-Atomically(MPA) MPA properties require that a set of data written to PM to persist atomically. That is, either all of the written values show up in PM media or none of them show up in PM media. This is often needed to ensure that a complex data structure stored in PM is always in a consistent state. For example, in a doubly-linked list, the node->next and node->prev fields must persist atomically. Otherwise, the integrity of the list (as 53 stored in PM) will be lost. To enforce such MPA properties, the notion of a persistent transaction is needed. In many PM related libraries, including PMDK [69], there are special APIs for programmers to implement persistent transactions. 5.2 RepresentationsofPMProperties We represent the DURA, MPB, and MPA properties as unary, binary, and n-ary relations over program statements that store data to PM. Given a programP , letSTMTs={st 1 ,...,st n } be the set of all program statements inP . Further- more, letSTOREs ⊆ STMTs be the subset of program statements that write to PM, and LOADs ⊆ STMTs be the subset of program statements that read from PM. LetCDEP(st ′ ,st) holds if and only if the program statementst is control-dependent on the program statementst ′ . Similarly, An example forCDEP(st ′ ,st) is the code snippet if (x>0) { y=1 }; else { y=0; }, where the STORE instructions that write to y is control dependent on the LOAD instruction that reads fromx. let DDEP(st ′ ,st) holds if and only if the program statement st is data-dependent on the program statementst ′ . An example forDDEP(st ′ ,st) is the code snippety = x+1;, where the data written toy comes from the data read fromx. 5.2.1 Durability A durability property is defined as a unary relation DURA(st) over a program statementst∈STOREs. It means that the data written to PM by st must persist in PM media eventually. As mentioned early, the voluntary cache flushing mechanism implemented in CPU hardware cannot guarantee the durability property. Instead, programmers must add CLFLUSHOPT and SFENCE instructions to enforce the durability property. 54 5.2.2 Must-Persist-Before(MPB) An MPB property is defined as a binary relation MPB(st ′ ,st) over two program statements st ′ ,st ∈ STOREs. It means that the data written to PM by the preceding statement st ′ must persist before the data written to PM by the subsequent statementst. Even if the CPU executes st ′ before executing st, the voluntary cache flushing mechanism imple- mented in CPU hardware cannot guaranteeMPB(st ′ ,st). In fact, without proper use of CLFLUSHOPT and SFENCE instructions, the data written byst may appear in PM media before the data written byst ′ , thus putting the PM into a potentially inconsistent state. 5.2.3 Must-Persist-Atomically(MPA) A MPA property is defined as a k-ary relationMPA(st 1 ,...,st k ) over program statementsst 1 ,...,st k ∈ STOREs. It means that the data written to PM by these statements must persist atomically. To help programmers enforce the MPA property, libraries such as PMDK provide APIs for implementing per- sistent transactions. As long as all the PM memory addresses touched by the program statements in MPA(st 1 ,...,st k ) are added to the transactions, e.g., by using TX_BEGIN(), TX_END(), and TX_ADD(), data written to PM will persist atomically (all or none). 5.2.4 RelatingDURAtoMPB DURA and MPB properties are related in the sense that DURA can be viewed as a special case of MPB as follows: DURA(st):=MPB(st,st last ), where st last is an imaginary program statement that executes after all the other program statements; it can be used to represent the end of program execution. Therefore,MPB(st,st last ) means that the data written to PM byst always persist before the end of the program execution. 55 5.2.5 RelatingMPBtoMPA MPB and MPA properties are also related in the sense that MPB properties can be viewed as the edges of a directed graph, whose strongly connected components (SCCs) represent the MPA properties. 5.2.5.1 TheGraph Let the graphG = (V,E), whereV is the set of program statements in the MPB properties, andE is the set of edges, each of which represents an MPB property. Specifically, MPB(st ′ ,st) is represented by an edge from nodest ′ tost. 5.2.5.2 FromSCCstoMPAs A strongly connected component (SCC) inG represents a set of contradicting “must-persist-before (MPB)” requirements. For example, given a cycle from nodest ′ → st...st ′ inG, since each edge represents an MPB relation, the entire cycle means that the program statement st ′ must persist before itself, which is impossible. The only way to reconcile the set of contradicting MPB requirements is to group all the program statements in the SCC as a persistenttransaction. In other words, these statements always persist together (either all or none). 5.3 InferringProperties: TheTheory We now explain the theory behind our method for inferring PM properties. The implementation details will be presented in the next section. 56 5.3.1 TheLifeCycleofaPM-basedProgram To understand our method for inferring DRUA, MPB, and MPA properties automatically, it is helpful to study the life cycle of a PM-based program. Due to power failure-induced crashes, a PM-based program often runs multiple times. Each time the program runs, it is called an execution instance. Depending on whether an execution instance writes to PM or reads from PM, it can be classified as either E bc (which stands for the execution “before crash”) or E ac (which stands for the execution “after crash”). Note that the distinction between E ac and E bc is of value only at the conceptual level, because, in practice, an execution instance often has both read and write operations with respect to PM. The life cycle is shown in Figure 5.1 with the following steps: • Step 1. executing the program (E bc ) • Step 2. storing data to PM • Step 3. crash due to power failure • Step 4. executing the program again (E ac ) • Step 5. loading data from PM • Step 6. go back to Step 1 Our method relies on the hypothesis that PM-specific properties are contracts between two execution instances: the execution before crash, denoted E bc , and the execution after crash, denoted E ac . Further- more, properties inferred from E ac should be enforced by E bc . The rationale is that, since E bc must save the program state to PM before crash unless it satisfies the PM properties demanded by E ac , when E ac tries to recover the PM program state, the integrity will be violated. 57 Persistent Memory Media E bc E ac 2 1 3 5 4 6 Load Figure 5.1: The life cycle of a PM-based program before and after crash 5.3.2 InferringPropertiesfromLOADs While the DURA, MPB, and MPA properties are requirements that must be satisfied by the STORE opera- tions in Step 2, they will be inferred from the LOAD operations in Step 5. The reason is because, if the data stored in PM is corrupted or in an inconsistent program state, the impact will be reflected on the LOAD operations in Step 5. Thus, by carefully studying how data is read from the PM, and is then used in the program, we can infer the properties that must be enforced by the preceding STORE operations in Step 2. 5.3.3 PropertiestoBeEnforcedbySTOREs For the example in Fig. 2.2 (a), reader() is E ac , from which we can infer PTime(I 1 ) < PTime(I 2 ), whereI 1 is the STORE of records[i].name andI 2 is the STORE of records[i].valid. This reason is becauseI 2 is control dependent onl 1 : only if the PM value of records[i].valid is set, the PM value of records[i].name is read. In contrast,writer() isE bc , which must enforce the property thatI 1 always persists beforeI 2 ; in other words, the write torecords[i].name must take effect in PM before the write torecords[i].valid. While the idea of inferring properties ofE bc fromE ac is straightforward in theory, there are signifi- cant challenges in practice. For example, in real applications, there is often no clear separation between 58 writer() and reader(); instead, reads from PM and writes to PM are mixed together in the same func- tion. Furthermore, PM values written by one function may be read by multiple functions, all of which must be taken into consideration. 5.3.4 ConditionforInferringPMProperty It is worth noting that, while we rely on control dependencies of LOAD instructions to infer properties that must be enforced by STORE instructions, we do not rely on data dependencies. Recall that a classic example for data dependency is the code snippet y = x+1;, where the data written toy (STORE) comes from the data read fromx (LOAD). Since there must be a preceding STORE that writes the data tox, it is tempting to infer an MPB property between the preceding STORE tox and the current STORE toy. However, what if the data stored toy will never by read from PM (and thus will never affect the behavior of the program)? In such as case, there is no need to enforce the MPB property. That is the reason why we avoid adding the MPB property too eagerly. Instead, we only add the MPB property when we have to. Supposed that, subsequently, the data stored toy is actually read from PM, then can look at this sub- sequent LOAD ofy, and see if it is dependent on the current LOAD ofx. We add the MPB property (over STOREs ofx andy) only if the two LOADS (ofx andy) are dependent. 5.4 ImplementationforInferringProperties In this section, we show how to implement our previous method. Toward this end, there are two obvious options. 5.4.1 ImplementaionStratgies One option is using static program analysis. Due to computational overhead and scalability concerns, static analysis is inevitably approximate. For example, a typical compiler framework such as LLVM provides 59 a lot of existing APIs for computing program dependency graphs (PDGs) [89, 76, 33] and control-/data- dependencies between program statements. However, these analyses are over-approximated, which means that if we build our method using them, the inferred DURA, MPB, and MPA properties would also be over- approximated. While this approach has the advantage of not missing any real properties (and hence not missing any real bugs), it may lead to bogus properties (and hence bogus bugs). The other option is using dynamic program analysis, which is often carried out on a program execution trace (or a set of traces). As a result, it will be inherently under-approximated, which means that if we detect a bug, it is guaranteed to be a real bug. However, we may no longer be able to detect all real bugs since the set of inferred properties will be a subset of the complete set of PM properties. 5.4.2 DynamicAnalysisAugmentedwithStaticAnalysis We focus on dynamic analysis based techniques for inferring PM properties. Nevertheless, we augment our dynamic analysis techniques with statically computed control-dependency information. Before presenting the detailed algorithm, we would like to show the entire process below in Figure 5.2. • Step 1. We conduct static program analysis (LLVM opt) to compute the control dependencies be- tween program statements • Step 2. We instrument the dependencies into the executable of the program, and then leverage existing test cases to generate traces • Step 3. We dynamically analyze the program execution (traces) to infer DURA and MPB properties • Step 4. We leverage the inferred MPB properties to infer MPA properties • Step 5 (optional). We demonstrate the effectiveness of our method by leveraging the inferred prop- erties to detect PM bugs 60 Figure 5.2: A combination of static and dynamic analysis for PM property inference 5.5 TheOverallAlgorithm Algorithm 4 shows the top-level procedure of our method for inferring and then checking PM properties. The input consists of the program P and a set of test cases (TestCases) for running the program. The output is a set of inferred properties (stored inDURAs,MPBs, andMPAs, together with a set of bugs detected by leveraging these properties (stored inBugs). Internally, our procedure goes through three steps. First, it instruments the programP and then runs the instrumented program together with the test cases, to generate a set of execution traces. Second, it traverses each execution traceT to infer the set of durability properties (stored inDURAs) and the set of MPB properties (stored inMPBs). The inferred MPB properties are then used to infer the MPA properties (stored inMPAs). Third, it leverages the inferred properties to detect violations of the DURA, MPB, and MPA properties. 61 Algorithm4: Inferring and Checking PM Properties. 1 Traces← InstrumentedExecution(P,TestCases) 2 // Inferring requirements 3 DURAs←{} 4 MPBs←{} 5 foreachT ∈Tracesdo 6 DURAs← DURAs∪ Infer_DURA_Reqirements(T ) 7 MPBs← MPBs∪ Infer_MPB_Reqirements(T ) 8 endforeach 9 MPAs← Infer_MPA_Reqirements(MPBs) 10 // Checking requirements 11 Bugs←{} 12 foreachT ∈Tracesdo 13 Bugs← Bugs∪ Check_MPB_Reqirements(T,MPBs) 14 Bugs← Bugs∪ Check_MPA_Reqirements(T,MPAs) 15 endforeach 16 returnBugs 5.5.1 SubroutineforGeneratingTraces In this section, we show the process of generating traces for target PM programs. 5.5.1.1 Definitions LetT =ev 1 ,...,ev n be an execution trace of the programP , where each eventev i (where1≤ i≤ n) is an instance of executing an instruction. We assume thatev i has the following fields: • ev i .type is one of the types: LOAD, STORE, CLFLUSHOPT, or SFENCE. • ev i .addr is the starting PM memory address of a STORE, LOAD, or CLFLUSHOPT instruction. • ev i .size is the size of the PM memory address of a STORE, LOAD, or CLFLUSHOPT instruction. Each event is an instance of executing a program statement. In this work, we are concerned with the following types of program statements: 62 • LOAD(addr, size) • STORE(addr, size) • CLFLUSHOPT(addr, size) • SFENCE Other PM-related instructions, such as CLWB and FLUSH, can be modeled using either CLFLUSHOPT or a combination of CLFLUSH and SFENCE. 5.5.1.2 Staticprogramanalysisoncontrol-dependencies To generate the execution traces needed by our property inference method, we first conduct a static anal- ysis of the program, to compute the control and data dependencies of program statements and then in- strument the dependencies into the program. This would allow the instrumented program to generate the desired traces when it is executed together with the given test cases. Our static program analysis is implemented using the LLVM compiler platform, by leveraging the existing APIs provided by LLVM. LLVM first constructs a program dependency graph (PDG) for the pro- gram, and then uses the PDG to compute the control- and data-dependencies. However, the dependencies computed by LLVM are restricted to each individual function (since by default, LLVM only does intro- procedural analysis). Therefore, we have to extend LLVM to conduct inter-procedural analysis. One way to conduct inter-procedural analysis is to inline some of the functions since the resulting code will contain not only the body of a function, but also function bodies of the callees. However, since this approach increases the size of the program, it has to be used judiciously. 63 5.5.1.3 DependencyInstrumentation After computing the control and data dependencies, which are binary relations defined over program state- ments, we need to instrument them into the program, to allow dependencies between program statements to be correlated to events in the execution traces. This is accomplished in two steps. First, we store the CDEP information in a look-up table. The look- up table can tell whether any two program statements (st ′ and st) are control-dependent, i.e., they are control-dependent if and only ifCDEP(st ′ ,st) holds. Next, we associate each eventev in the execution trace with a field ev.st, which presents the program statement that generates the eventev. Thus, if we want to know whether two LOAD events (ev ′ andev) are control-dependent, we simply check ifCDEP(ev ′ .st,ev.st) holds. 5.5.1.4 Testcaseexample In Listing 5.3, we show a test case example for the singly linked list. This test case initializes two lists list1 and list2 in lines 2 and 3. Then it declares four variables a, b, c, and d in lines 5 to 8 and adds to list1 in lines 12 to 15. In line 10, it declares another variable append with value 90. In line 17, it checks if the current list1’s size is four. In line 19, it declares an integer pointer variable last and lets it point to the last element in list1 in line 20. Then it checks the equivalence between if the dereferenced value forlast is equal tod. Then it addsappend tolist1 and checks if thelist1’s size is five in lines 23 and 24. Next, it lets last point to the last element in list1 again in line 26 and checks if the dereferenced value for last is equal to append. Finally, both lists are destroyed in lines 29 and 30. Similarly, we show two more test case examples for doubly linked list in Listing 5.4 and ring buffer in Listing 5.5. Our traces generated rely on those test cases from the target library’s unit tests for data structure benchmarks. 64 5.5.2 SubroutineforInferringProperties Our method infers three types of properties by traversing the execution traces. In the following subsec- tions, we present the details of inferring each type of properties. 5.5.2.1 SubroutineforInferringDURAProperties Durability properties are easy to infer. While traversing the events of the execution trace, every time we encounter a STORE eventev, we add a DURA propertyDURA(ev.st) if the property is not already included in the set DURAs. Here, ev.st represents the program statement that generates the event ev. Note that multiple events may be generated from the same program statement. 5.5.2.2 SubroutineforInferringMPBProperties MPB properties must be inferred by identifying event pairs, such as(ev ′ ,ev), whereev ′ is the preceding LOAD event,ev is the subsequent LOAD event, and the program statementev.st is control dependent on the program statementev ′ .st. The rationale behind this approach has been explained in Section 5.3. Algorithm 5 shows the pseudo-code of our subroutine, which takes the execution trace T as input and returns a set of MPBs as output. Internally, it iterates through the pairs of LOAD events and check the dependencies of the corresponding program statements. It is worth noting that the inferred MPBs are relations defined over program statements (instead of events in the trace). Finally, as shown in Agorithm 4, the MPB properties inferred from different execution traces are unioned together. There is no technical difficulty for doing so because, after all, the MPB properties are binary relations defined over program statements. In Listing 5.6, we show the code snippet for cc_list_add_last in doubly linked list; six MPBs are inferred in lines 34 to 35. 65 Algorithm5: Subroutine Infer_MPB_Reqirements(T). 1 MPBs←{} 2 foreach pair(ev ′ ,ev) of LOAD events in the traceT do 3 if ev is control-dependent on precedingev ′ then 4 letev ′ .st be the program statement generatingev ′ 5 letev.st be the program statement generatingev 6 MPB← (ev ′ .st,ev.st) 7 MPBs← MPBs∪{MPB} 8 endif 9 endforeach 10 returnMPB In Listing 5.7, we show another example for unlinkn function in doubly linked list; tw MPBs are inferred in lines 30 to 31. In Listing 5.8, we show an example of an MPB inferred for deque; one MPB is inferred in line 20. In Listing 5.9, we show an example of an MPB inferred for ring buffer in cc_rbuf_enqueue; two MPBs are inferred in lines 19 and 20. In Listing 5.10, we show an example of an MPB inferred for array in cc_array_subarray; one MPB is inferred in line 34. 5.5.2.3 SubroutineforInferringMPAProperties While both DURA and MPB properties are inferred from the execution traces, MPA properties are not inferred from the traces. Instead, they are inferred from the existing MPB properties. At a high level, a MPA property representing a set of contradicting MPB properties, which requires that a program statementst ′ to presist before itself. Since a program statement cannot persist before itself, the set of MPB properties cannot be satisfied at the same time unless a transaction is used to include all of them. As mentioned in Section 5.3, the set of contradicting MPB properties can be computed by first defining a graph and then computing the strongly connected components in that graph. 66 Algorithm 6 shows the pseudo-code of our subroutine, which takes a set of MPBs as input and returns a set of MPAs as output. Internally, it first constructs the directed graph G, where nodes are program statements, and edges are MPB relations over these program statements. Then, it computes the SCCs. Finally, each SCC gives rise to a MPA. Algorithm6: Infer_MPA_Reqirements(MPBs). 1 MPAs←{} 2 letG be a directed graph where eachMPB(st ′ ,st) is an edge from nodest ′ to nodest 3 letSCCs be the set of strongly-connected components inG 4 foreachSCC∈SCCsdo 5 MPA←{ st|st is a node in SCC} 6 MPAs← MPAs∪{MPA} 7 endforeach 8 returnMPAs In Listing 5.11, we show the code snippet for cc_list_add_last in doubly linked list, and after calculating SCCs for MPBs, it shows an MPA in line 34. In Listing 5.12, we show an example for the swap_adjacent function in doubly linked list. In Listing 5.13, we show an example of an MPA inferred from the link_behind function in doubly linked list. In Listing 5.14, we show an example of an MPA inferred from the cc_deque_add_first function in deque. 5.5.3 SubroutineforCheckingProperties To detect violations of the inferred properties, we first traverse the events of an execution trace to compute the “persistent time interval” for each STORE event and then compare these persistent intervals against the DURA, MPB, and MPA properties. 67 5.5.3.1 ComputingthePersistentInterval Algorithm 7 shows our method for computing the persistent intervals, which takes an execution traceT as input and returns the mapPI as output. For each STORE eventev, the persistent interval is stored in PI[ev]. Internally, our method scans through all events in T . In line 1, it initializes for each event ev with an empty upper bound and lower bound for its persistent interval. In line 2, transaction indicates if ev is inside a transaction. In lines 3 and 4, transaction_stores and transaction_add store st events and which addresses have been added into a transaction. transaction_begin_time has the time when a transaction_begin instruction is encountered. not_flushed_stores stores all st events which are not flushed yet. From lines 8 to 24, when ev is a store, then it checks if it is inside a transition; if so, it will be added intotransaction_stores. Otherwise, it will be added intonot_flushed_stores. And its persistent time upper bound will be set to its timestamp (program execution order). From lines 18 to 24, whenev is a CLFLUSHOPT, it checks if the flushed address has a corresponding store in not_flushed_stores and sets its persistent interval lower bound tost last .time+1, which means it has been flushed but not properly fenced. From lines 25 to 30, whenev is an SFENCE, it checks stores intransaction_stores, itst is found with persistent interval lower bound equal tost last .time+1, its persistent interval lower bound will be updated to the current SFENCE’s timestamp, and it will be removed fromnot_flushed_stores. In lines 33 to 35, it setstransaction toTrue andtransaction_begin_time withev.time. From lines 37 to 49, it sets all persistent interval’s upper bound totransaction_begin_time and lower bound to TX_END time for all addedsts through TX_ADD. Then it clears all contents intransaction_stores andtransaction_add. In lines 50 to 52, it adds all addresses seen in a transaction. 68 5.5.3.2 DetectingDURAViolations We detect DURA violations by traversing the STORE events (ev) in the execution traceT in Algorithm 8. Whenever the upper bound of the persistent interval,PI[ev].UB, is not smaller than the lower bound of the persistent interval,PI[ev last ].LB, we report a DURA violation for the corresponding program state- mentev.st. 5.5.3.3 DetectingMPBViolations In Algorithm 9, fore each pair of storesev ′ ,ev, if they have MPB relation, it checks their persistent time intervals, if violated, this MPB will be put intoBugs. 5.5.3.4 DetectingMPAViolations In Algorithm 10, for storeev, if they have MPA relation, it checks allst ′ in this MPA, they should all have same persistent time intervals, if violated, this MPA will be put intoBugs. 5.6 Experiments We implemented our method by using LLVM [81] opt pass to obtain static control dependency information. Our method takes a target program and instruments dependency information on LLVM bitcode. When the target instrumented test cases are executed, it returns a trace with PM related instructions. Then we use Python scripts to analyze on the traces and compute MPB and MPA relations. Finally, we check the generated traces with Python scripts to find PM property violations. 5.6.1 Benchmarks Table 5.1 shows the benchmark statistics, including the name, the number of lines of C code (LoC), a short description, and the known PM bug type. These benchmark programs fall into two sets. The first 69 set consists of data structure programs from a reputable repository of c data structure programs [109]. The first eight programs range from list, queue to ring buffer. We used their unit test cases to generate traces. The second set consists of programs from real-world benchmarks from PM enabled redis [53] and memcached [84]. We generated traces from the interaction between the server and the client. 5.6.1.1 Tracestatistics In Table 5.2, we show the statistics of traces we generated for the target programs. From column 2 to column 6, it shows the number of store, load, flush, fence instructions, and transaction blocks on for a target program. For each data structure program, we have two versions - one is a manually constructed good version, e.g., cc_list-ok, in which PM instructions are properly added and used; another is a bad version, e.g., cc_list-bad, in which some PM instructions are missing or misused. Note that there is only one version for Redis and Memcached - we only use the original version of those two benchmarks. There is no transaction utilized in Redis and Memcached. 5.6.2 ExperimentalSet-up We focus on comparing our tool with PMTest [88] as a baseline for Must Persist Relation inference and bug detection on all benchmark programs and comparing the quality of between Must Persist Relation and Likely-Correctness Conditions inferred by Witcher [38]. Note that we do not directly run PMTest and Witcher from their code. Specifically, we mimic how PMTest detect bugs without user-provided PM properties and how Witcher generated Likely-Correctness Conditions with their heuristic rules and integrate with our own tool. Our experiments were designed to answer the following research questions. • RQ 1: Is our method more effective than PMTest in inferring PM properties and detecting the PM bugs if there is no user-provided PM requirements? • RQ 2: Can our method infer better PM properties compared to Witcher? 70 The experiments were conducted on a computer with AMD Ryzen 5 5600X CPU and 32GB memory, run- ning Ubuntu 20.04. 5.6.3 ResultsforAnsweringRQ1 First, we present the experimental results that answer RQ 1. They are shown in Table 5.3. The first column shows benchmark names and the final entry is a total count. Columns 2 to 5 show the time of our method for inference Dura, MPB and MPAs and the number of relations our method inferred. Columns 6 to 9 show the time of our method for detecting Dura, MPB and MPA bugs and how many bugs we detected. The final four columns are PM bugs detection fromPMTest without user-provided PM properties. For data structure benchmarks, our inference procedure infers relatively small number of Duras, MPBs, and MPAs. Note that only cc_list-bad/ok and cc_deque-bad/ok has a different number of MPB inferred due to trace generation - some trace can catch more dependency information on it. All other data structure benchmarks have the same number of Dura and MPA inferred for bad and ok versions. Note here we only inferred MPA relations on cc_list, cc_array, cc_deque, and cc_queue no more than three MPAs relations. This is expected because MPA relations are calculated from MPBs and should be much smaller than the number of MPBs after SCC calculation. For Redis and Memcached, the number of Dura, MPB, and MPA relations generated is within 66, which is considerably small regarding their LoC. Our total runtime for the inference procedure is fast - under 10 seconds. A total number of 386 Dura, 280 MPB, and 18 MPA relations are inferred. From columns 2 to 5, we show the results for PM bug detection from our method. For data structure benchmarks, we can see we cannot detect any Dura, MPB and MPA violations - the traces are safe regarding the Dura, MPB, and MPA relations we inferred. And our method can detect Dura, MPB, and MPA violations on corresponding bad versions of a certain data structure. For Redis and Memcached, our method can also detect Dura, MPB, and MPA violations. The detection procedure for our method only takes 8 seconds and a total number of 79 Dura, 67 MPB, and 9 MPA relation violations are detected. From columns 6 to 9, 71 we notice that forPMTest, since it is tested without user-provided PM properties, it can only detect Dura bugs - columns 7 and 11 are the same, and for MPB and MPA bugs, it cannot detect any of them. PMTest procedure takes 7.4 seconds and our method for detection takes 7.8 seconds, which means our method does not impose much overhead for MPB and MPA bug detection. These results confirm our method is more effective than PMTest in inferring PM properties and detecting the PM bugs if there is no user-provided PM requirements. 5.6.4 ResultsforAnsweringRQ2 Here we present the experimental results that answer RQ 2 in Table 5.4 to. The first column shows bench- mark names. Columns 2 to 7 show the results for Properties inferred for our method (columns 2 to 4) and Witcher (columns 5 to 7). Columns 8 to 13 show the results for real bugs detected for our method (columns 8 to 10) and Witcher (columns 11 to 13). Columns 14 to 19 show the results for real bugs detected for our method (columns 14 to 16) and Witcher (columns 17 to 19). Columns 2 to 4 are identical to columns 3 to 5 in Table 5.3. Witcher inferred the same amount of Duras as our method - both are 386. However, Witcher inferred 1641 MPB relations and 35 MPA relations, which is five times and two times more than our method respectively. For some benchmarks, the MPB relations generated by Witcher is more than ten times than ours, e.g., cc_stack and Memcached. This is due to Witcher’s heuristic rules for generating MPBs - they consider not only control dependency between PM variables but also data dependencies. For MPAs, they do not calculate from MPBs; rather, they use another heuristics rule. Although the number of MPB and MPA relations generated from Witcher are strictly more than our method, our method’s MPB are a subset of their MPB relations, but MPAs are not. We manually confirm that our tool’s detection for Dura, MPB and MPA violations are real in columns 8 to 10, which is identical to identical to columns 7 to 9 in Table 5.3. For data structure programs, Witcher 72 detected the same real bugs as ours for Dura and MPB bugs. However, they missed 8 MPA bugs for cc_- array-bad, cc_deque-bad, cc_queue-bad, Redis, and Memcached. In columns 14 to 16, we can see that our method does not report any bogus Dura, MPB, or MPA bugs. However, Witcher detected 27 in MPB bogus bugs and 23 MPA bogus bugs, respectively. 5.6.5 DiscussionoftheLimitations One thread of validity for our method is which test cases are used for trace generated. If test cases are not diversified, it may lead to low-quality traces. Thus, a less number of MPB and MPA relations are inferred. However, our method does not focus on soundness, i.e., generating all MPBs and MPAs. We guarantee all the MPB and MPA relations generated all real ones. A more diverse set of traces are complementary to our methods. 73 1 int main(int argc, char **argv) { 2 cc_slist_new(&list1); 3 cc_slist_new(&list2); 4 5 int a = 8; 6 int b = 3; 7 int c = 20; 8 int d = 1; 9 10 int append = 90; 11 12 cc_slist_add(list1, &a); 13 cc_slist_add(list1, &b); 14 cc_slist_add(list1, &c); 15 cc_slist_add(list1, &d); 16 17 CHECK_EQUAL_C_INT(4, cc_slist_size(list)); 18 19 int *last; 20 cc_slist_get_last(list1, (void*) &last); 21 CHECK_EQUAL_C_INT(d, *last); 22 23 cc_slist_add_last(list1, &append); 24 CHECK_EQUAL_C_INT(5, cc_slist_size(list)); 25 26 cc_slist_get_last(list1, (void*) &last); 27 CHECK_EQUAL_C_INT(append, *last); 28 29 cc_slist_destroy(list1); 30 cc_slist_destroy(list2); 31 } Figure 5.3: An test case example for singly linked list 74 1 int main(int argc, char **argv) { 2 cc_list_new(&list1); 3 cc_list_new(&list2); 4 5 srand(time(NULL)); 6 7 /* populate the list with random data */ 8 int size = 50; 9 int i; 10 for (i = 0; i < size; i++) { 11 int *e = malloc(sizeof(int)); 12 *e = rand() % 10001; 13 cc_list_add(list1, e); 14 } 15 16 cc_list_sort_in_place(list1, cmp); 17 18 CC_ListIter iter; 19 cc_list_iter_init(&iter, list1); 20 21 void *prev; 22 void *e; 23 cc_list_iter_next(&iter, &prev); 24 while (cc_list_iter_next(&iter, &e) != CC_ITER_END) { 25 CHECK_C(*((int*)prev) <= *((int*)e)); 26 prev = e; 27 } 28 } Figure 5.4: An test case example for doubly linked list 75 1 int main(int argc, char **argv) { 2 stat = cc_rbuf_new(&rbuf); 3 4 uint64_t items[10]; 5 memset(items, 0, sizeof(uint64_t)*10); 6 srand((unsigned int) time(NULL)); 7 for (int i = 0; i < 10; i++) { 8 uint64_t item = rand() % range + 1; 9 items[i] = item; 10 cc_rbuf_enqueue(rbuf, item); 11 } 12 13 CHECK_EQUAL_C_INT(items[0], cc_rbuf_peek(rbuf, 0)); 14 CHECK_EQUAL_C_INT(items[1], cc_rbuf_peek(rbuf, 1)); 15 16 uint64_t a = rand() % range + 1, b = rand() % range + 1; 17 cc_rbuf_enqueue(rbuf, a); 18 cc_rbuf_enqueue(rbuf, b); 19 20 CHECK_EQUAL_C_INT(cc_rbuf_peek(rbuf, 0), a); 21 CHECK_EQUAL_C_INT(cc_rbuf_peek(rbuf, 1), b); 22 uint64_t out; 23 cc_rbuf_dequeue(rbuf, &out); 24 CHECK_EQUAL_C_INT(items[2], out); 25 26 cc_rbuf_destroy(rbuf); 27 28 } Figure 5.5: An test case example for ring buffer 76 1 enum cc_stat cc_list_add_last(CC_List *list, void *element) 2 { 3 Node *node = list->mem_calloc(1, sizeof(Node)); 4 5 if (node == NULL) 6 return CC_ERR_ALLOC; 7 8 node->data = element; 9 uni_FlushOpt(&node->data, sizeof(node->data)); 10 uni_Sfence(); 11 12 if (list->size == 0) { 13 list->head = node; 14 list->tail = node; 15 uni_FlushOpt(&list->head, sizeof(list->head)); 16 uni_FlushOpt(&list->tail, sizeof(list->tail)); 17 uni_Sfence(); 18 } else { 19 uni_TX_BEGIN(); 20 uni_TX_ADD(&node->prev); 21 node->prev = list->tail; 22 uni_TX_ADD(&list->tail->next); 23 list->tail->next = node; 24 uni_TX_ADD(&list->tail); 25 list->tail = node; 26 uni_TX_END(); 27 } 28 list->size++; 29 uni_FlushOpt(&list->size, sizeof(list->size)); 30 uni_Sfence(); 31 return CC_OK; 32 } 33 // MPB Inferred: 34 // MPB(node->data = element, list->size++) 35 // MPB(list->head, list->size++) 36 // MPB(list->tail, list->size++) 37 // MPB(node->prev = list->tail, list->size++) 38 // MPB(list->tail->next = node, list->size++) 39 // MPB(list->tail = node, list->size++) Figure 5.6: MPBs inferred for cc_list_add_last function in doubly linked list 77 1 static void *unlinkn(CC_List *list, Node *node) 2 { 3 void *data = node->data; 4 uni_TX_BEGIN(); 5 if (node->prev != NULL) { 6 uni_TX_ADD(&node->prev->next); 7 node->prev->next = node->next; 8 } 9 if (node->prev == NULL) { 10 uni_TX_ADD(&list->head); 11 list->head = node->next; 12 } 13 if (node->next == NULL) { 14 uni_TX_ADD(&list->tail); 15 list->tail = node->prev; 16 } 17 if (node->next != NULL) { 18 uni_TX_ADD(&node->next->prev); 19 node->next->prev = node->prev; 20 } 21 uni_TX_END(); 22 23 list->mem_free(node); 24 list->size--; 25 uni_FlushOpt(&list->size, sizeof(list->size)); 26 uni_Sfence(); 27 return data; 28 } 29 // MPB Inferred: 30 // MPB(list->head = node->next, list->size--) 31 // MPB(list->tail = node->prev, list->size--) Figure 5.7: MPBs inferred for unlinkn function in doubly linked list 78 1 enum cc_stat cc_deque_remove_last(CC_Deque *deque, void **out) 2 { 3 if (deque->size == 0) 4 return CC_ERR_OUT_OF_RANGE; 5 6 size_t last = (deque->last - 1) & (deque->capacity - 1); 7 void *element = deque->buffer[last]; 8 deque->last = last; 9 uni_FlushOpt(&deque->last, sizeof(deque->last)); 10 uni_Sfence(); 11 deque->size--; 12 uni_FlushOpt(&deque->size, sizeof(deque->size)); 13 uni_Sfence(); 14 if (out) 15 *out = element; 16 17 return CC_OK; 18 } 19 // MPB Inferred: 20 // MPB(deque->last = last, deque->size--) Figure 5.8: MPBs inferred cc_deque_remove_last function in deque 79 1 void cc_rbuf_enqueue(CC_Rbuf *rbuf, uint64_t item) 2 { 3 if (rbuf->head == rbuf->tail) 4 rbuf->tail = (rbuf->tail + 1) % rbuf->capacity; 5 uni_FlushOpt(&rbuf->tail, sizeof(rbuf->tail)); 6 uni_Sfence(); 7 8 rbuf->buf[rbuf->head] = item; 9 10 rbuf->head = (rbuf->head + 1) % rbuf->capacity; 11 uni_FlushOpt(&rbuf->head, sizeof(rbuf->head)); 12 uni_Sfence(); 13 if (rbuf->size < rbuf->capacity) 14 ++rbuf->size; 15 uni_FlushOpt(&rbuf->size, sizeof(rbuf->size)); 16 uni_Sfence(); 17 } 18 // MPB Inferred: 19 // MPB(rbuf->tail = (rbuf->tail + 1) % rbuf->capacity, 20 // rbuf->head = (rbuf->head + 1) % rbuf->capacity) Figure 5.9: MPBs inferred cc_rbuf_enqueue function in ring_buffer 80 1 enum cc_stat cc_array_subarray(CC_Array *ar, size_t b, size_t e, CC_Array **out) 2 { 3 ... 4 /* Try to allocate the buffer */ 5 if (!(sub_ar->buffer = ar->mem_alloc(ar->capacity * sizeof(void*)))) { 6 ar->mem_free(sub_ar); 7 uni_FlushOpt(&sub_ar->buffer, sizeof(sub_ar->buffer)); 8 uni_Sfence(); 9 return CC_ERR_ALLOC; 10 } 11 uni_FlushOpt(&sub_ar->buffer, sizeof(sub_ar->buffer)); 12 uni_Sfence(); 13 sub_ar->mem_alloc = ar->mem_alloc; 14 sub_ar->mem_calloc = ar->mem_calloc; 15 sub_ar->mem_free = ar->mem_free; 16 sub_ar->size = e - b + 1; 17 sub_ar->capacity = sub_ar->size; 18 uni_FlushOpt(&sub_ar->mem_alloc, sizeof(sub_ar->mem_alloc)); 19 uni_FlushOpt(&sub_ar->mem_calloc, sizeof(sub_ar->mem_calloc)); 20 uni_FlushOpt(&sub_ar->mem_free, sizeof(sub_ar->mem_free)); 21 uni_FlushOpt(&sub_ar->size, sizeof(sub_ar->size)); 22 uni_Sfence(); 23 ... 24 } 25 // MPB Inferred: 26 // MPB(sub_ar->buffer =ar->mem_alloc(ar->capacity * sizeof(void*)), sub_ar->size) Figure 5.10: MPBs inferred cc_array_subarray function in array 81 1 enum cc_stat cc_list_add_last(CC_List *list, void *element) 2 { 3 Node *node = list->mem_calloc(1, sizeof(Node)); 4 5 if (node == NULL) 6 return CC_ERR_ALLOC; 7 8 node->data = element; 9 uni_FlushOpt(&node->data, sizeof(node->data)); 10 uni_Sfence(); 11 12 if (list->size == 0) { 13 list->head = node; 14 list->tail = node; 15 uni_FlushOpt(&list->head, sizeof(list->head)); 16 uni_FlushOpt(&list->tail, sizeof(list->tail)); 17 uni_Sfence(); 18 } else { 19 uni_TX_BEGIN(); 20 uni_TX_ADD(&node->prev); 21 node->prev = list->tail; 22 uni_TX_ADD(&list->tail->next); 23 list->tail->next = node; 24 uni_TX_ADD(&list->tail); 25 list->tail = node; 26 uni_TX_END(); 27 } 28 list->size++; 29 uni_FlushOpt(&list->size, sizeof(list->size)); 30 uni_Sfence(); 31 return CC_OK; 32 } 33 // MPA Inferred: 34 // MPA(node->prev = list->tail; list->tail->next = node) Figure 5.11: MPAs inferred for cc_list_add_last function in doubly linked list 82 1 static void swap_adjacent(Node *n1, Node *n2) 2 { 3 if (n1->next == n2) { 4 uni_TX_BEGIN(); 5 if (n2->next) { 6 uni_TX_ADD(&n2->next->prev); 7 n2->next->prev = n1; 8 9 } 10 uni_TX_ADD(&n1->next); 11 n1->next = n2->next; 12 13 if (n1->prev) { 14 uni_TX_ADD(&n1->prev->next); 15 n1->prev->next = n2; 16 } 17 uni_TX_ADD(&n2->prev); 18 n2->prev = n1->prev; 19 20 uni_TX_ADD(&n1->prev); 21 n1->prev = n2; 22 uni_TX_ADD(&n2->next); 23 n2->next = n1; 24 uni_TX_END(); 25 26 return; 27 } 28 ...... 29 // MPA inferred: 30 // MPA(n1->next = n2->next; n2->prev = n1->prev; n2->next = n1) 31 } Figure 5.12: MPAs inferred for swap_adjacent function in doubly linked list 83 1 void link_behind(Node *const base, Node *ins) 2 { 3 uni_TX_BEGIN(); 4 /* link the gap */ 5 if (ins->next != NULL) { 6 uni_TX_ADD(&ins->next->prev); 7 ins->next->prev = ins->prev; 8 } 9 if (ins->prev != NULL) { 10 uni_TX_ADD(&ins->prev->next); 11 ins->prev->next = ins->next; 12 } 13 /* link behind */ 14 if (base->prev == NULL) { 15 uni_TX_ADD(&ins->prev); 16 ins->prev = NULL; 17 uni_TX_ADD(&ins->next); 18 ins->next = base; 19 uni_TX_ADD(&base->prev); 20 base->prev = ins; 21 } else { 22 uni_TX_ADD(&ins->prev); 23 ins->prev = base->prev; 24 uni_TX_ADD(&ins->prev->next); 25 ins->prev->next = ins; 26 uni_TX_ADD(&ins->next); 27 ins->next = base; 28 uni_TX_ADD(&base->prev); 29 base->prev = ins; 30 } 31 uni_TX_END(); 32 } 33 // MPA Inferred: 34 // MPA(ins->next->prev = ins->prev; 35 // ins->prev->next = ins->next; 36 // ins->prev = base->prev; 37 // ins->prev->next = ins; 38 // ins->next = base; 39 // base->prev = ins;) Figure 5.13: MPAs inferred for link_behind function in doubly linked list 84 1 enum cc_stat cc_deque_add_first(CC_Deque *deque, void *element) 2 { 3 if (deque->size >= deque->capacity && expand_capacity(deque) != CC_OK) 4 return CC_ERR_ALLOC; 5 uni_TX_BEGIN(); 6 uni_TX_ADD(&deque->first); 7 uni_TX_ADD(&deque->size); 8 deque->first = (deque->first - 1) & (deque->capacity - 1); 9 deque->buffer[deque->first] = element; 10 deque->size++; 11 uni_TX_END(); 12 return CC_OK; 13 } 14 // MPA Inferred: 15 // MPA(deque->first = (deque->first - 1) & (deque->capacity - 1); 16 // deque->size++;) Figure 5.14: MPAs inferred cc_deque_add_first function in deque 85 Algorithm7: Compute_Persistent_Interval(T). 1 PI←{ (ev,[ϕ,ϕ ])} 2 transaction← False 3 transaction_stores← set() 4 transaction_add← set() 5 transaction_begin_time← 0 6 not_flushed_stores=[] 7 foreach eventev∈T do 8 if ev.type=STORE then 9 if transactionthen 10 PI[ev].UB← ev.time 11 transaction_stores.add(ev) 12 endif 13 if¬transactionthen 14 PI[ev].UB← ev.time 15 not_flushed_stores.append(ev) 16 endif 17 endif 18 if ev.type=CLFLUSHOPT then 19 foreachst∈not_flushed_storesdo 20 if addr =event.addressthen 21 PT[st].LB← st last .time+1 22 endif 23 endforeach 24 endif 25 if ev.type=SFENCE then 26 foreachst∈not_flushed_storesdo 27 if PT[st].LB =st last .time+1then 28 PT[st].LB← ev.time 29 not_flushed_stores.remove(st) 30 endif 31 endforeach 32 endif 33 if ev.type=TX− BEGIN then 34 transaction← True 35 transaction_begin_time← ev.time 36 endif 37 if ev.type=TX− END then 38 transaction← False 39 foreachaddr∈transaction_adddo 40 foreachst∈transaction_storesdo 41 if st.addr =addr then 42 st.UB← transaction_begin_time 43 st.LB← ev.time 44 endif 45 endforeach 46 endforeach 47 transaction_stores.clear() 48 transaction_add.clear() 49 endif 50 if ev.type=TX− ADD then 51 transaction_add.add(ev.addr) 52 endif 53 endforeach 54 returnPI 86 Algorithm8: Check_DURA_Reqirements(T,DURAs). 1 Bugs←{} 2 foreach STORE eventev∈T do 3 if MPB(ev.st,st last )∈MPBsthen 4 Bugs← Bugs∪{MPB(ev ′ .st,ev.st)} 5 endif 6 endforeach 7 returnBugs Algorithm9: Check_MPB_Reqirements(T,MPBs). 1 Bugs←{} 2 foreach STORE eventev ′ ∈T do 3 foreach STORE eventev∈T do 4 if MPB(ev ′ .st,ev.st)∈MPBsthen 5 Bugs← Bugs∪{MPB(ev ′ .st,ev.st)} 6 endif 7 endforeach 8 endforeach 9 returnBugs Algorithm10: Check_MPA_Reqirements(T,MPAs). 1 Bugs←{} 2 foreach STORE eventev∈T do 3 letst be the program statement generatingev 4 if st∈MPAthen 5 foreachst ′ ∈MPA,st ′ ̸=stdo 6 if PT[st ′ ]̸=PT[st ′ ]then 7 Bugs← Bugs∪{MPA(st,...,st ′ )} 8 endif 9 endforeach 10 endif 11 endforeach 12 returnBugs 87 Table 5.1: Statistics of the benchmarks. Name LoC Description cc_slist 2698 singly lined list [109] cc_list 3208 doubly linked list [109] cc_array 2171 array [109] cc_stack 612 stack [109] cc_queue 576 queue [109] cc_deque 2575 double-ended queue [109] cc_pqeque 618 priority queu [109] cc_ring_buffer 316 ring buffer [109] pmem-redis 83864 PM enabled redis server [53] memcached-pmem 27331 PM enabled memcached server [84] Table 5.2: Trace Statistics Name # Store # Load # FLUSH # FENCE # TX cc_slist-bad 166 244 103 64 19 cc_slist-ok 166 258 102 90 26 cc_list-bad 1466 3022 401 264 270 cc_list-ok 1555 3443 401 300 285 cc_array-bad 102 344 130 44 0 cc_array-ok 102 344 84 75 20 cc_stack-bad 27 84 24 8 0 cc_stack-ok 27 84 13 13 4 cc_deque-bad 181 565 201 104 0 cc_deque-ok 181 471 57 26 57 cc_queue-bad 124 303 128 48 0 cc_queue-ok 124 271 68 20 24 cc_pqeque-bad 70 367 0 0 0 cc_pqeque-ok 70 367 117 63 0 cc_ring_buffer-bad 144 443 117 99 0 cc_ring_buffer-ok 144 443 118 118 3 pmem-redis 367 558 8 8 N/A memcached-pmem 230 1043 3 3 N/A 88 Table 5.3: Must Persist Relation Inference and Detection of Violation vsPMTest Our Method (infer) Our Method (detect) PMTest Name Time Dura MPB MPA Time Dura MPB MPA Time Dura MPB MPA (s) (s) bugs bugs bugs (s) bugs bugs bugs cc_slist-bad 0.1 20 9 0 0.1 6 3 0 0.1 6 0 0 cc_slist-ok 0.1 20 9 0 0.1 0 0 0 0.1 0 0 0 cc_list-bad 2.4 44 45 3 2.5 5 10 1 2.4 5 0 0 cc_list-ok 2.8 44 39 3 3.0 0 0 0 2.9 0 0 0 cc_array-bad 0.1 16 15 1 0.1 3 9 1 0.1 3 0 0 cc_array-ok 0.1 16 15 1 0.1 0 0 0 0.1 0 0 0 cc_stack-bad 0.1 12 1 0 0.1 1 1 0 0.1 1 0 0 cc_stack-ok 0.1 12 1 0 0.1 0 0 0 0.1 0 0 0 cc_deque-bad 0.1 21 14 1 0.1 10 10 1 0.1 10 0 0 cc_deque-ok 0.1 21 13 1 0.1 0 0 0 0.1 0 0 0 cc_queue-bad 0.1 15 10 2 0.1 2 5 2 0.1 2 0 0 cc_queue-ok 0.1 15 10 2 0.1 0 0 0 0.1 0 0 0 cc_pqeque-bad 0.1 9 2 0 0.1 9 2 0 0.1 9 0 0 cc_pqeque-ok 0.1 9 2 0 0.1 0 0 0 0.1 0 0 0 cc_ring_buf-bad 0.1 14 8 0 0.1 2 3 0 0.1 2 0 0 cc_ring_buf-ok 0.1 14 8 0 0.1 0 0 0 0.1 0 0 0 redis 0.4 54 66 3 0.4 13 18 3 0.3 13 0 0 memcached 0.4 30 13 1 0.5 30 6 1 0.4 30 0 0 total 7.4 386 280 18 7.8 79 67 9 7.4 79 0 0 Table 5.4: Must Persist Relation vs Likely-Correctness Condition Properties Inferred Bugs Detected (real) Bugs Detected (bogus) Name Our Method Witcher Our Method Witcher Our Method Witcher Dura MPB MPA Dura MPB MPA Dura MPB MPA Dura MPB MPA Dura MPB MPA Dura MPB MPA cc_slist-bad 20 9 0 20 81 2 6 3 0 6 3 0 0 0 0 0 2 2 cc_slist-ok 20 9 0 20 81 2 0 0 0 0 0 0 0 0 0 0 0 2 cc_list-bad 44 45 3 44 245 5 5 10 1 5 10 1 0 0 0 0 0 2 cc_list-ok 44 39 3 44 217 5 0 0 0 0 0 0 0 0 0 0 0 2 cc_array-bad 16 15 1 16 50 1 3 9 1 3 9 0 0 0 0 0 0 1 cc_array-ok 16 15 1 16 48 1 0 0 0 0 0 1 0 0 0 0 0 1 cc_stack-bad 12 1 0 12 13 0 1 1 0 1 1 0 0 0 0 0 0 0 cc_stack-ok 12 1 0 12 12 0 0 0 0 0 0 0 0 0 0 0 0 0 cc_deque-bad 21 14 1 21 110 5 10 10 1 10 10 0 0 0 0 0 1 5 cc_deque-ok 21 13 1 21 121 5 0 0 0 0 0 0 0 0 0 0 0 3 cc_queue-bad 15 10 2 15 37 2 2 5 2 2 5 0 0 0 0 0 1 2 cc_queue-ok 15 10 2 15 34 2 0 0 0 0 0 0 0 0 0 0 0 1 cc_pqeque-bad 9 2 0 9 14 0 9 2 0 9 2 0 0 0 0 0 0 0 cc_pqeque-ok 9 2 0 9 12 0 0 0 0 0 0 0 0 0 0 0 0 0 cc_ring_buf-bad 14 8 0 14 23 1 2 3 0 2 3 0 0 0 0 0 0 1 cc_ring_buf-ok 14 8 0 14 23 1 0 0 0 0 0 0 0 0 0 0 0 1 redis 54 66 3 54 357 2 13 18 3 13 18 2 0 0 0 0 18 0 memcached 30 13 1 30 177 1 30 6 3 30 6 0 0 0 0 0 6 1 total 386 280 18 377 1641 35 79 67 11 67 66 4 0 0 0 0 27 23 89 Chapter6 InferringPropertiesforConcurrentPersistentMemoryPrograms The techniques that we have presented so far (including both symbolic analysis for detecting and repairing PM bugs and trace-based analysis for inferring PM properties) are designed for sequential programs. How- ever, it is feasible to extend these techniques to concurrent programs. To understand why our symbolic analysis techniques can be extended to concurrent programs, we need to introduce three basic concepts: execution time, consistency time, and persistency time, which reside in three different dimensions. 6.1 TheThreeDimensions Given a program statementst, • (1) Execution Time, denotedETime(st), is the time whenst is executed by the processor; • (2)ConsistencyTime, denotedCTime(st), is the time whenst takes effect in the volatile DRAM; and • (3) Persistency Time, denotedPTime(st), is the time whenst takes effect in PM. As shown in Fig. 6.1, they represent three dimensions. Specifically, ETime(st) captures the sequential program behavior,CTime(st) captures concurrency related behavior, e.g., depending on whether the processor has a relaxed memory model, writes by one 90 Figure 6.1: The three dimensions. thread may become visible to another thread in reverse order, and PTime(st), which is unique to PM, captures the behavior of the program during power failure and recovery. 6.2 AddingConcurrencytotheSymbolicAnalysis In previous chapters, we have presented our symbolic analysis techniques for detecting and repairing PM bugs. While these techniques were designed for sequential programs, they can be extended to concurrent programs, by adding a subformulaΦ consistency that encodes the concurrency-related constraints. Recall that, for a sequential program, whether a PM property is violated or not can be formulated as a satisfiability (SAT) problem, where the formula Φ :=Φ program ∧Φ persistency ∧¬Φ assertion is satisfiable if and only if there exists a program execution that violates the given property (assertion). After adding concurrency-related constraints,Φ:=Φ program ∧Φ consistency ∧Φ persistency ∧¬Φ assertion , where Φ program encodes the sequential program behavior, Φ consistency encodes the consistency between 91 Figure 6.2: Constraint-based symbolic analysis. threads,Φ persistency encodes the persistency with respect to PM, andΦ assertion encodes the PM property. Thus,¬Φ assertion means the program execution violates the property. As shown in Fig. 6.2, if Φ is unsatisfiable, it means that there is no bug; however, if Φ is satisfiable, the solution to Φ may be used to generate a counterexample. For the running example in Fig. 2.2 (b), the formula Φ that encodes both the black solid edges and the red dashed edges is satisfiable, and the corresponding counterexample represents a buggy execution trace. 92 1 Thread 1 Thread 2 2 printf("%d", X); 3 int tmp1 = X 4 int tmp2 = tmp + 1 5 if (tmp2 > 0) { 6 int tmp3 = 5 7 Y = tmp3 + 6 8 } 9 Y = 0; Figure 6.3: An example that illustrates the inter-thread control dependency. While symbolic analysis is not the only technique that can detect PM bugs, it has obvious advantages over conventional techniques based on testing or other non-symbolic techniques. The most obvious limi- tation of these existing techniques is that they are based on explicit enumeration of either traces or failure scenarios, and thus have low scalability or low coverage. Another limitation is that they are often designed to handle a specific type of PM properties. In contrast, our method relies on the more efficient symbolic analysis (to avoid explicit enumeration), which can handle a large variety of PM properties in a uniform fashion. 6.3 Inter-threadControlDependency Figure 6.3 shows a code snippet where the LOAD in Thread 2 is control-dependent on the LOAD in Thread 1. However, since the existing method only computesCDEP(ev ′ .st,ev.st), where bothev ′ andev come from the same thread. In a concurrent program, ev ′ .st may control a STORE y instruction in Thread 1, whileev.st is controled by a LOADy instruction in Thread 2. As a result,ev.st in Thread 2 is transitively dependent onev ′ .st in Thread 1. In the following two subsections, we explain how they are implemented. 93 Table 6.1: ConcPMInvGen: Statistics of the benchmarks. Name LoC Description and Citation list-lock 484 lock based concurrent linked list [51] list-lockfree 397 lock free concurrent linked list [51] 6.3.0.1 ComputingInter-ThreadControlDependency If a pair of stores are from two different threads, then we will utilize inter-thread dependency information calculated through a transitive closure of data and control dependency for both RAM and PM variables. If such inter-thread dependency relation is found for those two stores’ corresponding load instructions, they will be added into MPB relations. 6.4 Experiments We implemented our method by using LLVM [81] opt pass to obtain static control dependency information. Our method takes a target program and instruments dependency information on LLVM bitcode. When the target instrumented test cases are executed, it returns a trace with PM related instructions. Then, we use Python scripts to analyze the traces and compute MPB and MPA relations. Finally, we check the generated traces with Python scripts to find PM property violations. 6.4.1 Benchmarks Table 6.1 shows the benchmark statistics, including the name, the number of lines of C code (LoC), a short description. These benchmark programs contain two implementations of concurrent linked list [51] based on Harris et al. [48]. The first row shows lock based concurrent linked list and the second row shows lock free concurrent linked list. 94 Table 6.2: ConcPMInvGen: Trace Statistics Name # Store # Load # FLUSH # FENCE # TX list-lock-bad 293 792 96 93 4 list-lock-ok 298 797 91 88 4 list-lockfree-bad 285 543 0 0 0 list-lockfree-ok 285 570 0 0 5 6.4.1.1 Tracestatistics In Table 6.2, we show the statistics of traces we generated for the target programs. From column 2 to column 6, it shows the number of store, load, flush, fence instructions, and transaction blocks for a target program. For each data structure program, we have two versions - one is a manually constructed good version, e.g., list_lock-ok, in which PM instructions are properly added and used; another is a bad version, e.g., list_lock-bad, in which some PM instructions are missing or misused. Note that for lock free versions of linked list we do not observe flush and fences - only transaction blocks are used in the good version of lock free concurrent linked list. 6.4.2 ExperimentalSet-up We focus on comparing our tool with PMTest [88] as a baseline for Must Persist Relation inference and bug detection on all benchmark programs and comparing the quality of between Must Persist Relation and Likely-Correctness Conditions inferred by Witcher [38]. Note that we do not directly run PMTest and Witcher from their code. Specifically, we mimic how PMTest detect bugs without user-provided PM properties and how Witcher generated Likely-Correctness Conditions with their heuristic rules and integrate with our own tool. Our experiments were designed to answer the following research questions. • RQ 1: Is our method more effective than PMTest in inferring PM properties and detecting the PM bugs in concurrent PM programs if there is no user-provided PM requirements? 95 Table 6.3: Inferring and Checking Violations of PM Properties Our Method (infer) Our Method (detect) PMTest Name Time Dura MPB MPA Time Dura MPB MPA Time Dura MPB MPA (s) (s) bugs bugs bugs (s) bugs bugs bugs list-lock-bad 1.2 16 18 3 1.2 0 0 3 1.2 0 0 0 list-lock-ok 1.2 16 31 2 1.2 0 0 0 1.2 0 0 0 list-lockfree-bad 3.9 11 10 1 4.0 11 7 1 4.0 11 0 0 list-lockfree-ok 4.0 11 10 1 3.9 0 0 0 3.9 0 0 0 total 10.3 54 69 7 10.3 11 7 9 7.4 11 0 0 • RQ 2: Can our method infer better properties for concurrent PM programs compared to Witcher? The experiments were conducted on a computer with AMD Ryzen 5 5600X CPU and 32GB memory, run- ning Ubuntu 20.04. 6.4.3 ResultsforAnsweringRQ1 First, we present the experimental results that answer RQ 1. They are shown in Table 5.3. The first column shows benchmark names; the final entry is a total count. Columns 2 to 5 show the time of our method for inference Dura, MPB and MPAs and the number of relations our method inferred. Columns 6 to 9 show the time of our method for detecting Dura, MPB and MPA bugs and how many bugs we detected. The final four columns are PM bug detection fromPMTest without user-provided PM properties. For data structure benchmarks, our inference procedure infers a relatively small number of Duras, MPBs, and MPAs. Note that list-lock-bad and list-lock-ok has a different number of MPB and MPA inferred due to trace generation - some trace can catch more dependency information on it. This is expected because MPA relations are calculated from MPBs and should be much smaller than the number of MPBs after SCC calculation. For Redis and Memcached, the number of Dura, MPB, and MPA relations generated is within 31, which is considerably fine regarding their LoC. Our total runtime for the inference procedure is fast - 10 seconds. A total number of 54 Dura, 69 MPB, and 7 MPA relations are inferred. From columns 2 to 5, we show the results for PM bug detection from our method. For data structure benchmarks, we can see we cannot detect any Dura, MPB and MPA violations - the traces are safe regarding 96 Table 6.4: ConcPMInvGen: Must Persist Relation vs Likely-Correctness Condition Properties Inferred Bugs Detected (real) Bugs Detected (bogus) Name Our Method Witcher Our Method Witcher Our Method Witcher Dura MPB MPA Dura MPB MPA Dura MPB MPA Dura MPB MPA Dura MPB MPA Dura MPB MPA list-lock-bad 16 18 3 16 701 4 0 0 3 0 0 0 0 0 0 0 37 4 list-lock-ok 16 31 2 16 707 4 0 0 0 0 0 0 0 0 0 0 41 4 list-lockfree-bad 11 10 1 11 841 2 11 7 1 11 7 0 0 0 0 0 321 2 list-lockfree-ok 11 10 1 11 853 2 0 0 0 0 0 0 0 0 0 0 311 2 total 54 69 7 54 3,102 12 11 7 4 11 7 0 0 0 0 0 710 12 the Dura, MPB, and MPA relations we inferred. And our method can detect Dura, MPB, and MPA violations on corresponding bad versions of a certain data structure. For Redis and Memcached, our method can also detect Dura, MPB, and MPA violations. The detection procedure for our method only takes 8 seconds and a total number of 11 Dura, 7 MPB, and 9 MPA relation violations are detected. From columns 6 to 9, we notice that for PMTest, since it is tested without user-provided PM properties, it can only detect Dura bugs - columns 7 and 11 are the same, and for MPB and MPA bugs, it cannot detect any of them. PMTest procedure takes 10.3 seconds and our method for detection takes 7.4 seconds, which means our method does not impose much overhead for MPB and MPA bug detection. These results confirm our method is more effective than PMTest in inferring PM properties and detecting the PM bugs if there is no user-provided PM requirements. 6.4.4 ResultsforAnsweringRQ2 Here we present the experimental results that answer RQ 2 in Table 5.4. The first column shows benchmark names. Columns 2 to 7 show the results for Properties inferred for our method (columns 2 to 4) and Witcher (columns 5 to 7). Columns 8 to 13 show the results for real bugs detected for our method (columns 8 to 10) and Witcher (columns 11 to 13). Columns 14 to 19 show the results for real bugs detected for our method (columns 14 to 16) and Witcher (columns 17 to 19). Columns 2 to 4 are identical to columns 3 to 5 in Table 5.3. Witcher inferred the same amount of Duras as our method - both are 54. However, Witcher inferred 3,102 MPB relations and 12 MPA relations, which is fifty times and one time more than our method, respectively. For some benchmarks, the MPB relations generated by Witcher is more than 97 ten times than ours, e.g., cc_stack and Memcached. This is due to Witcher’s heuristic rules for generating MPBs - they consider not only control dependency between PM variables, but also data dependencies and multithreading makes it worse for their heuristic rules. For MPAs, they do not calculate from MPBs; rather, they use another heuristics rule. Although the number of MPB and MPA relations generated from Witcher is strictly more than our method, our method’s MPB are a subset of their MPB relations, but MPAs are not. We manually confirm that our tool’s detection for Dura, MPB and MPA violations are real in columns 8 to 10, which is identical to identical to columns 7 to 9 in Table 5.3. For data structure programs, Witcher detected the same real bugs as ours for Dura and MPB bugs. However, they missed 8 MPA bugs for cc_- array-bad, cc_deque-bad, cc_queue-bad, Redis, and Memcached. In columns 14 to 16, we can see that our method does not report any bogus Dura, MPB, or MPA bugs. However, Witcher detected 710 MPB bogus bugs and 12 MPA bogus bugs, respectively. 6.4.5 CaseStudies In this section, we show two cases studies for MPA inferred in lock based and lock free concurrent linked lists. 6.4.5.1 AcasestudyinlockbasedconcurrentlinkedlistinPM For lock base linked list, in Listing 6.4, we show an MPA example where it needs transaction over stores inter functions. In new_node function, it mallocs new node and lock in lines 4 and 7. Then it initializes lock,data andnext fields in node. Inlist_add function, while it tries to add a new element to the list, if the list is empty from lines 22 to line 25, it requires a lock to ensure the atomicity. Our inter-procedural analysis on the lock based link list shows an example MPA inferred for stores on node->data, node->next, and elem->next. Notice that stores on node->data and node->next are 98 1 static node_t *new_node(val_t val, node_t *next) 2 { 3 /* allocate node */ 4 node_t *node = malloc(sizeof(node_t)); 5 6 /* allocate lock */ 7 node->lock = malloc(sizeof(ptlock_t)); 8 9 /* initialize the lock */ 10 INIT_LOCK(node->lock); 11 12 node->data = val; 13 node->next = next; 14 return node; 15 } 16 17 bool list_add(list_t *the_list, val_t val) 18 { 19 /* lock sentinel node */ 20 node_t *elem = the_list->head; 21 LOCK(elem->lock); 22 if (!elem->next) { /* the list is empty */ 23 node_t *new_elem = new_node(val, NULL); 24 elem->next = new_elem; 25 UNLOCK(elem->lock); 26 return true; 27 } 28 29 ... 30 } 31 // MPA inferred: 32 // MPA(node->data = val; 33 // node->next = next; 34 // elem->next = new_elem;) Figure 6.4: new_node and list_new in list-lock.c 99 from callingnew_node function in line 23, while store onelem->next is in line 24. That means stores on line 12, line 13 and line 24 must be protected by a transactional API. 6.4.5.2 AcasestudyinlockfreeconcurrentlinkedlistinPM For lock free linked list, in Listing 6.5, we show an MPA example where it needs transaction over stores inter functions. In new_node function, it mallocs new node in line 3. Then it initializes data and next fields in node. In list_add function, it first declares a new node called left with NULL value. Then, it declares a new nodenew_elem in line 12. In the while-loop starting from line 13, it first searches the right node byval: it takes a searchval and references of left node and right node for that keyval are obtained in line 14. This is to locate the pair of nodes between which the new node is to be inserted and updated thenew_elem in line 18. Note that in line 19, it requires a compare-and-swap (CAS), which exchanges the references inleft->nextright. Details of the non-blocking insertion can be found in Harris et al. [48] Our inter-procedural analysis on the lock free link list shows an example MPA inferred for stores on node->data, node->next, and new_elem->next. Notice that stores on node->data and node->next are from calling new_node function in line 12, while store on new_elem->next is in line 18. That means stores on line 4, line 5 and line 12 must be protected by a transactional API. 6.4.5.3 Observations Through previous case studies on MPA inference for lock based and lock free concurrent linked list, we have two key observations. First, the MPAs inferred for both versions of the concurrent linked list for the same functionalities, i.e., malloc new modes and insert nodes in the list, are similar. node->data and node->next should be made available atomically in PM, along with the new element’snext pointer. This ensures atomicity on updatingnode’sdata field and two next pointers for both the current node and the 100 1 static node_t *new_node(val_t val, node_t *next) 2 { 3 node_t *node = malloc(sizeof(node_t)); 4 node->data = val; 5 node->next = next; 6 return node; 7 } 8 9 bool list_add(list_t *the_list, val_t val) 10 { 11 node_t *left = NULL; 12 node_t *new_elem = new_node(val, NULL); 13 while (1) { 14 node_t *right = list_search(the_list, val, &left); 15 if (right != the_list->tail && right->data == val) 16 return false; 17 18 new_elem->next = right; 19 if (CAS_PTR(&(left->next), right, new_elem) == right) { 20 FAI_U32(&(the_list->size)); 21 return true; 22 } 23 } 24 } 25 26 // MPA Inferred: 27 // MPA(node->data = val; 28 // node->next = next; 29 // new_elem->next = right;) Figure 6.5: new_node and list_new in list-lockfree.c 101 new node. Second, we observe that regardless of whether the lock is used or not, our method is capable of inferring quality MPAs for both lock based and lock free concurrent algorithms. In sum, we demonstrate that a similarity between MPAs inferred between two kinds of concurrent linked lists. Specifically, our method penetrates through the critical region and analyzes the intra-thread read-read control dependencies and inter-thread data and control dependencies through transitive closure computation. 102 Chapter7 ConclusionandFutureWork 7.1 Conclusion In this thesis, I investigated the following important research tasks centering on the objective of auto- matically detecting and repairing PM bugs with SMT based symbolic method: designing new SMT based symbolic analysis method for PM bug detection; leveraging SMT based symbolic method for automatic PM bug repair; inferring PM related properties by static and dynamic analysis techniques; inferring PM related properties in concurrent PM programs. I have presented a method for detecting and automatically repairing both durability and crash con- sistency bugs in benchmark programs that leverages byte-addressable persistent memory, implemented a tool called PMBugAssist. The newly designed encoding can model PM program behavior accurately and detect violations of PM properties such as durability and crash consistency. For repairing PM bugs, the new method relies on a novel SMT-based symbolic analysis to first identify the valid and buggy execu- tions allowed by the program, and then remove these executions through iterative addition of blocking constraints and reordering/adding new instructions. Due to the efficiency of the symbolic analysis over explicit enumeration, our method is able to explore possible repairs in a large solution space quickly. Experiment results on a diverse set of benchmark programs show that PMBugAssist has an equivalent 103 ability to detect PM bugs and is significantly more effective in repairing PM bugs than the state-of-the-art approach. I use a combination of static and dynamic program analysis techniques to automatically infer PM related properties. I compute the control and data dependencies of program statements by static analysis, and then instrument these dependencies into the executable of the program. During the execution of the instrumented program from test cases, traces will be generated with log information, including the PM related machine instructions and the dependencies between these instructions. Dynamic analysis is used to analyze the control dependencies between load operations from PM on the execution trace, to infer three types of PM related properties: durability properties, must-persist-before properties, and must- persist-atomically properties. The experimental results show that, by leveraging our property inference method, significantly more bugs can be detected. I further extend the inference method on concurrent PM programs with extra concurrency information on the trace and generated must-persist-before properties and must-persist-atomically properties accustomed to multi-threaded programs. The experiments show that our inference method has the ability to handle concurrency on top of persistency and generate quality and useful PM properties. In summary, I test and confirm the following hypothesis. AunifiedframeworkthatleveragesSMTsolverbasedsymbolicanalysistechniquescanhelp automatetheprocessofdetectingandrepairingPMbugs 7.2 Futurework There are future directions worth investigating. Our constraint based symbolic method for PM bug detec- tion and repair rely on dynamic trace generations. This means theoretically, we cannot cover all program behaviors. Methods such as symbolic execution [105, 123, 123, 25, 15, 115, 19, 118, 12] and fuzz testing [86, 97, 26, 42, 8, 35, 82, 17, 9] may help generate a more diverse set of traces. Our proposed method does 104 not guarantee soundness, i.e., does not miss any PM bugs. This means a soundness guaranteed method such as abstract interpretation [24] can help for bug detection [129, 2, 121, 128, 126, 78, 79] and invariants inference [100, 98, 90, 91, 101, 122, 99] but it may also produce a high rate of false alarms if refinement is not properly designed, and the target programs can be, e.g., PM data structures [70, 36, 27, 22, 14]. 105 Bibliography [1] Jade Alglave, Daniel Kroening, Vincent Nimal, and Daniel Poetzl. “Don’t Sit on the Fence: A Static Analysis Approach to Automatic Fence Insertion”. In: ACM Trans. Program. Lang. Syst. 39.2 (2017), 6:1–6:38. doi: 10.1145/2994593. [2] Martin Helmut Alt, Christian Ferdinand, Florian Martin, and Reinhard Wilhelm. “Cache Behavior Prediction by Abstract Interpretation”. In: Static Analysis, Third International Symposium, SAS’96, Aachen,Germany,September24-26,1996,Proceedings. Ed. by Radhia Cousot and David A. Schmidt. Vol. 1145. Lecture Notes in Computer Science. Springer, 1996, pp. 52–66. doi: 10.1007/3-540-61739-6\_33. [3] Rajeev Alur, Rastislav Bodík, Garvit Juniwal, Milo M. K. Martin, Mukund Raghothaman, Sanjit A. Seshia, Rishabh Singh, Armando Solar-Lezama, Emina Torlak, and Abhishek Udupa. “Syntax-guided synthesis”. In: Formal Methods in Computer-Aided Design, FMCAD 2013, Portland, OR, USA, October 20-23, 2013. 2013, pp. 1–8. [4] Haniel Barbosa, Clark W. Barrett, Martin Brain, Gereon Kremer, Hanna Lachnitt, Makai Mann, Abdalrhman Mohamed, Mudathir Mohamed, Aina Niemetz, Andres Nötzli, Alex Ozdemir, Mathias Preiner, Andrew Reynolds, Ying Sheng, Cesare Tinelli, and Yoni Zohar. “cvc5: A Versatile and Industrial-Strength SMT Solver”. In: Tools and Algorithms for the Construction and Analysis of Systems - 28th International Conference, TACAS 2022, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022, Munich, Germany, April 2-7, 2022, Proceedings, Part I. Ed. by Dana Fisman and Grigore Rosu. Vol. 13243. Lecture Notes in Computer Science. Springer, 2022, pp. 415–442. doi: 10.1007/978-3-030-99524-9\_24. [5] Ivan Beschastnikh, Yuriy Brun, Michael D. Ernst, and Arvind Krishnamurthy. “Inferring models of concurrent systems from logs of their behavior with CSight”. In: 36th International Conference on Software Engineering, ICSE ’14, Hyderabad, India - May 31 - June 07, 2014. Ed. by Pankaj Jalote, Lionel C. Briand, and André van der Hoek. ACM, 2014, pp. 468–479. doi: 10.1145/2568225.2568246. [6] Ivan Beschastnikh, Yuriy Brun, Sigurd Schneider, Michael Sloan, and Michael D. Ernst. “Leveraging existing instrumentation to automatically infer invariant-constrained models”. In: SIGSOFT/FSE’11 19th ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE-19) and ESEC’11: 13th European Software Engineering Conference (ESEC-13), Szeged, Hungary, September 5-9, 2011. Ed. by Tibor Gyimóthy and Andreas Zeller. ACM, 2011, pp. 267–277. doi: 10.1145/2025113.2025151. 106 [7] Dirk Beyer, Matthias Dangl, and Philipp Wendler. “Correction to: A Unifying View on SMT-Based Software Verification”. In: J. Autom. Reason. 65.3 (2021), p. 461. doi: 10.1007/s10817-020-09585-6. [8] Marcel Bohme, Van-Thuan Pham, and Abhik Roychoudhury. “Coverage-based Greybox Fuzzing as Markov Chain”. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, October 24-28, 2016. Ed. by Edgar R. Weippl, Stefan Katzenbeisser, Christopher Kruegel, Andrew C. Myers, and Shai Halevi. ACM, 2016, pp. 1032–1043. doi: 10.1145/2976749.2978428. [9] Luca Borzacchiello, Emilio Coppa, and Camil Demetrescu. “Fuzzing Symbolic Expressions”. In: 43rd IEEE/ACM International Conference on Software Engineering, ICSE 2021, Madrid, Spain, 22-30 May 2021. IEEE, 2021, pp. 711–722. doi: 10.1109/ICSE43902.2021.00071. [10] Brad Fitzpatrick et al. Memcached. 2022. url: https://www.memcached.org. [11] Jacob Burnim and Koushik Sen. “DETERMIN: inferring likely deterministic specifications of multithreaded programs”. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1, ICSE 2010, Cape Town, South Africa, 1-8 May 2010. Ed. by Jeff Kramer, Judith Bishop, Premkumar T. Devanbu, and Sebastián Uchitel. ACM, 2010, pp. 415–424. doi: 10.1145/1806799.1806860. [12] Cristian Cadar, Daniel Dunbar, and Dawson R. Engler. “KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs”. In: 8th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2008, December 8-10, 2008, San Diego, California, USA, Proceedings. Ed. by Richard Draves and Robbert van Renesse. USENIX Association, 2008, pp. 209–224. url: http://www.usenix.org/events/osdi08/tech/full%5C_papers/cadar/cadar.pdf. [13] Cristian Cadar, Patrice Godefroid, Sarfraz Khurshid, Corina S. Pasareanu, Koushik Sen, Nikolai Tillmann, and Willem Visser. “Symbolic execution for software testing in practice: preliminary assessment”. In: Proceedings of the 33rd International Conference on Software Engineering, ICSE 2011, Waikiki, Honolulu , HI, USA, May 21-28, 2011. Ed. by Richard N. Taylor, Harald C. Gall, and Nenad Medvidovic. ACM, 2011, pp. 1066–1071. doi: 10.1145/1985793.1985995. [14] Wentao Cai, Haosen Wen, Vladimir Maksimovski, Mingzhe Du, Rafaello Sanna, Shreif Abdallah, and Michael L. Scott. “Fast Nonblocking Persistence for Concurrent Data Structures”. In: 35th International Symposium on Distributed Computing, DISC 2021, October 4-8, 2021, Freiburg, Germany (Virtual Conference). Ed. by Seth Gilbert. Vol. 209. LIPIcs. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2021, 14:1–14:20. doi: 10.4230/LIPIcs.DISC.2021.14. [15] João Carlos Menezes Carreira, Rodrigo Rodrigues, George Candea, and Rupak Majumdar. “Scalable testing of file system checkers”. In: European Conference on Computer Systems, Proceedings of the Seventh EuroSys Conference 2012, EuroSys ’12, Bern, Switzerland, April 10-13, 2012. Ed. by Pascal Felber, Frank Bellosa, and Herbert Bos. ACM, 2012, pp. 239–252. doi: 10.1145/2168836.2168861. 107 [16] Pavol Černý, Thomas A. Henzinger, Arjun Radhakrishna, Leonid Ryzhyk, and Thorsten Tarrach. “Efficient Synthesis for Concurrency by Semantics-Preserving Transformations”. In: Computer Aided Verification - 25th International Conference, CAV 2013, Saint Petersburg, Russia, July 13-19, 2013. Proceedings. Ed. by Natasha Sharygina and Helmut Veith. Vol. 8044. Lecture Notes in Computer Science. Springer, 2013, pp. 951–967. doi: 10.1007/978-3-642-39799-8\_68. [17] Ju Chen, Jinghan Wang, Chengyu Song, and Heng Yin. “JIGSAW: Efficient and Scalable Path Constraints Fuzzing”. In: 43rd IEEE Symposium on Security and Privacy, SP 2022, San Francisco,CA, USA, May 22-26, 2022. IEEE, 2022, pp. 18–35. doi: 10.1109/SP46214.2022.9833796. [18] Zhangyu Chen, Yu Hua, Yongle Zhang, and Luochangqi Ding. “Efficiently detecting concurrency bugs in persistent memory programs”. In: ASPLOS ’22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022 - 4 March 2022. Ed. by Babak Falsafi, Michael Ferdman, Shan Lu, and Thomas F. Wenisch. ACM, 2022, pp. 873–887. [19] Vitaly Chipounov, Volodymyr Kuznetsov, and George Candea. “S2E: a platform for in-vivo multi-path analysis of software systems”. In: Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2011, Newport Beach, CA, USA, March 5-11, 2011. Ed. by Rajiv Gupta and Todd C. Mowry. ACM, 2011, pp. 265–278. doi: 10.1145/1950365.1950396. [20] Kyeongmin Cho, Sung Hwan Lee, Azalea Raad, and Jeehoon Kang. “Revamping hardware persistency models: view-based and axiomatic persistency models for Intel-x86 and Armv8”. In: PLDI ’21: 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, Virtual Event, Canada, June 20-25, 20211. Ed. by Stephen N. Freund and Eran Yahav. ACM, 2021, pp. 16–31. [21] Alessandro Cimatti, Alberto Griggio, Bastiaan Joost Schaafsma, and Roberto Sebastiani. “The MathSAT5 SMT Solver”. In: Tools and Algorithms for the Construction and Analysis of Systems - 19th International Conference, TACAS 2013, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2013, Rome, Italy, March 16-24, 2013. Proceedings. Ed. by Nir Piterman and Scott A. Smolka. Vol. 7795. Lecture Notes in Computer Science. Springer, 2013, pp. 93–107. doi: 10.1007/978-3-642-36742-7\_7. [22] Nachshon Cohen, David T. Aksun, Hillel Avni, and James R. Larus. “Fine-Grain Checkpointing with In-Cache-Line Logging”. In: Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2019, Providence, RI, USA, April 13-17, 2019. Ed. by Iris Bahar, Maurice Herlihy, Emmett Witchel, and Alvin R. Lebeck. ACM, 2019, pp. 441–454. [23] Compute Express Link. 2023. url: https://www.computeexpresslink.org/about-cxl. [24] Patrick Cousot and Radhia Cousot. “Abstract Interpretation: A Unified Lattice Model for Static Analysis of Programs by Construction or Approximation of Fixpoints”. In: Conference Record of the Fourth ACM Symposium on Principles of Programming Languages, Los Angeles, California, USA, January 1977. Ed. by Robert M. Graham, Michael A. Harrison, and Ravi Sethi. ACM, 1977, pp. 238–252. doi: 10.1145/512950.512973. 108 [25] Heming Cui, Gang Hu, Jingyue Wu, and Junfeng Yang. “Verifying systems rules using rule-directed symbolic execution”. In: Architectural Support for Programming Languages and Operating Systems, ASPLOS 2013, Houston, TX, USA, March 16-20, 2013. Ed. by Vivek Sarkar and Rastislav Bodik. ACM, 2013, pp. 329–342. doi: 10.1145/2451116.2451152. [26] David Drysdale. Coverage-guided kernel fuzzing with syzkaller. 2023. url: https://github.com/google/syzkaller. [27] John Derrick, Simon Doherty, Brijesh Dongol, Gerhard Schellhorn, and Heike Wehrheim. “Verifying correctness of persistent concurrent data structures: a sound and complete method”. In: Formal Aspects Comput. 33.4 (2021), pp. 547–573. [28] Bang Di, Jiawen Liu, Hao Chen, and Dong Li. “Fast, flexible, and comprehensive bug detection for persistent memory programs”. In: ASPLOS ’21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Virtual Event, USA, April 19-23, 2021. Ed. by Tim Sherwood, Emery D. Berger, and Christos Kozyrakis. ACM, 2021, pp. 503–516. [29] Bruno Dutertre. “Yices 2.2”. In: Computer Aided Verification - 26th International Conference, CAV 2014, Held as Part of the Vienna Summer of Logic, VSL 2014, Vienna, Austria, July 18-22, 2014. Proceedings. Ed. by Armin Biere and Roderick Bloem. Vol. 8559. Lecture Notes in Computer Science. Springer, 2014, pp. 737–744. doi: 10.1007/978-3-319-08867-9\_49. [30] Michael D. Ernst, Jake Cockrell, William G. Griswold, and David Notkin. “Dynamically Discovering Likely Program Invariants to Support Program Evolution”. In: IEEE Trans. Software Eng. 27.2 (2001), pp. 99–123. doi: 10.1109/32.908957. [31] Michael D. Ernst, Adam Czeisler, William G. Griswold, and David Notkin. “Quickly detecting relevant program invariants”. In: Proceedings of the 22nd International Conference on on Software Engineering, ICSE 2000, Limerick Ireland, June 4-11, 2000. Ed. by Carlo Ghezzi, Mehdi Jazayeri, and Alexander L. Wolf. ACM, 2000, pp. 449–458. doi: 10.1145/337180.337240. [32] Michael D. Ernst, Jeff H. Perkins, Philip J. Guo, Stephen McCamant, Carlos Pacheco, Matthew S. Tschantz, and Chen Xiao. “The Daikon system for dynamic detection of likely invariants”. In: Sci. Comput. Program. 69.1-3 (2007), pp. 35–45. doi: 10.1016/j.scico.2007.01.015. [33] Jeanne Ferrante, Karl J. Ottenstein, and Joe D. Warren. “The Program Dependence Graph and Its Use in Optimization”. In: ACM Trans. Program. Lang. Syst. 9.3 (1987), pp. 319–349. doi: 10.1145/24039.24041. [34] Jean-Christophe Filliâtre. “Deductive software verification”. In: Int. J. Softw. Tools Technol. Transf. 13.5 (2011), pp. 397–403. doi: 10.1007/s10009-011-0211-0. [35] Andrea Fioraldi, Dominik Christian Maier, Heiko Eißfeldt, and Marc Heuse. “AFL++ : Combining Incremental Steps of Fuzzing Research”. In: 14th USENIX Workshop on Offensive Technologies, WOOT 2020, August 11, 2020. Ed. by Yuval Yarom and Sarah Zennou. USENIX Association, 2020. url: https://www.usenix.org/conference/woot20/presentation/fioraldi. 109 [36] Michal Friedman, Naama Ben-David, Yuanhao Wei, Guy E. Blelloch, and Erez Petrank. “NVTraverse: in NVRAM data structures, the destination is more important than the journey”. In: Proceedings of the 41st ACM SIGPLAN International Conference on Programming Language Design and Implementation, PLDI 2020, London, UK, June 15-20, 2020. Ed. by Alastair F. Donaldson and Emina Torlak. ACM, 2020, pp. 377–392. [37] Xinwei Fu. “Detecting Persistence Bugs from Non-volatile Memory Programs by Inferring Likely-correctness Conditions”. PhD thesis. Virginia Tech, 2022. [38] Xinwei Fu, Wook-Hee Kim, Ajay Paddayuru Shreepathi, Mohannad Ismail, Sunny Wadkar, Dongyoon Lee, and Changwoo Min. “Witcher: Systematic Crash Consistency Testing for Non-Volatile Memory Key-Value Stores”. In: SOSP ’21: ACM SIGOPS 28th Symposium on Operating Systems Principles, Virtual Event / Koblenz, Germany, October 26-29, 2021. Ed. by Robbert van Renesse and Nickolai Zeldovich. ACM, 2021, pp. 100–115. [39] Xinwei Fu, Dongyoon Lee, and Changwoo Min. “DURINN: Adversarial Memory and Thread Interleaving for Detecting Durable Linearizability Bugs”. In: 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22). Carlsbad, CA, July 2022, pp. 195–211. isbn: 978-1-939133-28-1. [40] Marie-Claude Gaudel. “Formal methods for software testing (invited paper)”. In: 11th International Symposium on Theoretical Aspects of Software Engineering, TASE 2017, Sophia Antipolis, France, September 13-15, 2017. Ed. by Frederic Mallet, Min Zhang, and Eric Madelaine. IEEE Computer Society, 2017, pp. 1–3. doi: 10.1109/TASE.2017.8285622. [41] João Gonçalves, Miguel Matos, and Rodrigo Rodrigues. “Mumak: Efficient and Black-Box Bug Detection for Persistent Memory”. In: Proceedings of the Eighteenth European Conference on Computer Systems, EuroSys 2023, Rome, Italy, May 8-12, 2023. Ed. by Giuseppe Antonio Di Luna, Leonardo Querzoni, Alexandra Fedorova, and Dushyanth Narayanan. ACM, 2023, pp. 734–750. doi: 10.1145/3552326.3587447. [42] Google. OSS-Fuzz: Continuous fuzzing for open source software. 2023. url: https://github.com/google/oss-fuzz. [43] Hamed Gorjiara. Verifying Correctness of Persistent Memory Programs. University of California, Irvine, 2022. [44] Hamed Gorjiara, Weiyu Luo, Alex Lee, Guoqing Harry Xu, and Brian Demsky. “Checking robustness to weak persistency models”. In:PLDI’22:43rdACMSIGPLANInternationalConference on Programming Language Design and Implementation, San Diego, CA, USA, June 13 - 17, 2022. Ed. by Ranjit Jhala and Isil Dillig. ACM, 2022, pp. 490–505. [45] Hamed Gorjiara, Guoqing Harry Xu, and Brian Demsky. “Jaaru: efficiently model checking persistent memory programs”. In: ASPLOS ’21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Virtual Event, USA, April 19-23, 2021. Ed. by Tim Sherwood, Emery D. Berger, and Christos Kozyrakis. ACM, 2021, pp. 415–428. 110 [46] Hamed Gorjiara, Guoqing Harry Xu, and Brian Demsky. “Yashme: detecting persistency races”. In: ASPLOS ’22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022 - 4 March 2022. Ed. by Babak Falsafi, Michael Ferdman, Shan Lu, and Thomas F. Wenisch. ACM, 2022, pp. 830–845. [47] Shengjian Guo, Meng Wu, and Chao Wang. “Adversarial symbolic execution for detecting concurrency related cache timing leaks”. In: Proceedings of the 2018 ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/SIGSOFT FSE 2018, Lake Buena Vista, FL, USA, November 04-09, 2018. Ed. by Gary T. Leavens, Alessandro Garcia, and Corina S. Pasareanu. ACM, 2018, pp. 377–388. [48] Timothy L. Harris. “A Pragmatic Implementation of Non-blocking Linked-Lists”. In: Distributed Computing, 15th International Conference, DISC 2001, Lisbon, Portugal, October 3-5, 2001, Proceedings. Ed. by Jennifer L. Welch. Vol. 2180. Lecture Notes in Computer Science. Springer, 2001, pp. 300–314. doi: 10.1007/3-540-45414-4\_21. [49] William R. Harris, Sriram Sankaranarayanan, Franjo Ivancic, and Aarti Gupta. “Program analysis via satisfiability modulo path programs”. In: Proceedings of the 37th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2010, Madrid, Spain, January 17-23, 2010. Ed. by Manuel V. Hermenegildo and Jens Palsberg. ACM, 2010, pp. 71–82. doi: 10.1145/1706299.1706309. [50] Jeff Huang, Charles Zhang, and Julian Dolby. “CLAP: recording local executions to reproduce concurrency failures”. In: ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’13, Seattle, WA, USA, June 16-19, 2013. Ed. by Hans-Juergen Boehm and Cormac Flanagan. ACM, 2013, pp. 141–152. [51] Jim Huang. concurrent-ll: concurrent linked list implementation. https://github.com/sysprog21/concurrent-ll. 2021. [52] Zunchen Huang and Chao Wang. “Symbolic Predictive Cache Analysis for Out-of-Order Execution”. In: Fundamental Approaches to Software Engineering - 25th International Conference, FASE 2022, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022, Munich, Germany, April 2-7, 2022, Proceedings. Ed. by Einar Broch Johnsen and Manuel Wimmer. Vol. 13241. Lecture Notes in Computer Science. Springer, 2022, pp. 163–183. [53] Intel. A version of Redis that uses persistent memory. https://github.com/pmem/pmem-redis. 2021. [54] Intel. Discover Persistent Memory Programming Errors with Pmemcheck. 2022. url: https://www.intel.com/content/www/us/en/developer/articles/technical/discover-persistent- memory-programming-errors-with-pmemcheck.html. [55] Intel. How to detect persistent memory programming errors using Intel Inspector. 2022. url: https://www.intel.com/content/www/us/en/developer/articles/technical/detect-persistent- memory-programming-errors-with-intel-inspector-persistence-inspector.html. [56] Intel. Intel Optane Memory. 2022. url: https://www.intel.com/content/www/us/en/products/docs/memory-storage/optane-persistent- memory/overview.html. 111 [57] Intel. Persistent Memory Development Kit (PMDK). 2023. url: https://https://pmem.io/pmdk. [58] Intel. PMDK Issue: Crash consistency bug in 3.2-nvml. 2021. url: https://github.com/pmem/redis/issues/219. [59] Intel. PMDK Issue: Crash consistency bug within pmemobj_tx_zalloc function. 2020. url: https://github.com/pmem/pmdk/issues/4945. [60] Intel. PMDK Issue: Crash-consistency bug within libart. 2022. url: https://github.com/pmem/pmdk/issues/5512. [61] Intel. PMDK Issue: Manage Mongo transactions: consistency between RS and index during crash. 2017. url: https://github.com/pmem/pmse/issues/160. [62] Intel. PMDK Issue: Multiple inconsistency bugs in libpmemobj array example program. 2021. url: https://github.com/pmem/pmdk/issues/5217. [63] Intel. PMDK Issue: test: obj_first_next fails . 2018. url: https://github.com/pmem/issues/issues/940. [64] Intel. PMDK Issue: test: obj_memops fails. 2015. url: https://github.com/pmem/issues/issues/943. [65] Intel. PMDK Issue: unit tests: obj_constructor fails. 2017. url: https://github.com/pmem/issues/issues/452. [66] Intel. PMDK Issue: unit tests: pmem_memcpy fails. 2017. url: https://github.com/pmem/issues/issues/459. [67] Intel. PMDK Issue: unit tests: pmem_memmove fails. 2017. url: https://github.com/pmem/issues/issues/460. [68] Intel. pmreorder - performs a persistent consistency check using a store reordering mechanism. 2022. url: https://pmem.io/pmdk/manpages/linux/master/pmreorder/pmreorder.1/. [69] Intel. The libpmemobj library. 2023. url: https://pmem.io/pmdk/libpmemobj. [70] Joseph Izraelevitz, Hammurabi Mendes, and Michael L. Scott. “Linearizability of Persistent Memory Objects Under a Full-System-Crash Failure Model”. In: Distributed Computing - 30th International Symposium, DISC 2016, Paris, France, September 27-29, 2016. Proceedings. Ed. by Cyril Gavoille and David Ilcinkas. Vol. 9888. Lecture Notes in Computer Science. Springer, 2016, pp. 313–327. [71] Manu Jose and Rupak Majumdar. “Bug-Assist: Assisting Fault Localization in ANSI-C Programs”. In: Computer Aided Verification - 23rd International Conference, CAV 2011, Snowbird, UT, USA, July 14-20, 2011. Proceedings. Ed. by Ganesh Gopalakrishnan and Shaz Qadeer. 2011, pp. 504–509. [72] Manu Jose and Rupak Majumdar. “Cause clue clauses: error localization using maximum satisfiability”. In: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2011, San Jose, CA, USA, June 4-8, 2011. 2011, pp. 437–446. 112 [73] Sepideh Khoshnood, Markus Kusano, and Chao Wang. “ConcBugAssist: constraint solving for diagnosis and repair of concurrency bugs”. In: Proceedings of the 2015 International Symposium on Software Testing and Analysis, ISSTA 2015, Baltimore, MD, USA, July 12-17, 2015. Ed. by Michal Young and Tao Xie. ACM, 2015, pp. 165–176. [74] Michalis Kokologiannakis, Ilya Kaysin, Azalea Raad, and Viktor Vafeiadis. “PerSeVerE: persistency semantics for verification under ext4”. In: Proc. ACM Program. Lang. 5.POPL (2021), pp. 1–29. [75] Robert Könighofer and Roderick Bloem. “Repair with On-The-Fly Program Analysis”. In: Hardware and Software: Verification and Testing - 8th International Haifa Verification Conference, HVC 2012, Haifa, Israel, November 6-8, 2012. Revised Selected Papers. Ed. by Armin Biere, Amir Nahir, and Tanja E. J. Vos. Vol. 7857. Lecture Notes in Computer Science. Springer, 2012, pp. 56–71. doi: 10.1007/978-3-642-39611-3\_11. [76] David J. Kuck, Robert H. Kuhn, David A. Padua, Bruce Leasure, and Michael Wolfe. “Dependence Graphs and Compiler Optimizations”. In:ConferenceRecordoftheEighthAnnualACMSymposium on Principles of Programming Languages, Williamsburg, Virginia, USA, January 1981. Ed. by John White, Richard J. Lipton, and Patricia C. Goldberg. ACM Press, 1981, pp. 207–218. doi: 10.1145/567532.567555. [77] Markus Kusano, Arijit Chattopadhyay, and Chao Wang. “Dynamic Generation of Likely Invariants for Multithreaded Programs”. In: 37th IEEE/ACM International Conference on Software Engineering, ICSE 2015, Florence, Italy, May 16-24, 2015, Volume 1. Ed. by Antonia Bertolino, Gerardo Canfora, and Sebastian G. Elbaum. IEEE Computer Society, 2015, pp. 835–846. doi: 10.1109/ICSE.2015.95. [78] Markus Kusano and Chao Wang. “Flow-Sensitive Composition of Thread-Modular Abstract Interpretation”. In: CoRR abs/1709.10116 (2017). arXiv: 1709.10116. url: http://arxiv.org/abs/1709.10116. [79] Markus Kusano and Chao Wang. “Thread-modular static analysis for relaxed memory models”. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2017, Paderborn, Germany, September 4-8, 2017. Ed. by Eric Bodden, Wilhelm Schäfer, Arie van Deursen, and Andrea Zisman. ACM, 2017, pp. 337–348. doi: 10.1145/3106237.3106243. [80] Philip Lantz, Dulloor Subramanya Rao, Sanjay Kumar, Rajesh Sankaran, and Jeff Jackson. “Yat: A Validation Framework for Persistent Memory Software”. In: 2014 USENIX Annual Technical Conference, USENIX ATC ’14, Philadelphia, PA, USA, June 19-20, 2014. Ed. by Garth Gibson and Nickolai Zeldovich. 2014, pp. 433–438. [81] Chris Lattner and Vikram S. Adve. “LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation”. In: 2nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2004), 20-24 March 2004, San Jose, CA, USA. IEEE Computer Society, 2004, pp. 75–88. doi: 10.1109/CGO.2004.1281665. 113 [82] Gwangmu Lee, Woochul Shim, and Byoungyoung Lee. “Constraint-guided Directed Greybox Fuzzing”. In: 30th USENIX Security Symposium, USENIX Security 2021, August 11-13, 2021. Ed. by Michael Bailey and Rachel Greenstadt. USENIX Association, 2021, pp. 3559–3576. url: https://www.usenix.org/conference/usenixsecurity21/presentation/lee-gwangmu. [83] Se Kwon Lee, Jayashree Mohan, Sanidhya Kashyap, Taesoo Kim, and Vijay Chidambaram. “Recipe: converting concurrent DRAM indexes to persistent-memory indexes”. In: Proceedings of the 27th ACM Symposium on Operating Systems Principles, SOSP 2019, Huntsville, ON, Canada, October 27-30, 2019. Ed. by Tim Brecht and Carey Williamson. ACM, 2019, pp. 462–477. [84] Lenovo. Lenovo modifications to Linux memcached for enhanced persistent memory support . https://github.com/lenovo/memcached-pmem. 2018. [85] Sihang Liu, Suraaj Kanniwadi, Martin Schwarzl, Andreas Kogler, Daniel Gruss, and Samira Khan. “Side-Channel Attacks on Optane Persistent Memory”. In: 32nd USENIX Security Symposium, USENIX Security 2023, Anaheim, CA, USA, August 9-11, 2023. Ed. by Joseph A. Calandrino and Carmela Troncoso. USENIX Association, 2023. url: https://www.usenix.org/conference/usenixsecurity23/presentation/liu-sihang. [86] Sihang Liu, Suyash Mahar, Baishakhi Ray, and Samira Manabi Khan. “PMFuzz: test case generation for persistent memory programs”. In: ASPLOS ’21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Virtual Event, USA, April 19-23, 2021. Ed. by Tim Sherwood, Emery D. Berger, and Christos Kozyrakis. ACM, 2021, pp. 487–502. doi: 10.1145/3445814.3446691. [87] Sihang Liu, Korakit Seemakhupt, Yizhou Wei, Thomas F. Wenisch, Aasheesh Kolli, and Samira Manabi Khan. “Cross-Failure Bug Detection in Persistent Memory Programs”. In: ASPLOS ’20: Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, March 16-20, 2020. Ed. by James R. Larus, Luis Ceze, and Karin Strauss. ACM, 2020, pp. 1187–1202. [88] Sihang Liu, Yizhou Wei, Jishen Zhao, Aasheesh Kolli, and Samira Manabi Khan. “PMTest: A Fast and Flexible Testing Framework for Persistent Memory Programs”. In: Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2019, Providence, RI, USA, April 13-17, 2019. Ed. by Iris Bahar, Maurice Herlihy, Emmett Witchel, and Alvin R. Lebeck. ACM, 2019, pp. 411–425. [89] LLVM. Dependence Graphs in LLVM. 2023. url: https://llvm.org/docs/DependenceGraphs/index.html. [90] Francesco Logozzo. “Automatic Inference of Class Invariants”. In: Verification, Model Checking, and Abstract Interpretation, 5th International Conference, VMCAI 2004, Venice, Italy, January 11-13, 2004, Proceedings. Ed. by Bernhard Steffen and Giorgio Levi. Vol. 2937. Lecture Notes in Computer Science. Springer, 2004, pp. 211–222. doi: 10.1007/978-3-540-24622-0\_18. [91] Francesco Logozzo. “Class invariants as abstract interpretation of trace semantics”. In: Comput. Lang. Syst. Struct. 35.2 (2009), pp. 100–142. doi: 10.1016/j.cl.2005.01.001. 114 [92] Muhammad Zubair Malik, Khalid Ghori, Bassem Elkarablieh, and Sarfraz Khurshid. “A Case for Automated Debugging Using Data Structure Repair”. In: ASE 2009, 24th IEEE/ACM International Conference on Automated Software Engineering, Auckland, New Zealand, November 16-20, 2009. IEEE Computer Society, 2009, pp. 620–624. [93] Cristian Mattarei, Makai Mann, Clark W. Barrett, Ross G. Daly, Dillon Huff, and Pat Hanrahan. “CoSA: Integrated Verification for Agile Hardware Design”. In: 2018 Formal Methods in Computer Aided Design, FMCAD 2018, Austin, TX, USA, October 30 - November 2, 2018. Ed. by Nikolaj S. Bjørner and Arie Gurfinkel. IEEE, 2018, pp. 1–5. doi: 10.23919/FMCAD.2018.8603014. [94] Sergey Mechtaev, Jooyong Yi, and Abhik Roychoudhury. “DirectFix: Looking for Simple Program Repairs”. In: 37th IEEE/ACM International Conference on Software Engineering, ICSE 2015, Florence, Italy, May 16-24, 2015, Volume 1. Ed. by Antonia Bertolino, Gerardo Canfora, and Sebastian G. Elbaum. IEEE Computer Society, 2015, pp. 448–458. [95] mekind. Memkind: an easy-to-use, general-purpose allocator. 2023. url: https://pmem.io/memkind/. [96] Yuri Meshman, Noam Rinetzky, and Eran Yahav. “Pattern-based Synthesis of Synchronization for the C++ Memory Model”. In: Formal Methods in Computer-Aided Design, FMCAD 2015, Austin, Texas, USA, September 27-30, 2015. Ed. by Roope Kaivola and Thomas Wahl. IEEE, 2015, pp. 120–127. [97] Michal Zalewski. American fuzzy lop. 2023. url: https://lcamtuf.coredump.cx/afl/. [98] Antoine Miné. “Relational Thread-Modular Static Value Analysis by Abstract Interpretation”. In: Verification, Model Checking, and Abstract Interpretation - 15th International Conference, VMCAI 2014, San Diego, CA, USA, January 19-21, 2014, Proceedings. Ed. by Kenneth L. McMillan and Xavier Rival. Vol. 8318. Lecture Notes in Computer Science. Springer, 2014, pp. 39–58. doi: 10.1007/978-3-642-54013-4\_3. [99] Antoine Miné. “Static analysis by abstract interpretation of concurrent programs”. PhD thesis. Ecole Normale Supérieure de Paris-ENS Paris, 2013. [100] Antoine Miné. “Tutorial on Static Inference of Numeric Invariants by Abstract Interpretation”. In: Found. Trends Program. Lang. 4.3-4 (2017), pp. 120–372. doi: 10.1561/2500000034. [101] Raphaël Monat and Antoine Miné. “Precise Thread-Modular Abstract Interpretation of Concurrent Programs Using Relational Interference Abstractions”. In: Verification, Model Checking, and Abstract Interpretation - 18th International Conference, VMCAI 2017, Paris, France, January 15-17, 2017, Proceedings. Ed. by Ahmed Bouajjani and David Monniaux. Vol. 10145. Lecture Notes in Computer Science. Springer, 2017, pp. 386–404. doi: 10.1007/978-3-319-52234-0\_21. [102] Leonardo de Moura and Nikolaj S. Bjørner. “Z3: An Efficient SMT Solver”. In: Tools and Algorithms for the Construction and Analysis of Systems, 14th International Conference, TACAS 2008, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2008, Budapest, Hungary, March 29-April 6, 2008. Proceedings. Ed. by C. R. Ramakrishnan and Jakob Rehof. Vol. 4963. Lecture Notes in Computer Science. Springer, 2008, pp. 337–340. 115 [103] Leonardo Mendonça de Moura and Grant Olney Passmore. “The Strategy Challenge in SMT Solving”. In: Automated Reasoning and Mathematics - Essays in Memory of William W. McCune. Ed. by Maria Paola Bonacina and Mark E. Stickel. Vol. 7788. Lecture Notes in Computer Science. Springer, 2013, pp. 15–44. doi: 10.1007/978-3-642-36675-8\_2. [104] Ian Neal, Andrew Quinn, and Baris Kasikci. “Hippocrates: healing persistent memory bugs without doing any harm”. In: ASPLOS ’21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Virtual Event, USA, April 19-23, 2021. Ed. by Tim Sherwood, Emery D. Berger, and Christos Kozyrakis. ACM, 2021, pp. 401–414. [105] Ian Neal, Ben Reeves, Ben Stoler, Andrew Quinn, Youngjin Kwon, Simon Peter, and Baris Kasikci. “AGAMOTTO: How Persistent is your Persistent Memory Application?” In: 14th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2020, Virtual Event, November 4-6, 2020. USENIX Association, 2020, pp. 1047–1064. url: https://www.usenix.org/conference/osdi20/presentation/neal. [106] Hoang Duong Thien Nguyen, Dawei Qi, Abhik Roychoudhury, and Satish Chandra. “SemFix: program repair via semantic analysis”. In: 35th International Conference on Software Engineering, ICSE ’13, San Francisco, CA, USA, May 18-26, 2013. Ed. by David Notkin, Betty H. C. Cheng, and Klaus Pohl. IEEE Computer Society, 2013, pp. 772–781. [107] Jeremy W. Nimmer and Michael D. Ernst. “Static verification of dynamically detected program invariants: Integrating Daikon and ESC/Java”. In: Workshop on Runtime Verification, RV 2001, in connection with CAV 2001, Paris, France, July 23, 2001. Ed. by Klaus Havelund and Grigore Rosu. Vol. 55. Electronic Notes in Theoretical Computer Science 2. Elsevier, 2001, pp. 255–276. doi: 10.1016/S1571-0661(04)00256-7. [108] “NVML: Implementing Persistent Memory Applications”. In: Santa Clara, CA: USENIX Association, Feb. 2015. [109] Srđan Panić. Collections-C: A library of generic data structures. https://github.com/srdja/Collections-C. 2023. [110] Jeff H. Perkins and Michael D. Ernst. “Efficient incremental algorithms for dynamic detection of likely invariants”. In: Proceedings of the 12th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2004, Newport Beach, CA, USA, October 31 - November 6, 2004. Ed. by Richard N. Taylor and Matthew B. Dwyer. ACM, 2004, pp. 23–32. doi: 10.1145/1029894.1029901. [111] Azalea Raad, Ori Lahav, and Viktor Vafeiadis. “Persistent Owicki-Gries reasoning: a program logic for reasoning about persistent programs on Intel-x86”. In: Proc. ACM Program. Lang. 4.OOPSLA (2020), 151:1–151:28. [112] Azalea Raad, Luc Maranget, and Viktor Vafeiadis. “Extending Intel-x86 consistency and persistency: formalising the semantics of Intel-x86 memory types and non-temporal stores”. In: Proc. ACM Program. Lang. 6.POPL (2022), pp. 1–31. doi: 10.1145/3498683. 116 [113] Dulloor Subramanya Rao, Sanjay Kumar, Anil S. Keshavamurthy, Philip Lantz, Dheeraj Reddy, Rajesh Sankaran, and Jeff Jackson. “System software for persistent memory”. In: Ninth Eurosys Conference 2014, EuroSys 2014, Amsterdam, The Netherlands, April 13-16, 2014. Ed. by Dick C. A. Bulterman, Herbert Bos, Antony I. T. Rowstron, and Peter Druschel. ACM, 2014, 15:1–15:15. [114] Benjamin Reidys and Jian Huang. “Understanding and detecting deep memory persistency bugs in NVM programs with DeepMC”. In: PPoPP ’22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul, Republic of Korea, April 2 - 6, 2022. Ed. by Jaejin Lee, Kunal Agrawal, and Michael F. Spear. ACM, 2022, pp. 322–336. [115] Matthew J. Renzelmann, Asim Kadav, and Michael M. Swift. “SymDrive: Testing Drivers without Devices”. In: 10th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2012, Hollywood, CA, USA, October 8-10, 2012. Ed. by Chandu Thekkath and Amin Vahdat. USENIX Association, 2012, pp. 279–292. url: https://www.usenix.org/conference/osdi12/technical-sessions/presentation/renzelmann. [116] Andy Rudoff. The Persistent Memory Programming Model. 2020. url: https://www.snia.org/educational-library/persistent-memory-programming-model-2020. [117] Steve Scargall. Programming Persistent Memory A Comprehensive Guide for Developers. Berkeley, CA: Apress, 2020. [118] Edward J. Schwartz, Thanassis Avgerinos, and David Brumley. “All You Ever Wanted to Know about Dynamic Taint Analysis and Forward Symbolic Execution (but Might Have Been Afraid to Ask)”. In: 31st IEEE Symposium on Security and Privacy, S&P 2010, 16-19 May 2010, Berleley/Oakland, California, USA. IEEE Computer Society, 2010, pp. 317–331. doi: 10.1109/SP.2010.26. [119] SNIA. NVM Programming Model. 2023. url: https://www.snia.org/tech_activities/standards/curr_standards/npm. [120] SNIA. Persistent Memory, NVM Programming Model, and NVDIMMs. 2017. url: https: //www.flashmemorysummit.com/English/Collaterals/Proceedings/2017/20170809_FR21_SNIA.pdf. [121] Thibault Suzanne and Antoine Miné. “From Array Domains to Abstract Interpretation Under Store-Buffer-Based Memory Models”. In: Static Analysis - 23rd International Symposium, SAS 2016, Edinburgh, UK, September 8-10, 2016, Proceedings. Ed. by Xavier Rival. Vol. 9837. Lecture Notes in Computer Science. Springer, 2016, pp. 469–488. doi: 10.1007/978-3-662-53413-7\_23. [122] Thibault Suzanne and Antoine Miné. “Relational Thread-Modular Abstract Interpretation Under Relaxed Memory Models”. In: Programming Languages and Systems - 16th Asian Symposium, APLAS 2018, Wellington, New Zealand, December 2-6, 2018, Proceedings. Ed. by Sukyoung Ryu. Vol. 11275. Lecture Notes in Computer Science. Springer, 2018, pp. 109–128. doi: 10.1007/978-3-030-02768-1\_6. 117 [123] David Trabish, Andrea Mattavelli, Noam Rinetzky, and Cristian Cadar. “Chopped symbolic execution”. In: Proceedings of the 40th International Conference on Software Engineering, ICSE 2018, Gothenburg, Sweden, May 27 - June 03, 2018. Ed. by Michel Chaudron, Ivica Crnkovic, Marsha Chechik, and Mark Harman. ACM, 2018, pp. 350–360. doi: 10.1145/3180155.3180251. [124] Chao Wang, Sudipta Kundu, Malay K. Ganai, and Aarti Gupta. “Symbolic Predictive Analysis for Concurrent Programs”. In: FM 2009: Formal Methods, Second World Congress, Eindhoven, The Netherlands, November 2-6, 2009. Proceedings. Ed. by Ana Cavalcanti and Dennis Dams. Vol. 5850. Lecture Notes in Computer Science. Springer, 2009, pp. 256–272. [125] Chao Wang, Rhishikesh Limaye, Malay K. Ganai, and Aarti Gupta. “Trace-Based Symbolic Analysis for Atomicity Violations”. In: Tools and Algorithms for the Construction and Analysis of Systems, 16th International Conference, TACAS 2010, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2010, Paphos, Cyprus, March 20-28, 2010. Proceedings. Ed. by Javier Esparza and Rupak Majumdar. Vol. 6015. Lecture Notes in Computer Science. Springer, 2010, pp. 328–342. [126] Shuai Wang, Yuyan Bao, Xiao Liu, Pei Wang, Danfeng Zhang, and Dinghao Wu. “Identifying Cache-Based Side Channels through Secret-Augmented Abstract Interpretation”. In: 28th USENIX Security Symposium, USENIX Security 2019, Santa Clara, CA, USA, August 14-16, 2019. Ed. by Nadia Heninger and Patrick Traynor. USENIX Association, 2019, pp. 657–674. url: https://www.usenix.org/conference/usenixsecurity19/presentation/wang-shuai. [127] Zixuan Wang, Mohammadkazem Taram, Daniel Moghimi, Steven Swanson, Dean M. Tullsen, and Jishen Zhao. “NVLeak: Off-Chip Side-Channel Attacks via Non-Volatile Memory Systems”. In: 32nd USENIX Security Symposium, USENIX Security 2023, Anaheim, CA, USA, August 9-11, 2023. Ed. by Joseph A. Calandrino and Carmela Troncoso. USENIX Association, 2023. url: https://www.usenix.org/conference/usenixsecurity23/presentation/wang-zixuan. [128] Meng Wu and Chao Wang. “Abstract interpretation under speculative execution”. In: Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2019, Phoenix, AZ, USA, June 22-26, 2019. Ed. by Kathryn S. McKinley and Kathleen Fisher. ACM, 2019, pp. 802–815. doi: 10.1145/3314221.3314647. [129] Zhenkai Zhang and Xenofon D. Koutsoukos. “Improving the Precision of Abstract Interpretation Based Cache Persistence Analysis”. In: Proceedings of the 16th ACM SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems, LCTES 2015, CD-ROM, Portland, OR, USA, June 18 - 19, 2015. Ed. by Sam H. Noh, Sebastian Fischmeister, and Jason Xue. ACM, 2015, 10:1–10:10. doi: 10.1145/2670529.2754967. 118 
Abstract (if available)
Abstract Emerging persistent memory (PM) technologies are beginning to bridge the gap between volatile memory and non-volatile storage in computer systems. PM has three main advantages. First, it allows high-speed memory access and has better performance than solid-state drives and hard disks at a relatively low cost. Second, it is byte-addressable and can access data in place. Third, it achieves data persistency and can hold data across power failures. Recently, PM software and hardware support have been available in the industry. However, PM programming remains a challenging and error-prone task due to reliance on ordinary developers to have a deep understanding of PM at both system and software levels to be capable of writing correct and efficient PM software code. Current studies in PM have attempted to solve the aforementioned challenges by proposing methods to detect PM bugs automatically using static and dynamic program analysis and repair them using heuristics.

In this dissertation, I propose a framework to detect and repair PM bugs automatically using a set of new symbolic analysis techniques. Unlike existing techniques that rely on patterns and heuristics to detect and repair a small subset of PM bugs, the proposed techniques can handle a wide range of PM bugs. This is achieved by first encoding the program semantics, correctness properties, and PM requirements as a set of logical constraints and then solving these constraints using off-the-shelf SMT solvers. By reasoning about these logical constraints symbolically, the proposed techniques can detect, diagnose, and repair PM bugs efficiently. Furthermore, I propose a new method to automatically infer PM requirements using a combination of static and dynamic analysis techniques. Finally, I demonstrate the feasibility of applying the proposed techniques to programs that rely on both PM and multi-threading by simultaneously reasoning about persistency and concurrency. For evaluation, we conduct extensive experiments on real-world benchmarks from industry-level PM softwares, and the experimental results show that our methods achieve state-of-the-art performance and scalability in detecting and repairing PM bugs and inferring quality PM requirements. 
Linked assets
University of Southern California Dissertations and Theses
doctype icon
University of Southern California Dissertations and Theses 
Action button
Conceptually similar
Constraint-based program analysis for concurrent software
PDF
Constraint-based program analysis for concurrent software 
Side-channel security enabled by program analysis and synthesis
PDF
Side-channel security enabled by program analysis and synthesis 
Automatic detection and optimization of energy optimizable UIs in Android applications using program analysis
PDF
Automatic detection and optimization of energy optimizable UIs in Android applications using program analysis 
Improving binary program analysis to enhance the security of modern software systems
PDF
Improving binary program analysis to enhance the security of modern software systems 
Automatic test generation system for software
PDF
Automatic test generation system for software 
Custom hardware accelerators for boolean satisfiability
PDF
Custom hardware accelerators for boolean satisfiability 
Formal analysis of data poisoning robustness of K-nearest neighbors
PDF
Formal analysis of data poisoning robustness of K-nearest neighbors 
Security-driven design of logic locking schemes: metrics, attacks, and defenses
PDF
Security-driven design of logic locking schemes: metrics, attacks, and defenses 
Automated repair of layout accessibility issues in mobile applications
PDF
Automated repair of layout accessibility issues in mobile applications 
Static program analyses for WebAssembly
PDF
Static program analyses for WebAssembly 
Automated repair of presentation failures in Web applications using search-based techniques
PDF
Automated repair of presentation failures in Web applications using search-based techniques 
Detection, localization, and repair of internationalization presentation failures in web applications
PDF
Detection, localization, and repair of internationalization presentation failures in web applications 
Assessing software maintainability in systems by leveraging fuzzy methods and linguistic analysis
PDF
Assessing software maintainability in systems by leveraging fuzzy methods and linguistic analysis 
Assume-guarantee contracts for assured cyber-physical system design under uncertainty
PDF
Assume-guarantee contracts for assured cyber-physical system design under uncertainty 
Formal equivalence checking and logic re-synthesis for asynchronous VLSI designs
PDF
Formal equivalence checking and logic re-synthesis for asynchronous VLSI designs 
Hardware and software techniques for irregular parallelism
PDF
Hardware and software techniques for irregular parallelism 
Software security economics and threat modeling based on attack path analysis; a stakeholder value driven approach
PDF
Software security economics and threat modeling based on attack path analysis; a stakeholder value driven approach 
Analysis of embedded software architecture with precedent dependent aperiodic tasks
PDF
Analysis of embedded software architecture with precedent dependent aperiodic tasks 
Software quality understanding by analysis of abundant data (SQUAAD): towards better understanding of life cycle software qualities
PDF
Software quality understanding by analysis of abundant data (SQUAAD): towards better understanding of life cycle software qualities 
Spatio-temporal probabilistic inference for persistent object detection and tracking
PDF
Spatio-temporal probabilistic inference for persistent object detection and tracking 
Action button
Asset Metadata
Creator Huang, Zunchen (author) 
Core Title Constraint based analysis for persistent memory programs 
Contributor Electronically uploaded by the author (provenance) 
School Andrew and Erna Viterbi School of Engineering 
Degree Doctor of Philosophy 
Degree Program Computer Science 
Degree Conferral Date 2023-12 
Publication Date 03/05/2025 
Defense Date 08/21/2023 
Publisher Los Angeles, California (original), University of Southern California (original), University of Southern California. Libraries (digital) 
Tag constraint based analysis,OAI-PMH Harvest,persistent memory,program analysis,program repair,SMT solving 
Format theses (aat) 
Language English
Advisor Wang, Chao (committee chair), Nuzzo, Pierluigi (committee member), Ravi, Srivastan (committee member) 
Creator Email zunchenh@usc.edu,zunchenhuang@gmail.com 
Permanent Link (DOI) https://doi.org/10.25549/usctheses-oUC113302956 
Unique identifier UC113302956 
Identifier etd-HuangZunch-12319.pdf (filename) 
Legacy Identifier etd-HuangZunch-12319 
Document Type Dissertation 
Format theses (aat) 
Rights Huang, Zunchen 
Internet Media Type application/pdf 
Type texts
Source 20230906-usctheses-batch-1091 (batch), University of Southern California (contributing entity), University of Southern California Dissertations and Theses (collection) 
Access Conditions The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law.  Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright.  It is the author, as rights holder, who must provide use permission if such use is covered by copyright. 
Repository Name University of Southern California Digital Library
Repository Location USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email cisadmin@lib.usc.edu
Tags
constraint based analysis
persistent memory
program analysis
program repair
SMT solving