Show simple item record

dc.contributor.authorLu, Zhengyang
dc.date.accessioned2023-06-19 14:44:00 (GMT)
dc.date.available2023-06-19 14:44:00 (GMT)
dc.date.issued2023-06-19
dc.date.submitted2023-06-14
dc.identifier.urihttp://hdl.handle.net/10012/19562
dc.description.abstractSatisfiability Modulo Theories (SMT) solvers are programs that decide whether a first-order logic formula is satisfiable. Over the last two decades, these solvers have become central to many methods and tools in fields as diverse as software engineering, verification, security, and Artificial Intelligence. Most modern SMT solvers provide user-controllable strategies, i.e., users can define a strategy that customizes a solving algorithm for their own problems by combining individual tactics as building blocks. A tactic is a well-defined and implemented reasoning step provided by the SMT solver, which either simplifies, trans- forms, or solves the given input SMT formula. The flexibility of customizing a strategy to a specialized type of formula is important since no existing strategy is considered optimal for all instances. However, finding a good customized strategy is challenging even for experts. In this thesis we present a novel class of reinforcement-learning (RL) guided methods, implemented in the Z3 SMT solver and that we refer to as AlphaSMT, which adaptively constructs the expected best strategy for any given input SMT formula. Briefly, the AlphaSMT RL framework combines deep Monte-Carlo Tree Search (MCTS) and logical reasoning in a unique way in order to enable the RL agent to learn the best combination of tactics for a given class of formulas. In more detail, a deep neural network serves as both the value function and the policy, evaluating state-action pairs and making the decision of which tactic to choose at each step. The neural network is trained toward the optimal policy by learning from self-exploring sampling solving processes. MCTS is used as a lookahead planning step for each decision made in the sampling processes. We evaluated the performance of AlphaSMT on benchmark sets from three SMT logics, namely, quantifier-free non-linear real arithmetic (QF NRA), quantifier-free non-linear integer arithmetic (QF NIA), and quantifier-free bit-vector (QF BV). In all these logics, AlphaSMT outperforms its base solver, Z3, by solving up to 80.5% more instances in a testing set. Evaluation results also show that a reasonably longer tactic timeout helps solve more instances and a pre-solver contributes significantly to the speedup.en
dc.language.isoenen
dc.publisherUniversity of Waterlooen
dc.subjectautomated reasoningen
dc.subjectreinforcement learningen
dc.subjectSMT solveren
dc.subjectartificial intelligenceen
dc.titleAlphaSMT: A Reinforcement Learning Guided SMT Solveren
dc.typeMaster Thesisen
dc.pendingfalse
uws-etd.degree.departmentElectrical and Computer Engineeringen
uws-etd.degree.disciplineElectrical and Computer Engineeringen
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.degreeMaster of Applied Scienceen
uws-etd.embargo.terms0en
uws.contributor.advisorGanesh, Vijay
uws.contributor.affiliation1Faculty of Engineeringen
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.typeOfResourceTexten
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record


UWSpace

University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages