SEQUENT CALCULUS: A LOGIC AND A LANGUAGE
FOR COMPUTATION AND DUALITY
by
PAUL DOWNEN
A DISSERTATION
Presented to the Department of Computer and Information Science
and the Graduate School of the University of Oregon
in partial fulfillment of the requirements
for the degree of
Doctor of Philosophy
June 2017
DISSERTATION APPROVAL PAGE
Student: Paul Downen
Title: Sequent Calculus: A Logic and a Language for Computation and Duality
This dissertation has been accepted and approved in partial fulfillment of the
requirements for the Doctor of Philosophy degree in the Department of Computer and
Information Science by:
Zena M. Ariola Chairperson
Michal Young Core Member
Boyana Norris Core Member
Mark Lonergan Institutional Representative
and
Scott L. Pratt Dean of the Graduate School
Original approval signatures are on file with the University of Oregon Graduate School.
Degree awarded June 2017
ii
c© 2017 Paul Downen
iii
DISSERTATION ABSTRACT
Paul Downen
Doctor of Philosophy
Department of Computer and Information Science
June 2017
Title: Sequent Calculus: A Logic and a Language for Computation and Duality
Truth and falsehood, questions and answers, construction and deconstruction;
most things come in dual pairs. Duality is a mirror that reveals the new from the old
via opposition. This idea appears pervasively in logic, where duality inverts “true”
with “false” and “and” with “or.” However, even though programming languages are
closely connected to logics, this kind of strong duality is not so apparent in practice.
Sum types (disjoint tagged unions) and product types (structures) are dual concepts,
but in the realm of programming, natural biases obscure their duality.
To better understand the role of duality in programming, we shift our perspective.
Our approach is based on the Curry-Howard isomorphism which says that programs
following a specification are the same as proofs for mathematical theorems. This
thesis explores Gentzen’s sequent calculus, a logic steeped in duality, as a model for
computational duality. By applying the Curry-Howard isomorphism to the sequent
calculus, we get a language that combines dual programming concepts as equal
opposites: data types found in functional languages are dual to co-data types (interface-
based objects) found in object-oriented languages, control flow is dual to information
flow, induction is dual to co-induction. This gives a duality-based semantics for
iv
reasoning about programs via orthogonality: checking safety and correctness based on
a comprehensive test suite.
We use the language of the sequent calculus to apply ideas from logic to issues
relevant to program compilation. The idea of logical polarity reveals a symmetric basis
of primitive programming constructs that can faithfully represent all user-defined
data and co-data types. We reflect the lessons learned back into a core language
for functional languages, at the cost of symmetry, with the relationship between the
sequent calculus and natural deduction. This relationship lets us derive a pure λ-
calculus with user-defined data and co-data which we further extend by bringing out
the implicit control-flow in functional programs. Explicit control-flow lets us share and
name control the same way we share and name data, enabling a direct representation
of join points, which are essential for tractable optimization and compilation.
This dissertation includes previously published co-authored material.
v
CURRICULUM VITAE
NAME OF AUTHOR: Paul Downen
GRADUATE AND UNDERGRADUATE SCHOOLS ATTENDED:
University of Oregon, Eugene, OR
Lawrence Technological University, Southfield, MI
DEGREES AWARDED:
Doctor of Philosophy in Computer Science, 2017, University of Oregon
Bachelor of Science in Mathematics, 2010, Lawrence Technological University
Bachelor of Science in Computer Science, 2010, Lawrence Technological University
Bachelor of Science in Computer Engineering, 2010, Lawrence Technological
University
AREAS OF SPECIAL INTEREST:
Programming Language Theory
Type Theory
Compilers
PROFESSIONAL EXPERIENCE:
Graduate Teaching Fellow, University of Oregon, Eugene, Oregon, September
2010 – June 2011
Graduate Research Fellow, University of Oregon, Eugene, Oregon, June 2011 –
Present
Research Intern, Université Paris Diderot, INRIA, PPS, Paris, France, June 2011
– August 2011
Visiting Researcher, Université Paris Diderot, INRIA, PPS, Paris, France,
November 2012 – June 2013
Visiting Researcher, Microsoft Research, Cambridge, UK, July 2015 – August
2015
vi
GRANTS, AWARDS AND HONORS:
Oregon Doctoral Research Fellowship, University of Oregon Computer and
Information Science Department, 2017
Oregon Doctoral Research Fellowship Nomination, University of Oregon
Computer and Information Science Department, 2016
Upsilon Pi Epsilon Honors Society, Inducted by University of Oregon Computer
and Information Science Department, 2015
Gurdeep Pall Graduate Student Fellowship, University of Oregon, 2015
Erwin & Gertrude Juilfs Scholarship in Computer and Information Science,
University of Oregon, 2014
Erwin & Gertrude Juilfs Scholarship in Computer and Information Science,
University of Oregon, 2012
Best GTF Award, University of Oregon Computer and Information Science
Department, 2011
PUBLICATIONS:
Maurer, Luke, Downen, Paul, Ariola, Zena M., & Peyton Jones, Simon. (2017).
Compiling without continuations. Pages 482–494 of: Proceedings of the
38th ACM SIGPLAN conference on programming language design and
implementation. PLDI ’17. New York, NY, USA: ACM. Distinguished Paper
Award.
Johnson-Freyd, Philip, Downen, Paul, & Ariola, Zena M. (2017). Call-by-name
extensionality and confluence. Journal of functional programming, 27, e12.
Downen, Paul, Maurer, Luke, Ariola, Zena M., & Peyton Jones, Simon. (2016).
Sequent calculus as a compiler intermediate language. Pages 74–88 of:
Proceedings of the 21st ACM SIGPLAN international conference on functional
programming. ICFP ’16. New York, NY, USA: ACM.
Johnson-Freyd, Philip, Downen, Paul, & Ariola, Zena M. (2016). First class call
stacks: Exploring head reduction. Proceedings of the workshop on continuations,
WoC 2016, London, UK, April 12th 2015. EPTCS, vol. 212.
vii
Downen, Paul, Johnson-Freyd, Philip, & Ariola, Zena M. (2015). Structures for
structural recursion. Pages 127–139 of: Proceedings of the 20th ACM SIGPLAN
international conference on functional programming. ICFP ’15. New York,
NY, USA: ACM.
Downen, Paul, & Ariola, Zena M. (2014a). Compositional semantics for
composable continuations: From abortive to delimited control. Pages 109–
122 of: Proceedings of the 19th ACM SIGPLAN international conference on
functional programming. ICFP ’14. New York, NY, USA: ACM.
Downen, Paul, & Ariola, Zena M. (2014b). Delimited control and computational
effects. Journal of functional programming, 24, 1–55.
Downen, Paul, & Ariola, Zena M. (2014c). The duality of construction. Pages 249–
269 of: Shao, Zhong (ed), Programming languages and systems: 23rd European
symposium on programming, ESOP 2014, held as part of the European joint
conferences on theory and practice of software, ETAPS 2014. Lecture Notes
in Computer Science, vol. 8410. Springer Berlin Heidelberg.
Downen, Paul, Maurer, Luke, Ariola, Zena M., & Varacca, Daniele. (2014).
Continuations, processes, and sharing. Pages 69–80 of: Proceedings of the 16th
international symposium on principles and practice of declarative programming.
PPDP ’14. New York, NY, USA: ACM.
Ariola, Zena M., Downen, Paul, Herbelin, Hugo, Nakata, Keiko, & Saurin, Alexis.
(2012). Classical call-by-need sequent calculi: The unity of semantic artifacts.
Pages 32–46 of: Schrijvers, Tom, & Thiemann, Peter (eds), Functional and
logic programming: 11th international symposium. Lecture Notes in Computer
Science, vol. 7294. Berlin, Heidelberg: Springer Berlin Heidelberg.
Downen, Paul, & Ariola, Zena M. (2012). A systematic approach to
delimited control with multiple prompts. Pages 234–253 of: Seidl, Helmut
(ed), Programming languages and systems: 21st European symposium on
programming, ESOP 2012, held as part of the European joint conferences on
theory and practice of software, ETAPS 2012. Lecture Notes in Computer
Science, vol. 7211. Springer Berlin Heidelberg. Best Paper Award Nominee.
viii
ACKNOWLEDGEMENTS
First of all, I would like to thank those that have supported me financially during
my studies. To the National Science Foundation who supported me through the grants
0917329 “A Foundation for Effects” and 1423617 “SEQUBE: A Sequent Calculus
Foundation for High- Level and Intermediate Programming Languages.” And to the
University of Oregon and donors Gurdeep Pall and John Juilfs who supported me
through fellowships and scholarships awarded by the university and Computer and
Information Science department.
I can’t thank my advisor Zena Ariola enough for the countless hours she has
dedicated to mentoring me. I could not have hoped for a more attentive and
encouraging advisor, and our frequent discussions and collaborations throughout my
time at the University of Oregon has shaped this thesis in so many ways, both big and
small. I would also like to thank other professors at the University of Oregon—Michal
Young, Daniel Lowd, Boyana Norris, Hank Childs, Kathleen Freeman and others—who
have given their time to advise me in matters outside research during my time here.
I would like to thank my office mates and co-members of the Oregon programming
languages group, Luke Maurer and Philip Johnson-Freyd, for our collaborations and
making my years at Oregon all the better. Through our joint work, they have both
had their direct influence on this thesis, pulling it in different directions that I would
not have gone on my own, and helped paint a fuller and brighter picture through their
own work.
I am grateful for the gracious hosts that have welcomed me during my studies.
To Alexis Saurin, Hugo Herbelin, and Pierre-Louis Curien at INRIA, I would like
to thank them for inviting me to Paris and expanding my horizons. Much of what I
learned about logic I owe to them, which has shaped this thesis. To Simon Peyton
ix
Jones, Andrew Kennedy, and others at Microsoft Research, I would like to thank them
for hosting Luke Maurer and I at Cambridge while we worked on applying our ideas
to the Glasgow Haskell Compiler (GHC). Simon Peyton Jones’ invaluable advice and
positive attitude helped us turn the theories into real benefits for Haskell programmers.
And to Iavor Diatchki and others at Galois, I would like to thank them for graciously
lending their time to help kick off the work on GHC.
There have been many visitors to the University of Oregon while I was a student
which enriched my experience here: Olivier Danvy, Keiko Nakata, Alexis Saurin, Pierre-
Louis Curien, Marco Gaboardi, Kenichi Asai, and Jacob Johannsen. To each, I would
like to say thank you for taking the time to share your ideas to me and others here
at Oregon.
And finally, I would like to thank my family. To my parents Dale and Terry
Downen for giving their support while I moved across the country to pursue these
studies. And to my husband Chris Hoffman for his tremendous patience, support, and
encouragement that made this thesis possible.
x
TABLE OF CONTENTS
Chapter Page
I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
II. Natural Deduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Gentzen’s NJ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
The λ-Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Proofs as Programs . . . . . . . . . . . . . . . . . . . . . . . . . . 38
A Critical Look at the λ-Calculus . . . . . . . . . . . . . . . . . . 40
III. Sequent Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Gentzen’s LK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
The Core Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
The Dual Calculi . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
IV. Polarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Additive and Multiplicative LK . . . . . . . . . . . . . . . . . . . . 102
Pattern Matching and Extensionality . . . . . . . . . . . . . . . . . 108
xi
Chapter Page
Polarizing the Fundamental Dilemma . . . . . . . . . . . . . . . . 115
Focusing and Polarity . . . . . . . . . . . . . . . . . . . . . . . . . 124
Self-Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
V. Data and Co-Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
The Essence of Evaluation: Substitutability . . . . . . . . . . . . . 137
The Essence of Connectives: Data and Co-Data . . . . . . . . . . . 144
Evaluating Data and Co-Data . . . . . . . . . . . . . . . . . . . . . 152
Combining Strategies in Connectives . . . . . . . . . . . . . . . . . 166
Combining Strategies in Evaluation . . . . . . . . . . . . . . . . . 174
Duality of Connectives and Evaluation . . . . . . . . . . . . . . . . 181
A (De-)Construction of the Dual Calculi . . . . . . . . . . . . . . . 189
VI. Induction and Co-Induction . . . . . . . . . . . . . . . . . . . . . 201
Programming with Structures and Duality . . . . . . . . . . . . . . 204
Polymorphism and Higher Kinds . . . . . . . . . . . . . . . . . . . 211
Well-Founded Recursion Principles . . . . . . . . . . . . . . . . . . 219
Indexed Recursion in the Sequent Calculus . . . . . . . . . . . . . 229
Encoding Recursive Programs via Structures . . . . . . . . . . . . 237
VII. Parametric Orthogonality Models . . . . . . . . . . . . . . . . 243
Poles, Spaces, and Orthogonality . . . . . . . . . . . . . . . . . . . 245
xii
Chapter Page
Computation, Worlds, and Types . . . . . . . . . . . . . . . . . . . 256
Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
Adequacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
VIII. The Polar Basis for Types . . . . . . . . . . . . . . . . . . . . . . 306
Polarizing User-Defined (Co-)Data Types . . . . . . . . . . . . . . 309
Type Isomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . 321
A Syntactic Theory of (Co-)Data Type Isomorphisms . . . . . . . 326
Laws of the Polarized Basis . . . . . . . . . . . . . . . . . . . . . . 345
The Faithfulness of Polarization . . . . . . . . . . . . . . . . . . . 361
IX. Representing Functional Programs . . . . . . . . . . . . . . . . 365
Pure Data and Co-Data in Natural Deduction . . . . . . . . . . . . 367
Natural Deduction versus Sequent Calculus . . . . . . . . . . . . . 386
Multiple Consequences . . . . . . . . . . . . . . . . . . . . . . . . . 405
X. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424
Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426
REFERENCES CITED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
xiii
LIST OF FIGURES
Figure Page
2.1. The NJ natural deduction system for second-order propositional
logic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2. NJ (natural deduction) proof of ` ((A ∧B) ∧ C) ⊃ (B ∧ A). . . . . . . . 18
2.3. The simply typed λ-calculus. . . . . . . . . . . . . . . . . . . . . . . . . 33
2.4. Typing derivation of the λ-calculus term λx. (pi2(pi1(x)), pi1(pi1(x))). . . . 34
2.5. The polymorphic λ-calculus (i.e. system F). . . . . . . . . . . . . . . . . 38
3.1. Truth tables for conjunction, disjunction, and implication. . . . . . . . . 47
3.2. The orientation of deductions for conjunction. . . . . . . . . . . . . . . . 47
3.3. The orientation of deductions for disjunction. . . . . . . . . . . . . . . . 47
3.4. The orientation of deductions for implication. . . . . . . . . . . . . . . . 47
3.5. The LK sequent calculus for second-order propositional logic. . . . . . . 49
3.6. Duality in the LK sequent calculus. . . . . . . . . . . . . . . . . . . . . . 62
3.7. µµ˜: The core language of the sequent calculus. . . . . . . . . . . . . . . . 66
3.8. The call-by-value (V) rewriting rules for the core µµ˜V-calculus. . . . . . 71
3.9. The call-by-name (N ) rewriting rules for the core µµ˜N -calculus. . . . . . 71
3.10. Scoping rules for (co-)variables in commands, terms, and co-terms. . . . 73
3.11. Implicit (co-)variable scope in the core µµ˜ typing. . . . . . . . . . . . . . 75
3.12. The syntax and types for the dual calculi. . . . . . . . . . . . . . . . . . 78
3.13. The β laws for the call-by-value (V) half of the dual calculi. . . . . . . . 80
3.14. The β laws for the call-by-name (N ) half of the dual calculi. . . . . . . . 80
3.15. LKQ: The focused sub-syntax and types for the call-by-value
dual calculus. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
xiv
Figure Page
3.16. LKT: The focused sub-syntax and types for the call-by-name
dual calculus. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.17. The focusing ς laws for the call-by-value half of the dual calculi. . . . . . 89
3.18. The focusing ς laws for the call-by-name half of the dual calculi. . . . . . 89
3.19. The Q-focusing translation to the LKQ sub-syntax. . . . . . . . . . . . . 93
3.20. The T -focusing translation to the LKT sub-syntax. . . . . . . . . . . . . 94
3.21. The duality relation between the dual calculi. . . . . . . . . . . . . . . . 99
4.1. An additive and multiplicative LK sequent calculus. . . . . . . . . . . . 104
4.2. The positive/negative and additive/multiplicative classification of
binary connectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
4.3. The syntax and types for system L . . . . . . . . . . . . . . . . . . . . . 109
4.4. The polarized extensional η laws for system L. . . . . . . . . . . . . . . 114
4.5. The polarized core µµ˜P-calculus: its static and dynamic semantics. . . . 119
4.6. The syntax for polarized system L. . . . . . . . . . . . . . . . . . . . . . 120
4.7. Logical typing rules for polarized system L. . . . . . . . . . . . . . . . . 122
4.8. The operational β laws for polarized system L. . . . . . . . . . . . . . . 123
4.9. Focused sub-syntax and core typing rules for polarized system L. . . . . 125
4.10. Focused logical typing rules for polarized system L. . . . . . . . . . . . . 127
4.11. The focusing ς laws for polarized system L. . . . . . . . . . . . . . . . . 129
4.12. Extending polarized system L with subtraction. . . . . . . . . . . . . . . 132
4.13. The self-duality of system L types . . . . . . . . . . . . . . . . . . . . . 133
4.14. The self-duality of system L programs. . . . . . . . . . . . . . . . . . . . 134
5.1. A parametric theory, µµ˜S , for the core µµ˜-calculus. . . . . . . . . . . . . 139
5.2. Call-by-value (V) and call-by-name (N ) strategies for the core
µµ˜-calculus. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
5.3. “Lazy-call-by-value” (LV) strategy for the core µµ˜-calculus. . . . . . . . 141
xv
Figure Page
5.4. “Lazy-call-by-name” (LN ) strategy for the core µµ˜-calculus. . . . . . . . 141
5.5. Nondeterministic (U) strategy for the core µµ˜-calculus. . . . . . . . . . . 141
5.6. Declarations of the basic data and co-data types. . . . . . . . . . . . . . 148
5.7. Adding data and co-data to the core µµ˜ sequent calculus. . . . . . . . . 148
5.8. Types of declared (co-)data in the parametric µµ˜ sequent calculus. . . . 150
5.9. Call-by-value (V) and call-by-name (N ) substitution strategies
extended with arbitrary (co-)data types. . . . . . . . . . . . . . . . . . . 152
5.10. “Lazy-call-by-value” (LV) and “lazy-call-by-name” (LN )
substitution strategies extended with arbitrary (co-)data types. . . . . . 152
5.11. The βη laws for declared data and co-data types. . . . . . . . . . . . . . 157
5.12. The parametric βSςS laws for arbitrary data and co-data. . . . . . . . . 158
5.13. Declarations of the basic single-strategy data and co-data types. . . . . . 169
5.14. Declarations of basic mixed-strategy data and co-data types. . . . . . . . 170
5.15. Kinds of multi-strategy (co-)data declarations and types. . . . . . . . . . 171
5.16. Types of multi-strategy (co-)data in the parametric µµ˜ sequent
calculus. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
5.17. Type-agnostic kind system for the core µµ˜ sequent calculus. . . . . . . . 177
5.18. Type-agnostic kind system for multi-kinded (co-)data. . . . . . . . . . . 178
5.19. Composite #»S strategy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
5.20. Composite core polarized strategy P = V ,N . . . . . . . . . . . . . . . . 178
5.21. Composite core LV and LN strategy. . . . . . . . . . . . . . . . . . . . 179
5.22. The duality of types of the parametric µµ˜-calculus. . . . . . . . . . . . . 183
5.23. The duality of programs of the parametric µµ˜-calculus. . . . . . . . . . . 184
5.24. Translation between the call-by-value half of the simply-typed
dual calculi and µµ˜. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
5.25. Translation between the call-by-name half of the simply-typed
dual calculi and µµ˜. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
xvi
Figure Page
5.26. The η laws for the dual calculi and extended (co-)values (V ′,N ′). . . . . 194
6.1. The syntax of types and programs in the higher-order µµ˜-calculus. . . . 212
6.2. The kind system for the higher-order parametric µµ˜ sequent
calculus. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
6.3. Types of higher-order (co-)data in the parametric µµ˜ sequent
calculus. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
6.4. βη conversion of higher-order types. . . . . . . . . . . . . . . . . . . . . 216
6.5. The βη laws for higher-order data and co-data types. . . . . . . . . . . . 217
6.6. The parametric βSςS laws for arbitrary higher-order data and
co-data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
6.7. Type-agnostic kind system for higher-order multi-kinded (co-)data. . . . 218
6.8. The syntax of recursion in the higher-order µµ˜-calculus. . . . . . . . . . 230
6.9. The kind system for size-indexed higher-order µµ˜ sequent calculus. . . . 231
6.10. Rewriting theory for recursion in the parametric µµ˜-calculus. . . . . . . 234
6.11. Type erasure for the higher-order parametric µµ˜-calculus. . . . . . . . . 236
7.1. Core parallel conversion rules. . . . . . . . . . . . . . . . . . . . . . . . . 282
7.2. Parallel conversion rules for (co-)data types. . . . . . . . . . . . . . . . . 283
8.1. Declarations of the primitive polarized data and co-data types. . . . . . 312
8.2. Declarations of the shifts between strategies as data and co-data
types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
8.3. A polarizing translation from G into P . . . . . . . . . . . . . . . . . . . 315
8.4. A theory for structural laws of data type declaration
isomorphisms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
8.5. A theory for structural laws of co-data type declaration
isomorphisms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
8.6. Isomorphism laws of positively polarized data sub-structures. . . . . . . 341
8.7. Isomorphism laws of negatively polarized co-data sub-structures. . . . . 342
8.8. Algebraic laws of the polarized basis of types. . . . . . . . . . . . . . . . 346
xvii
Figure Page
8.9. De Morgan duality laws of the polarized basis of types. . . . . . . . . . . 353
8.10. Identity laws of the redundant self-shift connectives. . . . . . . . . . . . 358
8.11. Derived laws of polarized functions. . . . . . . . . . . . . . . . . . . . . . 360
9.1. Untyped syntax for a natural deduction language of data and
co-data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
9.2. A natural deduction language for the core calculus. . . . . . . . . . . . . 370
9.3. Natural deduction typing rules for simple (co-)data. . . . . . . . . . . . 371
9.4. Natural deduction typing rules for higher-order (co-)data. . . . . . . . . 372
9.5. A core parametric theory for the natural deduction calculus. . . . . . . . 373
9.6. The untyped parametric βς laws for arbitrary data and co-data
types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
9.7. The typed βη laws for declared data and co-data types. . . . . . . . . . 375
9.8. Call-by-value (V) strategy in natural deduction. . . . . . . . . . . . . . . 375
9.9. Call-by-name (N ) strategy in natural deduction. . . . . . . . . . . . . . 376
9.10. Call-by-need (LV) strategy in natural deduction. . . . . . . . . . . . . . 377
9.11. Type-agnostic kind system for multi-kinded natural deduction
terms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
9.12. The pure, recursive size abstractions in natural deduction. . . . . . . . . 381
9.13. The β and ν laws for recursion. . . . . . . . . . . . . . . . . . . . . . . . 382
9.14. Translations between λlet and single-consequence µµ˜. . . . . . . . . . . 388
9.15. λµ: adding multiple consequences to natural deduction. . . . . . . . . . 405
9.16. The laws of control in λµ. . . . . . . . . . . . . . . . . . . . . . . . . . . 406
9.17. Translations between natural deduction and the sequent calculus
with many consequences. . . . . . . . . . . . . . . . . . . . . . . . . . . 415
xviii
CHAPTER I
Introduction
Truth and falsehood, questions and answers, construction and deconstruction; as
Alcmaeon (510BC) once said, most things come in dual pairs. Duality is a guiding
force, a mirror that reveals the new from the old via opposition. This idea appears
pervasively in logic, where duality is expressed by negation that inverts “true” with
“false” and “and” with “or.” However, even though the theory of programming languages
is closely connected to logic, this kind of strong duality is not so apparent in the
practice of programming. For example, sum types (disjoint tagged unions) and pair
types (structures) are related to dual concepts. But in the realm of programming, the
duality between these two features is not easy to see, much less use for any practical
purpose.
The situation is even worse for more complicated language features, where two
concepts, both important to the theory and practice of programming, are connected by
duality but one is well understood while the other is enigmatic and underdeveloped. In
the case of recursion and looping, inductive data types (like lists and trees of arbitrary,
but finite, size) are known to be dual to co-inductive infinite processes (like streams
of input or servers that are indefinitely available) (Hagino, 1987).1 However, while
proof assistants like Coq (Coq 8.4, 2012) have a sophisticated treatment of induction,
their treatment of co-induction is problematic (Giménez, 1996; Oury, 2008). The bias
towards induction and inadequate treatment of co-induction in type theory and proof
assistants is a road block for program verification and correctness.
Our main philosophy for approaching these questions is known as the Curry-
Howard isomorphism or proofs-as-programs paradigm (Curry et al., 1958; Howard,
1980; de Bruijn, 1968). The Curry-Howard isomorphism reveals a deep and
profound connection between logic and programming wherein mathematical proofs
are algorithmic programs. The canonical example of the isomorphism is the
correspondence between Gentzen’s (1935a) natural deduction, a system that formalizes
common mathematical reasoning by laying down the rules of intuitionistic logic, and
1In general, adding the prefix “co-” to a term or concept means “the dual of that thing,” and we
use the shorthand “(co-)thing” to mean “both thing and co-thing.”
1
Church’s (1932) λ-calculus, one of the first models of computation and the foundation
for functional programming languages. The rules for justifying proofs in intuitionistic
logic correspond exactly to the rules for writing programs in functional languages,
and simplifying proofs corresponds to running programs. This connection has led
technical advances flow both ways: not only can we use mathematics to help write
programs in functional languages, but we can also write programs to help develop
mathematics with proof assistants. However, the λ-calculus is not an ideal setting
for studying duality in computation. Dualities that are simple in other settings, like
the De Morgan laws in logic, are far from obvious in the λ-calculus. The root of the
problem is related to a lack of symmetry: natural deduction is only concerned with
verifying truth and the λ-calculus is only concerned with producing results.
Natural deduction is not the only logic, however. In fact, natural deduction has
a twin sibling called the sequent calculus, born at the same time within the seminal
paper of Gentzen (1935a). Whereas the rules of natural deduction more closely mimic
the reasoning that might occur in the minds of mathematicians, the rules of the
sequent calculus are themselves easier to reason about, for example, if we want to show
that the logic is consistent. Furthermore, unlike natural deduction’s presentation of
intuitionistic logic, Gentzen’s sequent calculus provides a native language for classical
logic which admits additional reasoning principles like proof by contradiction: if a
logical statement cannot be false, then it must be true. As a consequence, the sequent
calculus clarifies and reifies the many dualities of classical logic as pleasant symmetries
baked into the very structure of its rules. In this formal system of logic, equal attention
is given to falsity and truth, to assumptions and conclusions, such that there is perfect
symmetry. Yet, even though these two systems look very different from each other
and have their own distinct advantages and limitations, they are closely connected
and give us different perspectives into the underlying phenomena of logic.
When interpreted as a programming language, the natural symmetries of
the sequent calculus reveal hidden dualities in programming—input and output,
production and consumption, construction and deconstruction, structure and pattern—
and makes them a prominent part of the computational model. Fundamentally,
the sequent calculus expresses computation as an interaction between two opposed
entities: a producer representing a program that creates information, and a consumer
representing an environment or context that observes information. Computation then
occurs as a communication protocol allowing a producer and consumer to speak to one
2
another. This two-party method of computation gives a different view of computation
than the one shown by the λ-calculus. In particular, programs in the sequent calculus
can also be seen as configurations of an abstract machine (Ariola et al., 2009a), in
which the evaluation context is reified as a syntactic object that may be directly
manipulated. And due to the connection between classical logic (Griffin, 1990) and
control operators like Scheme’s (Kelsey et al., 1998) callcc or Felleisen’s (1992) C,
the built-in classicality of the sequent calculus also gives an effectful language for
manipulating control flow.
The computational interpretation of the sequent calculus is not just an intellectual
curiosity. Thanks to the relationship between natural deduction and the sequent
calculus as sibling logics (Gentzen, 1935b), the sequent calculus gives us another angle
for investigating real issues that arise in the λ-calculus and functional programming,
from source languages down to the machine. For example, McBride (Singh et al.,
2011) points out how the poor foundation for the computational interpretation of
co-induction is a road block for program verification and correctness, which is in
contrast to the robust and powerful treatment of induction in functional languages
and proof assistants. However, we show here how the symmetries of the sequent
calculus show us how both induction and co-induction can be represented as equal
and opposite reasoning principles under the unifying umbrella of structural recursion
for both ordinary recursive types and generalized algebraic datatypes (a.k.a. GADTs).
This computational symmetry between induction and co-induction is based on the
duality between data types in functional languages and co-data types as objects, and
gives a more robust way for proof assistants to handle recursion in infinite objects.
Moving down into the intermediate representation of programs that exists within
optimizing compilers, the logic of the sequent calculus shows how compilers can use
continuations in a more direct way with a “strategically defunctionalized” (Reynolds,
1998) continuation-passing style (CPS). This compromise between continuation-
passing and direct style makes it possible to transfer techniques between CPS (Appel,
1992) and static single assignment (SSA) (Cytron et al., 1991) compilers like SML/NJ
with direct style compilers like the Glasgow Haskell Compiler. For example, CPS
can faithfully represent join points in control flow (Kennedy, 2007), whereas direct
style can use arbitrary transformations expressed in terms of the original program
(Peyton Jones et al., 2001). Finally, the sequent calculus can also be interpreted as an
even lower-level, machine-like language for functional programs (Ohori, 1999), which
3
can be used to reason about fine details like manual memory management (Ohori,
2003). Therefore, the computational interpretation of the sequent calculus acts like a
beacon illuminating murky areas in both the design and implementation of functional
languages.
Overview
The structure of this dissertation can be broken down into three major parts.
First, Chapters II to IV review the background on the Curry-Howard isomorphism
for logics and languages based on natural deduction and sequent calculus. Second,
Chapters V and VI give the design and semantics of programming language features
in the setting of the sequent calculus based on an analysis of the background in the
first part. Third, Chapters VII to IX study the theory and application of the language
features in the second part for the purpose of reasoning about and implementing
programs.
Chapters II to VI have a linear dependency order; Chapter III depends on
Chapter II, Chapter IV depends on Chapter III, and so on. After that, Chapters VII
to IX depend on the preceding Chapters II to VI, but not on each other, and can be
read in any order.
Chapter V is an extended and rewritten material from the previous publication
(Downen & Ariola, 2014c) which I co-authored with Zena M. Ariola, Chapter VI is
a revised version of (Downen et al., 2015) which I co-authored with Philip Johnson-
Freyd and Zena M. Ariola, and Chapter VII uses some ideas from the supporting
materials in the appendix of (Downen et al., 2015) that I developed in collaboration
with Philip Johnson-Freyd.
Background
Chapter II reviews the logical system NJ of natural deduction, the core
programming language represented by the simply typed λ-calculus, and the Curry-
Howard correspondence between them. After considering the strength of their
correspondence and its application to functional programming, the chapter concludes
with some criticisms of issues in programming that are not readily addressed by these
two corresponding systems.
Chapter III is about how the idea behind the Curry-Howard isomorphism leads
to a foundational programming language based off the LK sequent calculus, which
4
is an alternative view of logic from natural deduction. A core calculus—called µµ˜—
is introduced, which lies at the heart of all the languages of the sequent calculus
to follow in the dissertation. The µµ˜-calculus brings up the fundamental dilemma
of computation in classical logic as corresponding to the need to fix an evaluation
order (like eager or lazy evaluation) for programming languages. The rest of LK’s
logical features are layered on top of this core which lets us talk about how ideas from
logic—such as de Morgan duality and focusing—translate to important concepts in
programming.
Chapter IV is about the application of polarity from logic to programming.
In logic, polarity tells us that types have one of two fundamental orientations—
positive or negative—which can be observed from the nature of their rules and impact
their meaning both in proof theory and computation. This brings into focus the
connection between pattern matching (from functional programming languages) and
extensionality (i.e. the idea that the only thing that can be observed about objects
is how they react to stimuli), and tells us how to combine both call-by-value and
call-by-name evaluation orders within a single program.
Language design
Chapter V presents a general framework that captures the previous interpretations
of the sequent calculus as a programming language (from Chapters III and IV), and
separates several independent concepts that were previously entangled. The main
ideas of this chapter are:
– All the individual logical connectives considered previously in the dissertation can
be represented by either data or co-data which are dual programming constructs
to one another and represent the mechanisms that both functional and object-
oriented languages use to let programmers declare new custom types.
– The impact of evaluation strategies on the behavior of programs can be described
by a discipline on substitution (i.e. what could a variable in a program possibly
stand for?), which lets us abstract away the differences caused by evaluation
orders out of the syntactic semantics of programming languages. This abstract
view of evaluation strategies encompasses the simple and canonical strategies,
namely call-by-value and call-by-name, as well as more complex and nuanced
strategies like call-by-need or radically non-deterministic evaluation.
5
– Programs can make use of multiple evaluation strategies by combining many
substitution disciplines (from the previous point) which are kept separate by
specifying a particular evaluation strategy for each type, so there are several
distinct kinds of types with each kind corresponding to a specific strategy. This
corresponds to the way that the Glasgow Haskell Compiler uses unboxed types
(Peyton Jones & Launchbury, 1991) to distinguish the different evaluation orders
of (necessarily strict) machine numbers and arrays from the otherwise lazy
Haskell programs.
Chapter VI extends Chapter V with well-founded induction and co-induction,
giving a fair treatment of co-induction by representing both as just specific use-cases
of structural recursion that can’t loop forever. The main ideas of this chapter are:
– Type abstraction (i.e. generics and modules) can be achieved by generalizing
the language of types with type functions, and letting (co-)data types quantify
over private type parameters that are not externally visible in their interface.
– Recursion in types (i.e. recursive types like lists or trees) can be achieved by
recursive data or co-data declarations using both the primitive recursion and
noetherian recursion principles from mathematics, where the recursive argument
is an index that tracks the “size” of the type (like the length of the list or height
of the tree).
– Recursion in programs (i.e. loops which must terminate on all inputs) can be
achieved by abstracting over the size index to recursive types, so that the program
cannot loop forever since the statically-known size must always decrease each
cycle.
Theory and application
Chapter VII develops a semantics for the programming language designed in
Chapters V and VI based on the idea of orthogonality (also known as bi-orthogonality,
>>-closure, or classical realizability). This gives a model connecting compile-time
types to run-time behavior useful for confirming language-wide safety properties in
the style of exhaustive testing: the collection of safe programs of a type are selected
from a pool of potential candidate implementations by checking them against a
test suite of observations; or dually a collection of safe observations of a type are
6
selected from the possible ones that might be considered by checking them against
the blessed specification programs. The chapter begins with a general introduction
to orthogonality and a comparison to negation in intuitionistic logic, and then builds
a specific model for the sequent-based language which is parameterized by both the
declared (co-)data types and the evaluation strategy(ies) used to interpret programs.
The adequacy of the model—that is, the fact that the syntactic typing rules implies
their semantic equivalent—is then applied to confirm several safety properties of the
language, including type safety, strong normalization, and the soundness of (typed)
extensionality laws with respect to the (untyped) operational semantics.
Chapter VIII applies the ideas from Chapter V to the problem of how polarity
(from Chapter IV) informs us of a small, finite collection of data and co-data types
which are capable of faithfully encoding every other (simple) type that a programmer
could possibly come up with. The emphasis here is on the “faithfulness” of encodings
which requires that some care is taken about which evaluation strategy is used at each
point in the program, so that the encodings don’t accidentally introduce the possibility
of rogue behavior that the programmer’s original type disallowed. To that point, this
chapters give a formal verification based on a theory of type isomorphisms of the
common folklore from polarized logic that complex types from both call-by-value and
call-by-name functional programming languages can be represented with the primitive
polarized types by sprinkling the special polarity shift connectives in the appropriate
places. However, the broader view of evaluation strategies and (co-)data types taken
here lets us consider how to encode types from call-by-need languages as well, which
uses four (rather than just the normal two) different shifts to and from the canonical
call-by-value and call-by-name strategies.
Chapter IX goes full circle, and relates back to natural deduction and the
λ-calculus, demonstrating how languages from Chapters V and VI based on the
sequent calculus can impact functional programming. The canonical relationship
between natural deduction and the sequent calculus gives a strong, bi-directional
correspondence to the intuitionistic restriction of the µµ˜-calculus and λ-calculus
family of languages. This correspondence can be applied to functional programming
languages, which are based on the λ-calculus, in one of two ways: (1) in the one
direction, functional programs can be compiled down to a machine-like representation
based on the sequent calculus, and (2) in the other direction, theories and ideas from
the sequent calculus can be translated back to the λ-calculus and the functional
7
paradigm. Afterward, the intuitionistic restriction is lifted, and the correspondence is
generalized to cover the full classical µµ˜-calculus by generalizing the λ-calculus with
first-class control. This generalization gives us a foundational language and a starting
point for talking about join points—a general technique for efficiently representing
shared control flow in programs—in direct style.
8
CHAPTER II
Natural Deduction
The foundations of mathematics and computation have connections that took
root in the early 1900s, when Hilbert posed the decision problem:
Is there an effectively calculable procedure that can decide whether a
logical statement is true or false?
This question, and its negative answer, prompted an investigation into the rigorous
meaning of what is “effectively computable” from Church (1936), Turing (1936), and
Gödel (1934). Later on, a much deeper connection between models of computation
and formalizations of logic was independently discovered and rediscovered many times
(Curry et al., 1958; Howard, 1980; de Bruijn, 1968). The most typical form of this
amazing coincidence, now known as the Curry-Howard isomorphism or the proofs-as-
programs paradigm, gives a structural isomorphism between Church’s (1932) λ-calculus,
a system for computing with functions, and Gentzen’s (1935a) natural deduction, a
system for formalizing mathematical logic. To illustrate the connection between logic
and programming, we will review the two systems and show how both they reveal
similar core concepts in different ways. In particular, two principles important for
characterizing the meaning of various structures, which we call β and η from the
tradition of the λ-calculus, arise independently in both fields of study.
Gentzen’s NJ
In 1935, Gentzen formalized an intuitive model of logical reasoning called natural
deduction, as it aimed to symbolically model the “natural” way that mathematicians
reason about proofs. A proof in natural deduction is a tree-like structure made up of
several inferences: ....
H1
....
H2 . . .
....
Hn
J
where we infer the conclusion J from proofs of the premises H1, H2, . . . , H3. The
conclusion J and premises Hi are all judgments that make a statement about logical
9
propositions (which we denote by the variables A,B,C, . . . ) that may be true or false,
such as “0 is greater than 1.” For example, we can make the basic judgment that a
proposition A is true, which we will write as ` A. Proof trees are built by stacking
together compatible inferences of the above form; we say that a proof tree is closed if all
leaves of the tree end with an axiom—that is, the special case of an inference with zero
premises—otherwise it is open. Open proof trees represent (partial) proofs that rely
on unsubstantiated assumptions, whereas closed proof trees represent self-contained
(complete) proofs.
Syntax and rules
The propositions that we deal with in logics like natural deduction are meant to
represent falsifiable or verifiable claims in a particular domain of study, such as “1 + 1
= 2.” However, in their simplest form, these systems don’t account for domain-specific
knowledge and leave such basic propositions as atoms or uninterpreted variables.
Instead, the primary interest of the logic is to characterize the meaning of connectives
that combine (zero or more) existing propositions, which are the logical glue for
putting together the basic building blocks. These connectives become the central
focus in Gentzen’s NJ, whose syntax and rules are given in Figure 2.1.
For example, the idea of logical conjunction is expressed formally as a connective,
written A∧B in NJ and read “A and B,” along with some associated rules of inference
for building proofs involving conjunction. On the one hand, in order to deduce that
A ∧B is true we may use the introduction rule ∧I:
` A ` B
` A ∧B ∧I
That is to say, if we have a proof that A is true and a proof that B is true, then we
have a proof that A∧B is true. On the other hand, in order to use the fact that A∧B
is true we may use either one of the elimination rules ∧E1 or ∧E2:
` A ∧B
` A ∧E1
` A ∧B
` B ∧E2
That is to say, if we have a proof that A ∧B is true, then it must be the case that A
is true and also that B is true.
10
X, Y, Z ∈ PropVariable ::= . . .
A,B,C ∈ Proposition ::= X | > | ⊥ | A ∧B | A ∨B | A ⊃ B | ∀X.A | ∃X.A
H, J ∈ Judgement ::= ` A
` > >I no >E rule no ⊥I rule
` ⊥
` C ⊥E
` A ` B
` A ∧B ∧I
` A ∧B
` A ∧E1
` A ∧B
` B ∧E2
` A
` A ∨B ∨I1
` B
` A ∨B ∨I2
` A ∨B
` A x....` C
` B y....` C
` C ∨Ex,y
` A x....` B
` A ⊃ B ⊃Ix
` A ⊃ B ` A
` B ⊃E
.... (X /∈ FV (∗))` A
` ∀X.A ∀IX
` ∀X.A
` A {B/X} ∀E
` A {B/X}
` ∃X.A ∃I
` ∃X.A
` A x.... (X /∈ FV (∗))` C (X /∈ FV (C))
` C ∃EX,x
FIGURE 2.1. The NJ natural deduction system for second-order propositional logic:
with truth (>), falsehood (⊥), conjunction (∧), disjunction (∨), implication (⊃), and
both universal (∀) and existential (∃) propositional quantification.
11
NJ also gives an account of logical implication as a connective in natural deduction,
written A ⊃ B and read “A implies B” or “if A then B,” in a similar fashion. In order
to deduce that A ⊃ B is true we may use the introduction rule ⊃I for implication:
` A x....` B
` A ⊃ B ⊃Ix
Notice that the introduction rule for implication has a more complex form of
introduction rule than the one for conjunction. In particular, the single premise of
the ⊃I rule introduces a local assumption that is only visible in the proof tree of that
premise. This premise says that if we can prove that B is true by assuming that A
is true, then we can conclude that A ⊃ B is true without the extra free assumption
about A. As a matter of bookkeeping, the identifier x used to mark the local axiom
whose scope within the overall proof is delimited by a corresponding ⊃Ix introduction
rule for proving the truth of an implication. Note that this local axiom x may be used
as many times as necessary in the sub-proof—be it zero times or several times—so
long as it is not used outside the scope created by the ⊃Ix rule. Once we have a proof
of A ⊃ B, we may make use of it with the elimination rule ⊃E for implication:
` A ⊃ B ` A
` B ⊃E
This is a formulation of the traditional reasoning principle modus ponens: if we believe
that A implies B is true and that A is true as well, then we must believe B is true.
The last binary connective in NJ, written A ∨B and read “A or B,” formalizes
logical disjunction. There are two different ways to prove that A ∨ B is true, which
corresponds to two different introduction rules ∨I for disjunction:
` A
` A ∨B ∨I1
` B
` A ∨B ∨I2
If we have a proof that A is true or a proof that B is true, then we have a proof that
A ∨ B is true. Notice how the elimination rules for ∨ are like upside-down versions
of the introduction rules for ∧. Unfortunately, making use of a proof that A ∨ B is
true is awkward in natural deduction, compared to connectives like conjunction and
implication. The elimination rule ∨E for disjunction is the most complex one of the
12
binary connectives of NJ:
` A ∨B
` A x....` C
` B y....` C
` C ∨Ex,y
This elimination rule assumes three premises: that A ∨B is true, that assuming A is
true lets us prove that C is true, and that assuming B is true lets us prove that C is
true. The conclusion of the rule asserts that C must be true because we know how to
prove it in either possible case where A or B is true. Note that the ∨E elimination rule
relies (twice) on the same mechanism of local assumptions for the two sub-proofs of C
that was also used in the ⊃I introduction rule. Hence, we use the same bookkeeping
identifiers connecting both local axioms x and y with the rule ∨Ex,y that delimits
their scope in the overall proof.
In the degenerate case, connectives that join zero propositions together serve as
logical constants. For example, consider a connective that internalizes the notion of
truth or validity into the system, written > and pronounced “true.” By its intuitive
meaning, we may always deduce that> is true with no additional premise, as described
by the introduction rule >I:
` > >I
However, we can do nothing interesting with a proof that > is true. In other words,
“nothing in, nothing out.” Notice how > can be understood as the nullary version of
the binary connective ∧ for conjunction: > has a single introduction rule with zero
premises similar to ∧’s two-premise introduction rule, and > has no elimination rules
compared with ∧’s two eliminations.
We can also consider a connective for internalizing the notion of falsehood, written
⊥ and pronounced “false.” In contrast to >, we should never be able to prove that ⊥
is true in any sensible context since that would be, well, false. In other words, there is
no valid introduction rule ⊥I. But if we are in some context where ⊥ is true for some
reason, then for all intents and purposes any proposition C might as well be true, as
described by the elimination rule ⊥E:
` ⊥
` C ⊥E
13
Again, notice how ⊥ can be understood as the nullary version of the binary connective
∨ for disjunction: ⊥ has no introduction rules compared to ∨’s two introductions,
and ⊥ has a single elimination rule with ` ⊥ as the only premise compared to ∨’s
elimination rule that assumes two premises in addition to ` A ∨B.
Using connectives described above, we can also define a derived connective for
negation, written ¬A and pronounced “not A,” which can be used to (indirectly) state
that a proposition is not true. For example, we should intuitively expect to be able
to prove ` ¬⊥ (“false is not true”) in NJ but be unable to derive ` ¬> (“true is not
true”). In lieu of treating ¬ as a proper connective,1 it can be defined in terms of
implication (⊃) and falsehood (⊥)
¬A , A ⊃ ⊥
so that the derived rules for negation that come from this encoding are:
` A x....` ⊥
` ¬A ⊃Ix
` ¬A ` A
` ⊥ ⊃E
Finally, the most complex form of propositions in NJ are the quantifiers: logical
connectives which abstract over a proposition variable (denoted by X, Y, Z, and of
which there are countably many) that occurs inside of a proposition.2 The first
such quantifier is the universal quantifier, written ∀X.A and pronounced “for all
X, A,” which codifies when the quantified proposition variable X may stand for any
proposition. For example, NJ has the property that for any proposition A, ` A ⊃ A
is provable. This fact can be represented more formally by proving ` ∀X.X ⊃ X,
where X is the universally quantified proposition variable. The second quantifier is
the existential quantifier, written ∃X.A and pronounced “there exists an X such that
A,” which codifies when the quantified proposition variable X stands for a specific
but unknown proposition. For example, there are propositions in NJ that are provably
1Although Gentzen (1935a) did originally treat negation as a proper connective in NJ, it was
defined in terms of the ⊥ connective so that the associated introduction and elimination rules for
negation are identical to the ones given here.
2For simplicity, we limit the presentation of NJ to second-order propositional logic. That is to
say, the quantifiers ∀ and ∃ abstract over propositions themselves, as opposed to objects of some
particular domain of interest like numbers.
14
true (such as the aforementioned A ⊃ A or simply the trivial truth >), which can be
represented formally by proving ` ∃X.X.
Since both of these quantifiers bind variables in propositions, all the usual
subtleties in programming languages involving static variables applies. In summary,
an occurrence of a proposition variable X in a proposition A is bound if it is within the
context of an ∀X or an ∃X and free otherwise (FV denotes the function that computes
the set of free variables of a proposition), and A {B/X} denotes the usual capture-
avoiding substitution operation where all free occurrences of X in A are replaced with
B such that all free occurrences of variables within B are still free after substitution.
We also do not distinguish propositions based on the choice of bound variable names,
commonly known as α equivalence, as stated by the two equalities for quantifiers:
∀X.A {X/Z} =α ∀Y.A {Y/Z} ∃X.A {X/Z} =α ∃Y.A {Y/Z}
where X and Y must not be free in A. The important property α equivalence and
capture-avoiding substitution is that they commute with one another, so that renaming
bound variables does not affect the result of substitution up to α equivalence. Stated
more formally, for all propositions A,B, and C, ifA =α B then A {C/X} =α B {C/X}.
A more thorough introduction to static variables and substitution is given by
Barendregt (1985) and Pierce (2002). In general, throughout this thesis we will take α-
equivalence for granted whenever static variable binders are present without belaboring
the formalities.
Establishing universal truths is a delicate matter, and requires the proper
discipline when crafting well-formed proofs. This subtlety rears its head in the universal
introduction rule ∀I for proving ∀X.A, which requires a new form of constraint on its
premise: .... (X /∈ FV (∗))` A
` ∀X.A ∀IY
The side condition X /∈ FV (∗) on the proof in the premise of the ∀I rule means that
the variable X cannot appear free in any of the propositions in the open leaves of
the sub-proof tree. Intuitively, this side condition on the variable X ensures that X is
totally generic in the sub-proof, so that we do not accidentally assume anything about
X that could leak into another part of the overall proof. Therefore, the ∀I rule can be
understood as stating that if we prove A is true where X is generic, then ∀X.A must
15
also be true. In contrast, the universal elimination rule ∀E has no such side condition
and can apply to any premise:
` ∀X.A
` A {B/X} ∀E
In other words, from a proof that ∀X.A is true, any instance of A with an arbitrary
B substituted for X is also true.
In contrast to the universal quantifier, establishing existential truths is easy. We
may deduce that ∃X.A is true by using the introduction rule ∃I:
` A {B/X}
` ∃X.A ∃I
which says that if A is true for some choice of B substituted forX, then it must be that
∃X.A is true. Notice that the introduction rule for ∃ is like an upside-down version of
the elimination rule for ∀; neither of the two rules impose any special criteria on their
premise. However, it is harder to use the fact that ∃X.A is true with the corresponding
elimination rule ∃E:
` ∃X.A
` A x.... (X /∈ FV (∗))` C (X /∈ FV (C))
` C ∃EX,x
The same side condition X /∈ FV (∗) that appeared in the premise of ∀I also appears
in the second premise of ∃E, so that X cannot appear free in any open leaves (besides
uses of the axiom x) of the sub-proof, but additionally the existential elimination rule
must also ensure that X is not free in the conclusion C. Intuitively, both of these side
conditions ensure that both the result ` C as well as its sub-proof is generic in the
choice of X. Therefore, the ∃E rule can be understood as stating that if we can prove
that ∃X.A is true and that C can be proved true from assuming A is true with a
generic X, then C must be true in general.
Example 2.1. Consider how we might build a proof that ((A∧B)∧C) ⊃ (B∧A) is true.
To start searching for a proof, we may begin with our goal ` ((A ∧B) ∧ C) ⊃ (B ∧ A)
at the bottom of the proof tree, and then try to simplify the goal by applying the
16
implication introduction rule “bottom up:”
` (A ∧B) ∧ C x....
` (A ∧B) ∧ C ` B ∧ A
` ((A ∧B) ∧ C) ⊃ (B ∧ A) ⊃Ix
This move adds the assumption (A∧B)∧C to our local hypothesis for the duration of
the proof, which we may use to finish off the proof at the top by the Ax rule. We are
still obligated to fill in the missing gap between Ax and ⊃I, but our job is now a bit
easier, since we have gotten rid of the ⊃ connective from the consequence in the goal.
Next, we can try to simplify the goal again by applying the conjunction introduction
rule to get rid of the ∧ in the goal:
` (A ∧B) ∧ C x....` B
` (A ∧B) ∧ C x....` A
` B ∧ A ∧I
` ((A ∧B) ∧ C) ⊃ (B ∧ A) ⊃Ix
We now have two sub-proofs to complete: a deduction concluding B and a deduction
concluding A from our local hypothesis (A∧B)∧C. At this point, the consequences of
our goals are as simple as they can be—they no longer contain any connectives for us
to work with. Therefore, we instead switch to work “top down” from our assumptions.
We are allowed to assume (A∧B)∧C, so let’s eliminate the unnecessary proposition
C using a conjunction elimination rule in both sub-proofs:
` (A ∧B) ∧ C x
` A ∧B ∧E1....` B
` (A ∧B) ∧ C x
` A ∧B ∧E1....` A
` B ∧ A ∧I
` ((A ∧B) ∧ C) ⊃ (B ∧ A) ⊃Ix
We can now finish off the entire proof by using conjunction elimination “top down”
in both sub-proofs, closing the gap between assumptions and conclusions as shown in
Figure 2.2. Since there are no unjustified branches at the top of the tree (every leaf is
17
an axiom provided by the ⊃I introduction rule) and there are no longer any gaps in
the proof, we have completed the deduction of our goal. End example 2.1.
Remark 2.1. The bookkeeping that keeps track of the scope of local axioms introduced
by the ⊃I, ∨E, and ∃E rules is important for ruling out bogus proofs that appear to
be closed but manage to deduce something like ` ⊥ that should be impossible. For
example, we could build a closed proof of ` ⊥ by using the ⊃I rule incorrectly as
follows:
` > >I
` ⊥ ⊃ > ⊃Ix ` ⊥ x
` (⊥ ⊃ >) ∧ ⊥ ∧I
` ⊥ ∧E2
Notice how the local axiom x that is introduced by the ⊃Ix rule in the left sub-proof
has been improperly “leaked” into the right sub-proof. This leak goes against the
constraints of the ⊃Ix rule and so the above proof tree is not well-formed. Likewise,
we can build another proof of ` ⊥ by incorrectly applying the ∨E rule as follows:
` > >I
` > ∨ ⊥ ∨I1 ` ⊥ y ` ⊥ y
` ⊥ ∨Ex,y
Again, the above proof is not well-formed because the constraints of the ∨Ex,y rule
are not met: the local axiom y has been used in the middle premise but its scope is
limited to only the right premise. The use of identifiers for local axiom bookkeeping is
more explicit than many other presentations of natural deduction systems, but every
system of natural deduction must enforce equivalent restrictions on these kinds of
rules with local axioms. End remark 2.1.
` (A ∧B) ∧ C x
` A ∧B ∧E1
` B ∧E2
` (A ∧B) ∧ C x
` A ∧B ∧E1
` A ∧E1
` B ∧ A ∧I
` ((A ∧B) ∧ C) ⊃ (B ∧ A) ⊃Ix
FIGURE 2.2. NJ (natural deduction) proof of ` ((A ∧B) ∧ C) ⊃ (B ∧ A).
18
Remark 2.2. The side conditions on free proposition variables in the ∀I and ∃E rules
are perhaps the most complex ones to understand, but are nonetheless crucial for the
overall logic to make sense. For example, it makes intuitive sense that if A is true for
all choices of X, then there is some choice of X such that A is true. Stated formally,
this intuition can be encoded into the proposition (∀X.A) ⊃ (∃X.A), which can be
proved in NJ as follows:
` ∀X.X y
` Y ∀E
` ∃X.X ∃I
` (∀X.X) ⊃ (∃X.X) ⊃Iy
The converse implication (∃X.A) ⊃ (∀X.A)—that if A is true for some X then it
must be true for all X—does not intuitively make sense, and indeed is not provable in
NJ. However, we can prove such a statement if we are sloppy with the side conditions
in ∀I and ∃E as follows:
` ∃X.X y ` X z
` X ∃EX,z
` ∀X.X ∀IX
` (∃X.X) ⊃ (∀X.X) ⊃Iy
This proof is not well-formed because the conclusion of the ∃EX,z rule is ` X, which
contains a free occurence ofX (as just plainly itself). It is fortunate that the restrictions
on free proposition variables prevent a proof of ` (∃X.X) ⊃ (∀X.X) in NJ since that
leads to clearly wrong conclusions like ` ⊥, similar to Remark 2.1, as follows:
` ∃X.X y ` X z
` X ∃EX,z
` ∀X.X ∀IX
` (∃X.X) ⊃ (∀X.X) ⊃Iy
` > >I
` ∃X.X ∃I
` ∀X.X ⊃E
` ⊥ ∀E End remark 2.2.
Logical harmony
Now that we know about some connectives and their rules of inference in our
system of natural deduction, we would like to have some assurance that what we have
defined is sensible in some way. To this end, we can insist on logical harmony, an idea
that has roots in arguments by Dummett (1991), to justify that the inference rules are
meaningful. Just like Goldilocks, we want rules that are neither too strong (leading to
19
an inconsistent logic) nor too weak (leading to gaps in our knowledge), but are instead
just right. Logical harmony for a particular connective can be broken down into two
properties of that connective’s inference rules: local soundness and local completeness
(Pfenning & Davies, 2001).
For a single logical connective, we need to check that its inference rules are not
too strong, meaning that they are locally sound, so that the results of the elimination
rules are always justified. In other words, we cannot get out more than what we put
in. Local soundness is expressed in terms of proof manipulations: a (potentially open)
proof in which an introduction is immediately followed by an elimination can be
simplified to a more direct proof. On the one hand, in the case of conjunction, if we
follow ∧I with ∧E1, then we can perform the following reduction on the proof tree:
.... D1` A
.... D2` B
` A ∧B ∧I
` A ∧E1 
.... D1` A
where D1 and D2 stand for proofs that deduce ` A and ` B, respectively. If we had
forgotten to include the first premise ` A in the ∧I rule, then this soundness reduction
would have no proof to justify its conclusion. On the other hand, if we follow ∧I with
∧E2, then we have a similar reduction:
.... D1` A
.... D2` B
` A ∧B ∧I
` B ∧E2 
.... D2` B
Additionally, we should ensure that the rules are not too weak, so that all the
information that goes into a proof can still be accessed somehow. In this respect,
we say that the inference rules for a logical connective are locally complete if they
are strong enough to break an arbitrary (potentially open) proof ending with that
connective into pieces and then put them back together again. For conjunction, this
is expressed by the following proof transformation:
.... D` A ∧B ≺
.... D` A ∧B
` A ∧E1
.... D` A ∧B
` B ∧E2
` A ∧B ∧I
20
If we had forgotten the elimination rule ∧E2, then local completeness would fail because
we would not have enough information to satisfy the premise of the ∧I introduction
rule. As a result, the rules will still be sound but we would be unable to prove a basic
tautology like A ∧ B ⊃ B ∧ A, which should hold by our intuitive interpretation of
A ∧B.
We also have local soundness and completeness for the inference rules of logical
implication, although they require a few properties about the system as a whole. For
local soundness, we can reduce ⊃I immediately followed by ⊃E as follows:
` A x.... D` B
` A ⊃ B ⊃Ix
.... E` A
` B ⊃E 
.... E` A.... D {E/x}` B
where D {E/A} is the substitution of the proof E for any uses of the local axiom x in
D. The substitution gives us a modified proof that no longer needs that particular
local axiom x of ` A, since any time the x axiom was used we instead place a full
copy of the E proof of ` A. For local completeness, we can expand an arbitrary proof
D of ` A ⊃ B as follows:
.... D` A ⊃ B ≺
.... D` A ⊃ B ` A x
` B ⊃E
` A ⊃ B ⊃Ix
Notice that on the right hand side the additional axiom x introduced by the use of
the ⊃Ix introduction rule is implicitly unused in the proof D.
The local soundness for the inference rules of logical disjunction follow from the
techniques used to show soundness of both conjunction and implication: disjunction
both uses a choice of two alternatives as well as a substitution for local axioms. By
letting i stand for either 1 or 2, we have the following reduction for either case when
∨I1 is followed by ∨E or ∨I2 is followed by ∨E:
.... D
Ai
` A1 ∨ A2 ∨Ii
A1
x1
.... E1
C
A2
x1
.... E2
C
` C ∨Ex1,x2 
.... D` Ai.... Ei {D/xi}` E
21
This reduction uses the same substitution operation as for the local soundness of
implication, where the correct premise Ei is selected to match the possible choice of
introduction rules. The local completeness, we can expand an arbitrary proof D of
` A ∨B as follows:
.... D` A ∨B ≺
.... D` A ∨B
` A x
` A ∨B ∨I1
` B y
` A ∨B ∨I2
` A ∨B ∨Ex,y
Note that this expansion may appear different from the ones that came before because
the introduction rules ∨I1 and ∨I2 appear above the elimination rule ∨E instead
of below by the typographic structure of the proof tree, but still the introductions
logically occur after the elimination by the meaning of the proof tree.
Demonstrating local soundness and completeness for the inference rules of the
nullary connectives for truth and falsehood may be deceptively basic. Since there is no
>E rule, local soundness of the > inference rules is trivially true: there is no possible
way to have a proof where >I is followed by >E because there is no >E rule, and
so local soundness is vacuous. Likewise, the local soundness of the ⊥ inference rules
is trivially true because there is no ⊥I rule, so soundness is again vacuous. However,
we still have to demonstrate local completeness by transforming arbitrary proofs of
` > and ` ⊥ into ones that apply all possible introduction and elimination rules
for the connectives. In the case of >, because the >I rule is always available, this
transformation just throws away the original, unnecessary proof and replaces it with
just >I: .... D` > ≺ > >I
In the case of ⊥, because ⊥E only requires a proof of ` ⊥ as its premise, this
transformation just adds on a final ⊥E inference:
.... D` ⊥ ≺
.... D` ⊥
` ⊥ ⊥E
Note that both of these transformations are nullary versions of local completeness for
logical conjunction and disjunction illustrated above. Therefore, we can be sure that
the inference rules for > and ⊥ are sensible.
22
Finally, the soundness and completeness of the quantifiers relies on the additional
side conditions on their inference rules restricting the allowable free proposition
variables. For the local soundness of the inference rules for universal quantification,
we can reduce ∀I immediately followed by ∀E as follows:
.... D (X /∈ FV (∗))` A
` ∀X.A ∀IX
A {B/X} ∀E 
.... D {B/X}
` A {B/X}
Note that in order to perform the reduction and get the same conclusion, we must
substitute B for X in the entire proof D. The fact that X is not free in any of the
open leaves in the proof D (which is a required condition of the premise of ∀IX) means
that those leaves are left unchanged by the substitution, so that the overall fringe of
the proof tree follows the same pattern. For the local completeness of the inference
rules for universal quantification, we can expand an arbitrary proof D of ` ∀X.A as
follows:
.... D (X /∈ FV (∗))` ∀X.A ≺
.... D (X /∈ FV (∗))` ∀X.A
` A ∀E
` ∀X.A ∀IX
Note that since there are countably many proposition variables, we can pick some X
which does not appear in the leaves of D without loss of generality since the choice
doesn’t matter (because we can always rename the bound X in ∀X.A by α equivalence
as necessary), which lets us satisfy the side condition imposed by the ∀IX rule.
The local soundness and completeness of the inference rules for existential
quantification combines ideas previously seen in disjunction and universal
quantification. We can reduce ∃I immediately followed by ∃E as follows:
.... D
` A {B/X}
` ∃X.A ∃I
` A x.... E (X /∈ FV (∗))` C (X /∈ FV (C))
` C ∃EX,x 
.... D
` A {B/X}.... E {B/X,D/x}` C
Note how this reduction involves two different kinds of substitution: substituting the
proposition B for the proposition variable X and substituting the proof D for the
local axiom x. The side condition that X is not free in C, nor in the leaves of E , is
23
important to make sure that the conclusion and leaves remain the same after the
substitutions. We can expand an arbitrary proof D of ` ∃X.A as follows:
.... D` ∃X.A ≺
.... D` ∃X.A
` A x
` ∃X.A ∃I
` ∃X.A ∃EX,x
which follows the general pattern of the disjunction completeness expansion, but notice
that the conclusion ` ∃X.A and right sub-proof follows the extra side conditions about
the free proposition variable X.
The λ-Calculus
The λ-calculus, first defined by Church in the 1930s, is a remarkably simple yet
powerful model of computation. The original language of terms (denoted by M,N)
is defined by only three parts: abstracting a program with respect to a parameter
(i.e. a function term: λx.M), reference to a parameter (i.e. a variable term: x), and
applying a program to an argument (i.e. a function application term: M N). Despite
this simple list of features, the untyped λ-calculus is a complete model of computation
equivalent to Turing machines. It is often used as a foundation for understanding
the static and dynamic semantics of programming languages as well as a platform
to experiment with new language features. In particular, functional programming
languages are sometimes thought of as notational convenience that desugars to an
underlying core language based on the λ-calculus.
Dynamic semantics
The dynamic behavior of the λ-calculus is defined by three principles. The most
basic principle is called the α law or α equivalence, and it asserts that the particular
choice of names for bound variables does not matter; the defining characteristic for
a variable is where it was introduced, enforcing a notion of static scope. We already
saw the principle of α equivalence arise for logical quantifiers in Section 2.1, and the
same idea helps understand the meaning of functions as λ-abstractions λx.M which
bind the variable x in M . For instance, the identity function that immediately returns
its argument unchanged may be written as either λx.x or λy.y, both of which are
considered α equivalent which is written λx.x =α λy.y. As with the logical quantifiers,
24
we will never be more discerning of λ-calculus terms than α equivalence: if M =α N
then we will always treat M and N as the “same” term.
The other dynamic principles of the λ-calculus deserve a more explicit treatment
because of how drastically they can alter terms. For this purpose, we will employ rules
that explain how to rewrite one λ-calculus term into another. More specifically, a
rewriting rule R, written
M R N
and pronounced “M rewrites (by R) to N ,” is a binary relation between terms.
Rewriting rules can be combined by offering a choice between them, so thatM RS N ,
pronounced “M rewrites (by R or S) to N ,” whenever M R N or M S N . We also
denote the inverse rewriting rule by flipping the direction of the  relation symbol,
so that N ≺R M exactly when M R N .
The second principle is called the β law orβ reduction, and it provides the primary
computational force of the λ-calculus. Given a λ-abstraction (i.e. a term of the form
λx.M) that is applied to an argument, we may calculate the result by substituting
the argument for every reference to the λ-abstraction’s parameter:
(λx.M) N β M {N/x}
The term M {N/x} is notation for performing capture-avoiding substitution of the
term N for the free occurrences of variable x in M , such that the static bindings of
variables are preserved.3 The third principle is called the η law orη expansion, and
it imbues functions with a form of extensionality. In essence, a λ-abstraction that
does nothing but forward its parameter to another function the same as that original
function:
M ≺η (λx.M x) (x /∈ FV (M))
Note that this rule is restricted so that M may not refer to the variable x introduced
by the abstraction, denoted by the function FV (M) that computes the set of free
variables in v, again to preserve static binding.
3As before, more details about α equivalence and capture-avoiding substitution in the λ-calculus
are given by Barendregt (1985) and Pierce (2002).
25
Even though the λ-calculus with just functions alone is sufficient for modeling all
computable functions, it is often useful to enrich the language with other constructs.
For instance, we may add pairs to the λ-calculus by giving a way to build a pair out
of two other terms, (M,N), as well as projecting out the first and second components
from a pair, pi1(M) and pi2(M). We may define the dynamic behavior of pairs in the
λ-calculus similarly to the way we did for functions. Since pairs do not introduce any
parameters, they are a bit simpler than functions. The main computational principle,
by analogy called β reduction for pairs, extracts a component out of a pair when it is
demanded:
pi1 (M,N) β M pi2 (M,N) β N
The extensionality principle, here called η expansion for pairs, expands a termM with
the pair formed out of the first and second components of M :
M ≺η (pi1(M), pi2(M))
Along with pairs, we can add a unit value to the λ-calculus, which is a nullary form
of pair containing no elements, written (), that expresses a lack of any interesting
information. On the one hand, since the unit value contains no elements, there are
no projections out of it, and therefore it has no meaningful β reduction. On the other
hand, the extensionality principle is quite strong, and the η expansion for the unit
replaces any term M with the canonical unit value:
M ≺η ()
This rule can be read as the nullary version of the η rule for pairs, where M did not
contain any interesting information, and so it is irrelevant.
We can also add explicit choice to the λ-calculus by extending the language
with (tagged, disjoint) unions, which are like boolean values that carry some extra
information. First, we add the two ways to build a value of the union by tagging a
term with our choice, either ι1 (M) or ι2 (M). Second, we add the method of using
a tagged union by performing case analysis, caseM of ι1 (x1) ⇒ N1 | ι1 (x2) ⇒ N2,
that checks the discriminant M to pick which branch ι1 (x1) ⇒ N1 or ι2 (x2) ⇒ N2
to pursue. Since the term for case analysis introduces variables like function terms
26
do, the dynamic behavior of tagged unions also relies on substitution. The main
computational principle of β reduction for tagged unions checks which of the two tags
were used to build the discriminant and then extracts the payload of the union by
binding it to a variable within the term of the corresponding branch:
case ι1 (M)of
ι1 (x1)⇒ N1
ι2 (x2)⇒ N2
β N1 {M/x1}
case ι1 (M)of
ι1 (x1)⇒ N1
ι2 (x2)⇒ N2
β N2 {M/x2}
The extensionality principle of η expansion for tagged unions says that every tagged
union value must be constructed by one of the two possible tagging methods by
expanding a term M with one that is computed by using case analysis on M to
determine which tag was chosen and then returning the same payload and tag:
M ≺η
caseM of
ι1 (x1)⇒ ι1 (x1)
ι2 (x2)⇒ ι2 (x2)
As before, we can add the nullary form of the binary tagged unions which represent
an impossible void value: since tagged unions provide a choice of two ways to build
results, there is no way to build a void result. To go along with impossible results, we
also have an empty case analysis void terms, caseM of , which will explicitly never
produce any answer because a void term M cannot produce an answer. Like with
units, there is no meaningful β reduction for void expressions because there is no void
value for the empty case analysis to inspect. However, the extensionality principle is
again strong, as it asserts that there is no value of the void type by explicitly discards
any potential result a void term M might return through an empty case analysis:
M ≺η caseM of
This rule can be understood as the nullary version of the η rule for tagged unions,
where there are no possible options for the program to proceed. Intuitively, there
should be no way to encounter a void term during evaluation, since there are no ways
to create void results, and so this η rule explicitly acknowledges that a void term M
can only exist in a dead code branch and its results are therefore irrelevant.
27
Remark 2.3. A basic rewriting rule like R does not necessarily confer any general
properties about the relation, so we systematically denote the enrichment of a rewriting
relation with useful closure properties by changing the shape of the relation symbol
. First off, we have general R reduction, denoted by M →R N and pronounced “M
R-reduces to N ,” which is the compatible closure of R allowing for the R rule to
be applied in any context within M . Syntactically, a context (denoted by C) is a
λ-calculus term with a single hole (denoted by ), and we can plug a term M into a
context C (written as the operation C[M ]) by replacing the  in C with M . In terms
of contexts, general R reduction is defined as the smallest relation →R that includes
R and is closed under compatibility (comp) as follows:
M R N
M →R N
M →R N
C[M ]→R C[N ]
comp
Unlike the capture-avoiding substitution operation M {N/x}, plugging a term M into
a context C might capture free variables of the term, so that even if x is free in M , x
might not be free in C[M ]. As a consequence, α equivalence does not commute with
context filling in the same way that it commutes with capture-avoiding substitution.
For example, we might say that λx. =α λy., but (λx.)[x] = λx.x 6=α λy.x =
(λy.)[x].
Next up, we have the R reduction theory (or R rewriting theory), denoted by
M R N , which is the reflexive-transitive closure of →R allowing for zero or more
repetitions ofR reductions. The R reduction theory is defined as the smallest relation
R that includes →R and is closed under reflexivity (refl) and transitivity (trans) as
follows:
M →R N
M R N M R M
refl M R M
′ M ′ R N
M R N trans
Note that above definition of R is the same as taking the compatible-reflexive-
transitive closure of R directly.
For the most generality, we have the R equational theory, denoted by M =R N
and pronounced as “M R-equals N ,” which is the symmetric-transitive closure ofR
that allows for reductions to be applied in both directions as many times as desired.
The R equational theory is defined as the smallest relation =R that includes R and
28
is closed under symmetry (symm) and transitivity (trans) as follows:
M R N
M =R N
N =R M
M =R N
symm M =R M
′ M ′ =R N
M =R N trans
Note that the above definition of =R is the same as taking the compatible-reflexive-
symmetric-transitive closure of R directly.
Finally, we have R operational reduction, denoted by M 7→R N , which gives us
the R operational semantics, denoted by M 7→ R N , as the reflexive-transitive closure
of 7→R. Both of these are restrictions on the above more general reduction relations: R
operational reduction is a limited form of general R reduction and the R operational
semantics is a limited form of the R reduction theory. The purpose of the operational
semantics is to specify how programs are to be executed by specifying a clear order
on when each reduction step of the program occurs; there should be enough possible
reductions to reach a result, but not so many that there are gratuitously many choices
for what to do at every step. This ordering for selecting the next reduction step can be
achieved by restricting compatibility, which allowed reduction to occur in any context,
to only allowing reduction to occur in a specially chosen subset of contexts called
evaluation contexts, usually denoted by the variable E. Given a choice of evaluation
contexts, R operational reduction and the R operational semantics are defined as the
smalled relations 7→R and 7→ R closed under the following rules:
M R N
E[M ] 7→R E[N ] eval
M 7→R N
M 7→ R N M 7→ RM refl
M 7→ RM ′ M ′ 7→ R N
M 7→ R N trans
Since we have to make a choice for which contexts are evaluation contexts, there can
be many possible operational semantics for a given language. As an example, we can
define a call-by-name operational semantics 7→ β for our λ-calculus discussed so far by
choosing the following evaluation contexts:
E ∈ EvalCxt ::=  | pi1(E) | pi2(E) | E N
| (caseE of ) | (caseE of ι1 (x)⇒ N1 | ι2 (y)⇒ N2)
by using the family of operational β laws.
As with a basic rewriting rule R, we denote the inverse of the directed reduction
relations→R,R, 7→R, and 7→ R by flipping the direction of the arrow, so that N ←R
29
M if and only if M →R N and so on. Since the equational theory =R is symmetric,
it is undirected, so it is its own inverse. End remark 2.3.
Static semantics
So far, we have only considered the dynamic meaning of the λ-calculus without
any mention of its static properties. In particular, now that we have both functions
and pairs, we may want to statically check and rule out programs that might “go
wrong” during calculation. For instance, if we apply a pair to an argument, (x, y) z,
then there is nothing we can do to reduce this program any further. Likewise, it is
nonsensical to ask for the second component of a function, pi2(λx.x). We may rule
out such ill-behaved programs by using a type system which guarantees that such
situations never occur by assigning a type to every term and ensuring that programs
are used in accordance to their types. For instance, we may give a function type,
A→ B, to λ-abstractions as follows:
x : A x....
M : B
λx.M : A→ B →Ix
where λx.M : A→ B means that the function λx.M has type A→ B. The premise
to this rule requires that M has type B assuming that all free occurrences of x in
M have type A. Since the variable x bound in the conclusion, it is closed off by the
premise of the rule because the type values that x can stand for in M has nothing to
do with any other x that might occur elsewhere in a larger term. Having given a rule
for introducing a term of function type, we can now restrict application to only occur
for terms of the correct type:
M : A→ B N : A
M N : B →E
This rule ensures that if we apply a term M to an argument, then M must have a
function type.
Likewise, we may give a product type, A×B, to the creation of a pairs
M : A N : B
(M,N) : A×B ×I
30
as well as limiting first and second projection to terms of a product type:
M : A×B
pi1(M) : A
×E1 M : A×Bpi2(M) : B ×E2
The unit type, 1, is a degenerate form of product types with a single canonical value
() : 1 1I
and no other typing rules.
Tagged unions belong to sum types, A+B, which has two different rules for the
creation of the two distinctly tagged values:
M : A
ι1 (M) : A+B
+I1 M : Bι2 (M) : A+B
+I2
The case analysis term for sum types has the most complex rule, requiring three
premises (one for the discriminant and two for the branches), two of which bind
variables which appear free in their respective sub-terms just like in the rule for
λ-abstractions:
M : A+B
x1 : A
x1
....
N1 : C
x2 : B
x2
....
N2 : C
(caseM of ι1 (x1)⇒ N1 | ι1 (x2)⇒ N2) : C +Ex1,x2
This rule says that a case analysis expression on a term M with the sum type A+B
has a result of type C if the terms N1 and N2 in both branches have the type C,
under the assumption that all free occurrences of x1 in N1 has type A and all free
occurrences of x2 in N2 has type B. The void type, 0, is a degenerate form of sum
types with no possible values and one case analysis term following the typing rule
M : 0
caseM of : C 0E
which says that the result of an empty case analysis on a term M of type 0 can be
said to have any type C because there will never be any result.
With all these rules in place, nonsensical programs like pi1(λx.x) are now ruled
out, since they cannot be given a type. The static semantics (i.e. the typing rules) and
31
the dynamic semantics (i.e. the reduction and expansion relationships) of this simply
typed λ-calculus are summarized in Figure 2.3. Note that the η laws, if left unchecked,
have the potential to cause unwanted relationships between terms. The different ways
that η has the potential to cause problems can be very subtle (Klop & de Vrijer, 1989),
but the issue is most clearly seen for units. In particular, η1 expansion for units says
that any term can be replaced with unit value (). But this apparently far-reaching
law is clearly nonsensical for representing programs: if every possible program is just
() then there’s no point in evaluating anything because there is never an interesting
answer! The other direction is not much better; η1 reduction says that the unit value ()
could just as well become anything else, leading to many different conflicting answers
whenever we encounter a unit value.
This conundrum is somewhat self-imposed, however: clearly the η1 law shouldn’t
apply to every term, but only to terms we expect will result in a unit value anyway.
Therefore, the η laws are all restricted to apply only to terms of an appropriate type,
so for example the η1 law only expands terms of type 1 with (). This creates an
interesting split in the relationships between terms, where we have the β laws that
do not depend on types, so that they still make sense for reasoning about untyped
terms, in contrast with the η laws that do depend on types to make sense, so that
they require typing information to ensure that they are correctly applied.
Remark 2.4. We should note that some care needs to be taken during a type derivation
to make sure that the distinction between variables in different scopes is clear. For
example, consider the following typing derivation of the function λx.λx.x:
x : A x
λx.x : B → A →Ix
λx.λx.x : A→ B → A →Ix
This typing derivation is not valid! In particular, note that the function λx.λx.x is
α equivalent to λx.λy.y by renaming the second bound variable, which represents a
binary function that returns its second argument. The problem is that by rebinding
the same variable x within the same scope, it is easy to have confusion about which
of the two arguments is meant when referring to x. This is why typing rules like →I
for terms which bind variables introduce a new scope in their premise to prevent this
32
X, Y, Z ∈ TypeVariable ::= . . .
A,B,C ∈ Type ::= X | 1 | 0 | A×B | A+B | A→ B
x, y, z ∈ Variable ::= . . .
M,N ∈ Term ::= x | () | caseM of
| (M,N) | pi1(M) | pi2(M)
| ι1 (M) | ι2 (M) | (caseM of ι1 (x)⇒ N1 | ι2 (y)⇒ N2)
| λx.M |M N
H, J ∈ Judgement ::= M : A
() : 1 1I no 1E rule no 0I rule
M : 0
caseM of : C 0E
M : A N : B
(M,N) : A×B ×I
M : A×B
pi1(M) : A
×E1 M : A×Bpi2(M) : B ×E2
M : A
ι1 (M) : A+B
+I1 M : Bι2 (M) : A+B
+I2
M : A+B
x : A x....
N1 : C
y : B
y
....
N2 : C
caseM of ι1 (x)⇒ N1|ι2 (y)⇒ N2 : C +Ex,y
x : A x....
M : B
λx.M : A→ B →Ix
M : A→ B N : A
M N : B →E
(β1) no rule (η1) M : 1 ≺η ()
(β0) no rule (η0) M : 0 ≺η caseM of
(β×) pii(M1,M2) β Mi (η×) M : A×B ≺η (pi1(M), pi2(M))
(β+)
case ιi (M)of
ι1 (x1)⇒ N1
ι2 (x2)⇒ N2
β Ni {M/xi} (η+) M : A+B ≺η
caseM of
ι1 (x1)⇒ ι1 (x1)
ι2 (x2)⇒ ι2 (x2)
(β→) (λx.M) N β M {N/x} (η→) M : A→B ≺η λx.M x (x /∈ FV (M))
FIGURE 2.3. The simply typed λ-calculus: with unit (1), void (0), product (×), sum
(+), and function (→) types.
33
confusion. In particular, the typing derivation for the sub-term λx.x is:
x : B x
λx.x : B → B →Ix
In this derivation, the variable x is already closed off, because it is bound by the
λ-abstraction in the conclusion. Therefore, when we continue the derivation to type
the outer λ-abstraction, the type of the bound reference of x is already fixed, and
cannot be changed as in
x : B x
λx.x : B → B →Ix
λx.λx.x : A→ B → B →Ix
which is the correct typing derivation for this term. End remark 2.4.
Example 2.2. For an example of how to program in the λ-calculus, consider the
following function which takes a nested pair, of type (A×B)×C, and swaps the inner
first and second components, while discarding the outer component:
λx. (pi2(pi1(x)), pi1(pi1(x)))
We can check that this function is indeed well-typed, using the typing rules
given in Figure 2.3, by the constructing the typing derivation in Figure 2.4.
Notice how the derivation bears a close structural resemblance to the proof of
` ((A ∧B) ∧ C) ⊃ (B ∧ A) given in Figure 2.2 of Example 2.1. In addition, we
can check that this function behaves as intended by applying it to a nested pair,
x : (A×B)× C x
pi1(x) : A×B ×E1
pi2(pi1(x)) : B
×E2
x : (A×B)× C x
pi1(x) : A×B ×E1
pi1(pi1(x)) : A
×E1
(pi2(pi1(x)), pi1(pi1(x))) : B × A ×I
λx. (pi2(pi1(x)), pi1(pi1(x))) : ((A×B)× C)→ (B × A) →Ix
FIGURE 2.4. Typing derivation of the λ-calculus term λx. (pi2(pi1(x)), pi1(pi1(x))).
34
((M1,M2),M3), and evaluating it with the reductions given in Figure 2.3:
(λx. (pi2(pi1(x)), pi1(pi1(x)))) ((M1,M2),M3)
→β→ (pi2(pi1((M1,M2),M3)), pi1(pi1((M1,M2),M3)))
β× (pi2(M1,M2), pi1 (M1,M2))
β× (M2,M1)
which confirms that this is the function we wanted. End example 2.2.
Type abstraction
If we only stick to typed terms, then the language we have described so far is
rather rigid and painful to use because every term must have a fixed specific type even
if it doesn’t matter. For example, the identity function λx.x, which just returns its
given input, works uniformly for values of any type. However, it must be given a single
type like Int → Int or String → String, meaning that the integer and string identity
functions must be defined separately even though their definition is the same. Statically
typed programming languages combat this useless redundancy with features called
polymorphism or generics that correspond to universal types in the λ-calculus, which
has been co-discovered in Girard’s (1971) system F and Reynolds’s (1974) polymorphic
λ-calculus. The main idea is to let generic terms abstract over type variables, so that
we have the term ΛX.M similar to the λ-abstractions that represent functions, and
to specialize generic terms to specific types, so that we have the term M A similar to
function application. The computational β reduction for polymorphism also mimics
functions by substituting the specialized type for the abstracted type variable:
(ΛX.M) A β M {A/X}
Likewise, the extensional η expansion for polymorphism says that a generic term that
just immediately specializes another generic term M with its applied type is the same
as M :
M ≺η ΛX.M X
35
These generic terms can be given a universal type of the form ∀X.A. Specialization
of generic terms just involves plugging in the applied type for the variable X in the
result, but the typing rule for abstraction is more tricky:
.... (X /∈ FV (∗))
M : A
ΛX.M : ∀X.A ∀IX
M : ∀X.A
M B : A {B/X} ∀E
The ∀I rule imposes a side condition on its premise, X /∈ FV (∗), which says that the
type variable X cannot appear in the type of any free variable of M . With universal
types, we can finally give a single, polymorphic definition of the identity function once
and for all, ΛX.λx.x : ∀X.X → X, which is typed as follows:
x : X x
λx.x : X → X →Ix
ΛX.λx.x : ∀X.X → X ∀IX
There is another complementary form of type abstraction with a very different
purpose in programming languages. For the sake of supporting more modular programs,
many typed languages allow for modules or other basic program units to hide some
of their representation. That way, the implementor of the module may use details
of its representation, but users of the module can only see the public interface do
not have access to these private details since peaking into the private details of a
module’s implementation would break the abstraction and prevent the user code from
linking with a different implementation. For example, we might have a module for
integer sets with four components in its public interface: the empty set, a function
for creating the singleton set of a given integer, a union function, and a membership
function that decides if an integer is in the set. Now there are many different ways
that a program could represent integer sets—arrays, linked lists, hash tables, balanced
trees, higher-order functions, etc.—but the code which uses integer sets should be
independent of the implementors choice of representation so that it can plug in with
several different implementations of the same public interface. This type of abstraction
can be modeled by existential types that make a choice of type private to a small
fragment of the overall program. For our example of integer sets, their interface is
described by the type
∃X.X × (Int→ X)× (X → X → X)× (Int→ X → Bool)
36
where the ∃ abstracts over a private type denoted by the variable X, and the four
components of the public interface are given by the four components of the product:
the empty set of type X, the singleton function of type Int→ X, the union function
of type X → X → X, and the membership function of type Int→ X → Bool.
How do we write programs with existential types? To be explicit about when we
are abstracting over a private type A used within a term M , we can package them
together as A@M where the term is tagged with its private type. We can then use a
packaged term by employing a new form of case analysis, caseM of X@y ⇒ N , which
locally unpacks M and separates out its private type (bound to the type variable X)
from the contents (bound to the variable y) for the purpose of evaluating the result
of N . The computational β reduction for existential types unpacks a type-packaged
term A@M that is in the eye of case analysis, substituting the concrete type A and
the implementation M for the abstract type variable X and the reference x within
their local scope:
caseA@M of X @ x⇒ N β N {A/X,N/x}
The extensional η principle for existential types says that every value of an existential
type must be a type-packaged value by expanding an existential term M into one that
is computed by unpacking M to extract its private type and value, only to return a
new package with the same type and value:
M ≺η caseM of X @ x⇒ X @ x
This form of existential type abstraction for packages can be enforced with the following
typing rules:
M : A {B/X}
B @M : ∃X.A ∃I
M : ∃X.A
x : A x.... (X /∈ FV (∗))
N : C (X /∈ FV (C))
(caseM of X @ x⇒ N) : C ∃EX,x
To form a new package B@M : ∃X.A, we only need to check that the underlying term
M does indeed implement a program of type A with the chosen type B substituted
for X. Unpacking a type abstraction is more complex, as we need to ensure that
the hidden type information cannot “leak” outside its scope. Therefore, the generic
37
A,B,C ∈ Type ::= . . . | ∀X.A | ∃X.A
M,N ∈ Term ::= . . . | ΛX.M |M A | A@M | caseM of X @ x⇒ N
.... X /∈ FV (∗)
M : A
ΛX.M : ∀X.A ∀IX
M : ∀X.A
M B : A {B/X} ∀E
M : A {B/X}
B @M : ∃X.A ∃I
M : ∃X.A
x : B x.... X /∈ FV (∗)
N : C X /∈ FV (C)
caseM of X @ x⇒ N : C ∃EX,x
(β∀) (ΛX.M) A β M {A/X} (η∀) M : ∀X.A ≺η ΛX.M X (X /∈ FV (M))
(β∃)
caseA@M of
X @ x⇒ N β N {A/X,M/x} (η
∃) M : ∃X.A ≺η
caseM of
X @ x⇒ X @ x
FIGURE 2.5. The polymorphic λ-calculus (i.e. system F): extending the simply typed
λ-calculus with universal (∀) and existential (∃) type abstraction.
type variable X that is brought into scope by the case analysis cannot appear in the
types of any other free variables (besides the corresponding variable x) in its scope.
Additionally, the generic type X bound by the unpacking case analysis cannot appear
in the return type C, which is the other source of potential leak.
The static and dynamic semantics of the universal (∀) and existential (∃) forms
of type abstraction are summarized in Figure 2.5, which extends the simply typed
λ-calculus from Figure 2.3 to be a full-fledged model of statically typed (functional)
programming languages.
Proofs as Programs
Amazingly, despite their different origins and presentations, both the systems
have a close, one-for-one correspondence to each other. Example 2.1 and Example 2.2
correspond to different ways of expressing the same idea. Both natural deduction
and the λ-calculus end up revealing the same underlying ideas in different ways. The
propositions of natural deduction are isomorphic to the types of the λ-calculus, where
38
conjunctions are the same as pair types, disjunctions are the same as sum types,
implications are the same as function types, logical truth and falsehood is the same
as the unit and void types, and the two quantifiers are the same in both systems.
Furthermore, the proofs of natural deduction are isomorphic to the (typed) terms
of the λ-calculus. This structural similarity between the two systems gives us the
slogan, “proofs as programs and propositions as types.” From this point of view,
natural deduction may be seen as the essence of the type system for the λ-calculus
and the λ-calculus may be seen as a more concise term language for expressing proofs
in natural deduction. For this reason, we may say that the λ-calculus is a natural
deduction language.
The correspondence between these two systems is not just between their syntax
and static structures, but also extends to the dynamic properties as well. Local
soundness and completeness in natural deduction are exactly the same as the β
and η laws of terms in the λ-calculus, respectively, for all the discussed types:
functions, products, sums, unit, void, universal, and existential types. Therefore, it
is no coincidence that the β and η rules for functions in the λ-calculus appeared as
they originally did, or that conjunction and disjunction have their given introduction
and elimination rules in NJ. Effectively, both the study of logic and the study of
computability have lead mathematicians to (re)discover different perspectives of the
same essential phenomena (Wadler, 2015).
Surprisingly, there is also a third entity in this correspondence: an algebraic
structure known as Cartesian closed categories (Lambek & Scott, 1986). In general, a
category is made up of:
– some objects A, B and C (“points”),
– some morphisms between those objects (“arrows”), a morphism f from A to B
is written f : A→ B,
– a trivial morphism from every object to itself (“identity”), and
– the ability to chain together any two morphisms passing through the same object
(“composition”). Given f : A → B and g : B → C then g ◦ f is a morphism
from A to C ,
along with some laws about identity and composition. And Cartesian closed categories
in particular are also guaranteed to have some special objects: a terminal object 1, a
39
product object A×B for any objects A and B, and an exponential object BA for any
objects A and B. As it turns out, the terminal (1), product (A×B), and exponential
(BA) objects correspond to unit (1), pair (A × B), and function (A → B) types in
the λ-calculus and to truth (>), conjunction (A ∧ B), and implication (A ⊃ B) in
natural deduction, respectively. Cartesian closed categories may be seen as a variable-
free presentation of the λ-calculus, where λ-abstractions (which bind variables) are
replaced by primitive functions. Furthermore, the categorical concept of the initial
object (0) and sums of objects (A+B) correspond with the empty (0) and sum (A+B)
types and with logical falsehood (⊥) and disjunction (A ∨B), respectively. Since the
same idea has been stumbled upon three different times from three different angles,
the connection between proofs and programs cannot be a simple coincidence.
A Critical Look at the λ-Calculus
The Curry-Howard isomorphism lead to striking discoveries and developments
that likely would not have arisen otherwise. The connection between logic and
programming languages led to the development of mechanized proof assistants, notably
the Coq system (Coquand, 1985), which are used in both the security and verification
communities for validating the correctness of programs. The connection between
category theory and programming languages suggested a new compilation technique
for ML (Cousineau et al., 1987). However, let us now look at the λ-calculus with a
more critical eye. There are some defining principles and computational phenomena
that are important to programming languages, but are not addressed by the λ-calculus.
For example, what about:
– Duality? The concept of duality is important in category theory where it comes
for free as a consequence of the presentation. Since the morphisms in category
theory have a direction, we can just “flip all the arrows” to find its dual without
any effort or creativity on our part. This action gives us a straightforward method
to find the dual of any category or diagram. For example, consider the diagram
that describes products in categorical terms:
C
A A×B B
f g
!(f,g)
pi1 pi2
40
Here, for any two objects,A and B, and morphisms, f , and g, there is the product
A×B object with the projection morphisms pi1 and pi2 out of the product and
a unique morphism into the product. The description of sums pops out for free
by just turning that diagram around:
A A+B B
C
ι1
f
![f,g]
ι2
g
Now, the two projections have become two injections, ι1 and ι2, into the sum
object and we have a unique morphism out of the sum for any f and g.
Duality also appears in logic, for example in the traditional De Morgan laws
like ¬(A ∨B) = (¬A) ∧ (¬B). Predictably, the corresponding concept of a sum
object (the dual of a product) in logic is disjunction (the dual of a conjunction).
If we look at the rules of NJ from Figure 2.1, the introduction rules for A ∨B
bear a resemblance to the elimination rules for A∧B: one is just flipped upside-
down from the other. However, the elimination rule for disjunction is quite
different from the introduction for conjunction. This dissimilarity comes from
the asymmetry in natural deduction. We may have many premises, but only a
single conclusion. It seems like a more symmetrical system of logic would be
easier to methodically determine duality just like in category theory.
Likewise, this form of duality is not readily apparent in the λ-calculus. Since the
λ-calculus is isomorphic to NJ, it shares the same biases and lack of symmetry.
The emphasis of the language is entirely on the production of information: a
λ-abstraction produces a function, a function application produces the result, etc.
For this reason, the relationship between a pair, (M,N), and case analysis on
tagged unions, caseM of ι1 (x) ⇒ N1|ι2 (y) ⇒ N2, is not entirely obvious. For
this reason, we would like to study a language which expresses duality “for free,”
and which corresponds to a more symmetrical system of logic.
– Evaluation strategy? Reynolds (1998) observed that while functional or
applicative languages may be based on the λ-calculus, the true λ-calculus implies
a lazy (call-by-name) evaluation order, whereas many languages are evaluated
by a strict (call-by-value) order that first reduces arguments before performing
a function call.
41
To resolve this mismatch between the λ-calculus and strict programming
languages, Plotkin (1975) defined a call-by-value variant of the λ-calculus
along with a continuation-passing style (CPS) transformation that embeds
the evaluation order into the program itself. Sabry & Felleisen (1993) give a
complete set of equations for reasoning about the call-by-value λ-calculus based
on Fischer’s (1993) call-by-value CPS transformation, and which corresponds
to Moggi’s (1989) computational λ-calculus. The equations were later refined
into a complete theory for call-by-value reduction by Sabry & Wadler (1997).
More recently, there has been work on a theory for reasoning about call-by-
need evaluation of the λ-calculus (Ariola et al., 1995; Ariola & Felleisen, 1997;
Maraist et al., 1998), which is the strategy commonly employed by Haskell
implementations, and the development of Levy’s (2001) call-by-push-value
framework which includes both call-by-value and call-by-name evaluation but
not call-by-need.
Different evaluation strategies used by implementations of functional programming
languages have been studied as different versions of the λ-calculus that embody
the implementation, including various calculi for call-by-value (Plotkin, 1975;
Moggi, 1989; Sabry & Felleisen, 1993; Sabry & Wadler, 1997) and call-by-need
(Ariola et al., 1995; Ariola & Felleisen, 1997; Maraist et al., 1998) evaluation.
What we would ultimately want is not just another calculus, but instead a
framework that gives a clear justification of the evaluation strategies found in
programming languages, and where the relationships between strategies can be
naturally expressed. Can we have a logical foundation for programming languages
that is naturally strict, in the same way that the λ-calculus is naturally lazy?
And which readily accounts for programs that utilize more than one evaluation
strategy in the same language? Can we express the duality between evaluation
strategies Filinski (1989); Curien & Herbelin (2000); Wadler (2003) generically,
between arbitrarily many pairs of strategies?
– Object-oriented programming? The object-oriented paradigm has become a
prominent part of the mainstream programming landscape. Unfortunately,
what is meant by an “object” in the object-oriented sense is fuzzy, since the
exact details of “what is an object” depend on choices made by the particular
programming language. One concept of objects that is universal across every
42
programming language is dynamic dispatch which is used to select the behavior
of a method call based on the value or type of an object. Dynamic dispatch
is emphasized by Kay (1993) in the form of message passing in the design of
Smalltalk. Abadi & Cardelli (1996) give a theoretical formulation for the many
features of object-oriented languages, wherein dynamic dispatch plays a central
role. Can we give an account of the essence of objects, and in particular messages
and dispatch, that is connected to logic and category theory in the same way as
the λ-calculus? Even more, can this foundation for objects refer back to basic
principles discovered independently in the field of logic?
– Control flow? Every programming language has some concept of control flow
which can describe the order that instructions are executed, the flow of data
dependencies between parts of a program, or the call-and-return protocol
of functions. The λ-calculus serves as a wonderful formalization of for pure
functions. However, many languages include additional computational effects,
like exceptions, that let programs manipulate control flow in ways not possible
with pure functions, and so they lie outside of the expressive power of the λ-
calculus (Felleisen, 1991). For example, Scheme (Kelsey et al., 1998) is a language
based on the λ-calculus that nonetheless has operators like callcc that reifies
control flow as a first-class object, which follows a traditional approach for
representing control flow by adding new primitives to the λ-calculus.
Instead, we would rather understand the flow of control in a setting where it
is naturally expressed as a consequence of the language, rather than added on
as an afterthought. Surprisingly, certain programmatic manipulations of control
flow, like Scheme’s callcc, correspond to axioms of classical logic (Griffin, 1990;
Ariola & Herbelin, 2003). Since these classical reasoning principles are a well
established part of logic, can we also have a corresponding language with a
naturally classical representation of control as a first-class citizen?
With the aim of answering each of these questions, we will put the λ-calculus
aside and we look to another logical framework instead of natural deduction. Most
surprisingly, we do not have to look very far, since Gentzen (1935a) introduced the
sequent calculus along side natural deduction as an alternative system of formal logic.
Gentzen developed sequent calculus in order to better understand the properties of
natural deduction. Therefore, to answers these questions about programming, we will
43
look for the computational interpretation of the sequent calculus and its corresponding
programming language.
44
CHAPTER III
Sequent Calculus
Natural deduction is not an only child; it was born with a twin sibling called
the sequent calculus. One of Gentzen’s (1935a) ground-breaking insights with the
sequent calculus is the use of its namesake sequents to organize the information we
have about the various propositions in question. In its most general form, a sequent
is a conditional conglomeration of propositions:
A1, A2, . . . , An ` B1, B2, . . . , Bm
pronounced “A1, A2, . . ., and An entail B1, B2, . . ., or Bm,” which states that assuming
each of A1, A2, . . . , An are true then at least one of B1, B2, . . . , Bm must be true. The
turnstyle (`) in the middle of the sequent separates the hypotheses on the left, which
we collectively write as Γ, from the consequences on the right, which we collectively
write as ∆.
This separation between the left and right sides of the sequent gives the essential
skeletal structure of the sequent calculus as a logic. As special cases, we can form
several basic judgements about logical propositions using our above interpretation of
the meaning of sequents by observing that an empty collection of hypotheses denotes
“true” and an empty collection of consequences denotes “false.” A single consequence
without hypotheses ` A means “A is true,”1 a single hypothesis without consequences
A ` means “A is false,” and the empty sequent ` is a primitive contradiction “true
entails false.” So already, the basic structure of the sequent gives us a language for
speaking about truth, falsehood, and contradiction without knowing anything else
about the logic at hand.
1Note how sequents gracefully extend the single judgement ` A of the NJ system of natural
deduction, which only directly asserts the truth of propositions, so that statements of falsehood
or contradiction must be represented indirectly through logical connectives like ` ⊥ (i.e. “false is
true”) for contradiction and ` ¬A (i.e. “not A is true”) or ` A→ ⊥ (i.e. “A implies false is true”)
for falsehood. A consequence of these indirect encodings is that simplified versions of NJ without a
false connective ⊥ will have trouble speaking about contradictions, and likewise simplifications of
NJ without negation ¬ will have trouble speaking about falsehoods.
45
Let’s now revisit the basic binary connectives—conjunction (A ∧B), disjunction
(A ∨B), and implication (A ⊃ B)—by giving their meaning in terms of truth tables
that describe the relationship between the truth of a compound proposition and the
truth of its parts, as shown in Figure 3.1. Coupled with the interpretation of sequents,
this interpretation of connectives gives a simple method of determining the validity of
inference rules by checking if the conclusion does indeed follow from the premises. For
example, we can validate the inference rules involving conjunction shown in Figure 3.2.
Due to the interaction between entailment in the sequent (separating hypotheses
from consequences) and the line of inference (separating premises from conclusions), we
have two dimensions for orienting inference rules based on the location of their primary
proposition (marked with a box in Figure 3.2). On the horizontal axis, rules where the
primary proposition appears to the right or left of the turnstyle are called right and left
rules, respectively. On the vertical axis, rules where the primary proposition appears
below or above the line of inference are called introduction and elimination rules,
respectively. This gives us four quadrants where the rules of inference for conjunction
might live.
– Right introduction: knowing that A is true and B is true is sufficient to conclude
that A ∧B is true.
– Right elimination: known that A ∧B is true is sufficient to conclude that A is
true and likewise that B is true.
– Left introduction: knowing that A is false is sufficient to conclude that A∧B is
false, and likewise when B is false.
– Left elimination: knowing that A ∧ B is false while both A and B are true is
sufficient to deduce a contradiction, as this represents an impossible situation.
Similar inference rules with similar readings can be given for disjunction and
implication under the same right/left and introduction/elimination orientations as
shown in Figure 3.3 and Figure 3.4.
Notice how the extra judgemental structure provided by sequents allows for
simpler versions of some of the particularly complex inference rules from natural
deduction in Figure 2.1 that introduce localized assumptions to select premises. In
contrast to the NJ inference rule ⊃I for right implication introduction which proves
A→ B by introducing a local assumption that A is true (` A) in the premise which
46
A B A ∧B
False False False
False True False
True False False
True True True
A B A ∨B
False False False
False True True
True False True
True True True
A B A ⊃ B
False False True
False True True
True False False
True True True
FIGURE 3.1. Truth tables for conjunction (∧), disjunction (∨), and implication (⊃).
Left Right
Elimination
A ∧B ` ` A ` B
`
` A ∧B
` A
` A ∧B
` B
Introduction
A `
A ∧B `
B `
A ∧B `
` A ` B
` A ∧B
FIGURE 3.2. The orientation of deductions for conjunction (∧).
Left Right
Elimination
A ∨B `
A `
A ∨B `
B `
` A ∨B A ` B `
`
Introduction
A ` B `
A ∨B `
` A
` A ∨B
` B
` A ∨B
FIGURE 3.3. The orientation of deductions for disjunction (∨).
Left Right
Elimination
A ⊃ B `
` A
A ⊃ B `
B `
` A ⊃ B ` A
` B
Introduction
` A B `
A ⊃ B `
A ` B
` A ⊃ B
FIGURE 3.4. The orientation of deductions for implication (⊃).
47
proves B is true (` B), the sequent-based right introduction rule in Figure 3.4 instead
stores A as a hypothesis in the premise A ` B which asserts that A entails B, thereby
reducing the implication connective to the implication built into the meaning of the
turnstyle. Likewise, In contrast to the NJ inference rule ∨E for right disjunction
elimination which introduces local assumptions for both possibilities A and B into
two different premises, instead the sequent-based right elimination rule in Figure 3.3
stores the possibilities as hypotheses in the premises A ` and B ` which assert that
A and B are false.
With the dimensions of logical orientation illustrated in Figure 3.2, Figure 3.3, and
Figure 3.4, we can identify one of the primary distinctions between natural deduction
and the sequent calculus. Natural deduction is exclusively made up of right rules—
including both right introduction and right elimination—and the sequent calculus
is exclusively made up of introduction rules—including both right introduction and
left introduction.2 Or in other words, natural deduction is concerned with deducing
and using the truth of propositions, whereas the sequent calculus is concerned with
introducing true and false applications of logical connectives. With this fundamental
characterization of the sequent calculus in mind, we will delve into Gentzen’s LK: the
original sequent-based logic.
Gentzen’s LK
Gentzen’s LK, a simple logic based extensively on the use of sequents to trace
local hypotheses and consequences throughout a proof, is given in Figure 3.5. The
sequents are built out of (ordered) lists of propositions Γ and ∆, and the inference
rules let us build proof trees by stacking inferences on top of one another. We include
all the same connectives in LK as we had in NJ: the nullary constants > and ⊥,
the binary operators ∧, ∨, and ⊃, and quantifiers ∀ and ∃. Additionally, notice that
negation is included as a full-fledged unary connective ¬A, whose logical inference
rules are easy to define in terms of sequents, instead of encoding it with implication
and falsehood as in NJ.
The various inference rules of LK can be thought of in three groups that
collectively work toward different objectives. The first group, containing just the axiom
(Ax) and cut (Cut) rules, gives the core of LK. The Ax rule lets us draw consequences
2But no one, it seems, is interested in left eliminations. A rare exception is the stack calculus
(Carraro et al., 2012) which characterizes implication entirely by left rules only.
48
X, Y, Z ∈ PropVariable ::= . . .
A,B,C ∈ Proposition ::= X | > | ⊥ | A ∧B | A ∨B | ¬A | A ⊃ B | ∀X.A | ∃X.A
Γ ∈ Hypothesis ::= A1, . . . , An
∆ ∈ Consequence ::= A1, . . . , An
Judgement ::= Γ ` ∆
Core rules:
A ` A Ax
Γ ` A,∆ Γ′, A ` ∆′
Γ′,Γ ` ∆′,∆ Cut
Logical rules:
Γ ` >,∆ >R no >L rule no ⊥R rule Γ,⊥ ` ∆ ⊥L
Γ ` A,∆ Γ ` B,∆
Γ ` A ∧B,∆ ∧R
Γ, A ` ∆
Γ, A ∧B ` ∆ ∧L1
Γ, B ` ∆
Γ, A ∧B ` ∆ ∧L2
Γ ` A,∆
Γ ` A ∨B,∆ ∨R1
Γ ` B,∆
Γ ` A ∨B,∆ ∨R2
Γ, A ` ∆ Γ, B ` ∆
Γ, A ∨B ` ∆ ∨L
Γ, A ` ∆
Γ ` ¬A,∆ ¬R
Γ ` A,∆
Γ,¬A ` ∆ ¬L
Γ, A ` B,∆
Γ ` A ⊃ B,∆ ⊃R
Γ ` A,∆ Γ′, B ` ∆′
Γ′,Γ, A ⊃ B ` ∆′,∆ ⊃L
Γ ` A,∆ X /∈ FV (Γ ` ∆)
Γ ` ∀X.A,∆ ∀R
Γ, A {B/X} ` ∆
Γ,∀X.A ` ∆ ∀L
Γ ` A {B/X} ,∆
Γ ` ∃X.A,∆ ∃R
Γ, A ` ∆ X /∈ FV (Γ ` ∆)
Γ,∃X.A ` ∆ ∃L
Structural rules:
Γ ` ∆
Γ ` A,∆ WR
Γ ` ∆
Γ, A ` ∆ WL
Γ ` A,A,∆
Γ ` A,∆ CR
Γ, A,A ` ∆
Γ, A ` ∆ CL
Γ ` ∆, A,B,∆′
Γ ` ∆, B,A,∆′ XR
Γ′, B,A,Γ ` ∆
Γ′, A,B,Γ ` ∆ XL
FIGURE 3.5. The LK sequent calculus for second-order propositional logic: with truth
(>), falsehood (⊥), conjunction (∧), disjunction (∨), negation (¬), implication (⊃),
and both universal (∀) and existential (∃) propositional quantification.
49
from hypotheses with the understanding that “A entails A” for any proposition A.
The Cut rule lets us eliminate intermediate propositions from a proof. For example,
the special case of the Cut rule where the hypothesis Γ and Γ′ and consequences ∆
and ∆′ are all empty is:
` A A `
` Cut
In other words, if we know that a proposition A is both true ( ` A) and false (A ` ),
then we can conclude that a contradiction has taken place ( ` ). We can then use
the intuitive reading of sequents to extend this reasoning to the general form of Cut,
meaning that it is valid to allow additional hypotheses and alternate consequences
in both premises when eliminating a proposition in this fashion so long as they are
all gathered together in the resulting conclusion. If Γ entails either A or ∆, and both
Γ′ and A entails ∆′, then both Γ′ and Γ entails either ∆′ or ∆ by cases on which of
A or ∆ is entailed by Γ: if A is a consequence of Γ, then ∆′ is a consequence of the
combination of A and Γ′, otherwise ∆ must be a consequence of Γ.
Both Ax and Cut play an important part in the overall structure of LK proof
trees. The Ax serves as the primitive leaves of the proof, signifying that there is
nothing interesting to justify because we have just what is needed. The Cut lets us use
auxiliary proofs or “lemmas” without them appearing in the final conclusion, where
on the one hand we show how to derive a proposition A as a consequence and on the
other hand we assume A as a hypothesis that may be used in another proof.
The second group of inference rules aims to characterize the logical connectives.
These logical rules are generalizations of the introduction rules for the connectives
from Figure 3.2, Figure 3.3, and Figure 3.4: the left rules are named with an L and
the right rules are named with an R. Compared to the basic inference rules that came
from an intuitive understanding of connectives as truth tables, each logical rule is
generalized with additional hypotheses and alternative conclusions that are “along for
the ride,” similar to Cut. For example, the two left introduction rules for conjunction
in Figure 3.2 are generalized to:
Γ, A ` ∆
Γ, A ∧B ` ∆ ∧L1
Γ, B ` ∆
Γ, A ∧B ` ∆ ∧L2
which say that if ∆ is a consequence of A and Γ, then ∆ is just as well a consequence
of A ∧B and Γ (and similarly for B). Since we also consider logical negation ¬A as
50
a connective, it too is equipped with left and right introduction rules in Figure 3.5.
These rules have the following special cases when Γ and ∆ are empty:
A `
` ¬A ¬R
` A
¬A ` ¬L
In other words, whenever A is false we can infer that ¬A true, and whenever A is true
we know ¬A is false. Similarly, the logical rules of the nullary connectives > and ⊥
are easy verify by the meaning of sequents. Clearly Γ entails either > or ∆ for any Γ
and ∆, since > is always true, and likewise both Γ and ⊥ entail ∆ because ⊥ is never
true.
The most subtle logical connectives in LK are the quantifiers ∀ and ∃. The special
cases of the introduction rules for ∀X.A and ∃X.A when Γ and ∆ are:
` A
` ∀X.A ∀R
A {B/X} `
∀X.A ` ∀L
` A {B/X}
` ∃X.A ∃R
A `
∃X.A ` ∃L
For universal quantification over the variable X in A, if we can prove that A is true
without knowing anything about X then we can infer that ∀X.A is true, and if we
can exhibit a counterexample for a specific B such that A with B for X is false then
we know the general ∀X.A must be false. Existential quantification over the variable
X in A is reversed, so that exhibiting an example for a specific B such that A with B
for X is true means that ∃X.A must be true, whereas showing that A is false without
knowing anything about X lets us infer that ∃X.A is false.
The extra subtlety of the quantifiers lies in ensuring that we “know nothing else
about X.” In natural deduction, this fact was expressed as a property of an entire
proof sub-tree by checking all the leaves. In the sequent calculus, however, this extra
constraint is more easily captured locally as a simple side condition because the “leaves”
are all immediately known within the sequents. This side condition states that the
variable X does not appear free anywhere else in the sequent, written as the premise
X /∈ FV (Γ ` ∆) in both the ∀R and ∃L rules. Just as in NJ, this extra side condition
really is necessary, since without it both quantifiers collapse into one, which is clearly
not what we want. In LK, we should expect that a ∀ entails the corresponding ∃, for
51
example ∀X.X ` ∃X.X which is proved as follows:
Y ` Y Ax
∀X.X ` Y ∀L
∀X.X ` ∃X.X ∃R
But intuitively it shouldn’t be that an ∃ always entails the corresponding ∀. However,
consider the following attempted proof of ∃X.X ` ∀X.X:
X ` X Ax X /∈ FV ( ` X)
∃X.X ` X ∃L X /∈ FV (∃X.X ` )
∃X.X ` ∀X.X ∀R
The only reason that this proof is not valid is because the side conditions on X are
not met: X /∈ FV (∃X.X ` ) is true but X /∈ FV ( ` X) does not hold. Therefore,
the side conditions on the free type variables of sequents in the ∀R and ∃L rules are
essential for keeping the intended distinct meanings of the quantifiers.
The third group of inference rules aim to describe the structural properties of
the sequents themselves that arise from their meaning. The weakening rules say that
we can make any proof weaker by adding additional unused hypotheses (WL) or
considering alternative unfulfilled consequences (WR) since the presence of irrelevant
propositions doesn’t matter. The contraction rules say that duplicate hypotheses (CL)
and duplicate consequences (CR) can just as well be merged into one since redundant
repetitions don’t matter. And finally, the exchange rules say that hypotheses (XL) and
consequences (XR) can be swapped since the order of propositions doesn’t matter.
Remark 3.1. It may seem strange that the meaning of a sequent with multiple
consequences is that only one consequence must be true instead of all consequences
being true. In other words, the consequences of a sequent are disjunctive rather than
conjunctive so that, for example, A ` B,C means “A entails B or C” instead of “A
entails B and C.” One reason for this interpretation is that disjunctive consequences
can be weakened but conjunctive consequences cannot. For example, if we already
know that “A entails B or C” then we can deduce “A entails B or C or D” for any
D because we already know that either B or C is a consequence of A, so the status
of D is irrelevant. However, if we already know that “A entails B and C” then we
don’t know much about “A entails B and C and D” in general, since D might not
actually follow from A at all. A similar argument also explains why the hypotheses of
52
a sequent are conjunctive rather than disjunctive. Therefore, the meaning of sequents,
where all hypotheses must entail one consequence, is essential for enabling weakening
on both sides of entailment. End remark 3.1.
Example 3.1. Through the exclusive use of introduction rules for treating logical
connectives, LK enables a “bottom up” style of building proofs by starting with a
final sequent as a goal that we would like to prove and building the rest of the proof
up from there. When read in reverse, each logical rule identifies a connective in the
goal below the line of inference and breaks it down into simpler sub-goals above the
line. For example, let’s revisit Example 2.1 and consider how to build an LK proof
that the proposition ((A ∧ B) ∧ C) ⊃ (B ∧ A) is true. As in NJ, we begin with the
sequent ` ((A∧B)∧C) ⊃ (B∧A) as the goal and notice that the primary connective
exposed in the only proposition available is implication, so we can apply the right
implication rule: ....
(A ∧B) ∧ C ` B ∧ A
` ((A ∧B) ∧ C) ⊃ (B ∧ A) ⊃R
Next, we may break down the conjunction in the consequence B ∧ A with the right
conjunction rule, splitting the proof into two parts:
....
(A ∧B) ∧ C ` B
....
(A ∧B) ∧ C ` A
(A ∧B) ∧ C ` B ∧ A ∧R
` ((A ∧B) ∧ C) ⊃ (B ∧ A) ⊃R
At this point, the consequences of both our goals are generic, lacking any specific
connectives to work with, which is where the proof differs from the proof in Example 2.1.
Instead of moving to build the proof top-down as in NJ, in LK we shift our attention
to the left and begin breaking down the hypotheses. Since the hypothesis (A∧B)∧C
contains a superfluous C, we use the first left conjunction rule in both branches of
the proof to discard it:
....
A ∧B ` B
(A ∧B) ∧ C ` B ∧L1
....
A ∧B ` A
(A ∧B) ∧ C ` A ∧L1
(A ∧B) ∧ C ` B ∧ A ∧R
` ((A ∧B) ∧ C) ⊃ (B ∧ A) ⊃R
53
Now we may apply another left conjunction rule to select the appropriate hypothesis
needed for both sub-proofs:
....
B ` B
A ∧B ` B ∧L2
(A ∧B) ∧ C ` B ∧L1
....
A ` A
A ∧B ` A ∧L1
(A ∧B) ∧ C ` A ∧L1
(A ∧B) ∧ C ` B ∧ A ∧R
` ((A ∧B) ∧ C) ⊃ (B ∧ A) ⊃R
And finally, we can now close off both sub-proofs with the Ax rule, finishing the proof:
B ` B Ax
A ∧B ` B ∧L2
(A ∧B) ∧ C ` B ∧L1
A ` A Ax
A ∧B ` A ∧L1
(A ∧B) ∧ C ` A ∧L1
(A ∧B) ∧ C ` B ∧ A ∧R
` ((A ∧B) ∧ C) ⊃ (B ∧ A) ⊃R End example 3.1.
Remark 3.2. The traditional LK sequent calculus from Figure 3.5 presents the
structural properties of sequents—exchange, weakening, and contraction—explicitly in
the form of inference rules. However, there are alternate sequent calculi and variations
on LK that forgo these structural rules by baking the properties deeper into the logic
itself. The first change along this line is to treat the hypotheses and consequences of
sequents as unordered collections of propositions, for example building sequents out
of sets or multisets. This way, the exchange rules XL and XR don’t do anything at all,
since the sequents in the premise and conclusion are considered identical. The second
change is to rephrase the core axiom and cut rules in a way that bakes in weakening
and contraction as follows:
Γ, A ` A,∆ Ax
Γ ` A,∆ Γ, A ` ∆
Γ ` ∆ Cut
Contraction can be derived from these new Ax and Cut rules. CL is derived as:
Γ, A,A ` ∆ Γ, A ` A,∆ Ax
Γ, A ` ∆ Cut
54
and the derivation of CR is similar. Weakening, unfortunately, cannot be directly
derived in the same manner as contraction, but instead it is admissible. That is to
say, given any proof of the sequent Γ ` ∆, we can build similar proofs Γ, A ` ∆ and
Γ ` A,∆ by pushing the unused A through the proof until it is finally discarded by
the generalized Ax rule.
In terms of provability—the question of which sequents can conclude a valid proof
tree—the versions of LK with explicit and implicit structural rules are the same. In
the implicit system, exchange is invisible, contraction is a consequence of axiom and
cut, and all weakening is pushed to the leaves. Furthermore, the two different versions
of the axiom and cut rules are interderivable with respect to their different logics. The
explicit Ax rule in Figure 3.5 is a special case of the implicit one above, whereas the
implicit Ax rule can be expanded into many weakenings followed by the explicit rule.
Likewise, the explicit Cut rule can be derived from the implicit rule by weakening
the two premises until they match, whereas the implicit Cut rule can be derived from
the explicit rule by contracting the result of the conclusion to remove the duplication.
Therefore, up to provability, the choice between these two different styles for handling
the structural properties of sequents are a matter of taste. On the same subject, it’s
also sensible to consider an alternate version of left implication introduction that
duplicates rather than splitting hypotheses and consequences among the premises in
the style of our revised Cut above:
Γ ` A,∆ Γ, B ` ∆
Γ, A ⊃ B ` ∆ ⊃L
In the presence of structural properties (either explicit or implicit), these two
⊃L rules are equivalent up to provability. However, if we want a more refined
view of the structural properties, as in sub-structural logics like linear logic
(Girard, 1987), then these differences become more acute and must be considered
carefully. End remark 3.2.
Consistency and cut elimination
One of Gentzen’s motivations for developing the LK sequent calculus was to study
the consistency of natural deduction. A consistent logic does not prove a contradiction,
so that no proposition is proven both true and false. More specifically, we can say that
a sequent calculus is consistent whenever there is no proof of the empty sequent `.
55
For a logic like LK, these two conditions are the same: from a contradiction weakening
gives us ` A and A ` for any A, and from any A, that’s proven both true and
false, Cut gives us `. Consistency of logics like LK is important because without
consistency provability is meaningless: it’s not particularly interesting to exhibit a
proof that some proposition A is true when we already know of a single proof that
shows every proposition is true (and false)!
So in the interest of showing LK’s consistency, how we might possibly begin to
build a proof of the empty sequent from the bottom up? Let’s consider which of
LK’s inference rules (from Figure 3.5) could possibly deduce `. It can’t be any of the
structural rules because they all force at least one hypothesis or consequence in the
conclusion below the line. Likewise, it can’t be any of the logical rules: since they are
introduction rules, they all include at least one proposition built from a connective on
either side of the deduced sequent. It also can’t be the axiom rule, which only deduces
simple non-empty sequents of the form A ` A. Indeed, the only inference rule that
might ever deduce an empty sequent—and therefore lead to inconsistency—is Cut as
shown previously.
This observation that only cuts can lead to contradictions is Gentzen’s (1935b)
great insight to logical consistency. If we want to know that a sequent calculus like
LK is consistent, it’s enough to ask if the Cut rule is important for provability. If
Cut is not essential in any proof, so any provable sequent can be deduced without
the help of Cut, then ` is unprovable since it cannot be deduced without Cut. This
application highlights the importance of Gentzen’s (1935a) cut elimination (originally
called Hauptsatz), and its phrasing in the sequent calculus, which says that every LK
proof can be reduced to a cut-free one.
Theorem 3.1 (Cut elimination). For all LK proofs of Γ ` ∆, there exists an alternate
LK proof of Γ ` ∆ that does not contain any use of the Cut rule.
The proof of cut elimination can be divided into two main parts: the logical steps
and the structural steps. The logical steps of cut elimination consider the cases when
we have a cut between two proof trees ending in the left and right rules for the same
connective occurring in the same proposition, and show how to rewrite the proof into
a new one that does not mention that particular connective. The structural steps
of cut elimination handle all the other cases where we do not have a left and right
introduction for the same proposition facing one another in a cut. These steps involve
rewriting the structure of the proof and propagating the rules until the relevant logical
56
steps can take over. The final ingredient is to ensure that this procedure for eliminating
cuts always gives a definite result, and does not spin off into an infinite regress.
Example 3.2. Notice how different inference rules of LK treat the division of extraneous
hypotheses and consequences among multiple premises differently. On the one hand,
rules like ∧R and ∨L duplicate the side propositions Γ and ∆ from the conclusion
to both premises. On the other hand, rules like Cut and ⊃L merge different side
propositions from the two premises into the common conclusion, creating an ordering
between them during the merge. Why are these particular rules given in such different
styles, and why is the particular merge order chosen? One way to understand the impact
of these details is to look at the interaction between the logical and structural rules
during cut elimination, so let’s examine a few exemplar steps of the cut elimination
procedure.
The first, and the most trivial, case is when we cut an axiom with an existing
proof D of Γ ` A,∆ or E of Γ, A ` ∆. This particular maneuver doesn’t add anything
interesting to the nature of the existing proof, and so correspondingly eliminating the
cut should just give the same proof back unchanged, as we can see in both cases:
D....
Γ ` A,∆ A ` A Ax
Γ ` A,∆ Cut =⇒
D....
Γ ` A,∆
A ` A Ax
E....
Γ, A ` ∆
Γ, A ` ∆ Cut =⇒
E....
Γ, A ` ∆
Notice here that cutting an axiom with both D and E does not change the sequent
in either conclusion, which comes from the precise way that Cut merges the side
propositions in the two premises. For D, the extra consequence A coming from the
axiom A ` A replaces the cut A in exactly the right position, and likewise for E . If
Cut put the propositions of its conclusion in any other order, then we would need to
exchange the result of one or both of the above steps with XL and XR to put them
back into the right order.
Moving on to a logical step, consider what happens when compatible ∧R and
∧L1 introductions, with premises D1, D2, and E respectively, meet in a Cut:
D1....
Γ ` A,∆
D2....
Γ ` B,∆
Γ ` A ∧B,∆ ∧R
E....
Γ′, A ` ∆′
Γ′, A ∧B ` ∆′ ∧L1
Γ′,Γ ` ∆′,∆ Cut =⇒
D1....
Γ ` A,∆
E....
Γ′, A ` ∆′
Γ′,Γ ` ∆′,∆ Cut
57
Reducing this cut involves selecting the appropriate premise D1 of the ∧R introduction
so that it can meet with the single premise of ∧L1. The number of cuts are not reduced
by this step, but instead the primary proposition A∧B of the cut has been reduced to
A, which (non-trivially) justifies why this step is making progress in the cut elimination
procedure.
Not every cut-elimination step winds up so neatly organized, unfortunately, and
sometimes the result is necessarily out of order and must be corrected. For example,
consider the following reduction step of a Cut between compatible ¬R and ¬L
inferences with premises D and E respectively:
D....
Γ, A ` ∆
Γ ` ¬A,∆ ¬R
E....
Γ′ ` A,∆′
Γ′,¬A ` ∆′ ¬L
Γ′,Γ ` ∆′,∆ Cut =⇒
E....
Γ′ ` A,∆′
D....
Γ, A ` ∆
Γ,Γ′ ` ∆,∆′ Cut
Γ′,Γ ` ∆′,∆ XL,XR
Here, the Cut we get from reducing the proposition ¬A to A results in a sequent
that is out of order compared to the conclusion we started with. Thus, we need to
re-order the sequent with some number of XL and XR exchanges to restore the original
conclusion. The fact that reducing a negation introduction cut inverts the order of
propositions comes from the inherent inversion of negation: there’s no obvious way to
prevent this scenario by modifying Cut.
A similar re-ordering occurs with implication, where a Cut between compatible
⊃R and ⊃L inferences, with premises D, E1, and E2, can be reduced as follows:
D....
Γ, A ` B,∆
Γ ` A ⊃ B,∆⊃R
E1....
Γ′ ` A,∆′
E2....
Γ′′, B ` ∆′′
Γ′′,Γ′, A ⊃ B ` ∆′′,∆′⊃L
Γ′′,Γ′,Γ ` ∆′′,∆′,∆ Cut =⇒
E1....
Γ′ ` A,∆′
D....
Γ, A ` B,∆
E2....
Γ′′, B ` ∆′′
Γ′′,Γ, A ` ∆′′,∆ Cut
Γ′′,Γ,Γ′ ` ∆′′,∆,∆′ Cut
Γ′′,Γ′,Γ ` ∆′′,∆′,∆XL,XR
Here, we start with the side-propositions of E1 and E2 merged together with ⊃L, but
after reducing the Cut, D cuts in between the two of them, so the final sequent must
be re-ordered to match the original conclusion. The need to place D in the middle
comes from the fact that its concluding sequent has A on the left and B on the right,
58
so our only available cuts must correspondingly place E1 to the left and E2 to the right,
no matter how they are nested.
Finally, we can see how the free variable side conditions on the ∀R and ∃L rules
play a key role in cut elimination. For example, consider the following reduction step
of a cut between compatible ∀R and ∀L inferences with D and E respectively:
D....
Γ ` A,∆
Γ ` ∀X.A,∆ ∀R
E....
Γ′, A {B/X} ` ∆
Γ′,∀X.A ` ∆′ ∀L
Γ′,Γ ` ∆′,∆ Cut =⇒
D{B/X}....
Γ ` A {B/X} ,∆
E....
Γ′, A {B/X} ` ∆
Γ′,Γ ` ∆′,∆ Cut
Notice that in order to make a direct cut between D and E , we need to substitute B
for X in D to make the two sides match up properly. The fact that X does not occur
free in Γ ` ∆ means that after substitution, both Γ and ∆ remain unchanged in the
conclusion of the proof. If instead X appeared free somewhere in Γ or ∆, then the
logical cut elimination step for ∀ would change the conclusion which ruins the result
of the procedure. End example 3.2.
Remark 3.3. The side conditions on the ∀R and ∃L rules are not just a useful aid to cut
elimination, but are crucial to the entire endeavor. More specifically, if we removed the
side condition from these two inference rules, then LK is inconsistent because we can
directly derive a contradiction; and since cut elimination implies that contradictions
cannot be derived, it therefore becomes impossible. One such contradiction is built in
three parts, and is similar to the faulty NJ proof of false in Remark 2.2. First, we can
prove that ∃X.X is true because there is some provably true proposition in LK, for
example Y ⊃ Y or just >. Second, we can prove that ∀X.X is false because there is
some provably false proposition in LK, for example (¬Y ) ∧ Y or just ⊥. Third, recall
that without the side conditions on free propositional variables, we can derive a proof
of ∃X.X ` ∀X.X, which is the glue that connects the first two parts together via cuts.
In total, we would be able to derive the following contradiction in LK:
` > >R
` ∃X.X ∃R
X ` X Ax X /∈ FV ( ` X)
∃X.X ` X ∃L X /∈ FV (∃X.X ` )
∃X.X ` ∀X.X ∀R
` ∀X.X Cut
⊥ ` ⊥L
∀X.X ` ∀L
` Cut
59
which is only ruled out by the side conditions on ∀R and ∃L that prevent a proof of the
sequent ∃X.X ` ∀X.X. In this particular proof, the side condition X /∈ FV (∃X.X `)
is satisfied becauseX is bound in ∃X.X soX is indeed not free in ∃X.X `, but the side
condition X /∈ FV (` X) is clearly violated. The other possible proof which switches
the order of the ∃L and ∀R rules similarly violates the side condition X /∈ FV (X `)
forced by ∀R. End remark 3.3.
Logical duality
Another application of sequent calculi is to study the dualities of logic through
the deep symmetries of the system (Gentzen, 1935b). The turnstyle of entailment (`)
provides the pivot of duality separating left from right and true from false. Logical
duality in the LK sequent calculus expresses a relationship between the connectives
that follows De Morgan’s laws about the way negation distributes over conjunction
and disjunction:
¬(A ∧B) a` (¬A) ∨ (¬B)
¬(A ∨B) a` (¬A) ∧ (¬B)
Where we interpret the equivalence relation A a` B as the mutual provability of A
and B: that both A ` B and B ` A are provable. Focusing on the opposite roles of
the left and right sides of a sequent, we can immediately observe that the introduction
rules of conjunction and disjunction from Figure 3.5 are mirror images of one another
by flipping the sequents across their turnstyle. Similarly, both the ∀ and ∃ are duals
to one another, and negation is its own dual, with both ¬R and ¬L reflecting the
same inference flipped about entailment.
But what about implication? After examining Figure 3.5, there doesn’t seem to
be any logical connective that serves as implication’s dual counterpart. Fortunately,
the symmetric nature of sequents lets us discover the dual of implication by just
syntactically flipping the ⊃R and ⊃L inferences, giving us the following inferences
rules for a new connective B − A:
Γ, A ` ∆ Γ′ ` B,∆′
Γ,Γ′ ` B − A,∆,∆′ −R
Γ, B ` A,∆
Γ, B − A ` ∆ −L
60
But what does this new connective, the dual of implication, mean? By excluding all
side hypotheses and consequences so that Γ,Γ′,∆,∆′ are all empty in the style of
Figure 3.4, we can read off the basic truth and falsehood facts from the above rules.
On the one hand, the −R rule says that B − A is true whenever B is true and A
is false. On the other hand, the −L rule says that B − A must be false whenever B
entails A. Therefore, the proposition B −A can be thought of as the subtraction of A
from B or equivalently the complement of A with respect to B, so that B −A can be
read as “B but not A.”
Remark 3.4. Another method for discovering the implication’s dual is by reducing
these two rather complex connectives into simpler forms. Notice that, since LK is
a classical logic, implication is equivalent to an encoding based on disjunction and
negation, up to provability:
A ⊃ B a` (¬A) ∨B
since A implies B is true if and only if either B is true or A is false. The proofs
justifying this encoding in LK are:
A ` A Ax
A, (¬A) ` ¬L B ` B Ax
A, (¬A) ∨B ` B ∨L
(¬A) ∨B,A ` B XL
(¬A) ∨B ` A ⊃ B ⊃R
A ` A Ax
` ¬A,A ¬R
` (¬A) ∨B,A ∨R1
` A, (¬A) ∨B XR
B ` B Ax
B ` (¬A) ∨B ∨R2
A ⊃ B ` (¬A) ∨B, (¬A) ∨B ⊃L
A ⊃ B ` (¬A) ∨B CR
We also have an encoding of subtraction in terms of conjunction and negation:
B − A a` B ∧ (¬A)
which is provable similarly to the encoding of implication. We can now use the above
encodings to calculate the negation of implication with De Morgan’s laws, using the
fact that conjunction is provably commutative—A ∧B a` B ∧ A for any A and B:
¬(A ⊃ B) a` ¬((¬A) ∨B)
a` (¬(¬A)) ∧ (¬B)
a` (¬B) ∧ (¬(¬A))
a` (¬B)− (¬A)
61
Duality of sequents:
(Γ ` ∆)⊥ , ∆⊥ ` Γ⊥ (A1, . . . , An)⊥ , A⊥n , . . . , A⊥1
Duality of propositions:
(X)⊥ , X (¬A)⊥ , ¬(A⊥)
>⊥ , ⊥ ⊥⊥ , >
(A ∧B)⊥ , (A⊥) ∨ (B⊥) (A ∨B)⊥ , (A⊥) ∧ (B⊥)
(A ⊃ B)⊥ , (B⊥)− (A⊥) (B − A)⊥ , (A⊥) ⊃ (B⊥)
(∀X.A)⊥ , ∃X.(A⊥) (∃X.A)⊥ , ∀X.(A⊥)
FIGURE 3.6. Duality in the LK sequent calculus.
The dual is then recovered from the fact that A⊥ a` (¬A)∗, where A∗ stands for A
with all propositional variables X replaced with ¬X. Therefore, we can also derive
the dual of implication by encoding it and its dual with conjunction, disjunction, and
negation. End remark 3.4.
With the dual of implication at hand, we can properly express the duality of
sequent calculus proofs—for every LK proof D of a sequent:
D....
An, . . . , A2, A1 ` B1, B2, . . . , Bm
there is a dual proof D⊥ of the dual sequent:
D⊥....
B⊥m, . . . , B
⊥
2 , B
⊥
1 ` A⊥1 , A⊥2 , . . . , A⊥n
The duality relation on judgements and propositions, is given in Figure 3.6. Note that
the duality operation A⊥ may be understood as taking the negation of the proposition,
¬A, and pushing the negation inward all the way using the De Morgan laws, until an
unknown proposition variable X is reached (Gentzen, 1935b).3
3Note that Gentzen did not consider the dual counterpart to implication as a connective, as we
do, but rather eliminated implication from the system by encoding it in terms of disjunction and
negation given above for the purposes of establishing duality.
62
Theorem 3.2 (Logical duality). For any LK proof D of the sequent Γ ` ∆, there
exists a dual proof D⊥ of the dual sequent ∆⊥ ` Γ⊥.
Due to the natural syntactic symmetry of the LK sequent calculus, logical duality
comes from an exchange between left and right: left rules mirror right rules and
hypotheses to the left of entailment mirror consequences to the right. Thus, establishing
logical duality in the sequent calculus follows from a straightforward induction on the
structure of proofs, working from the bottom conclusion up to the axioms.
Example 3.3. To illustrate how the left and right sides of proofs get swapped, consider
the case when the bottom conclusion is inferred from a use of the ∧R rule:
D....
Γ ` A,∆
E....
Γ ` B,∆
Γ ` A ∧B,∆ ∧R
Then by the inductive hypothesis, we get a proof D⊥ of (Γ ` A,∆)⊥ , ∆⊥, A⊥ ` Γ⊥
and a proof E⊥ of (Γ ` B,∆)⊥ , ∆⊥, B⊥ ` Γ⊥, from which we can deduce (Γ `
A ∧B,∆)⊥ , ∆⊥, (A⊥) ∨ (B⊥) ` Γ⊥ by ∨L:
D⊥....
∆⊥, A⊥ ` Γ⊥
E⊥....
∆⊥, B⊥ ` Γ⊥
∆⊥, A⊥ ∨B⊥ ` Γ⊥ ∨L End example 3.3.
Remark 3.5. The duality of proofs in the LK sequent calculus means that if a
proposition A is true, so that we have a proof of ` A, then its dual must be false, so
that we have a proof of A⊥ ` . Analogously, if a proposition A is false, then its dual
must be true. For example, consider the following general proof that the contradictory
proposition A ∧ (¬A) is false:
A ` A Ax
A ∧ (¬A) ` A ∧L1
A ∧ (¬A),¬A ` ¬L
A ∧ (¬A), A ∧ (¬A) ` ∧L2
A ∧ (¬A) ` CL
63
For free, duality gives us a general proof that the law of excluded middle, A ∨ (¬A),
is true:
A ` A Ax
A ` A ∨ (¬A) ∨R1
` ¬A,A ∨ (¬A) ¬R
` A ∨ (¬A), A ∨ (¬A) ∨R2
` A ∨ (¬A) CR
This is not a trivial property—the fact that the LK sequent calculus can prove the
law of excluded middle means that it is a proof system for classical logic. In contrast,
intuitionistic logic is missing duality since it accepts non-contradiction, ¬(A∧(¬A)), in
general but rejects the universal truth of laws like excluded middle or double negation
elimination ((¬(¬A)) ⊃ A), only allowing for specialized proofs depending on the
particular proposition A in question. Intuitionistic logic also only validates three
of the four aforementioned De Morgan laws, rejecting ¬(A ∧ B) ` (¬A) ∨ (¬B) in
particular, showing another break of duality. Gentzen’s (1935a) system NJ of natural
deduction is naturally a proof system for intuitionistic logic, in contrast with the LK
sequent calculus which is classical.
However, notice that the LK proof of excluded middle made critical use of multiple
consequences and contraction on the right of the sequent in order to apply both ∨R2
and ∨R1 to the same original consequence. Without the ability to manipulate sequents
with multiple consequences, the proof that A ∨ (¬A) is true would not be possible.
Indeed, such a restriction would break the symmetry of LK—as multiple hypotheses
cannot be mirrored into multiple consequences—and destroy the duality that let
us convert the law of non-contradiction into law of excluded middle. As it turns
out, Gentzen (1935a) also introduced a sequent calculus called LJ as a restriction
of LK where sequents could only ever contain one consequence, which is instead a
sequent calculus system for intuitionistic logic of equal provability strength as NJ.
Note that with this restriction, LJ effectively removes the right structural rules WR,
CR, and XR since they involve sequents with more than one consequence. From the
other perspective, generalizing natural deduction with multiple consequences turns
it into a proof system for classical logic (Parigot, 1992; Ariola & Herbelin, 2003).
Therefore, we can summarize that the difference between a single-consequence and
multiple-consequence proof systems can mean the difference between intuitionistic
and classical logic. End remark 3.5.
64
The Core Calculus
Today, the Curry-Howard isomorphism (Curry et al., 1958; Howard, 1980;
de Bruijn, 1968) is a far-reaching thesis that each logic corresponds to a foundational
programming language: the propositions of logic can be seen as types of programs
and the proofs of those propositions can be seen as programs themselves. The
shining example of this recurring correspondence is between Gentzen’s (1935a) natural
deduction and Church’s (1932) λ-calculus. However, the logics of natural deduction
and the sequent calculus are rather different from one another. As previously discussed,
one major point of distinction between the two styles of logic is that natural deduction
is right-handed, favoring exclusively right rules for logical connectives, whereas the
sequent calculus is ambidextrous, favoring introduction rules on both the left and right
sides of entailment. That means that the sequent calculus does not correspond to the
λ-calculus the same way that natural deduction does. So what might a programming
language based on a sequent calculus like LK look like?
From natural deduction’s right-handed nature, we get an expression-oriented
language like the λ-calculus: all the phrases of the language work toward producing
some result corresponding to the primary consequence on the right, and so they may
all be (potentially) composed together. But the sequent calculus is ambidextrous,
containing both left- and right-handed rules, and regularly deals with sequents like
A,¬A ` that lack any particular consequence to speak of. Without a consequence,
how can we say what type of result to expect from a program corresponding to the
sequent A,¬A `, or that it even produces a result at all? More generally, notice that
we can classify the rules of LK from Figure 3.5 by the three different kinds of sequents
they can deduce: those with a primary consequence of interest like in the right rules,
those with a primary hypothesis of interest like in the left rules, and those with no
particular proposition of interest (including possibly the empty sequent) like in the
cut rule. If we interpret LK as a programming language, it seems reasonable that each
of these different kinds of sequents correspond to a different basic kind of phrase in
the language, whose composition is guided by the forms of the inference rules.
Before delving into the entirety of LK, let’s first consider a core language shown
in Figure 3.7, Herbelin’s (2005) µµ˜-calculus, that corresponds to the core part of LK
and lies in the heart of every sequent-based language we will explore. Notice that the
language of types in this core lacks any logical connectives, so that the only types are
uninterpreted variables X, Y , Z, etc. The µµ˜-calculus is a bare language for describing
65
A,B,C ∈ Type ::= X X, Y, Z ∈ TypeVariable ::= . . .
c ∈ Command ::= 〈v||e〉
v ∈ Term ::= x | µα.c x, y, z ∈ Variable ::= . . .
e ∈ CoTerm ::= α | µ˜x.c α, β, γ ∈ CoVariable ::= . . .
Γ ∈ InputEnv ::= x1 : A1, . . . , xn : An ∆ ∈ OutputEnv ::= α1 : A1, . . . , αn : An
Judgement ::= c : (Γ ` ∆) | (Γ ` v : A | ∆) | (Γ | e : A ` ∆)
Core rules:
x : A ` x : A | VR | α : A ` α : A VL
c : (Γ ` α : A,∆)
Γ ` µα.c : A | ∆ AR
c : (Γ, x : A ` ∆)
Γ | µ˜x.c : A ` ∆ AL
Γ ` v : A | ∆ Γ′ | e : A ` ∆′
〈v||e〉 : (Γ′,Γ ` ∆′,∆) Cut
FIGURE 3.7. µµ˜: The core language of the sequent calculus.
only input, output, and interactions: the types on the right side of a sequent describe
the outputs of a program and the types on the left side of a sequent describe the
inputs of a program. When the two opposite sides come together—when the opposed
forces of input and output meet—we have an interaction that sparks computation.
Note that the type system brings out an aspect of deduction that was implicit in the
sequent calculus: the role of a distinguished active proposition that is currently under
consideration. For example, in the ∧R rule from Figure 3.5, we are currently trying to
prove the proposition A ∧B, so it is considered the active proposition of the sequent
Γ ` A ∧B,∆.
By putting attention on (at most one) active proposition, we get three
classifications of sequents: active on the right, active on the left, or passive (without
an active proposition on either side). These three forms of sequents likewise classify
three different forms of µµ˜ expressions that might be part of a program:
– An active sequent on the right (Γ ` v : A | ∆) describes a term v that sends
information of type A as its output (that is, v is a producer of type A).
– An active sequent on the left (Γ | e : A ` ∆) describes a co-term e that receives
information of type A as its input (that is, e is a consumer of type A).
66
– A passive sequent (c : (Γ ` ∆)) describes a command c that is an executable
program capable of running on its own without any distinguished input or
output.
In each case, the environments Γ and ∆ describe any additional inputs and outputs
to an expression by specifying the type of free variables (x, . . . ) and free co-variables
(α, . . . ) that expression might reference, respectively.
The expressions of the µµ˜-calculus come from the axiom and cut rules of LK
plus an additional pair of activation rules AR and AL. The Ax rule of LK is divided
into two separate rules in µµ˜: the VR rule creates a term by just referring to a
variable available from its environment, and similarly the VL rule creates a co-term
by referring to a co-variable. The Cut rule connects a term and co-term that are
waiting to send and receive information of the same type, so that the output of the
term is forwarded to the co-term as input (and dually, the input of the co-term is
drawn from the output of the term). Finally, the activation rules AR and AL pick a
particular (co-)variable from the environment of a command to activate by creating
an output or input abstraction, respectively. Intuitively, if the variable x stands for
an unknown input in a command c, then the input abstraction µ˜x.c is a co-term that,
when given a place to draw information, will bind that location to the input channel
x while running c. Dually, if the co-variable α stands for an unknown output in a
command c, then the output abstraction µα.c is a term that, when given a place to
send information, will bind that location to the output channel α while running c.
Having examined the static properties of the µµ˜-calculus—its syntax and types—
we still need to consider the dynamic properties of µµ˜, to explain what it means to
run a program. To say “what is computation in the sequent calculus?” we turn to cut
elimination (previously mentioned in Section 3.1) which outlines a method of reducing
commands as the main unit of computation.4 In other words, computation in µµ˜ is
the behavior that results from cutting together a compatible producer and consumer
in a command, so that they may meaningfully interact with one another. In the bare
µµ˜-calculus with no logical connectives, we can only have three forms of commands:
a cut between (co-)variables 〈x||α〉, a cut with an output abstraction 〈µα.c||e〉, and a
cut with an input abstraction 〈v||µ˜x.c〉. In the first case, a command 〈x||α〉 represents
a basic final state that can reduce no further, and even though its typing derivation
4Note, however, that the steps performed in µµ˜ transform more of the program at once which
differs from the fine-grained steps of the original cut-elimination procedures used for LK.
67
contains a Cut, it is a trivial sort of cut that corresponds more closely to a passive
version of LK’s Ax :
x : A ` x : A | VR | α : A ` α : A VL
〈x||α〉 : (x : A ` α : A) Cut
In the second two cases, the operational meaning of input and output abstractions
are expressed via capture-avoiding substitution—much like the β law for functions in
the λ-calculus—as illustrated by the following µ and µ˜ rewriting rules:
(µ) 〈µα.c||e〉 µ c {e/α} (µ˜) 〈v||µ˜x.c〉 µ˜ c {v/x}
The µ˜ reduction step substitutes the term v for the variable x introduced by an input
abstraction, distributing it into the command c to the points where it is referenced. The
µ reduction step is the mirror image, which substitutes a co-term e for a co-variable
α introduced by an output abstraction. There is an extensional nature to input and
output abstractions—analogous to the η law for functions in the λ-calculus—that
observes the fact that trivial input and output abstractions can be eliminated by the
following ηµ and ηµ˜ rewriting rules:
(ηµ) µα. 〈v||α〉 ηµ v (α /∈ FV (v)) (ηµ˜) µ˜x. 〈x||e〉 ηµ˜ e (x /∈ FV (e))
In other words, the term that sends the output of v to α only to forward that
information along as its own output is the same as v itself. Dually, the co-term
that binds its input to x only to forward that information along to another co-term e
can be written more simply as just e.
As per Remark 2.3, we can derive a reduction theory (µµ˜ηµηµ˜) and equational
theory (=µµ˜ηµηµ˜) for the µµ˜-calculus as the compatible-reflexive-transitive and
compatible-reflexive-symmetric-transitive closures (respectively) of the µ, µ˜, ηµ, and
ηµ˜ rewriting rules. It is also very easy to give the µµ˜-calculus an operational semantics
by just applying the µ and µ rewriting rules directly to commands, so that the only
evaluation context is the empty context in contrast to the λ-calculus which requires
deeply nested evaluation contexts. In other words a single operational reduction is
68
given by
c µµ˜ c′
c 7→µµ˜ c′
and multiple steps of the µµ˜ operational semantics is the reflexive-transitive closure of
the single-step 7→µµ˜ . Note how only the operational semantics only includes the µ and
µ˜ rewriting rules, meaning that they are operational rules (Herbelin & Zimmermann,
2009). In contrast, the ηµ and ηµ˜ rewriting rules are not used to run a program in
the operational semantics, so they are (merely) observational rules meaning that the
(co-)terms before and after ηµ and ηµ˜ reduction are both observably the same in any
program. These observational rules are never needed to run a program and get a
result, because they are simulated by the operational rules whenever they come to
the forefront. For example, we have the following (general) ηµ reduction which gets
us to a final command:
〈µβ. 〈x||β〉||α〉 →ηµ 〈x||α〉
But notice that in this case a µ operational step gets us to exactly the same final
command anyway.
The fundamental dilemma of computation
Unfortunately, the aforementioned dynamic semantics for µµ˜ is overly simplistic
and extremely non-deterministic, to the point where programs may make completely
divergent and unrelated computations. The non-determinism of the µµ˜-calculus
corresponds to the fact that classical cut elimination in the LK sequent calculus
is also non-deterministic. The phenomenon is embodied by the fundamental conflict
between input and output abstractions, as shown by the following critical pair between
two dual µ and µ˜ reductions for performing substitution:
c1 {(µ˜x.c2)/α} ≺µ 〈µα.c1||µ˜x.c2〉 µ˜ c2 {(µα.c1)/x}
Both the term µα.c1 and co-term µ˜x.c2 are fighting for control in the above command,
and either one may win. The non-deterministic outcome of this conflict is exemplified
69
in the case where neither α nor x are referenced in their respective commands:
c1 ≺µ 〈µ .c1||µ˜ .c2〉 µ˜ c2
showing that programs may produce different results each time they are run, since
the same starting point may step to two different and completely arbitrary commands.
This form of divergent reduction paths is called a critical pair and has a serious impact
on the dynamic semantics of the µµ˜-calculus.
For the µµ˜ operational semantics, the result of a program is non-deterministic
because it can end up in different final states depending on which rule is chosen;
for example 〈x||α〉 ← [µµ˜ 〈µγ. 〈x||α〉||µ˜z. 〈y||β〉〉 7→ µµ˜ 〈y||β〉. This fact implies that the
µµ˜ηµηµ˜ reduction theory is not non-confluent, because different reductions can be
applied such that the two diverging paths never converge back to the same result
again. And finally, the µµ˜ηµηµ˜ equational theory is incoherent because all commands
and (co-)terms are equated. From the perspective of programming language semantics,
this type of non-determinism can be undesirable since it makes it impossible to predict
a single definitive result of a program since there may be multiple incompatible results
depending on the choices made during execution. If we want to regain properties like
determinism, confluence, or coherence, which are enjoyed by the λ-calculus, then some
of these freedoms must be curtailed.
In order to recover determinism for the sequent calculus, Curien & Herbelin (2000)
observed that we only need to choose an evaluation strategy that deterministically
picks the next step to take by giving priority to one reduction over the other:
Call-by-value consists in giving priority to the µ redexes, while call-by-
name gives priority to the µ˜ redexes.
Prioritization between the two opposed means that there must be some potential
µ or µ˜ redexes that we could reduce but choose not to, thereby yielding priority to
the other side of the command. From another viewpoint, choosing a priority between
the two sides of a command is the same thing as choosing a restriction on the terms
and co-terms that can be substituted by the µ and µ˜ rules. And reversing directions,
choosing which terms and co-terms are substitutable by µ and µ˜ reductions also
chooses the evaluation strategy.
Reflecting the above observation back to the calculus, we can restore determinacy
to the operational semantics and confluence to the rewriting theory by making the
70
V ∈ ValueV ::= x E ∈ CoValueV ::= e
(µV) 〈µα.c||E〉 µV c {E/α} (ηµ) µα. 〈v||α〉 ηµ v (α /∈ FV (v))
(µ˜V) 〈V ||µ˜x.c〉 µ˜V c {V/x} (ηµ˜) µ˜x. 〈x||e〉 ηµ˜ e (x /∈ FV (e))
FIGURE 3.8. The call-by-value (V) rewriting rules for the core µµ˜V-calculus.
V ∈ ValueN ::= v E ∈ CoValueN ::= α
(µN ) 〈µα.c||E〉 µN c {E/α} (ηµ) µα. 〈v||α〉 ηµ v (α /∈ FV (v))
(µ˜N ) 〈V ||µ˜x.c〉 µN c {V/x} (ηµ˜) µ˜x. 〈x||e〉 ηµ˜ e (x /∈ FV (e))
FIGURE 3.9. The call-by-name (N ) rewriting rules for the core µµ˜N -calculus.
substitution rules strategy-aware: µ˜ only substitutes values for variables and µ only
substitutes co-values for co-variables. In other words, the decision of which values
and co-values are substitutable is enough information to determine an evaluation
strategy in the µµ˜-calculus. To get call-by-value reduction, we can restrict the notion
of value to exclude output abstractions and leave co-values unrestricted, thereby giving
priority to the µ redexes as shown in Figure 3.8. Dually for call-by-name reduction,
we can restrict the notion of co-value to exclude input abstractions and leave values
unrestricted, thereby giving priority to the µ˜ redexes as shown in Figure 3.9. Notice
that in any case, the observational ηµ and ηµ˜ reductions are not affected by the
restrictions on (co-)values, because they do no substitution and are sound under any
choice of evaluation strategy. These restrictions on substitution give us exactly Curien
& Herbelin’s (2000) notions of the call-by-value and call-by-name, which restores
determinacy, confluence, and coherence to the dynamic semantics of µµ˜. Excluding
a (co-)term from the collection of (co-)values effectively prioritizes it by blocking
opposing reductions, whereas including a (co-)term as a (co-)value diminishes its
priority since it can be deleted or duplicated by substitution.
71
Structural rules and static scope
So far we have skirted around the issue of how the structural properties of
the sequent calculus are represented in the µµ˜-calculus. After all, they are an
important part of Gentzen’s LK sequent calculus, but the type system in Figure 3.7
does not express them. For instance, the co-term µ˜z. 〈x||α〉 should have the type
x : X | µ˜z. 〈x||α〉 : Y ` α : X, but there’s no way to derive that conclusion with the
typing rules in Figure 3.7 alone. What’s missing here is a way to infer weakening
on the left, which is a symptom of the general lack of structural properties in the
raw core typing rules. There are multiple options for restoring the classical structural
properties to the core µµ˜ type system, and to be thorough we will compare two of the
most commonly used methods. The common theme behind both methods is to equate
the structural properties of sequents with the scoping properties of static variables
and co-variables in expressions.
The first method of expressing the structural properties of sequents in µµ˜ is
to add explicit structural rules that allow for a single (co-)variable to appear any
number of times in an expression. The full collection of these structural scoping rules
are shown in Figure 3.10, which corresponds one-for-one with the structural rules of
Gentzen’s LK sequent calculus over each form of µµ˜ expression. The weakening rules
say that even if a free (co-)variable is in scope in an expression, it does not have to
be referenced, as in the co-term µ˜z. 〈x||α〉:
x : X ` x : X | VR | α : X ` α : X VL
〈x||α〉 : (x : X ` α : X) Cut
〈x||α〉 : (x : X, z : Y ` α : X) WL
x : X | µ˜z. 〈x||α〉 : Y ` α : X AL
The contraction rules say that a free (co-)variable can be referenced an additional
time by replacing another (co-)variable, as in the command 〈µδ. 〈y||α〉||µ˜z. 〈y||α〉〉:
y : X ` y : X | VR | β : X ` β : X VL
〈y||β〉 : (y : X ` β : X) Cut
〈y||β〉 : (y : X ` δ : Y, β : X)WR
y : X ` µδ. 〈y||β〉 : Y | β : XAR
x : X ` x : X | VR | α : X ` α : XVL
〈x||α〉 : (x : X ` α : X) Cut
〈x||α〉 : (x : X, z : Y ` α : X)WL
x : X | µ˜z. 〈x||α〉 : Y ` α : X AL
〈µδ. 〈y||β〉||µ˜z. 〈x||α〉〉 : (x : X, y : X ` α : X, β : X) Cut
〈µδ. 〈x||β〉||µ˜z. 〈x||α〉〉 : (x : X ` α : X, β : X) CL
72
c : (Γ ` ∆)
c : (Γ ` α : A,∆) WR
c : (Γ ` ∆)
c : (Γ, x : A ` ∆) WL
c : (Γ ` α : A, β : A,∆)
c {α/β} : (Γ ` α : A,∆) CR
c : (Γ, y : A, x : A ` ∆)
c {x/y} : (Γ, x : A ` ∆) CL
c : (Γ ` ∆, α : A, β : B,∆′)
c : (Γ ` ∆, β : B,α : A,∆′) XR
c : (Γ′, y : B, x : A,Γ ` ∆)
c : (Γ′, x : A, y : B,Γ ` ∆) XL
Γ ` v : C | ∆
Γ ` v : C | α : A,∆ WR
Γ ` v : C | ∆
Γ, x : A ` v : C | ∆ WL
Γ ` v : C | α : A, β : A,∆
Γ ` v {α/β} : C | α : A,∆ CR
Γ, y : A, x : A ` v : C | ∆
Γ, x : A ` v {x/y} : C | ∆ CL
Γ ` v : C | ∆, α : A, β : B,∆′
Γ ` v : C | ∆, β : B,α : A,∆′ XR
Γ′, y : B, x : A,Γ ` v : C | ∆
Γ′, x : A, y : B,Γ ` v : C | ∆ XL
Γ | e : C ` ∆
Γ | e : C ` α : A,∆ WR
Γ | e : C ` ∆
Γ, x : A | e : C ` ∆ WL
Γ | e : C ` α : A, β : A,∆
Γ | e {α/β} : C ` α : A,∆ CR
Γ, y : A, x : A | e : C ` ∆
Γ, x : A | e {x/y} : C ` ∆ CL
Γ | e : C ` ∆, α : A, β : B,∆′
Γ | e : C ` ∆, β : B,α : A,∆′ XR
Γ′, y : B, x : A,Γ | e : C ` ∆
Γ′, x : A, y : B,Γ | e : C ` ∆ XL
FIGURE 3.10. Scoping rules for (co-)variables in commands, terms, and co-terms.
Finally, the exchange rules say that the order of the (co-)variables in scope does not
matter. Notice that none of these rules are syntactically visible in their expression.
Unlike the axiom, activation, and cut rules that only apply to expressions starting
with a very specific form, the structural rules could potentially apply to expressions
of any form so they are not directed by syntax.
The scoping rules in Figure 3.10 can seem repetitive or even redundant: the same
weakening, contraction, and exchange rules are repeated three times for commands,
terms, and co-terms. Indeed, with this style of presenting the structural properties of
sequents, it is common to limit the rules to a single form of expression like commands
(Wadler, 2003; Munch-Maccagnoni, 2009). Unfortunately however, the repetition for
73
each kind of expression and sequent is necessary to ensure that the structural rules
match our expectation of static scope in programming languages. For example, in
anticipation of the imminent extension of µµ˜ with function types in Section 3.3, we
might want to call a binary function of type X → X → Y with the same value for
both arguments, as in the co-term x · x · β. To type this co-term, we need to contract
x in the co-term itself, as in:
x′ : X ` x′ : X | VR
x : X ` x : X | VR | β : Y ` β : Y VL
x : X | x · β : X → Y ` β : Y →L
x′ : X, x : X | x′ · x · β : X → X → Y ` β : Y →L
x : X | x · x · β : X → X → Y ` β : Y CL
which is not possible if we only allow contraction in commands. Furthermore, only
including the structural rules for commands can mean that sensible observational
reductions like ηµ and ηµ˜ no longer preserve the type of expressions. For example, the ηµ-
expanded term µα. 〈x||α〉 can be assigned the type y : Y, x : X ` µα. 〈x||α〉 : X | β : Y
using weakening and exchange on commands as follows:
x : X ` x : X | VR | α : X ` α : X VL
〈x||α〉 : (x : X ` α : X) Cut
〈x||α〉 : (x : X ` β : Y, α : X) WR
〈x||α〉 : (x : X ` α : X, β : Y ) XR
〈x||α〉 : (x : X, y : Y ` α : X, β : Y ) WL
〈x||α〉 : (y : Y, x : X ` α : X, β : Y ) XL
y : Y, x : X ` µα. 〈x||α〉 : X | β : Y AR
But there is no way to conclude y : Y, x : X ` x : X | β : Y without the structural
rules for terms, even though it is a reduct of a term of that type: µα. 〈x||α〉 →ηµ x.
The second method of expressing the structural properties of sequents in µµ˜
is by treating the environments Γ and ∆ as unordered sets associating types to
(co-)variables and generalizing the axiom and cut rules to implicitly accomodate
several steps of weakening and contraction, respectively (Curien & Herbelin, 2000;
Wadler, 2005; Munch-Maccagnoni, 2013). This extension of the core µµ˜ type system
is shown in Figure 3.11 and corresponds to the variant of LK with implicit structural
rules discussed in Remark 3.2. In this formulation, there is no explicit use of structural
rules in a typing derivation, but instead the structural properties of sequents follow
74
Γ, x : A ` x : A | ∆ VR Γ | α : A ` α : A,∆ VL
c : (Γ ` α : A,∆)
Γ ` µα.c : A | ∆ AR
c : (Γ, x : A ` ∆)
Γ | µ˜x.c : A ` ∆ AL
Γ ` v : A | ∆ Γ | e : A ` ∆
〈v||e〉 : (Γ ` ∆) Cut
FIGURE 3.11. Implicit (co-)variable scope in the core µµ˜ typing.
from the natural scoping rules for static (co-)variables in the µµ˜-calculus, analogous
to the scoping rules for the λ-calculus. During type checking, an output abstraction
Γ ` µα.c : A | ∆ (and dually an input abstraction Γ | µ˜x.c : A ` ∆) signals that
the active type A may undergo an arbitrary number of structural rules depending on
how α (dually x) is referenced in c. During execution, the behavior of structural rules
are implicitly implemented by the substitution operation used by µ and µ˜ reduction,
corresponding to the structural steps of a cut elimination procedure.
As stated before for the logic of LK in Remark 3.2, the choice between the two
formulations of the scoping properties of µµ˜ (co-)variables is somewhat arbitrary and
a matter of taste. Since we are dealing with a calculus corresponding to classical logic,
both treatments of structural properties are equivalent to each other in a sense—both
formulations will admit type checking the same expressions, even in richer extensions
of the core language. However, the two formulations have their own advantages. The
implicit scoping presented in Figure 3.11 is concise and forgoes the redundancy of
repeated rules, whereas the explicit scoping presented in Figure 3.10 easily allows for
a more refined analysis of the structural properties and exploration of sub-structural
calculi (Munch-Maccagnoni, 2009) corresponding to sub-structural logics that forbid
certain uses of structural rules. The most important thing, though, is that something
is done to express the scope of (co-)variables in the classical language µµ˜. For our
purposes here, we will take the explicit formulation of scoping rules in Figure 3.10 as
the canonical definition for classical µµ˜ in the remainder.
Remark 3.6. As it turns out, output abstractions in the µµ˜-calculus let programs
manipulate their own control flow similar to Scheme’s (Kelsey et al., 1998) callcc
control operator, or Felleisen’s (1992) C operator. Intuitively, a use of callcc or an
abort can be read in terms of an output abstraction that duplicates or deletes its
75
bound co-variable, respectively:
callcc(λα.v) , µα. 〈v||α〉 abort c , µδ.c (δ /∈ FV (c))
This phenomenon is a consequence of Griffin’s (1990) observation that under the
Curry-Howard correspondence, classical logic corresponds to control flow manipulation,
along with the fact that the LK sequent calculus formalizes classical logic (see
Remark 3.5). Under this interpretation, multiple consequences in the sequent calculus
correspond to multiple available co-variables which give the program multiple possible
exit paths. The weakening and contraction rules on the right for these multiple
consequences correspond to deleting or copying an exit path, respectively. Indeed,
multiple consequences with right-handed structural rules may be seen as the logical
essence for this “classical” form of control effects (so called for the connection to
classical logic as well as callcc being the traditional control operator), since extending
natural deduction with multiple consequences, as in Parigot’s (1992) λµ-calculus, gives
rise to a programming language with control effects equivalent to callcc (Ariola &
Herbelin, 2003). End remark 3.6.
The Dual Calculi
With the core µµ˜ language firmly in place, we can now enrich it with additional
programming constructs that correspond to the logical elements—the connectives and
logical rules—of Gentzen’s LK sequent calculus. The syntax and typing rules for these
extra logical constructs are shown in Figure 3.12,5 which extends the core µµ˜-calculus
from Figure 3.7 along with the structural (co-)variable scoping rules from Figure 3.10.
This language combines both Curien & Herbelin’s (2000) λµµ˜-calculus (the portion
associated with implication) and Wadler’s (2003) dual calculus (the portion associated
with conjunction, disjunction, and negation) into a single calculus corresponding to
all of the simply-typed LK sequent calculus. Furthermore, the quantifiers of LK are
interpreted as a sequent calculus version of system F (Reynolds, 1983; Girard et al.,
1989): universal quantification (∀) acts as an abstraction over types analogous to
implication, and existential quantification (∃) is the mirror image of ∀. We refer to
5To help syntactically distinguish terms from co-terms, we use the notational convention
throughout that round parentheses are the grouping brackets for terms, and square brackets are the
grouping brackets for co-terms.
76
this combined language here as the “dual calculi” because, as we will soon see, the
language is the basis for two different but highly related calculi that exhibit dual
computational behavior to one another.
Since the right introduction rules for logical connectives are shared by both
natural deduction and the sequent calculus, the dual calculi terms for creating results
of product, sum, and function types have the same form as in the λ-calculus. Products
are introduced by pairing, (v, v′), sums are introduced by injection, ι1 (v) and ι2 (v),
and functions are introduced by λ-abstractions, λx.v. Additionally, the terms for
creating results of universally quantified types are Λ-abstractions, ΛX.v, as in system
F, and the results of existentially quantified types are “masked” terms, B @ v, that
hide the type B in the underlying term v from being visible from the outside. In
contrast, the left introduction rules of the sequent calculus are distinct from the right
elimination rules of natural deduction, so the difference between the λ-calculus and
the dual calculi really appears when results are used.
Instead of function application, the left implication introduction →L builds a
co-term that represents a call-stack. If v is a term that produces a result of type A, and
e is a co-term that consumes a result of type B, then the call-stack v · e is a co-term
that works with a function value of type A→ B by feeding it v as an argument and
sending the returned result to e. For example, given that x1 : A1, x2 : A2, x3 : A3, and
β : B, then the call-stack x1 · [x2 · [x3 · β]] is expecting to consume a function of type
A1 → (A2 → (A3 → B)):6
x1:A1 ` x1 : A1 |VR
x2:A2 ` x2 : A2 |VR
x3:A3 ` x3 : A3 |VR | β : B ` β:BVL
x3:A3 | x3 · β : A3 → B ` β : B →L
x3:A3, x2:A2 | x2 · x3 · β : A2 → A3 → B ` β:B →L
x3:A3, x2:A2, x1:A1 | x1 · x2 · x3 · β : A1 → A2 → A3 → B ` β:B →L
The left introductions for the other type constructors follow a similar pattern, with
each one building a co-term that expects to consume a value of that type. There
are two left conjunction introductions corresponding to the two projections out of
a product. If e1 is a co-term that consumes a value of type A, then ×L1 builds the
6Like the common notational convention in the simply-typed λ-calculus that the function type
constructor associates to the right, so that A1 → A2 → A3 → B = A1 → (A2 → (A3 → B)), we
adopt a similar notational convention that the call stack constructor associates to the right, so that
x1 · x2 · x3 · β = x1 · [x2 · [x3 · β]].
77
A,B,C ∈ Type ::= X | A×B | A+B | ¬A | A→ B | ∀X.A | ∃X.A
c ∈ Command ::= 〈v||e〉
v ∈ Term ::= x | µα.c | (v, v) | ι1 (v) | ι2 (v) | not(e) | λx.v | ΛX.v | B @ v
e ∈ CoTerm ::= α | µ˜x.c | pi1 [e] | pi2 [e] | [e, e] | not[v] | v · e | B @ e | Λ˜X.e
Γ ∈ InputEnv ::= x1 : A1, . . . , xn : An
∆ ∈ OutputEnv ::= α1 : A2, . . . , αn : An
Judgement ::= c : (Γ ` ∆) | (Γ ` v : A | ∆) | (Γ | e : A ` ∆)
Logical rules:
Γ ` v : A | ∆ Γ ` v′ : B | ∆
Γ ` (v, v′) : A×B | ∆ ×R
Γ | e : A ` ∆
Γ | pi1 [e] : A×B ` ∆ ×L1
Γ | e : B ` ∆
Γ | pi2 [e] : A×B ` ∆ ×L2
Γ ` v : A | ∆
Γ ` ι1 (v) : A+B | ∆ +R1
Γ ` v : B | ∆
Γ ` ι2 (v) : A+B | ∆ +R2
Γ | e : A ` ∆ Γ | e′ : B ` ∆
Γ | [e, e′] : A+B ` ∆ +L
Γ | e : A ` ∆
Γ ` not(e) : ¬A | ∆ ¬R
Γ ` v : A | ∆
Γ | not[v] : ¬A ` ∆ ¬L
Γ, x : A ` v : B | ∆
Γ ` λx.v : A→ B | ∆ →R
Γ ` v : A | ∆ Γ′ | e : B ` ∆′
Γ′,Γ | v · e : A→ B ` ∆′,∆ →L
Γ ` v : A | ∆ X /∈ FV (Γ ` ∆)
Γ ` ΛX.v : ∀X.A | ∆ ∀R
Γ | e : A {B/X} ` ∆
Γ | B @ e : ∀X.A ` ∆ ∀L
Γ ` v : A {B/X} | ∆
Γ ` B @ v : ∃X.A | ∆ ∃R
Γ | e : A ` ∆ X /∈ FV (Γ ` ∆)
Γ | Λ˜X.e : ∃X.A ` ∆ ∃L
FIGURE 3.12. The syntax and types for the dual calculi.
78
co-term pi1 [e1] that works with a value of type A × B by projecting out the first
element of the product and sending it to e1 when needed (and similarly for the second
projection pi2 [e2] built by ×L2). If e1 and e2 are co-terms that consume values of type
A and B, respectively, then +L builds the co-term [e1, e2] that works with a value of
type A+B by checking its constructor: an injection of the form ι1 (v1) has the value
of v1 sent to e1 as needed, and likewise an injection of the form ι2 (v2) has the value of
v2 sent to e2 as needed. The co-term for ∀L is similar to the call stacks of→L, so that
if e is a co-term that consumes a value at the particular type A {B/X}, then B @ e
works with a value of the general type ∀X.A by first specializing the polymorphic
value and then passing it along to e. Perhaps the most unusual co-term comes from
∃L, but this is just the mirror image of the ∀R term. If e is a co-term that consumes
a value of type A, containing a generic type variable X, then ∃L gives the abstracted
co-term Λ˜X.e that works with a value of type ∃X.A by instantiating X with the
value’s hidden type before passing the underlying value to e.
The one type constructor that is not typically found in the λ-calculus, but
commonly in a sequent calculus like LK or the dual calculi, is negation. The negation
type ¬A represents an inversion between producers and consumers—terms and
co-terms—during computation. Intuitively, negation expresses a form of continuations:
a term of type ¬A is actually a consumer of A. The right negation introduction allows
terms to contain consumers, so that if e is a co-term expecting an input a result of
type A then ¬R builds the term not(e). Dually, the left negation introduction allows
co-terms to contain producers, so that if v is a term expecting to output a result of
type A then ¬L builds the co-term not[v]. When a negated term and co-term meet
each other in a command, the inversion is undone so that their underlying components
change places and continue the interaction.
The above intuition on the dynamic meaning of types in the dual calculi can be
codified into rewriting rules. Recall from Section 3.2 that the semantics of the core
µµ˜-calculus was split in two to restore determinacy and confluence: one corresponding
to call-by-value and the other to call-by-name. Likewise, there are two semantics for
the dual calculi, so that the same language bears two different calculi (hence the
name). Since both semantics of the core µµ˜-calculus are already given in Figure 3.8
and Figure 3.9, we only need to suitably expand the notions of value and co-value to
accomodate the new (co-)term introductions and explain the logical steps of cut
elimination (referred to by the common name β) that occur when two opposed
79
V ∈ ValueV ::= x | (V, V ) | ι1 (V ) | ι2 (V ) | not(e) | λx.v | ΛX.v | A@ V
E ∈ CoV alueV ::= e
(β×V ) 〈(V1, V2)||pii [E]〉 β×V 〈Vi||E〉 (β
+
V ) 〈ιi (V )||[E1, E2]〉 β+V 〈V ||Ei〉
(β¬V ) 〈not(e)||not[v]〉 β¬V 〈v||e〉 (β→V ) 〈λx.v||V · E〉 β→V 〈v {V/x}||E〉
(β∀V) 〈ΛX.v||B @ E〉 β∀V 〈v {B/X}||E〉 (β
∃
V)
〈
B @ V
∣∣∣∣∣∣Λ˜X.e〉 β∃V 〈V ||e {B/X}〉
FIGURE 3.13. The β laws for the call-by-value (V) half of the dual calculi.
V ∈ ValueN ::= v
E ∈ CoValueN ::= α | pi1 [E] | pi2 [E] | [E,E] | not(v) | v · E | B @ E | Λ˜X.e
(β×V ) 〈(V1, V2)||pii [E]〉 β×V 〈Vi||E〉 (β
+
V ) 〈ιi (V )||[E1, E2]〉 β+V 〈V ||Ei〉
(β¬V ) 〈not(e)||not[v]〉 β¬V 〈v||e〉 (β→V ) 〈λx.v||V · E〉 β→V 〈v {V/x}||E〉
(β∀V) 〈ΛX.v||B @ E〉 β∀V 〈v {B/X}||E〉 (β
∃
V)
〈
B @ V
∣∣∣∣∣∣Λ˜X.e〉 β∃V 〈V ||e {B/X}〉
FIGURE 3.14. The β laws for the call-by-name (N ) half of the dual calculi.
introduction forms of the same type meet in a command. The call-by-value β rules
are given in Figure 3.13 and the call-by-name β rules are given in Figure 3.14, both
of which extend the core semantics from Figure 3.8 and Figure 3.9, respectively. The
β×, β+ and β¬ rules come from Wadler’s (2003) dual calculus whereas the β→ rules
are inspired by Curien & Munch-Maccagnoni’s (2010) revision of the λµµ˜-calculus.
The β laws extend the previous dynamic semantics of the core µµ˜-calculus to
account for the additional programming constructs. As per Remark 2.3, we have
a reduction theory (µV µ˜VβV ), equational theory (=µV µ˜VβV ), and an operational
semantics (7→ µV µ˜VβV ) for the call-by-value dual calculus from the µV , µ˜V , ηµ, ηµ˜, and
βV laws, as well as a reduction theory (µNµNβN ), equational theory (=µNµNβN ), and
operational semantics (7→ µNµNβN ) for the call-by-name dual calculus from the µN , µ˜N ,
ηµ, ηµ˜, and βN laws. As before, both the call-by-value and call-by-name operational
semantics applies the rewriting rules directly to commands.
80
Notice that, like in the core µµ˜-calculus, the form of the operational β rules are
the same in both semantics, so that the only difference is the definition of value and
co-value referred to in those rules. The rule of thumb is that a β rule only applies
when an introductory value and co-value interact in a command. For example, the
call-by-value β×V rule will only project from a pair value to extract a component that
is also a value. These restrictions are captured in the call-by-value definition of value
that admits only “simple” terms and hereditarily excludes complex terms like µα.c
(representing an arbitrarily complex computation before yielding a result on α) from
the values of product and sum types, which matches the behavior of products and
sums in strict functional languages like ML. However, there is no such restriction
on co-terms in the call-by-value operational semantics, so any co-term counts as a
co-value. Dually, the call-by-name β×N rule will only project out of a pair when it is
needed by a projection co-value to send that component the underlying co-value. These
restrictions are captured in the call-by-name definition of co-value that admits only
“strict” co-terms and hereditarily excludes complex co-terms like µ˜x.c (representing an
arbitrarily complex computation before demanding a result for x) from the co-values of
product and sum types. However, there is no restriction on terms in the call-by-name
operational semantics, so any term counts as a value.
Remark 3.7. It’s worthwhile to mention that although the dual calculi are primarily
seen as typed languages, their semantics do not use any type information to run
commands. We can therefore execute untyped commands as well as typed ones, which
of course creates the possibility of getting stuck at fatal type errors. Untyped commands
also open up the possibility of running general recursive programs, which can be
encoded in a similar manner as in the λ-calculus without any additional features
of the language. For example, Curry’s untyped fixed-point Y combinator in the λ-
calculus:
Y , λf.(λx.f (x x)) (λx.f (x x))
can be analogously defined in the dual calculi using functions as:
Y , λf.µα. 〈λx.µβ. 〈f ||µγ. 〈x||x · γ〉 · β〉||(λx.µβ. 〈f ||µγ. 〈x||x · γ〉 · β〉) · α〉
The two share analogous behavior: in the λ-calculus Y f = f (Y f) and in the
dual calculi 〈Y ||f · α〉 = 〈f ||µβ. 〈Y ||f · β〉 · α〉. Also analogous to the non-terminating
untyped term Ω , (λx.x x) (λx.x x) in the λ-calculus, the dual calculi both have
81
non-terminating untyped commands, which can be written using functions or more
simply with negation:
Ω , 〈not(µ˜x. 〈x||not[x]〉)||not[µα. 〈not(α)||α〉]〉
For example, in the call-by-name operational semantics, we have the following infinite
execution of Ω:
Ω , 〈not(µ˜x. 〈x||not[x]〉)||not[µα. 〈not(α)||α〉]〉
7→β¬N 〈µα. 〈not(α)||α〉||µ˜x. 〈x||not[x]〉〉
7→µ˜N 〈µα. 〈not(α)||α〉||not[µα. 〈not(α)||α〉]〉
7→µN 〈not(not[µα. 〈not(α)||α〉])||not[µα. 〈not(α)||α〉]〉
7→β¬N 〈µα. 〈not(α)||α〉||not[µα. 〈not(α)||α〉]〉
7→µN . . .
Note that encoding general recursion in the untyped sequent calculus requires some
logical connective, like negation or implication. The core µµ˜-calculus gives a more
restrained language of substitution that does not express general recursion even in
the untyped calculus, where general (and non-confluent) µ- and µ˜-reduction is still
strongly normalizing (Polonovski, 2004)—that is, there are no infinite sequences of
µµ˜-reductions. This fact is in contrast with the untyped λ-calculus which can express
general recursion, because β-reduction is not strongly normalizing in the untyped
calculus. End remark 3.7.
Focusing on computation
There is a problem lurking in the β-based operational semantics for the dual
calculi. Consider how we would evaluate the projection pi1((f 1), 2) in a call-by-value
functional language like ML. First we would compute the application f 1 to construct
the pair value, then we would compute the pi1 projection of that pair and extract the
value returned by f 1 as the result of the expression. However, if we represent this
82
program as the following command in the call-by-value dual calculus:7
〈((µβ. 〈f ||1 · β〉), 2)||pi1 [α]〉
we find that no operational rule matches this command, so we are stuck! This isn’t
just a problem with the call-by-value operational semantics. The command:
〈(1, 2)||pi1 [µ˜x. 〈0||α〉]〉
which corresponds to the expression letx = pi1(1, 2) in 0 in a functional language, is
also stuck in the call-by-name operational semantics.
This is clearly an undesirable situation that breaks the connection between the λ-
calculus and dual calculi—we should not get stuck on such commands with unfinished
computation in introduction forms—so something needs to be done to refocus the
attention in a command to the next step of computation. As it stands now in the
dual calculi, we either have too many programs with unexplained behavior, or too few
behaviors for executing programs. Correspondingly, there are two general techniques
to remedy prematurely stuck commands and restore the connection between λ-calculus
and the dual calculi:
(1) The static approach (Curien & Herbelin, 2000) removes the superfluous parts of
the syntax that cause β reduction to get stuck, but are not necessary to express
all the same computations as the original language.
(2) The dynamic approach (Wadler, 2003) adds the necessary extra steps to the
operational semantics that lift buried computations to the top of the command,
so that they are exposed and may take over control of the computation.
Both of these techniques are an application of an idea called focusing (Andreoli, 1992;
Laurent, 2002) from proof search at different points in a programs life—either at
“run time” or at “compile time”—to make sure that the call-by-value and call-by-
name semantics are complete without missing out on any essential capabilities of the
language.
7Here, α stands for the empty, or top-level, context which is implicit in the functional expression.
83
Static focusing
For the static method of focusing, consider which syntactic patterns could lead to
β-stuck commands. In the call-by-value command above, 〈((µβ. 〈f ||1 · β〉), 2)||pi1 [α]〉,
the problem is that a pair with a non-value component (namely the first one)
is interacting with a projection co-value. Because the pair does not have values
for both components, the β×V operational step does not apply. Dually, the call-by-
name command above, 〈(1, 2)||pi1 [µ˜x. 〈0||α〉]〉, puts a pair value in interaction with a
projection that has a non-co-value component. Because the projection does not contain
a co-value, the β×N operational step does not apply. After examining all the βV rules,
we see that the call-by-value βV operational semantics is only equipped to deal with
certain introduction forms containing values (namely the pairing ×R, injection +R,
and masking ∃R terms as well as calling →L co-terms). Similarly, the call-by-name
βN operational semantics is only equipped to deal with certain introduction co-terms
containing co-values (namely the projection ×L, matching +L, and calling →L, and
specializing ∀L co-terms).
We can rule out the problematic commands via static focusing by limiting
ourselves to a sub-syntax of the dual calculi. However, since each operational semantics
(both call-by-value and call-by-name) have difficulty with different parts of the syntax,
static focusing effectively splits the language in two: one sub-syntax for each evaluation
strategy. For call-by-value, we must bake in the notion of values into the syntax and
restrict the ×R, +R, ∃R, and →L inference rules appropriately. Doing so gives us
the LKQ sub-calculus (Curien & Herbelin, 2000) shown in Figure 3.15. Dually for
call-by-name, we must bake in the notion of co-values into the syntax and restrict
the ×L, +L, →L, and ∀L inference rules appropriately, giving the LKT sub-calculus
shown in Figure 3.16.
The associated type systems separate the restricted notions of (co-)values from
general (co-)terms through a new form of focused sequent with a stricter sense of
active formula held in a stoup (Girard, 1991). LKQ introduces values in the focus of
a stoup on the right (Γ ` V : A ; ∆) and LKT introduces co-values in the focus of
a stoup on the left (Γ ; E : A ` ∆). Notice how the focus of the inference rules is
forcibly maintained through type checking: working bottom-up, once a (co-)value is in
focus in the stoup, our active attention cannot move to any other type in the sequent
via activation since the AR and AL rules do not introduce (co-)values in focus. The
new form of sequent calls for additional focusing structural rules FR (in LKQ) and
84
A,B,C ∈ Type ::= X | A×B | A+B | ¬A | A→ B | ∀X.A | ∃X.A
v ∈ Term ::= V | µα.c
V ∈ Value ::= x | (V, V ) | ι1 (V ) | ι2 (V ) | not(e) | λx.v | ΛX.v | A@ V
e ∈ CoTerm ::= α | µ˜x.c | pi1 [e] | pi2 [e] | [e, e] | not[v] | v · e | B @ e | Λ˜X.e
c ∈ Command ::= 〈v||e〉
Judgement ::= c : (Γ ` ∆) | (Γ ` v : A | ∆) | (Γ ` V : A ; ∆) | (Γ | e : A ` ∆)
Axiom:
x : A ` x : A ; Var | α : A ` α : A CoVar
Logical rules:
Γ ` V : A ; ∆ Γ ` V ′ : B ; ∆
Γ ` (V, V ′) : A×B ; ∆ ×R
Γ | e : A ` ∆
Γ | pi1[e] : A×B ` ∆ ×L1
Γ | e : B ` ∆
Γ | pi2[e] : A×B ` ∆ ×L2
Γ ` V : A ; ∆
Γ ` ι1(V ) : A+B ; ∆ +R1
Γ ` V : B ; ∆
Γ ` ι2(V ) : A+B ; ∆ +R2
Γ | e : A ` ∆ Γ | e′ : B ` ∆
Γ | [e, e′] : A+B ` ∆ +L
Γ | e : A ` ∆
Γ ` not(e) : ¬A ; ∆ ¬R
Γ ` v : A | ∆
Γ | not[v] : ¬A ` ∆ ¬L
Γ, x : A ` v : B | ∆
Γ ` λx.v : A→ B ; ∆ →R
Γ ` V : A ; ∆ Γ′ | e : B ` ∆′
Γ,Γ′ | V · e : A→ B ` ∆,∆′ →L
Γ ` v : A | ∆ X /∈ FV (Γ ` ∆)
Γ ` ΛX.v : ∀X.A ; ∆ ∀R
Γ | e : A {B/X} ` ∆
Γ | B @ e : ∀X.A ` ∆ ∀L
Γ ` V : A {B/X} ; ∆
Γ ` B @ V : ∃X.A ; ∆ ∃R
Γ | e : A ` ∆ X /∈ FV (Γ ` ∆)
Γ | Λ˜X.e : ∃X.A ` ∆ ∃L
Focusing (structural) rules:
Γ ` V : A ; ∆
Γ ` V : A | ∆ FR
FIGURE 3.15. LKQ: The focused sub-syntax and types for the call-by-value dual
calculus.
85
A,B,C ∈ Type ::= X | A×B | A+B | ¬A | A→ B | ∀X.A | ∃X.A
v ∈ Term ::= x | µα.c | (v, v) | ι1 (v) | ι2 (v) | not(e) | λx.v | ΛX.v | B @ v
e ∈ CoTerm ::= E | µ˜x.c
E ∈ CoValue ::= α | pi1 [E] | pi2 [E] | [E,E] | not(v) | v · E | B @ E | Λ˜X.e
c ∈ Command ::= 〈v||e〉
Sequent ::= c : (Γ ` ∆) | (Γ ` v : A | ∆) | (Γ | e : A ` ∆) | (Γ ; E : A ` ∆)
Axiom:
x : A ` x : A | Var ; α : A ` α : A CoVar
Logical rules:
Γ ` v : A | ∆ Γ ` v′ : B | ∆
Γ ` (v, v′) : A×B | ∆ ×R
Γ ; E : A ` ∆
Γ ; pi1[E] : A×B ` ∆ ×L1
Γ ; E : B ` ∆
Γ ; pi2[E] : A×B ` ∆ ×L2
Γ ` v : A | ∆
Γ ` ι1(v) : A+B | ∆ +R1
Γ ` v : B | ∆
Γ ` ι2(v) : A+B | ∆ +R2
Γ ; e : A ` ∆ Γ ; e′ : B ` ∆
Γ ; [E,E ′] : A+B ` ∆ +L
Γ | e : A ` ∆
Γ ` not(e) : ¬A | ∆ ¬R
Γ ` v : A | ∆
Γ ; not[v] : ¬A ` ∆ ¬L
Γ, x : A ` v : B | ∆
Γ ` λx.v : A→ B | ∆ →R
Γ ` v : A | ∆ Γ′ ; E : B ` ∆′
Γ,Γ′ ; v · E : A→ B ` ∆,∆′ →L
Γ ` v : A | ∆ X /∈ FV (Γ ` ∆)
Γ ` ΛX.v : ∀X.A | ∆ ∀R
Γ ; E : A {B/X} ` ∆
Γ ; B @ E : ∀X.A ` ∆ ∀L
Γ ` v : A {B/X} ` ∆
Γ ` B @ v : ∃X.A | ∆ ∃R
Γ | e : A ` ∆ X /∈ FV (Γ ` ∆)
Γ ; Λ˜X.e : ∃X.A ` ∆ ∃L
Focusing (structural) rules:
Γ ; E : A ` ∆
Γ | E : A ` ∆ FL
FIGURE 3.16. LKT: The focused sub-syntax and types for the call-by-name dual
calculus.
86
FL (in LKT) which just say that every value is a term and every co-value is a co-term.
However, the reverse of the focusing rules—which would say that every (co-)term is a
(co-)value—are omitted in LKQ and LKT because they would collapse the distinction
that the stoup has created. As it turns out (Curien & Munch-Maccagnoni, 2010),
distinguishing (co-)values in type systems like LKQ and LKT correspond with the
technique of focusing in proof theory developed by Andreoli (1992), Girard (1993, 2001),
and Laurent (2002). In proof search, focusing makes the searching algorithm more
efficient by cutting down on the search space, whereas in calculi, focusing identifies a
well-behaved sub-syntax for the operational semantics.
Dynamic focusing
For the dynamic method of focusing, consider which steps were missing from the
operational semantics. So instead of ruling out troublesome corners of the syntax, we
will instead add additional steps to kick-start stuck commands. Recall that in our
stuck call-by-value command, 〈((µβ. 〈f ||1 · β〉), 2)||pi1 [α]〉, the β×V operational step was
stuck because a pair with a non-value component needs to interact with a projection.
One thing we can do in this situation is lift the non-value component out of the
pair and assign it a name via an input abstraction. Such a step reveals a hidden µV
reduction and lets the computation continue to bring the application of f to the top:
〈((µβ. 〈f ||1 · β〉), 2)||pi1 [α]〉 7→? 〈µβ. 〈f ||1 · β〉||µ˜x. 〈(x, 2)||pi1 [α]〉〉
7→µV 〈f ||1 · µ˜x. 〈(x, 2)||pi1 [α]〉〉
Now, assuming that the call to f returns the result 3, the computation can continue
along to present 3 as the result to α, yielding the desired answer:
〈f ||1 · µ˜x. 〈(x, 2)||pi1 [α]〉〉 7→ 〈3||µ˜x. 〈(x, 2)||pi1 [α]〉〉
7→µ˜V 〈(3, 2)||pi1 [α]〉
7→β×V 〈3||α〉
That one extra lifting step was all that was needed to continue the computation and get
to the final command. Likewise, the stuck call-by-name command 〈(1, 2)||pi1 [µ˜x. 〈0||α〉]〉
has a non-co-value component in the projection, so we can similarly lift the component
87
out of the projection and assign it a name via an output abstraction:
〈(1, 2)||pi1 [µ˜x. 〈0||α〉]〉 7→? 〈µβ. 〈(1, 2)||pi1 [β]〉||µ˜x. 〈0||α〉〉
7→µ˜N 〈0||α〉
Lifting non-(co-)value components out of introduction forms of (co-)terms seems to
be the missing step in β-stuck commands.
The full set of such lifting rules are given in Figure 3.17 for the call-by-value
semantics and Figure 3.18 for the call-by-name semantics.8 These rules give the
minimum required extra steps to reduce hidden computations nested deeply inside
terms and co-terms in a way that matches the call-by-value and call-by-name semantics
for the λ-calculus. However, the ς laws are the first operational rules on (co-)terms,
rather than commands. As such, we must extend the context of our operational
reductions to allow for ς when necessary. For the call-by-value and call-by-name
operational semantics including ς, we have the following evaluation contexts (denoted
by D to avoid confusion with co-values):
D ∈ EvalCxtV ::=  | 〈||e〉 | 〈V ||〉 D ∈ EvalCxtN ::=  | 〈v||〉 | 〈||E〉
Still, unlike the λ-context, evaluation contexts are not arbitrarily nested, but only
ever place attention the entire command or its immediate (co-)term. For example,
in call-by-value we have the following operational ς reductions on either side of a
command like:
〈ι1 (v)||e〉 7→ς+V 〈µα. 〈v||µ˜y. 〈ι1 (y)||α〉〉||e〉 7→µV 〈v||µ˜y. 〈ι1 (y)||e〉〉
〈V ||v · e〉 7→ς→N 〈V ||µ˜x. 〈v||µ˜y. 〈x||y · e〉〉〉 7→µ˜V 〈v||µ˜y. 〈V ||y · e〉〉
and in call-by-name we have only operational ς reductions on the co-term side like:
〈v||pi1 [e]〉 7→ς×N 〈v||µ˜x. 〈µβ. 〈x||pi1 [β]〉||e〉〉 7→µ˜N 〈µβ. 〈v||pi1 [β]〉||e〉
Furthermore, note that extending the semantics of the dual calculi with the ς rules
preserves determinism of the operational semantics and confluence of the reduction
8The proviso that x, y, α, and β are fresh means that they do not appear free anywhere in the
command on the left-hand side of the operational reduction step.
88
(ς×V ) (v, v′) ς×V µα. 〈v||µ˜y. 〈(y, v
′)||α〉〉 (V, v) ς×V µα. 〈v||µ˜y. 〈(V, y)||α〉〉
(ς+V ) ιi (v) ς+V µα. 〈v||µ˜y. 〈ιi (y)||α〉〉
(ς→V ) v · e ς→V µ˜x. 〈v||µ˜y. 〈x||y · e〉〉
(ς∃V) B @ v ς∃V µα. 〈v||µ˜y. 〈B @ y||α〉〉

v /∈ValueV
α, x, y fresh
FIGURE 3.17. The focusing ς laws for the call-by-value (V) half of the dual calculi.
(ς×N ) pii [e] ς×N µ˜x. 〈µβ. 〈x||pii [β]〉||e〉
(ς+N ) [e, e′] ς+N µ˜x. 〈µβ. 〈x||[β, e
′]〉||e〉 [E, e] ς+N µ˜x. 〈µβ. 〈x||[E, β]〉||e〉
(ς→N ) v · e ς→N µ˜x. 〈µβ. 〈x||v · β〉||e〉
(ς∀N ) B @ e ς∀N µ˜x. 〈µβ. 〈x||B @ β〉||e〉

e/∈CoValueN
x, β fresh
FIGURE 3.18. The focusing ς laws for the call-by-name (N ) half of the dual calculi.
theory, since there are no critical pairs between the ς rules and µµ˜ηµηµ˜β rules in either
the call-by-value or call-by-name calculus.
For the µV µ˜VβVςV call-by-value operational semantics, the net effect is that the
final commands are always a value yielded to a co-variable or a simple co-value (that
is, a co-variable or a left introduction co-term) applied to a variable as follows:
FinalCommandV ::= 〈V ||α〉 | 〈x||Es〉
V ∈ ValueV ::= x | (V, V ′) | ι1 (V ) | ι2 (V ) | not(e) | λx.v | ΛX.v | B @ V
Es ∈ SimpleCoValueV ::= α | pi1 [e] | pi2 [e] | [e, e′] | not[v] | V · e | B @ e | Λ˜X.e
Dually for the µN µ˜NβN ςN call-by-name operational semantics, the final commands
are always a simple value (a variable or an introduction term) yielded to a co-variable
or a co-value applied to a variable as follows:
FinalCommandN ::= 〈Vs||α〉 | 〈x||E〉
Vs ∈ SimpleValueN ::= x | (v, v′) | ι1 (v) | ι2 (v) | not(e) | λx.v | ΛX.v | B @ v
E ∈ CoValueN ::= α | pi1 [E] | pi2 [E] | [E,E ′] | not[v] | v · E | B @ E | Λ˜X.e
89
If we only take well-typed commands into consideration, then we get a standard
type safety theorem which says that well-typed commands always reduce to a final
command, and do not get stuck on any interacting (and potentially mismatched)
introduction forms. The small-step version of type safety can be expressed as the
progress and preservation properties (Wright & Felleisen, 1994).
Theorem 3.3 (Progress and preservation). For any command c : (Γ ` ∆):
a) Progress: c is a call-by-value (respectively, call-by-name) final command or there
is a command c′ such that c 7→µV µ˜VβV ςV c′ (respectively, c 7→µN µ˜NβN ςN c′), and
b) Preservation: if c 7→µV µ˜VβV ςV c′ or c 7→µN µ˜NβN ςN c′, then c′ : (Γ ` ∆).
Proof. Progress follows by induction on the typing derivation of c : (Γ ` ∆). The
structural rules (for weakening, contraction, and exchange) follow immediately from
the inductive hypothesis and the Cut rule forms the base cases. For call-by-name,
progress is assured because for every well-typed co-term Γ | e : A ` ∆, either e is a
co-value, an input abstraction, or e ςN e′ for some e′. Therefore, if the cut is neither
final nor reducible, then the co-term reduces. Similarly for call-by-value, every well-
typed term Γ ` v : A | ∆ is either a value, an output abstraction or v ςV v′ for some
v′, and every well-typed co-term Γ | e : A ` ∆ is either a simple co-value, an input
abstraction, or e ςV e′ for some e′. Therefore, if the cut is neither final nor reducible,
then either the term reduces or the term is a value and the co-term reduces.
Preservation follows by cases on all the possible rewriting rules so that
– if c µµ˜ηµηµ˜βς c′ then c : (Γ ` ∆) implies c′ : (Γ ` ∆),
– if v µµ˜ηµηµ˜βς v′ then v : (Γ ` ∆)C implies v′ : (Γ ` ∆)C, and
– if e µµ˜ηµηµ˜βς e′ then e : (Γ ` ∆)C implies e′ : (Γ ` ∆)C.
for both call-by-value and call-by-name, using the fact that for Γ ` V : A | ∆ and
Γ′ | E : A ` ∆′:
– if c : (Γ′, x : A ` ∆′) then c {V/x} : (Γ′,Γ ` ∆′,∆),
– if c : (Γ ` α : A,∆) then c {E/α} : (Γ′,Γ ` ∆′,∆),
– if Γ′, x : A ` v : C | ∆′ then Γ′,Γ ` v {V/x} : C | ∆′,∆,
– if Γ ` v : C | α : A,∆ then Γ′,Γ ` v {E/α} : C | ∆′,∆,
90
– if Γ ` v : C | ∆ and X /∈ FV (Γ ` ∆) then Γ ` v {B/X} : C {B/X} | ∆,
– if Γ′, x : A | e : C ` ∆′ then Γ′,Γ | e {V/x} : C ` ∆′,∆,
– if Γ | e : C ` α : A,∆ then Γ′,Γ | e {E/α} : C ` ∆′,∆, and
– if Γ | e : C ` ∆ and X /∈ FV (Γ ` ∆) then Γ | e {B/X} : C {B/X} ` ∆,
each of which follows by induction on the typing derivation of c, v : C and e : C.
From progress and preservation, we can derive the following big-step statement
of type safety.
Theorem 3.4 (Type safety). For any dual calculi command c : (Γ ` ∆):
– if c 7→ µV µ˜VβV ςV c′ then c′ : (Γ ` ∆) and c′ is irreducible (i.e. c′ 67→µV µ˜VβV ςV ) if and
only if c′ is a call-by-value final command, and
– if c 7→ µN µ˜NβN ςN c′, then c′ : (Γ ` ∆) and c′ is irreducible (i.e. c′ 67→µN µ˜NβN ςN ) if
and only if c′ is a call-by-name final command.
Proof. By induction on the left-to-right reflexive-transitive structure of c 7→ µV µ˜VβV ςV c′
and c 7→ µN µ˜NβN ςN c′, using progress (Theorem 3.3 (a)) for the reflexive case and
preservation (Theorem 3.3 (b)) for the transitive case.
Remark 3.8. The original λµµ˜-calculus used a different β rule for functions, namely:
(β→) 〈λx.v||v′ · e〉 β→ 〈v′||µ˜x. 〈v||e〉〉 x /∈ FV (e)
This β→ works the same for both call-by-name and call-by-value reduction; since
the argument v′ is bound to x with an input abstraction, the rules of the core µµ˜-
calculus take over to determine whether or not the argument is evaluated now (by a
µV reduction, for example) or later (by a µ˜N reduction). Furthermore, this form of β→
reduction applies more often than the strategy-specific β→V and β→N , so we might ask
if it avoids the need of focusing for functions altogether. Unfortunately, the general
β→ rule still suffers a similar, if more subtle, fate as the strategy-specific β rules. For
example, consider the command 〈f ||µβ. 〈1||α〉 · µ˜x. 〈0||α〉〉 which corresponds to the
expression let z = f (abort 1) in 0 in a functional language containing the control
operator abort that halts the current computation and yields its argument as the
result. In call-by-value this expression should evaluate to 1, and in call-by-name it
91
should evaluate to 0, but the β→ rule does not help us since there is a free variable f
instead of a λ-abstraction. In this command, the ς rules are still necessary to get the
final result, and unfortunately combining the general β→ rule with ς→ creates a mild
form of non-determinism in the operational semantics since some β→ redexes are also
ς→ redexes (though the associated reduction theories are still confluent).
As it turns out, though, the combination of lifting and strategy-specific β→
reductions are more powerful than the generalized β→ rule. In call-by-value, the
combination of ς→V , µ˜V , and β→V exactly simulate the λµµ˜-calculus β→ rule as follows:
〈λx.v||v′ · e〉 7→ς→V 〈λx.v||µ˜y. 〈v′||µ˜x. 〈y||x · e〉〉〉 7→µ˜V 〈v′||µ˜x. 〈λx.v||x · e〉〉→β→V 〈v||µ˜x. 〈v||e〉〉
In call-by-name, observe that the combination of λµµ˜’s β→ and µ˜N rules simulate the
call-by-name-specific β→N even when the call stack is not a co-value,
〈λx.v||v′ · e〉 7→β→ 〈v′||µ˜x. 〈v||e〉〉 7→µ˜N 〈v {v′/x}||e〉
but together the µ˜Nηµβ→N ς→N rules perform the same reduction as follows:
〈λx.v||v′ · e〉 7→ς→N 〈λx.v||µ˜y. 〈µα. 〈y||v′ · α〉||e〉〉 7→µ˜N 〈µα. 〈λx.v||v′ · α〉||e〉
→β→N 〈µα. 〈v {v′/x}||α〉||e〉 →ηµ 〈v {v′/x}||e〉
So even though type safety (Theorem 3.4) cannot dispense with the ς→ rules by
adopting the λµµ˜-calculus’ original β→ rules, we can still rely on the combination of
strategy-specific β→ς→ rules from Figures 3.13 and 3.17 and Figures 3.14 and 3.18 to
get all the same results with deterministic operational semantics. End remark 3.8.
Static versus dynamic focusing
Now that we have two different methods for addressing β-stuck commands, one
question still remains: what do the static and dynamic methods have to do with one
another? As it turns out, they are compatible and complementary solutions to the
same problem—two sides of the same coin—that apply the same essential idea at
different times. First, one of the major features of static focusing in proof theories and
type systems is that the apparent restriction on inference rules is no real restriction at
all: every program (i.e. proof) in the original system has a corresponding program with
the same type (i.e. specification) in the focused sub-system. We can make this claim
92
J〈v||e〉KQ , 〈JvKQ∣∣∣∣∣∣JeKQ〉JxKQ , x Jµα.cKQ , µα.JcKQJ(v, v′)KQ , µα. 〈JvKQ∣∣∣∣∣∣µ˜x. 〈J(x, v′)KQ∣∣∣∣∣∣α〉〉 J(V, v)KQ , µα. 〈JvKQ∣∣∣∣∣∣µ˜x. 〈J(V, x)KQ∣∣∣∣∣∣α〉〉J(V, V ′)KQ , (JV KQ, JV ′KQ)Jιi (v)KQ , µα. 〈JvKQ∣∣∣∣∣∣µ˜x. 〈Jιi (x)KQ∣∣∣∣∣∣α〉〉 Jιi (V )KQ , ιi (JV KQ)Jnot(e)KQ , not(JeKQ)Jλx.v′KQ , λx.Jv′KQ JΛX.v′KQ , ΛX.Jv′KQJB @ vKQ , µα. 〈v∣∣∣∣∣∣µ˜x. 〈JB @ xKQ∣∣∣∣∣∣α〉〉 JB @ V KQ , B @ JV KQ
JαKQ , α Jµ˜x.cKQ , µ˜x.JcKQJpii [e]KQ , pii [JeKQ] J[e, e′]KQ , [JeKQ, Je′KQ] Jnot[v′]KQ , not[Jv′KQ]Jv · eKQ , µ˜x. 〈JvKQ∣∣∣∣∣∣µ˜y. 〈x∣∣∣∣∣∣Jy · eKQ〉〉 JV · eKQ , JV KQ · JeKQ
JB @ eKQ , B @ JeKQ qΛ˜X.eyQ , Λ˜X.JeKQ
where v /∈ ValueV
FIGURE 3.19. The Q-focusing translation to the LKQ sub-syntax.
more formally for LKQ and LKT by observing that the syntactic transformations in
Figures 3.19 and 3.20 translate general dual calculi expressions into the LKQ and
LKT sub-syntaxes, respectively, with the same type (by generalizing the proof of
preservation in Theorem 3.3 (b)). These translations are defined in such a way that
an expression that happens to already lie in the LKQ sub-syntax is not altered by
Q-focusing translation, and likewise LKT expressions are not altered by T -focusing
translation.
With the focusing translations and the ς reduction theory in hand, we can now
observe that both the static and dynamic methods of focusing amount to the same
thing. In particular, notice that the LKQ sub-syntax is just the ςV-normal forms
from the original dual calculus and the Q-focusing translation performs call-by-value
ςV-normalization, and similarly the T -focusing translation is just call-by-name ςN -
normalization into the LKT sub-syntax of ςN -normal forms, which can be confirmed
by induction on the syntax of (co-)terms and commands.
93
J〈v||e〉KT , 〈JvKT ∣∣∣∣∣∣JeKT〉
JxKT , x Jµα.cKT , µα.JcKTJ(v, v′)KT , (JvKT , Jv′KT) Jιi (v)KT , pii [JvKT ] Jnot(e)KT , not(JeKT )Jλx.vKT , λx.JvKT JΛX.vKT , ΛX.JvKT JB @ vKT , B @ JvKT
JαKT , α Jµ˜x.cKT , µ˜x.JcKTJpii [e]KT , µ˜x. 〈µα. 〈x∣∣∣∣∣∣Jpii [α]KT〉∣∣∣∣∣∣JeKT〉 Jpii [E]KT , pii [JEKT ]J[e, e′]KT , µ˜x. 〈µα. 〈x∣∣∣∣∣∣J[α, e′]KT〉∣∣∣∣∣∣JeKT〉 J[E, e]KT , µ˜x. 〈µα. 〈x∣∣∣∣∣∣J[E,α]KT〉∣∣∣∣∣∣JeKT〉J[E,E ′]KT , [JEKT , JE ′KT ] Jnot[v]KT , not[JvKT ]Jv · eKT , µ˜x. 〈µα. 〈x∣∣∣∣∣∣Jv · αKT〉∣∣∣∣∣∣JeKT〉 Jv · EKT , JvKT · JEKTJB @ eKT , µ˜x. 〈µα. 〈x∣∣∣∣∣∣JB @ αKT〉∣∣∣∣∣∣JeKT〉 JB @ EKT , B @ JEKTq
Λ˜X.e′
yT , Λ˜X.Je′KT
where e /∈ CoV alueN
FIGURE 3.20. The T -focusing translation to the LKT sub-syntax.
94
Theorem 3.5 (Focusing). – Every LKQ command, term, and co-term is a ςV-
normal form, and cςV JcKQ, v ςV JvKQ, and eςV JeKQ.
– Every LKT command, term, and co-term is a ςN -normal form, and cςN JcKT ,
v ςN JvKT , and eςN JeKT .
Proof. The fact that LKQ expressions are ςV-normal forms and LKT expressions are
ςN -normal forms is apparent from the syntax of LKQ and LKT. Furthermore, the fact
that cςV JcKQ, cςN JcKT , and so on follows by mutual induction on the syntax of
commands and (co-)terms.
Therefore, the difference between the static and dynamic methods of focusing
is not a matter of what but when: do we prefer to leave ς redexes to happen during
execution, or would we rather reduce them all up front as a preprocessing pass?
Remark 3.9. By representing a calling context with an explicit syntactic object e,
we have a direct representation of a tail-recursive interpreter (Ariola et al., 2009a),
which can also be seen as a form of abstract machine. In particular, we may view the
syntax of the dual calculi as a more abstract representation of a CEK-style machine
(Felleisen & Friedman, 1986) or a Krivine-style machine (Krivine, 2007): the control
(C) is represented by a term v, the continuation (K) is represented by a co-term e,
and the environment (E) is implicit and instead implemented by the capture-avoiding
substitution operation. Finally, the configuration state of the machine is represented
by a command c. Interestingly, though, the treatment of focusing in these machines
tends to be asymmetrical depending on the evaluation strategy: call-by-value abstract
machines tend to rely on dynamic focusing during execution, whereas call-by-name
abstract machines tend to maintain static focusing.
For example, consider a variation on a Krivine machine with implicit substitution
for call-by-name evaluation of λ-calculus terms:
〈v v′||E〉 〈v||E[ v′]〉
〈λx.v||E[ v′]〉 〈v {v′/x}||E〉
This machine uses two forms of evaluation context—the application of the computation
in question to an argument, E[ v′], and the empty context, —for finding the next
β-redex to perform. We can relate the states of this call-by-name machine to the call-
by-name dual calculus by translating the evaluation contexts to co-terms. The empty
95
context can be represented by just an arbitrary co-variable α, and the application to
an argument is represented directly as a call stack co-term: E[ v′] , v′ ·E. With this
interpretation, the first rule of the machine states the relationship between function
application in the λ-calculus and call stacks in the dual calculus, and the second rule
is exactly the β→N operational step. Note that if we always start with a co-value in the
machine state then the first rule only ever builds co-values in the LKT sub-syntax.
For example, by evaluating a term v in the “empty context” as 〈v||α〉, the co-term in
the machine will always be a chain of call stacks with some number of arguments like
v1 · v2 · v3 · v4 · α. Therefore, this Krivine-style machine operates within the statically
focused LKT sub-syntax.
Now consider the following variation on a CEK machine with implicit substitution
for call-by-value evaluation of λ-calculus terms:
〈v v′||E〉 〈v||E[ v′]〉
〈V ||E[ v]〉 〈v||E[V ]〉
〈V ||E[(λx.v) ]〉 〈v {V/x}||E〉
Compared to the call-by-name machine above, the machine uses one additional form
of evaluation context—the application of a function value to the computation in
question E[V ]—for finding the next β-redex to perform. We can extend the previous
translation of evaluation contexts to co-terms so that an applied function value is
represented indirectly with an input abstraction: E[V ] , µ˜x. 〈V ||x · E〉. With this
interpretation, the first rule of the machine relates function application and call stacks
as before, the second rule of the machine is a combined ς→V µ˜V step,
〈V ||v · E〉 7→ς→V 〈V ||µ˜x. 〈v||µ˜y. 〈x||y · E〉〉〉 7→µ˜V 〈v||µ˜y. 〈V ||y · E〉〉
and the last rule is a combined µ˜Vβ→V step:
〈V ||µ˜y. 〈λx.v||y · E〉〉 7→µ˜V 〈λx.v||V · E〉 7→β→V 〈v {V/x}||E〉
Notice that this machine does not necessarily operate within the LKQ sub-syntax:
the first rule might push a non-value computation onto a call stack. In this case, the
ς→V rule is needed to refocus the machine during execution. Of course, we could avoid
the need for ς→V reduction at run-time by changing our interpretation of application
96
to pre-ς→V -normalize the call stack, as in E[ v] , µ˜x. 〈v||µ˜y. 〈x||y · E〉〉. However, this
is just a matter of taste since the two timings of focusing amount to the same thing
(Theorem 3.5). End remark 3.9.
Call-by-value is dual to call-by-name
We now turn to the duality for which the dual calculi are named. We saw how the
symmetries of the sequent calculus present a logical duality that captures De Morgan
duals in Section 3.1. This duality is carried over by the Curry-Howard isomorphism
and presents itself as two dualities in programming languages:
(1) a duality between the static semantics (types) of languages, and
(2) a duality between the dynamic semantics (reductions) of languages.
These dualities of programming languages were first observed by Filinski (1989) from
the correspondence with duality in category theory, which was later expanded upon
by Selinger (2001, 2003) in the style of natural deduction. Curien & Herbelin (2000)
and Wadler (2003, 2005) brought this duality to the language of sequent calculus, and
show how it is better reflected in the language as a duality of syntax corresponding
to the inherent symmetries in the logic.
The static aspect of duality between types comes directly from the logical duality
of the sequent calculus. Since duality spins a sequent around its turnstyle, so that
assumptions are exchanged with conclusions, we also have a corresponding swap in the
programming language. The dual of a term v of type A is a co-term of the dual type
and vice versa, so that the term and co-term components of a command are swapped.
Likewise, the duality on types lines up directly with the De Morgan duality on logical
propositions. For example, since the types for pairs (×) and sums (+) correspond to
conjunction (∧) and disjunction (∨), we have the same relationship with the duality
operation C⊥:
(A×B)⊥ , (A⊥) + (B⊥) (A+B)⊥ , (A⊥)× (B⊥)
Also following the De Morgan duality, negation (¬) is self-dual.
However, just like we found in Gentzen’s LK sequent calculus in Section 3.1, the
dual calculi presented in Figure 3.12 are missing the counterpart to functions. By
analogy, we complete the duality of components in the calculus by adding the dual of
97
functions, also referred to as subtraction, that represent a transformation on co-terms,
as the counterpart to a transformation on terms. The typing rules for subtraction are
the same as the logical rules for subtraction in LK, and the syntax is reversed from
functions in the dual calculi:
Γ | e : A ` ∆ Γ′ ` v : B | ∆′
Γ,Γ′ ` e · v : B − A | ∆,∆′ −R
Γ | e : B ` α : A,∆
Γ | λ˜α.e : B − A ` ∆ −L
Similarly, the β− and ς− operational rules for subtraction are the mirror image of the
corresponding rules for functions. In call-by-value we have:
V alueV ::= . . . | e · V
(β−V )
〈
E · V
∣∣∣∣∣∣λ˜α.e〉 β−V 〈V ||e {E/α}〉
(ς−V ) e · v ς−V µα. 〈v||µ˜y. 〈e · y||α〉〉 (v /∈ V alueV , x fresh)
and in call-by-name we have:
CoV alueN ::= . . . | λ˜α.e
(β−N )
〈
E · V
∣∣∣∣∣∣λ˜α.e〉 β−N 〈V ||e {E/α}〉
(ς−N ) e · v ς−N µα. 〈µβ. 〈β · v||α〉||e〉 (e /∈ CoV alueN , α fresh)
With the dual counterpart to functions in place, the full duality relationship of
types and programs of the dual calculi is defined in Figure 3.21, where we assume
an underlying involutive bijection x and α between variables and co-variables.9 First,
notice that the duality operation is involutive on the nose: the dual of the dual is
exactly the same as the original (Wadler, 2003).
Theorem 3.6 (Involutive duality). The duality operation ⊥ on environments,
sequents, types, commands, terms, and co-terms is involutive, so that ⊥⊥ is the identity
transformation.
Proof. By mutual induction on the definition of the duality operation ⊥
9By an involutive bijection, we mean that x gives a (co-)variable and α gives a variable such that
x ≡ y and α ≡ β if and only if x ≡ y and α ≡ β, and also that x ≡ x and α ≡ α.
98
Duality of sequents:
(c : (Γ ` ∆))⊥ , c⊥ : (∆⊥ ` Γ⊥)
(Γ ` v : A | ∆)⊥ , ∆⊥ | v⊥ : A⊥ ` Γ⊥ (Γ | e : A ` ∆)⊥ , ∆⊥ ` e⊥ : A⊥ | Γ⊥
(xn : An, . . . , x1 : A1)⊥ , x⊥1 : A⊥1 , . . . , x⊥n : A⊥n
(α1 : A1, . . . , αn : An)⊥ , α⊥n : A⊥n , . . . , α⊥1 : A⊥1
Duality of types:
(X)⊥ , X
(A×B)⊥ , (A⊥) + (B⊥) (A+B)⊥ , (A⊥)× (B⊥)
(A→ B)⊥ , (B⊥)− (A⊥) (B − A)⊥ , (A⊥)→ (B⊥)
(∀X.A)⊥ , ∃X.(A⊥) (∃X.A)⊥ , ∀X.(A⊥)
(¬A)⊥ , ¬(A⊥)
Duality of programs:
〈v||e〉⊥ ,
〈
e⊥
∣∣∣∣∣∣v⊥〉
(x)⊥ , x [α]⊥ , α
(µα.c)⊥ , µ˜α.c⊥ [µ˜x.c]⊥ , µx.c⊥
(v1, v2)⊥ ,
[
v⊥1 , v
⊥
2
]
[e1, e2]⊥ ,
(
e⊥1 , e
⊥
2
)
ι1 (v)⊥ , pi1
[
v⊥
]
pi1 [e]⊥ , ι1
(
e⊥
)
ι2 (v)⊥ , pi2
[
v⊥
]
pi2 [e]⊥ , ι2
(
e⊥
)
not(e)⊥ , not[e⊥] not[v]⊥ , not(v⊥)
(λx.v)⊥ , λ˜x.[v⊥] [λ˜α.e]⊥ , λα.(v⊥)
(e · v)⊥ , e⊥ · v⊥ [v · e]⊥ , v⊥ · e⊥
(ΛX.v)⊥ , Λ˜X.[v⊥] [Λ˜X.e]⊥ , ΛX.(e⊥)
(B @ v)⊥ , B⊥ @ [v⊥] [B @ e]⊥ , B⊥ @ (e⊥)
FIGURE 3.21. The duality relation between the dual calculi.
99
This relationship is not just a syntactic word game, but it gives us a duality
between the typing derivations of terms and co-terms (Curien & Herbelin, 2000;
Wadler, 2003):
Theorem 3.7 (Static duality).
a) c : (Γ ` ∆) is well-typed if and only if c⊥ : (∆⊥ ` Γ⊥) is.
b) Γ ` v : A | ∆ is well-typed if and only if ∆⊥ | v⊥ : A⊥ ` Γ⊥ is.
c) Γ | e : A ` ∆ is well-typed if and only if ∆⊥ ` e⊥ : A⊥ | Γ⊥ is.
Furthermore, if a command, term, or co-term lies in the LKQ sub-syntax, its dual lies
in the LKT sub-syntax and vice versa.
Proof. By induction on the typing derivation.
The dynamic aspect of duality takes form as a relationship between the two
reduction systems for evaluating programs: call-by-value reduction is dual to call-by-
name reduction. That is, if we have a command c that behaves a certain way according
to the call-by-value calculus, then the dual command c⊥ behaves in a correspondingly
dual way according to the call-by-name calculus, and vice versa. The two dynamic
semantics (operational, reduction, and equational) mirror each other exactly, rule for
rule (Curien & Herbelin, 2000; Wadler, 2003).
Theorem 3.8 (Dynamic duality). a) c µV µ˜VβV c′ if and only if c⊥ µN µ˜NβN c′⊥,
and dually c µN µ˜NβN c′ if and only if c⊥ µV µ˜VβV c′⊥.
b) v ηµςV v′ if and only if v⊥ ηµ˜ςN v′⊥, and dually v ηµςN v′ if and only if
v⊥ ηµ˜ςV v′⊥.
c) e ηµ˜ςV e′ if and only if e⊥ ηµςN e′⊥, and dually e ηµ˜ςN e′ if and only if
e⊥ ηµςV e′⊥.
Proof. By cases on the respective rewriting rules, using the fact that substitution
commutes with duality ((c {V/x})⊥ =α c⊥
{
V ⊥/x
}
, (c {E/α})⊥ =α c⊥
{
E⊥/α
}
,
(c {A/X})⊥ =α c⊥
{
A⊥/X
}
, and similarly for (co-)terms) which is guaranteed by
the fact that the duality operation is compositional and hygienic (Downen & Ariola,
2014a).
100
CHAPTER IV
Polarity
Looking back to Gentzen’s original LK from Figure 3.5, a careful eye might notice
that there is a bit of an inconsistency among the logical rules. In particular, compare
left implication introduction (⊃L) with right conjunction (∧R) and left disjunction
(∨L) introduction and notice how they treat their auxiliary propositions (hypotheses
Γ and consequences ∆) very differently. In both the ∧R and ∨L rules, the auxiliary
propositions are shared among both premises and the deduction: each sequent contains
exactly the same extra hypotheses (Γ) and consequences (∆). However, the ⊃L rule
does not follow this pattern. Instead, the two premises of the ⊃L rule contain different
auxiliary propositions from one another, which are then combined together in the
deduction: each sequent contains potentially different hypotheses and consequences.
Why are the rules for implication appear so different from the rules for conjunction
and disjunction? Is this merely a notational accident, or is there some significance
to the way these side propositions are threaded through the proof tree? As it turns
out, we can classify the logical connectives in a way that emphasizes this distinction,
which through the Curry-Howard lense has a profound impact on our understanding
of the computational nature of the sequent calculus. Before in Chapter III, we found
that the sequent calculus shows us the duality between evaluation strategies—namely
the call-by-value and call-by-name strategies—via two distinct languages with the
same syntax but different semantics. This distinction between the semantics of the
dual call-by-value and call-by-name calculi becomes apparent when we consider the
operational behavior of programs. For example, Wadler (2003) was able to encode
functions in terms of the other connectives, but surprisingly different encodings are
necessary for both call-by-value and call-by-name. Even though they share a syntax,
the two dual calculi truly describe different languages. Instead, we will soon find that
an alternative interpretation of the sequent calculus lets us express the same duality
of evaluation within the same language, so that a single program might employ both
call-by-value and call-by-name during its execution.
101
Additive and Multiplicative LK
Recall back to the basic left introduction inference rules for conjunction in
Figure 3.2. These rules state that if A is false then A ∧ B is false, and likewise
if B is false then A ∧ B is false as well. However, there is another presentation of
conjunction that makes use of the internal structure of sequents. We originally decided
in Chapter III to interpret a sequent as meaning that the truth of all hypotheses entails
the truth of one consequence. So for example, the sequent A,B ` C,D means that
“A and B entails C or D.” In other words, the commas to the left are pronounced as
“and,” and the commas to the right are pronounced “or.”
We might then formalize this interpretation by saying that the logical connective
for conjunction actually corresponds to a comma on the left, so that the sequents
A,B ` and A ∧B ` are equally valid as shown by the two inferences which reverse
one another (from bottom-up to top-down):
A,B `
A ∧B `
A ∧B `
A,B `
Notice how the sequents A,B ` and A ∧ B ` are equivalent statements since both
mean that “A and B entails false,” which justifies that the above inference rules are
valid. Likewise, we could equate the logical connective for disjunction with a comma
on the right, so that the sequents ` A,B and ` A∨B are equally valid as shown by
the inferences:
` A,B
` A ∨B
` A ∨B
` A,B
This gives an alternative to the right introduction rules for disjunction in contrast to
the ones given in Figure 3.3.
Notice how the above alternative rules for conjunction and disjunction are
reversible: both the top-down and bottom-up inferences are valid. More generally,
an inference of the form
H1 H2 . . . Hn
J
102
is reversible when there are derivations D1,D2, . . . ,Dn for each of
J.... D1
H1
J.... D2
H2 . . .
J.... D3
Hn
and irreversible otherwise. So we can say that the above alternative left introduction
of conjunction and right introduction of disjunction are both reversible. These
formulations of conjunction and disjunction contrast with the rules that were given in
Figures 3.2 and 3.3. Clearly A ` (i.e. “A is false”) is a much stronger statement than
A ∧ B ` (i.e. “the conjunction of A and B is false”), so the left introduction rules
given in Figure 3.2 are irreversible. Likewise, ` A (i.e. “A is true”) is a much stronger
statement than ` A ∨B (i.e. “either A or B is true”), so the right introduction rules
for conjunction given in Figure 3.3 are also irreversible.
It seems that we have a substantive choice on how we might phrase conjunction
and disjunction in the setting of the sequent calculus. Instead of just arbitrarily
choosing one of them, we can consider all the possibilities at once in the same logic as
shown in Figure 4.1. In this combined logic, we have two separate logical connectives for
conjunction and two connectives for disjunction. Additionally, there are two separate
constants (i.e. nullary connectives) for truth and falsehood. Our original formulation
of conjunction (∧) and disjunction (∨) in LK from Figure 3.5 are preserved as the &
and ⊕ connectives, respectively, as well as truth (>) and falsehood (⊥) which go by the
same name. The new alternatives for truth, falsehood, conjunction, and disjunction
discussed above are denoted by the 1, 0, ⊗, and ` connectives, respectively. Finally,
the presentation of negation (¬) and implication (⊃) is unchanged. For now we delay
further discussion of the quantifiers until Chapter VI.
Now we can more formally analyze the reversibility of the logical rules for the
different variations of the connectives. The left introduction for ⊗-conjunction and ⊕-
disjunction are reversible because the sequent Γ, A,B ` ∆ follows from Γ, A⊗B ` ∆
and each of Γ, A ` ∆ and Γ, B ` ∆ follow from Γ, A⊕B ` ∆:
A ` A Ax B ` B Ax
A,B ` A⊗B ⊗R Γ, A⊗B ` ∆
Γ, A,B ` ∆ Cut
103
A,B,C ∈ Proposition ::= X | 0 | 1 | A⊕B | A⊗B | > | ⊥ | A&B | A`B | A ⊃ B | ¬A
Γ ∈ Hypothesis ::= A1, . . . , An ∆ ∈ Consequence ::= A1, . . . , An
Judgement ::= Γ ` ∆
Axiom and cut:
A ` A Ax
Γ ` A,∆ Γ′, A ` ∆′
Γ′,Γ′ ` ∆′,∆ Cut
Logical rules:
` 1 1R
Γ ` ∆
Γ, 1 ` ∆ 1L no 0R rule Γ, 0 ` ∆
0L
Γ ` A,∆ Γ′ ` B,∆′
Γ,Γ′ ` A⊗B,∆,∆′ ⊗R
Γ, A,B ` ∆
Γ, A⊗B ` ∆ ⊗L
Γ ` >,∆ >R not >L rule Γ ` ∆Γ ` ⊥,∆ ⊥R ⊥ ` ⊥L
Γ ` A,∆ Γ ` B,∆
Γ ` A&B,∆ &R
Γ, A ` ∆
Γ, A&B ` ∆ &L1
Γ, B ` ∆
Γ, A&B ` ∆ &L2
Γ ` A,∆
Γ ` A⊕B,∆ ⊕R1
Γ ` B,∆
Γ ` A⊕B,∆ ⊕R2
Γ, A ` ∆ Γ, B ` ∆
Γ, A⊕B ` ∆ ⊕L
Γ ` A,B,∆
Γ ` A`B,∆ `R
Γ, A ` ∆ Γ, B ` ∆′
Γ,Γ′, A`B ` ∆,∆′ `L
Γ, A ` ∆
Γ ` ¬A,∆ ¬R
Γ ` A,∆
Γ,¬A ` ∆ ¬L
Γ, A ` B,∆
Γ ` A ⊃ B,∆ ⊃R
Γ ` A,∆ Γ′, B ` ∆′
Γ,Γ′, A ⊃ B ` ∆,∆′ ⊃L
Structural rules:
Γ ` ∆
Γ ` A,∆ WR
Γ ` ∆
Γ, A ` ∆ WL
Γ ` A,A,∆
Γ ` A,∆ CR
Γ, A,A ` ∆
Γ, A ` ∆ CL
Γ ` ∆, A,B,∆′
Γ ` ∆, B,A,∆′ XR
Γ, A,B,Γ′ ` ∆
Γ, B,A,Γ′ ` ∆ XL
FIGURE 4.1. An additive and multiplicative LK sequent calculus: with two truths
(1, >), two falsehoods (0, ⊥), two conjunctions (⊗, &), two disjunctions (⊕, `), one
negation (¬), and one implication (⊃).
104
A ` A Ax
A ` A⊕B ⊕R1 Γ, A⊕B ` ∆
Γ, A ` ∆ Cut
B ` B Ax
B ` A⊕B ⊕R2 Γ, A⊕B ` ∆
Γ, B ` ∆ Cut
However, the right rules of ⊗-conjunction and ⊕-disjunction are irreversible because
the premises are stronger than the conclusion. Clearly neither Γ ` A,∆ nor Γ ` B,∆
follow from the weaker sequent Γ ` A⊕B,∆, but also neither Γ ` A,∆ nor Γ′ ` B,∆′
follow from Γ,Γ′ ` A ⊗ B,∆,∆′ because of the way that the side-propositions Γ,Γ′
and ∆,∆′ from the conclusion are split up between the two premises.
In contrast, the right introduction rules for &-conjunction, `-disjunction, and
→-implication are reversible because the premises are weak enough to be proved from
the conclusions:
Γ ` A&B,∆
A ` A Ax
A&B ` A &L1
Γ ` A,∆ Cut
Γ ` A&B,∆
B ` B Ax
A&B ` B &L2
Γ ` B,∆ Cut
Γ ` A`B,∆ A ` A
Ax
B ` B Ax
A`B ` A,B `L
Γ ` A,B,∆ Cut
Γ ` A ⊃ B,∆
A ` A Ax B ` B Ax
A,A ⊃ B ` B ⊃L
Γ, A ` ∆, B Cut.... XR
Γ, A ` B,∆
However, each of the &-conjunction,`-disjunction, and⊃-implication left introduction
rules are irreversible for similar reasons as the right introduction rules for ⊗-
conjunction and ⊕-disjunction. Clearly neither Γ, A ` ∆ nor Γ, B ` ∆ follow from
the weaker sequent Γ, A&B ` ∆. Furthermore, both the `L and ⊃L share the same
splitting problem that causes the irreversibility of ⊗R.
One consequence of reversibility is that any derivation whose conclusion matches
the conclusion of a reversible rule might as well end with that reversible rule, because
we can always extract out the premises to the rule and then reassemble the same
conclusion. For example, suppose that we have a derivation D of the sequent Γ ` A&
B,∆, where the proposition A&B appears on the right side. Then by the reversibility
of the &R rule noted above, we have derivations from Γ ` A&B,∆ to Γ ` A,∆ and
Γ ` A,∆, which we will denote by the names &R1−1 and &R2−1 respectively. These
two reverse derivations let us expand D to get an extended derivation which ends with
105
&R as follows:
.... D
Γ ` A&B,∆ ≺
.... D
Γ ` A&B,∆.... &R1−1
Γ ` A,∆
.... D
Γ ` A&B,∆.... &R2−1
Γ ` B,∆
Γ ` A&B,∆ &R
Similarly, we can expand arbitrary derivations of sequents with A`B or A→ B on
the right side using the derivations `R−1 and →R−1 which reverse the `R and →R
right introduction rules:
.... D
Γ ` A`B,∆ ≺
.... D
Γ ` A`B,∆.... `R−1
Γ ` A,B,∆
Γ ` A`B,∆ `R
.... D
Γ ` A→ B,∆ ≺
.... D
Γ ` A→ B,∆.... →R−1
Γ, A ` B,∆
Γ ` A→ B,∆ →R
The same expansion also occurs when the proposition A⊕B or A⊗B appears
on the left of the concluding sequent, by using the ⊕L1−1, ⊕L2−1, and ⊗L−1 reverse
derivations of the ⊕L and ⊗L left introduction rules.
.... D
Γ, A⊕B ` ∆ ≺
.... D
Γ, A⊕B ` ∆.... ⊕L1−1
Γ, A ` ∆
.... D
Γ, A⊕B ` ∆.... ⊕L2−1
Γ, B ` ∆
Γ, A⊕B ` ∆ ⊕L
.... D
Γ, A⊗B ` ∆ ≺
.... D
Γ, A⊗B ` ∆.... ⊗L−1
Γ, A,B ` ∆
Γ, A⊗B ` ∆ ⊗L
So in comparison with natural deduction, whereas the steps of cut elimination
(Section 3.1) in the sequent calculus correspond with local soundness (Section 2.1),
the above reversibility expansions correspond with local completeness.
With both variations of the connectives included in a single logic, we can compare
and contrast them by the emergent properties of their logical rules. Notice how the
auxiliary hypotheses Γ and consequences ∆ in the &R and ⊕L rules are shared among
106
both premises as well in the conclusion, so that Γ and ∆ are “copied” when the rules
are read from the bottom-up. Because the side-propositions are copied bottom-up, we
say that the &-conjunction and ⊕-disjunction are additive connectives. In contrast,
in each of the ⊗R, `L, and ⊃L rules the two premises contain different auxiliary
hypotheses and consequences which are “merged” when the rules are read from the
top-down. Because the side-propositions are merged top-down, we say that the ⊗-
conjunction, `-disjunction, and ⊃-implication are multiplicative connectives. In the
degenerate case for the nullary connectives, we can say that > and 0 are additive
because the Γ and ∆ in the conclusion of their only introduction rule (>R and ⊥L)
is “copied” among its zero premises, whereas the 1 and 0 have rules (1R and 0L) that
“merge” the hypothesis and conclusions from their zero premises into the conclusion.
Note that ¬-negation is neither additive nor multiplicative—or perhaps it could be
considered both additive and multiplicative—since both its right and left introduction
rules have exactly one premise.
Besides the additive-multiplicative distinction, there is another axis which is
perhaps more fundamental upon which we can classify the connectives. Recall the
previous discussion of reversibility of the inference rules that lead us to consider ⊗-
conjunction and `-disjunction as alternatives to the &-conjunction and ⊕-conjunction
that were inherited from Gentzen’s LK. Both the ⊗-conjunction and ⊕-disjunction
have reversible left introductions because the premises are weak enough to be proved
from the conclusion. On the flip side, we saw that &-conjunction, `-disjunction,
and ⊃-implication have reversible right introductions for dual reasons. We can thus
divide the logical connectives based on two polarities: connectives with reversible left
introductions and irreversible right introductions are positive, and dually connectives
with reversible right introductions and irreversible left introductions are negative.
Based on our previous analysis, we can say that ⊗-conjunction and ⊕-disjunction
are positive, whereas &-conjunction, `-disjunction, and ⊃-disjunction are negative.
Note again that ¬-negation does not directly participate in this classification and is
neutral with regard to polarity because both the left and right ¬ introductions are
reversible, making it both—or neither, depending on our perspective—positive and
negative at the same time. We can thus categorize all the binary connectives along the
additive-multiplicative and positive-negative axes, as shown in Figure 4.2. These two
classifications are enough to separate all of the connectives into different quadrants
107
Positive Negative
Additive ⊕ &
Multiplicative ⊗ `, ⊃
FIGURE 4.2. The positive/negative and additive/multiplicative classification of
binary connectives.
based on their properties, so that only ` and ⊃ share the same quadrant showing
that these are the two connectives that are most similar to one another.
Pattern Matching and Extensionality
Let us now consider a language for the additive and multiplicative LK sequent
calculus which is well suited for expressing the polarity of connectives within the
form of its expressions. The language shown in Figure 4.3, which extends the core
µµ˜-calculus from Figure 3.7, is based on Munch-Maccagnoni’s (2009) system L family
of calculi.1 System L is visually rather different from the dual calculi we studied
previously in Chapter III, where its most obvious first departure from the dual calculi
is its pervasive use of pattern-matching as a core language construct.
One way to understand the role of pattern-matching in programming and its
connection to polarity is to look at Dummett’s 1976 lectures (Dummett, 1991) on
the justification of logical principles. In essence, Dummett suggested that there are
effectively two ways for framing the meaning of logical laws, which reveals a certain
bias in the logician: the verificationist and the pragmatist.
In the eyes of a verificationist, it is the rules for proving a proposition
(corresponding to the right introduction rules in either natural deduction or sequent
calculus) that give meaning to a logical connective. These are the primitive rules
for a connective that define its character. All the other rules of a connective (the
elimination or left rules) must then be justified with respect to its right introductions.
In other words, the meaning of a proposition can be devised from its canonical proofs
(Prawitz, 1974) composed of right introduction rules, and the other rules are sound
with respect to them. This is an alternative to the global property of cut elimination
from Section 3.1 that is more similar to local soundness for natural deduction described
in Section 2.1.
1We consider here the two-sided variant of system L to make easier comparisons with the other
languages for the sequent calculus.
108
A,B,C ∈ Type ::= X | 0 | 1 | A⊕B | A⊗B | ∼A
| > | ⊥ | A&B | A`B | A→ B | ¬A
v ∈ Term ::= x | µα.c | () | ι1 (v) | ι2 (v) | (v, v) | ∼ (e)
| µ([].c) | µ(pi1 [α].c | pi2 [β].c) | µ([α, β].c) | µ([x · β].c) | µ(¬ [x].c)
e ∈ CoTerm ::= α | µ˜x.c | µ˜[().c] | µ˜[ι1 (x).c | ι2 (y).c] | µ˜[(x, y).c] | µ˜[∼ (α).c]
| pi1 [e] | pi2 [e] | [e, e] | v · e | ¬ [v]
c ∈ Command ::= 〈v||e〉
Logical rules:
` () : 1 | 1R
c : (Γ ` ∆)
Γ | µ˜[().c] : 1 ` ∆ 1L not 0R rule Γ | µ˜[] : 0 ` ∆
Γ ` v : A | ∆
Γ ` ι2 (v) : A⊕B | ∆ ⊕R1
Γ ` v : B | ∆
Γ ` ι1 (v) : A⊕B | ∆ ⊕R2
c : (Γ, x : A ` ∆) c′ : (Γ, y : B ` ∆)
Γ | µ˜[ι1 (x).c | ι2 (y).c′] : A⊕B ` ∆ ⊕L
Γ ` v : A | ∆ Γ ` v′ : B | ∆
Γ,Γ′ ` (v, v′) : A⊗B | ∆,∆′ ⊗R
c : (Γ, x : A, y : B ` ∆)
Γ | µ˜[(x, y).c] : A⊗B ` ∆ ⊗L
Γ | e : A ` ∆
Γ ` ∼ (e) : ∼A | ∆ ∼R
c : (Γ ` α : A,∆)
Γ | µ(∼ (α).c) : ∼A ` ∆ ∼L
Γ ` µ() : > | ∆ >R no >L rule
c : (Γ ` ∆)
Γ ` µ([].c) : ⊥ | ∆ ⊥R | [] : ⊥ ` ⊥L
c : (Γ ` α : A,∆) c′ : (Γ ` β : B,∆)
Γ ` µ(pi1 [α].c | pi2 [β].c′) : A&B | ∆ &R
Γ | e : A ` ∆
Γ | pi1 [e] : A&B ` ∆ &L1
Γ | e : B ` ∆
Γ | pi2 [e] : A&B ` ∆ &L2
c : (Γ ` α : A, β : B,∆)
Γ ` µ([α, β].c) : A`B | ∆ `R
Γ | e : A ` ∆ Γ | e′ : B ` ∆
Γ,Γ′ | [e, e′] : A`B ` ∆,∆′ `L
c : (Γ, x : A ` β : B,∆)
Γ ` µ([x · β].c) : A→ B | ∆ →R
Γ ` v : A | ∆ Γ′ | e : B ` ∆′
Γ,Γ′ | v · e : A→ B ` ∆,∆′ →L
c : (Γ, x : A ` ∆)
Γ ` µ(¬ [x].c) : ¬A | ∆ ¬R
Γ ` v : A | ∆
Γ | ¬ [v] : ¬A ` ∆ ¬L
FIGURE 4.3. The syntax and types semantics for system L: with two unit types (1,
>), two empty types (0, ⊥), (co-)products (&, ⊕), (co-)pairs (⊗, `), two negations
(∼, ¬), and functions (→).
109
In the eyes of a pragmatist, it is the rules for using a proposition (corresponding
to the elimination rules in natural deduction left rules in sequent calculus), that give
meaning to a logical connective. That is to say, the primitive concept is what can
be done with a proposition. This stance is the polar opposite of the verificationist.
For a pragmatist, canonical proofs are composed of elimination or left rules , and the
other rules must be sound with respect to the way assumptions are used rather than
the way facts are verified. The key insight behind this connection is that the positive
connectives follow a verificationist’s point of view, whereas the negative connectives
follow a pragmatist’s point of view. In terms of system L, positive types focus on the
patterns or shapes of terms (which create results) whereas negative types focus on
the patterns or shapes of co-terms (which use results).
Since the positive connectives correspond to a verificationist style of proof, the
proofs (i.e. verifications) of a proposition fall within a fixed set of well-known canonical
forms, whereas the uses (i.e. refutations) of a proposition are arbitrary. Therefore, in
a program corresponding to a verificationist proof, the terms for producing output
also must fall within a fixed set of forms, but the co-terms for consuming input are
allowed to be arbitrary. In order to gain a foot-hold on the unrestricted nature of
positive co-terms, we may describe them by inversion on the possible forms of their
input. That is to say, positive co-terms may be defined by cases on the structure of
all possible input they might receive. In other words, positive types follow the general
pattern that terms are formed by construction, whereas co-terms are formed by case
analysis on term constructors
Compared to the positive connectives, the pragmatist approach to negative
connectives may seem a bit unusual. Rather than thinking about how to conclude
true facts, the pragmatist takes the dual approach and focuses attention on how to
make use of those facts. In this way, the methods of using an assumed proposition
are limited to a fixed set of known canonical forms, whereas the conclusions of a
proposition may be arbitrary. The programs that correspond with pragmatist proofs
are likewise dual to verificationist proofs, so that the relative roles of producers and
consumers are reversed. In a pragmatist program, the terms that produce output are
allowed to have an arbitrary form. Instead, it is the co-terms for consuming input that
must fall within a fixed set of known forms—the legal observations of a type. We may
then define terms by inversion on the possible forms of their consumer, so that they
are given by cases on the observation of their output. In other words, general pattern
110
for negative connectives is that the co-terms are formed by construction, whereas the
terms are formed by the dual form of case analysis on co-term constructors.
For example, in the dual calculi, a value of the product type A×B was created
by the pair term (v1, v2) which are used by a projection co-term of the form pi1 [e]
or pi2 [e]. In system L, however, we have two different methods to conjoin two types.
From the verificationist viewpoint, the positive A⊗B method to conjunction puts the
focus on the construction of pairs representing the canonical proof of a conjunction of
two parts, keeping terms of the form (v1, v2) : A⊗B that clearly contains both v1 : A
and v1 : B sub-terms, as the single right introduction rule defining A⊗B:
Γ ` v : A | ∆ Γ ` v′ : B | ∆
Γ,Γ′ ` (v, v′) : A⊗B | ∆,∆′ ⊗R
To use a value of type A ⊗ B, a co-term only needs to justify its reaction to the
canonical pair values, such as the case abstraction co-term µ˜[(x, y).c] : A ⊗ B that
performs a pattern-matching case analysis to bind x : A to the first component and
y : B to the second component of its given pair in the arbitrary command c:
c : (Γ, x : A, y : B ` ∆)
Γ | µ˜[(x, y).c] : A⊗B ` ∆ ⊗L
From the pragmatist viewpoint, the negative A&B method to conjunction puts the
focus on the destruction of pairs, keeping co-terms of the form pi1 [e] : A & B and
pi2 [e] : A&B that clearly mark the choice between the two canonical left introduction
rules of products defining A&B:
Γ | e : A ` ∆
Γ | pi1 [e] : A&B ` ∆ &L1
Γ | e : B ` ∆
Γ | pi2 [e] : A&B ` ∆ &L2
To create a value of type A&B, a term only needs to justify its reaction the two possible
projection observations, such as the co-case abstraction term µ(pi1 [α].c1 | pi2 [β].c2) :
A&B that performs pattern-matching case analysis which projection is observing it,
binding α : A to e1 : A in c1 in the case of a pi1 [e1] projection and binding β : B to
e2 : B in c2 in the case of a pi2 [e2] projection:
c : (Γ ` α : A,∆) c′ : (Γ ` β : B,∆)
Γ ` µ(pi1 [α].c | pi2 [β].c′) : A&B | ∆ &R
111
As another example, the dual calculi creates values of the sum type A + B by
the injection terms ι1 (v) and ι2 (v) which are used by a co-pair co-term of the form
[v1, v2], and in system L each of these two constructions show up separately in the
two different methods to disjoin two types. On the one hand, the A⊕ B method to
disjunction keeps the injection terms ι1 (v) : A ⊕ B and ι2 (v) : A ⊕ B that clearly
mark the choice between the canonical right introduction rules defining A⊕B:
Γ ` v : A | ∆
Γ ` ι2 (v) : A⊕B | ∆ ⊕R1
Γ ` v : B | ∆
Γ ` ι1 (v) : A⊕B | ∆ ⊕R2
To use a value of type A⊕B, we only need to justify the reaction of a co-term to the
canonical injection terms, such as the case abstraction co-term µ˜[ι1 (x).c1 | ι2 (y).c2] :
A⊕B that checks which injection it receives, binding x : A to v1 : A in c1 in the case
of ι1 (v1) and binding y : B to v2 : B in c2 in the case of ι2 (v2):
c : (Γ, x : A ` ∆) c′ : (Γ, y : B ` ∆)
Γ | µ˜[ι1 (x).c | ι2 (y).c′] : A⊕B ` ∆ ⊕L
On the other hand, the A`B method of disjunction puts the focus on the destruction
of sums, keeping co-terms of the form [e1, e2] that clearly contains both e1 : A and
e2 : B sub-co-terms, as the single canonical left introduction rule defining A`B:
Γ | e : A ` ∆ Γ | e′ : B ` ∆
Γ,Γ′ | [e, e′] : A`B ` ∆,∆′ `L
To create a value of type A`B, we only need to justify the reaction of a term to the
canonical co-pair observations, such as the co-case abstraction term µ([α, β].c) : A`B
that binds α : A to first component and β : B to the second component of its given
co-pair in the arbitrary command c:
c : (Γ ` α : A, β : B,∆)
Γ ` µ([α, β].c) : A`B | ∆ `R
The rest of the connectives follow suit accordingly, where positive connectives
construct terms according to certain patterns and have co-terms which match on those
patterns by case analysis, and negative connectives construct co-terms according to
certain patterns and have terms which match on those patterns by case analysis. The
112
positive constants 1 and 0 are nullary versions of A⊗B and A⊕B so they contain
the nullary versions of pairs and co-products. The negative constants > and ⊥ are
nullary versions of A&B and A`B so they contain the nullary versions of products
and co-pairs. Functions A→ B are another example of a multiplicative negative type
like the negative disjunction A`B, and so it contains similar (co-)terms for the sake
of uniformity. This means that the call stacks v · e : A → B for functions are the
same as in the dual calculi, but λ-abstractions have been replaced with the co-case
abstraction terms µ([x · β].c) : A → B which deconstruct a call stack to bind the
argument to x : A and the return co-term to β : B in the command c. Note that
this change in representation from λ-abstractions to call stack deconstructions does
not change the expressiveness of functions, since the each can represent the other as
macro expansions:
µ([x · β].c) = λx.µβ.c λx.v = µ([x · β].〈v||β〉) (β /∈ FV (v))
Finally, we have to accomodate negation, which could be considered both positive and
negative as we previous saw in Section 4.1. Therefore, instead of breaking the pattern
or choosing arbitrarily, we include two different negation connectives—a positive
negation ∼A and a negative negation ¬A—to express the two possible orientations
of construction and deconstruction by case analysis.
Remark 4.1. It is worthwhile to pause and ask why the pragmatist representation
of logical connectives may appear to be backwards. For example, ` is a logical “or”
whose interpretation appears to be an “and” combination of two things, whereas & is
a logical “and” whose interpretation appears to be an “or” choice of two alternatives.
The reason is that the pragmatist approach requires us to completely reverse the
way we think about proving. Under the verificationist approach, we focus on how to
establish truth: to show that “A and B” is true, we need to show that both A and B
are true; to show that “A or B” is true, it suffices to show that either A is true or B
is true. Instead, the pragmatist approach asks us to focus on the ways to establish
falsehood: to show that “A and B” is false, it suffices to show that either A is false
or B is false; to show that “A or B” is false we need to show that both A and B are
false. Whereas the verificationist is primarily concerned with building a proof, the
pragmatist is instead concerned with building a refutation. Therefore, the pragmatist
interpretation of negative connectives intuitively has a negative baked in: “and” is
113
Positive ηP rules:
(η0P) e : 0 ≺η0P µ˜[]
(η1P) e : 1 ≺η1P µ˜[().〈()||e〉]
(η⊕P ) e : A⊕B ≺η⊕P µ˜[ι1 (x).〈ι1 (x)||e〉 | ι1 (y).〈ι1 (y)||e〉]
(η⊗P ) e : A⊗B ≺η⊗P µ˜[(x, y).〈(x, y)||e〉]
(η∼P ) e : ∼A ≺η∼P µ˜[∼ (α).〈∼ (α)||e〉]

x, y, α /∈ FV (e)
Negative ηP rules:
(η>P ) v : > ≺η>P µ()
(η⊥) v : ⊥ ≺η⊥P µ([].〈v||[]〉)
(η&P ) v : A&B ≺η&P µ(pi1 [α].〈v||pi1 [α]〉 | pi2 [β].〈v||pi2 [β]〉)
(ηP` ) v : A`B ≺ηP` µ([α, β].〈v||[α, β]〉)
(η→P ) v : A→ B ≺η→P µ([x · β].〈v||x · β〉)
(η¬P) v : ¬A ≺η¬P µ(¬ [x].〈v||¬ [x]〉)

α, β, x /∈ FV (v)
FIGURE 4.4. The extensional η laws for system L: with two unit types (1, >), two
empty types (0, ⊥), (co-)products (&, ⊕), (co-)pairs (⊗, `), two negations (∼, ¬),
and functions (→).
represented by a choice and “or” is represented by a pair because they are about
refutations rather than proofs. End remark 4.1.
The advantage of the system L style of syntax can be seen when we look to
the program transformations corresponding to the reversibility expansions previously
seen in Section 4.1, which are listed in Figure 4.4. In particular, these expansions
correspond to the η laws from the λ-calculus, so we refer to them by the same naming
convention. For example, the expansion of the right function introduction corresponds
to the λ-calculus η law for functions (v : A→ B ≺η→ λx.v x) which in system L looks
like:
(η→P ) v : A→ B ≺η→P µ([x · β].〈v||x · β〉)
Here, the pattern-matching formulation of functional terms gives a more pleasant η
law than the λ-based syntax from the dual calculi, which must introduce an extra
114
output abstraction to express the η→P law of the sequent calculus as follows:
v : A→ B ≺ λx.µβ. 〈v||x · β〉
As another example, the expansion of the right product introduction corresponds to
the surjective η law for products (v : A × B ≺η× (pi1(v), pi2(v))) which in system L
looks like:
(η&P ) v : A&B ≺η&P µ(pi1 [α].〈v||pi1 [α]〉 | pi2 [β].〈v||pi2 [β]〉)
Again, the pattern-matching syntax for product terms makes for a cleaner presentation
of the surjectivity of products in the sequent calculus, where the dual calculi
representation of the η&P introduces two output abstractions as follows:
v : A×B ≺ (µα. 〈v||pi1 [α]〉, µβ. 〈v||pi2 [β]〉)
We also have the positive reversibility expansions which worked on the left instead of
the right, meaning that they expand co-terms instead of terms. For example the left
sum introduction expansion η⊕P is:
(η⊕P ) e : A⊕B ≺η⊕P µ˜[ι1 (x).〈ι1 (x)||e〉 | ι1 (y).〈ι1 (y)||e〉]
The system L η law for sums looks very different than the one we saw in the λ-calculus
(v : A+ B ≺η+ case v of ι1 (x)⇒ ι1 (x) | ι2 (y)⇒ ι2 (y)). In particular, the existence
of co-terms as full-fledged syntactic entities, which were missing from the syntax of
the λ-calculus, gives a better presentation of the positive η laws that reveals their
connection with the negative η laws. In the λ-calculus, there doesn’t seem to be much
connection between the η laws for sums and products, but in system L, the syntax
makes it apparent that they are the polar opposite forms of the same law; one acting
on terms and the other on co-terms.
Polarizing the Fundamental Dilemma
System L is a great language for expressing the extensional η laws of types in
a way that reveals their symmetry with one another. However, if we try to naïvely
reconcile the polarized ηP laws with the core µµ˜ operational laws, we quickly run into
115
trouble since their strength is capable of re-introducing the fundamental dilemma of
computation (see Section 3.2). On the one hand, the negative ηP laws are incompatible
with the call-by-value µV µ˜V laws, since ηP can convert any term into a V value. For
example, if we start with the usual problematic command 〈µ .c1||µ˜ .c2〉, an unfortunate
η→P expansion can convert µ .c1, which is not a V value, into µ([x · β].〈µ .c1||x · β〉),
which is a V value. This leads to the divergent reductions:
c1 ←µV 〈µ .c1||µ˜ .c2〉 ←η→P 〈µ([x · β].〈µ .c1||x · β〉)||µ˜ .c2〉 →µ˜V c2
Therefore, for the negative ηP laws to make sense, the (co-)terms of negative types
cannot be interpreted by the call-by-value V strategy. On the other hand, the positive
ηP laws are incompatible with the call-by-name µN µ˜N laws, since ηP can convert any
co-term into a N co-value. For example, staring from the same problematic command,
an unfortunate η⊗P expansion can convert µ˜ .c2, which is not a N co-value, into
µ˜[(x, y).〈(x, y)||µ˜ .c2〉], which is a N co-value. This leads to the divergent reductions:
c2 ←µN 〈µ .c1||µ˜ .c2〉 ←η⊗P 〈µ .c1||µ˜[(x, y).〈(x, y)||µ˜ .c2〉]〉 →µ˜N c1
Therefore, for the positive ηP laws to make sense, the (co-)terms of positive types
cannot be interpreted by the call-by-name V strategy. What this means is that in
the face of the polarized η laws, we cannot resolve the fundamental dilemma by just
imposing a language-wide evaluation strategy once and for all as we did with the
dual calculi in Chapter III, since half the ηP laws are incompatible with call-by-value
evaluation and the other half are incompatible with call-by-name.
Fortunately, the concept of reversibility give us a different answer to the
fundamental non-determinism of the classical sequent calculus that leverages the ηP
laws instead of fighting against them, with an idea that can be traced back to Danos
et al. (1997). The key insight is that in lieu of imposing a language-wide evaluation
strategy, we can use the type of an interacting pair of (co-)terms in a command to figure
out what evaluation strategy to use for the reduction of that particular command.
So when we faced with an ambiguous command like 〈µα.c1||µ˜x.c2〉, we can use the
type of µα.c1 and µ˜x.c2 to tell us what the term and the co-term “really look like”
(Graham-Lengrand, 2015).
116
For example, suppose the troublesome command is between a term and co-term
of type A⊗B as in:
.... D
c1 : (Γ ` α : A⊗B,∆)
Γ ` µα.c1 : A⊗B | ∆ AR
.... E
c2 : (Γ, x : A⊗B ` ∆)
Γ | µ˜x.c2 : A⊗B ` ∆ AL
〈µα.c1||µ˜x.c1〉 : (Γ ` ∆) Cut
Since we know that the left rule for ⊗ is reversible, we can achieve an equivalent
co-term that ends with ⊗L:
.... D
c1 : (Γ ` α : A⊗B,∆)
Γ ` µα.c1 : A⊗B | ∆ AR
.... E ′
c′2 : (Γ, x : A, y : B ` ∆)
Γ | µ˜[(x, y).c′2] : A⊗B ` ∆
⊗L
〈µα.c1||µ˜[(x, y).c′2]〉 : (Γ ` ∆) Cut
Therefore, by employing reversibility of the typing rules, we discovered that there
wasn’t an issue after all, revealing the fact that in a sense the co-term was concealing
its intent (Graham-Lengrand, 2015). On the other hand, if we have the command
〈V ||µ˜x.c〉, where V is a V value then it is safe to substitute V for x since it must be a
pair 〈V1||V2〉 (or a variable standing in for a pair).
This approach of using reversibility of restore confluence also extends to the
negative connectives. However, because negative connectives are reversible in opposite
ways to positive connectives, we get the opposite resolution to the dilemma. Suppose
again that we are faced with the command 〈µα.c1||µ˜x.c2〉 with a similar typing
derivation as before, except that now x and α have the type A → B. We know
that the right rule for → is reversible, so we can explicate the typing derivation as:
.... D′
c′1 : (Γ, y : A ` β : B,∆)
Γ | µ([y · β].c′1) : A→ B ` ∆
→R
.... E
c2 : (Γ, x : A→ B ` ∆)
Γ | µ˜x.c2 : A→ B ` ∆ AL
〈µ([y · β].c′1)||µ˜x.c2〉 : (Γ ` ∆) Cut
giving us the more explicit command 〈µ([α, β].c′1)||µ˜x.c2〉 that now spells out exactly
which side should be prioritized. Therefore, for negative types, polarity in the type of
a cut reveals the opposite intent, restoring determinism to the system by favoring the
117
co-term over the term. Therefore polarity of the type of a cut can tell us who requires
priority, and restores determinism in an analogous manner as in Section 3.3.
Following this regime for solving the fundamental dilemma, the polarization
hypothesis says that types can be used to determine the evaluation order in a program
according to their polarity (Zeilberger, 2009; Munch-Maccagnoni, 2013). For positive
types like A⊗B andA⊕B, reversibility on the left tells us to favor giving priority to the
term in case of ambiguity. Contrarily, the reversibility on the right for negative types
like A&B, A`B, and A→ B tells us to favor giving priority to the co-term. Thus,
positive types suggest a call-by-value evaluation order and negative types suggest a
call-by-name evaluation order. To formally apply the polarized approach to evaluation
strategy, we must bifurcate the syntax of the core µµ˜-calculus and separate the positive
entities from the negative ones, as shown in Figure 4.5. This bifurcated syntax has
all the same types and expressions as from Figure 3.7, except that positive types and
(co-)terms (denoted by A+, . . . , v+, and e+) are syntactically separate from negative
types and (co-)terms (denoted by A−, . . . , v−, and e−). In order for the polarity of a
type or (co-)term to be apparent from its syntax, we need to annotate type variables
and (co-)variables with their intended polarity using either the positive superscript
(X+, x+, α+) or the negative superscript (X−, x−, α−) which is an explicit part of
their syntax (as opposed to the mere distinction between v+ and v−, etc.). When
the polarity of type, term, co-term doesn’t matter, we may just refer to it as A for
either A+ or A−, v for either an v+ or v−, and e for either a e+ or e−. Note commands
are not distinguished by a polarity because unlike (co-)terms they are not part of a
specific type. Instead, the single syntactic set Command contains two different kinds
of commands—one between positive (co-)terms and one between negative ones—so
that commands are only syntactically valid when the polarity of their (co-)terms agree.
Also, note that only the core typing rules are bifurcated into positive and negative
versions; the structural rules from Figure 3.10 which are also part of µµ˜P remain the
same.
Now, to address polarity in the full system L language, we only need to extend
the polarized core µµ˜P -calculus with the specific connectives and constructs, as shown
in Figure 4.6, which extends the polarized core calculus from Figure 4.5. Note that
there is one extra pair of connectives ↓A− and ↑A+ that are introduced in Figure 4.6
which are known as Girard’s (2001) polarity “shifts” that mark a switch between
the positive and negative polarities. These shifts are important for making sure that
118
A,B,C ∈ Type ::= A+ | A−
A+, B+, C+ ∈ Type+ ::= X+ A−, B−, C− ∈ Type− ::= X−
c ∈ Command ::= 〈v+||e+〉 | 〈v−||e−〉
v ∈ Term ::= v+ | v− v+ ∈ Term+ ::= x+ | µα+.c v− ∈ Term− ::= x− | µα−.c
e ∈ CoTerm ::= e+ | e− e+ ∈ CoTerm ::= α+ | µ˜x+.c e− ∈ CoTerm ::= α− | µ˜x−.c
Γ ∈ InputEnv ::= x1 : A1, . . . , xn : An ∆ ∈ OutputEnv ::= α1 : A1, . . . , αn : An
Judgement ::= c : (Γ ` ∆) | (Γ ` v : A | ∆) | (Γ | e : A ` ∆)
Core rules:
x+ : A+ ` x+ : A+ | VR+ | α+ : A+ ` α+ : A+ VL+
x− : A− ` x− : A− | VR− | α− : A− ` α− : A− VL−
c : (Γ ` α+ : A+,∆)
Γ ` µα+.c : A+ | ∆ AR+
c : (Γ, x+ : A+ ` ∆)
Γ | µ˜x+.c : A+ ` ∆ AL+
c : (Γ ` α− : A−,∆)
Γ ` µα−.c : A− | ∆ AR−
c : (Γ, x− : A− ` ∆)
Γ | µ˜x−.c : A− ` ∆ AL−
Γ ` v+ : A+ | ∆ Γ′ | e+ : A+ ` ∆′
〈v+||e+〉 : (Γ′,Γ ` ∆′,∆) Cut+
Γ ` v− : A− | ∆ Γ′ | e− : A− ` ∆′
〈v−||e−〉 : (Γ′,Γ ` ∆′,∆) Cut−
V ∈ ValueP ::= V+ | V− V+ ∈ Value+ ::= x+ V− ∈ Value− ::= v−
E ∈ CoValueP ::= E+ | E− E+ ∈ CoValue+ ::= e+ E− ∈ CoValue− ::= α−
(µP)
〈
µα+.c
∣∣∣∣∣∣E+〉 µP c{E+/α+} (ηµ) µα+. 〈v+∣∣∣∣∣∣α+〉 ηµ v+ (α+ /∈ FV (v+))
(µP)
〈
µα−.c
∣∣∣∣∣∣E−〉 µP c {E−/α−} (ηµ) µα−. 〈v−∣∣∣∣∣∣α−〉 ηµ v− (α− /∈ FV (v−))
(µ˜P)
〈
V+
∣∣∣∣∣∣µ˜x+.c〉 µP c {V+/x+} (ηµ) µ˜x+. 〈x+∣∣∣∣∣∣e+〉 ηµ e+ (x+ /∈ FV (e+))
(µ˜P)
〈
V−
∣∣∣∣∣∣µ˜x−.c〉 µP c {V−/x−} (ηµ) µ˜x−. 〈x−∣∣∣∣∣∣e−〉 ηµ e− (x− /∈ FV (e−))
FIGURE 4.5. The polarized core µµ˜P-calculus: its static and dynamic semantics.
119
A,B,C ∈ Type ::= A+ | A−
A+, B+, C+ ∈ Type+ ::= X+ | 0 | 1 | A+ ⊕B+ | A+ ⊗B+ | ∼A− | ↓A−
A−, B−, C− ∈ Type− ::= X− | > | ⊥ | A− &B− | A− `B− | A+ → B− | ¬A+ | ↑A+
c ∈ Command ::= 〈v+||e+〉 | 〈v−||e−〉 v ∈ Term ::= v+ | v− e ∈ CoTerm ::= e+ | e−
v+ ∈ Term+ ::= x+ | µα+.c | () | ι1 (v+) | ι2 (v+) | (v+, v+) | ∼ (e−) | ↓(v−)
e+ ∈ CoTerm+ ::= α+ | µ˜x+.c | µ˜[] | µ˜[().c] | µ˜
[
ι1
(
x+
)
.c | ι2
(
y+
)
.c
]
| µ˜
[(
x+, y+
)
.c
]
| µ˜
[
∼
(
α−
)
.c
]
| µ˜
[
↓
(
x−
)
.c
]
v− ∈ Term− ::= x− | µα−.c | µ() | µ([].c) | µ
(
pi1
[
α−
]
.c | pi2
[
β−
]
.c
)
| µ
([
α−, β−
]
.c
)
| µ
(
[x+ · α−].c
)
| µ
(
¬
[
x+
]
.c
)
| µ
(
↑
[
α+
]
.c
)
e− ∈ CoTerm− ::= α− | µ˜x−.c | [] | pi1 [e−] | pi2 [e−] | [e−, e−] | v+ · e− | ¬ [v+] | ↑[e+]
FIGURE 4.6. The syntax for polarized system L: with both positive connectives—
disjunction (⊕), conjunction (⊗), negation (∼), and polarity shift (↓)—and negative
connectives—conjunction (&), disjunction (`), negation (¬), functions (→), and
polarity shift (↑).
120
the polar bifurcation of the language does not accidentally eliminate its essential
expressive capabilities. For example, in the λ-calculus, the dual calculi, or a functional
programming language, it is typical to store a function (which is a negative term)
inside the structure of a pair or sum type structure (which is a positive term). However,
this would be prevented by the distinction between the polarities of types. Instead,
we would like to allow for some mingling between positive and negative types and
(co-)terms without confusing the two. The ↓ shift lets us embed negative types inside
positive ones, so that for every negative type A− we have the positive type ↓A−.
Going along with our story that positive values follow predetermined patterns, we
have the structured term ↓(v−) : ↓A− which contains a negative term along with a
case abstraction co-term, µ˜[↓(x−).c] : ↓A−, for unpacking the structure and pulling
out the underlying term. The ↑ shift lets us embed positive types inside negative ones,
so that for every type A+ we have the negative type ↑A+. The (co-)terms of the ↑ shift
symmetric to the ↓ ones, so that we have the co-case abstraction term µ(↑[α+].c) : ↑A+
which is waiting for a shifted co-term of the form ↑[e+] : ˆA+ containing a positive
co-term.
The logical typing rules for polarized system L is shown in Figure 4.7, which
are effectively the same rules from Figure 4.3 made aware of the distinction between
positive and negative polarities. The only new rules are for the new shift connectives.
More interestingly, the βP rules for system L are similar to rules for reducing case
analysis in functional languages as shown in Figure 4.8. For example, for the positive
βP laws we have sum types that select which branch to take based on the constructor
tag: 〈
ι1 (V+)
∣∣∣∣∣∣µ˜[ι1 (x+).c1 | ι2 (y+).c2]〉 β⊕P c1 {V+/x+}
and pair types which decompose a pair into its constituent parts:
〈(
V+, V
′
+
)∣∣∣∣∣∣µ˜[(x+, y+).c]〉 β⊗P c{V+/x+, V ′+/y+}
The negative β[][P ] laws follow the same notion of case analysis as the positive β[][P ]
laws, except in the reverse direction. For example, terms of product types select the
appropriate response based on the constructor tag of their observation:
〈
µ
(
pi1
[
α−
]
.c1 | pi2
[
β−
]
.c2
)∣∣∣∣∣∣pi1 [E−]〉 β&P c1 {E−/α−}
121
Positive logical rules:
no 0R rule Γ | µ˜[] : 0 ` ∆ 0L ` () : 1 | 1R
c : (Γ ` ∆)
Γ | µ˜[().c] : 1 ` ∆ 1L
Γ ` v+ : A+ | ∆
Γ ` ι1 (v+) : A+ ⊕B+ | ∆ ⊕R1
Γ ` v+ : B+ | ∆
Γ ` ι2 (v+) : A+ ⊕B+ | ∆ ⊕R2
c : (Γ, x+ : A+ ` ∆) c′ : (Γ, y+ : B+ ` ∆)
Γ | µ˜[ι1 (x+).c | ι2 (y+).c′] : A+ ⊕B+ ` ∆ ⊕L
Γ ` v+ : A+ | ∆ Γ′ ` v′+ : B+ | ∆′
Γ,Γ′ `
(
v+, v
′
+
)
: A+ ⊗B+ | ∆,∆′
⊗R c : (Γ, x+ : A+, y+ : B+ ` ∆)
Γ | µ˜[(x+, y+).c] : A+ ⊗B+ ` ∆ ⊗L
Γ | e− : A− ` ∆
Γ ` ∼ (e−) : ∼A− | ∆ ∼R
c : (Γ ` α− : A−,∆)
Γ | µ˜[∼ (α−).c] : ∼A− ` ∆ ∼L
Γ ` v− : A− | ∆
Γ ` ↓(v−) : ↓A− | ∆ ´R
c : (Γ, x− : A− ` ∆)
Γ | µ˜[↓(x−).c] : ↓A− ` ∆ ´L
Negative logical rules:
Γ ` µ() : > | ∆ >R no >L rule
c : (Γ ` ∆)
Γ ` µ([].c) : ⊥ | ∆ ⊥R | [] : ⊥ ` ⊥L
c : (Γ ` α− : A−,∆) c′ : (Γ ` β− : B−,∆)
Γ ` µ(pi1 [α−].c | pi2 [β−].c′) : A− &B− | ∆ &R
Γ | e− : A− ` ∆
Γ | pi1 [e−] : A− &B− ` ∆ &L1
Γ | e− : B− ` ∆
Γ | pi2 [e−] : A− &B− ` ∆ &L2
c : (Γ ` α− : A−, β− : B−,∆)
Γ ` µ([α−, β−].c) : A− `B− | ∆ `R
Γ | e− : A− ` ∆ Γ′ | e′− : B− ` ∆′
Γ,Γ′ |
[
e−, e′−
]
: A− `B− ` ∆,∆′ `L
c : (Γ, x+ : A+ ` ∆)
Γ ` µ(¬ [x+].c) : ¬A+ | ∆ ¬R
Γ ` v+ : A+ | ∆
Γ | ¬ [v+] : ¬A+ ` ∆ ¬L
c : (Γ ` α+ : A+,∆)
Γ ` µ(↓(α+).c) : ↑A+ | ∆ ˆR
Γ | e+ : A+ ` ∆
Γ | ↑[e+] : ↑A+ ` ∆ ˆL
FIGURE 4.7. Logical typing rules for polarized system L: with both positive
connectives (0, 1, ⊕, ⊗, ∼, ↓) and negative connectives (>, ⊥, &, `, →, ¬, ↑).
122
V ∈ V alueP ::= V+ | v−
V+ ∈ V alue+ ::= x+ | () | ι1 (V+) | ι2 (V+) | (V+, V+) | ∼ (E−) | ↓(v−)
E ∈ CoV alueP ::= e+ | E−
E− ∈ CoV alue− ::= α− | pi1 [E−] | pi2 [E−] | [E−, E−] | V+ · E− | ¬ [V+] | ↑[e+]
Positive βP rules:
(β0P) no β0P rule
(β1P) 〈()||µ˜[().c]〉 β1P c
(β⊕P )
〈
ιi (V+)
∣∣∣∣∣∣µ˜[ι1 (x+1 ).c1 | ι2 (x+2 ).c2]〉 β⊕P ci {V+/x+i }
(β⊗P )
〈(
V+, V
′
+
)∣∣∣∣∣∣µ˜[(x+, y+).c]〉 β⊗P c{V+/x, V ′+/y}
(β∼P )
〈
∼ (E−)
∣∣∣∣∣∣µ˜[∼ (α−).c]〉 β∼P c{E−/α−}
(β↓P)
〈
↓(v−)
∣∣∣∣∣∣µ˜[↓(x−).c]〉 
β
↓
P
c
{
v−/x−
}
Negative βP rules:
(β>P ) no β>P rule
(β⊥P ) 〈µ([].c)||[]〉 β⊥P c
(β&P )
〈
µ
(
pi1
[
α−2
]
.c1 | pi2
[
α−1
]
.c2
)∣∣∣∣∣∣pii [E−]〉 β&P ci {E−/α−i }
(βP` )
〈
µ
([
α−, β−
]
.c
)∣∣∣∣∣∣[E−, E ′−]〉 βP` c {E−/α−, E ′−/β−}
(β→P )
〈
µ
(
[x+ · β−].c
)∣∣∣∣∣∣V+ · E−〉 β→P c{V+/x+, E−/β−}
(β¬P)
〈
µ
(
¬
[
x+
]
.c
)∣∣∣∣∣∣¬ [V+]〉 β¬P c{V+/x+}
(β↑P)
〈
µ
(
↑
[
α+
]
.c
)∣∣∣∣∣∣↑[e+]〉 β↑P c
{
e+/α
+
}
FIGURE 4.8. The operational β laws for polarized system L: with two unit types (1,
>), two empty types (0, ⊥), (co-)products (&, ⊕), (co-)pairs (⊗, `), two negations
(∼, ¬), functions (→), and two polarity shifts (↓, ↑).
123
and terms of co-pair types decompose their observation into the two independent
messages: 〈
µ
([
α−, β−
]
.c
)∣∣∣∣∣∣[E−, E ′−]〉 βP` c {E−/α−, E ′−/β−}
Focusing and Polarity
The βP-based operational rules for polarized system L explain how to reduce
commands by performing pattern-matching. However, βP reduction alone is not
enough, since it suffers the same essential deficiency as β reduction in the dual calculi
(Section 3.3). For example, in the positive form of pattern-matching of type A+⊕B+,
we could encounter the command
〈
ι1
(
µα+.c
)∣∣∣∣∣∣µ˜[ι1 (x+).c1 | ι2 (y+).c2]〉 6β⊕P
which does not proceed by β⊕P because µα+.c is not a V value. Similarly in the negative
form of pattern-matching of type A+ → B−, we could encounter the command
〈
µ
(
[x+ · β−].c
)∣∣∣∣∣∣(µα+.c1) · [µ˜y−.c2]〉 6β→P
which does not proceed by β→P because µα+.c1 is not a V value and µ˜y−.c2 is not a
N co-value.
Unsurprisingly, the same technique of focusing with the same two options we had
before: we can remove the superfluous parts of the syntax of system L (like the above
two commands) with the static approach to focusing, or we can add the extra steps
necessary to kick-start the computation again with the dynamic approach to focusing.
The major difference between focusing in system L versus focusing in the dual calculi
is that since polarized system L incorporates aspects of both the call-by-value and
call-by-name halves of the dual calculi into a single language, the polarized focusing
shares similarities with both the call-by-value and call-by-name focusing at once. In
particular, the dual calculi had two different sets of focused sub-syntaxes (LKQ and
LKT) and two different sets of focusing ς rules (ςV and ςN ) corresponding to its
two different evaluation strategies. Instead, polarized system L has a single focused
sub-syntax and a single set of focusing ς rules.
First, let’s consider the static approach with the focused sub-syntax of system L
shown in Figure 4.9. On the positive side, the restrictions on the syntax of positive
terms resembles LKQ. Every positive term is either a positive value or output
124
v+ ∈ Term+ ::= V+ | µα+.c
V+ ∈ Value+ ::= x+ | () | ι1 (V+) | ι2 (V+) | (V+, V+) | ∼ (E−) | ↓(v−)
e+ ∈ CoTerm+ ::= α+ | µ˜x+.c | µ˜[] | µ˜[().c] | µ˜
[
ι1
(
x+
)
.c | ι2
(
y+
)
.c
]
| µ˜
[(
x+, y+
)
.c
]
| µ˜
[
∼
(
α−
)
.c
]
| µ˜
[
↓
(
x−
)
.c
]
v− ∈ Term− ::= x− | µα−.c | µ() | µ([].c) | µ
(
pi1
[
α−
]
.c | pi2
[
β−
]
.c
)
| µ
([
α−, β−
]
.c
)
| µ
(
[x+ · α−].c
)
| µ
(
¬
[
x+
]
.c
)
| µ
(
↑
[
α+
]
.c
)
e− ∈ CoTerm− ::= E− | µ˜x−.c
E− ∈ CoValue− ::= α− | pi1 [E−] | pi2 [E−] | [E−, E−] | V+ · E− | ¬ [V+] | ↑[e+]
c ∈ Command ::= 〈v+||e+〉 | 〈v−||e−〉
Judgement ::= c : (Γ ` ∆)
| (Γ ` v : A | ∆) | (Γ ` V+ : A+ ; ∆)
| (Γ | e : A ` ∆) | (Γ ; E− : A− ` ∆)
Axiom:
x+ : A+ ` x+ : A+ ; Var
+
| α+ : A+ ` α+ : A+ CoVar
+
x− : A− ` x− : A− | Var
−
; α− : A− ` α− : A− CoVar
−
Focusing (structural) rules:
Γ ` V+ : A+ ; ∆
Γ ` V+ : A+ | ∆ FR
Γ ; E− : A− ` ∆
Γ | E− : A− ` ∆ FL
FIGURE 4.9. Focused sub-syntax and core typing rules for polarized system L.
125
abstraction, where the positive values are defined hereditarily: a pair of two values is a
value, an injection of a value is a value, and so on. That way, troublesome commands like
〈ι1 (µα+.c)||µ˜[ι1 (x+).c1 | ι2 (y+).c2]〉 become syntactically forbidden. The interesting
types that contain negative types and break this mold are the values ∼ (E−) : ∼A−
which contain a negative co-value and the values ↓(v−) : ↓A− which contains a
negative term. Also like LKQ there is no restrictions placed on positive co-terms,
which is in part because the co-terms of positive types are all abstractions which are
not easily restricted like the positively constructed terms are. On the negative side, the
restrictions on the syntax of negative co-terms resembles LKT. Every negative term
is either a negative co-value or input abstraction, where the negative co-values are
defined hereditarily: a pair of two co-value is a co-value, a projection of a co-value is
a co-value, etc. So troublesome commands like 〈µ([x+ · β−].c)||(µα+.c1) · [µ˜y−.c2]〉 are
also syntactically forbidden. As before, there are some interesting types that refer to
positive types, like the co-values V+ · E− : A+ → B− and ¬ [V+] : ¬A+ which contain
a positive value and ↑[e+] : ↑A+ which contains a positive co-term. Also as like LKT
there is no restriction on the negative terms, which are all abstractions over negative
co-values.
The focalized and polarized type system for system L introduces two new sequents
using the stoup (;) based on the two restrictions on the syntax, Γ ` V+ : A+ ; ∆ for
typing positive values in focus and Γ ; E− : A− ` ∆ for typing negative co-values in
focus. The logical typing rules are given in Figure 4.10. The typing rules are essentially
the same as the unfocused polarized ones from Figure 4.7, except that they now follow
the syntactic restrictions on positive terms and negative co-terms Figure 4.9. This
has the net effect that, in a bottom-up reading of a typing derivation, once focus is
gained via the FR or FL rules it is maintained. The only rules which are capable
of losing focus are the ´R and ˆL rules, which transition from a positive value to a
negative term and from a negative co-value to a positive co-term. This can be seen as
a design philosophy justifying the choice of polarities in the connectives of polarized
system L from Figure 4.6: focus should be maintained by every connective except the
shifts. Therefore, the function type A+ → B− (called the “primordial function type”
by Zeilberger (2009)) must have a positive argument type and negative return type to
maintain focus in the call stack V+ · E−, and the negation types ∼A− and ¬A+ must
invert the polarity of the type to maintain focus in ∼ (E−) and ¬ [V+]. Anything else
126
Positive focused logical rules:
no 0R rule Γ | µ˜[] : 0 ` ∆ 0L ` () : 1 ; 1R
c : (Γ ` ∆)
Γ | µ˜[().c] : 1 ` ∆ 1L
Γ ` V+ : A+ ; ∆
Γ ` ι1 (V+) : A+ ⊕B+ ; ∆ ⊕R1
Γ ` V+ : B+ ; ∆
Γ ` ι2 (V+) : A+ ⊕B+ ; ∆ ⊕R2
c : (Γ, x+ : A+ ` ∆) c′ : (Γ, y+ : B+ ` ∆)
Γ | µ˜[ι1 (x+).c | ι2 (y+).c′] : A+ ⊕B+ ` ∆ ⊕L
Γ ` V+ : A+ ; ∆ Γ′ ` V ′+ : B+ ; ∆′
Γ,Γ′ `
(
V+, V
′
+
)
: A+ ⊗B+ ; ∆,∆′
⊗R c : (Γ, x+ : A+, y+ : B+ ` ∆)
Γ | µ˜[(x+, y+).c] : A+ ⊗B+ ` ∆ ⊗L
Γ ; E− : A− ` ∆
Γ ` ∼ (E−) : ∼A− ; ∆ ∼R
c : (Γ ` α− : A−,∆)
Γ | µ˜[∼ (α−).c] : ∼A− ` ∆ ∼L
Γ ` v− : A− | ∆
Γ ` ↓(v−) : ↓A− ; ∆ ´R
c : (Γ, x− : A− ` ∆)
Γ | µ˜[↓(x−).c] : ↓A− ` ∆ ´L
Negative focused logical rules:
Γ ` µ() : > | ∆ >R no >L rule
c : (Γ ` ∆)
Γ ` µ([].c) : ⊥ | ∆ ⊥R ; [] : ⊥ ` ⊥L
c : (Γ ` α− : A−,∆) c′ : (Γ ` β− : B−,∆)
Γ ` µ(pi1 [α−].c | pi2 [β−].c′) : A− &B− | ∆ &R
Γ ; E− : A− ` ∆
Γ ; pi1 [E−] : A− &B− ` ∆ &L1
Γ ; E− : B− ` ∆
Γ ; pi2 [E−] : A− &B− ` ∆ &L2
c : (Γ ` α− : A−, β− : B−,∆)
Γ ` µ([α−, β−].c) : A− `B− | ∆ `R
Γ ; E− : A− ` ∆ Γ′ ; E ′− : B− ` ∆′
Γ,Γ′ ;
[
E−, E ′−
]
: A− `B− ` ∆,∆′ `L
c : (Γ, x+ : A+ ` ∆)
Γ ` µ(¬ [x+].c) : ¬A+ | ∆ ¬R
Γ ` V+ : A+ ; ∆
Γ ; ¬ [V+] : ¬A+ ` ∆ ¬L
c : (Γ ` α+ : A+,∆)
Γ ` µ(↓(α+).c) : ↑A+ | ∆ ˆR
Γ | e+ : A+ ` ∆
Γ ; ↑[e+] : ↑A+ ` ∆ ˆL
FIGURE 4.10. Focused logical typing rules for polarized system L: with positive
connectives (0, 1, ⊕, ⊗, ¬, ↓) and negative connectives (>, ⊥, &, `, →, ∼, ↑).
127
would place a negative term or a positive co-term inside of a construction, breaking
the convention.
Next, let’s consider the dynamic approach with the extra focusing rewrite
rules shown in Figure 4.11. These extra reductions are just enough to prevent the
troublesome commands from getting stuck. For example, the β⊕P -stuck command
between (co-)terms of type A+ ⊕ B+, 〈ι1 (µα+.c)||µ˜[ι1 (x+).c1 | ι2 (y+).c2]〉, can now
proceed by a ς⊕P reduction on the immediate sub-term:〈
ι1
(
µα+.c
)∣∣∣∣∣∣µ˜[ι1 (x+).c1 | ι2 (y+).c2]〉
→β⊕P
〈
µγ+.
〈
µα+.c
∣∣∣∣∣∣µ˜y+. 〈ι1 (y+)∣∣∣∣∣∣γ+〉〉∣∣∣∣∣∣µ˜[ι1 (x+).c1 | ι2 (y+).c2]〉
Likewise, the β→P -stuck command between (co-)terms of type A+ → B−,
〈µ([x+ · β−].c)||(µα+.c1) · [µ˜y−.c2]〉, can also now proceed by a ς→P reduction on the
immediate sub-co-term:
〈
µ
(
[x+ · β−].c
)∣∣∣∣∣∣(µα+.c1) · [µ˜y−.c2]〉
→ς→P
〈
µ
(
[x+ · β−].c
)∣∣∣∣∣∣µ˜x+. 〈µα+.c1∣∣∣∣∣∣µ˜y+. 〈x+∣∣∣∣∣∣y+ · [µ˜y−.c2]〉〉〉
The combination of βP and ςP reductions gives us enough tools for a well-behaved
extension the core µµ˜ operational semantics. Because ςP operates on (co-)terms instead
of commands, we must extend the set of polarized evaluation contexts D to reduce
(co-)terms when necessary as follows:
D ∈ EvalCxtP ::=  | 〈||e+〉 | 〈v−||〉
This gives us the µP µ˜PβPςP operational semantics ( 7→µP µ˜PβP ςP ), which is strong
enough to compute results of the following form:
FinalCommandP ::=
〈
V+
∣∣∣∣∣∣α+〉 | 〈x+∣∣∣∣∣∣Es+〉 | 〈x−∣∣∣∣∣∣E−〉 | 〈V s−∣∣∣∣∣∣α−〉
V s− ∈ SimpleValue− =
{
v− ∈ Term− | v− 6=α µα−.c
}
Es+ ∈ SimpleCoValue+ =
{
e+ ∈ CoTerm+ | e+ 6=α µ˜x+.c
}
When considering only well-typed commands, we get the standard safety theorem
saying that well-typed commands always reduce to a final command shown above
similar to the dual calculi.
128
Positive ςP rules:
(ς0P) no ς0P rule
(ς1P) no ς1P rule
(ς⊕P ) pii [v+] ς⊕P µα
+.
〈
v+
∣∣∣∣∣∣µ˜y+. 〈pii [y+]∣∣∣∣∣∣α+〉〉
(ς⊗P )
(
v+, v
′
+
)
ς⊗P µα
+.
〈
v+
∣∣∣∣∣∣µ˜y+. 〈(y+, v′+)∣∣∣∣∣∣α+〉〉
(ς⊗P ) (V+, v+) ς⊗P µα
+.
〈
v+
∣∣∣∣∣∣µ˜y+. 〈(V+, y+)∣∣∣∣∣∣α+〉〉
(ς∼P ) ∼ (e−) ς∼P µα+.
〈
µβ−.
〈
∼
(
β−
)∣∣∣∣∣∣α+〉∣∣∣∣∣∣e−〉
(ς ↓P) no ς
↓
P rule

v+ /∈ V alueP ,
e− /∈ CoV alueP ,
α+,β−, y+ fresh
Negative ςP rules:
(ς>P ) no ς>P rule
(ς⊥P ) no ς⊥P rule
(ς&P ) pii [e−] ς&P µ˜x
−.
〈
µβ−.
〈
x−
∣∣∣∣∣∣pii [β−]〉∣∣∣∣∣∣e−〉
(ςP` )
[
e−, e′−
]
ςP` µ˜x
+.
〈
µβ−.
〈
x+
∣∣∣∣∣∣[β−, e′−]〉∣∣∣∣∣∣e−〉
(ςP` ) [E−, e−] ςP` µ˜x
+.
〈
µβ−.
〈
x+
∣∣∣∣∣∣[E−, β−]〉∣∣∣∣∣∣e−〉
(ς→P ) v+ · e′− ς→P µ˜x+.
〈
v+
∣∣∣∣∣∣µ˜y+. 〈x+∣∣∣∣∣∣y+ · e′−〉〉
(ς→P ) V+ · e− ς→P µ˜x+.
〈
µβ−.
〈
x+
∣∣∣∣∣∣V+ · β−〉∣∣∣∣∣∣e−〉
(ς¬P) ¬ [v+] ς¬P µ˜x+.
〈
v+
∣∣∣∣∣∣µ˜y+. 〈x+∣∣∣∣∣∣¬ [y+]〉〉
(ς ↑P) no ς
↑
P rule

e− /∈ CoV alueP ,
v+ /∈ V alueP
x−,y+, β− fresh
FIGURE 4.11. The focusing ς laws for polarized system L: with two unit types (1, >),
two empty types (0, ⊥), (co-)products (&, ⊕), (co-)pairs (⊗, `), two negations (∼,
¬), functions (→), and two polarity shifts (↓, ↑).
129
Theorem 4.1 (Progress and preservation). For any system L command c : (Γ ` ∆):
a) Progress: c is a polarized final command or there is a command c′ such that
c 7→µP µ˜PβP ςP c′, and
b) Preservation: if c 7→µP µ˜PβP ςP c′, then c′ : (Γ ` ∆).
Proof. The proof is analogous to the proof of Theorem 3.3. Progress follows by
induction on the typing derivation of c : (Γ ` ∆), which is assured because
– every v+ is either a value, an output abstraction, or a ςP redex,
– every e+ is either an input abstraction or in SimpleCoValue+,
– every v− is either an output abstraction or in SimpleValue−, and
– every e− is either a co-value, an input abstraction, or a ςP redex.
Therefore, if the cut is neither final nor reducible, then either its positive term or
negative co-term ςP -reduces. Preservation follows by cases on all the possible rewriting
rules using the substitution principle for typing derivations similar to Theorem 3.3,
so that
– if c µµ˜ηµηµ˜βς c′ then c : (Γ ` ∆) implies c′ : (Γ ` ∆),
– if v µµ˜ηµηµ˜βς v′ then v : (Γ ` ∆)C implies v′ : (Γ ` ∆)C, and
– if e µµ˜ηµηµ˜βς e′ then e : (Γ ` ∆)C implies e′ : (Γ ` ∆)C.
Also, much like the dual calculi, the two methods of focusing correspond to one
another, applying the same essential transformations either during execution or as
a pre-processing pass. More specifically, the focused sub-syntax of polarized system
L contains exactly the ςP-normal forms of system L, and therefore every command,
term, and co-term can be ςP reduced into the focused sub-syntax.
Theorem 4.2 (Focusing). Every polarized system L command, term, and co-term is
in the focused sub-syntax if and only if it is a ςP-normal form. Furthermore, for every
polarized system L command c, term v, and co-term e, there is a focused command c′,
term v′, and co-term e′ such that cςP c
′, v ςP v
′, and eςP e
′.
Proof. First, the fact that the command and (co-)term in the focused sub-syntax are
in one-for-one correspondence with ςP -normal forms follows by induction on the syntax
of polarized system L commands and (co-)terms.
130
Second, observe that the ςP reduction theory is strongly normalizing because each
reduction reduces the number of non-(co-)values within term and co-term constructions
which serves as a normalization measure. Therefore, every command and (co-)term
has a unique ςP-normal form, which by the first point must lie within the focused
sub-syntax of polarized system L.
Self-Duality
System L exhibits a logical duality similar to the dual calculi (see Section 3.3).
However, the dual calculi are two separate dual calculi—one call-by-value and one
call-by-name—that share common syntax and types, polarized system L is self-dual.
In other words, we can say that polarized system L internalizes the notion of duality
inside of itself, so that it gives a single, complete, and self-contained language for
discussing and using dual concepts. This is because polarization lets us incorporate
both call-by-name and call-by-value constructs and evaluation. In the dual calculi, call-
by-value programs are dualized into call-by-name ones, and vice versa, which lie in
the two separate interpretations of the same syntax. But polarized system L contains
both call-by-value and call-by-name fragments, so that there is no need for a separate
calculus and a change of interpretation to accomodate the inversion of control flow
caused by duality.
As with the dual calculi, the self-duality of polarized system L resembles the de
Morgan laws, where truth is dual to falsehood and conjunction is dual to conjunction.
However, polarity explicitly reveals another aspect of duality that was implicit in the
dual calculus: duality also reverses the polarity of types and programs. So 0 is dual to
>, 1 is dual to ⊥,⊕ is dual to &, and ⊗ is dual to `. The polarity reversal corresponds
to the fact that the dynamic semantics of call-by-value is dual to that of call-by-name.
This also means that, whereas the single negation connective was self-dual in the dual
calculi, the two polarities of negation (∼ and ¬) are dual to one another. Likewise,
the two polarity shifts (↓ and ↑) are also dual connectives.
The only lack of symmetry is with function types A+ → B− which lack their
dual counterpart, as was the case in both LK (Section 3.1) and the dual calculi
(Section 3.3). As is now the standard procedure, this asymmetry is easily remedied
by adding subtraction types B+ − A− as the dual counterpart to function types as
shown in Figure 4.12. Syntactically, this presentation of subtraction is the same as in
the dual calculi, except that we use a case abstraction co-term µ˜[(α · y).c] to match
131
v+ ∈ Term+ ::= . . . | e− · v+ e+ ∈ CoTerm+ ::= . . . | µ˜
[
(α− · y+).c
]
V+ ∈ Value+ ::= . . . | E− · V+ A+, B+, C+ ∈ Type+ ::= . . . | B+ − A−
Γ′ | e− : A− ` ∆′ Γ ` v+ : B+ | ∆
Γ,Γ′ ` e− · v+ : B+ − A− | ∆,∆′ −R
c : (Γ, y+ : B+ ` α− : A−∆)
Γ | µ˜[(α− · y+).c] : B+ − A− ` ∆ −L
Γ′ ; E− : A− ` ∆′ Γ ` V+ : B+ ; ∆
Γ,Γ′ ` E− · V+ : B+ − A− ; ∆,∆′ −R
(β−P )
〈
E− · V+
∣∣∣∣∣∣µ˜[(α− · y+).c]〉 β−P c{V+/y+, α−E−}
(ς−P ) e− · v+ ς−P µα
+.
〈
µβ+.
〈
β− · v+
∣∣∣∣∣∣α+〉∣∣∣∣∣∣e−〉
(e /∈ CoValue−, α+, β− /∈ FV (e− · v+))
(ς−P ) E− · v+ ς−P µα
+.
〈
v+
∣∣∣∣∣∣µ˜y+. 〈E− · y+∣∣∣∣∣∣α+〉〉
(v /∈ Value+, α+, y+ /∈ FV (E− · v+))
(η−P ) e+ : B+ − A− ≺η−P µ˜
[
(α− · y+).
〈
α− · y+
∣∣∣∣∣∣e〉]
FIGURE 4.12. Extending polarized system L with subtraction (−), the dual of
implication (→).
132
(X+)⊥ , X− (X−)⊥ , X+
0⊥ , > >⊥ , 0
1⊥ , ⊥ ⊥⊥ , 1
(A+ ⊕B+)⊥ , (A⊥+) & (B⊥+) (A− &B−)⊥ , (A⊥−)⊕ (B⊥−)
(A+ ⊗B+)⊥ , (A⊥+)` (B⊥+) (A− `B−)⊥ , (A⊥−)⊗ (B⊥−)
(B+ − A−)⊥ , (A⊥−)→ (B⊥+) (A+ → B−)⊥ , (B⊥−)− (A⊥+)
(∼A−)⊥ , ¬(A⊥−) (¬A+)⊥ , ∼(A⊥+)
(↓A−)⊥ , ↑(A⊥−) (↑A+)⊥ , ↓(A⊥+)
FIGURE 4.13. The self-duality of system L types: with two unit types (1,>), two
empty types (0, ⊥), (co-)products (⊕, &), (co-)pairs (⊗, `), (co-)functions (→, −),
and negations (∼, ¬), and two polarity shifts (↓, ↑).
the system L style of pattern-matching function terms instead of a λ abstracting a
co-variable over a co-term. Additionally, the fact that the function type A+ → B−
mixes the two polarities is reflected in the subtraction type B+−A−. With symmetry
restored, we formally define the duality of polarized system L types in Figure 4.13
and programs in Figure 4.14.
The self-duality of polarized system L exhibits the same pleasant properties as
duality of the dual calculi from Section 3.3: the duality relation is involutive, respects
static semantics (typing), and respects dynamic semantics (reduction). The major
departure from the dual calculi is that all the dynamic semantics and rewriting rules
are contained within the same polarized language, instead of being split between two
interpretations of the same syntax.
Theorem 4.3 (Involutive duality). The duality operation ⊥ on environments,
sequents, types, commands, terms, and co-terms is involutive, so that ⊥⊥ is the identity
transformation.
Proof. By induction on the definition of the duality operation ⊥, similar to the proof
of Theorem 3.6.
Theorem 4.4 (Static duality).
a) c : (Γ ` ∆) is well-typed if and only if c⊥ : (∆⊥ ` Γ⊥) is.
b) Γ ` v : A | ∆ is well-typed if and only if ∆⊥ | v⊥ : A⊥ ` Γ⊥ is.
133
〈v+||e+〉⊥ ,
〈
e⊥+
∣∣∣∣∣∣v⊥+〉 〈v−||e−〉⊥ , 〈e⊥−∣∣∣∣∣∣v⊥−〉
(x+)⊥ , x−
(µα+.c)⊥ , µ˜α−.c⊥
()⊥ , []
ι1 (v+)⊥ , pi1
[
v⊥+
]
ι2 (v+)⊥ , pi2
[
v⊥+
]
(
v+, v
′
+
)⊥
,
[
v⊥+, v
′⊥
+
]
(e− · v+)⊥ , e⊥− · v⊥+
∼ (e−)⊥ , ¬
[
e⊥−
]
↓(v−)⊥ , ↑
[
v⊥−
]
[α+]⊥ , α−
[µ˜x+.c]⊥ , µx−.c⊥
µ˜[]⊥ , µ()
µ˜[().c]⊥ , µ
(
[].c⊥
)
µ˜
[
ι1
(
x+
)
.c | ι2
(
y+
)
.c′
]⊥
, µ
(
pi1
[
x−
]
.c⊥ | pi2
[
y−
]
.c′⊥
)
µ˜
[(
x+, y+
)
.c
]⊥
, µ
([
x−, y−
]
.c⊥
)
µ˜
[
(α− · x+).c
]⊥
, µ
(
[α+ · x−].c⊥
)
µ˜
[
∼
(
α−
)
.c
]⊥
, µ
(
¬
[
α+
]
.c⊥
)
µ˜
[
↓
(
x−
)
.c
]⊥
, µ
(
↑
[
x+
]
.c⊥
)
(x−)⊥ , x+
(µα−.c)⊥ , µ˜α+.c⊥
µ()⊥ , µ˜[]
µ([].c)⊥ , µ˜
[
().c⊥
]
µ
(
pi1
[
α−
]
.c | pi2
[
β−
]
.c′
)⊥
, µ˜
[
ι1
(
α+
)
.c⊥ | ι2
(
β
+)
.c′⊥
]
µ
([
α−, β−
]
.c
)⊥
, µ˜
[(
α+, β
+)
.c⊥
]
µ
(
[x+ · α−].c
)⊥
, µ˜
[
(x− · α+).c⊥
]
µ
(
¬
[
x+
]
.c
)⊥
, µ˜
[
∼
(
x−
)
.c⊥
]
µ
(
↑
[
α+
]
.c
)⊥
, µ˜
[
↓
(
α−
)
.c⊥
]
[α−]⊥ , α+
[µ˜x−.c]⊥ , µx+.c⊥
[]⊥ , ()
pi1 [e−]⊥ , ι1
(
e⊥−
)
pi2 [e−]⊥ , ι2
(
e⊥−
)
[
e−, e′−
]⊥
,
(
e⊥−, e
′⊥
−
)
[v+ · e−]⊥ , v⊥+ · e⊥−
¬ [v+]⊥ , ∼
(
v⊥+
)
↑[e+]⊥ , ↓
(
e⊥+
)
FIGURE 4.14. The self-duality of system L programs: with two unit types (1,>), two
empty types (0, ⊥), (co-)products (⊕, &), (co-)pairs (⊗, `), (co-)functions (→, −),
and negations (∼, ¬), and two polarity shifts (↓, ↑).
134
c) Γ | e : A ` ∆ is well-typed if and only if ∆⊥ ` e⊥ : A⊥ | Γ⊥ is.
Furthermore, if a command, term, or co-term lies in the focused sub-syntax, then so
does its dual.
Proof. By induction on the typing derivation, similar to the proof of Theorem 3.7.
Theorem 4.5 (Dynamic duality). a) c µP µ˜PβP c′ if and only if c⊥ µP µ˜PβP c′⊥.
b) v ηµςP v′ if and only if v⊥ ηµ˜ςP v′⊥.
c) e ηµ˜ςP e′ if and only if e⊥ ηµςP e′⊥.
Proof. By cases on the respective rewriting rules using the fact that substitution
commutes with duality, similar to the proof of Theorem 3.8.
135
CHAPTER V
Data and Co-Data
This chapter is a new text based on the ideas and results from (Downen & Ariola,
2014c) of which I was the primary author and developed the language and theory of
data and co-data in the classical sequent calculus presented in this chapter. I would
to thank my advisor Zena M. Ariola for the assistance and feedback in writing that
publication.
The ramifications of treating the sequent calculus as a programming language
(Curien & Herbelin, 2000; Wadler, 2003; Zeilberger, 2008b; Munch-Maccagnoni, 2009)
have elucidated issues that arise in programs, including the interplay between strict
and lazy evaluation in programs and types. When interpreted as a computational
framework, the sequent calculus reveals a diversity of connectives that is easy to
overlook in the tradition of the λ-calculus. However, this diversity can become
overwhelming. We now have several connectives for representing similar logical ideas:
two connectives each for conjunction, disjunction, negation, and so on.
Additionally, there are still some questions that have not been addressed. For
instance, how do other evaluation strategies, like call-by-need (Ariola et al., 1995;
Ariola & Felleisen, 1997; Maraist et al., 1998),1 fit into the picture? If we follow the
story of polarized logic, that the polarity determines evaluation order, then there is
no room—by definition there are only two polarities so we can only directly account
for two evaluation strategies with this approach.
We now aim to tame the abundance of connectives found in the sequent calculus.
Can we find a single pattern that encompasses every single connective we have
discussed so far in the sequent calculus? That way, rather than cataloguing the many
different connectives on a case-by-case basis, we can direct our attention on the
commonalities underlying them all. As a tool for analysis, we summarize a broad
family of types occurring in the sequent calculus, whose static and dynamic properties
1Call-by-need can be thought of as a memoizing version of call-by-name where the arguments to
function calls are evaluated on demand, like in call-by-name, but where the value of an argument is
remembered so that it is computed only once, like in call-by-value.
136
all derive from a small core. As a tool for synthesis, we use the patterns underpinning
the connectives as a mechanism facilitating the exploration of new connectives.
Furthermore, we look for a more general classification of evaluation strategies,
in an effort to capture the essence of strategies, that goes beyond the duality
between call-by-value and call-by-name evaluation. In order to account for other
evaluation strategies like call-by-need, we need to step outside of the polarization
hypothesis, which assumed that every evaluation strategy corresponds to one of the
(two!) polarities. Instead, we look at a treatment of strategy based on its impact
on substitution. The substitution-based characterization of evaluation strategies is
general enough to describe call-by-need evaluation and also generalizes polarization
as a mechanism for combining different evaluation strategies within a single program.
Our approach to understanding the dynamic behavior of the various connectives
is the same as the traditional approach from the λ-calculus: the dynamic meaning
of all connectives are characterized by β and η laws. We will first investigate these
principles as symmetric equations, rather than non-symmetric reductions, which lets us
understand β and η laws that are valid for any evaluation strategy. Besides maintaining
similarity with the simply typed λ-calculus, the equational theory avoids the conflict
between extensionality and control that arises in rewriting theories for classical logic
(David & Py, 2001). Instead, we drive the (untyped) reduction theory and operational
semantics for all the connectives, which includes the operational β and focusing ς
rewriting rules previously studied in Chapters III and IV, is justified in terms of the
fundamental β and η equations.
The Essence of Evaluation: Substitutability
As we have seen previously in Chapters III and IV, there are many different
languages for the sequent calculus (Curien & Herbelin, 2000; Wadler, 2003; Herbelin,
2005; Munch-Maccagnoni, 2009; Munch-Maccagnoni & Scherer, 2015) that are all
based on the same structural core µµ˜-calculus that was explored in Section 3.2. This
core, as was in Figure 3.7, forms the basis of naming in the sequent calculus via
variables and co-variables as well as input and output abstractions. Further still, the
fundamental dilemma of computation in classical sequent calculus lies wholly within
this core. The root cause of non-determinism, non-confluence, and incoherence is a
conflict between the input and output abstractions, where each one tries to take
control over the future path of evaluation. Therefore, before we tackle evaluation
137
of the language with (co-)data types, we will first focus on how to characterize the
resolutions to the fundamental dilemma in the structural core of the sequent calculus.
Recall that the source of the conflict in the structural core of the sequent calculus
comes from the two opposing rules for implementing substitution:
〈µα.c||e〉 µ c {e/α} 〈v||µ˜x.c〉 µ˜ c {v/x}
As stated, a command like 〈µ .c1||µ˜ .c2〉, where the (co-)variables are never used, is
equal to both c1 and c2, so any two arbitrary commands may be considered equal.
The language-based solution to this dilemma from Chapter III is to restrict one of the
two rules to remove the conflict—the µ rule is restricted to co-values to implement a
form of call-by-name evaluation or the µ˜ rule is restricted to values implement a form
of call-by-value evaluation. However, in lieu of inventing various different languages
with different evaluation strategies for mitigating the conflict, let’s instead admit
restrictions on both directions of substitution to values and co-values:
〈µα.c||E〉 µS c {E/α} 〈V ||µ˜x.c〉 µ˜S c {V/x}
while leaving the specifics of what constitutes a substitutable value V and a
substitutable co-value E open-ended. That is to say, we make the sets of values
(V ∈ ValueS) and co-values (E ∈ CoValueS) a parameter of the theory, in the same
sense as Ronchi Della Rocca & Paolini’s (2004) parametric λ-calculus, that may
be filled in at a later time. A choice of a specific value set ValueS and co-value
set CoValueS makes up a substitution strategy S = (ValueS ,CoValueS). The full
parametric equational theory µµ˜ for the structural core (Downen & Ariola, 2014c) is
given in Figure 5.1, where we denote a particular instance of for a chosen substitution
strategy S as µµ˜S . Since the rules for extensionality of input and output abstractions
did not cause any issue, we leave them alone.
By leaving the choice of dual substitution restrictions open as parameters, the
same parametric theory may describe the semantics different evaluation strategies
by instantiating the parameters in different ways. As per Remark 2.3, we can
derive reduction and equational theories from the µS µ˜Sηµηµ˜ rewriting rules from
Figure 5.1 as their compatible-reflexive-transitive and compatible-reflexive-symmetric-
transitive closures, respectively. So, given a particular substitution strategy S, the
S instance of the parametric reduction and equational theories, denoted µµ˜S , is
138
(µS) 〈µα.c||E〉 µS c {E/α} (E ∈ CoValueS)
(µ˜S) 〈V ||µ˜x.c〉 µ˜S c {V/x} (V ∈ ValueS)
(ηµ) µα. 〈v||α〉 ηµ v (α /∈ FV (v))
(ηµ˜) µ˜x. 〈x||e〉 ηµ˜ e (x /∈ FV (e))
FIGURE 5.1. A parametric theory, µµ˜S , for the core µµ˜-calculus.
obtained by instantiating the set of values and co-values with S. The one constraint
on the substitution strategy is that we always assume that variables are values, and
co-variables are co-values, since our restriction on the µ and µ˜ axioms mean that they
can only ever stand in for unknown value and co-values.
If we want to characterize an operational semantics as well, we also need to specify
the evaluation contexts in which the standard reduction may occur. Therefore, we
say that an evaluation strategy S (or just strategy for short) is a substitution strategy
together with a set of evaluation contexts (D ∈ EvalCxtS) that yield a command when
filled with a command, term, or co-term as appropriate, and which includes at least
the following contexts:
–  ∈ EvalCxtS ,
– 〈||E〉 ∈ EvalCxtS for all E ∈ CoValueS , and
– 〈V ||〉 ∈ EvalCxtS for all V ∈ ValueS .
So a choice of evaluation strategy S gives us the µS µ˜S operational semantics that is
closed under EvalCxtS contexts.
The previous characterizations of call-by-value and call-by-name from Chapter III
come out as particular instances of the parametric theory. For example, we can define
the call-by-value strategy V, shown in Figure 5.2, by restricting the set of values to
exclude output abstractions, leaving variables as the only value, and letting every
co-term be a co-value. In effect, this decision restricts the µ˜ rule in the usual way for
call-by-value while letting the µ rule be unrestricted. In addition, the V evaluation
contexts only permit reduction at the top of a command or one of its immediate sub-
(co-)terms, favoring the term side over the co-term side. The call-by-name strategy
N is defined in the dual way by letting every term be a value and restricting the set
139
V ∈ ValueV ::= x
E ∈ CoValueV ::= e
D ∈ EvalCxtV ::=  | 〈||e〉 | 〈V ||〉
V ∈ ValueN ::= v
E ∈ CoValueN ::= α
D ∈ EvalCxtN ::=  | 〈v||〉 | 〈||E〉
FIGURE 5.2. Call-by-value (V) and call-by-name (N ) strategies for the core
µµ˜-calculus.
of co-values to exclude input abstractions, leaving co-variables as the only co-value.
Again, this choice of values and co-values describes the call-by-name restriction on
the µ rule while leaving the µ˜ rule unrestricted. The N evaluation contexts also only
permit reduction at the top of commands or the immediate sub-(co-)terms, but instead
favor the co-term side over the term size.
We can also explore other choices for the parameters that describe strategies
besides just call-by-value and call-by-name. For instance, we can characterize a notion
of call-by-need in terms of a “lazy call-by-value” strategy LV shown in Figure 5.3,
which characterizes evaluation similar to a previous call-by-need theory for the sequent
calculus (Ariola et al., 2011). The intuition for LV is similar to the call-by-need λ-
calculus (Ariola et al., 1995): a non-value term bound to a variable represents a delayed
computation that will only be evaluated when it is needed. Then, only once the term
has been reduced to a value (in the sense of call-by-value), may it be substituted
for the variable. In this way, LV only performs V substitutions (which we can see
from the fact that the LV (co-)values are a subset of V (co-)values), but in a lazy,
pull-driven fashion that gives initial priority to the consumer as in N . Therefore, in
the command 〈v1||µ˜x. 〈v2||µ˜y.c〉〉, we temporarily ignore v1 and v2 and work inside
c since this command decomposes into the evaluation context 〈v1||µ˜x. 〈v2||µ˜y.〉〉
surrounding c. If it turns out that c evaluates to D[〈x||E〉], we are left in the state
〈v1||µ˜x. 〈v2||µ˜y.D[〈x||E〉]〉〉, where E is a co-value that wants to know something about
x, making µ˜x. 〈v2||µ˜y.D[〈x||E〉]〉 into a co-value as well. Therefore, if v1 is a non-value
output abstraction, it may take over via the µLV rule, and thus begin evaluation of
the value of the demanded variable x.
Due to the symmetry of the sequent calculus, it is straightforward to generate
the dual to the call-by-need strategy, which is the “lazy call-by-name” strategy LN
shown in Figure 5.4. This strategy performs a subset of N substitutions (since LN
(co-)values are a subset of N (co-)values), but still gives initial priority to the producer
140
V ∈ ValueLV ::= x
E ∈ CoValueLV ::= α | µ˜x.D[〈x||E〉]
D ∈ EvalCxtLV ::=  | 〈v||µ˜y.D〉 | 〈v||〉 | 〈||E〉
FIGURE 5.3. “Lazy-call-by-value” (LV) strategy for the core µµ˜-calculus.
V ∈ ValueLN ::= x | µα.D[〈V ||α〉]
E ∈ CoValueLN ::= α
D ∈ EvalCxtLN ::=  | 〈µα.D||e〉 | 〈||e〉 | 〈V ||〉
FIGURE 5.4. “Lazy-call-by-name” (LN ) strategy for the core µµ˜-calculus.
as in V. For example, in the command 〈µα. 〈µβ.c||e2〉||e1〉, we temporarily ignore e1
and e2 and work inside c since this command decomposes into the LN evaluation
context 〈µα. 〈µβ.||e2〉||e1〉 surrounding c. If it turns out that c evaluates to D[〈V ||α〉],
we are left in the state 〈µα. 〈µβ.D[〈V ||α〉]||e2〉||e1〉, where V is a value that wants to
yield a result to α, making µα. 〈µβ.D[〈V ||α〉]||e2〉 a value as well. Therefore, if e1 is
a non-co-value input abstraction, it may take over via the µ˜LN rule, and thus begin
evaluation of the observation for the demanded co-variable α.
Remark 5.1. Note that, while our primary interest in strategies is to achieve a coherent,
confluent theory of deterministic evaluation by avoiding the fundamental dilemma of
classical computation, individual strategies are not required to do so. That is to say,
it can be meaningful to talk about strategies that yield incoherent theories for the
sequent calculus, if we are not interested in properties like confluence. For example,
the simplest such strategy is the “unrestricted” strategy, U , for unconstrained and
non-deterministic evaluation, which considers every term to be a value and every
co-term to be a co-value as shown in Figure 5.5. The µ˜U µ˜U theory effectively ignores
the concept of values and co-values, choosing to restrict neither the µ nor µ˜ rules for
V ∈ V alueU ::= v E ∈ CoV alueU ::= e D ∈ EvalCxtU ::= 
FIGURE 5.5. Nondeterministic (U) strategy for the core µµ˜-calculus.
141
substitution, and thereby giving a theory corresponding to Barbanera & Berardi’s
(1994) symmetric λ-calculus for a classical logic that does not consider a restricted
evaluation strategy. End remark 5.1.
Remark 5.2. Another way to think about substitution strategies, and the parameterized
notions of values and co-values, is to consider the essential parts of an equational
theory. Typically, equational theories are expressed by a set of axioms (primitive
equalities assumed to hold) along with some basic properties or rules for forming
larger equations like compatibility reflexivity, symmetry, and transitivity previously
discussed in Remark 2.3.
In a language with an internal notion of variables, like the λ-calculus or the
core µµ˜-calculus, we also generally expect the equational theory to be closed under
substitution. That is to say, if two things are equal, then they should still be equal
after substituting the same term for the same variable in both of them. However, this
principle often does not always hold in full generality for programming languages. For
example, the ML terms let y = x in 5 and 5 are equal—they will always behave the
same in any context. However, if we substitute the term (print ”hi”; 1) for x in both,
we end up with let y = (print ”hi”; 1) in 5 and 5, which are no longer equal because
one produces an observable side effect (printing the string ”hi”) and the other does
not. Instead, ML supports a restricted substitution principle: if two terms are equal,
then they are still equal when we substitute the same value (an integer, a pair of values,
a function abstraction, . . . ) for the same variable in both of them. This restriction
deftly avoids these kinds of counter-examples.
The exact same issue arises in the classical sequent calculus, since it also includes
effects that allow manipulation of control flow. Therefore, we need to restrict the
substitution principle in the sequent calculus to only allow substituting values for
variables. Additionally, since we have a second form of substitution, we also have a
restriction that only allows substituting co-values for co-variables. This leads us to
substitution principles that say if two commands (or terms or co-terms) are equal,
they must still be equal after substituting (co-)values for (co-)variables:
c = c′ V ∈ ValueS
c {V/x} = c′ {V/x} substS
c = c′ E ∈ CoValueS
c {E/α} = c′ {E/α} substS
and similarly for substitutions in terms and co-terms.
142
In lieu of the presentation in Figure 5.1, we may also define the dynamic semantics
of the core µµ˜-calculus by axioms describing trivial statements about variable binding.
The ηµ and ηµ˜ rules state that giving a name to something, and then using it
immediately (without repetition) in the same place is the same thing as doing nothing.
Additionally, we may say that binding a variable to itself is the same thing as doing
nothing:
(µα) 〈µα.c||α〉 µα c (µx) 〈x||µ˜x.c〉 µ˜x c
These axioms can also be seen as the special cases of µS and µ˜S which are always
sound for every strategy, since we always assume that (co-)variables are (co-)values.
If we take the above substitution principles as primitive inference rules like
reflexivity, etc. in our equational theory, we can derive µS and µ˜S from the µα and µ˜x
axioms. The trick is to realize that a command like 〈V ||µ˜x.c〉 is the image of 〈x||µ˜x.c〉
under substitution of V for x. That is to say that 〈V ||µ˜x.c〉 is syntactically the same as
〈x||µ˜x.c〉 {V/x}. Therefore, we can derive the µ˜S axiom from µ˜x and substS as follows:
〈x||µ˜x.c〉 = c µ˜x V ∈ ValueS
〈V ||µ˜x.c〉 = c {V/x} substS
The derivation of µS from µα and substS is similar:
〈µα.c||α〉 = c µα E ∈ CoValueS
〈µα.c||E〉 = c {E/α} substS
Conversely, the substitution principles are derivable from the more powerful µ˜S
and µS axioms. For example, we can derive the substS principle for co-values from µ˜S
by recognizing that both sides of the equation can be deduced from a command like
〈V ||µ˜x.c〉 with the µ˜V axiom, so that congruence allows us to lift the equality c = c′
under the bindings. The full derivation of the co-value substS principle is:
V ∈ ValueS
〈V ||µ˜x.c〉 = c {V/x} µ˜S
c {V/x} = 〈V ||µ˜x.c〉 symm
c = c′
〈V ||µ˜x.c〉 = 〈V ||µ˜x.c′〉 comp
V ∈ ValueS
〈V ||µ˜x.c′〉 = c′ {V/x} µ˜S
〈V ||µ˜x.c〉 = c′ {V/x} trans
c {V/x} = c′ {V/x} trans
143
and the substS principle for co-values may be derived similarly. Therefore, the µ˜S and
µS rules may also be seen as a realization of two dual substitution principles of an
equational theory in the form of axioms. And furthermore, by controlling substitution
we control evaluation itself. End remark 5.2.
The Essence of Connectives: Data and Co-Data
When considering a variety of different polarized connectives (Zeilberger, 2008b,
2009; Curien & Munch-Maccagnoni, 2010; Munch-Maccagnoni, 2013), we find that
they all fit into one of two dual patterns. Each polarized connective is either positive
or negative: positive connectives (following the verificationist approach) describe how
to construct terms, whereas negative connectives (following the pragmatist approach)
describe how to construct co-terms. In response, both approaches define their other
half by inversion, or cases on the allowed patterns of construction. Thus, we use
verificationist approach to represent (algebraic) data types from functional languages,
whose objects are produced by specific constructions and consumed by inversion on
the possible constructions. Contrastingly, we use pragmatist approach to represent the
dual form of co-data types, whose observations, or messages, are described by specific
constructions and whose objects respond by inversion on those possible observations.
To study types in the sequent calculus, we will mirror the way that modern
programming languages let the user define new types. Functional languages allow for
user-defined data types, which are declared by describing the constructors used to
build objects of that type. Object-oriented languages allow for user-defined co-data
types as interfaces, which are declared by describing the methods (observations) to
which objects of that type respond. As we have seen, the sequent calculus unifies
these two computational uses of types, letting us describe both user-defined data and
co-data types as mirror images of one another. Thus, we aim to encompass all the
previously considered connectives as user-defined (co-)data types.
As a starting point, we base the syntax for declaring new user-defined (co-)data
type declarations in the sequent calculus on data type declarations in functional
languages. However, because the form of (co-)data types in the classical sequent
calculus is more expressive than data types in functional languages, we need a syntax
that is more general than the usual form of data type declaration from ML-based
languages. Therefore, we will look at how the generalized syntax for GADTs in Haskell
(Peyton Jones et al., 2006; Schrijvers et al., 2009) may be used for ordinary data type
144
declarations. For example, the typical sum type Either and pair type Both may be
declared as:
data Either a bwhere
Left : a→ Either a b
Right : b→ Either a b
dataBoth a bwhere
Pair : a→ b→ Both a b
In the declaration for Either, we specify that there are two constructors: a Left
constructor that takes a value of type a and builds a value of type Either a b, and
similarly a Right constructor that take a value of type b and builds a value of type
Either a b. In the declaration for Both, we specify that there is one constructor, Pair,
that takes a value of type a, a value of type b, and builds a value of type Both a b.
When declaring a new type in the sequent calculus, we will take the basic GADT
form, but instead describe the constructors with a sequent judgment rather than a
function type. For connectives following the verificationist approach, we have data type
declarations that introduce new concrete terms and abstract co-terms. For instance,
we can give a declaration of A⊕B as:
dataX ⊕ Y where
ι1 : X ` X ⊕ Y |
ι2 : Y ` X ⊕ Y |
where we replace the function arrow (→) with logical entailment (`), to emphasize
that the function type is not inherently baked into the system. Additionally, we mark
the distinguished output of each constructor as X ⊕ Y |, which denotes the type of
the result produced as the output of the constructed term. This declaration extends
the syntax of the language with two new concrete terms for the constructors, ι1 (v)
and ι2 (v), and with one new abstract co-term for case analysis, µ˜[ι1 (x).c1 | ι2 (y).c2].
Note that these are exactly the system L terms and co-terms for the type A⊕B from
Figure 4.3.
Similarly, we can declare pair types A⊗B as:
dataX ⊗ Y where
( , ) : X, Y ` X ⊗ Y |
145
where the multiple inputs to the constructor are given as a list of inputs on the left of
the sequent, as opposed to the “curried” style used in the declaration of Both. Note
that we make use of mix-fix notation ( , ) used in functional languages like Agda
for describing the constructor syntax, so that this declaration extends the syntax of
the language with one new concrete term for the constructor, (v, v′), and one new
abstract co-term for case analysis, µ˜[(x, y).c]. Again, these are exactly the same terms
and co-terms for the type A⊗B in system L.
However, note that user-defined types in the sequent calculus are more general
than in functional programming languages. For example, we can declare the positive
form of negation as:
data∼X where
∼ : ` ∼X | X
where we have an additional output beside the normal distinguished output of type
∼X, which is not expressible in functional programming languages. This declaration
extends the syntax of the language with one new concrete term for the constructor,
∼ (e), and one new abstract co-term for case analysis, µ˜[∼ (α).c].
Besides data declarations, we also have co-data declarations that introduce
abstract terms and concrete co-terms. We can think of a co-data declaration as an
interface that describes the messages understood by an abstract value. By analogy
to object-oriented programming, an interface (co-data type declaration) describes the
fixed set of methods (co-structures) that an object (case abstraction) has to support
(provide cases for), and the object value (case abstraction) defines the behavior that
results from a method call (command). For example, we can declare product types
A&B as:
codataX & Y where
pi1 : | X & Y ` X
pi2 : | X & Y ` Y
where instead of a distinguished output, we have a distinguished input marked as
| A & B for each co-constructor, which denotes the type of the input expected by
the constructed co-term. This declaration extends the language with a new abstract
term for case analysis, µ(pi1 [α].c1|pi2 [β].c2), and two concrete co-terms, pi1 [e] and pi2 [e].
146
Note that these are exactly the terms and co-terms for the type A&B as described
in system L.
Of note, we find that function types, which are usually baked into functional
programming languages as non-definable types, are just another instance of user-
defined co-data types in the sequent calculus. In particular, we can declare function
types A→ B as:
codataX → Y where
· : X | X → Y ` Y
Following the pattern by rote, this declaration extends the language with a new
abstract term, µ([x · α].c) where we put brackets around the call-stack pattern x · α
for clarity, and a new concrete co-term, v · e. Even though presentation for objects of
the function type differs from the usual λ-based presentation, both presentations are
mutually definable as syntactic sugar based on one another, as we saw in Chapter IV:
λx.v , µ(x · α.〈v||α〉) µ([x · α].c) , λx.µα.c
The rest of the basic connectives, including negation and the corresponding unit types
for ⊕, ⊗, &, and `, are declared as user-defined (co-)data types in Figure 5.6.
Now that we have shown how each of the basic connectives can be described
by a data or co-data declaration, our goal is to generalize the pattern to arbitrary,
user-defined data and co-data types. First, we introduce the general untyped syntax
for arbitrary data and co-data in Figure 5.7.2 In addition to the expressions inherited
from the core µµ˜-calculus, we now have two new forms of terms and co-terms. On the
one hand, we have data structure terms K( #»e , #»v ) that build a concrete construction
with the constructor K, and these may be analysed by a data case abstraction co-term
µ˜
[ #                    »K( #»α , #»x ).c] which defines several alternative responses to its given answer matching
specific patterns. On the other hand, we have co-data structure co-terms O[ #»v , #»e ] that
build a concrete observation with the observer O, and these may be analysed by a
co-data case abstraction term µ
( #                   »O[ #»x , #»α ].c) which defines several alternative responses
to its given question matching specific patterns. Note that for both data and co-data
case abstractions, we impose the additional syntactic side-condition that the listed
2We maintain the same convention from Chapter III and IV for user-defined data and co-data
types, whereby terms and co-terms are syntactically distinguished by the use of round parenthesis
for terms and square brackets for co-terms.
147
dataX ⊕ Y where
ι1 : X ` X ⊕ Y |
ι2 : Y ` X ⊕ Y |
codataX & Y where
pi1 : | X & Y ` X
pi2 : | X & Y ` Y
dataX ⊗ Y where
( , ) : X, Y ` X ⊗ Y |
codataX ` Y where
[ , ] : | X ` Y ` X, Y
data 0where codata>where
data 1where
() : ` 1 |
codata⊥where
[] : | ⊥ `
dataX − Y where
· : X ` X − Y | Y
codataA→ Bwhere
· : X | X → Y ` Y
data∼X where
∼ : ` ∼X | X
codata¬X where
¬ : X | ¬X `
FIGURE 5.6. Declarations of the basic data and co-data types.
x, y, z ∈ Variable ::= . . . α, β, γ ∈ CoVariable ::= . . .
K ∈ Constructor ::= . . . O ∈ Observer ::= . . .
c ∈ Command ::= 〈v||e〉
v ∈ Term ::= x | µα.c | K( #»e , #»v ) | µ
( #                   »O[ #»x , #»α ].c)
e ∈ CoTerm ::= α | µ˜x.c | µ˜
[ #                    »K( #»α , #»x ).c] | O[ #»v , #»e ]
FIGURE 5.7. Adding data and co-data to the core µµ˜ sequent calculus.
148
constructors K, . . . of a data case abstraction are all distinct from one another and
likewise the listed observers O, . . . of a co-data case abstraction are all distinct.
Second, we give the type system accommodating the general form of declarations
for a generic data type constructor F and co-data type constructor G in Figure 5.8. The
type constructors in such declarations may connect a sequence of other types, which are
represented by the sequence of type variables #»X . Furthermore, a data type may have
several constructors, named K1 to Kn, and a co-data type may have several observers
which are co-constructors, named O1 to On. The form of these (co-)constructors (i.e.
their arity and the type of terms and co-terms they are built from) are described
by an arbitrary sequent in the declaration, with the (co-)data being defined in the
distinguished input or output position of the sequent. For each such data and co-data
declaration, we have additional typing rules for the newly declared connectives which
are also shown in Figure 5.8. Because the meaning of a particular type constructor F
or G depends on its declaration, we annotate the sequent with the global environment
G that specifies the declarations for all the type constructors, so that G is used to
determine the shape of their left and right logical rules. While these generalized typing
rules are involved, they are described in such a way that they exactly replicate the
expected typing rules for existing (co-)data types. For instance, by instantiating the
generalized typing rules to the basic (co-)data types from Figure 5.6, we recover exactly
the same (unpolarized) logical rules from system L in Figure 4.3. Thus, the syntax
and typing rules for user-defined (co-)data types subsume each basic connective.
Since we have extended the core µµ˜-calculus syntax with (co-)data structures
and abstractions, we must also update the core strategies from Section 5.1 to account
for the new values and co-values introduced by the declarations. We could define the
(co-)values of each newly declared (co-)data type on a case-by-case basis. However,
instead we can also to define the (co-)values of (co-)data types generically across
all declarations, which besides being more economical prevents ad-hoc decisions. To
do this, we define a strategy S once and for all over an untyped syntax that was
given in Figure 5.7 which already accounts for all possible (co-)data type declarations.
Also note that the notion of evaluation context does not change with the addition
of (co-)data, so we only need to consider how the substitution strategy is impacted.
Thus, a strategy can be given for all possible extensions of newly-declared (co-)data
types by carving out a set of values and co-values from the untyped syntax of terms
and co-terms.
149
A,B,C ∈ Type ::= X | F( #»A) X, Y, Z ∈ TypeVariable ::= . . . F,G ∈ Connective ::= . . .
decl ∈ Declaration ::= data F( #»X )where
#                                        »
K : #»A ` F( #»X ) | #»B
| codataG( #»X )where
#                                        »
O : #»A | G( #»X ) ` #»B
G ∈ GlobalEnv ::= #     »decl Γ ∈ InputEnv ::= #       »x : A ∆ ∈ OutputEnv ::= #        »α : A
J,H ∈ Judgement ::= c :
(
Γ `G ∆
)
| (Γ `G v : A | ∆) | (Γ | e : A `G ∆)
Core rules:
x : A `G x : A | VR | α : A `G α : A VL
c :
(
Γ `G α : A,∆
)
Γ `G µα.c : A | ∆ AR
c :
(
Γ, x : A `G ∆
)
Γ | µ˜x.c : A `G ∆ AL
Γ `G v : A | ∆ Γ′ | e : A `G ∆′
〈v||e〉 :
(
Γ′,Γ `G ∆′,∆
) Cut
Logical rules:
Given data F( #»X )where
#                                                   »
Ki :
#   »
Aij
j ` F( #»X ) | #   »Bijj
i
∈ G, we have the rules:
#                                                         »
Γ′j | e : Bij
#             »{C/X} `G ∆′j
j #                                                          »
Γj | v : Aij
#             »{C/X} `G ∆j
j
#»Γj
j
,
#»
Γ′j
j `G Ki( #»e , #»v ) : F(
#»
C ) | #  »∆j
j
,
#  »
∆′j
j
FRKi
#                                                                                                          »
ci :
(
Γ,
#                              »
xi : Ai
#             »{C/X} `G
#                              »
αi : Bi
#             »{C/X} ,∆
)i
Γ | µ˜
[
#                        »Ki( #»αi , #»xi).ci
i
]
: F( #»C ) `G ∆
FL
Given codataG( #»X )where
#                                                    »
Oi :
#   »
Aij
j | G( #»X ) ` #   »Bijj
i
∈ G, we have the rules:
#                                                                                                          »
ci :
(
Γ,
#                              »
xi : Ai
#             »{C/X} `G
#                              »
αi : Bi
#             »{C/X} ,∆
)i
Γ `G µ
(
#                       »Oi[ #»xi , #»αi ].ci
i
)
: G( #»C ) | ∆
GR
#                                                          »
Γj | v : Aij
#             »{C/X} `G ∆j
j #                                                         »
Γ′j | e : Bij
#             »{C/X} `G ∆′j
j
#»Γj
j
,
#»
Γ′j
j | Oi[ #»v , #»e ] : G( #»C ) `G
#  »∆j
j
,
#  »
∆′j
j
GLOi
FIGURE 5.8. Types of declared (co-)data in the parametric µµ˜ sequent calculus.
150
Our call-by-value strategy V will mimic ML-like languages. Therefore, we can
say that a data structure is a value of V when all of its sub-terms are values. For
example, a pair (v1, v2) is a value when both v1 and v2 are values, and an injection,
ι1 (v) or ι2 (v), is a value when v is a value. Additionally, all co-data case abstractions
(i.e. objects) are considered values. This comes from the fact that a λ-abstraction,
which we represent as a case abstraction, is a value in the call-by-value λ-calculus.
As before, though, we continue to admit every single co-term as a co-value. Thus, we
achieve the V strategy with arbitrary (co-)data types shown in Figure 5.9.
Our call-by-name strategy N will mimic call-by-name λ-calculi with data types,
similar to Haskell. Therefore, we still admit every single term as a value. The co-values
of N represent “strict” contexts from a call-by-name λ-calculus. For example, case
analysis is always strict in these languages, therefore the case abstraction of a data
type is a co-value. Additionally, an observation of a co-data type is a co-value when
all sub-(co-)terms are (co-)values. This follows the definition of co-values from the
call-by-name half of the dual calculi from Section 3.3 as well as the hereditary nature
of strict contexts for functions and products in a call-by-value λ-calculus. For example,
the contexts:
letx =  1 in 5 letx = pi1  in 4
is not strict because x is not required to compute the result 5, even though we are
applying the hole  to an argument or projecting out one of its components. However,
the contexts:
case 1of ι1(x)⇒ 5 | ι2(y)⇒ 10
case pi1 of ι1(x)⇒ 5 | ι2(y)⇒ 10
are both strict because we need to compute the input plugged into  to determine
which branch to take. Thus, we achieve the N strategy with arbitrary (co-)data types
shown in Figure 5.9.
Finally, our call-by-need strategy LV is the most complex, since it accounts for
the memoization used to efficiently implement lazy evaluation for the Haskell language.
Intuitively, the key to understanding call-by-need is to think about sharing, where
the values and co-values of LV represent terms and co-terms that may be freely
copied as many times as necessary. In LV , a structure can be copied if all of its sub-
151
V ∈ ValueV ::= x | K( #»e , #»V ) | µ
( #                   »O[ #»x , #»α ].c)
E ∈ CoValueV ::= e
V ∈ ValueN ::= v
E ∈ CoValueN ::= α | O[ #»v , #»E ] | µ˜
[ #                    »K( #»α , #»x ).c]
FIGURE 5.9. Call-by-value (V) and call-by-name (N ) substitution strategies extended
with arbitrary (co-)data types.
V ∈ ValueLV ::= x | K( #»E, #»V ) | µ
( #                   »O[ #»x , #»α ].c)
E ∈ CoValueLV ::= α | µ˜x.D[〈x||E〉] | O[ #»v , #»E ] | µ˜
[ #                    »K( #»α , #»x ).c]
V ∈ ValueLN ::= x | µα.D[〈V ||α〉] | K( #»e , #»V ) | µ
( #                   »O[ #»x , #»α ].c)
E ∈ CoValueLN ::= α | O[ #»V , #»E ] | µ˜
[ #                    »K( #»α , #»x ).c]
FIGURE 5.10. “Lazy-call-by-value” (LV) and “lazy-call-by-name” (LN ) substitution
strategies extended with arbitrary (co-)data types.
(co-)terms can be copied, following the usual treatment of sharing for data structures
in implementations of Haskell. Additionally, a case abstraction can always be copied,
following the treatment of λ-abstractions in implementations of Haskell. Thus, we
achieve the LV strategy with arbitrary (co-)data types shown in Figure 5.10. The
dual lazy call-by-name strategy LN is also shown in Figure 5.10, which is derived by
exchanging the role of terms and co-terms from LV .
Evaluating Data and Co-Data
Having resolved the fundamental dilemma of computation in the parametric
µµ˜-calculus via a variety of strategies, and having extended the language with new
syntactic forms for user-defined (co-)data types, we now need to explain how the
constructs of (co-)data types behave. To that end, we introduce two different semantics
for (co-)data in the parametric sequent calculus:
– a typed βη theory that is independent of the chosen strategy, and
– an untyped βς theory that depends on the chosen strategy.
152
Both of these theories have their own advantages and disadvantages. On the one
hand, the βη theory gives a canonical definition of the dynamic semantics of (co-)data
independently of any evaluation strategy, but relies on types to do so sensibly. On the
other hand, the βς theory gives a mechanism for running programs without resorting
to types and equational reasoning, but it depends on the chosen evaluation strategy
and relates fewer programs than βη.
The typed βη theory of (co-)data
Since the evaluation strategy is handled by the equational theory of the core
µµ˜-calculus, we should express the behavior of (co-)data type structures in some way
that is valid for any choice of strategy, S. In other words, given a set of data and
co-data type declarations G, we would like to describe the equational theory for the
language extended with those types. As we saw in Chapter II, in the λ-calculus the
dynamic meaning of types are expressed by β and η laws. The β laws characterize
the main computational force of a type, whereas the η laws characterize a form of
extensionality for a type. Therefore, to accomplish our goal in the sequent calculus,
we will use an analogous form of β and η laws for defining the dynamic meaning of
user-defined (co-)data types, and like in the λ-calculus, the η laws must be typed to
be sensible.
For example, we may extend the equational theory with the following β law for
functions:
(β→) 〈µ([x · α].c)||v · e〉 β→ 〈v||µ˜x. 〈µα.c||e〉〉
which matches on the structure of a function call and binds the sub-components to
the appropriate (co-)variables. Notice that this rule applies for any function call, v · e,
whether or not v or e are (co-)values, so β→ does not depend on any substitution
strategy. This works because we avoid performing substitution in the β→ axiom, and
instead v and e are put in interaction with input and output abstractions. Since we
have already informed the core structural theory about our chosen strategy, we know
that the substitutions will be performed in the correct order. Therefore, if we are
evaluating our program according to call-by-value, we would have to evaluate v first
(via the µS rule if necessary) before substituting for x. Likewise, in call-by-name, we
would have to evaluate e first (via the µ˜S rule if necessary) before substituting for α.
153
Next, we have the following η law for functions:
(η→) z : A→ B ≺η→ µ([x · α].〈z||x · α〉)
which says that an unknown function, z, is equivalent to a trivial case abstraction
that matches a function call and forwards it along, unchanged, to z. Here, we use the
variable z to stand in for an unknown value, since we are only allowed to substitute
values for variables.
Note that the more general but strategy-dependent presentation of the η law,
which applies to an arbitrary value rather than just a variable, is derivable from
the more restrictive η→ law above and the equational theory of substitution in the
parametric µµ˜-calculus:
V : A→ B =ηµ µγ. 〈V ||γ〉
=ηµ˜ µγ. 〈V ||µ˜z. 〈z||γ〉〉
=η→ µγ. 〈V ||µ˜z. 〈µ([x · α].〈z||x · α〉)||γ〉〉
=µ˜S µγ. 〈µ([x · α].〈V ||x · α〉)||γ〉
=ηµ µ([x · α].〈V ||x · α〉)
This has the nice side effect that neither the β→ or η→ rules themselves explicitly
mention values or co-values in any way—they are strategy independent.3
Remark 5.3. To make the comparison with previous characterizations of functions in
the sequent calculus from Chapter III, we can be more formal about the relationship
between λ-abstractions and co-case abstractions over call stacks. In particular, taking
the round trip of the mutual syntactic sugar definitions presented in Section 5.2 results
in equal (co-)terms:
λx.v , µ([x · α].〈v||α〉) , λx.µα. 〈v||α〉 =ηµ λx.v
µ([x · α].c) , λx.µα.c , µ([x · α].〈µα.c||α〉) =µS µ([x · α].c)
3It also has the pleasant effect that the side conditions on the free variables of V used to prevent
static variable capture automatically come from capture-avoiding substitution in the equational
theory.
154
where the application of µS is valid for any S, since co-variables are always co-values.
We may also rephrase these β and η axioms for functions into the λ-based syntax:
(βλ) 〈λx.v||v′ · e〉 βλ 〈v′||µ˜x. 〈v||e〉〉 (ηλ) z ≺ηλ λx.µα. 〈z||x · α〉
Note that these are mutually derivable from the β→ and η→ axioms according to the
syntactic sugar definition for λ-abstractions, along with the µS and ηµ axioms. Thus,
the two presentations of functions really are equivalent to one another: we can view
a function as a λ-abstraction mapping an input to an output, or as an object that
deconstructs an observation in the shape of a call-stack. End remark 5.3.
Remark 5.4. Even though we can derive a generalized version of the η→ axiom which
applies to values, it is important to note that the η[→] rule would not work if we
replaced z with a general term v. The exact same problem occurs in the call-by-value
λ-calculus, where we admit non-terminating terms. If we are allowed to η expand any
term, then we have the equality:
5 =β (λx.5) (λy.Ω y) =η (λx.5) Ω ≈ Ω
where Ω stands in for a term that loops forever. So if we allow η expansion of arbitrary
terms in the call-by-value λ-calculus, then a value like 5 is the same thing as a program
that loops forever. The solution in the call-by-value λ-calculus is to limit the η rule to
only apply to values. It should then be no surprise that the same limitation is necessary
for the analogous η→ axiom in the classical sequent calculus, where we can always have
the term µ .c that never returns a result just like an infinite loop. End remark 5.4.
Similarly, we can explain the behavior of the co-data type for products with an
analogous set of β and η axioms. The β& axiom demonstrates how an object of A&B
matches on the structure of projection, binding the consumer for its output to the
appropriate co-variable:
(β&) 〈µ(pi1 [α].c1 | pi2 [β].c2)||pi1 [e]〉 β& 〈µα.c1||e〉
(β&) 〈µ(pi1 [α].c1 | pi2 [β].c2)||pi2 [e]〉 β& 〈µβ.c2||e〉
Again, this rule is safe for any projection pi1[e] or pi2[e] because the underlying co-term
e is put in interaction with an output abstraction, so that the substitution is performed
only in the correct situation. Likewise, the η& axiom states that an unknown product
155
value z is equivalent to a redundant co-case analysis which forwards its output to z:
(η&) z : A&B ≺η& µ(pi1 [α].〈z||pi2 [α]〉 | pi2 [β].〈z||pi2 [β]〉)
In other words, the variable z, which stands in for some object of A & B, must be
equivalent to an object with the same response to the pi1 and pi2 projections. As before,
we have the generalized, strategy-dependent version of the η& as an equality:
V : A&B = µ(pi1 [α].〈V ||pi1 [α]〉 | pi2 [β].〈V ||pi2 [β]〉) α, β /∈ FV (V )
which is derivable from η&, ηµ, ηµ˜, and µ˜S , meaning that the only thing that is
observable about an object of A & B is its response to observations of the form
pi1 [α] and pi2 [β].
The β and η laws for user-defined data types follow a similar, but mirrored,
pattern. For example, the β rules for ⊕ is exactly dual to β&, and performs case
analysis on the tag of the term without requiring that the sub-term be a value:
(β⊕) 〈ι1 (v)||µ˜[ι1 (x).c1 | ι2 (y).c2]〉 β⊕ 〈v||µ˜x.c1〉
(β⊕) 〈ι2 (v)||µ˜[ι1 (x).c1 | ι2 (y).c2]〉 β⊕ 〈v||µ˜y.c2〉
These rules work for any injected terms ι1 (v) or ι2 (v) because they put the sub-
term v in interaction with an input abstraction, allowing the equational theory of the
underlying structural core take care of managing evaluation order. For example, while
this rule is stronger than the one given for the call-by-value half of the dual sequent
calculi from Section 3.3, it is still valid according to its Wadler’s (2003) call-by-value
continuation-passing style (CPS) transformation. The η rule for ⊕ is also dual to η&,
where we expand an unknown co-value γ into a case abstraction:
(η⊕) γ : A⊕B ≺η⊕ µ˜[ι1 (x).〈ι1 (x)||γ〉 | ι2 (y).〈ι2 (y)||γ〉]
Thus, the only thing that matters for an unknown sum co-value γ is the way that it
responds to an input of the form ι1 (x) or ι2 (x).
As a final example, consider the β axiom for pairs, which matches on the structure
of the pair and binds the sub-terms to the appropriate variables:
(β⊗) 〈(v, v′)||µ˜[(x, y).c]〉 β⊗ 〈v||µ˜x. 〈v′||µ˜y.c〉〉
156
(βF) 〈Ki( #»e , #»v )||µ˜[· · · | Ki( #»α , #»x ).ci | · · ·]〉 βF 〈µ #»α . 〈 #»v ||µ˜ #»x .ci〉|| #»e 〉
(βG) 〈µ(· · · | Oi[ #»x , #»α ].ci | · · ·)||Oi[ #»v , #»e ]〉 βG 〈 #»v ||µ˜ #»x . 〈µ #»α .ci|| #»e 〉〉
(ηF) γ : F( #»C ) ≺ηF µ˜
[
#                                                     »Ki( #»α , #»x ).〈Ki( #»α , #»x )||γ〉
i
]
(ηG) z : G( #»C ) ≺ηG µ
(
#                                                   »Oi[ #»x , #»α ].〈z||Oi[ #»x , #»α ]〉
i
)
FIGURE 5.11. The βη laws for declared data and co-data types.
The β⊗ rule follows the intuition that a destructuring binding on the structure of a
known pair is the same thing as binding the sub-terms of the pair one at a time. Next,
the η axiom for pairs states that a co-variable γ expecting a pair A⊗ B as input is
the same as the redundant case abstraction which breaks apart and re-assembles it
input before forwarding it to γ:
(η⊗) γ : A⊗B ≺η⊗ µ˜[(x, y).〈(x, y)||γ〉]
We now look to summarize all the β and η laws considered so far into their general
form for user-defined (co-)data types. That way, we can take an arbitrary declaration
for a user-defined (co-)data type and automatically generate the appropriate axioms to
characterize the run-time behavior of its programs. In particular, given the declarations
for a generic data type constructor F and co-data type constructor G in Figure 5.8,
we show the corresponding β and η axioms in Figure 5.11. Note that these rules use
syntactic sugar for writing a sequence of input and output bindings. That is, given
a sequence of terms #»v = v1, . . . , vn and variables #»x = x1, . . . , xn, or a sequence of
co-terms #»e = e1, . . . , en and co-variables #»α = α1, . . . , αn, the sequence bindings are
defined as:
〈 #»v ||µ˜ #»x .c〉 , 〈v1||µ˜x1. . . . 〈vn||µ˜xn.c〉〉 〈µ #»α .c|| #»e 〉 , 〈µα.1 . . . 〈µαn.c||en〉||e1〉
The type restriction on the η laws are necessary to prevent the associated equational
theory from collapsing, similar to the situation in the λ-calculus as discussed in
Section 2.2. For example, the nullary case of the η law for co-data gives us x =η
µ() =η y which is fine if both x : > and y : >, but is troublesome if x and y stand for
some other kind of object like functions or products.
157
(βS)
〈
K( #»E, #»V )
∣∣∣∣∣∣µ˜[· · · | K( #»α , #»x ).c | · · ·]〉 βS c { #      »E/α, #     »V/x}
(βS)
〈
µ(· · · | O( #»x , #»α ).c | · · ·)
∣∣∣∣∣∣O( #»E, #»V )〉 βS c { #     »V/x, #      »E/α}
(ςS) K(
#»
E, e′, #»e , #»v ) ςS µα.
〈
µβ.
〈
K( #»E, β, #»e , #»v )
∣∣∣∣∣∣α〉∣∣∣∣∣∣e′〉
(ςS) K(
#»
E,
#»
V , v′, #»v ) ςS µα.
〈
v′
∣∣∣∣∣∣µ˜y. 〈K( #»E, #»V , y, #»v )∣∣∣∣∣∣α〉〉
(ςS) O(
#»
V , v′, #»v , #»e ) ςS µ˜x.
〈
v′
∣∣∣∣∣∣µ˜y. 〈x∣∣∣∣∣∣O( #»V , y, #»v , #»e )〉〉
(ςS) O(
#»
V ,
#»
E, e′, #»e ) ςS µ˜x.
〈
µβ.
〈
x
∣∣∣∣∣∣O( #»V , #»E, β, #»e )〉∣∣∣∣∣∣e′〉

v′ /∈ ValueS
e′ /∈ CoValueS
x,y, α, β fresh
FIGURE 5.12. The parametric βSςS laws for arbitrary data and co-data.
The untyped βς theory of (co-)data
Next, we consider an alternative semantics for (co-)data in the sequent calculus
which is based on system L’s strategy-dependent β laws from Figure 4.8 and ς laws
from Figure 4.11 in Chapter IV. These rules can be generalized to arbitrary (co-)data
structures as shown in Figure 5.12. Both the β and ς perform two separate and non-
overlapping duties. The ς laws evaluate unevaluated data and co-data structures by
lifting out an unevaluated (i.e. non-(co-)value) sub-expression and giving it a name,
so that computation can proceed to determine its (co-)value. The β laws perform
pattern-matching on fully-evaluated structures built from (co-)values by substituting
the contained (co-)values for the corresponding (co-)variables in the matching pattern
of a case abstraction. Note that the strategy-dependent β laws in Figure 5.12 are less
general than the strategy-independent ones from Figure 5.11, which can pattern-match
on any structure, so that they do not accidentally perform the same work of giving
names to unevaluated components that would otherwise be done by a ς rule. Also
notice that these rewriting rules do not depend on types at all: they function over
untyped syntax, letting us evaluate programs without resorting to information about
static types.
Besides just being meaningful for executing untyped programs, the βς semantics
for (co-)data has another advantage over the βη semantics: the βς reduction theory
is easily confluent.
158
Definition 5.1 (confluence). A reduction relation →R in the sequent calculus is
(strongly) confluent if and only if all divergent reductions c1← R cR c2 join together
as c1 R c′← R c2 for some c′, and similarly for (co-)terms. Furthermore, a reduction
relation →R in the sequent calculus is locally (or weakly) confluent if and only if all
divergent reductions c1 ←R c→R c2 join together as c1 R c′← R c2 for some c′, and
similarly for (co-)terms.
A well-known consequence of confluence is that, for any (strongly) confluent→R,
the equational theory =R is the same thing as convertibility R ← R. That means
that in order to determine if two expressions are equal by a confluent theory, we only
need to normalize both and compare their normal forms. Unfortunately, even putting
issues involving types aside, the combination of the η law with the µµ˜ laws notoriously
breaks confluence. For example, if we consider just functions, we have the following
critical pair between η→S (which generalizes η→ to values) and µS :
µ .c←η→S µ([x · β].〈µ .c||x · β〉)→µS µ([x · β].c)
So confluence in the presence of η and µµ˜ is not so straightforward. Contrarily,
confluence of the βς theory of (co-)data is straightforwardly confluent when combined
with the core µµ˜ theory.
Theorem 5.1 (Parametric confluence). The →µS µ˜Sηµηµ˜βSςS reduction relation is
confluent for any substitution strategy S such that µS µ˜S is deterministic and the
sets ValueS and CoValueS are both forward closed under →µS µ˜Sηµηµ˜βSςS .
Proof. By the decreasing diagrams (van Oostrom, 1994) method of confluence. As
shorthand, let R = µS µ˜Sηµηµ˜βSςS . Our measure of decreasingness based on increasing
depth of the context in which reduction occurs, which finds the context of compatibility
lifting a basic R rewrite into→R. First, we define the depth of a reduction c1 →R c2,
denoted by depth(c1 →R c2), as the height of the hole in the context C from its root
such that c1 = C[c′1], c2 = C[c′2], and c′1 R c′2, and similarly for reduction on (co-)terms.
This measure is well-founded (i.e. for any set of reductions, there is a minimal one with
no others less than it) because the syntax of commands and (co-)terms are finitely
deep. Second, we define the measures of strict decreasingness on reductions, written
(c1 →R1 c′1) < (c2 →R2 c′1), as depth(c1 →R1 c′1) > depth(c2 →R2 c′2), and similarly
for (co-)term reductions. The goal of the proof is then to show that for every rule R1
159
and R2 of R giving a divergent pair of reductions c1 ←R1 c→R2 c2 (and similarly for
(co-)terms), the two ends join back together as
c1 R′1 →R′′2 R′ c′← R′ ←R′′1 ← R′2 c2
where →R′′i is zero or one R reductions of the same measure as c →Ri ci, each R′i
reduction is less than c→Ri ci, and each R′ reduction is less than either c→R1 c1
or c→R2 c2.
We now demonstrate the (strong) confluence of →R by showing that the local
confluence diagrams of each diverging pair of →R reductions are all decreasing by
the above measure. In the cases where the two diverging reductions are disjoint (i.e.
their depths are unordered, so the reductions occur in separate sub-expressions of the
overall expression), then they trivially join in one step via compatibility. Otherwise,
the two diverging reductions are nested (i.e. their depths are ordered, so that one
reduction occurs inside the other or directly on the same expression). In this case, we
proceed by cases on rewriting rule used for the outer-most reduction, so the possible
nested diverging reductions join back together by decreasing diagrams as follows:
– 〈µα.c||E〉 µS c {E/α} has four different possible nested reductions:
∗ If 〈µα.c||E〉 R c′ then c′ = c {E/α}, because the only possibility for R is
µ˜S , but µS µ˜S is deterministic by assumption.
∗ If 〈µα.c||E〉 →ηµ 〈v||E〉 because c = 〈v||α〉 and α /∈ FV (v) then c {E/α} =
〈v||E〉 as well, so the two divergent reductions trivially join.
∗ If 〈µα.c||E〉 →R 〈µα.c′||E〉 because c→R c′ then
c {E/α} →R c′ {E/α} ≺µS 〈µα.c′||E〉
because →R reduction is closed under substitution.
∗ If 〈µα.c||E〉 →R 〈µα.c||E ′〉 because E →R E ′ then E ′ must be a co-value
since co-values are closed under reduction and
c {E/α}R c {E ′/α} ≺µS 〈µα.c||E ′〉
160
which is decreasing because
depth(c {E/α}R c {E ′/α}) > 0 = depth(〈µα.c||E〉 µS c {E/α})
– 〈V ||µ˜x.c〉 µ˜S c {V/x} is analogous to the previous case by duality.
– µα. 〈v||α〉 ηµ v has two different possible nested reductions:
∗ If µα. 〈v||α〉 →µS µα.c {α/β} because v = µβ.c then v =α µα.c {α/β}, so
the two divergent reductions trivially join.
∗ If µα. 〈v||α〉 →R µα. 〈v′||α〉 because v →R v′ then v′ ≺ηµ µα. 〈v′||α〉.
– µ˜x. 〈x||e〉 ηµ˜ e is analogous to the previous case by duality.
–
〈
K( #»E, #»V )
∣∣∣∣∣∣µ˜[· · · | K( #»α , #»x ).c | · · ·]〉 βS c{ #      »E/α, #     »V/x} has several possible
nested reductions inside the (co-)values #»E, #»V of the data structure or inside
the commands . . . c . . . inside the case abstraction, all of which follow similarly
to the latter two cases for µS and µ˜S . Otherwise, there are no other nested
reductions.
–
〈
µ(· · · | O( #»x , #»α ).c | · · ·)
∣∣∣∣∣∣O( #»E, #»V )〉 βS c{ #     »V/x, #      »E/α} is analogous to the
previous case by duality.
– K( #»E, e′, #»e , #»v ) ςS µα.
〈
µβ.
〈
K( #»E, β, #»e , #»v )
∣∣∣∣∣∣α〉∣∣∣∣∣∣e′〉 has the following possible
nested reductions:
∗ Any reduction inside #»E , #»e , or #»v trivially joins in one step because
(co-)values are closed under reduction. Likewise, any reduction inside e′
which does not convert e′ into a co-value also joins in one step.
∗ If K( #»E, e′, #»e , #»v )→R K( #»E,E ′, #»e , #»v ) because e′ →R E ′ then
µα.
〈
µβ.
〈
K( #»E, β, #»e , #»v )
∣∣∣∣∣∣α〉∣∣∣∣∣∣e′〉→R µα. 〈µβ. 〈K( #»E, β, #»e , #»v )∣∣∣∣∣∣α〉∣∣∣∣∣∣E ′〉
→µS µα.
〈
K( #»E,E ′, #»e , #»v )
∣∣∣∣∣∣α〉
ηµ K(
#»
E,E ′, #»e , #»v )
which is decreasing because the first two reductions occur in non-empty
contexts (i.e. their depth is greater than 0) and the final reduction occurs
in the empty context, so its measure is the same as the ςS reduction.
161
– All three other ςS are similar to the previous case.
As special cases, each of the particular substitution strategies we considered in
Section 5.1 (except for U) is confluent.
Corollary 5.1. The →µS µ˜Sηµηµ˜βSςS reduction relation is confluent for S = V, S = N ,
and S = LV.
Proof. Follows from Theorem 5.1, since each of V ,N , and LV make µµ˜ deterministic
and their (co-)values are closed under reduction.
Extensionality and lifting
Now that we have two competing theories for the dynamic semantics of (co-)data,
how do they compare? Do they agree, and give similar results for the same programs?
As it turns out, when the restriction of the βς equational theory to typed commands
and (co-)terms is derivable from the βη equational theory with help from the µµ˜ core.
For example, we have the specific ς rules specialized for the ⊕ connective declared in
Figure 5.6:
(ς⊕) ι1(v) = µα. 〈v||µ˜x. 〈ι1(x)||α〉〉 (ς⊕) ι2(v) = µα. 〈v||µ˜x. 〈ι2(x)||α〉〉
These rules can be derived by η expansion followed by β reduction:
ι1(v) : A⊕B =ηµ µα. 〈ι1(v)||α〉
=η⊕ µα. 〈ι1(v)||µ˜[ι1(x).〈ι1(x)||α〉 | . . .]〉
=β⊕ µα. 〈v||µ˜x. 〈ι1(x)||α〉〉
Notice here that the steps of this derivation are captured exactly by our formulation
of β and η axioms: (1) the ability to η expand a co-variable, and (2) the ability to
perform β reduction immediately to break apart a structure once the constructor is
seen.
We also have similar specialized ς rules for functions:
(ς→) v · e = µ˜x. 〈v||µ˜y. 〈x||y · e〉〉 (ς→) V · e = µ˜x. 〈µα. 〈x||V · α〉||e〉
162
which are again derivable by a similar procedure of η expansion and β reduction:
V · e : A→ B =ηµ˜ µ˜x. 〈x||V · e〉
=η→ µ˜x. 〈µ(y · α.〈x||y · α〉)||V · e〉
=β→ µ˜x. 〈V ||µ˜y. 〈µα. 〈x||y · α〉||e〉〉
=µ˜V µ˜x. 〈µα. 〈x||V · α〉||e〉
v · e : A→ B =ηµ˜ µ˜x. 〈x||v · e〉
=η→ µ˜x. 〈µ(y · α.〈x||y · α〉)||v · e〉
=β→ µ˜x. 〈v||µ˜y. 〈µα. 〈x||y · α〉||e〉〉
=µ˜V µ˜x. 〈v||µ˜y. 〈x||µ˜x. 〈µα. 〈x||y · α〉||e〉〉〉
=ς→ µ˜x. 〈v||µ˜y. 〈x||y · e〉〉
These particular ς axioms for functions are interesting because they were left out of
Wadler’s (2003) sequent calculus, however, we now know they were implicitly present
in the equational theory (Wadler, 2005) as a consequence of the β and η axioms. This
same procedure words for all the definable (co-)data types, so that the βη axioms
for the F and G (co-)data type constructors as declared in Figure 5.8 generate the
derived ς axioms shown in Figure 5.12. These rules search for the left-most non-value
or non-co-value found in a data or co-data structure, and give it a name with an input
or output abstraction, which comes from the ordering of bindings implied by the β
laws in Figure 5.11. For example, the instance of the derived lift axioms for pair types
A⊗B, following the general pattern, are:
(ς⊗) (v, v′) = µα. 〈v||µ˜x. 〈(x, v′)||α〉〉 (ς⊗) (V, v′) = µα. 〈v′||µy. 〈(V, y)||α〉〉
Remark 5.5. Notice that all of the strategies we have considered so far follow a
particular pattern. More specifically, each of the V, N , and LV strategies fit the
following focalizing criteria.
Definition 5.2 (Focalizing strategy). A strategy S is focalizing if and only if
– (co-)variables are (co-)values (as assumed to hold for all strategies),
– structures built from (co-)values are themselves (co-)values (i.e. K( #»E, #»V ) and
O[ #»V , #»E ] are (co-)values), and
163
– case abstractions are (co-)values (i.e. µ˜[K( #»α , #»x ).c | . . .] and µ(O[ #»x , #»α ].c | . . .)
are (co-)values).
These criteria correspond to the impact of focalization on the typing rules for
system L from Section 4.4, and further justifies the connection between maintaining
focus with the stoup in proof search and values and strictness in languages.
Furthermore, it also happens that the non-(co-)values of each of these three strategies
are closed under ς-reduction as well. In other words, the ς laws cannot create or destroy
(co-)values, but instead only serve to identify and lift out sub-(co-)terms that are out
of focus. Thus, these strategies are all focalizing, in that they follow a focalization
procedure dynamically at run-time.
Besides demonstrating the connection between focalization and evaluation, these
criteria give us a general technique for developing strategies. In particular, we can
take a core strategy, which covers only the structural core of the sequent calculus, and
automatically extend it with data and co-data with a single, generic method. First,
close the sets of values and co-values under the above three focalization criteria, so
that K( #»E, #»V ), O[ #»V , #»E ], µ˜[K( #»α , #»x ).c | . . .], and µ(O[ #»x , #»α ].c | . . .) are all (co-)values.
Second, close the sets of values and co-values under ς expansion, so that if v →ς V
and e→ς E then v and e are themselves (co-)values.
This generic method let’s us generate the previously known strategies for the
parametric sequent calculus. For example, applying this method to the core V, N ,
and LV strategies from Figures 5.2 and 5.3 gives exactly the extended strategies in
Figures 5.9 and 5.10. So the core strategy gives enough information to recover its
corresponding focalizing strategy. Furthermore, we already assumed that strategies
always consider (co-)variables to be (co-)values. Thus, in the world of focalizing
strategies for the parametric sequent calculus, the only crucial decision is what
to do with general input and output abstractions; everything else follows from
focalization. End remark 5.5.
More generally, we can say that the βς equational theory of (co-)data is sound with
respect to the βη equational theory, with help from the core µµ˜ theory of substitution.
Theorem 5.2 (Soundness of βς w.r.t. βη). For any substitution strategy S:
a) If c :
(
Γ `G ∆
)
, c′ :
(
Γ `G ∆
)
, and c =βSςS c
′, then c =µS µ˜Sηµηµ˜βGηG c′.
b) If Γ `G v : A|∆, Γ `G v′ : A|∆, and v =βSςS v′, then v =βGηG v′.
164
c) If Γ|e : A `G ∆, Γ|e′ : A `G ∆, and e =βSςS e′, then e =µS µ˜Sηµηµ˜βGηG e′.
Proof. Note that compatibility, reflexivity, symmetry, and transitivity of =βSςS implies
the same in =µS µ˜Sηµηµ˜βGηG , so we only need to check that the βSςS rewriting rules
can be derived as =µS µ˜Sηµηµ˜βGηG equalities:
– βS restricted to a data type F(
#»
C ) is derived as:
〈
K( #»E, #»V )
∣∣∣∣∣∣µ˜[· · · | K( #»α , #»x ).c | · · ·]〉 =βF 〈µ #»α . 〈 #»V ∣∣∣∣∣∣µ˜ #»x .c〉∣∣∣∣∣∣ #»E〉
=µS µ˜S c
{ #      »
E/α,
#     »
V/x
}
– βS restricted to a co-data type G(
#»
C ) is derived analogously to the previous case.
– ςS restricted to a data type F(
#»
C ) is derived inductively on the structure of
constructions from right-to-left as:
K( #»E, #»V , v′, #»v ) : F( #»C )
=ηµ µα.
〈
K( #»E, #»V , v′, #»v )
∣∣∣∣∣∣α〉
=ηF µα.
〈
K( #»E, #»V , v′, #»v )
∣∣∣∣∣∣µ˜[· · · | K( #»β , #»x , y, #»z ).〈K( #»β , #»x , y, #»z )∣∣∣∣∣∣α〉 | · · ·]〉
=βF µα.
〈
µ
#»
β .
〈
#»
V
∣∣∣∣∣∣µ˜ #»x . 〈v′∣∣∣∣∣∣µ˜y. 〈 #»v ∣∣∣∣∣∣µ˜z. 〈K( #»β , #»x , y, #»z )∣∣∣∣∣∣α〉〉〉〉∣∣∣∣∣∣ #»E〉
=µS µα.
〈
#»
V
∣∣∣∣∣∣µ˜ #»x . 〈v′∣∣∣∣∣∣µ˜y. 〈 #»v ∣∣∣∣∣∣µ˜z. 〈K( #»E, #»x , y, #»z )∣∣∣∣∣∣α〉〉〉〉
=µ˜S µα.
〈
v′
∣∣∣∣∣∣µ˜y. 〈 #»v ∣∣∣∣∣∣µ˜z. 〈K( #»E, #»V , y, #»z )∣∣∣∣∣∣α〉〉〉
=ςF µα.
〈
v′
∣∣∣∣∣∣µ˜y. 〈K( #»E, #»V , y, #»v )∣∣∣∣∣∣α〉〉
K( #»E, e′, #»e , #»v ) : F( #»C )
=ηµ µα.
〈
K( #»E, e′, #»e , #»v )
∣∣∣∣∣∣α〉
=ηF µα.
〈
K( #»E, e′, #»e , #»v )
∣∣∣∣∣∣µ˜[· · · | K( #»β , γ, #»δ , #»x ).〈K( #»β , γ, #»δ , #»x )∣∣∣∣∣∣α〉 | · · ·]〉
=βF µα.
〈
µ
#»
β .
〈
µγ.
〈
µ
#»
δ .
〈
#»v
∣∣∣∣∣∣µ˜ #»x . 〈K( #»β , γ, #»δ , #»x )∣∣∣∣∣∣α〉〉∣∣∣∣∣∣ #»e 〉∣∣∣∣∣∣e′〉∣∣∣∣∣∣ #»E〉
=µS µα.
〈
µγ.
〈
µ
#»
δ .
〈
#»v
∣∣∣∣∣∣µ˜ #»x . 〈K( #»E, γ, #»δ , #»x )∣∣∣∣∣∣α〉〉∣∣∣∣∣∣ #»e 〉∣∣∣∣∣∣e′〉
=ςF µα.
〈
µγ.
〈
K( #»E, γ, #»e , #»v )
∣∣∣∣∣∣α〉∣∣∣∣∣∣e′〉
– ςS restricted to a co-data type G(
#»
C ) is derived analogously to the previous
case.
165
Going the other way, the strategy-independent β law is sound with respect to
the strategy-dependent βς rewriting theory, with the help from the core µµ˜S theory
of substitution, for any focalizing strategy (Definition 5.2).
Theorem 5.3 (Soundness of β w.r.t. βς). For any focalizing strategy S, if c =βG c′,
then c =µS µ˜SβSςS c
′, and similarly for (co-)terms.
Proof. Note that compatibility, reflexivity, symmetry, and transitivity of =βG implies
the same in =µS µ˜SβSςS , so we only need to check that the βG rewriting rules can be
derived as =µS µ˜SβSςS equalities:
– βS for a data structure is derived as:
〈K( #»e , #»v )||µ˜[· · · | K( #»α , #»x ).c | · · ·]〉
=ςSµS 〈µ #»α . 〈K( #»α , #»v )||µ˜[· · · | K( #»α , #»x ).c | · · ·]〉|| #»e 〉
=ςSµSµS 〈µ #»α . 〈 #»v ||µ˜ #»x . 〈K( #»α , #»x )||µ˜[· · · | K( #»α , #»x ).c | · · ·]〉〉|| #»e 〉
=βS 〈µ #»α . 〈 #»v ||µ˜ #»x .c〉|| #»e 〉
The first two steps follow by applying ςS reduction to name non-(co-)values and
applying µS µ˜S to name (co-)values, and then substituting the case abstraction
(which must be a co-value because S is focalizing) for the outer µ-abstraction
generated by ςS . The last step follows because (co-)variables are (co-)values since
S is focalizing.
– βS for a co-data structure is derived analogously to the previous case.
So equationally speaking, in the presence of the core µµ˜ theory of substitution,
typed versions of the βSςS laws can be derived from the typed βGηG laws, and untyped
versions of the βG laws can be derived from the untyped βSςS laws. However, the
typed ηG law cannot be derived by βSςS , so βGηG equates more typed programs.
Combining Strategies in Connectives
The parametric µµ˜-calculus provides a general framework for describing all the
basic connectives discussed in Section 5.2, giving a mechanism for extending the syntax
and semantics of the sequent calculus to account for a wide variety of new structures.
However, what about the connectives of polarized system L from Section 4.3 which
166
involved both polarities? Can we include the shifts, negation, and polarized function
type into our notion of user-defined data and (co-)data types? Also, what about
polarized logic ability to utilize multiple evaluation strategies in a single program? Is
there a way to instantiate the parametric equational theory with two strategies at the
same time? Or even more than two strategies at once?
To answer to all of these questions, let’s look at how the parametric µµ˜-calculus
described thus far compares to a polarized languages like system L. In polarized system
L, all types are classified by one of two polarities: positive or negative. The distinction
between data and co-data determines the polarity of a type, and furthermore the type’s
polarity determines the evaluation order used for programs of that type. In polarized
system L, data types are positive and describe call-by-value programs, whereas co-data
types are negative and describe call-by-name programs. In the parametric µµ˜-calculus,
we have stepped outside this regimine, so that programs of data types and co-data
types can be evaluated with the strategy of our own choosing. However, we can still
allow for this choice of strategy while remaining compatible with polarized logic’s
type-based approach to evaluation strategy. In particular, we can still have multiple
classifications of types, as a generalization of polarized types, and use the type’s
classification to determine which strategy to use for programs of that type. In other
words, even though we have decoupled the link between data vs co-data and evaluation
order, we can still have the evaluation strategy depend on the type.
Separating types into different classifications is not a new idea, and shows up in
several type systems in the form of kinds. Effectively, kinds classify types in the same
way that types classify terms, i.e. kinds are types “one level up the chain.” Therefore,
we will look at extending the parametric µµ˜-calculus with multiple base kinds for
classifying (co-)data types of different strategies. For example, if we are interested
in both call-by-value (V) and call-by-name (N ) evaluation, then we would have two
different base kinds, called V and N , which classify the various types of call-by-value
and call-by-name programs, respectively.4 This extension to the language of kinds
involves understanding more about which kinds involved in the various connectives:
we to know the kinds of types expected as parameters to the connective, as well as
4Here we use the names V andN to mean both a strategy (a set of values, co-values, and evaluation
contexts) and a kind (a “type of types”). Even though the two are different things, the clash in
naming is meant to make obvious the connection between the kind and the strategy. Both kinds and
strategies are used in very different places, so the meaning of V and N can be distinguished from
context.
167
the kind of type the connective builds. Thus, we need to be more explicit in our data
and co-data declarations in order to specify the link with strategy.
For example, let’s suppose we want a wholly call-by-value pair type, corresponding
to the polarized version of the positive ⊗ connective. We can make this intent known
by adding explicit kind annotations to the declaration of ⊗ from Figure 5.6:5
data (X : V)⊗ (Y : V) : V where
( , ) : X : V , Y : V ` X ⊗ Y : V |
Here, we say that the types for both components of the pair belong to kind V, and
the resulting pair type itself also belongs to kind V . Because we interpret the kind V
as containing the types of programs which should be evaluated according to the V
strategy, then this declaration gives us the basic pair type in the call-by-value instance
of the parametric equational theory. The main difference here is that we are being
explicit about the fact that types A, B, and A⊗B must be call-by-value, and cannot
be interpreted by any other evaluation strategy, as opposed to the previous situation
where the programs of a (co-)data type could be interpreted by any evaluation strategy
of our choice. The impact of these explicit kind annotation on typing is minor: the
rules for typing terms and co-terms of type A⊗B are essentially the same as before.
The main change is that we need to make sure that types are well-kinded. In particular,
we have a new judgement X1 : k1, . . . , Xn : kn `G A : k that says that A is a type of
kind k with respect to the assigned kinds of type variables in the typing environment
Θ = X1 : k1, . . . , Xn : kn and the declarations in G. Then A ⊗ B is a type of kind
V under an typing environment Θ and set of declarations G containing the data
declaration of ⊗ when both A and B are as well:
Θ `G A : V Θ `G A : V
Θ `G A⊗B : V
Additionally, we can also describe a wholly call-by-name product type,
corresponding to the polarized version of the negative & connective. Making this
intent known in the more general setting is done by adding N kind annotations to
5Adding explicit kinds to a data type declaration is not new; it is supported by GHC with the
extension “kind signatures.” Rather, the new idea is to have the kind impact the meaning of a term
by denoting its evaluation strategy.
168
data (X : V)⊕ (Y : V) : V where
ι1 : X : V ` X ⊕ Y : V |
ι2 : Y : V ` X ⊕ Y : V |
codata (X : N ) & (Y : N ) : N where
pi1 : | X & Y : N ` X : N
pi2 : | X & Y : N ` Y : N
data (X : V)⊗ (Y : V) : V where
( , ) : X : V , Y : V ` X ⊗ Y : V |
codata (X : N )` (Y : N ) : N where
[ , ] : | X ` Y : N ` X : N , Y : N
data 1 : V where
() : ` 1 : V |
codata⊥ : N where
[] : | ⊥ : N `
data 0 : V where codata> : N where
FIGURE 5.13. Declarations of the basic single-strategy data and co-data types.
the declaration of & from Figure 5.6:
codata (X : N ) & (Y : N ) : N where
pi1 : | X & Y : N ` X : N
pi2 : | X & Y : N ` Y : N
Here, we say that the types for both components of the product belong to the kind N ,
and the resulting product type itself also belongs to kind N . Thus, this declaration
forces us to evaluate programs of this type in a way that matches the corresponding
interpretation in polarized languages like system L. In general, we can annotate all the
basic types of Figure 5.6 to force them into their polarized interpretations, giving the
annotated declarations in Figure 5.13. Essentially, this process involves us annotating
all data types with the kind V and all co-data types with the kind N , following the
assertion that data types describe call-by-value evaluation and co-data types describe
call-by-name evaluation. As before, the typing rules for terms and co-terms of type
A&B do not change with the addition of kind annotations, we only have an additional
rule for the well-kinded uses of the & connective:
Θ `G A : N Θ `G B : N
Θ `G A&B : N
169
data ↓(X : N ) : V where
↓ : X : N ` ´X : V |
codata ↑(X : V) : N where
↑ : | ˆX : N ` X : V
data∼(X : N ) : V where
∼ : ` ∼X : V | X : N
codata¬(X : V) : N where
¬ : X : V | ¬X : N `
data (X : V)− (Y : N ) : V where
· : X : V ` (X − Y ) : V | Y : N
codata (X : V)→ (Y : N ) : N where
· : X : V | (X → Y ) : N ` Y : N
FIGURE 5.14. Declarations of basic mixed-strategy data and co-data types.
While annotating the kinds of types involved in (co-)data declarations is
relatively straightforward for the single-polarity connectives, the exercise becomes
more important when representing polarized connectives that involve both polarities.
For example, the polarized function type made non-trivial use of both polarities in its
definition, which can be captured by the following annotated co-data declaration:
codata (X : V)→ (Y : N ) : N where
· : X : V | X → Y : N ` Y : N
Intuitively, the source A of the function type must be positive so it belongs to the
kind V , and the target B of the function type must be negative so it belongs to kind
N . Furthermore, since polarized languages assume that all co-data types themselves
are negative, the overall type A→ B belongs to the kind N . This declaration gives
us the primordial, Zeilberger’s (2009) polarized function type, with the same impact
on evaluation order. Likewise, we can give annotated (co-)data type declarations for
other mixed-polarity connectives, like the polarity shifts ´A and ˆA and involutive
negations ¬A and ∼A, as shown in Figure 5.14. Thus, kind-annotated (co-)data type
declarations give us a syntactic mechanism for summarizing all the simple polarized
connectives that we have previously seen.
In general, the extension of (co-)data declarations to include multiple base kinds
(R, S, T ), along with the necessary kinding restrictions, is given in Figure 5.15. This
extension means that we need to keep track of what kind each type variable has, since
6This is just shorthand for a (co-)data declaration of F( #        »X : k) : S in G.
170
k ∈ Kind ::= S R,S, T ∈ BaseKind ::= . . .
A,B,C ∈ Type ::= X | F( #»A) X, Y, Z ∈ TypeVariable ::= . . . F,G ∈ TypeCon ::= . . .
decl ∈ Declaration ::= data F( #        »X : k) : Swhere
#                                                                 »
K :
(
#         »
A : T ` F( #»X ) | #         »B : R
)
| codataG( #        »X : k) : Swhere
#                                                                  »
O :
(
#         »
A : T | G( #»X ) ` #         »B : R
)
G ∈ GlobalEnv ::= #     »decl Θ ∈ TypeEnv ::= #        »X : k
Γ ∈ InputEnv ::= #       »x : A ∆ ∈ OutputEnv ::= #        »α : A
J,H ∈ Judgement ::= (G ` decl) | (Θ `G A : k)
Declaration rules:
#                                »
#        »
X : k `G A : T
#                                 »
#        »
X : k `G B : R
G `
data F( #        »X : k) : Swhere
#                                                                 »
K :
(
#         »
A : T ` F( #»X ) | #         »B : R
)
data
#                                »
#        »
X : k `G A : T
#                                 »
#        »
X : k `G B : R
G `
codataG( #        »X : k) : Swhere
#                                                                 »
O :
(
#         »
A : T | F( #»X ) ` #         »B : R
)
codata
Kind rules:
Θ, X : k `G X : k TV
#                      »Θ `G C : k (F( #        »X : k) : S)6 ∈ G
Θ `G F( #»C ) : S
FT
FIGURE 5.15. Kinds of multi-strategy (co-)data declarations and types.
171
there are now multiple options, necessitating the introduction of type environments
X1 : k1, . . . , X2 : k2 denoted by Θ which are analogous to input (Γ) and output (∆)
environments at the level of types instead of programs. These type environments Θ
are used for checking the kind of a type A as in the first new form of judgement
Θ `G A : k which checks that the type A has kind k under the assumption that
type variables have the kind listed in Θ given the set of declarations G. Since all the
specific types are generated by (co-)data declarations, there are only two inference
rules for finding the kind of a type: reference to a type variable (TV ) or an instance
of a particular (co-)data type former F( #        »X : k) : S from the global set of declarations
G. We annotate the types and type variables in (co-)data declarations to make the
intension of the declaration explicit in the syntax. This explicit annotation makes it
straightforward to check that declarations are well-formed. The second new form of
judgement G ` decl checks that the declaration decl is well-formed—meaning that it
includes only well-kinded types—given a previously established set of declarations G.
To accomodate the generalization to multiple base kinds, we must also update
the typing rules for programs of the parametric µµ˜-calculus, as shown in Figure 5.16.
For the most part, the change from the single-kinded type system from Figure 5.8
is that we thread the type environment Θ around the rules, as demonstrated by the
updated judgement forms c :
(
Γ `ΘG ∆
)
, Γ `ΘG v : A | ∆, and Γ | e : A `ΘG ∆. Note
that the only substantial update in the typing rules is in the cut rule: Cut now takes
an additional premise Θ `G A : S checking that the cut type is indeed a type of some
base kind S. This extra premise is needed because, reading the rules bottom-up, the
Cut is the only inference rule that invents a new type out of thin air (see Section 3.1).
It is therefore prudent to check that this new type actually makes sense under the
given type environment Θ and global declarations G. Other than this change to Cut,
the other core inference rules (VR, VL, AR, AL) and the logical rules are essentially
the same as from Figure 5.8, ignoring Θ.
Having outlined the general pattern for mixed-strategy (co-)data types, we can
use the declaration mechanism to come up with special-purpose types that might be
used in a program. For example, we can represent the use of strictness in Haskell to
create lazy data structures with strict fields, like a lazy pair where the first component
is strict. We can signify this intent by declaring a different pair type that uses two
172
Judgement ::= c :
(
Γ `ΘG ∆
)
| (Γ `ΘG v : A | ∆) | (Γ | e : A `ΘG ∆)
Core rules:
x : A `ΘG x : A |
VR | α : A `ΘG α : A
VL
c :
(
Γ `ΘG α : A,∆
)
Γ `ΘG µα.c : A | ∆
AR
c :
(
Γ, x : A `ΘG ∆
)
Γ | µ˜x.c : A `ΘG ∆
AL
Γ `ΘG v : A | ∆ Θ `G A : S Γ′ | e : A `ΘG ∆′
〈v||e〉 :
(
Γ′,Γ `ΘG ∆′,∆
) Cut
Logical rules:
Given data F( #        »X : k) : Swhere
#                                                                                    »
Ki :
(
#              »
Aij : Tijj ` F( #»X ) | #                »Bij : Rijj
)i
∈ G, we have
the rules:
#                                                          »
Γ′j | e : Bij
#             »{C/X} `ΘG ∆′j
j #                                                          »
Γj | v : Aij
#             »{C/X} `ΘG ∆j
j
#»Γj
j
,
#»
Γ′j
j `ΘG Ki( #»e , #»v ) : F(
#»
C ) | #  »∆j
j
,
#  »
∆′j
j
FRKi
#                                                                                                           »
ci :
(
Γ,
#                              »
xi : Ai
#             »{C/X} `ΘG
#                              »
αi : Bi
#             »{C/X} ,∆
)i
Γ | µ˜
[
#                        »Ki( #»αi , #»xi).ci
i
]
: F( #»C ) `ΘG ∆
FL
Given codataG( #        »X : k) : Swhere
#                                                                                     »
Oi :
(
#              »
Aij : Tijj | G( #»X ) ` #                »Bij : Rijj
)i
∈ G, we have
the rules:
#                                                                                                           »
ci :
(
Γ,
#                              »
xi : Ai
#             »{C/X} `ΘG
#                              »
αi : Bi
#             »{C/X} ,∆
)i
Γ `ΘG µ
(
#                       »Oi[ #»xi , #»αi ].ci
i
)
: G( #»C ) | ∆
GR
#                                                          »
Γj | v : Aij
#             »{C/X} `ΘG ∆j
j #                                                          »
Γ′j | e : Bij
#             »{C/X} `ΘG ∆′j
j
#»Γj
j
,
#»
Γ′j
j | Oi[ #»v , #»e ] : G( #»C ) `ΘG
#  »∆j
j
,
#  »
∆′j
j
GLOi
FIGURE 5.16. Types of multi-strategy (co-)data in the parametric µµ˜ sequent
calculus.
173
different kinds, N and V :
dataMixedPair(X : V , Y : N ) : N where
MPair : X : V , Y : N ` MixedPair(X, Y ) : N |
In this declaration, the fact that the type A belongs to kind V denotes that the first
component should be evaluated with the call-by-value strategy V , whereas the second
component and the pair as a whole should be evaluated with the call-by-name strategy
N . We could better reflect such a data type in Haskell with strict fields, by accounting
for memoization through the call-by-need strategy, by just replacing N with LV .
Remark 5.6. Recall from Section 5.2 that although the η axioms for data and co-data
types do not reference the chosen strategy, their expressive power is affected by the
substitution principle, which is in turn affected by the choice of values and co-values.
In light of this observation, if we were forced to pick only one strategy for all data
types and one strategy for all co-data types, it would make sense to pick the strategies
that would give us the strongest equational theories. Therefore, if we want to make
the η axiom for a data type as strong as possible, we should choose the call-by-value
V strategy, since by substitution every co-term of that data type is equivalent to a
case abstraction on the structure of the type. Likewise, if we want to make the η
axiom for a co-data type as strong as possible, we should choose the call-by-name
N strategy, since by substitution every term of that co-data type is equivalent to a
co-case abstraction on the co-structure of the type. In this sense, the decision use of
polarities (i.e. the data/co-data divide) to determine evaluation strategy is the same
as choosing strategies to get the strongest and most universal η principles for every
(co-)data type. End remark 5.6.
Combining Strategies in Evaluation
Now that we are looking at programs with multiple different strategies running
around, we need to be able make sure that only terms and co-terms from the same
strategy interact with one another. Otherwise, the same fundamental dillema that
we were trying to avoid could crop back up again. For example, suppose we have a
program using both the call-by-value and call-by-name strategies, V and N , and face
the usual problematic command c0 = 〈µ .c1||µ˜ .c2〉. If we interpret the term µ .c1 as
call-by-name then it is a value of N , meaning it is a valid instance of µ˜V substitution,
174
and if we interpret the co-term µ˜ .c2 as call-by-value, then it is a co-value of V , meaning
it is a valid instance of µE substitution. This puts us back where we started, where
c1 =µE c0 =µ˜V c2 due to the conflict in a N -V interaction. Thus, our goal is to be
able to instantiate the parametric equational theory with a more complex composite
strategy made up of several primitive strategies, and use the kinds of types to make
sure that the terms and co-terms agree on which strategy to use in a command. This
way, we can understand how to write and run programs that interleave several different
evaluation strategies, and be sure that we will still get out the expected result in the
end.
Recall from Chapter IV that as a way out of the dilemma, Danos et al. (1997)
shows that we can use types to disambiguate the expected evaluation order in unclear
commands. This procedure follows the assumption that η laws are universal (Graham-
Lengrand, 2015): the η law of every (co-)data type applies to arbitrary (co-)terms of
the type without restriction. However, that procedure is not directly applicable in
the more general setting where the η laws are restricted to (co-)values, since we no
longer assume that data types must follow a call-by-value order and co-data types
must follow a call-by-name order. However, we still assume that each type, be it data
or co-data, must belong to a kind specifying some evaluation order. Thus, we can still
use a type-based approach for evaluation, albeit a more general one, by just checking
the kind of the principle type of interaction in a command. In this sense, the typed
µµ˜ and βη laws can already be generalized to multiple strategies S, giving the typed
µ #»S µ˜ #»ηµηµ˜β
GηG equational theory for multi-strategy (co-)data types G. Of note, we
only need to perform the type-based strategy-lookup during µ or µ˜ substitution:
(µ #»S ) 〈µα.c||E〉 µ #»S c {E/α} (E : A,A : S, E ∈ CoValueS)
(µ˜ #»S ) 〈V ||µ˜x.c〉 µ˜ #»S c {V/x} (V : A,A : S, V ∈ ValueS)
and otherwise restrict the rewriting rules as usual so that both sides have the same type.
The type-restricted rules rely on the type associated with (co-)terms in a command
to decide on the appropriate strategy for deciding values and co-values, thus fixing
an priority between the opposing µ and µ˜ substitution rules. In other words, we can
always use typing information to evaluate a multi-strategy program without falling
back into the fundamental dilemma of classical computation.
175
As an example, consider an application of the typed β law for the data connective
MixedPair as defined previously in Section 5.4. Recall that the typed β laws do not
make reference to the chosen strategy in any way, they are only responsible for breaking
apart structures. This means that the β rules are completely unaffected by the use of
composite strategies. For instance, we may simplify a program using MixedPair in the
same way as the call-by-value ⊗:
〈MPair(v, v′)||µ˜[MPair(x : A, y : B).c]〉 →βMixedPair 〈v||µ˜x : A. 〈v′||µ˜y : B.c〉〉
→µ˜N 〈v||µ˜x : A.c {v′/y : B}〉
Notice that as before, the input abstractions take over for determining evaluation
order in even with multiple primitive strategies, only now the type of the command
comes more directly into play. In this case, we are allowed to substitute v′ for y : B
since v′ : B and B : N , which can be found in the implied typing derivation of the
command, and ValueN includes every term. However, we must first evaluate v before
substituting it for x : A. The implied typing derivation tells us that v : A and A : V,
so v can only be substituted by the µ˜V rule if it has the restrictive form of value given
by ValueV . But the input abstraction for x : A likewise has the type A : V, so it is
already a co-value of CoTermV .
However, since we are only interested in determinism, a full typing discipline is
overkill for the untyped βς theory of (co-)data. After all, neither the parametric core
µS µ˜∫ theory nor the βSςS theory needed to use types to maintain determinism when
instantiated with a single strategy S. Therefore, we use a type-agnostic kind system
for making sure that all commands are well-kinded. By “type-agnostic,” we mean that
we are checking the property v :: S, that is v is a term of some unknown type of kind
S. The kind system for the structural core µµ˜-calculus is shown in Figure 5.17, and
unremarkably resembles the ordinary type system except at “one level up.” The whole
point of the system is shown in the Cut rule that only allows commands between term
and co-term of the same kind, whereas (co-)variables have the kind assumed in the
environment, and input and output abstractions are generic over the kind of variable
they abstract. Furthermore, the additional kinding rules for generic declared (co-)data
types is shown in Figure 5.18. The main property that distinguishes this from an
ordinary type system is that we “forget” the types, effectively collapsing them down
into a single universal type for each kind, similar to a generalized version of Zeilberger’s
176
Γ ∈ InputEnv ::= x1 :: S1, . . . , xn :: Sn ∆ ∈ OutputEnv ::= α1 : S1, . . . , αn : Sn
Judgement ::= c ::
(
Γ `G ∆
)
| (Γ `G v :: S | ∆) | (Γ | e :: S `G ∆)
Core rules:
x :: S `G x :: S | VR | α :: A `G α :: S VL
c ::
(
Γ `G α :: S,∆
)
Γ `G µα.c :: S | ∆ AR
c ::
(
Γ, x :: S `G ∆
)
Γ | µ˜x.c :: S `G ∆ AL
Γ ` v :: S | ∆ Γ′ | e :: S ` ∆′
〈v||e〉 :: (Γ′,Γ ` ∆′,∆) Cut
FIGURE 5.17. Type-agnostic kind system for the core µµ˜ sequent calculus.
(2009) “bi-typed” system, where we now allow for as many base kinds as desired. This
kind system is a relaxation of the full typing regime, in that all well-typed commands
and (co-)terms are well-kinded by demoting the environments x1 : A1, . . . , xn : An
and α1 : B1, . . . , αm : Bm to x1 :: T1, . . . , xn :: Tn and α1 :: R1, . . . , αm : Rm, where
A1 : T1, . . . , An : Tn and B1 : R1, . . . , Bm : Rm in the given typing environment.
Now that we have refined the untyped syntax into the well-kinded sub-syntax,
we can build composite strategies that combine multiple primitive ones. Essentially,
a composite substitution strategy is one whose values and co-values are further sub-
divided into different base kinds. This way, each value in a composite strategy belongs
to exactly one kind of term, corresponding to the particular “primitive” strategy that
it comes from. Furthermore, to get a full composite evaluation strategy, we also need
to compose the evaluation contexts that come from each “primitive” strategy, to get a
single set of evaluation contexts that intermingles them all. In general, we can form the
composite strategy #»Sii = S1, . . . ,Sn as shown in Figure 5.19. As discussed previously
in Remark 5.5, each of the substitution strategies we have considered so far follows
a predictable pattern, so we will first just focus on the core of the strategy without
(co-)data.
For example, combining call-by-value and call-by-name into a single composite
strategy is the most straightforward, and is essentially just a disjoint union of the V
and N strategies, as shown in Figure 5.20, where the “disjointness” is enforced by the
kinding restriction on (co-)values. Note that this combination exactly captures the
177
Given data F( #        »X : k) : Swhere
#                                                                                    »
Ki :
(
#              »
Aij : Tijj ` F( #»X ) | #                »Bij : Rijj
)i
∈ G, we have:
#                                         »
Γ′j | e :: Rij `G ∆′j
j #                                       »
Γj | v :: Tij `G ∆j
j
#»Γj
j
,
#»
Γ′j
j `G Ki( #»e , #»v ) :: S |
#  »∆j
j
,
#  »
∆′j
j
FRKi
#                                                                           »
ci ::
(
Γ, #             »xi :: Tij `G
#               »
αi :: Rij ,∆
)i
Γ | µ˜
[
#                        »Ki( #»αi , #»xi).ci
i
]
:: S `G ∆
FL
Given codataG( #        »X : k) : Swhere
#                                                                                     »
Oi :
(
#              »
Aij : Tijj | G( #»X ) ` #                »Bij : Rijj
)i
∈ G, we have:
#                                                                           »
ci ::
(
Γ, #             »xi :: Tij `G
#               »
αi :: Rij ,∆
)i
Γ `G µ
(
#                       »Oi[ #»xi , #»αi ].ci
i
)
:: S | ∆
GR
#                                       »
Γj | v :: Tij `G ∆j
j #                                         »
Γ′j | e :: Rij `G ∆′j
j
#»Γj
j
,
#»
Γ′j
j | Oi[ #»v , #»e ] :: S `G
#  »∆j
j
,
#  »
∆′j
j
GLOi
FIGURE 5.18. Type-agnostic kind system for multi-kinded (co-)data.
V ∈ Value #»Sii ::= VSi :: Si VSi ∈ ValueSi ::= . . .
E ∈ CoValue #»Sii ::= ESi :: Si ESi ∈ CoValueSi ::= . . .
D ∈ EvalCxt #»Sii ::=  | Di[D] Di ∈ EvalCxtSi ::= . . .
FIGURE 5.19. Composite #»S strategy.
V ∈ ValueP ::= VV :: V | VN :: N E ∈ CoValueP ::= EV :: V | EN :: N
VV ∈ ValueV ::= x EV ∈ CoValueV ::= e
VN ∈ ValueN ::= v EN ∈ CoValueN ::= α
D ∈ EvalCxtP ::=  | 〈||e :: V〉 | 〈V :: V||〉 | 〈v :: N||〉 | 〈||E :: N〉
FIGURE 5.20. Composite core polarized strategy P = V ,N .
178
V ∈ ValueLV,LN ::= VLV :: LV | VLN :: LN
VLV ∈ ValueLV ::= x
VLN ∈ ValueLN ::= x | µα.D[〈VLN ||α〉]
E ∈ CoValueLV,LN ::= ELV :: LV | ELN :: LV
ELV ∈ CoValueLV ::= α | µ˜x.D[〈x||ELV〉]
ELN ∈ CoValueLN ::= α
D ∈ EvalCxt ::=  | 〈v :: LV||µ˜x.D〉 | 〈µα.D||e :: LN〉
| 〈v :: LV||〉 | 〈||E :: LV〉 | 〈||e :: LN〉 | 〈V :: LN||〉
FIGURE 5.21. Composite core LV and LN strategy.
polarized evaluation strategy P for system L in Section 4.3. Combining call-by-need
with its dual is a little more involved, since both the LV and LN substitution strategies
form “closures” over evaluation contexts that can include delayed (co-)terms that have
not yet been evaluated, but whose results should be shared. Thus, to combine these
two strategies, we rely on the merged evaluation contexts of the composite strategy,
as shown in Figure 5.21. Additionally, all four primitive strategies can be combined
into a single composite strategy by taking the disjoint union of the previous two
combinations in the expanded composite syntax, so that the V ,N ,LV ,LN strategy
is defined as the following sets of (co-)values:
ValueV,N ,LV,LN , ValueP ∪ ValueLV,LN
CoValueV,N ,LV,LN , CoValueP ∪ CoValueLV,LN
EvalCxtV,N ,LV,LN , EvalCxtP ∪ EvalCxtLV,LN
Finally, we add (co-)data to all of these composite strategies using the method
described in Remark 5.5: we extend every V alueS with all well-kinded co-case
abstractions and terms of the form K( #»E, #»V ) in TermS , extend every CoV alueS with
all well-kinded case abstractions and co-terms of the form O[ #»V , #»E ] in CoTermS , and
close every V alueS and CoV alueS under ς expansion.
The main goal in tracking the strategy in the kinds is to continue to avoid
the fundamental dilemma of classical computation when mixing strategies. Well-
179
kindedness ensures that we cannot have a command between a term and a co-term
following different primitive strategies, so that the kind restriction is enough to
determine a consistent strategy for every substitution and avoid the fundamental
dillema. For example, it is enough for composite strategies like P or LV ,LN , since it
lets us determine the appropriate strategy to use for every substitution, which prevents
re-introducing the critical pair between µ and µ˜. Furthermore, well-kindedness is
preserved by the untyped reduction theory, so that we only need to begin with a
well-kinded command or (co-)term to ensure that every step stays well-kinded.
Theorem 5.4 (Kind preservation). For all strategies #»S = S1, . . . ,Sn:
a) If c ::
(
Γ `G ∆
)
and c→µ #»S µ˜ #»S ηµηµ˜β #»S ς #»S c
′ then c′ :
(
Γ `G ∆
)
.
b) If Γ `G v :: Si | ∆ and v →µ #»S µ˜ #»S ηµηµ˜β #»S ς #»S v
′ then Γ `G v′ :: Si | ∆.
c) If Γ | e :: Si `G ∆ and e→µ #»S µ˜ #»S ηµηµ˜β #»S ς #»S e
′ then Γ | e′ :: Si `G ∆.
Proof. By (mutual) induction on the kinding derivations c ::
(
Γ `G ∆
)
, Γ `G v ::
Si | ∆, and Γ | e :: Si `G ∆. The cases of the compatible closure of the base 
rewriting rules follow directly from the inductive hypothesis, and the base cases for
the  rewriting rules follows from the fact that well-kindedness is preserved under
substitution, i.e. that for any Γ′ `G V :: Si | ∆′ and Γ′ | E :: Si `G ∆′,
1. c ::
(
Γ, x :: Si `G ∆
)
implies c {V/x} ::
(
Γ,Γ′ `G ∆,∆′
)
and c ::(
Γ `G α :: Si,∆
)
implies c {E/α} ::
(
Γ,Γ′ `G ∆,∆′
)
,
2. Γ, x :: Si `G v :: Sj | ∆ implies Γ,Γ′ `G v {V/x} :: Sj | ∆,∆′ and Γ `G v :: Sj |
α :: Si,∆ implies Γ,Γ′ `G v {E/α} :: Sj | ∆,∆′, and
3. Γ, x :: Si | e :: Sj `G ∆ implies Γ,Γ′ | e {V/x} :: Sj `G ∆,∆′ and Γ | e :: Sj `G
α :: Si,∆ implies Γ,Γ′ | e {E/α} :: Sj `G ∆,∆′, and
each of which follows by induction on the kinding derivations for c, v, and e.
This means that we can safely compute the result of any untyped command
or (co-)term so long as it is well-kinded to begin with. Returning to the MixedPair
example, if we begin with the well-kinded command 〈MPair(V, V ′)||µ˜[MPair(x, y).c]〉,
180
then we know that V :: V and V ′ :: N , so V cannot be an output abstraction but V ′
can be due to the kinded definition of ValueP . This gives us the reduction
〈MPair(V, V ′)||µ˜[MPair(x, y).c]〉 →βP c {V/x, V ′/y}
which induces the combined substitution of the V-value V and N -value V ′, resulting
in the command c {V/x, V ′/y} of the same kind that we started with.
Duality of Connectives and Evaluation
Having laid out a general system for both data and co-data and with the possibility
of intermingling multiple evaluation strategies, we now rephrase the duality of the
sequent calculus. In particular, given any instance of the parametric µµ˜-calculus, we are
able to automatically generate its dual instance, such that the two are isomorphic to one
another by the involutive duality operation. Additionally, particular application of the
duality-generating operation recapitulates the previous results of duality in the sequent
calculus, giving a single setting for summarizing the study of computational duality.
Effectively, duality applies in both the static world of types as well as the dynamic
world of programs. In types, duality expresses the opposing purpose of assumption
and conclusion on the two sides of a sequent. In programs, duality expresses the
opposing purpose of production and consumption on the two sides of a command.
Thus, the entailment (`) of a sequent and the dividing line of a command provide the
fundamental pole about which opposing entities turn in their dance of duality.
The main difference from before is that we now have many sources for names that
must be dualized. Types and programs in the parametric sequent calculus contain
a variety of names—free variables and co-variables, constructors and observers, and
connectives for data and co-data types. These names are arbitrary identifiers which
ultimately do not impact the meaning of types or programs. However, to examine
duality we must relate pairs of these arbitrary names. Therefore, we build our duality
on a given relationship between dual names, written as an overline. Recall that in both
the dual calculi (Chapter III) and system L (Chapter IV), duality swaps variables with
co-variables, and vice versa. Formally, this is represented by an assumed bijection, x and
α, between the two dual variable sets. But in the parametric µµ˜-calculus, (co-)variables
aren’t the only names we must think about; we also have to do something about the
names of connectives (F) as well as the names of constructors and observers (K and
181
O). Therefore, we also assume a bijection between constructors and observers, K and
O, as well as a bijection between connective names, F. Additionally, for multi-kinded
programs we need a bijection S between the names for base kinds. As shorthand, we
may use the dual identifier relation ∼ which identifies the chosen duals to the various
bijections, so that x ∼ α means x = α and α = x and so on for the other namespaces.
With the bijections between names at hand, we first consider the duality of types
as shown in Figure 5.22. As before, the static aspect of this duality is exactly the usual
form of logical duality of the sequent calculus, where the input environment, Γ, is
swapped with the output environment, ∆, in a sequent. Duality of the environments is
defined pointwise, so for every variable x : A we associate a dual co-variable denoted x :
A⊥, and likewise every co-variable α : A is associated with a dual variable denoted α :
A⊥. Duality of the kinding environments from Figure 5.15 for multi-kinded programs
is similar, except that instead of types we have base kinds, and the dual of S is S.
In sequents, terms swap places with co-terms and vice versa. For example, the dual
of a closed term, `G v : A | or `G v :: S | , is a closed co-term, | v⊥ : A⊥ `G⊥ or
| v⊥ :: S⊥ `G⊥ . Going the other way, a type derivation of a closed co-term, | e : A `G
or | e :: S `G , is dualized as a type derivation of a closed term, `G⊥ e⊥ : A⊥ |
or `G⊥ e⊥ :: S⊥ | . Commands, which sit outside of the sequent, stay in place and
instead describe the dynamic aspect of dualization inside a program.
Each data type declaration is dual to a co-data type declaration, and vice versa.
On the one hand, the constructors, Ki, of data type declaration become the observers
of the dual co-data type declaration, denoted Ki. On the other hand, the observers,
Oi, of a co-data type declaration become the constructors of the dual data type
declaration, denoted O⊥i . Furthermore, the sequents describing each constructor are
also reversed by the duality operation on sequents, similar to the action of duality
on typing judgements. For example, Figure 5.6 shows several dual data and co-data
declarations side-by-side, so that set of declarations is self-dual, under the following
dualization relationship for names:
⊕ ∼ & ι1 ∼ pi1 ι2 ∼ pi2
⊗ ∼ ` ( , ) ∼ [ , ]
1 ∼ ⊥ () ∼ []
0 ∼ >
− ∼→ · ∼ ·
182
Duality of environments:
( #        »X : k)⊥ ,
#            »
X : k⊥ ( #     »decl)⊥ ,
#        »
decl⊥
( #       »x : A)⊥ ,
#           »
x : A⊥ ( #         »x :: S )⊥ , #         »x :: S
( #        »α : A)⊥ ,
#           »
α : A⊥ ( #         »α :: S )⊥ , #             »α1 :: S1
Duality of sequents:(
c :
(
Γ `ΘG ∆
))⊥
, c⊥ :
(
∆⊥ `Θ⊥G⊥ Γ⊥
) (
c ::
(
Γ `G ∆
))⊥
, c⊥ ::
(
∆⊥ `G⊥ Γ⊥
)
(
Γ `ΘG v : A | ∆
)⊥
, ∆⊥ | v⊥ : A⊥ `Θ⊥G⊥ Γ⊥
(
Γ `G v :: S | ∆
)⊥
, ∆⊥ | v⊥ :: S `G⊥ Γ⊥(
Γ | e : A `ΘG ∆
)⊥
, ∆⊥ `Θ⊥G⊥ e⊥ : A⊥ | Γ⊥
(
Γ | e :: S `G ∆
)⊥
, ∆⊥ `Θ⊥G⊥ e⊥ :: S | Γ⊥
Duality of declarations:

data F( #        »X : k) : Swhere
K1 :
#            »
A1 : T1 ` F( #»X ) | #              »B1 : R1
. . .
Kn :
#             »
An : Tn ` F( #»X ) | #               »Bn : Rn

⊥
,
codata F(
#            »
X : k⊥) : Swhere
K1 :
#               »
B⊥1 : R1 | F(
#»
X ) ` #             »A⊥1 : T1
. . .
K⊥n :
#                »
B⊥n : Rn | F(
#»
X ) ` #              »A⊥n : Tn
codataG(
#            »
X : k⊥) : Swhere
O1 :
#            »
A1 : T1 | G( #»X ) ` #              »B1 : R1
. . .
On :
#             »
An : Tn | G( #»X ) ` #               »Bn : Rn

⊥
,
dataG(
#            »
X : k⊥) : Swhere
O1 :
#               »
B⊥1 : R1 ` G(
#»
X ) | #             »A⊥1 : T1
. . .
On :
#                »
B⊥n : Rn ` G(
#»
X ) | #              »A⊥n : Tn
Duality of types and kinds:
S⊥ , S X⊥ , X F( #»A)⊥ , F( #  »A⊥) G( #»A)⊥ , G( #  »A⊥)
FIGURE 5.22. The duality of types of the parametric µµ˜-calculus.
183
Duality of the core calculus:
〈v||e〉⊥ ,
〈
v⊥
∣∣∣∣∣∣e⊥〉
x⊥ , x
(µα.c)⊥ , µ˜α.c⊥
α⊥ , α
[µ˜x.c]⊥ , µx.c⊥
Duality of data and co-data:
K( #»e , #»v )⊥ , K[
# »
e⊥ ,
# »
v⊥ ] µ˜[K( #»α , #»x ).c | . . .]⊥ , µ
(
K[ #»α , #»x ].c⊥ | · · ·
)
O[ #»v , #»e ]⊥ , O(
# »
v⊥ ,
# »
e⊥) µ(O[ #»x , #»α [.c | . . .)⊥ , µ˜
[
O( #»x , #»α ).c⊥ | · · ·
]
FIGURE 5.23. The duality of programs of the parametric µµ˜-calculus.
∼ ∼ ¬ ∼ ∼ ¬
The duality between types is defined inductively on the structure of the types, such
that all data connectives F are replaced with their dual co-data connectives F, as
described above, and vice versa.
Next, we move on to consider the effect of duality on programs as shown in
Figure 5.23. In the core of the µµ˜-calculus, every command 〈v||e〉 is dual to another
command representing the flipped version of itself
〈
v⊥
∣∣∣∣∣∣e⊥〉, variables are dual to
co-variables, and input abstractions and output abstractions are dual to one another.
On the constructive side of data and co-data, every data structure K( #»e , #»v ) dual is a
co-data observation K[
# »
e⊥ ,
# »
v⊥ ] and every co-data observation O[ #»v , #»e ] is dual to a data
structure O(
# »
v⊥ ,
# »
e⊥). On the destructive side of data and co-data, every case analysis
on a data structure is dual to a co-data object, and every co-data object is dual to a
case analysis on a data structure.
Example 5.1. Let’s consider how to swap the results of a product:
swapx , µ(pi1[α].〈x||pi2[α]〉 | pi2[β].〈x||pi1[β]〉)
swapx,γ , 〈swapx||γ〉 , 〈µ(pi1[α].〈x||pi2[α]〉 | pi2[β].〈x||pi1[β]〉)||γ〉
Given that x stands for a value of B & A, then swapx is a term of A & B such that
whenever we ask for the pi1 of swapx we get the pi2 of x, and whenever we ask for the
184
pi2 of swapx, we get the pi1 of x. The command swapx,γ then represents a program that
sends the request γ : A&B to the swapped product swapx.
The duality of the sequent calculus lets us turn this program around, so that we
are calculating with data instead of co-data. First, we need to specify how names
are treated in order to generate the dualized program. For the connectives and
(co-)constructors, we use the naming convention relating products (&) and sums
(⊕)
⊕ ∼ & ι1 ∼ pi1 ι2 ∼ pi2
along with the following bijection between the variables and co-variables involved:
x ∼ α′ x′ ∼ α y′ ∼ β z′ ∼ γ
What we get out from duality is then a program that swaps an injection:
swap⊥x , µ˜[ι1(x′).〈ι2(x′)||α′〉 | ι2(y′).〈ι1(y′)||α′〉]
swap⊥x,γ ,
〈
z′
∣∣∣∣∣∣swap⊥x 〉 , 〈z′||µ˜[ι1(x′).〈ι2(x′)||α′〉 | ι2(y′).〈ι1(y′)||α′〉]〉
In particular, z′ stands for a value of type A⊥ ⊕ B⊥, and α′ stands for a co-value of
type B⊥ ⊕ A⊥. The co-term swap⊥x consumes an input of type A⊥ ⊕ B⊥ and swaps
the injection tag, turning ι1(x′) into ι2(x′) or turning ι2(y′) into ι1(y′), in order to
pass a value of type B⊥ ⊕ A⊥ along to α′. The whole command swap⊥x,γ then feeds
z′ into the consumer v⊥0 . Notice how even though the roles of input and output have
been exchanged by the duality operation, so that requests become results, the overall
structure in the dual program follows the same pattern as before. End example 5.1.
The final piece to the puzzle is to determine the effect of duality on the
strategy parameter(s) to the parametric µµ˜-calculus. Fortunately, this duality is
straightforward, since the strategy S is just a set of terms and co-terms (the (co-)values
of the substitution strategy component of S) and contexts (the evaluation contexts
of S). Thus, this final duality is achieved by applying the defined duality operation
pointwise. Given a substitution strategy S whose values are given by the set ValueS
and co-values are given by the set CoValueS , then we can automatically generate the
dual substitution strategy S⊥ by swapping values with co-values, so that the values,
185
ValueS⊥ , and co-values, CoValueS⊥ , of S⊥ are defined as:
ValueS⊥ , {E⊥ | E ∈ CoValueS} CoValueS⊥ , {V ⊥ | V ∈ ValueS}
Additionally, for a full evaluation strategy S, we can automatically generate the dual
evaluation strategy by dualizing the substitution strategy component of S as well as
its evaluation contexts EvalCxtS as follows:
EvalCxtS⊥ , {D⊥ | D ∈ EvalCxtS}
where the duality operation is generalized to contexts in the obvious way by taking
⊥ = . For example, dualizing the call-by-value strategy V generates the call-by-
name strategy N and vice versa, and similarly for the call-by-need strategy LV and
its dual:
V⊥ = N N⊥ = V LV⊥ = LN LN⊥ = LV
Also, the unrestricted strategy U is self-dual, so that U⊥ = U .
With all the dualities in place, we can now verify that the duality operation
satisfies the properties we would expect. Firstly, the duality operation is involutive
at all levels, so that the double-dual is an identity operation for any chosen bijection
between dual namespaces.
Theorem 5.5 (Involutive duality). The ⊥ operation on environments, sequents,
declarations, types, commands, and (co-)terms is involutive, so that ⊥⊥ is the identity
transformation.
Proof. By (mutual) induction on the definition of the duality operation ⊥, where
each case follows immediately by the inductive hypothesis.
Secondly, the duality operation respects the static semantics of the parametric
µµ˜-calculus, so that typing of commands and (co-)terms is preserved.
Theorem 5.6 (Static duality). If the typing judgement J (from Figures 5.8, 5.15,
5.16, 5.17 and 5.18) is derivable then J⊥ is.
186
Proof. By induction on the derivation of J , where in each case we must show that for
some conclusion J ′, premises H1, . . . , Hn, and inference rule I, the derivation of
H1 . . . Hn
J ′ I
and the inductive hypothesized derivations of H⊥1 , . . . , H⊥n implies the derivation of
H⊥1 . . . H
⊥
n
J ′⊥
I⊥
where I⊥ is the dual inference rule to I, which we define as follows for both the type
and kind system for programs
VR⊥ , VL VL⊥ , VR AR⊥ , AL AL⊥ , AR Cut⊥ , Cut
FR⊥K , FLK GL⊥O , GRO FL⊥ , FR GR⊥ , GL
WR⊥ ,WL WL⊥ ,WR CR⊥ , CL CL⊥ , CR XR⊥ , XL XL⊥ , XR
and the kind system for types
data⊥ , codata codata⊥ , data TV⊥ , TV FT⊥ , FT
The cases for the left and right rules of (co-)data (FR and FL) follow from the
inductive hypotheses and the fact that substitution of types commutes with duality
(A⊥
{
B⊥/X
}
=α A {B/X}⊥), which is guaranteed because the duality operation is
compositional and hygienic (Downen & Ariola, 2014a). The rest of the cases follow
immediately from the inductive hypotheses.
Thirdly, the duality operation respects the dynamic aspect of the parametric µµ˜-
calculus, so that it preserves the rewriting rules between commands and (co-)terms.
Theorem 5.7 (Equational duality). For any (possibly composite) strategy S and set
of declarations G,
a) if c RGS c
′ then c⊥ 
RG⊥S⊥
c′⊥,
b) if v RGS v
′ then v⊥ 
RG⊥S⊥
v′⊥, and
187
c) if e RGS e
′ then e⊥ 
RG⊥S⊥
e′⊥,
whenever RGS = µS µ˜Sηµηµ˜, RGS = βGηG, or RGS = βSςS .
Proof. By cases on each possible rewriting rules, using the more specific fact that
a) if c R c′ then c⊥ R⊥ c′⊥,
b) if v R v′ then v⊥ R⊥ v′⊥, and
c) if e R e′ then e⊥ R⊥ e′⊥,
where the dual of each rewriting rule R is defined as follows:
µS⊥ , µ˜S⊥ µ˜⊥S , µS⊥ ηµ⊥ , ηµ˜ ηµ˜⊥ , ηµ
βG
⊥ , βG⊥ ηG⊥ , ηG⊥
βS
⊥ , βS⊥ ςS⊥ , ςS⊥
Each case follows by the definition of the rewriting rules, the definition of the
duality operation on strategies S and declarations G, and the fact that substitution
commutes with the duality operation (that c⊥
{
V ⊥/x
}
=α (c {V/x})⊥, c⊥
{
E⊥/α
}
=α
(c {E/α})⊥, and similarly for (co-)terms) which is guaranteed by the fact that the
duality operation is compositional and hygienic (Downen & Ariola, 2014a).
Remark 5.7. Note that the duality operation discussed here does not just compare two
existing languages, as in previous work on computational duality (Curien & Herbelin,
2000; Wadler, 2003), but it actively generates the dual language to any instance of the
parametric sequent calculus. Thus, we can use this operation to create the dual to any
strategy of our choice. For example, applying the duality operation to the call-by-need
strategy LV from Figures 5.3 and 5.10 generates the dual to call-by-need evaluation
from Figures 5.4 and 5.10. Intuitively, the dual of call-by-need delays computation of
consumers and prioritizes producers. Then we switch attention to a consumer only
when we have a value to return to it. However, we do not copy complex consumers, like
the way control operators in Scheme-like languages copy arbitrary call-stacks. Rather,
we memoize such call-stacks, so that control operations cannot duplicate extra work
inside of a continuation. And in fact, this is essentially how the “lazy call-by-name”
evaluation strategy was developed by Ariola et al. (2011). The parametric µµ˜-calculus
generalizes the procedure to any starting evaluation strategy. End remark 5.7.
188
A (De-)Construction of the Dual Calculi
We have now seen a general language of the sequent calculus for studying a
wide variety of types. Each type is characterized by two actions: building up a
structure by construction, and analyzing the shape of a structure by deconstruction.
The types are primarily categorized by the way they orient these actions along
the producer-consumer protocol: data types produce via construction and consume
via deconstruction, and co-data types produce via deconstruction and consume via
construction. This viewpoint aligns neatly with system L from Chapter IV. In fact,
polarized system L corresponds exactly to the P instance of the parametric µµ˜-calculus
with the (co-)data type declarations from Figures 5.13 and 5.14. It also aligns with
the treatment of functions and polymorphism in the dual calculi: implication and the
universal quantifier are both co-data types (in both call-by-value and call-by-name)
that have constructed call-stacks and deconstructive λ- and Λ−abstractions, whereas
the existential quantifier is a data type with constructed packages and deconstructive
Λ˜-abstractions. However, the rest of the types in the dual calculi do not seem to follow
this pattern: both the terms and co-terms of every type appear to be constructed,
with no deconstructive pattern-matching to be found.
As it turns out, however, even the dual calculi’s construction-oriented sequent
calculus still follows the construction-deconstruction discipline, albeit indirectly. More
formally, the simply-typed sub-language of the dual calculi (i.e. without the quantifiers)
and the appropriate instances of the parametric µµ˜-calculus are in equational
correspondence (Sabry & Felleisen, 1992) with one another. This means that every
command and (co-)term of the dual calculi can be translated to the µµ˜-calculus,
and vice versa, such that the two translations are inverses of each other, up to the
equational theory, and the equations of each calculus are preserved by translation. In
other words, the dual calculi can be seen as syntactic sugar by macro-expansion for a
particular use-case of the µµ˜-calculus.
Since the dual calculi really stands for a pair of two separate but dual sequent
calculi—one for call-by-value and one for call-by-name—we need two translations
into two different instances of the parametric µµ˜-calculus. Because we have several
representations of conjunction, disjunction, and negation as (co-)data types in the
µµ˜-calculus, as shown in Figure 5.6, our task requires us to determine which particular
types correspond to the dual calculi’s characterization in both call-by-value and call-by-
name. Furthermore, since we aim to achieve an equational correspondence, our choice
189
of (co-)data types must respect both the computational (β rules) and extensional (η
rules) aspects of the types found in the dual sequent calculi.
First, let’s focus on the call-by-value half of the dual calculi. To represent call-by-
value conjunction, we will use the A⊗B data type. On the one hand, the terms for
conjunction, (v1, v2), translate directly to a constructed pair of A⊗B. On the other
hand, the co-terms for conjunction, pi1[e] and pi2[e], need to be expressed as the basic
deconstructions on an input of type A⊗B which extract one component of a pair:
pi1[e] ≈ µ˜[(x, ).〈x||e〉]
pi2[e] ≈ µ˜[( , y).〈y||e〉]
The representation of call-by-value disjunction is similar, for which we use the A⊕B
data type. On the one hand, the terms for disjunction, ι1(v) and ι2(v), translate
directly to the constructed values of the sum type A ⊕ B. On the other hand, the
co-terms for disjunction, [e1, e2], need to be expressed as the basic deconstruction on
an input of type A⊕B which checks which of the two constructors was used:
[e1, e2] ≈ µ˜[ι1(x).〈x||e1〉 | ι2(y).〈y||e2〉]
Finally, we represent the call-by-value negation with the function-like co-data type
¬A. On the one hand, the terms for negation, not(e), need to be expressed as the
basic deconstruction on an output of type ¬A:
not(e) ≈ µ(¬ [x].〈x||e〉)
On the other hand, the co-terms for negation, not[v], translate directly to the
constructed co-values of the type ¬A. Intuitively, the role of negation in the call-
by-value half of the dual calculi is to represent functions from the call-by-value λ-
calculus, as used by Wadler’s (2003) call-by-value encoding. Thus, we choose the
form of negation that most resembles functions: the values of ¬A are function-like
abstractions that accept an input but do not return a result.
Having seen how to embed the call-by-value half of the dual calculi into the
µµ˜⊕,⊗,¬,→V -calculus, we also need to translate back. As before, the constructed terms of
A⊗B and A⊕B, as well as the constructed co-terms of ¬A, translate directly. The only
interesting part of the translation is in encoding deconstruction as the constructive
190
forms. Translating the deconstructive terms of ¬A is straightforward, and only requires
us to place a generic input abstraction inside of the negation constructor:
µ(¬ [x].c) ≈ not(µ˜x.c)
Likewise, translating the deconstructive co-terms of A ⊕ B requires us to form a
co-term pair of two generic input abstractions:
µ˜[ι1(x).c1 | ι2(y).c2] ≈ [µ˜x.c1, µ˜y.c2]
Translating a deconstructive co-term of A⊗B is the most involved, since it requires
us to copy its input in order to extract both the first and second components one at a
time. This can be achieved by naming its input with an input abstraction, and using
both pi1 and pi2 on it:
µ˜[(x, y).c] ≈ µ˜z. 〈z||pi1[µ˜x. 〈z||pi2[µ˜y.c]〉]〉
The full translation between call-by-value half of the dual calculi and the µµ˜⊕,⊗,¬,→V
instance of the parametric sequent calculus is shown in Figure 5.24.
Second, let’s consider the call-by-name half of the dual calculi. Contrary to the call-
by-value case, we will choose the opposite representations of conjunction, disjunction
and negation from the (co-)data types listed in Figure 5.6: conjunction is A & B,
disjunction is A`B, and negation is ∼A. Likewise, the translations follow an opposite
story as before: the co-terms of A&B and A`B and terms of ∼A translate directly,
whereas the terms of A&B and A`B and co-terms of ∼A require more work. The
disjunctive and conjunctive terms for the call-by-name calculus are translated as:
ι1(v) ≈ µ([α, ].〈α||v〉)
ι2(v) ≈ µ([ , β].〈β||v〉)
(v1, v2) ≈ µ(pi1[α].〈v1||α〉 | pi2[β].〈v2||β〉)
and the negative co-terms for the call-by-name calculus are translated as:
not[v] ≈ µ˜[∼ (α).〈v||α〉]
191
〈v||e〉?v , 〈v?v ||e?v〉
x?v , x
(µα.c)?v , µα.c?v
ιi(v)?v , ιi(v?v)
(v1, v2)?v , (v1?v, v2?v)
not(e)?v , µ(¬ [x].〈x||e?v〉)
(λx.v)?v , µ([x · β].〈v?v ||β〉)
α?v , α
[µ˜x.c]?v , µ˜x.c?v
pii[e]?v , µ˜[(x1, x2).〈xi||e?v〉]
[e1, e2]?v , µ˜[ι1(x).〈x||e1?v〉 | ι2(y).〈y||e2?v〉]
not[v]?v , ¬ [v?v ]
[v · e]?v , v?v · e?v
〈v||e〉v? , 〈vv? ||ev?〉
xv? , x
(µα.c)v? , µα.cv?
ιi(v)v? , ιi(vv?)
(v1, v2)v? , (v1v?, v2v?)
µ(¬ [x].c)v? , not(µ˜x.cv?)
µ([x · β].c)v? , λx.µβ.cv?
αv? , α
[µ˜x.c]v? , µ˜x.cv?
µ˜[ι1(x).c1 | ι2(y).c2]v? , [µ˜x.c1v?, µ˜y.c2v?]
µ˜[(x, y).c]v? , µ˜z. 〈z||pi1[µ˜x. 〈z||pi2[µ˜y.cv?]〉]〉
¬ [v]v? , not[vv? ]
[v · e]v? , vv? · ev?
FIGURE 5.24. Translation between the call-by-value half of the simply-typed dual
calculi and µµ˜⊕,⊗,¬,→V .
192
Going the other way, the deconstructive co-term of ∼A is translated as a negated
output abstraction:
µ˜[∼ (α).c] ≈ not[µα.c]
and the deconstructive term of A&B is translated as a pair of output abstractions:
µ(pi1[α].c1 | pi2[β].c2) ≈ (µα.c1, µβ.c2)
As before with call-by-value conjunction, translating terms of call-by-name disjunction
is more involved in the dual way, requiring us to copy the output in order to extract
both components one at a time. This can be achieved by naming its output with an
output abstraction and using both ι1 and ι2 on it:
µ([α, β].c) ≈ µγ. 〈ι1(µα. 〈ι2(µβ.c)||γ〉)||γ〉
The full translation between the call-by-name half of the dual calculi and the µµ˜&,`,∼,→N
instance of the parametric sequent calculus is shown in Figure 5.25.
With the full translations to and from the dual calculi and instances of the
parametric µµ˜-calculus, we have a correspondence between the dual calculi respecting
their equational theories. In particular, the βη theory of (co-)data in the parametric
µµ˜-calculus corresponds to an appropriate βης theory for the dual calculi. To that
point, we need to extend the dual calculi with η laws as well as with additional values
in call-by-value and additional co-values in call-by-name as shown in Figure 5.26,
which is based on the semantics for the dual calculi by Wadler (2005). The extra
values in V ′ extend those in V to say that the result of projecting out of a value is
itself a value, which makes intuitive sense by the meaning of call-by-value. The extra
co-values in N ′ extend those in N to say that forcing a tagged injection also forces its
payload, which may not be so obvious intuitively, but is still semantically sound by
the interpretation of sum types in the call-by-name dual calculus. With this extension
to the dual calculi, we get an equational correspondence.
Theorem 5.8. – The call-by-value half of the simply-typed dual calculi is in
equational correspondence with the µµ˜⊕,⊗,¬,→V -calculus.
– The call-by-name half of the simply-typed dual calculi is in equational
correspondence with the µµ˜&,`,∼,→N -calculus.
193
〈v||e〉?n , 〈v?n||e?n〉
x?n , x
(µα.c)?n , µα.c?n
ιi(v)?n , µ([α1, α2].〈v||αi〉)
(v1, v2)?n , µ(pi1[α].〈v1||α〉 | pi2[β].〈v2||β〉)
not(e)?n , ∼ (e?n)
(λx.v)?n , µ([x · β].〈v?v ||β〉)
α?n , α
[µ˜x.c]?n , µ˜x.c?n
pii[e]?n , pii[e?n]
[e1, e2]?n , [e1?n, e2?n]
not[v]?n , µ˜[∼ (α).〈v||α〉]
[v · e]?n , v?n · e?n
〈v||e〉n? , 〈vn? ||en? 〉
xn? , x
(µα.c)n? , µα.cn?
µ(pi1[α].c1 | pi2[β].c2)n? , (µα.c1n? , µβ.c2n? )
µ([α, β].c)n? , µγ. 〈ι1(µα. 〈ι2(µβ.cn? )||γ〉)||γ〉
∼ (e)n? , not(en? )
µ([x · β].c)n? , λx.µβ.cn?
αn? , α
[µ˜x.c]n? , µ˜x.cn?
pii[e]n? , pii[en? ]
[e1, e2]n? , [e1n? , e2n? ]
µ˜[∼ (α).c]n? , not[µ˜x.cn? ]
[v · e]n? , vn? · en?
FIGURE 5.25. Translation between the call-by-name half of the simply-typed dual
calculi and µµ˜&,`,∼,→N .
Call-by-value extended values (V ′):
V ∈ ValueV ′ ::= . . . | µα. 〈V ||pi1 [α]〉 | µα. 〈V ||pi2 [α]〉
Call-by-name extended co-values (N ′):
E ∈ CoValueN ′ ::= . . . | µ˜x. 〈ι1 (x)||E〉 | µ˜x. 〈ι2 (x)||E〉
η laws for both call-by-value (S = V ′) and call-by-name (S = N ′):
(η×S ) V : A×B ≺η×S (µα. 〈V ||pi1 [α]〉, µβ. 〈V ||pi2 [β]〉) (α, β /∈ FV (V ))
(η⊕S ) E : A⊕B ≺η⊕S [µ˜x. 〈ι1 (x)||E〉, µ˜y. 〈ι2 (y)||E〉] (x, y /∈ FV (E))
(η¬S ) V : ¬A ≺η¬S not(µ˜x. 〈V ||not[x]〉) (x /∈ FV (V ))
(η→S ) V : A→ B ≺η→S λx.µβ. 〈V ||x · β〉 (x, β /∈ FV (V ))
FIGURE 5.26. The η laws for the dual calculi and extended (co-)values (V ′,N ′).
194
Proof. a) To demonstrate the call-by-value equational correspondence, we must
prove the following conditions
(1) The translations ( )?v and ( )
v
? are inverses up to the respective equational
theories of the two calculi: cv??v = c in the µµ˜
⊕,⊗,¬,→
V -calculus and c?vv? = c in
the call-by-value dual calculus, and similarly for (co-)terms.
(2) The two equational theories are sound under translation with respect to
each other: c = c′ in the µµ˜⊕,⊗,¬,→V -calculus implies cv? = c′
v
? in the call-by-
value dual calculus and c = c′ in the call-by-value dual calculus implies
c?v = c′
?
v in the µµ˜
⊕,⊗,¬,→
V -calculus, and similarly for (co-)terms.
The inversion of the translation follows by induction on the syntax of
both languages. In each direction, the round-trip translation of the core µµ˜
sublanguage (commands, (co-)variables, and µ- and µ˜-abstractions), as well as
the round-trip translation of injections, pairs, negation co-terms, and call stacks,
follows directly by the inductive hypothesis. The other cases for the round-trip
translation of the µµ˜⊕,⊗,¬,→V -calculus are:
µ(¬ [x].c)v??v =IH µ(¬ [x].〈x||µ˜x.c〉) =µ˜V µ(¬ [x].c)
µ([x · β].c)v??v =IH µ([x · β].〈µβ.c||β〉) =µV µ([x · β].c)
µ˜[ι1 (x).c1 | ι2 (y).c2]v??v =IH µ˜[ι1 (x).〈x||µ˜x.c1〉 | ι2 (y).〈y||µ˜y.c2〉]
=µ˜V µ˜[ι1 (x).c1 | ι2 (y).c2]
µ˜[(x, y).c]v?
?
v =IH µ˜z. 〈z||µ˜[(x, ).〈x||µ˜x. 〈z||µ˜[( , y).〈y||µ˜y.c〉]〉〉]〉
=µ˜V µ˜z. 〈z||µ˜[(x, ).〈z||µ˜[( , y).c]〉]〉
=η⊗V µ˜[(x, y).〈(x, y)||µ˜z. 〈z||µ˜[(x, ).〈z||µ˜[( , y).c]〉]〉〉]
=µ˜V µ˜[(x, y).〈(x, y)||µ˜[(x, ).〈(x, y)||µ˜[( , y).c]〉]〉]
=β⊗S µ˜[(x, y).c]
where the most interesting case is for the round-trip of a case abstraction on
a pair, which requires the βη laws for ⊗ to simplify. The other cases for the
round-trip translation of the call-by-value dual calculus are:
not(e)?v
v
? =IH not(µ˜x. 〈x||e〉) =ηµ˜ not(e)
195
λx.v?v
v
? =IH λx.µβ. 〈v||β〉 =ηµ λx.v
pi1 [e]?v
v
? =IH µ˜z. 〈z||pi1 [µ˜x. 〈z||pi2 [µ˜y. 〈x||e〉]〉]〉
=µV µ˜z. 〈z||pi1 [µ˜x. 〈µβ. 〈z||pi2 [β]〉||µ˜y. 〈x||e〉〉]〉
=µ˜V′ µ˜z. 〈z||pi1 [µ˜x. 〈x||e〉]〉 =ηµ˜ pi1 [e]
pi2 [e]?v
v
? =IH µ˜z. 〈z||pi1 [µ˜x. 〈z||pi2 [µ˜y. 〈y||e〉]〉]〉
=µV µ˜z. 〈µα. 〈z||pi1 [α]〉||µ˜x. 〈z||pi2 [µ˜y. 〈y||e〉]〉〉
=µ˜V′ µ˜z. 〈z||pi2 [µ˜y. 〈y||e〉]〉 =ηµ˜ pi2 [e]
[e1, e2]?v
v
? =IH [µ˜x. 〈x||e1〉, µ˜y. 〈y||e2〉] =ηµ˜ [e1, e2]
where the most interesting cases are the pi1 and pi2 projections which requires
the extended notion of values in V ′ to simplify.
The soundness of equations follows by cases on the possible rewrite rules of the
respective equational theories, which may make use of the facts that substitution
commutes with translation (since both translations are compositional and
hygienic (Downen & Ariola, 2014a)) and V (co-)values translate to V (co-)values
in both directions. The cases for the core µV , µ˜V , ηµ, and ηµ˜ rules are immediate
since they are the same in both calculi. The one tricky issue in relating the core
µµ˜ calculus is the extended notion of V ′ value, which does not translate to a
value in µµ˜V . Thankfully, these extra terms are still semantically substitutable
within the µµ˜⊕,⊗,¬,→V equational theory. In particular, we have the following
derived equality for µ˜V ′ within µµ˜⊕,⊗,¬,→V by induction on the values of V ′. The
case for a first projection value is
〈µα. 〈V ||pi1 [α]〉||µ˜z.c〉?v
= 〈µα. 〈V ?v ||µ˜[(x, y).〈x||α〉]〉||µ˜x.c?v〉
=µV 〈V ?v ||µ˜[(x, y).〈z||µ˜z.c?v〉]〉
=µ˜V 〈V ?v ||µ˜[(x, y).c?v {x/z}]〉
=ηµ 〈V ?v ||µ˜[(x, y).c?v {µα. 〈x||α〉/z}]〉
=β⊗V 〈V
?
v ||µ˜[(x, y).c?v {µα. 〈(x, y)||µ˜[(x, y).〈x||α〉]〉/z}]〉
=β⊗V 〈V
?
v ||µ˜[(x, y).〈(x, y)||µ˜z′.c?v {µα. 〈z′||µ˜[(x, y).〈x||α〉]〉/z}〉]〉
=η⊗V 〈V
?
v ||µ˜z′.c?v {µα. 〈z′||µ˜[(x, y).〈x||α〉]〉/z}〉
196
=IH c?v {µα. 〈V ?v ||µ˜[(x, y).〈x||α〉]〉/z}
= c?v {(µα. 〈V ||pi1 [α]〉)?v/z} = (c {µα. 〈V ||pi1 [α]〉/z})?v
and the case for a second projection value is similar. What remains is to check
the soundness of the rewrite rules for each connective. The ς rules are the same in
both calculi, so they translate directly. Going from µµ˜⊕,⊗,¬,→V to the call-by-dual
calculus, we have:
(β⊕) 〈ιi (v)||µ˜[ι1 (x1).c1 | ι2 (x2).c2]〉v?
= 〈ιi (vv?)||[µ˜x1.c1v?, µ˜x2.c2v?]〉 =ς⊕V µ˜V 〈v
v
? ||µ˜z. 〈ιi (z)||[µ˜x1.c1v?, µ˜x2.c2v?]〉〉
=β⊕V 〈v
v
? ||µ˜z. 〈z||µ˜xi.civ?〉〉 =µ˜V 〈vv? ||µ˜xi.civ?〉 = 〈v||µ˜xi.ci〉v?
(β⊗) 〈(v1, v2)||µ˜[(x, y).c]〉v?
= 〈(v1v?, v2v?)||µ˜z. 〈z||pi1 [µ˜x. 〈z||pi2 [µ˜y.cv?]〉]〉〉
=ς×V µV 〈v1
v
?||µ˜x. 〈(x, v2v?)||µ˜z. 〈z||pi1 [µ˜x. 〈z||pi2 [µ˜y.cv?]〉]〉〉〉
=ς×V µV 〈v1
v
?||µ˜x. 〈v2v?||µ˜y. 〈(x, y)||µ˜z. 〈z||pi1 [µ˜x. 〈z||pi2 [µ˜y.cv?]〉]〉〉〉〉
=µ˜V 〈v1v?||µ˜x. 〈v2v?||µ˜y. 〈(x, y)||pi1 [µ˜x. 〈(x, y)||pi2 [µ˜y.cv?]〉]〉〉〉
=β×V 〈v1
v
?||µ˜x. 〈v2v?||µ˜y. 〈x||µ˜x. 〈y||µ˜y.cv?〉〉〉〉
=µ˜V 〈v1v?||µ˜x. 〈v2v?||µ˜y.cv?〉〉 = 〈v1||µ˜x. 〈v2||µ˜y.c〉〉v?
(β¬) 〈µ(¬ [x].c)||¬ [v]〉v? = 〈not(µ˜x.cv?)||not[vv? ]〉 =β¬V 〈vv? ||µ˜x.cv?〉 = 〈v||µ˜x.c〉
v
?
(β→) 〈µ([x · β].c)||v · e〉v? = 〈λx.µβ.cv?||vv? · ev?〉 =ς→V µ˜V 〈vv? ||µ˜x. 〈λx.µβ.cv?||x · ev?〉〉
=β→V 〈vv? ||µ˜x. 〈µβ.cv?||ev?〉〉 = 〈v||µ˜x. 〈µβ.c||e〉〉
v
?
(η⊕) µ˜[ι1 (x).〈ι1 (x)||α〉 | ι2 (y).〈ι2 (y)||α〉]v?
= [µ˜x. 〈ι1 (x)||α〉, µ˜y. 〈ι2 (y)||α〉] =η+V α
(η⊗) µ˜[(x, y).〈(x, y)||α〉]v?
= µ˜z. 〈z||pi1 [µ˜x. 〈z||pi2 [µ˜y. 〈(x, y)||α〉]〉]〉
=η×V β×V′ µ˜z. 〈µβ1. 〈z||pi1 [β1]〉||µ˜x. 〈z||pi2 [µ˜y. 〈(x, y)||α〉]〉〉
=µ˜V′ µ˜z. 〈z||pi2 [µ˜y. 〈(µβ1. 〈z||pi1 [β1]〉, y)||α〉]〉
=η×V β×V′ µ˜z. 〈µβ2. 〈z||pi2 [β2]〉||µ˜y. 〈(µβ1. 〈z||pi1 [β1]〉, y)||α〉〉
197
=µ˜V′ µ˜z. 〈(µβ1. 〈z||pi1 [β1]〉, µβ2. 〈z||pi2 [β2]〉)||α〉
=η×V µ˜z. 〈z||α〉 =ηµ˜ α
(η¬) µ(¬ [x].〈z||¬ [x]〉)v? = not(µ˜x. 〈z||not[x]〉) =η¬V z
(η→) µ([x · β].〈z||x · β〉)v? = λx.µβ. 〈z||x · β〉 =η→V z
Going from the call-by-value dual calculus to µµ˜⊕,⊗,¬,→V , we have:
(β+V ) 〈ιi (V )||[e1, e2]〉?v = 〈ιi (V ?v )||µ˜[ι1 (x).〈x||e1?v〉 | ι2 (x).〈x||e2?v〉]〉
=β⊕V 〈V
?
v ||ei?v〉 = 〈V ||ei〉?v
(β×V ) 〈(V1, V2)||pii [e]〉?v = 〈(V1?v, V2?v)||µ˜[(x1, x2).〈xi||e?v〉]〉
=β⊗V 〈Vi
?
v||e?v〉 = 〈Vi||e〉?v
(β¬V ) 〈not(e)||not(v)〉?v = 〈µ(¬ [x].〈x||e?v〉)||¬ [v?v ]〉
=β¬V 〈v?v ||e?v〉 = 〈v||e〉
?
v
(β→V ) 〈λx.v||V · e〉?v = 〈µ([x · β].〈v?v ||β〉)||V ?v · e?v〉
=β→µV 〈V ?v ||µ˜x. 〈v?v ||e?v〉〉
=µ˜V′ 〈v?v {V ?v /x}||e?v〉 = 〈v {V/x}||e〉?v
(η+V ) [µ˜x. 〈ι1 (x)||e〉, µ˜y. 〈ι2 (y)||e〉]?v
= µ˜[ι1 (x).〈x||µ˜x. 〈ι1 (x)||e?v〉〉 | ι2 (y).〈y||µ˜y. 〈ι2 (y)||e?v〉〉]
=µ˜V µ˜[ι1 (x).〈ι1 (x)||e?v〉 | ι2 (y).〈ι2 (y)||e?v〉] =η⊕V e
?
v
(η×V ) (µα. 〈v||pi1 [α]〉, µβ. 〈v||pi2 [β]〉)?v
= (µα. 〈V ?v ||µ˜[(x, ).〈x||α〉]〉, µβ. 〈V ?v ||µ˜[( , y).〈y||β〉]〉)
=ηµ µγ. 〈(µα. 〈V ?v ||µ˜[(x, ).〈x||α〉]〉, µβ. 〈V ?v ||µ˜[( , y).〈y||β〉]〉)||γ〉
=µ˜V′ µγ. 〈V ?v ||µ˜z. 〈(µα. 〈z||µ˜[(x, ).〈x||α〉]〉, µβ. 〈z||µ˜[( , y).〈y||β〉]〉)||γ〉〉
=η⊗V µγ.
〈
V ?v
∣∣∣∣∣∣
∣∣∣∣∣∣µ˜
(x, y).〈(x, y)
∣∣∣∣∣∣
∣∣∣∣∣∣µ˜z.
〈(µα. 〈z||µ˜[(x, ).〈x||α〉]〉,
µβ. 〈z||µ˜[( , y).〈y||β〉]〉)
∣∣∣∣∣∣
∣∣∣∣∣∣γ
〉〉〉
=µ˜V µγ.
〈
V ?v
∣∣∣∣∣∣
∣∣∣∣∣∣µ˜
(x, y).〈(µα. 〈(x, y)||µ˜[(x, ).〈x||α〉]〉,
µβ. 〈(x, y)||µ˜[( , y).〈y||β〉]〉)
∣∣∣∣∣∣
∣∣∣∣∣∣γ
〉〉
198
=β⊗V µγ. 〈V
?
v ||µ˜[(x, y).〈(µα. 〈x||α〉, µβ. 〈y||β〉)||γ〉]〉
=ηµ µγ. 〈V ?v ||µ˜[(x, y).〈(x, y)||γ〉]〉
=η⊗ µγ. 〈V ?v ||γ〉 =ηµ V ?v
(η¬V) not(µ˜x. 〈V ||not[x]〉)?v = µ(¬ [x].〈x||µ˜x. 〈V ?v ||¬ [x]〉〉)
=µ˜V µ(¬ [x].〈V ?v ||¬ [x]〉) =η¬V′ V
?
v
(η→V ) λx.µβ. 〈V ||x · β〉?v = µ([x · β].µβ. 〈V ?v ||x · β〉β)
=µ˜V µ([x · β].〈V ?v ||x · β〉) =η→V′ V
?
v
b) This follows from part (a) by duality. More specifically, translation commutes
with duality in the two calculi as
c?v
⊥ = c⊥?n c?n
⊥ = c⊥?v cv?⊥ = c⊥
n
? c
n
?
⊥ = c⊥v?
and similarly for (co-)terms, which follows directly from the definitions of the
duality and translation operations by (mutual) induction on the syntax of
commands and (co-)terms. Therefore, the fact that the translations are inverses
comes from part (a) by applying Theorems 3.6 and 5.5, so
cn?
?
n =Theorem 5.5 cn?
?
n
⊥⊥ = c⊥⊥v?
?
v = c⊥⊥ =Theorem 5.5 c
c?n
n
? =Theorem 3.6 c?n
n
?
⊥⊥ = c⊥⊥?v
v
? = c⊥⊥ =Theorem 3.6 c
and similarly for (co-)terms. Furthermore N is dual to V and &,`,∼,→ is dual
to ⊕,⊗,¬,→, so if we have c = c′ in the µµ˜&,`,∼,→N then c⊥ = c⊥ in µµ˜⊕,⊗,¬,→N
by Theorem 5.7, c⊥v? = c′⊥
v
? in the call-by-value dual calculus by part (a), and
thus
cn? = cn?⊥⊥ = c⊥
v
?
⊥ = c′⊥v?
⊥ = c′n?
⊥⊥ = cn?
in the call-by-name dual calculus by Theorems 3.8 and 3.6 and the above, and
similarly for (co-)terms. Going the other way, if we have c = c′ in the call-by-
value dual calculus then we have c⊥?v = c′⊥
?
v in µµ˜
⊕,⊗,¬,→
V by Theorem 5.7 and
199
part (a), so
c?n = c?n
⊥⊥ = c⊥?v
⊥ = c′⊥?v
⊥ = c′?n
⊥⊥ = c?n
in µµ˜&,`,∼,→N by Theorem 3.8 and the above, and similarly for (co-)terms.
It follows that the idea of distinguishing data and co-data provides a unifying
framework for studying the computational meaning of types in the sequent calculus.
The distinction is baked into polarized languages, like system L as previously seen
in Chapter IV. But even for the dual calculi, in which there is no apparent division
between data types and co-data types, the difference between the two is instead buried
inside the dual call-by-value and call-by-name interpretations of the types. Next in
the following Chapter VI, we will move beyond just the simple types considered
here (variations of products, sums, functions, and so on) to also incorporate more
advanced type features into the data and co-data framework. In particular, Chapter VI
will show how polymorphism in the form of type abstraction, previously seen in
Chapters II and III, can be rephrased in terms of the data and co-data framework
explored here. This extension will serve as a platform to study the duality between
induction and co-induction as two modes of structural recursion which improves the
treatment of co-induction as the equal-and-opposite partner to induction, and also
clarifies the murky issues of “well-foundedness” surrounding co-induction. In particular,
the tendency to view co-inductive objects as “necessarily lazy” comes from the fact
that they are co-data objects. The delicate balance of evaluation order that is required
to combine inductive and co-inductive objects falls out automatically by modeling as
data and co-data which already implies the correct computational meaning.
200
CHAPTER VI
Induction and Co-Induction
This chapter is a revised version of (Downen et al., 2015) to fit in the context
of this dissertation of which I was the primary author and developed the language and
theory of structural recursion in the classical sequent calculus presented in this chapter.
I would like to thank my co-authors Philip Johnson-Freyd and Zena M. Ariola for
their assistance and feedback in writing that publication.
Martin-Löf’s type theory (Martin-Löf, 1998, 1975; Martin-Löf, 1982) taught
us that inductive definitions and reasoning are pervasive throughout proof theory,
mathematics, and computer science. Inductive data types are used in programming
languages like ML and Haskell to represent structures, and in proof assistants and
dependently typed languages like Coq and Agda to reason about finite structures
of arbitrary size. Mendler (1988) showed us how to talk about recursive types and
formalize inductive reasoning over arbitrary data structures. However, the foundation
for the opposite to induction, co-induction, has not fared so well. Co-induction is a
major concept in programming, representing endless processes, but it is often neglected,
misunderstood, or mistreated. As articulated by McBride (Singh et al., 2011):
We are obsessed with foundations partly because we are aware of a number
of significant foundational problems that we’ve got to get right before we
can do anything realistic. The thing I would think of . . . in particular in
that respect is co-induction and reasoning about co-recursive processes.
That’s currently, in all major implementations of type theory, a disaster.
And if we’re going to talk about real systems, we’ve got to actually have
something sensible to say about that.
The introduction of co-patterns for co-induction Abel et al. (2013) is a major step
forward in rectifying this situation. Abel et al. emphasize that there is a dual view
to inductive data types, in which the values of types are defined by how they are
used instead of how they are built, a perspective on co-data types first spurred on by
Hagino (1987, 1989). Co-inductive co-data types are exciting because they may solve
201
the existing problems with representing infinite objects in proof assistants like Coq
(Abel & Pientka, 2013).
Our goal here is to improve the understanding and treatment of co-induction, and
to integrate both induction and co-induction into a cohesive whole for representing
well-founded recursive programs. Our main tools for accomplishing this goal are
the pervasive and overt duality and symmetry that runs through classical logic and
the sequent calculus. By developing a representation of well-founded induction in
a language for the classical sequent calculus, we get an equal and opposite version
of well-founded co-induction “for free.” Thus, the challenges that arise from using
classical sequent calculus as a foundation for induction are just as well the challenges of
co-induction, as the two are inherently developed simultaneously. Later in Chapter IX,
we will translate the developments of induction and co-induction in the classical
sequent calculus to a λ-calculus based language for effect-free programs, to better
relate to the current practice of type theory and functional programming. As the λ-
based style lacks symmetries present in the sequent calculus, some of the constructs for
recursion are lost in translation. Unsurprisingly, the cost of an asymmetrical viewpoint
is blindness to the complete picture revealed by duality.
Our philosophy is to emphasize the disentanglement of the recursion in types
from the recursion in programs, to attain a language rich in both data and co-data
while highlighting their dual symmetries. On the one hand, the Coq viewpoint is that
all recursive types—both inductive and co-inductive—are represented as data types
(positive types in polarized logic (Munch-Maccagnoni, 2009)), where induction allows
for infinitely deep destruction and co-induction allows for infinitely deep construction.
On the other hand, the co-pattern approach (Abel et al., 2013; Abel & Pientka, 2013)
which is inspired by Hagino’s (1987) treatment of co-induction via finiate observations
represents inductive types as data and co-inductive types as co-data. In contrast, we
take the view that separates the recursive definition of types from the types used for
specifying recursive processing loops. Thereby, the types for representing the structure
of a recursive process are given first-class status, defined on their own independently
of any other programming construct. This makes the types more compositional, so
that they may be combined freely in more ways, as they are not confined to certain
restrictions about how they relate to data vs co-data or induction vs co-induction.
More traditional views on the distinction between inductive and co-inductive programs
202
come from different modes of use for the same building blocks, emerging from particular
compositions of several (co-)data types.
We will base our calculus for recursion on the parametric µµ˜-calculus with
data and co-data from Chapter V which corresponds to a classical logic, so it
inherently contains control effects (Griffin, 1990) that allow programs to abstract
over their own control-flow—intuitionistic logic and effect-free functional programs
are later considered as a special case in Chapter IX. As we saw, the fundamental
dilemma of classical computation (Section 3.2) means that the intended evaluation
strategy for a program becomes an essential part of understanding its meaning: even
terminating programs give different results for different strategies. For example, the
functional program length(Cons (error “boom”)Nil) returns 1 under call-by-name
(lazy) evaluation, but goes “boom” with an error under call-by-value (strict) evaluation.
Therefore, a calculus that talks about the behavior of programs needs to consider
the impact of the evaluation strategy. We therefore leverage the parametric nature of
the µµ˜-calculus to disentangle this choice from the calculus itself, boiling down the
distinction as a substitution strategy. Note that, unlike many accounts of co-induction,
we do not rely on a particular choice of evaluation strategy—like some sort of lazy
evaluation which delays computing results until they are needed—but instead the
apt use of data and co-data forces the correct interpretation of infinite objects. We
therefore get a family of calculi, parameterized by the strategy, for reasoning about the
behavior of programs ultimately executed with some evaluation strategy. The issue of
strong normalization is then framed uniformly over this family of calculi by specifying
some basic requirements of the chosen substitution strategy which are inspired by
focusing in logic.
The bedrock on which we build our structures for recursion is the connection
between logic and programming languages, and the cornerstone of the design is the
duality permeating these programming concepts. Induction and co-induction are
clearly dual, and the duality of their opposition shines through in the the symmetric
setting of the sequent calculus. Here, classicality is not just a feature, but an essential
completion of the duality needed to fully express the connections between recursion
and co-recursion. We consider several different types for representing recursion in
programs based on the mathematical principles of primitive and noetherian recursion
which are reflected as pairs of dual data and co-data types. As we will find, both of
these different recursive principles have different strengths as programming features:
203
primitive recursion allows us to depend on the statically-known sizes of constructions
at run-time à la GADTs and simulate seemingly infinite constructed objects, like
potentially infinite lists in Coq or Haskell, whereas noetherian recursion admits type-
erasure. In essence, we demonstrate how this parametric sequent calculus can be used
as a core calculus and compilation target for establishing well-foundedness of recursive
programs, via the computational interpretation of common principles of mathematical
induction.
This chapter covers the following topics:
– A presentation of some basic functional programs, including co-patterns (Abel
et al., 2013), in a sequent based syntax to illustrate how the sequent calculus
gives a language for programming with structures and duality (Section 6.1).
– A language for the higher-order sequent calculus in which all types, including
functions and polymorphism, are treated as user-defined data and co-data types
(Section 6.2).
– Two forms of well-founded recursion in types—based on primitive and noetherian
recursion—along with specific data and co-data types for performing well-
founded recursion in programs (Section 6.3).
– An extension of the language of the sequent calculus with recursion, where the
reduction theory is strongly normalizing for well-typed programs and supports
erasure of computationally irrelevant types at run-time (Section 6.4).
–
Programming with Structures and Duality
Pattern-matching is an integral part of functional programming languages, and is
a great boon to their elegance. However, the traditional language of pattern-matching
can be lacking in areas, especially when we consider dual concepts that arise in all
programs. For example, when defining a function by patterns, we can match on the
structure of the input—the argument given to the function—but not its output—the
observation being made about its result. In contrast, calculi inspired by the sequent
calculus that we’ve seen in Chapters III, IV, and V feature a more symmetric language
which both highlights and restores this missing duality. Indeed, in a setting with such
204
ingrained symmetry, maintaining dualities is natural. We now consider how concepts
from functional programming translate to a sequent-based language, and how programs
can leverage duality by writing basic recursive functional programs in this symmetric
setting.
Example 6.1. One of the most basic functional programs is the function that calculates
the length of a list. We can write this length function in a Haskell- or Agda-like
language by pattern-matching over the structure of the given List a to produce a Nat:
dataNatwhere
Z : Nat
S : Nat→ Nat
data List awhere
Nil : List a
Cons : a→ List a→ List a
length : List a→ Nat
length Nil = Z
length (Cons x xs) = let y = length xs in S y
This definition of length describes its result for every possible call. Similarly, we can
define length in the parametric µµ˜-calculus1 from Chapter V in much the same way.
First, we introduce the types in question by data declarations in the sequent calculus:
dataNatwhere
Z : ` Nat |
S : Nat ` Nat |
data List(X)where
Nil : ` List(X) |
Cons : X, List(X) ` List(X) |
While these declarations give the same information as before, the differences between
these specific data type declarations are largely stylistic. Instead of describing the
constructors in terms of a pre-defined function type, the shape of the constructors are
described via sequents, replacing function arrows with entailment (`) and commas for
separating multiple inputs. Furthermore, the type of the main output produced by
each constructor is highlighted to the right of the sequent between entailment and a
vertical bar, as in ` Nat | or ` List(X) |, and all other types describe the parameters
that must be given to the constructor to produce this output. Thus, we can construct
1Recall that following the notation of Chapter III the symbols µ and µ˜ used here are not related
to recursion, as they sometimes are in other languages, but rather are binders for variables and
co-variables.
205
a list as either Nil or Cons(x, xs), much like in functional languages. Next, we define
length by specifying its behavior for every possible call:
length : List(X)→ Nat
〈length||Nil · α〉 = 〈Z||α〉
〈length||Cons(x, xs) · α〉 = 〈length||xs · µ˜y. 〈S(y)||α〉〉
The main difference is that we consider more than just the argument to length. Instead,
we are describing the action of length with its entire context by showing the behavior
of a command connecting it together with a consumer. For example, in the command
〈Z||α〉, Z is a term producing zero and α is a co-term—specifically a co-variable—that
consumes that number. Besides co-variables, we have other co-terms that consume
information. The call-stack Nil · α consumes a function by supplying it with Nil as its
argument and consuming its returned result with α. The input abstraction µ˜y. 〈S(y)||α〉
names its input y before running the command 〈S(y)||α〉, similarly to the context
let y =  in S(y) from the functional program.
In functional programs, it is common to avoid explicitly naming the result of a
recursive call, especially in such a short program. Instead, we would more likely define
length as:
length : List a→ Nat
length Nil = Z
length (Cons x xs) = S (length xs)
We can mimic this definition in the sequent calculus as:
length : List(X)→ Nat
〈length||Nil · α〉 = 〈Z||α〉
〈length||Cons(x, xs) · α〉 = 〈S(µβ. 〈length||xs · β〉)||α〉
Note that to represent the functional call length xs inside the successor constructor
S, we need to make use of the output abstraction µβ. 〈length||xs · β〉 that names its
output channel β before running the command 〈length||xs · β〉, which calls length with
xs as the argument and β as the return point. As we saw in Section 5.6, output
206
abstractions are exactly dual to input abstractions, and defining length in µµ˜ requires
us to name the recursive result as either an input or an output.
Just as functions can be represented as first-class values through λ-abstractions
in functional languages, their sequent calculus counter-parts can be represented as
first-class values in terms of case abstractions in the µµ˜-calculus. Using a recursively-
defined case abstraction with deep pattern-matching, we can represent length in the
µµ˜-calculus from Chapter 5.2:
length = µ(Nil · α.〈Z||α〉
|Cons(x, xs) · α.〈length||xs · µ˜y. 〈S(y)||α〉〉)
Furthermore, the deep pattern-matching can be mechanically translated to the shallow
case analysis on (co-)data structures:
length = µ(xs · α. 〈xs||µ˜[Nil.〈Z||α〉
|Cons(x, xs′).〈length||xs′ · µ˜y. 〈S(y)||α〉〉]〉)
This case abstraction describes exactly the same specification as the definition for
length according to the reduction theory of the parametric µµ˜-calculus: when run with
the call-stack Nil ·α, the command reduces to 〈Z||α〉, and when run with the call-stack
Cons(x, xs) · α, the command reduces to 〈length||xs · µ˜y. 〈S(y)||α〉〉. However, here we
will favor presenting the example programs in the style of specifying the behavior
of commands using deep pattern-matching, as this gives a higher-level and more
abstract reading of programs, with the understanding that they can be mechanically
compiled down to (recursive) case abstractions with shallow pattern-matching as
above. End example 6.1.
We have seen how to write a recursive function by pattern-matching on the first
argument, x, in a call-stack x ·α. However, why should we be limited to only matching
on the structure of the argument x? If the observations on the returned result must
also follow a particular structure, why can’t we match on α as well? Indeed, in a
symmetric language, there is no such distinction. For example, the function call-stack
itself can be viewed as a structure, so that a curried chain of function applications
f x y z is represented by the pattern x · y · z · α, which reveals the nested structure
down the output side of function application, rather than the input side. Thus, the
sequent calculus reveals a dual way of thinking about information in programs phrased
207
as co-data, as we saw in Chapter V, in which observations follow predictable patterns,
and values respond to those observations by matching on their structure. In such a
symmetric setting, it is only natural to match on any structure appearing in either
inputs or outputs.
Example 6.2. We can consider this view on co-data to understand programs with
“infinite” objects. For example, infinite streams may be defined by the primitive
projections out of streams:
codata Stream(X)where
Head : | Stream(X) ` a
Tail : | Stream(X) ` Stream(X)
Contrarily to data types, the type of the main input consumed by co-data constructors
is highlighted to the left of the sequent in between a vertical bar and entailment,
as in | Stream(X) `. The rest of the types describe the parameters that must be
given to the constructor in order to properly consume this main input. For Streams,
the observation Head[α] requests the head value of a stream which should be given
to α, and Tail[β] asks for the tail of the stream which should be given to β.2 We
can now define a function countUp—which turns an x of type Nat into the infinite
stream x, S(x), S(S(x)), . . .—by pattern-matching on the structure of observations on
functions and streams:
countUp : Nat→ Stream(Nat)
〈countUp||x · Head[α]〉 = 〈x||α〉
〈countUp||x · Tail[β]〉 = 〈countUp||S(x) · β〉
If we compare countUp with length in this style, we can see that there is no fundamental
distinction between them: they are both defined by cases on their possible observations.
The only point of difference is that length happens to match on the structure of its
argument in its call-stack, whereas countUp matches on the return co-data structure
of in its call-stack.
2Keeping the convention from Chapter III, we use square brackets as grouping delimiters in
observations, like the head projection Head[α] out of a stream, as opposed to round parentheses used
as grouping delimiters in results, like the successor number S(y). This helps to disambiguate between
results (terms) and observations (co-terms) in a way that is syntactically apparent independently of
their context.
208
Abel et al. (2013) have carried this intuition back into the functional paradigm.
For example, we can still describe streams by their Head and Tail projections, and
define countUp through co-patterns:
codata Stream awhere
Head : Stream a→ a
Tail : Stream a→ Stream a
countUp : Nat→ Stream(X)
(countUp x).Head = x
(countUp x).Tail = countUp (Sx)
This definition gives the functional program corresponding to the sequent version of
countUp. So we can see that co-patterns arise naturally, in Curry-Howard isomorphism
style, from the computational interpretation of Gentzen’s (1935a) sequent calculus.
End example 6.2.
Example 6.3. Since a symmetric language is not biased against pattern-matching on
inputs or outputs, and indeed the two are treated identically, there is nothing special
about matching against both inputs and outputs simultaneously. For example, we can
model infinite streams with possibly missing elements as
SkipStream(X) = Stream(Maybe(X))
where Maybe(X) corresponds to the Haskell data type of the same name defined as:
dataMaybe(X)where
Nothing : ` Maybe(X) |
Just : X ` Maybe(X) |
with constructors Nothing and Just(x) for x of type X. Then we can define the empty
skip stream which gives Nothing at every position, and the countDown function that
209
transforms Sn(Z) into the stream Sn(Z), Sn−1(Z), . . . ,Z,Nothing, . . . :
empty : SkipStream(Nat)
〈empty||Head[α]〉 = 〈Nothing||α〉
〈empty||Tail[β]〉 = 〈empty||β〉
countDown : Nat→ SkipStream(Nat)
〈countDown||x · Head[α]〉 = 〈Just(x)||α〉
〈countDown||Z · Tail[β]〉 = 〈empty||β〉
〈countDown||S(x) · Tail[β]〉 = 〈countDown||x · β〉 End example 6.3.
Example 6.4. As opposed to the co-data approach to describing infinite objects, there
is a more widely used approach in lazy functional languages like Haskell and proof
assistants like Coq that still favors framing information as data. For example, an
infinite list of zeroes is expressed in this functional style by an endless sequence of
Cons:
zeroes : List(Nat)
zeroes = Cons Z zeroes
We could emulate this definition in sequent style as the expansion of zeros when
observed by any α:
zeroes : List(Nat)
〈zeroes||α〉 = 〈Cons(Z, zeroes)||α〉
Likewise, we can describe the concatenation of two, possibly infinite lists in the
same way, by pattern-matching on the call:
cat : List(X)→ List(X)→ List(X)
〈cat||Nil · ys · α〉 = 〈ys||α〉
〈cat||Cons(x, xs) · ys · α〉 = 〈Cons(x, µβ. 〈cat||xs · ys · β〉)||α〉
The intention is that, so long as we do not evaluate the sub-components of Cons
eagerly, then α receives a result even if xs is an infinitely long list like zeroes.
210
End example 6.4.
In each of these examples, we were only concerned with writing recursive programs,
but have not showed that they always terminate. Termination is especially important
for proof assistants and dependently typed languages, which rely on the absence of
infinite loops for their logical consistency. If we consider the programs in Examples 6.1
and 6.2, then termination appears fairly straightforward by structural recursion
somewhere in a function call: each recursive invocation of length has a structurally
smaller list for the argument, and each recursive invocation of countUp, and countDown
has a smaller stream projection out of its returned result. However, formulating this
argument in general turns out to be more complicated. Even worse, the “infinite data
structures” in Example 6.4 do not have as clear of a concept of “termination:” zeroes
and concatenation could go on forever, if they are not given a bound to stop. To tackle
these issues, we will phrase principles of well-founded recursion in the parametric µµ˜-
calculus, so that we arrive at a core calculus capable of expressing complex termination
arguments (parametrically to the chosen evaluation strategy) inside the calculus itself
(see Section 6.4).
Polymorphism and Higher Kinds
Before we can talk about statically-guaranteed termination arguments in types,
we must first be able to quantify over types. That is to say, we need to extend the
parametric µµ˜-calculus with type quantifiers like ∀ and ∃ that we had seen previously
in natural deduction (Chapter II) and the sequent calculus (Chapter III). We could
just add special connectives with their own separate rules for the quantifiers to the
calculus. However, instead let’s look at how we can enrich the existing mechanisms of
data and co-data to incorporate both ∀- and ∃-style quantifiers as just more declared
(co-)data types like products, sums, and functions.
As it turns out, starting from the multi-kinded parametric sequent calculus from
Sections 5.4 and 5.5 we are almost already there. First of all, we will extend the
syntax of terms and co-terms to let (co-)data structures contain types in addition
to sub-expressions, as shown in Figure 6.1. This change means that the patterns in
case abstractions can now bind type variables in addition to ordinary (co-)variables,
so that (co-)terms can abstract over types as well as other (co-)terms like in the
polymorphic λ-calculus (Section 2.2) or polymorphic sequent calculus (Section 3.3).
In addition, we will also allow types to abstract over types by extending the language
211
X, Y, Z ∈ TypeVariable ::= . . . R,S, T ∈ BaseKind ::= . . . F,G ∈ Connective ::= . . .
k, l ∈ Kind ::= S | k → l A,B,C ∈ Type ::= X | F( #»A) | λX : k.B | A B
x, y, z ∈ Variable ::= . . . α, β, γ ∈ CoVariable ::= . . .
K ∈ Constructor ::= . . . O ∈ Observer ::= . . .
c ∈ Command ::= 〈v||e〉
v ∈ Term ::= x | µα.c | K #»A ( #»e , #»v ) | µ
(
O
#     »
X:k [ #»x , #»α ].c | . . .
)
e ∈ CoTerm ::= α | µ˜x.c | µ˜
[
K
#     »
X:k( #»α , #»x ).c | . . .
]
| O #»A [ #»v , #»e ]
FIGURE 6.1. The syntax of types and programs in the higher-order µµ˜-calculus.
of kinds (denoted by the metavariables k, l) to include arrow kinds k → l in addition
to base kinds S, which gives us type functions also shown in Figure 6.1. The type-level
language of functions uses the notation of the λ-calculus, so that a type function with
the parameter X : k is introduced as the λ-abstraction λX : k.B and a type function
is applied as A B.
Intuitively, the motivation for adding type functions to the language is to let
(co-)data declarations abstract over them, giving us higher-order (co-)data types.
In particular, the addition of type abstraction in both programs and types lets us
extend the multi-kinded (co-)data declaration mechanism and kind system as shown
in Figure 6.2. The main addition is that now the constructors in a data declaration
of F( #»X ) and the observers in a co-data declaration of G( #»X )can introduce hidden
quantified type variables #»Y that do not appear in the externally visible interface #»X
of the connective. For example, for some fixed kind S, we can give declarations for
the universal (∀) and existential (∃) quantification over a type of kind k as follows:
codata ∀k(X : k → S) : Swhere
@ :
(
| ∀k(X) `Y :k X Y : S
) data ∃k(X : k → S) : Swhere
@ :
(
X Y : S `Y :k ∃k(X) |
)
These declarations extend the same notion of quantifiers in the dual calculi to
higher kinds k, where we use the shorthand ∀Y :k.A for ∀k(λY :k.A) and ∃Y :k.A for
3As before, this is shorthand for a (co-)data declaration of F( #        »X : k) : S in G.
212
decl ∈ Declaration ::= data F( #        »X : k) : Swhere
#                                                                       »
K :
(
#         »
A : T ` #   »Y :l F( #»X ) | #         »B : R
)
| codataG( #        »X : k) : Swhere
#                                                                        »
O :
(
#         »
A : T | G( #»X ) ` #   »Y :l #         »B : R
)
G ∈ GlobalEnv ::= #     »decl Θ ∈ TypeEnv ::= #        »X : k
Γ ∈ InputEnv ::= #       »x : A ∆ ∈ OutputEnv ::= #        »α : A
J,H ∈ Judgement ::=
(
Γ `ΘG ∆
)
seq | (G ` decl) | (Θ `G A : k)
Declaration rules:
#                                               »
#        »
X : k, #      »Y : l `G A : T
#                                                »
#        »
X : k, #      »Y : l `G B : R
#                                »(
`
#     »
X:k, #   »Y :l
G
)
seq
G `
data F( #        »X : k) : Swhere
#                                                                       »
K :
(
#         »
A : T ` #   »Y :l F( #»X ) | #         »B : R
)
data
#                                               »
#        »
X : k, #      »Y : l `G A : T
#                                                »
#        »
X : k, #      »Y : l `G B : R
#                                »(
`
#     »
X:k, #   »Y :l
G
)
seq
G `
codataG( #        »X : k) : Swhere
#                                                                       »
O :
(
#         »
A : T | F( #»X ) ` #   »Y :l #         »B : R
)
codata
Kind rules:
Θ, X : k `G X : k TV
#                      »Θ `G C : k (F( #        »X : k) : S)3 ∈ G
Θ `G F( #»C ) : S
FT
Θ, X : k `G A : l
Θ `G λX : k.A : k → l →I
2 Θ `G A : k → l Θ `G B : k
Θ `G A B : l →E
2
Well-formed sequent rules:
( ` ) seq
G ` decl
(
`G
)
seq(
`G,decl
)
seq
(
`ΘG
)
seq(
`Θ,X:kG
)
seq
Θ `G A : S
(
Γ `ΘG ∆
)
seq(
Γ, x : A `ΘG ∆
)
seq
Θ `G A : S
(
Γ `ΘG ∆
)
seq(
Γ `ΘG α : A,∆
)
seq
FIGURE 6.2. The kind system for the higher-order parametric µµ˜ sequent calculus.
213
∃k(λY :k.A). A term of type ∀Y :k.A is introduced as the case abstraction µ(Y :k @ α.c)
that is consumed buy the observation B @ e. Dually, a term of type ∃Y : k.A is
introduced by the construction B @ v that is consumed by the case abstraction
µ˜[Y :k @ x.c].
Note that the kind system in Figure 6.2 also includes an entirely new kind of
judgement
(
Γ `G ∆
)
seq that says a general sequent Γ `ΘG ∆ is well-formed. This
judgement is now necessary because of the addition of type functions, which are a
new kind of type that does not actually classify any term or co-term. In other words,
supposing that a free variable x has type λX:S.X would be nonsensical. Therefore,
we rule any such possibility by the rules of
(
Γ `G ∆
)
seq , which enforce that for every
x : A in Γ and α : A in ∆, A must belong to some base kind S and not some other kind
like k → l. This is the same reason that the declarations for (co-)data types can only
declare connectives of the form F( #        »X : k) : S for some base kind S, and similarly the
sequents that give the types of constructors and observers are well-formed whenever
the declaration is well-formed according to the data and codata rules.
Since we have added new forms of terms and co-terms which package up and
abstract over types, we also need to update the typing rules to accomodate these
new forms in the higher-order parametric µµ˜-calculus, as shown in Figure 6.3. Note
that the judgements and core typing rules are exactly the same as the core typing
rules for the multi-kinded type system from Figure 5.16 plus the addition of the type
conversion rules TCR and TCL. These conversion rules say that any β = equivalent
types (in the sense of the typed βη equational theory of the λ-calculus from Chapter II
Section 2.2 and denoted by the judgement Θ `G A =βη B : S with the rules given in
Figure 6.4) contain exactly the same terms and co-terms.
The only other update is in the left and right introduction rules for particular
(co-)data types, which now account for the possibility that constructions and
observations might include types which are referenced in the components of the pattern.
For (co-)data structures, this means that there is a choice of hidden types
# »
C ′i which
must be substituted for the quantified type variables #        »Yi : li in the sub-(co-)terms
of the structure. For (co-)data case abstractions, we need to extend the local type
environment Θ with the abstracted type variables, just as we must extend the local
input and output environments with the abstracted (co-)variables. For example, the
specific instances of the general typing rules for the two families of quantifiers ∀k and
214
Judgement ::= c :
(
Γ `ΘG ∆
)
| (Γ `ΘG v : A | ∆) | (Γ | e : A `ΘG ∆)
Type conversion rules:
Γ `ΘG v : A | ∆ Θ `G A =βη B : S
Γ `ΘG v : B | ∆
TCR
Γ | e : A `ΘG ∆ Θ `G A =βη B : S
Γ | e : B `ΘG ∆
TCL
Logical rules:
Given data F( #        »X : k) : Swhere
#                                                                                             »
Ki :
(
#              »
Aij : Tijj `
#      »
Yi:li F( #»X ) | #                »Bij : Rijj
)i
∈ G, we have
the rules:
θ =
{
#       »
C/X
}
#                              »
Θ `G C ′iθ : liθ θ′ =
{
#         »
C ′i/Yi
}
θ
#                                             »
Γ′j | e : Bijθ′ `ΘG ∆′j
j #                                             »
Γj | v : Aijθ′ `ΘG ∆j
j
# »Γj
j
,
# »
Γ′j
j `ΘG K
# »
C′
i ( #»e , #»v ) : F(
#»
C ) | #  »∆jj ,
#  »
∆′j
j
FRKi
θ =
{ #       »
C/X
} #                                                                                      »
ci :
(
Γ, #              »xi : Aiθ `Θ,
#         »
Yi:liθ
G
#              »
αi : Biθ,∆
)i
Γ | µ˜
[
#                               »
K
#      »
Yi:li
i ( #»αi , #»xi).ci
i
]
: F( #»C ) `ΘG ∆
FL
Given codataG( #        »X : k) : Swhere
#                                                                                              »
Oi :
(
#              »
Aij : Tijj | G( #»X ) `
#      »
Yi:li #                »Bij : Rijj
)i
∈ G, we
have the rules:
θ =
{ #       »
C/X
} #                                                                                      »
ci :
(
Γ, #              »xi : Aiθ `Θ,
#         »
Yi:liθ
G
#              »
αi : Biθ,∆
)i
Γ `ΘG µ
(
#                              »
O
#      »
Yi:li
i [ #»xi , #»αi ].ci
i
)
: G( #»C ) | ∆
GR
θ =
{ #       »
C/X
} #                        »
Θ `G C ′i : li θ′ =
{ #         »
C ′i/Yi
}
θ
#                                           »
Γj | v : Aijθ′ `ΘG ∆j
j #                                           »
Γ′j | e : Bijθ′ `ΘG ∆′j
j
#»Γj
j
,
#»
Γ′j
j | O
# »
C′i
i [ #»v , #»e ] : G(
#»
C ) `ΘG
#  »∆j
j
,
#  »
∆′j
j
GLOi
FIGURE 6.3. Types of higher-order (co-)data in the parametric µµ˜ sequent calculus.
215
Θ, X : k `G A : l Θ `G B : k
Θ `G (λX:k.A) B =βη A {B/X} : l β
Θ `G A : k → l
Θ `G λX:k.A X =βη A : k → l
η
Θ `G A : k
Θ `G A =βη A : k refl
Θ `G B =βη A : k
Θ `G A =βη B : k
symm
Θ `G A =βη B : k Θ `G B =βη C : k
Θ `G A =βη C : k trans
Θ, X : k `G X =βη X : k TV
Θ `G F( #»C ) : S #                                         »Θ `G C =βη C ′ : k Θ `G F( # »C ′) : S (F( #        »X : k) : S) ∈ G
Θ `G F( #»C ) =βη F( # »C ′) : S
FT
Θ, X : k `G A =βη A′ : l
Θ `G λX:k.A =βη λX:k.A′ : k → l →I
2
Θ `G A =βη A′ : k → l Θ `G B =βη B′ : k
Θ `G A B =βη A′ B′ : l →E
2
FIGURE 6.4. βη conversion of higher-order types.
216
(βF)
〈
K
#»
C
i ( #»e , #»v )
∣∣∣∣∣∣µ˜[· · · | K #   »Y :li ( #»α , #»x ).ci | · · ·]〉 βF 〈µ #»α . 〈 #»v ∣∣∣∣∣∣µ˜ #»x .ci { #       »C/Y }〉∣∣∣∣∣∣ #»e 〉
(βG)
〈
µ
(
· · · | O
#   »
Y :l
i [ #»x , #»α ].ci | · · ·
)∣∣∣∣∣∣O #»Ci [ #»v , #»e ]〉 βG 〈 #»v ∣∣∣∣∣∣µ˜ #»x . 〈µ #»α .ci { #       »C/Y }∣∣∣∣∣∣ #»e 〉〉
(ηF) γ : F( #»C ) ≺ηF µ˜
[
#                                                                »
K
#   »
Y :l
i ( #»α , #»x ).
〈
K
#   »
Y :l
i ( #»α , #»x )
∣∣∣∣∣∣γ〉i]
(ηG) z : G( #»C ) ≺ηG µ
(
#                                                              »
O
#   »
Y :l
i [ #»x , #»α ].
〈
z
∣∣∣∣∣∣O #   »Y :li [ #»x , #»α ]〉i
)
FIGURE 6.5. The βη laws for higher-order data and co-data types.
∃k above are:
c :
(
Γ `Θ,X:kG α : A X,∆
)
Γ `ΘG µ(X : k @ α.c) : ∀k(A) | ∆
∀Rk
Θ `G B : k Γ | e : A B `ΘG ∆
Γ | B @ e : ∀k(A) `ΘG ∆
∀Lk
Θ `G B : k Γ `ΘG v : A B | ∆
Γ `ΘG B @ v : ∃k(A) | ∆
∃Rk
c :
(
Γ, x : A X `Θ,X:kG ∆
)
Γ | µ˜[X : k @ x.c] : ∃k(A) `ΘG ∆
∃Lk
Other than this addition, the rules are the same as before in Section 5.4.
Thus concluding the static semantics of the higher-order parametric µµ˜-calculus,
we must also consider how the extension affects the dynamic semantics. The short
answer is: not much. In general, the types contained in structures must be substituted
for the type variables bound by patterns during pattern-matching, but this does not
significantly alter the behavior of a program. More specifically, the core µS µ˜Sηµηµ˜
theory of substitution does not change at all, since the form of input and output
abstractions remain the same, the typed βη theory of (co-)data accounts for the
presence of types in programs as shown in Figure 6.5, where the connectives F and
G are declared in G as in Figure 6.3, and likewise the untyped βς theory of (co-)data
is extended as shown in Figure 6.6. We must also extend the inference rules from
Figure 5.18 for checking that expressions are well-kinded so that we know which
substitution strategy to use when mixing several within a program, as shown in
Figure 6.7, by just ignoring the additional type annotations on (co-)data structures.
Likewise, the definitions of particular substitution strategies, like V , N , LV , and LN ,
are only changed by annotating structures and patterns with types and type variables,
and otherwise exactly the same as their definitions in Chapter V.
217
(βS)
〈
K
#»
C ( #»E, #»V )
∣∣∣∣∣∣µ˜[· · · | K #   »Y :l( #»α , #»x ).c | · · ·]〉 βS c { #       »C/Y , #      »E/α, #     »V/x}
(βS)
〈
µ
(
· · · | O
#   »
Y :l( #»x , #»α ).c | · · ·
)∣∣∣∣∣∣O #»C ( #»E, #»V )〉 βS c { #       »C/Y , #     »V/x, #      »E/α}
(ςS) K
#»
C ( #»E, e′, #»e , #»v ) ςS µα.
〈
µβ.
〈
K
#»
C ( #»E, β, #»e , #»v )
∣∣∣∣∣∣α〉∣∣∣∣∣∣e′〉
(ςS) K
#»
C ( #»E, #»V , v′, #»v ) ςS µα.
〈
v′
∣∣∣∣∣∣µ˜y. 〈K #»C ( #»E, #»V , y, #»v )∣∣∣∣∣∣α〉〉
(ςS) O
#»
C ( #»V , v′, #»v , #»e ) ςS µ˜x.
〈
v′
∣∣∣∣∣∣µ˜y. 〈x∣∣∣∣∣∣O #»C ( #»V , y, #»v , #»e )〉〉
(ςS) O
#»
C ( #»V , #»E, e′, #»e ) ςS µ˜x.
〈
µβ.
〈
x
∣∣∣∣∣∣O #»C ( #»V , #»E, β, #»e )〉∣∣∣∣∣∣e′〉

v′ /∈ ValueS
e′ /∈ CoValueS
x,y, α, β fresh
FIGURE 6.6. The parametric βSςS laws for arbitrary higher-order data and co-data.
Given data F( #        »X : k) : Swhere
#                                                                                            »
Ki :
(
#              »
Aij : Tijj `
#     »
Y :li F( #»X ) | #                »Bij : Rijj
)i
∈ G, we have:
#                                         »
Γ′j | e :: Rij `G ∆′j
j #                                       »
Γj | v :: Tij `G ∆j
j
#»Γj
j
,
#»
Γ′j
j `G K
#»
C
i ( #»e , #»v ) :: S |
#  »∆j
j
,
#  »
∆′j
j
FRKi
#                                                                           »
ci ::
(
Γ, #             »xi :: Tij `G
#               »
αi :: Rij ,∆
)i
Γ | µ˜
[
#                               »
K
#     »
Y :li
i ( #»αi , #»xi).ci
i
]
:: S `G ∆
FL
Given codataG( #        »X : k) : Swhere
#                                                                                             »
Oi :
(
#              »
Aij : Tijj | G( #»X ) `
#     »
Y :li #                »Bij : Rijj
)i
∈ G, we
have:
#                                                                           »
ci ::
(
Γ, #             »xi :: Tij `G
#               »
αi :: Rij ,∆
)i
Γ `G µ
(
#                              »
O
#     »
Y :li
i [ #»xi , #»αi ].ci
i
)
:: S | ∆
GR
#                                       »
Γj | v :: Tij `G ∆j
j #                                         »
Γ′j | e :: Rij `G ∆′j
j
#»Γj
j
,
#»
Γ′j
j | O #»Ci [ #»v , #»e ] :: S `G
#  »∆j
j
,
#  »
∆′j
j
GLOi
FIGURE 6.7. Type-agnostic kind system for higher-order multi-kinded (co-)data.
218
Well-Founded Recursion Principles
There is one fundamental difficulty in ensuring termination for programs written
in a sequent calculus style: even incredibly simple programs perform their structural
recursion from within some larger overall structure. For example, consider the humble
length function from Example 6.1. The decreasing component in the definition of
length is clearly the list argument which gets smaller with each call. However, in the
sequent calculus, the actual recursive invocation of length is the entire call-stack. This
is because the recursive call to length does not return to its original caller, but to
some place new. When written in a functional style, this information is implicit since
the recursive call to length is not a tail-call, but rather S(length xs). When written
in a sequent style, this extra information becomes an explicit part of the function
call structure, necessary to remember to increment the output of the function before
ultimately returning. This means that we must carry around enough memory to store
our ever increasing result amidst our ever decreasing recursion.
Establishing termination for sequent calculus therefore requires a more finely
controlled language for specifying “what’s getting smaller” in a recursive program,
pointing out where the decreasing measure is hidden within recursive invocations. For
this purpose, we adopt a type-based approach to termination checking (Abel, 2006).
Besides allowing us to abstract over termination-ensuring measures, we can also specify
which parts of a complex type are used as part of the termination argument. As a
consequence for handling simplistic functions like length, we will find that, for free,
the calculus ends up as a robust language for describing more advanced recursion over
structures, including lexicographic and mutual recursion over both data and co-data
structures simultaneously.
In considering the type-based approach to termination in the sequent calculus, we
identify two different styles for the type-level measure indices. The first is an exacting
notion of index with a predictable structure matching the natural numbers and which
we use to perform primitive recursion. This style of indexing gives us a tight control
over the size of structures and depend on the specific structure of the index in the
style of GADTs, allowing us to define types like the fixed-sized vectors of values from
dependently typed languages as well as a direct encoding of “infinite” structures as
found in lazy functional languages. The second is a looser notion that only tracks the
upper bound of indices and which we use to perform noetherian recursion. This style
of indexing is more in tune with typical structurally recursive programs like length
219
and also supports full run-time erasure of bounded indices while still maintaining
termination of the index-erased programs.
Primitive Recursion
We begin with the seemingly more basic of the two recursion schemes: primitive
recursion on a single natural number index. These natural number indices are used in
types in two different ways. First, the indices act as an explicit measure in recursively
defined (co-)data types, tracking the recursive sub-components of their structures
in the types themselves. Second, the indices are abstracted over by the primitive
recursion principle, allowing us to generalize over arbitrary indices and write looping
programs. For simplicity, we will limit ourselves to a single arbitrary base kind S in
the discussion to follow, although using multiple different ones is still admissible.
Let’s consider some examples of using natural number indices for the purpose
of defining (co-)data types with recursive structures. We extend the higher-order
(co-)type declaration mechanism from Section 6.2 with the ability to define new
(co-)data types by primitive recursion over an index, giving a mechanism for describing
recursive (co-)data types with statically tracked measures. Essentially, the constructors
are given in two groups—the constructors for the zero case and the constructors for
the successor case—and may only contain recursive sub-components at the (strictly)
previous index. For example, we may describe vectors of exactly N values of type A,
Vec(N,A), as in dependently typed languages:
dataVec(i : Ix, X : S) : S by primitive recursion on i
where i = 0 Nil : ` Vec(0, X) |
where i = j + 1 Cons : X : S,Vec(j,X) : S ` Vec(j + 1, X) |
where Ix is the kind of type-level natural number indices. Nil builds an empty vector of
type Vec(0, A), and Cons(v, v′) extends the vector v′ : Vec(N,A) with another element
v : A, giving us a vector with one more element of type Vec(N + 1, A). These terms
are typed by the right rules for Vec:
Γ `ΘG Nil : Vec(0, A) | ∆
VecRNil
Γ `ΘG v : A | ∆ Γ′ `ΘG v : Vec(M,A) | ∆′
Γ′,Γ `ΘG Cons(v, v′) : Vec(M + 1, A) | ∆′,∆
VecRCons
220
Other than these restrictions on the instantiations of i : Ix for vectors constructed
by Nil and Cons, the typing rules for terms of Vec(N,A) follow the normal pattern
for declared data types.4 Destructing a vector diverges more from the usual pattern
of non-recursive data types. Since the constructors of vector values are put in two
separate groups, we have two separate case abstractions to consider, depending on
whether the vector is empty or not. On the one hand, to destruct an empty vector,
we only have to handle the case for Nil, as given by the co-term µ˜[Nil.c]. On the other,
destructing a non-empty vector requires us to handle the Cons case, as given by the
co-term µ˜[Cons(x, xs).c]. These co-terms are typed by the two left rules for Vec—one
for both its zero and successor instances:
c :
(
Γ `ΘG ∆
)
Γ | µ˜[Nil.c] : Vec(0, A) `ΘG ∆
VecL0
c :
(
Γ, x : A, xs : Vec(M,A) `ΘG ∆
)
Γ | µ˜[Cons(x, xs).c] : Vec(M + 1, A) `ΘG ∆
VecL+1
As a similar example, we can define a less statically constrained list type by
primitive recursion. The IxList indexed data type is just like Vec, except that the Nil
constructor is available at both the zero and successor cases:
data IxList(i : Ix, X : S)by primitive recursion on i
where i = 0 Nil : ` IxList(0, X) |
where i = j + 1 Nil : ` IxList(j + 1, X) |
Cons : X : S, IxList(j,X) : S ` IxList(j + 1, X) |
Now, destructing a non-zero IxList(N + 1, A) requires both cases, as given in the
co-term µ˜[Nil.c | Cons(x, xs).c′]. IxList has three right rules for building terms: for Nil
at both 0 and M + 1 and for Cons:
Γ `ΘG Nil : IxList(0, A) | ∆
IxListRNil 0 Γ `ΘG Nil : IxList(M + 1, A) | ∆
IxListRNil+1
Γ `ΘG v : A | ∆ Γ′ `ΘG v : IxList(M,A) | ∆′
Γ′,Γ `ΘG Cons(v, v′) : IxList(M + 1, A) | ∆′,∆
IxListRCons
4We can have a vector with an abstract index if we don’t yet know what shape it has, as with
the variable x or abstraction µα.c of type Vec(i, A).
221
It also has two left rules: one for case abstractions handling the constructors of the 0
case and another for the M + 1 case:
c :
(
Γ `ΘG ∆
)
Γ | µ˜[Nil.c] : IxList(0, A) `ΘG ∆
IxListL0
c0 :
(
Γ `ΘG ∆
)
c1 :
(
Γ, x : A, xs : Vec(M,A) `ΘG ∆
)
Γ | µ˜[Nil.c0Cons(x, xs).c1] : IxList(M + 1, A) `ΘG ∆
IxListL+1
To write looping programs over these indexed recursive types, we use a recursion
scheme which abstracts over the index occurring anywhere within an arbitrary type.
As the types themselves are defined by primitive recursion over a natural number, the
recursive structure of programs will also follow the same pattern. The trick then is
to embody the primitive induction principle for proving a proposition P over natural
numbers:
P [0] ∧ (∀j : N.P [j]→ P [j + 1])→ (∀i : N.P [i])
and likewise the refutation of such a statement, as is given by any specific counter-
example—n : N ∧ P [n] → (∀i : N.P [i])—into logical rules of the sequent calculus.5
Recall from the reading of sequents in Chapter III, proofs come to the right of
entailment (` A means “A is true”), whereas refutations come to the left (A `
means “A is false”). Because we will have several recursion principles, we denote
this particular one as ∀ quantification over Ix, ∀Ix, so that the primitive recursive
proposition ∀i : N.P [i] on natural numbers corresponds to the type ∀i : Ix.A which is
shorthand for ∀Ix(λi : Ix.A) with the following inference rules:
` A 0 A j `j:Ix A (j + 1)
` ∀Ix(A)
`M : Ix A M `
∀Ix(A) `
We use this translation of primitive induction into logical rules as the basis for our
primitive recursive co-data type. The refutation of primitive recursion is given as a
specific counter-example, so the co-term is a specific construction. Whereas, proof
by primitive recursion is a process given by cases, the term performs case analysis
over its observations. The canonical counter-example is described by the co-data type
5We use the overbar notation, P , to denote that the proposition P is false. The use of this notation
is to emphasize that we are not talking about negation as a logical connective, but rather the dual
to a proof that P is true, which is a refutation of P demonstrating that it is false.
222
declaration for ∀Ix:
codata ∀Ix(X : Ix → S) : Swhere
@ :
(
| ∀Ix(X) `j:Ix X j : S
)
Notice that this is exactly the same co-data definition of ∀ quantification from
Section 6.2 except that the generic kind k has been specialized to Ix. Therefore,
the general mechanism for co-data automatically generates the same left rule for
constructing the counter-example, and a right rule for extracting the parts of this
construction. However, to give a recursive process for ∀Ix, we need an additional right
rule that gives us access to the recursive argument by performing case analysis on
the particular index. This scheme for primitive recursion is expressed by the term
µ(0:Ix @ α.c0 | j + 1:Ix @x α.c1) which performs case analysis on type-level indices at
run-time, and which can access the recursive result through the extra variable x in
the successor pattern j + 1:Ix @x α. This term has the typing rule:
c0 :
(
Γ `ΘG α : A 0,∆
)
c1 :
(
Γ, x : A j `Θ,j:ixG α : A (j + 1),∆
)
Γ `ΘG µ(0:Ix @ α.c0 | j + 1:Ix @x α.c1) : ∀Ix(A) | ∆
∀IxRrec
Note that this extension of the ∀Ix connective is allowed by the pragmatist view of
co-data types: the observations of a co-data type are fixed up front, but the terms can
be “whatever works” with respect to those observations. Terms of type ∀i:Ix.A describe
a process which is able to produce A {N/i}, for any index N , by stepwise producing
A {0/i}, A {1/i}, . . . , A {N/i} and piping the previous output to the recursive input
x of the next step, thus “inflating” the index in the result arbitrarily high. In essence,
this follows the interface of an infinitary & (an additive conjunction) of the form
A {0/i} & A {1/i} & A {2/i} & . . . . The index of the particular step being handled
is part of the observer pattern, so that the recursive case abstraction knows which
branch to take. In contrast, co-terms of type ∀i:Ix.A hide the particular index at which
they can consume an input, thereby forcing their input to work for any index.
By just applying duality in the sequent calculus and flipping everything about
the turnstyles, we get the opposite notion of primitive recursion as a data type. In
particular, we get the data declaration describing a dual type, named ∃Ix:
data ∃Ix(X : Ix → S)where
@ :
(
X j : S `j:Ix ∃Ix(X) |
)
223
Again, note that this data declaration is just the Ix instance of the general ∃ quantifier
from Section 6.2. The general mechanism for data automatically generates the right
rule for constructing an index-witnessed example case, and a left rule for extracting
the index and value from this structure. Further, as before we need an additional left
rule for performing self-referential recursion for consuming such a construction:
c0 :
(
Γ, x : A 0 `ΘG ∆
)
c1 :
(
Γ, x : A (j + 1) `Θ,j:IxG α : A j,∆
)
Γ | µ˜[0:Ix @ x.c0 | j + 1:Ix @α x.c1] : ∃Ix(A) `ΘG ∆
∃IxLrec
This extension of the ∃Ix connective is allowed by the verificationist view of data types:
the constructions of a data type are fixed up front, but the co-terms can be “whatever
works” with respect to those constructions. Dual to before, the recursive output sink
can be accessed through the extra co-variable α in the pattern j + 1:Ix @α x. The
terms of type ∃Ixi:Ix.A hide the particular index at which they produce an output. In
contrast, it is now the co-terms of the type ∃Ixi:Ix.A which describe a process which
is able to consume A {N/i} for any choice of N in steps by consuming A {N/i}, . . . ,
A {0/i} and piping the previous input to the recursive output α of the next step, thus
“deflating” the index in the input down to 0. In essence, this follows the interface of an
infinitary ⊕ (an additive disjunction) of the form A {0/i} ⊕ A {1/i} ⊕ A {2/i} ⊕ . . . .
Noetherian Recursion
We now consider the more complex of the two recursion schemes: noetherian
recursion over well-ordered indices. As opposed to ensuring a decreasing measure by
matching on the specific structure of the index, we will instead quantify over arbitrary
indices that are less than the current one. In other words, the details of what these
indices look like are not important. Instead, they are used as arbitrary upper bounds
in an ever decreasing chain, which stops when we run out of possible indices below
our current one as guaranteed by the well-foundedness of their ordering. Intuitively,
we may jump by leaps and bounds down the chain, until we run out of places to move.
Qualitatively, this different approach to recursion measures allows us to abstract
parametrically over the index, and generalize so strongly over the difference in the
steps to the point where the particular chosen index is unknown. Thus, because a
process receiving a bounded index has so little knowledge of what it looks like, the
224
index cannot influence its action, thereby allowing us to totally erase bounded indices
during run-time.
Now let’s see how to define some types by noetherian recursion on an ordered
index. Unlike primitive recursion, we do not need to consider the possible cases for
the chosen index. Instead, we quantify over any index which is less than the given one.
For example, recall the recursive definition of the Nat data type from Example 6.1.
We can be more explicit about tracking the recursive sub-structure of the constructors
by indexing Nat with some ordered type, and ensuring that each recursive instance
of Nat has a smaller index, so that we may define natural numbers by noetherian
recursion over ordered indices from a new kind called Ord:
dataNat(i : Ord)by noetherian recursion on iwhere
Z : ` Nat(i) |
S : Nat(j) `j<i Nat(i) |
Note that the kind of indices less than i is denoted by (< i), and we write j <
i as shorthand for j : (< i). Noetherian recursion in types is surprisingly more
straightforward than primitive recursion, and more closely follows the established
pattern for data type declarations:
Γ `ΘG Z : Nat(N) | ∆
NatRZ
Θ `G M : < N Γ `ΘG v : Nat(M) | ∆
Γ `ΘG SM(v) : Nat(N) | ∆
NatRS
Z builds a Nat(N) for any Ord index N , and SM(v) builds an incremented Nat(N) out
of a Nat(M), when M < N . To destruct a Nat(N), for any index N , we have the one
case abstraction that handles both the Z and S cases:
c0 :
(
Γ `ΘG ∆
)
c1 :
(
Γ, x : Nat(j) `Θ,j<NG ∆
)
Γ | µ˜
[
Z.c0 | Sj<N(x).c1
]
: Nat(N) `ΘG ∆
NatL
Like the case abstraction for tearing down an existentially constructed value, the
pattern for S introduces the free type variable j which stands for an arbitrary index
less than N .
225
We can consider some other examples of (co-)data types defined by noetherian
recursion. The definition of finite lists is just an annotated version of the definition
from Example 6.1:
data List(i : Ord, X : S) : S by noetherian recursion on iwhere
Nil : ` List(i,X) |
Cons : X : S, List(j,X) : S `j<i List(i,X) |
Furthermore, the infinite streams from Example 6.2 can also be defined as a co-data
type by noetherian recursion:
codata Stream(i : Ord, X : S) : S by noetherian recursion on iwhere
Head : | Stream(i,X) ` X : S
Tail : | Stream(i,X) `j<i Stream(j, a) : S
Recursive co-data types follow the dual pattern as data types, with finitely built
observations and values given by case analysis on their observations. For Stream(N,A),
we can always ask for the Head of the stream if we have some use for an input of type
A, and we can ask for its tail if we can use an input of type Stream(M,A), for some
smaller index M < N :
Γ | e : A `ΘG ∆
Γ | Head[e] : Stream(N,A) `ΘG ∆
StreamLHead
Θ `G M : < N Γ | e : Stream(M,A) `ΘG ∆
Γ | TailM [e] : Stream(N,A) `ΘG ∆
StreamLTail
Whereas a Stream(N,A) value is given by pattern-matching on these two possible
observations:
c0 :
(
Γ `ΘG α : A,∆
)
c1 :
(
Γ `Θ,j<NG β : Stream(j, A),∆
)
Γ `ΘG µ
(
Head[α].c0 | Tailj<N [β].c1
)
: Stream(N,A) | ∆ StreamR
As before, to write looping programs over recursive types with bounded indices,
we use an appropriate recursion scheme for abstracting over the type index. The proof
principle for noetherian induction by a well-founded relation < on a set of ordinals O
226
is:
(∀j : O.(∀i < j.P [i])→ P [j])→ (∀i : O.P [i])
which can be made more uniform by introducing an upper-bound to the quantifier in
the conclusion as well as in the hypothesis:
(∀j < n.(∀i < j.P [i])→ P [j])→ (∀i < n.→ P [i])
Likewise, a disproof of this argument is again a witness of a counter-example within
the chosen bound:
m < n ∧ P [n]→ (∀i < n.P [i])
We can then translate these principles into inference rules in the sequent calculus,
where we represent this new recursion scheme by a co-data type ∀<Ord:
∀<Ord(j, A) `j<N A j
` ∀<Ord(N,A)
`M < N A M `
∀<Ord(N,A) `
We will write ∀i < N.A as shorthand for the type ∀<Ord(N, λi : Ord.A). We use a
similar reading of these rules as a basis for noetherian recursion as we did for primitive
recursion. A refutation is still a specific counter-example, so it is represented as a
constructed co-term, whereas a proof is a process so is given as a term defined by
matching on its observation. Thus, we declare ∀<Ord as a co-data type of the form:
codata ∀<Ord(i : Ord, X : Ord→ S) : Swhere
@ :
(
| ∀<Ord(i,X) `j<i X j : S
)
Note that, while very similar, this is slightly different than just the Ord instance of the
general ∀ quantifier of a generic kind k, because the ∀<Ord connective also accepts an
additional type parameter i : Ord which serves as the upper bound of the quantification.
Again, the general mechanism for co-data types tells us how to construct the counter-
example with the observation M @ e, and destruct it by simple case analysis. The
recursive form of case analysis is given manually as the term µ(j < N @x α.c), where
x in the pattern is a self-referential variable standing in for the term itself. The typing
rule for this recursive case analysis restricts access to itself by making the type of the
227
self-referential variable have a smaller upper bound:
Θ `G M : < N Γ | e : A M `ΘG ∆
Γ |M @ e : ∀<Ord(N,A) `ΘG ∆
∀<OrdL
c :
(
Γ, x : ∀<Ord(j, A) `Θ,j<NG α : A j,∆
)
Γ `ΘG µ(j < N @x α.c) : ∀<Ord(N,A) | ∆
∀<OrdRrec
In essence, the terms of type ∀i < N.A describe a process which is capable of producing
A {M/i} for any M < N by leaps and bounds: an output of type A {M/i} is built
up by repeating the same process whenever it is necessary to ascending to an index
under M . In contrast, and similar to primitive recursion, co-terms of type ∀i < N.A
hide the chosen index, forcing their input to work for any index.
As always, the symmetry of sequents points us to the dual formulation of
noetherian recursion in programs. Specifically, we get the dual data type, named
∃<Ord, with the usual shorthand ∃j < N.A for ∃<Ord(N, λi:Ord.A), via the following
data declaration:
data ∃<Ord(i : Ord, X : Ord→ S) : Swhere
@ :
(
X j : S `j<i ∃<Ord(i,X) |
)
A {j/i} `j<N ∃<Ordi < j.A
∃<Ordi < N.A `
` A {M/i} `M < N
` ∃<Ordi < N.A
Also dual to the ∀<Ord connective, we have the following pattern construction of ∃<Ord
values as well as the recursive form of pattern-matching where the α in the pattern
j < N @α x refers to the case abstraction itself:
Θ `G M : < N Γ `ΘG v : A M | ∆
Γ `ΘG M @ v : ∃<Ord(N,A) | ∆
∃<OrdR
c :
(
Γ, x : A j `Θ,j<NG α : ∃<Ord(j, A),∆
)
Γ | µ˜[j < N @α x.c] : ∃<Ord(N,A) `ΘG ∆
∃<OrdLrec
Now that the roles are reversed, the terms of ∃i < N.A hide the chosen index M at
which they can produce a result of type A {M/i}. Instead, the co-terms of ∃i < N.A
consume an A {M/i} for any indexM < N : an input of type A {M/i} is broken down
228
by repeating the same process whenever it is necessary to descend from an index under
M .
Indexed Recursion in the Sequent Calculus
We now flesh out the rest of the system for recursive types and structures for
representing recursive programs in the higher-order parametric µµ˜ sequent calculus.
The extended syntax for programs, types, and kinds is shown in Figure 6.8, which
extends the basic higher-order syntax from Figure 6.1 with size kinds (Ix, Ord, and
< M), size types (0, M + 1, and ∞), recursive forms of (co-)data declarations (both
primitive and noetherian), and the special recursive forms of case abstraction for
the primitive (∀Ix and ∃Ix), and noetherian (∀<Ord and ∃<Ord) recursion principles.
The rules for sorting, kinding, and well-formed sequents, are given in Figure 6.9.
Note that the rules for the inequality of Ord, M < N , are enough to derive expected
facts like ` 4 < 6, but not so strong that they force us to consider Ord types above
∞. Specifically, the requirement that every Ord has a larger successor, M < M + 1,
only when there is an upper bound already established, M < N , prevents us from
introducing ∞ < ∞ + 1. Additionally, we have two sorts of kinds, those of erasable
types, , and non-erasable types, . Types (of some base kind S) for program-level
(co-)values and Ord indices are erasable, because they cannot influence the behavior
of a program, whereas the Ix indices are used to drive primitive recursion, and cannot
be erased. The fact that there are two sorts of kinds means that some kinds can
be ill-sorted (analogous to the possibility of ill-typed programs), so we must add an
additional premise to the →E2 rule checking that the arrow kind k → l is well-sorted,
as well as to the corresponding type conversion rules for function application from
Figure 6.4:
Θ, X : k `G A : l Θ `G B : k Θ `G k → l : 
Θ `G (λX:k.A) B =βη A {B/X} : l β
Θ `G A =βη A′ : k → l Θ `G B =βη B′ : k Θ `G k → l : 
Θ `G A B =βη A′ B′ : l →E
2
Thus, this sorting system separates erasable and non-erasable type annotations found
in programs.
229
R,S, T ∈ BaseKind ::= . . .
X, Y, Z, i, j ∈ TypeVariable ::= . . . F,G ∈ Connective ::= . . .
s ∈ Sort ::=  |  k, l ∈ Kind ::= S | k → l | Ix | Ord | (< M)
A,B,C,M,N ∈ Type ::= X | F( #»A) | λX : k.B | A B | 0 |M + 1 | ∞
decl ∈ Declaration ::= data F( #        »X : k) : Swhere
#                                                                       »
K :
(
#         »
A : T ` #   »Y :l F( #»X ) | #         »B : R
)
| codataG( #        »X : k) : Swhere
#                                                                        »
O :
(
#         »
A : T | G( #»X ) ` #   »Y :l #         »B : R
)
| data F(i : Ix, #        »X : k) : S by primitive recursion on i
where i = 0
#                                                                            »
K :
(
#         »
A : T ` #   »Y :l F(0, #»X ) | #         »B : R
)
where i = j + 1
#                                                                                      »
K :
(
#         »
A : T ` #   »Y :l F(j + 1, #»X ) | #         »B : R
)
| codataG(i : Ix, #        »X : k) : S by primitive recursion on i
where i = 0
#                                                                             »
O :
(
#         »
A : T | G(0, #»X ) ` #   »Y :l #         »B : R
)
where i = j + 1
#                                                                                       »
O :
(
#         »
A : T | G(j + 1, #»X ) ` #   »Y :l #         »B : R
)
| data F(i : Ord, #        »X : k) : S by noetherian recursion on i
where K :
(
#         »
A : T ` #   »Y :l F(i, #»X ) | #         »B : R
)
| codataG(i : Ord, #        »X : k) : S by noetherian recursion on i
where O :
(
#         »
A : T | G(i, #»X ) ` #   »Y :l #         »B : R
)
x, y, z ∈ Variable ::= . . . α, β, γ ∈ CoVariable ::= . . .
K ∈ Constructor ::= . . . O ∈ Observer ::= . . .
c ∈ Command ::= 〈v||e〉
v ∈ Term ::= x | µα.c | K #»A ( #»e , #»v ) | µ
(
O
#     »
X:k [ #»x , #»α ].c | . . .
)
| µ(j<M @x α.c) | µ(0:Ix @ α.c0 | i+1:Ix @x α.c1)
e ∈ CoTerm ::= α | µ˜x.c | µ˜
[
K
#     »
X:k( #»α , #»x ).c | . . .
]
| O #»A [ #»v , #»e ]
| µ˜[j<M @α x.c] | µ˜[0:Ix @ x.c0 | i+ 1:Ix @α x.c1]
FIGURE 6.8. The syntax of recursion in the higher-order µµ˜-calculus.
230
G ∈ GlobalEnv ::= #     »decl Θ ∈ TypeEnv ::= #        »X : k
Γ ∈ InputEnv ::= #       »x : A ∆ ∈ OutputEnv ::= #        »α : A
J,H ∈ Judgement ::=
(
Γ `ΘG ∆
)
seq | (Θ `G k : s) | (Θ `G A : k)
Sort rules:
Θ `G S :  Θ `G Ix :  Θ `G Ord : 
Θ `G k :  Θ `G l : 
Θ `G k → l : 
Θ `G M : Ord
Θ `G < M : 
Θ `G k : 
Θ `G k : 
Kind rules:
(X : k) /∈ Θ′
Θ, X : k,Θ′ `G X : k TV
#                      »Θ `G C : k (F( #        »X : k) : S) ∈ G
Θ `G F( #»C ) : S
FT
Θ, X : k `G A : l
Θ `G λX : k.A : k → l →I
2 Θ `G A : k → l Θ `G B : k Θ `G k → l : 
Θ `G A B : l →E
2
Θ `G 0 : Ix
Θ `G M : Ix
Θ `G M + 1 : Ix Θ `G ∞ : Ord Θ `G 0 : <∞
Θ `G M : <∞
Θ `G M + 1 : <∞
Θ `G N : Ord Θ `G M : < N
Θ `G M : < M + 1
Θ `G M : < M ′ Θ `G M ′ : < N
Θ `G M : < N
Θ `G N : Ord Θ `G M : < N
Θ `G M : Ord
Well-formed sequent rules:
( ` ) seq
G ` decl
(
`G
)
seq(
`G,decl
)
seq
Θ `G k : s
(
`ΘG
)
seq(
`Θ,X:kG
)
seq
Θ `G A : S
(
Γ `ΘG ∆
)
seq(
Γ, x : A `ΘG ∆
)
seq
Θ `G A : S
(
Γ `ΘG ∆
)
seq(
Γ `ΘG α : A,∆
)
seq
FIGURE 6.9. The kind system for size-indexed higher-order µµ˜ sequent calculus.
231
Before admitting a user-defined (co-)data type into the system, we need to check
that its declaration actually denotes a meaningful type via the judgement G ` decl, so
we must extend Figure 6.2 with additional rules for when recursively-defined (co-)data
types are well-formed. When checking for well-formedness of (co-)data types defined
by primitive induction on i : Ix, as with the general form
data F(i : Ix, #        »X : k) : S by primitive recursion on i
where i = 0 K1 :
#            »
B1 : T1 ` #       »d1:l1 F(0,
#»
X ) | #              »C1 : R1 . . .
where i = j + 1 K′1 :
#            »
B′1 : T ′1 ` #       »d′1:l′1 F(j + 1,
#»
X ) | #              »C ′1 : R′1 . . .
the i = 0 case proceeds as normal for a non-recursive data declaration without i in
scope, but i = j + 1 case we can allow for the extra rule stating that F(j, #»A) : S for
any #       »A : k.
#                                             »
Θ, j : Ix,Θ′ `G A : k
Θ, j : Ix,Θ′ `G F(j, #»A) : S
Intuitively, in the i = j + 1 case the sequents for the constructors may additionally
refer to smaller instances F(j, #»A) of the type being defined. This gives us the following
rule for primitive recursive data declarations:
G ` data F( #        »X : k) : Swhere
#                                                                         »
K :
(
#         »
A : T `
#   »
Y :l F( #»X ) | #         »B : R
)
#                                              »
Θ, j : Ix,Θ′ `G A : k
Θ, j : Ix,Θ′ `G F(j, #»A) : S....
G ` data F(j : Ix, #        »X : k) : Swhere
#                                                                                      »
K′ :
( #            »
A′ : T ′ `
#   »
Y :l F(j, #»X ) | #             »B′ : R′
)
G ` data F(i : Ix, #        »X : k) : S by primitive recursion on i
where i = 0
#                                                                              »
K :
(
#         »
A : T `
#   »
Y :l F(0, #»X ) | #         »B : R
)
where i = j + 1
#                                                                                                   »
K′ :
( #            »
A′ : T ′ `
#      »
Y ′:l′ F(j + 1, #»X ) | #             »B′ : R′
)
dataprim
And well-formedness of primitive recursive co-data types are the same. If the
declaration is well-formed, we have the typing rules for F similarly to a non-recursive
(co-)data type. The difference is that the (co-)constructors for the i = 0 and i = j + 1
case build a structure of type F(0, #»A) and F(M + 1, #»A) with M substituted for j,
respectively. Additionally, there are two case abstractions: one of type F(0, #»A) that
232
only handles constructors of the i = 0 case, and one of type F(M + 1, #»A) that only
handles constructors of the i = j + 1 case.
Similarly, when checking for well-formedness of (co-)data types F(i : Ord, #        »X : k)
defined by noetherian induction on i : Ord, we get to assume the type is defined for
smaller indices:
Θ, i : Ord,Θ′ `G M : < i #                                                 »Θ, i : Ord,Θ′ `G A : k
Θ, i : Ord,Θ′ `G F(M, #»A) : S
Intuitively, the sequents for the constructors may refer to F(M, #»A), so long as they
introduce quantified type variables #      »Y : l such that #        »X : k, #      »Y : l ` M < i. Other than
this, the typing rules for structures and case statements are exactly the same as for
non-recursive (co-)data types. This gives us the following rule for noetherian recursive
data declarations:
Θ, i : Ord,Θ′ `G M : < i #                                                 »Θ, i : Ord,Θ′ `G A : k
Θ, i : Ord,Θ′ `G F(M, #»A) : S....
G ` data F(i : Ord, #        »X : k) : Swhere
K :
(
#         »
A : T ` #   »Y :l F(i, #»X ) | #         »B : R
)
G ` data F(i : Ord, #        »X : k) : S by noetherian recursion on iwhere
K :
(
#         »
A : T ` #   »Y :l F(i, #»X ) | #         »B : R
)
datanoether
And well-formedness of noetherian recursive co-data types are the same. If the
declaration is well-formed, we have exactly the typing rules for F as non-recursive
(co-)data type.
Having concluded the static semantics of well-founded recursion in the sequent
calculus, we also need to explain the impact on the dynamic semantics. In particular,
there are the two dual pairs of special case abstractions introduced in Figure 6.8 that
allow for self-reference. To compute with recursion, we use the additional rules shown
in Figure 6.10. The recursive case abstractions for ∀<Ord and ∃<Ord are simplified
by “unrolling” their loop via the ν rules: the recursive abstraction reduces to a non-
recursive one by substituting itself inward—with a tighter upper bound—for the
recursive variable. Intuitively, this index-unaware loop unrolling is possible because
the actual chosen index doesn’t matter, the loop must do the same thing each time
233
(ν) µ([j<N @x α].c) ν µ([i<N @ α].c {i/j, µ([j<i@x α].c)/x})
(ν) µ˜[(j<N @α x).c] ν µ˜[(i<N @ x).c {i/j, µ˜[(j<i@α x).c]/α}]
(β∀IxS ) 〈V ||0 @ E〉 β∀IxS c0 {E/α}
(β∀IxS ) 〈V ||M+1 @ E〉 β∀IxS 〈µβ. 〈V ||M @ β〉||µ˜x.c1 {M/j,E/α}〉
whereV = µ([0:Ix @ α].c0 | [j+1:Ix @x β].c1)
(β∃IxS ) 〈0 @ V ||E〉 β∃IxS c0 {V/x}
(β∃IxS ) 〈M+1 @ V ||E〉 β∃IxS 〈µα.c1 {M/j, V/y}||µ˜y. 〈M @ y||E〉〉
whereE = µ˜[(0:Ix @ x).c0 | (j+1:Ix @α y).c1]
FIGURE 6.10. Rewriting theory for recursion in the parametric µµ˜-calculus.
around regardless of the value of the index. In contrast, the ∀Ix and ∃Ix recursors operate
strictly stepwise: they will always go from step 10 to 9 and so on to 0. The indices
used in the constructor really do matter, because they can influence the behavior
of the program. This fact forces us to “unroll” the loop while pattern-matching on
structures like (M+1)@E in tandem; unlike noetherian recursion the two steps cannot
be performed independently.
The “well-foundedness” of the recursion corresponds to strong normalization of
the rewriting theory: every possible reduction sequence is finite. Unfortunately, even
with the indexes controlling the use of self-reference, the naïve reduction theory allows
for infinite reduction sequences, albeit pointless ones. In particular, notice how the ν
rules from Figure 6.10 are self-replication: so long as the self-referential variable x (in
the case of ∀<Ord) or α (in the case of ∃<Ord) occur inside the command c, then the
result of a ν reduction contains yet another self-referential abstraction that can be
unrolled again. Therefore, in the interest of strong normalization, we must impose a
restriction on where reduction can occur to prevent these pointless infinite unrollings.
Intuitively, the restriction should follow the motto “don’t touch unreachable branches.”
In other words, we limit the compatible closure of the rewriting rules, which normally
allows reduction to occur in any context, to be careful about reduction inside a case
abstraction.
234
The simplest such restriction is to use the weak reduction theory, which allows
for reduction inside any context except inside of case abstractions. This corresponds
to weak reduction in the λ-calculus which does not allow for reduction inside of λ-
abstractions. While weak reduction suffices for strong normalization, it can be a bit
too draconian. Therefore, we consider something in between general reduction and
weak reduction that allows for some reduction inside case abstractions, but only those
that satisfy a reachability caveat about the kinds of quantified types introduced by its
patterns. This restriction prevents unnecessary infinite unrolling that would otherwise
occur in simple commands like
〈
length
∣∣∣∣∣∣Risei[α]〉. Intuitively, the reachability caveat
prevents reduction inside a case abstraction which introduces type variables that might
be impossible to instantiate, like i < 0 or j < i. The compatible reductions following
the reachability caveat are defined as:
c→ c′ b : #     »k →(< N) ∈ Θ =⇒ N =∞∨N = M + 1
µ
(
OΘ[ #»x , #»α ].c| . . .
)
→ µ
(
OΘ[ #»x , #»α ].c′| . . .
)
c→ c′ b : #     »k →(< N) ∈ Θ =⇒ N =∞∨N = M + 1
µ˜
[
KΘ( #»α , #»x ).c| . . .
]
→ µ˜
[
KΘ( #»α , #»x ).c′| . . .
]
We call the reduction theory which follows the above caveats the bounded reduction
theory, and note that every weak reduction step is a bounded reduction step, which
is in turn a general reduction step.
We also define the type erasure operation on programs, Erase(c), Erase(v), and
Erase(e) as shown in Figure 6.11, which removes all types from constructors and
patterns in c with an erasable kind, while leaving intact the unerasable Ix types. The
corresponding type-erased reduction theory is the same, except we can no longer rely
on the reachability caveat to maintain strong normalization while reducing inside
of case abstractions. Instead, in the type-erased calculus, we must assume the worst
and can only use the weak reduction theory if we want strong normalization to hold
without the help of the upper bounds on noetherian recursion. In this case every weak
reduction step of a type-erased command is justified by the same weak reduction step
in the original command, so that type-erasure cannot introduce infinite loops.
Since the reduction theory of the µµ˜-calculus is parameterized by a particular
(substitution) strategy S, the strong normalization of well-founded also follows by
235
Erase〈v||e〉 , 〈Erase(v)||Erase[e]〉
Erase(x) , x
Erase(µα.c) , µα.Erase(c)
Erase(K
#»
A ( #»e , #»v )) , KErase(
#»
A )(
#               »
Erase[e],
#                 »
Erase(v))
Erase
(
µ
( #                          »
O
#   »
Y :l [ #»x , #»α ].c
))
, µ
( #                                                           »
OErase(
#   »
Y :l)[ #»x , #»α ].Erase(c)
)
Erase[α] , α
Erase[µ˜x.c] , µ˜x.Erase(c)
Erase[O
#»
A [ #»v , #»e ]] , OErase(
#»
A )(
#                 »
Erase(v),
#               »
Erase[e])
Erase
[
µ˜
[ #                           »
K
#   »
Y :l( #»α , #»x ).c
]]
, µ˜
[ #                                                            »
KErase(
#   »
Y :l)( #»α , #»x ).Erase(c)
]
Erase() , 
Erase(A : k, #       »B : k) , Erase( #       »B : k) if k : 
Erase(A : k, #       »B : k) , A : k,Erase( #       »B : k) if k : 
FIGURE 6.11. Type erasure for the higher-order parametric µµ˜-calculus.
236
an argument that is parametric with respect to S. It doesn’t matter exactly which
(co-)values are substitutable, but only that S meets some general conditions. In
particular, strong normalization of the βςν reduction theory follows for any S which
is focalizing (as in Definition 5.2 discussed in Chapter V Section 5.3) and stable.
Definition 6.1 (Stable strategy). A substitution strategy S is stable if and only if
(co-)values are closed under reduction.
It then follows that the bounded reduction theory is strongly normalizing on
well-typed commands and (co-)terms for any stable and focalizing S, and therefore
so is the weak reduction theory on both higher-order and type-erased programs.
Theorem 6.1. Suppose that S is a stable and focalizing substitution strategy.
a) If c :
(
Γ `ΘG ∆
)
and
(
Γ `ΘG ∆
)
seq , then the bounded µS µ˜Sηµηµ˜βSςSν reduction
theory is strongly normalizing in c and the weak µS µ˜Sηµηµ˜βSςSν reduction theory
is strongly normalizing in Erase(c).
b) If Γ `ΘG v : A | ∆, Θ `G A : S, and
(
Γ `ΘG ∆
)
seq , then the bounded
µS µ˜Sηµηµ˜βSςSν reduction theory is strongly normalizing in v and the weak
µS µ˜Sηµηµ˜βSςSν reduction theory is strongly normalizing in Erase(v).
c) If Γ | e : A `ΘG ∆, Θ `G A : S, and
(
Γ `ΘG ∆
)
seq , then the bounded
µS µ˜Sηµηµ˜βSςSν reduction theory is strongly normalizing in e and the weak
µS µ˜Sηµηµ˜βSςSν reduction theory is strongly normalizing in Erase(e).
Note that the call-by-value (V), call-by-name (N ), call-by-need (LV) and its dual
(LN ) from Section 5.1 are all stable and focalizing, so that as a corollary, we achieve
strong normalization for these particular instances of the parametric µµ˜-calculus.
Furthermore, the non-deterministic strategy U is also stable and focalizing, which
gives another account of strong normalization for the symmetric λ-calculus (Lengrand
& Miquel, 2008) as a corollary and shows its extension to other programming features
including recursion. The details for this proving strong normalization are given in the
extended version of (Downen et al., 2015).
Encoding Recursive Programs via Structures
To see how to encode basic recursive definitions into the sequent calculus using
the primitive and noetherian recursion principles, we revisit the previous examples
237
from Section 6.1 to show how to encode basic recursive definitions into values in the
sequent calculus. For simplicity, we will stick to a single generic base kind S, although
each example can be adapted to use multiple, like the polarized mixture of V and N ,
as desired. We will see how the intuitive argument for termination can be represented
using the type indices for recursion in various ways. In essence, we demonstrate how
the parametric µµ˜S-calculus can be used as a core calculus and compilation target for
establishing well-foundedness of recursive programs.
Example 6.5. Recall the length function from Example 6.1, as written in sequent-style.
As we saw, we could internalize the definition for length into a recursively-defined
case abstraction that describes each possible behavior. Using the noetherian recursion
principle in the µµ˜-calculus, we can give a more precise and non-recursive definition
for length:
length : ∀X : S.∀i <∞. List(i,X)→ Nat(i)
length = µ([X @ i<∞@r Nil · γ].〈Z||γ〉
|[X @ i<∞@r Consj<i(x, xs) · γ].
〈
r
∣∣∣∣∣∣j @ xs · µ˜y. 〈Sj(y)∣∣∣∣∣∣γ〉〉)
The difference is that the polymorphic nature of the length function is made explicit in
system F style, and the recursion part of the function has been made internal through
the ∀<Ord co-data type. Going further, we may unravel the deep patterns into shallow
case analysis, giving annotations on the introduction of every co-variable:
length = µ([X @ α:∀i <∞. List(i,X)→ Nat(i)].
〈µ([i<∞@r:∀j<i. List(j,X)→Nat(j) β: List(i,X)→ Nat(i)].
〈µ([xs:List(i, a) · γ:Nat(i)].
〈xs||µ˜[Nil.〈Z||γ〉
|Consj<i(x:X, ys:List(j,X)).〈r||j @ ys · µ˜y:Nat(j).〈Sj(y)||γ〉〉])|
|β〉)|
|α〉)
Although quite verbose, this definition spells out all the information we need to verify
that length is well-typed and well-founded: no guessing required. Furthermore, this
core definition of length is entirely in terms of shallow case analysis, making reduction
straightforward to implement. Since the correctness of programs is ensured for this core
238
form, which can be elaborated from the deep pattern-matching definition mechanically,
we will favor the more concise pattern-matching forms for simplicity in the remaining
examples. End example 6.5.
Example 6.6. Recall the countUp function from Example 6.2. When we attempt to
encode this function into the parametric µµ˜-calculus, we run into a new problem: the
indices for the given number and the resulting stream do not line up since one grows
while the other shrinks. To get around this issue, we mask the index of the given natural
number using the dual form of noetherian recursion, and say that ANat = ∃i<∞.Nat(i).
We can then describe countUp as a function from ANat to a Stream(i,ANat) by
noetherian recursion on i:
countUp : ∀i <∞.ANat→ Stream(i,ANat)
countUp = µ([i<∞@r x · Head[α]].〈x||α〉
|[i<∞@r (j<i@ x) · Tailk<i[β]].〈r||k @ (j+1 @ Sj(x)) · β〉)
End example 6.6.
Example 6.7. The previous example shows how infinite streams may be modeled
by co-data. However, recall the other approach to infinite objects mentioned in
Example 6.4. Unfortunately, an infinitely constructed list like zeroes would be
impossible to define in terms of noetherian recursion: in order to use the recursive
argument, we need to come up with an index smaller than the one we are given, but
since lists are a data type their observations are inscrutable and we have no place to
look for one. As it turns out, though, primitive recursion is set up in such a way that
we can make headway. Defining infinite lists to be InfList(X) = ∀i : Ix. IxList(i,X), we
can encode zeroes as:
zeroes : InfList(Nat(0))
zeroes = µ([0:Ix @ α: IxList(0,Nat(0))].〈Nil||α〉
|[i+1:Ix @r: IxList(i,Nat(0)) α: IxList(i+ 1,Nat(0))].〈Cons(Z, r)||α〉)
Even more, we can define the concatenation of infinitely constructed lists in terms
of primitive recursion as well. We give a wrapper, cat, that matches the indices of
the incoming and outgoing list structure, and a worker, cat′, that performs the actual
239
recursion:
cat : ∀X : S. InfList(X)→ InfList(X)→ InfList(X)
cat = 〈µ([X:S @ xs · ys · i:Ix @ α].〈xs||i@ µ˜zs. 〈cat′||i@ zs · ys · α〉〉)
cat′ : ∀X : S.∀i : Ix. IxList(i,X)→ InfList(X)→ IxList(i,X)
cat′ = µ([X @ 0 @ Nil · ys · α].〈Nil||α〉
|[X @ i+1 @r Nil · ys · α].〈ys||i+1 @ α〉
|[X @ i+1 @r Cons(x, xs) · ys · α].〈Cons(x, µβ. 〈r||xs · ys · β〉)||α〉)
If we would like to stick with the “finite objects are data, infinite objects
are co-data” mantra, we can write a similar concatenation function over possibly
terminating streams:
codata StopStream(i <∞, X : S) : Swhere
Head : | StopStream(i,X) ` X : S
Tail : | StopStream(i,X) `j<i 1 : S, StopStream(j,X) : S
A StopStream(i,X) object is like a Stream(i,X) object except that asking for its Tail
might fail and return the unit value instead, so it represents an infinite or finite stream
of one or more values. This co-data type makes essential use of multiple conclusions,
which are only available in a language for classical logic. We can now write a general
recursive definition of concatenation in terms of the StopStream co-data type:
〈cat||xs · ys · Head[α]〉 = 〈xs||Head[α]〉
〈cat||xs · ys · Tail[δ, β]〉 = 〈cat||µγ. 〈xs||Tail[µ˜[().〈ys||β〉] , γ]〉 · ys · β〉
This function encodes into a similar pair of worker-wrapper values, where now a
possibly infinite list is represented as a terminating stream InfList(X) = ∀i <
∞. StopStream(i,X):
cat′ : ∀X : S.∀i <∞. StopStream(i,X)→ InfList(X)→ StopStream(i,X)
cat′ = µ([X @ i<∞@r xs · ys · Head[α]].〈xs||α〉
|[X @ i<∞@r xs · ys · Tailj<i[δ, β]].
240
〈r||j @ µγ.〈xs||Tailj[µ˜[().〈ys||j @ β〉] , γ]〉 · ys · β〉)
End example 6.7.
Remark 6.1. It is worth pointing out why our encoding for “infinite” data structures,
like zeroes, avoids the problem underlying the lack of subject reduction for co-induction
in Coq (Oury, 2008). Intuitively, the root of the problem is that Coq’s co-inductive
objects are non-extensional, since the interaction between case analysis and the co-fixed
point operator effectively allows these objects to notice if they are being discriminated
or not. In contrast, we take the extensional view that the presence or absence of case
analysis, in all of its various forms, is unobservable. To ensure strong normalization,
the basic observation is instead a specific message that advertises to the object exactly
how deep it would like to go, thus restoring extensionality and putting a limit on
unfolding. End remark 6.1.
Example 6.8. We now consider an example with a more complex recursive argument
that makes non-trivial use of lexicographic induction. The Ackermann function can
be written as:
〈ack||Z · y · α〉 = 〈S(y)||α〉
〈ack||S(x) · Z · α〉 = 〈ack||x · S(Z) · α〉
〈ack||S(x) · S(y) · α〉 = 〈ack||S(x) · y · µ˜z. 〈ack||x · z · α〉〉
The fact that this function terminates follows by lexicographic induction on both
arguments: to every recursive call of ack, either the first number decreases, or the
first number stays the same and the second number decreases. This argument can be
encoded into the basic noetherian recursion principle we already have by nesting it
twice:
ack : ∀i <∞.∀j <∞.Nat(i)→ Nat(j)→ ANat
ack = µ([i<∞@r1 j<∞@r2 Z · y · α].〈j+1 @ Sj(y)||α〉
|[i<∞@r1 j<∞@r2 Si
′<i(x) · Z · α].〈r1||i′ @ 1 @ x · S0(Z) · α〉
|[i<∞@r1 j<∞@r2 Si
′<i(x) · Sj′<j(y) · α].
〈r2||j′ @ Si′(x) · y · µ˜[(k<∞@ z).〈r1||i′ @ k @ x · z · α〉]〉)
241
Essentially, we get two recursive arguments from nesting ∀<Ord quantification over Ord
indexes:
r1 : ∀i′ < i.∀j <∞.Nat(i′)→ Nat(j)→ ANat
r2 : ∀j′ < j.Nat(i)→ Nat(j′)→ ANat
The first recursive path r1 can be taken whenever the first argument is smaller, in
which case the second argument is arbitrary. The second recursive path r2 can be taken
whenever the second argument is smaller and the first argument has the same index
(the i in the type of r2 matches the index of the original first argument to ack). Again,
we find that the dual form noetherian recursion, ∃<Ord, is useful for masking the index
of the output from ack. Furthermore, it is interesting to note that in the third case of
ack, we must explicitly destruct the ∃-packed result from ack before performing the
second recursive call. In practical terms, this forces the nested recursive call of the
Ackermann function to be strict, even in a lazy language. End example 6.8.
Each of these examples shows how we can phrase many different inductive and
co-inductive arguments in the form of structural recursion on combinations of data
and co-data types, where the forms of structural recursion provided by the calculus
are guaranteed to be well-founded by the strong normalization theorem (Theorem 6.1).
The next Chapter VII will present a parametric model for the parametric sequent
calculus which is suitable for proving strong normalization. The model in Chapter VII
is simplified from the one used in (Downen et al., 2015) which has both a positive
and negative consequence. An unfortunate cost of the simplification is that the model
in Chapter VII only applies to deterministic strategies like V, N , LV, LN , or the
polarized P which resolve the fundamental dilemma of classical computation, and do
not accomodate the type of non-determinism allowed by the U substitution strategy.
However, a technical benefit of the simplification is that the model in Chapter VII
straightforwardly scale to tackle other problems unrelated to strong normalization,
like the question of whether the typed βη theory of (co-)data developed previously
in Chapter V is “sound” with respect to the untyped βς theory in some appropriate
sense.
242
CHAPTER VII
Parametric Orthogonality Models
This chapter incorporates ideas from the proof technique for strong normalization
from the supporting material in the appendix of (Downen et al., 2015) which I developed
in collaboration with Philip Johnson-Freyd. Philip Johnson-Freyd developed a fixed-
point construction for modeling types which extends work by Barbanera & Berardi
(1994) and Lengrand & Miquel (2008) and is found in that appendix. I developed
the extension of work by Munch-Maccagnoni (2009) on focalization and classical
realizability that is presented in this chapter.
We have now studied many languages for the sequent calculus which include
features like simple, higher-order, and recursive types framed as data and co-data.
This study has included both the static (i.e. type systems) and dynamic (i.e. rewriting
rules) semantics for the features in question. The dynamic semantics in particular
was done in two different styles: one typed βη system for determining when programs
using (co-)data are equal, and one untyped βς system for running a program to find
the answer. As we saw in Section 5.3 of Chapter V the two different versions of the
dynamic semantics are related: βη subsumes βς for typed commands, terms, and
co-terms (Theorem 5.2). However, what about the other direction? The meaning of a
program should be defined by how it behaves when it is run. So if we are ultimately
using βς to evaluate (co-)data structures, then where does that leave the extensional
η law? Is ς truly the operational shadow of η that is seen during execution, or does η
mean something else?
This problem is confounded by the fact that purely syntactic methods are not
easy to apply to simplify η down to the operational semantics. That’s because the
η law is often just too strong and breaks any obvious form of confluence, defeating
any syntactic techniques based on it. Therefore, if we want to reconcile the strong η
rule with the operational semantics, we need to employ a different approach. For that
reason, we move to a semantic methodology that builds a model for the language, and
show that the syntactic typing rules adequately reflect the meaning of programs in
the semantic model. In turn, the model lets us capture more challenging properties of
243
the language, like the fact that the untyped operational semantics respects the typed
extensional η law. Taken together, these two parts bridge the static and dynamic
semantics of the language, and let us make bolder claims.
We now seek to build a model for the parametric µµ˜-calculus that lets us capture
the impact of types on the way that programs run. The model that we build is
parameterized by a number of different choices:
1. a safety condition constraining the run-time behavior of programs, so that well-
typed programs don’t “go wrong,”
2. a collection of (co-)data type declarations that define what types are interpreted
as semantic entities of the model, and
3. a collection of evaluation strategies that define how programs are executed.
The first parameter is not too uncommon—there are models of languages that abstract
out a notion of “safety,” which lets them speak about many different properties of
the language all at once. However, the second two parameters are novel—usually,
models consider a language with a fixed set of type formers and a fixed evaluation
mechanism—which is due to the open-ended presentation of (co-)data and evaluation
in Chapter V.
Our parameterized model will be based on an idea by Girard (1987) which goes
by many names: (bi-)orthogonality (Munch-Maccagnoni, 2009), classical realizability
Krivine (2009), and >>-closure Pitts (2000). The basic idea is to capture safety as a
binary predicate (‚) on two opposite entities: answers and questions. We can pose the
two dual problems: “which questions are safe to ask about these answers?” and “which
answers are safe to give to these questions?” This style of formulation matches perfectly
with the language of the sequent calculus. Terms are answers, co-terms are questions,
and commands are the action of asking a particular question about a particular
answer. The ‚ predicate represents a collection of commands that are safe to run.
The safety properties of types are then modeled by a collection of answers (i.e. terms)
and questions (i.e. co-terms) where every possible questions-answers combination is
safe (i.e. a command in ‚). The orthogonality approach gives a heavily test-based
view of language properties, where we use test suits of canonical observations to carve
out a space of valid programs that pass the test for each of those observations, or
alternative a specification of obviously correct results to carve out a space of valid
244
use-cases. The magic of this approach is that we quickly reach a fixed point: after
flipping back and forth between questions and answers with orthogonality twice, we
learn everything we possibly can.
This chapter covers the following topics:
– A general introduction to the idea of orthogonality in an abstract setting of spaces
and poles (Section 7.1) that explores the connection between orthogonality and
negation in intuitionistic logic.
– A representation of types based on orthogonality that are oriented around either
a positive or negative bias (Section 7.2), and a generic presentation of the closure
under expansion property, which is pervasive to semantic models of programming
languages, which is appropriate for many different applications.
– A binary model of the parametric µµ˜-calculus with higher-order and recursive
types that interprets sequents as statements about program behavior
(Section 7.3), which is parameterized by a choice of (co-)data types, evaluation
strategies, and safety condition.
– A proof of the fundamental adequacy lemmas (Section 7.4): the existence of a
syntactic derivation of a sequent implies the truth of the semantic interpretation
of that sequent.
– Several applications of the model, using adequacy to prove language-wide facts
about the parametric µµ˜-calculus (Section 7.5), including: logical consistency,
type safety, strong normalization, and soundness of the extensional βη theory
with respect to the operational βς semantics.
Poles, Spaces, and Orthogonality
We’re going to look at a semantic model of understanding computation in the
sequent calculus in terms of orthogonality. The model hinges on a representation
of commands of the sequent calculus that we deem to be valid execution states for
our purposes. In other words, we isolate some form of commands that can run. We
represent such a set of runnable commands abstractly as a computational pole which
is any set capable of running with a computation relation  . This way, the model is
extensible and does not pin down the precise nature of commands ahead of time.
245
Definition 7.1 (Computational poles). A computational pole P (or just pole for short)
is any set equipped with a relation  between elements of P.
In addition, the terms and co-terms of the sequent calculus are represented as an
interaction space with a positive and negative side oriented around some pole, which
likewise abstracts over their precise form.
Definition 7.2 (Interaction spaces). Given any computational pole P, a P-interaction
space A (or just P-space for short) is a pair of sets (A+,A−) equipped with a cut
operation 〈 || 〉 : A+ → A− → P (i.e. for all v ∈ A+ and e ∈ A−, 〈v||e〉 ∈ P).
We call P the pole of A, A+ the positive side of A, A− the negative side of A,
and use the shorthand v ∈ A to denote v ∈ A+ and e ∈ A to denote e ∈ A−.
Note that, while spaces and poles are quite abstract, we can always substitute the
more concrete syntactic notions of the language of the sequent calculus for better
intuition. For example, consider the (single-kinded) parametric µµ˜-calculus from
Chapter V. The set of untyped commands from Figure 5.7, Command, is a perfectly
fine computational pole, since the untyped operational reductions 7→µS µ˜SβSςS serve as
a computational relation  on commands. Likewise, the sets of untyped terms and
co-terms from Figure 5.7, (Term,CoTerm), is a perfectly fine Command-interaction
space, since we have the syntactic cut 〈v||e〉 formation of commands. This follows from
the intuition that commands are the primary computational entities of the sequent
calculus, whereas terms and co-terms provide a space for possible interactions (via
cuts) that lead to computations. We could just as well limit our attention to closed
programs (those without any free variables) as does Munch-Maccagnoni (2009) by
considering the set of closed, untyped commands as a pole along with the sets of
closed, untyped (co-)terms as an interaction space. Furthermore, if we are instead
interested in strong normalization, then we can start with the sets of all strongly
normalizing, untyped (co-)terms as an all-encompassing interaction space. Therefore,
the appropriate space required for modeling programs really depends on what sort of
outcome we are looking to achieve.
Since interaction spaces are just a pair of sets, we can compare when one
interaction space is contained inside another by considering the two pointwise: all
the positive and negative elements of the contained space must also be positive and
negative elements of the other space, respectively.
246
Definition 7.3 (Containment). Given two P-spaces A = (A+,A−) and B = (B+,B−),
we say that A is inside B (written A v B) if and only if A+ ⊆ B+ and A− ⊆ B−.
Equivalently, we say that B contains A (written B w A) if and only if B+ ⊇ A+ and
B− ⊇ A−.
Containment lets us specify when one interaction space is made up of parts of
another. For example, the set of terms and co-terms of type A→ B is inside the set of
untyped terms and co-terms, since every typed (co-)term is also an untyped (co-)term,
but not vice versa. This relationship is important for setting up a large, encompassing
space as an area of interest, wherein lie many smaller sub-spaces of interest.
We are now ready to tackle the most fundamental operation on interaction spaces:
orthogonality. Intuitively, orthogonality lets us pare down a large interaction space
which may include some undesired interactions by selecting only the parts which pass
some chosen criteria. To start with, we begin with some “plausibly well-behaved” but
overly-permissive computational pole P and P-interaction space A which includes every
interaction and computational behavior we might be interested in observing, but also
may allow for undesired interactions and behaviors. From there, we select a sub-pole
Q of P that serves as a safety condition and only includes the desired computational
behavior that we are interested in, along with a sub-P-space C contained in A which
serves as a specification laying out a set of criteria for evaluating the safety of elements
in A. Together, Q and C can be seen as a test suite for performing quality control and
determining which elements of A are acceptable: each positive element of A (intuitively,
untested programs) must pass the Q test when paired with every negative element of
C (intuitively, vetted use-cases), and dually each negative element of A (intuitively,
untested use-cases) must pass the Q test when paired with every positive element of
C (intuitively, vetted programs).1
Definition 7.4 (Orthogonality). Let P be a pole, Q ⊆ P be a sub-pole of P, A =
(A+,A−) be a P-space, and C = (C+,C−) v A be a P-space inside A. The positive
Q-orthogonal of C− inside A+, written C
QA+
− , consists of the positive elements of A
that form a Q element when cut with every negative element of C and is defined as:
C
QA+
− , {v ∈ A+ | ∀e ∈ C−, 〈v||e〉 ∈ Q}
1Traditionally, these operations are referred to as either C⊥ or C>, but here we use the generalized
notation CQA which lets us vary both the safety condition Q as well as the encompassing space A of
all potential programs in consideration.
247
Dually, the negative Q-orthogonal of C+ inside A−, written C
QA−
+ , consists of all
negative elements of A that form a Q element when cut with every positive element
of C and is defined as:
C
QA−
+ , {e ∈ A− | ∀v ∈ C+, 〈v||e〉 ∈ Q}
Taken together, the Q-orthogonal complement of C inside A, written CQA , is the Q-
space given by both the positive and negative Q-orthogonals of C inside A:
(C+,C−)Q(A+,A−) , (C
QA+
− ,C
QA−
+ )
Example 7.1. For example, suppose we are trying to reason about the execution of
well-typed programs. In other words, we want to model type safety of the operational
semantics. For an all-encompassing interaction space, we can consider all untyped
(co-)terms U = (Term,CoTerm), which is centered around the pole Command
containing all untyped commands. We would then need to design a pole that is a
subset of all untyped commands, ‚ ⊆ Command, representing type safety to contain
all valid states of type-safe execution that eventually leads to an acceptable result,
and excludes stuck states that are caused by type errors. For example, ‚ would
not include commands like 〈True||1 · []〉, 〈µ(x · α.c)||µ˜[(x, y) .c]〉, 〈ι1 (1)||µ˜[(x, y) .c]〉, and
〈µ(x · α.c)||pi1 [β]〉, since they are all stuck on irrecoverable miscommunications like
missing case analysis or data/co-data mismatches. Instead, ‚ would include valid
states where we may not have enough information to take the next step, but execution
could potentially continue if we learn more. These would be states where we are stuck
on a free variable, like 〈f ||1 · []〉 or 〈z||µ˜[(x, y) .c]〉, or on a free co-variable, like 〈True||α〉
or 〈µ(x · α.c)||β〉, and correspond to the “final commands” from Chapters III and IV.
To complete type safety pole‚, we should also ensure that a command that eventually
reaches a valid state in some number of steps is also valid. That is, if c′ is in ‚ and
c 7→ c′ then c is also in ‚. This is commonly referred to as “closure under expansion”
and is found in similar models of program evaluation.
Now, we can consider what the orthogonality operations mean for the above
description of our choice of the safety pole ‚ beginning with every (co-)term in the
all-encompassing Command-space U. For instance, the negative orthogonal {()}‚U−
selects every co-term that runs with the term (). This would include co-terms like
µ˜[() .c] and µ˜ .c for commands c that are in ‚, because they both reduce to the
248
safe state c in one step. However, {()}‚U− would not include co-terms like 1 · [] or
µ˜[ι1 (x) .c | ι2 (y) .c′] since the commands 〈()||1 · []〉 and 〈()||µ˜[ι1 (x) .c | ι2 (y) .c′]〉 are
stuck on an irrecoverable type error which is excluded from ‚. As another example,
{}‚U− would instead select every co-term, since the condition ∀v ∈ {}, 〈v||e〉 ∈ ‚ is
vacuously true for any e. Note that this fact about {}⊥ (or {}‚U+ ) holds regardless of
the definition of ‚, so that the ‚-orthogonal complement of the empty space (∅, ∅)
inside U always gives back all of U, for any ‚ and U. End example 7.1.
While we often have a particular purpose in mind (like the above example of
type-safe execution), we can temporarily ignore the particular details and just leave
the safety ‚ abstract for the time being. As we will see, the nature of orthogonality
itself already gives us some interesting structure independent of our choices, without
knowing anything about the particularities of terms and co-terms.
Orthogonality and intuitionistic negation
As an operation on interaction spaces, orthogonality has some inherently negating
behavior: it selects a collection of positive elements (terms) with respect to a collection
of negative elements (co-terms), and vice versa. We will see that this simple intuition
reveals a fundamental connection between the orthogonality of interaction spaces
and the negation connective in intuitionistic logic. As it turns out, basic properties
of intuitionistic negation, both from a logical and computational perspective, are
shared with the orthogonality operation. Furthermore, classical but non-intuitionistic
properties of negation are invalid for orthogonality.
Recall from Chapter II that in the intuitionistic logic of natural deduction,
negation can be encoded in terms of implication and falsehood: ¬A = A→ ⊥. This
encoding of negation is summarized by the following two derivations for ¬ introduction
and elimination that are derived from the rules for ⊃ and ⊥:
A
x
....⊥
¬A ¬Ix
¬A A
⊥ ¬E
Using the above derived rules for negation, we can give some schematic proofs
involving negation and implication that hold in intuitionistic logic. For example, we
249
have the contrapositive of an implication, (A ⊃ B) ⊃ (¬B ⊃ ¬A),
¬B k
A ⊃ B f A x
B
⊃E
⊥ ¬E
¬A ¬Ix
(¬B) ⊃ (¬A) ⊃Ik
(A ⊃ B) ⊃ ((¬B) ⊃ (¬A)) ⊃If
double negation introduction, A ⊃ (¬¬A),
¬A k A x
⊥ ¬E
¬¬A ¬Ik
A ⊃ (¬¬A) ⊃Ix
and triple negation elimination, (¬¬¬A) ⊃ (¬A),
¬¬¬A w
¬A k A x
⊥ ¬E
¬¬A ¬Ik
⊥ ¬E
¬A ¬Ix
(¬¬¬A) ⊃ (¬A) ⊃Ih
Furthermore, each of these proofs can also be written as a corresponding term in the
simply-typed λ-calculus as follows:
Contra : (A→ B)→ (¬B → ¬A)
Contra = λf : A→ B.λk : ¬B.λx : A.k (f x)
DNI : A→ ¬¬A
DNI = λx : A.λk : ¬A.k x
TNE : ¬¬¬A→ ¬A
TNE = λh : ¬¬¬A.λx : A.h (λk : ¬A.k x)
Remark 7.1. The three terms Contra, DNI , and TNE have an important status for
pure functional programming in languages like Haskell. In particular, they give us
a definition of the continuation monad over the return type ⊥, Cont A = ¬¬A.
250
Double negation introduction,DNI , is the return (a.k.a. unit) function. Triple negation
elimination, TNE , is the join function from Cont (Cont A) → Cont A with a more
general type. And Contra is the contravariant mapping function for the underlying
¬ functor. We can get the Functor mapping function fmap by Contra-mapping a
function twice, fmap f = Contra (Contra f). End remark 7.1.
As it turns out, these three properties of contrapositive mapping, double negation
introduction, and triple negation elimination correspond to similar properties of
orthogonality. In particular, the orthogonal complement of an interaction space takes
on the role of negation, and the containment relation takes on the role of implication.
With this correspondence in mind, we get the following three well-known intuitionistic
orthogonality properties:
Property 7.1 (Intuitionistic orthogonality). For any two poles Q ⊆ P and P-spaces
A, B, and C,
a) contrapositive: A v B implies BQC v AQC ,
b) double orthogonal introduction: A v C implies A v AQCQC , and
c) triple orthogonal elimination: A v C implies AQCQCQC = AQC .
Proof. a) Suppose that v ∈ BQC , so that by the definition of orthogonality, we know
that v ∈ C and 〈v||e〉 ∈ Q for all ∈ B. But since A is contained in B, it follows
that 〈v||e〉 ∈ Q for all ∈ A, meaning that v ∈ AQC as well. Dually, e ∈ BQC
implies that e ∈ AQC by the definition of orthogonality and the fact that A is
contained in B. Therefore, BQC v AQC follows from A v B.
b) Suppose that v ∈ A and e ∈ AQC . Since A v C it must be that v ∈ C, and by
the definition of orthogonality, it must also be that 〈v||e〉 ∈ Q. But this also
means that v ∈ AQCQC by the definition of orthogonality as well. Dually, given
any e ∈ A, we also have that e ∈ C and 〈v||e〉 ∈ Q for all v ∈ AQC , meaning that
e ∈ AQCQC as well. Therefore, A v AQCQC follows from A v C.
c) First, we get the fact thatAQC v AQCQCQC as an immediate consequence of double
orthogonal introduction (Property 7.1 (b)) because AQC v C by definition of
orthogonality. Second, we get A v AQCQC from double orthogonal introduction
(Property 7.1 (b)) again, from which AQCQCQC v AQC follows by contrapositive
(Property 7.1 (a)). Therefore, AQCQCQC = AQC follows from A v C.
251
It is important to point out that when demonstrating the above three properties,
we never needed to know anything specific about the makeup of the computational
poles Q and P or the interaction spaces A, B, or B. No matter what choices me make,
we get to use these intuitionistic reasoning principles when working with orthogonality.
These are well-known properties of orthogonality (also noted by Munch-Maccagnoni
(2009), for example).
Example 7.2. Recall from Remark 3.5 that one difference between negation in
intuitionistic logic versus classical logic is that double negation elimination, i.e.
(¬¬A)→ A, is not assumed to hold generically for any A in the intuitionistic setting.
To see why “double orthogonal elimination”, i.e. AQCQC v A, does not hold in general,
let’s return to our example of type-safe execution from Example 7.1. For the moment,
let’s assume a call-by-value V evaluation strategy, so that every co-term is a co-value
and thus 〈µα.c||e〉 7→µV c {e/α} for any e. Recall that the orthogonal of the empty
interaction space, (∅, ∅)‚U , is U. Now, suppose that the command c is in the type-safe
pole ‚. Notice that 〈µ .c||e〉 7→ c, so that for an arbitrary co-term e, the command
〈µ .c||e〉 reduces in one step to a command in ‚. This means that the term µ .c must
be in the double-orthogonal of the empty space, µ .c ∈ (∅, ∅)‚U‚U . But this also means
that we’ve run into a situation where the double orthogonal of an interaction space
(namely the empty one) includes elements that weren’t originally there. Therefore, in
general we can’t say that the double orthogonal gives back the same space that we
started with.
Since taking the double orthogonal of a set of an interaction space can introduce
new elements, we can view it as a closure operation. Furthermore, since taking the
orthogonal thrice gives the same thing as just once (Property 7.1 (c)), flipping back
and forth more than twice in this way is redundant: AQCQCQC = AQC and AQCQCQCQC =
AQCQC , so only A, AQC and AQCQC are interesting. In this regard, AQCQC can be seen
as the completion of A with respect to the possible candidates in C and the criteria
imposed by the pole Q. End example 7.2.
By adding more connectives into the mix, like conjunction (∧) and disjunction (∨)
from Chapter II, we get additional properties of intuitionistic negation. In particular,
we have the de Morgan law—used as the backbone of logical duality in Chapter III—
that allows us to distribute negation over conjunction in both directions: (¬(A∨B))↔
((¬A) ∧ (¬B)). This law is provable with the rules of NJ natural deduction as two
252
implications:
¬(A ∨B) k
A
x
A ∨B ∨I1
⊥ ¬E
¬A ¬I
x
¬(A ∨B) k
B
y
A ∨B ∨I2
⊥ ¬E
¬B ¬Iy
(¬A) ∧ (¬B) ∧I
(¬(A ∨B)) ⊃ ((¬A) ∧ (¬B)) ⊃Ik
A ∨B x
(¬A) ∧ (¬B) k
¬A ∧E1 A y
⊥ ¬E
⊥
(¬A) ∧ (¬B) k
¬B ∧E2 B z
⊥ ¬E
⊥
⊥ ∨Ey,z
⊥ ¬Ix
((¬A) ∧ (¬B)) ⊃ (¬(A ∨B)) ⊃Ik
We can also write down the terms corresponding to the above proofs in the simply-
typed λ-calculus from Section 2.2, expressing the above de Morgan law as two
functions:
PairNeg : ((¬A)× (¬B))→ (¬(A+B))
PairNeg = λk.λx. casexof ι1 (y)⇒ pi1(k) y | ι2 (z)⇒ pi2(k) z
NegSum : (¬(A+B))→ ((¬A)× (¬B))
NegSum = λk.((λx.k (ι1 (x))), (λy.k (ι2 (y))))
There is another de Morgan law used for logical duality in Section 3.1 for
distributing a negation over a conjunction in both directions: (¬(A ∧ B)) ↔
((¬A)∨ (¬B)). However, in an intuitionistic setting, this law does not hold both ways.
In particular, we can only assume that the right-to-left direction of this law holds in
general: (¬(A ∧ B)) ← ((¬A) ∨ (¬B)). This implication is provable in intuitionistic
253
natural deduction:
(¬A) ∨ (¬B) k
¬A q
A ∧B x
A
∧E1
⊥ ¬E
¬B r
A ∧B x
B
∧E2
⊥ ¬E
⊥
⊥ ∨Eq,r
¬(A ∧B) ⊃Ix
((¬A) ∨ (¬B)) ⊃ (¬(A ∧B)) ⊃Ik
And we also have simplty-typed λ-calculus function that corresponds to the one
direction of the law.
SumNeg : ((¬A) + (¬B))→ (¬(A×B))
SumNeg = λk.λx. case k of ι1 (q)⇒ q (pi1(x)) | ι2 (r)⇒ r (pi2(x))
We are unable to write the inverse function, NegPair : (¬(A×B))→ ((¬A) + (¬B)),
since we don’t know up front which of ¬A or ¬B to return in general.
Just like before, these three de Morgan laws correspond to similar properties of
orthogonality. The following union and intersection operations on interaction spaces
take on the roles of conjunction and disjunction, and they enjoy similar introduction
and elimination properties as in the natural deduction logic of NJ.
Definition 7.5 (Union and intersection). Given two P-spaces A = (A+,A−) and
B = (B+,B−), the union of A and B, written A unionsq B, and the intersection of A and B,
written A u B is defined as:
(A+,A−) unionsq (B+,B−) , (A+ ∪ B+,A− ∪ B−)
(A+,A−) u (B+,B−) , (A+ ∩ B+,A− ∩ B−)
Property 7.2 (Union/intersection introduction/elimination). For any P-spaces A, B,
and C,
a) A v A unionsq B and B v A unionsq B,
b) A unionsq B v C if and only if A v C and B v C,
c) A u B v A and A u B v B, and
d) C v A u B if and only if C v A and C v B.
254
Proof. Each property follows from the definition of unionsq and u in terms of the underlying
set union and intersection operations.
Furthermore, when coupled with orthogonality, union and intersection give the
following intuitionistic de Morgan orthogonality properties.
Property 7.3 (Spacial de Morgan laws). For any poles Q ⊆ P and P-spaces A, B,
and C,
a) (A unionsq B)QC = AQC u BQC , and
b) (A u B)QC w AQC unionsq BQC .
Proof. a) First, we show that (AunionsqB)QC v AQC uBQC . Suppose that v ∈ (AunionsqB)QC ,
so that by definition v ∈ C and 〈v||e〉 ∈ Q for all e ∈ AunionsqB. By the definition of
unionsq, it follows that 〈v||e〉 ∈ Q for all e ∈ A and 〈v||e〉 ∈ Q for all e ∈ B separately,
so we have that both v ∈ AQC and v ∈ BQC . Thus, v ∈ AQC u BQC . Dually, every
e ∈ (A unionsq B)QC also leads to e ∈ AQC u BQC for similar reasons.
Second, we show that (A unionsq B)QC w AQC u BQC . Suppose that v ∈ AQC u BQC . By
the definition of u, it follows that both v ∈ AQC and v ∈ BQC , meaning that
v ∈ C, 〈v||e〉 ∈ Q for all e ∈ A, and 〈v||e〉 ∈ Q for all e ∈ B. Thus, 〈v||e〉 ∈ Q
for all e ∈ A unionsq B, so v ∈ (A unionsq B)QC . Dually, every e ∈ AQC u BQC also leads to
e ∈ (A unionsq B)QC for similar reasons.
b) Suppose that v ∈ AQC unionsqBQC , so that by the definition of unionsq and orthogonality we
know v ∈ C and either 〈v||e〉 ∈ Q for all e ∈ A or 〈v||e〉 ∈ Q for all e ∈ B. For any
e ∈ AuB, it follows that e ∈ A and e ∈ B as well, so it must be that 〈v||e〉 ∈ Q.
Thus, v ∈ (A u B)QC . Dually, every e ∈ AQC unionsq BQC also leads to e ∈ (A u B)QC
for similar reasons.
Again, take notice that the de Morgan properties of orthogonality don’t depend
on what particular elements inhabit the computational poles or interaction spaces.
They are general laws that come out from the definition of orthogonality and the other
basic operations and relations on interaction spaces.
Example 7.3. To see why the de Morgan Property 7.3 (b) does not go both ways like
Property 7.3 (a), let’s return again to type-safe execution from Example 7.1. Now,
suppose we begin with two sets of terms, A+ = {True, ()} and B+ = {False, ()},
so that their intersection is A+ ∩ B+ = {()}. The negative ‚-orthogonal of this
255
intersection in U is (A+ ∩ B+)‚U− = {()}‚U− = {e | 〈()||e〉 ∈ ‚}. By the definition
of ‚ from Example 7.1, given a command c in ‚ we have that the co-term µ˜[() .c]
runs with () because 〈()||µ˜[() .c]〉 7→ c, so µ˜[() .c] is in (A+ ∩ B+)‚U− . However, both
〈True||µ˜[() .c]〉 and 〈False||µ˜[() .c]〉 are not in ‚ since they are stuck on a type error.
This means that the co-term µ˜[() .c] is in neither A‚U−+ nor A
‚U−
+ , since µ˜[() .c] fails
to be type safe when run with some of the terms in A+ and B+. Therefore, we’ve
stumbled onto a situation where a co-term is in the space orthogonal to an intersection,
but does not come from the union of the separate orthogonal spaces. In other words,
taking the orthogonal of an intersection between two sets of terms permits more
possible co-terms than just forming the orthogonal sets in isolation and putting them
together. End example 7.3.
Computation, Worlds, and Types
With the basic building blocks of computational poles, interaction spaces,
and orthogonality at hand, we can now set the stage for constructing models of
programming languages using these concepts. In particular, we will be modeling some
safety condition of the language represented by a pole ‚, which in the context of
the sequent calculus will contain commands exhibiting the desired property. While
we have a lot of leeway in choosing ‚, it cannot be arbitrary, however. Because the
purpose of the programming language is to compute, a safety condition must respect
computation. For this reason, the safety condition is made up of three poles:
– The “top” pole ‚, which is unsafe, and corresponds to everything that can be
written,
– The “bottom” pole ‚, which is the safe subset of ‚, and corresponds to only
those programs which pass our criteria, and
– The “middle” pole ‚‚, which is partially safe and lies between ‚and ‚.
The purpose of ‚‚ is to act as a waypoint toward safety: the elements of ‚‚ are not
quite safe yet, however, we have the assurance that all the elements of ‚‚ that step into
‚ are safe. This is commonly known as closure under expansion, and in our notation
is written as: for all c ∈ ‚‚, if c c′ ∈ ‚ then c ∈ ‚.
Definition 7.6 (Safety condition). A safety condition S is a triple of poles ( ‚, ‚‚,‚)
such that ‚ ⊆ ‚‚ ⊆ ‚and the following condition holds:
256
– Closure under expansion: for all c ∈ ‚‚, if c c′ ∈ ‚ then c ∈ ‚.
We call ‚the unsafe pole, ‚‚ the demisafe pole, and ‚ the safe pole of the safety
condition.
The fact that we are allowed to choose a ‚‚ which is smaller than ‚for the
purposes of constraining closure under expansion is important for some applications,
but not others. For instance, in the following Section 7.5 we can just take ‚‚ = ‚
for the goal of proving logical consistency and type safety. However, to show strong
normalization it is crucial that we constrain ‚‚ to include only the commands which
may not be strongly normalizing themselves, but are formed by cutting together
strongly normalizing (co-)terms. This restriction on ‚‚ gives us a key foothold for
demonstrating the closure under expansion which would be impossible otherwise.
In order to model the semantic meaning of types in terms of a chosen safety
condition, we must delineate the world in which they reside. A world containing
semantic types is represented as an interaction space which holds every possible
element of all the types we’re interested in. Therefore, it may allow for undesired
interactions by mixing elements that belong to different types, but each inhabitant
of the world should act as a well-behaved member of some potential type. To phrase
this requirement, worlds are made up of three interaction spaces that represent the
impact of substitution strategy in the programming language:
– The “untyped” interaction space U corresponds to everything that can be written
without any restrictions,
– The “value” interaction space V is contained within U and corresponds to the
values and co-values of a substitution strategy, and
– The “well-behaved” interaction space W is contained within U and corresponds
to the elements which pass the minimum criteria needed to be considered elegible
for belonging to a type.
The definition of a world in this sense makes heavy use of (co-)value restriction,
since in many places computation can only proceed with (co-)values and not general
(co-)terms. For this purpose, we need to know what it means to restrict one interaction
space by another.
257
Definition 7.7 (Restriction). Given two P-spaces A and B, the B-restriction of A,
written A|B, is their intersection A u B. Likewise, we write A|B = A ∩ B for the
B-restriction of A when A and B are sets.
Note that in the notation of restriction A|‚WV = (A u V)‚W , whereas A‚W
∣∣∣
V
=
A‚W u V. The semantic notion of worlds can now be defined in terms of two criteria:
saturation which forces a sufficient amount of elements from U into W by stating
that the portion of U which steps to a safe command with all “benign” elements is
well-behaved, and generation which states that any interaction space contained in W
has only safe interaction if all the interactions with (co-)values are safe.
Definition 7.8 (Worlds). Given a safety condition S = ( ‚, ‚‚,‚), an S-world is a
triple T = (U,V,W) where both U and V are ‚-spaces andW is a ‚‚-space such that
V v U, W v U, and the following conditions hold:
– Saturation: for all v ∈ U, if 〈v||E〉 c for some c ∈ ‚ and all E ∈ W|‚WV
∣∣∣
V
then
v ∈W. Dually, for all e ∈ U, if 〈V ||e〉 c for some c ∈ ‚ and all V ∈ W|‚WV
∣∣∣
V
then e ∈W. In other words, W|‚WV
∣∣∣‚1U
V
vW where ‚1 = {c ∈ ‚| c c′ ∈ ‚}.
– Generation: for all ‚‚-spaces A vW, if A = A|‚WV then A = A‚W .
We call U the untyped ‚-space, V the value ‚-space, andW the well-behaved ‚‚-space.
As shorthand, for any ‚-space A v U, we write V ∈ A to denote V ∈ A|V and E ∈ A
to denote E ∈ A|V.
Note that the generation property is a rephrasing of Munch-Maccagnoni’s (2009)
generation lemma, where we take it as an assumption instead of proving that it holds
for a particular setting. In a particular world T = (U,V,W), we can say that a semantic
type is any space A contained in the well-behaved W where every positive element of
A is safe when paired with every negative value element of A, and vice versa.
Definition 7.9 (Semantic types). Given a safety condition S = ( ‚, ‚‚,‚) and S-
world T = (U,V,W), a T-type is any ‚-space A such that A = A|‚WV . We denote the
set of all T-types as SemType(T).
Note that by the definition of semantic types, each one must contain some
minimum amount of “benign” elements that belong to every type that lives in its
world. These benign elements are safe when paired with any well-behaved member of
the world, and so they never cause any problems.
258
Lemma 7.1 (Type minimum). For any S-world T = ( ‚, ‚‚,‚) and T-type A,W‚W v
W|‚WV v A.
Proof. First, note that W|V vW. Also, because A is a T-type, we know thatA = A|‚WV ,
so that by definition of orthogonality, A vW and thus A|V v W|V vW. Finally, by
contrapositive (Property 7.1 (a)) we get W‚W v W|‚WV v A|‚WV = A.
It sometimes happens that W‚W is empty, and if that’s the case then there is
not necessarily anything in the positive or negative sides of a semantic type. However,
in some applications, like strong normalization, we find that things like (co-)variables
are benign, and can be safely assumed to inhabit every possible type. The existence
of this minimum of every type lets us prove the type expansion property, which says
that any “untyped” term which steps to a safe place for all co-values of a type must
belong to that type, and vice versa.
Lemma 7.2 (Type expansion). For any safety condition S = ( ‚, ‚‚,‚), S-world
T = (U,V,W), and T-type A,
1. v ∈ A if 〈v||E〉 c ∈ ‚ for all E ∈ A|V, and
2. e ∈ A if 〈V ||e〉 c ∈ ‚ for all V ∈ A|V.
Proof. 1. Note that W|‚WV v A by Lemma 7.1, and so W|‚WV
∣∣∣
V
v A|V by
monotonicity (Property 7.4 (a)), so that v ∈ W by saturation of T because
〈v||E〉 ‚ for all E ∈ W|‚WV
∣∣∣
V
v A|V. Thus, for all E ∈ A|V, 〈v||E〉 ∈
‚‚ since
W is a ‚‚-space, and so 〈v||E〉 ∈ ‚ by closure under expansion of S. Therefore,
v ∈ A|‚WV = A because A is a T-type.
2. Analogous to part 1 by duality.
Note the two steps of this proof, which forms a general procedure of justifying
the presence of elements in a type. First, we must justify that we are dealing with
something generally well-behaved that exists in the ‚‚-space W. Only then can we
use closure under expansion of the safety condition to show that it is also safe with
every (co-)value of the type.
259
The positive construction of types
We now consider two dual methods of constructing particular types inside of
a world. The first is the positive method, which builds a type around a chosen set
of values. In particular, given some world T = (U,V,W), where V = (V+,V−) and
W = (W+,W−), and a chosen set of well-behaved value elements Acons ⊆ W+|V+
serving as the primitive constructions, we have the positive construction of the T-type
PosT(Acons), defined as follows:
PosT(Acons) ,
(
Acons,A
‚W−
cons
)∣∣∣∣‚W
V
∣∣∣∣∣
‚W
V
To show that PosT(Acons) is actually a T-type, we need to demonstrate that
PosT(Acons) = PosT(Acons)|‚WV . To do so, we rely on some facts about restriction, and
how they generalize the basic properties of the orthogonality operation (Property 7.1).
Property 7.4. For all P-spaces A, B, and C,
a) restriction monotonicity: A v B implies A|C v B|C,
b) restriction containment: A|C v A, and
c) restriction idempotency: A|C|C = A|C.
Proof. Each property follows from the definition of restriction in terms of intersection,
and the introduction and elimination facts in Property 7.2. In particular, A @ B
implies A|C = A u C v B u C = A|C, A|C = A u C v A, and A|C|C = A u C u C v
A u C = A|C.
Property 7.5. For any two poles ⊆O P and P-spaces A, B, C, and D
a) restricted orthogonal: AOC
∣∣∣
D
= AOC|D ,
b) restricted contrapositive: A v B implies B|OCD
∣∣∣
D
v A|OCD
∣∣∣
D
,
c) restricted double orthogonal introduction: A v C implies A|D v A|OCD
∣∣∣OC
D
∣∣∣∣
D
, and
d) restricted triple orthogonal elimination: A v C implies A|OCD
∣∣∣OC
D
∣∣∣∣OC
D
∣∣∣∣∣
D
= A|OCD
∣∣∣
D
.
Proof. The restricted orthogonal Property 7.5 (a) follows from the definitions of the
restriction and orthogonality operations on interaction spaces. In particular, supposing
260
A = (A+,A−), C = (C+,C−) and D = (D+,D−), we have
AOC
∣∣∣
D
= AOC u D = (A+,A−)O(C+,C−) u (D+,D−)
=
(
A
OC+
− ,A
OC−
+
)
u (D+,D−) =
(
A
OC+
− ∩ D+,A
OC−
+ ∩ D−
)
= ({v ∈ C+ | ∀e ∈ A− 〈v||e〉 ∈ O} ∩ D+, {e ∈ C− | ∀v ∈ A+ 〈v||e〉 ∈ O} ∩ D−)
= ({v ∈ C+ ∩ D+ | ∀e ∈ A− 〈v||e〉 ∈ O}, {e ∈ C− ∩ D− | ∀v ∈ A+ 〈v||e〉 ∈ O})
=
(
A
OC+∩D+
− ,A
OC−∩D−
+
)
= (A−,A+)
O(C+∩D+,C−∩D−) = AOCuD = AOC|D
The other properties follow from Property 7.5 (a), the intuitionistic facts
of orthogonality in Property 7.1, as well as the monotonicity of restriction
(Property 7.4 (a)). For the restricted contrapositive, A v B implies A|D v B|D by
the monotonicity Property 7.4 (a) which implies B|OCD v A|OCD by the contrapositive
Property 7.1 (a) which implies B|OCD
∣∣∣
D
v A|OCD
∣∣∣
D
by the monotonicity of restriction
again. For the restricted double orthogonal introduction, A v C implies A|D v C|D
by monotonicity (Property 7.4 (a)) which implies A|D v A|
OC|DOC|D
D = A|OCD
∣∣∣OC
D
∣∣∣∣
D
by ordinary double orthogonal introduction (Property 7.1 (b)) and Property 7.5 (a).
For the restricted triple orthogonal elimination, again A v C implies A|D v C|D
by monotonicity (Property 7.4 (a)) which implies A|OCD
∣∣∣OC
D
∣∣∣∣OC
D
∣∣∣∣∣
D
= A|OC|DOC|DOC|DD =
A|OC|DD = A|OCD
∣∣∣
D
by ordinary triple orthogonal elimination (Property 7.1 (c)) and
Property 7.5 (a).
Lemma 7.3 (Positive semantic types). For any safety condition S, W-world T =
(U,V,W), and Acons ⊆ W+|V+, it must be that PosT(Acons) is a T-type.
Proof.
PosT(Acons)|‚WV
=
(
Acons,A
‚W−
cons
)∣∣∣∣‚W
V
∣∣∣∣∣
‚W
V
∣∣∣∣∣∣
‚W
V
(Definition)
=
(
Acons, Acons|‚W−V+
)∣∣∣∣‚W
V
∣∣∣∣∣
‚W
V
∣∣∣∣∣∣
‚W
V
(Property 7.4 (c))
=
 Acons|‚W−V+
∣∣∣∣‚W+
V−
∣∣∣∣∣
‚W−
V+
∣∣∣∣∣∣
‚W+
V−
, Acons|‚W−V+
∣∣∣∣‚W+
V−
∣∣∣∣∣
‚W−
V+
 (Definition)
261
=
Acons|‚W−V+
∣∣∣∣‚W+
V−
, Acons|‚W−V+
∣∣∣∣‚W+
V−
∣∣∣∣∣
‚W−
V+
 (Property 7.5 (d))
=
(
Acons, Acons|‚W−V+
)∣∣∣∣‚W
V
∣∣∣∣∣
‚W
V
(Definition)
=
(
Acons,A
‚W−
cons
)∣∣∣∣‚W
V
∣∣∣∣∣
‚W
V
= PosT(Acons) (Property 7.4 (c))
The Pos construction of types, which involves three applications of orthogonality
interspersed with value restrictions, is more complex than the traditional bi-orthogonal
construction of types which needs only two applications of orthogonality. This is
because we do not assume anything about the chosen substitution strategy, so that
there may be both non-values and non-(co-)values making the value restriction
necessary at every step and inducing an extra application of orthogonality to ensure
that the restricted version triple orthogonal elimination principle (used for showing
that Pos(Acons) is indeed a semantic type) applies. However, if we assume that the
negative side of V is universal (corresponding to the call-by-value V substitution
strategy where all co-terms are co-values), then we can greatly simplify the positive
construction of types to be the more traditional bi-orthogonal definition.
Lemma 7.4 (Positive bi-orthogonality). For any safety condition S, W-world T =
(U,V,W), and Acons ⊆ W+|V+, if V− = U− then PosT(Acons) = (Acons,A
‚W−
cons )‚W =
(A‚W+‚W−cons ,A
‚W−
cons ).
Proof. Note that because PosT(Acons) must be a T-type (Lemma 7.3), we know that
PosT(Acons) = PosT(Acons)‚W by generation of T (Definition 7.8), so we have the
following equality:
PosT(Acons)
=
(
Acons,A
‚W−
cons
)∣∣∣∣‚W
V
∣∣∣∣∣
‚W
V
(Definition)
=
(
Acons|V+ , A
‚W−
cons
∣∣∣∣
V−
)‚W ∣∣∣∣∣∣
‚W
V
(Definition)
=
(
Acons, A
‚W−
cons
∣∣∣∣
V−
)‚W ∣∣∣∣∣∣
‚W
V
(Acons ⊆ V+)
262
=
(
Acons,A
‚W−
cons
)‚W ∣∣∣∣∣
‚W
V
(V− = U−)
=
(
A
‚W+‚W−
cons
∣∣∣∣
V+
, A
‚W−
cons
∣∣∣∣
V−
)‚W
(Definition)
=
(
A
‚W+‚W−
cons
∣∣∣∣
V+
,A
‚W−
cons
)‚W
(V− = U−)
=
(
A
‚W+‚W−
cons , A
‚W+‚W−
cons
∣∣∣∣‚W−
V+
)
(Definition)
=
(
A
‚W+‚W−
cons , A
‚W+‚W−
cons
∣∣∣∣‚W−
V+
)‚W
(PosT(Acons) = PosT(Acons)‚W)
=
(
A
‚W+‚W−
cons
∣∣∣∣‚W+‚W−
V+
,A
‚W−‚W+‚W−
cons
)
(Definition)
=
(
A
‚W+‚W−
cons ,A
‚W−‚W+‚W−
cons
)
(Previous)
=
(
A
‚W+‚W−
cons ,A
‚W−
cons
)
(Property 7.1 (c))
=
(
Acons,A
‚W−
cons
)‚W
(Definition)
The negative construction of types
The dual to the positive method of constructing types is the negative method,
which builds a type around a chosen set of co-values. In particular, given some world
T = (U,V,W), where V = (V+,V−) and W = (W+,W−), and a chosen set of well-
behaved co-value elements Aobs ⊆ W−|V− serving as the primitive observations, we
have the negative construction of the T-type NegTAobs, defined as follows:
NegT(Aobs) ,
(
A
‚W+
obs ,Aobs
)∣∣∣∣‚W
V
∣∣∣∣∣
‚W
V
Lemma 7.5 (Negative semantic types). For any safety condition S, W-world T =
(U,V,W), and Aobs ⊆ W−|V−, it must be that NegT(Aobs) is a T-type.
Proof.
NegT(Aobs)|‚WV
=
(
A
‚W+
obs ,Aobs
)∣∣∣∣‚W
V
∣∣∣∣∣
‚W
V
∣∣∣∣∣∣
‚W
V
(Definition)
263
=
(
Aobs|‚W+V− ,Aobs
)∣∣∣∣‚W
V
∣∣∣∣∣
‚W
V
∣∣∣∣∣∣
‚W
V
(Property 7.4 (c))
=
 Aobs|‚W+V−
∣∣∣∣‚W−
V+
∣∣∣∣∣
‚W+
V−
, Aobs|‚W+V−
∣∣∣∣‚W−
V+
∣∣∣∣∣
‚W+
V−
∣∣∣∣∣∣
‚W+
V+
 (Definition)
=
 Aobs|‚W+V−
∣∣∣∣‚W−
V+
∣∣∣∣∣
‚W+
V−
, Aobs|‚W+V−
∣∣∣∣‚W−
V+
 (Property 7.5 (d))
=
(
Aobs|‚W+V− ,Aobs
)∣∣∣∣‚W
V
∣∣∣∣∣
‚W
V
(Definition)
=
(
A
‚W+
obs ,Aobs
)∣∣∣∣‚W
V
∣∣∣∣∣
‚W
V
= NegT(Aobs) (Property 7.4 (c))
Similar to the fact that the positive construction of types Pos can be simplified
to the traditional bi-orthogonal construction under certain assumptions about V, the
same holds for the negative construction of types Neg. Unsurprisingly, this requires
the dual assumption that the positive side of V is universal (corresponding to the
call-by-name substitution strategy N where all terms are values).
Lemma 7.6 (Negative bi-orthogonality). For any safety condition S, W-world T =
(U,V,W), and Aobs ⊆ W−|V−, if V+ = U+ then NegT(Aobs) = (A
‚W+
obs ,Aobs)‚W =
(A‚W+obs ,A
‚W−‚W+
obs ).
Proof. Analogous to the proof of Lemma 7.4 by duality.
Models
We now build a parameterized model for the µµ˜-calculus in earnest by partially
instantiating the notion of safety conditions and world. Since one of the applications
of interest in Section 7.5 is the binary property of contextual equivalence, we will make
a model out of pairs of commands and (co-)terms. More specifically, our model is
parameterized by an arbitrary safety condition S = ( ‚, ‚‚,‚) as well as a world
TS = (US ,VS ,WS) for every base kind S such that
‚= Command × Command
US = (TermS × TermS ,CoTermS × CoTermS)
VS = (ValueS × ValueS ,CoValueS × CoValueS)
264
from the well-kinded (but untyped) syntax of the µµ˜-calculus, and where the
computation relation  for the top pole ‚is defined as
(c1, c2) (c′1, c′2) if (c1 7→ c′1) ∧ (c2 7→ c′2)
(c1, c2) (c′1, c2) if c1 7→ c′1
(c1, c2) (c1, c′2) if c2 7→ c′2
and the cut operation for each ‚-space US is defined as
〈(v1, v2)||(e1, e2)〉 , (〈v1||e1〉 , 〈v2||e2〉)
Therefore, the definitions of ‚‚, ‚, and WS for each strategy S is arbitrary, so long
as they satisfy the criteria imposed by the safety condition ( ‚, ‚‚,‚) and worlds
(US ,VS ,WS). Since we are dealing with binary relations, not just unary predicates,
we will use the following shorthand:
– c ‚ c′ means (c, c′) ∈ ‚ and c ‚‚ c′ means (c, c′) ∈ ‚‚,
– v A v′ means (v, v′) ∈ A for any A v US , and
– e A e′ means (e, e′) ∈ A for any A v US .
To accomodate the size indexes Ix and Ord, the model is also parameterized by a
size measurement, defined as follows.
Definition 7.10. A size measurement is a set of ordinals O equipped with two
constants 0,∞ ∈ O, a unary operation +1 : O → O, and a well-founded (partial)
order < between elements of O such that the following conditions hold:
1. 0 is less than ∞: 0 <∞,
2. +1 is monotonic: M < N implies +1(M) < +1(N) for all M,N ∈ O,
3. +1 is strictly increasing: M < +1(M) for all M ∈ O, and
4. ∞ is a limit of +1: M <∞ implies +1(M) <∞ for all M ∈ O.
265
Types and Kinds
First, we build a model for the kinds and sorts in the wholly static part of the
higher-order µµ˜-calculus with structural recursion. Since the language of kinds includes
functions and size indexes in addition to base kinds, we need to form the Universe
containing all the semantic representations of the different kinds of syntactic types. In
particular, base kinds are interpreted as the set of semantic types of the corresponding
world, the kind of size type indexes are interpreted as the set of ordinals, and the
kinds of type functions are interpreted as the set of partial functions (denoted by ⇀)
between other members of the universe.
Definition 7.11 (Universe). The Universe is the smallest set such that
1. O ∈ Universe,
2. SemType(TS) ∈ Universe for all base kinds,
3. (K⇀ L) ∈ Universe for all K,L ∈ Universe,
4. K ∈ Universe, for all K ⊆ L ∈ Universe.
For any K,L ∈ Universe, A ∈ K, and B ∈ L, the partial function application A(B) is
defined whenever there exists K1,K2 ∈ Universe such that A ∈ K1 → K2 and B ∈ K1,
and is undefined otherwise.
With the universe in place, we can now define the meaning of kinds as a
relationship between the syntax of types and their semantics (Pitts, 1997) in the
model.
Definition 7.12 (Semantic kinds). A semantic kind K ∈ SemKind is a pair (D,R)
where D ∈ Universe and R ⊆ (Type × D). We refer to D as the domain of K and
R as the syntactic-semantic relationship of K. As shorthand, given K ∈ SemKind,
A ∈ Type, and A ∈ Universe, we write A ∈ K to indicate A ∈ pi1(K) and A K A to
indicate (A,A) ∈ pi2(K).
First, we give an interpretation of sorts in terms of semantic kinds. The sort of
non-erasable kinds, , is interpreted as the whole set of all semantic kinds, and the
sort of erasable kinds, , adds the restriction that the syntactic-semantic relation is
closed under β expansion of the syntactic component.
JK , SemKind JK , {K ∈ SemKind | ∀A′ K A.∀Aβ A′.A K A}
266
A type substitution σ is a partial function from syntactic type variables and
connective names to semantic entities in some kind in the universe, and the set of all
type substitutions is
TypeSubstitution , (TypeVariable ∪ Connective) ⇀⋃Universe
The interpretation of syntactic kinds and types is then a (partial) function from type
substitutions to semantic entities in SemKind and Universe, respectively.
Jk ∈ KindK : TypeSubstitution ⇀ SemKindJA ∈ TypeK : TypeSubstitution ⇀ Universe
This interpretation is mutually defined by structural induction over the syntax.
JXKσ , σ(X)J0Kσ , 0J∞Kσ ,∞JN + 1Kσ , +1(JNKσ)r
F( #»A)
z
σ
, σ(F)( #       »JAKσ)JλX:k.BKσ , λX ∈ pi1(JkKσ).JBKσ{X/X}JA BKσ , JAKσ(JBKσ)
JSKσ , (SemType(TS),Type × SemType(TS))Jk → lKσ , (pi1(JkKσ)→ pi1(JlKσ), {(A,A) | ∀B JkKσ B.A B JlKσ A(B)})JIxKσ , (N,{(M,M) |M ∼N M})JOrdKσ , (O,{(M,M) | ∃M ′ ∈ Type.M β M ′ ∼O M′ ≤M})J< NKσ , ({M ∈ O |M < JNKσ} ,{(M,M) | ∃M ′ ∈ Type.M β M ′ ∼O M′ ≤M})
Note that N is defined as the smallest subset of O containing 0 and closed under +1,
the relation ∼N is defined as the smallest subset of Type ×O such that
1. 0 ∼N 0, and
2. M + 1 ∼N +1(M) for all M ∼N M.
267
and the relation ∼O used in is defined as ∼N ∪{(∞,∞)}.
Declarations
Using the interpretation of kinds above, each (co-)data declaration can be
interpreted as the semantic type representing the connective it defines.
Jdecl ∈ DeclarationK : TypeSubstitution ⇀ Universe
The interpretations revolve around structures: data types are interpreted as the
positive type built around their constructions, and co-data types are interpreted as
the negative type built around their observations. We consider each different form of
(co-)data type declaration introduced previously in Chapters V and VI, first warming
up with simple (co-)data types before moving on to higher-order (co-)data types and
recursive (co-)data types.
Simple (co-)data types
To interpret a data type, we must interpret the meaning of each of its constructors.
In particular, given the signature of a constructor,
K :
(
#         »
A : T ` F( #»X ) | #         »B : R
)
in a data type declaration, we define its interpretation as the relation between the
possible term constructions it can build, where the constructors agree and the sub-
(co-)terms are related.r
K :
(
#         »
A : T ` F( #»X ) | #         »B : R
)z
σ
,
{(
K( #»e , #»v ),K(
#»
e′ ,
#»
v′ )
) ∣∣∣ #                 »e JBKσ e′ , #                 »v JAKσ v′}
The interpretation of a full simple data type declaration is then the function returning
the positive type built around the union of all of its constructions as follows:uwvdata F(
#        »
X : k) : Swhere
#                                                                                    »
Ki :
(
#              »
Aij : Tijj ` F( #»X ) | #                »Bij : Rijj
)i
}~
σ
, #                                »λX ∈ pi1(JkKσ).PosTS
⋃
i

s
Ki :
(
#              »
Aij : Tijj ` F( #»X ) | #                »Bij : Rijj
){
σ{ #       »X/X}


268
Co-data types are dual to data types, and follow the opposite approach. First
we interpret the meaning of each of a co-data type’s observers, given its signature
O :
(
#         »
A : T | G( #»X ) ` #         »B : R
)
, as the relation between the possible co-term observations
it can build where the observers agree and the sub-(co-)terms are related.r
O :
(
#         »
A : T | G( #»X ) ` #         »B : R
)z
σ
,
{(
O[ #»v , #»e ],O[
#»
v′ ,
#»
e′ ]
) ∣∣∣ #                 »v JAKσ v′ , #                 »e JBKσ e′}
The interpretation of a full simple co-data type declaration is then the function
returning the negative type built around the union of all of its observations as follows:uwv codataG(
#        »
X : k) : Swhere
#                                                                                     »
Oi :
(
#              »
Aij : Tijj | G( #»X ) ` #                »Bij : Rijj
)i
}~
σ
, #                                »λX ∈ pi1(JkKσ).NegTS
⋃
i

s
Oi :
(
#              »
Aij : Tijj | G( #»X ) ` #                »Bij : Rijj
){
σ{ #       »X/X}


Higher-order (co-)data types
The only difference between simple and higher-order (co-)data types is that higher-
order (co-)data types can also include hidden quantified types within their structures.
Therefore, when interpreting the meaning of their constructors and observers, we must
also quantify over the possible types that might be included. For a quantified type of
kind l, we extend the relation to quantify over l[σ] twice, choosing a pair of syntactic
types C and C ′ which are related to exactly the same semantic type C. The two
syntactic types are used in syntactic term and co-term structures, whereas the single
semantic type is used for to interpret the types of the remaining components as follows:r
K :
(
#         »
A : T ` #   »Y :l F( #»X ) | #         »B : R
)z
σ
,
{(
K
#»
C ( #»e , #»v ),K
# »
C′ (
#»
e′ ,
#»
v′ )
) ∣∣∣∣ #                »C JlKσ C, #                  »C ′ JlKσ C, #                                »e JBKσ{ #      »C/Y } e′ , #                                »v JAKσ{ #      »C/Y } v′}r
O :
(
#         »
A : T | G( #»X ) ` #   »Y :l #         »B : R
)z
σ
,
{(
O
#»
C [ #»v , #»e ],O
# »
C′ [
#»
v′ ,
#»
e′ ]
) ∣∣∣∣ #                »C JlKσ C, #                  »C ′ JlKσ C, #                                »v JAKσ{ #      »C/Y } v′ , #                                »e JBKσ{ #      »C/Y } e′}
With this extended interpretation of single constructors and observers, the
interpretation of higher-order (co-)data types is effectively the same as the simple
269
case, defined as follows:uwvdata F(
#        »
X : k) : Swhere
#                                                                                                  »
Ki :
(
#              »
Aij : Tijj `
#          »
Yij :lij
j
F( #»X ) | #                »Bij : Rijj
)i
}~
σ
, #                                »λX ∈ pi1(JkKσ).PosTS
⋃
i

s
Ki :
(
#              »
Aij : Tijj `
#          »
Yij :lij
j
F( #»X ) | #                »Bij : Rijj
){
σ{ #       »X/X}


uwv codataG(
#        »
X : k) : Swhere
#                                                                                                   »
Oi :
(
#              »
Aij : Tijj | G( #»X ) `
#          »
Yij :lij
j #                »
Bij : Rijj
)i
}~
σ
, #                                »λX ∈ pi1(JkKσ).NegTS
⋃
i

s
Oi :
(
#              »
Aij : Tijj | G( #»X ) `
#          »
Yij :lij
j #                »
Bij : Rijj
){
σ{ #       »X/X}


Primitive recursive (co-)data types
Interpreting recursively-defined (co-)data types is more interesting, since the
interpretation itself must also be recursive. As it turns out, we can use the same
recursion principle corresponding to the program to define the connective semantically.
In particular, we can interpret primitive-recursive data type asuwwwwwv
data F(i:Ix, #     »X:k) : S by primitive recursion on i
where i = 0
#                                                                               »
Ki :
(
#       »
Ai:Ti `
#      »
Yi:li F(0, #»X ) | #         »Bi:Ri
)i
where i = j+1
#                                                                                         »
K′i :
(
#        »
A′i:T ′i `
#       »
Y ′i :l′i F(j+1, #»X ) | #         »B′i:R′i
)i
}~
σ
, λJ ∈ N.FJσ
where the family of FMσ is defined by primitive recursion on M ∈ N:
F0σ ,
#                              »
λX ∈ pi1(JkK).
PosTS
⋃
i

s
#                                                                               »
Ki :
(
#       »
Ai:Ti `
#      »
Yi:li F(0, #»X ) | #         »Bi:Ri
)i{
σ{ #       »X/X}


F+1(M)σ ,
#                              »
λX ∈ pi1(JkK).
PosTS
⋃
i

t
#                                                                                  »
K′i :
(
#        »
A′i:T ′i `
#       »
Y ′i :l′i F(0, #»X ) | #         »B′i:R′i
)i|
σ{ #       »X/X,(λJ∈N.FMσ )/F}


270
Note that this is well-defined by primitive recursion whenever the declaration of F is
well-formed, because the signature of the constructors K′i can only F(j,
#»
C ), which is
defined by the previous F.
Dually, we can interpret a primitive-recursive co-data type asuwwwwwv
codataG(i:Ix, #     »X:k) : S by primitive recursion on i
where i = 0
#                                                                               »
Oi :
(
#       »
Ai:Ti | G(0, #»X ) `
#      »
Yi:li #         »Bi:Ri
)i
where i = j+1
#                                                                                          »
O′i :
(
#        »
A′i:T ′i | G(j+1,
#»
X ) `
#       »
Y ′i :l′i
#         »
B′i:R′i
)i
}~
σ
, λJ ∈ N.GJσ
where the family of GMσ is defined by primitive recursion on M ∈ N:
G0σ ,
#                              »
λX ∈ pi1(JkK).
NegTS
⋃
i

s
#                                                                               »
Oi :
(
#       »
Ai:Ti | G(0, #»X ) `
#      »
Yi:li #         »Bi:Ri
)i{
σ{ #       »X/X}


G+1(M)σ ,
#                              »
λX ∈ pi1(JkK).
NegTS
⋃
i

t
#                                                                                   »
O′i :
(
#        »
A′i:T ′i | G(0,
#»
X ) `
#       »
Y ′i :l′i
#         »
B′i:R′i
)i|
σ{ #       »X/X,(λJ∈N.GMσ )/G}


Noetherian recursive (co-)data types
Data types defined by noetherian recursion are interpreted by the same recursion
principle, so that we have the following interpretationt
data F(i : Ord, #        »X : k) : S by noetherian recursion on i
where K :
(
#         »
A : T ` #   »Y :l F(i, #»X ) | #         »B : R
)
|
σ
, λJ ∈ O.FJσ
where the family of FMσ is defined by noetherian recursion on M ∈ O:
FMσ ,
#                              »
λX ∈ pi1(JkK).
PosTS
⋃
i

s
#                                                                            »
Ki :
(
#       »
Ai:Ti `
#      »
Yi:li F(i, #»X ) | #         »Bi:Ri
)i{
σ{ #       »X/X,M/i,(λJ∈(<M).FJσ)/F}


where <M is the set {M′ ∈ O |M′ <M}. Note that this is well-defined by noetherian
recursion whenever the declaration of F is well-formed, because each use of F in the
271
signature of the constructors Ki must be on a strictly decreasing i (with respect to
the < ordering on ordinal indexes).
Dually, we can interpret a noetherian-recursive co-data type ast
codataG(i : Ord, #        »X : k) : S by noetherian recursion on i
where O :
(
#         »
A : T | G(i, #»X ) ` #   »Y :l #         »B : R
)
|
σ
, λJ ∈ O.GJσ
where the family of GMσ is defined by noetherian recursion on M ∈ O:
GMσ ,
#                              »
λX ∈ pi1(JkK).
NegTS
⋃
i

s
#                                                                             »
Oi :
(
#       »
Ai:Ti `
#      »
Yi:li G(i, #»X ) | #         »Bi:Ri
)i{
σ{ #       »X/X,M/i,(λJ∈(<M).GJσ)/G}


Sequents
The meaning of sequents is to constrain the potential domain of substitutions.
That is, having x : A in the input environment of a sequent means that we must
substitute a suitable value for x which belongs to the interpretation of A. Since
sequents quantify over several different types of variables (type variables, connectives,
value variables and co-value variables), the substitution they correspond to must map
each variable to an appropriate replacement as follows:
Substitution , ((TypeVariable ∪ Connective) ⇀ (Type ×⋃Universe))
× (Variable ⇀ (Value × Value))
× (CoVariable ⇀ (CoValue × CoValue))
Note that substitutions map (co-)variables to pairs of (co-)values, continuing the
interpretation of types as binary relations on (co-)terms. These big substitutions can
be pared down into more useful parts by extracting only the semantic component
(via sem) used in the interpretation of types, as well as the left and right syntactic
components (via left and right, respectively), as follows:
ValueSubstitution , (TypeVariable ⇀ Type)
× (Variable ⇀ Value)
× (CoVariable ⇀ CoValue)
272
sem : Substitution → TypeSubstitution
left, right : Substitution → ValueSubstitution
We begin with the most fundamental interpretation of sequents, which is on
the judgement
(
Γ `ΘG ∆
)
seq that a sequent is well-formed. This well-formedness
judgement is interpreted as the set of possible substitutions that map each variable
mentioned in
(
Γ `ΘG ∆
)
seq to an appropriate semantic entities dictated by the
previous interpretations:
(
Γ ΘG ∆
)
seq
, {σ ∈ Substitution
| ∀decl ∈ G, decl defines F( #        »X : k) : S =⇒ pi2(σ(F)) = JdeclKsem(σ) ∈ r #       »k → Szsem(σ)
∧ ∀X:k ∈ Θ, JkKsem(σ) ∈ SemKind ∧ σ(X) ∈ pi2(JkKsem(σ))
∧ ∀x:A ∈ Γ,∃S, σ(x) ∈ JAKsem(σ) ∈ SemType(TS)
∧ ∀α:A ∈ ∆,∃S, σ(α) ∈ JAKsem(σ) ∈ SemType(TS)}
Intuitively, finding a particular σ ∈
(
Γ ΘG ∆
)
seq is the semantic evidence that the
sequent is coherent enough to reason about. That’s why the interpretation of every
other sequent that follows is predicated upon finding such evidence before any firm
statements are made.
The sequents for the sorting judgement affirms that the interpretations of the
kind is indeed the stated sort, assuming that there is a well-formed substitution for the
sequent. The sequents for the kinding judgement affirms that, under any well-formed
substitution, the syntactic type is related to the semantic interpretation by its kind.
Furthermore, the sequents for type conversion require that both types be semantically
well-kinded by the previous interpretation, and also that the semantic interpretation
of the types are the same. These are formally defined as follows:
G  decl , ∀σ ∈
(
G
)
seq , JdeclKsem(σ) ∈ Universe
Θ G k : s , ∀σ ∈
(
ΘG
)
seq , JkKsem(σ) ∈ JsK
Θ G A : k , ∀σ ∈
(
ΘG
)
seq , JkKsem(σ) ∈ SemKind =⇒
A {left(σ)} JkKsem(σ) JAKsem(σ)
273
Θ G A = B : k , ∀σ ∈
(
ΘG
)
seq , JkKsem(σ) ∈ SemKind =⇒
A {left(σ)} JkKsem(σ) JAKsem(σ) ∧B {right(σ)} JkKsem(σ) JBKsem(σ)
∧ JAKsem(σ) = JBKsem(σ)
where {left(σ)} and {right(σ)} is notation for applying right(σ) as the simultaneous
capture-avoiding substitution of all type variables in its domain.
We now get down to the typing judgements for programs (i.e. commands and
(co-)terms). First, we define what it means for two commands, terms, or co-terms to
be semantically related to one another with respect to some sequent as follows:
c⇔ c′ :
(
Γ ΘG ∆
)
, ∀σ ∈
(
Γ G ∆
)
seq , c {left(σ)} ‚ c′ {right(σ)}
Γ ΘG v ⇔ v′ : A | ∆ , ∀σ ∈
(
Γ G ∆
)
seq ,∀S, A {left(σ)} JSKpi2◦σ JAKsem(σ) =⇒
v {left(σ)} JAKsem(σ) v′ {right(σ)}
Γ | e⇔ e′ : A ΘG ∆ , ∀σ ∈
(
Γ G ∆
)
seq ,∀S, A {left(σ)} JSKpi2◦σ JAKsem(σ) =⇒
e {left(σ)} JAKsem(σ) e′ {right(σ)}
Intuitively, the semantic judgement that two commands are related means that, under
the left and right components of any well-formed substitution, they are related by ‚.
Likewise the semantic judgement that two terms or co-terms are related means that
(under the left and right components of any well-formed substitution), they are related
by the interpretation of the type they inhabit. Furthermore, we give the semantic
judgement that commands and (co-)terms are well-typed in terms of the above by
affirming they are related to themselves.
c :
(
Γ ΘG ∆
)
, c⇔ c :
(
Γ ΘG ∆
)
Γ ΘG v : A | ∆ , Γ ΘG v ⇔ v : A | ∆
Γ | e : A ΘG ∆ , Γ | e⇔ e : A ΘG ∆
Adequacy
We have built a model for the parametric µµ˜-calculus, but we have not yet seen
that it corresponds at all to the statics of the language. In other words, we need to
demonstrate adequacy: derivations of a syntactic judgement implies the truth of the
274
corresponding semantic judgement. With this in mind, we now show the adequacy of
the model via the following fundamental lemmas.
The first fundamental lemma demonstrates the adequacy of the sorting and
kinding rules.
Lemma 7.7 (Kind adequacy). For any focalizing strategies #»S , safety condition S and
family of S-worlds TS , and size measurement,
a) if G ` decl is derivable in µµ˜ #»S then G  decl,
b) if Θ `G k : s is derivable in µµ˜ #»S then Θ G k : s,
c) if Θ `G A : k is derivable in µµ˜ #»S then Θ G A : k, and
d) if Θ `G A = B : k is derivable in µµ˜ #»S then Θ G A = B : k.
Proof. By mutual induction on the derivations, demonstrating that each inference
rule I is sound: the interpretation of its conclusion follows from the interpretation of
the premises, giving the semantic version of the rule JIK.
The interpretation of the sorting rules are sound as follows:
– To show Θ G S : , note that JSKσ = (SemType(TS),Type × SemType(TS)) ∈
SemKind for any σ and that for all A β A′ ∈ Type and A ∈ SemType(TS),
both A JSKσ A and A′ JSKσ A since all syntactic and semantic types are related
by JSKσ, so JSKσ ∈ JK.
– To show Θ G Ix : , note that JIxKσ = (N, {(M,M) |M ∼N M}) ∈ SemKind
for any σ.
– To show Θ G Ord : , note that
JOrdKσ = (O, {(M,M) | ∃M ′ ∈ Type.M β M ′ ∼O M′ ≤M}) ∈ SemKind
for any σ, and that the syntactic-semantic relation of JOrdKσ is closed under β
expansion by definition.
– To show
Θ G k :  Θ G l : 
Θ G k → l :  suppose that σ ∈
(
ΘG
)
seq , so that we knowJkKσ ∈  and JlKsem(σ) ∈  from the premises. Thus, Jk → lKsem(σ) ∈ SemKind,
and the syntactic-semantic relation of Jk → lKsem(σ) ∈ SemKind is closed under
275
β expansion because the syntact-semantic relation of JlKsem(σ) is: if Aβ A′ and
A′ Jk → lKsem(σ) A then for all B JkKsem(σ) B it follows that A′ B JlKsem(σ) A(B),
so that A B JlKsem(σ) A(B) as well because JlKsem(σ) ∈ JK.
– To show
Θ G M : Ord
Θ G < M :  suppose that σ ∈
(
ΘG
)
seq , so that we know JMKsem(σ) ∈JOrdKsem(σ) from the premise. Thus, the domain {M′ ∈ O | M′ < JMKsem(σ)}
of J< MKsem(σ) is a defined subset of O, and the syntactic-semantic relation ofJMKsem(σ) is closed under β expansion for the same reason that JOrdKsem(σ) is.
– To show
Θ G k : 
Θ G k :  suppose that σ ∈
(
ΘG
)
seq , so that we know JkKsem(σ) ∈ JK
from the premise, and note that this implies that JkKsem(σ) ∈ SemKind = JK.
The interpretation of the kinding rules are sound as follows:
– To show
(X : k) /∈ Θ′
Θ, X : k,Θ′ G X : k
JTV K suppose that σ ∈ (ΘG ) seq and
k[sem(σ)] ∈ SemKind, and note that
(X {left(σ)} , JXK(σ)) = (left(σ)(X), sem(σ)(X)) = σ(X) ∈ pi2(JkKσ)
– To show
#                      »Θ G C : k (F(
#        »
X : k) : S) ∈ G
Θ G F(
#»
C ) : S
JFT K
suppose that σ ∈
(
ΘG
)
seq , so
we know that
#                                                                       »
C {left(σ)} JkKsem(σ) JCKsem(σ) from the premise. Since (F( #        »X : k) :
S) ∈ G, we also know that for the decl ∈ G defining F( #        »X : k) : S, we have
sem(σ)(F) = JdeclKsem(σ) ∈ r #       »k → Szsem(σ) = #                                    »pi1(JkKsem(σ))→ SemType(TS)
Thus,F( #»C ) {left(σ)} = F( #          »C {σ}) JSKsem(σ) rF( #»C )zsem(σ) = sem(σ)(F)( #                  »JCKsem(σ)).
– To show
Θ, X : k G A : l
Θ G λX : k.A : k → l
q→I2y suppose that σ ∈ (ΘG ) seq andJk → lKsem(σ) ∈ SemKind, so we know that JkKsem(σ), JlKsem(σ) ∈ SemKind and
thus A {left(σ), B/X} JlKsem(σ)B/X JAKsem(σ),B/X for all B JkKsem(σ) B from the
premise. Note that
(λX:k.A) {left(σ)} B →β A {left(σ), B/X} JλX:k.AKsem(σ)(B) = JAKsem(σ),B/X
276
Thus, (λX:k.A) {left(σ)} Jk → lKsem(σ) JλX:k.AKsem(σ).
– To show
Θ G A : k → l Θ G B : k Θ G k → l : 
Θ G A B : l
q→E2y suppose that
σ ∈
(
ΘG
)
seq , so we know that Jk → lKsem(σ) ∈ JK ⊆ SemKind (which
implies JkKsem(σ) ∈ JK = SemKind), A {left(σ)} Jk → lKsem(σ) JAKsem(σ),
and B {left(σ)} JkKsem(σ) JBKsem(σ) from the premises. It follows that
(A B) {left(σ)} JlKsem(σ) JAKsem(σ)(JBKsem(σ)) = JA BKsem(σ) by the definition
of Jk → lKsem(σ).
– To show Θ G 0 : Ix note that 0 {σ1} = 0, J0Kσ2 = 0 ∈ N, and 0 JIxKσ2 0 for any
substitutions σ1 and σ2.
– To show
Θ G M : Ix
Θ G M + 1 : Ix suppose that σ ∈
(
ΘG
)
seq , so we know that
M {left(σ)} JIxKsem(σ) JMKsem(σ) from the premise. It follows by definition ofJIxKsem(σ) that JMKsem(σ) ∈ N and M {left(σ)} ∼N JMKsem(σ), so
+1(M) {left(σ)} JIxKsem(σ) J+1(M)Ksem(σ)
– To show Θ G ∞ : Ord note that∞{σ1} =∞, J∞Kσ2 =∞ ∈ O, and∞ JOrdKσ2
∞ for any substitutions σ1 and σ2.
– To show Θ G 0 : <∞ note that 0 {σ1} = 0, J0Kσ2 = 0 ∈ O, and 0 < ∞ is
guaranteed by the size measurement for any substitutions σ1 and σ2.
– To show
Θ G M : <∞
Θ G M + 1 : <∞ suppose that σ ∈
(
ΘG
)
seq , so we know
that M {left(σ)} J<∞Ksem(σ) JMKsem(σ) from the premise. The fact that
+1(JMKsem(σ)) <∞, and thus that +1(M) {left(σ)} J<∞Ksem(σ) +1(JMKsem(σ)),
is guaranteed by the size measurement which forces ∞ to be a limit of +1.
– To show
Θ G N : Ord Θ G M : < N
Θ G M : < M + 1 suppose that σ ∈
(
ΘG
)
seq andJ< M + 1Ksem(σ) ∈ SemKind, so we know that N {left(σ)} JOrdKsem(σ) JNKsem(σ)
(which implies that J< NKsem(σ) ∈ skind) and M {left(σ)} J< NKσ JMKsem(σ)
from the premises. The size measurement guarantees forces +1 to be
strictly increasing, so JMKsem(σ) < +1(JMKsem(σ)) = JM + 1Ksem(σ). Thus
M {left(σ)} J< MKsem(σ) JMKsem(σ).
277
– To show
Θ G M : < M ′ Θ G M ′ : < N
Θ G M : < N suppose that σ ∈
(
ΘG
)
seq andJ< NKsem(σ) ∈ SemKind, so we know that M ′ {left(σ)} J< NKsem(σ) JM ′Ksem(σ)
(which implies J< M ′Ksem(σ) ∈ SemKind) and M {left(σ)} J< M ′Ksem(σ)JMKsem(σ). Note that JMKsem(σ) < JM ′Ksem(σ) < JNKsem(σ) by definition, soJMKsem(σ) < JNKsem(σ) is forced by transitivity of the size measurement, and
thus M {left(σ)} J< NKsem(σ) JMKsem(σ).
– To show
Θ G N : Ord Θ G M : < N
Θ G M : Ord
suppose that σ ∈
(
ΘG
)
seq , so we know that N {left(σ)} JOrdKsem(σ) JNKsem(σ) (which
implies that J< NKsem(σ) ∈ skind) and M {left(σ)} J< NKsem(σ) JMKsem(σ) from the
premises. M {left(σ)} JOrdKsem(σ) JMKsem(σ) follows from the inclusion of the set
M ∈ O |M < JNKsem(σ) in O.
The interpretation of βη conversion of types are sound as follows:
– To show
Θ, X : k G A : l Θ G B : k Θ G k → l : 
Θ G (λX:k.A) B =βη A {B/X} : l JβK
suppose that σ ∈
(
ΘG
)
seq so we know that for all C JkKsem(σ) C
Jk → lKsem(σ) ∈ JK ⊆ SemKindJkKsem(σ) ∈ JK = SemKindJlKsem(σ) ∈ JK ⊆ SemKind
B {left(σ)} JkKsem(σ) JBKsem(σ)
A {left(σ), C/X} Jk → lKsem(σ) JAKsem(σ),C/X
from the premises. Thus, it follows that
((λX:k.A) B) {left(σ)} →β A {left(σ), B/X}J(λX:k.A) BKsem(σ) = JAKsem(σ),JBKsem(σ)/X ∈ JK
A {left(σ), B {left(σ)}/X} JlKsem(σ) JAKsem(σ),JBKsem(σ)/X
((λX:k.A) B) {left(σ)} JlKsem(σ) JAKsem(σ),JBKsem(σ)/X = J(λX:k.A) BKsem(σ)
278
– To show
Θ G A : k → l
Θ G λX:k.A X =βη A : k → l JηK suppose that σ ∈ (ΘG ) seq
and JktolKsem(σ) ∈ SemKind so we know that A {left(σ)} Jk → lKsem(σ)JAKsem(σ) from the premise. It follows that (λX:k.A X) {left(σ)} Jk → lKsem(σ)JλX:k.A XKsem(σ) = JAKsem(σ) by the definition of Jk → lKsem(σ).
– The closure rule
Θ G A : k
Θ G A =βη A : k
JreflK follows immediately from the premise.
– The closure rule
Θ G B =βη A : k
Θ G A =βη B : k
JsymmK follows immediately from the
premise.
– The closure rule
Θ G A =βη B : k Θ G B =βη C : k
Θ G A =βη C : k
JtransK follows immediately
from the premise.
– To show
Θ G F(
#»
C ) : S #                                         »Θ G C =βη C ′ : k Θ G F( #»C ) : S (F( #        »X : k) : S) ∈ G
Θ G F(
#»
C ) =βη F(
#»
C ) : S
JFT K
suppose that σ ∈
(
ΘG
)
seq . Note that F( #»C ) {left(σ)} JSKsem(σ) rF( #»C )zsem(σ)
and F( # »C ′) {left(σ)} JSKsem(σ) rF( # »C ′)zsem(σ) follow immediately from the first and
last premises. We also learn from (F( #        »X : k) : S) ∈ G that #                                               »JkKsem(σ) ∈ SemKind ,
so from the remaining premises, we have that
#                                                  »JCKsem(σ) = JC ′Ksem(σ) , and thusr
F( #»C )
z
sem(σ)
= σ(F)( #                  »JCKsem(σ)) = σ(F)( #                   »JC ′Ksem(σ)) = rF( # »C ′)zsem(σ)
– To show
Θ, X : k G A =βη A′ : l
Θ G λX:k.A =βη λX:k.A′ : k → l
q→I2y suppose that σ ∈ (ΘG ) seq
and Jk → lKsem(σ) ∈ SemKind (which implies that JkKsem(σ), JlKsem(σ) ∈
SemKind), so from the premise, we know that for all B JkKsem(σ) B
A {left(σ), B/X} JlKsem(σ),B/X JAKsem(σ),B/X
A′ {left(σ), B/X} JlKsem(σ),B/X JA′Ksem(σ),B/XJAKsem(σ),B/X = JA′Ksem(σ),B/X
279
Note also that for all B JkKsem(σ) B
(λX:k.A) {left(σ)} B →β A {left(σ), B/X}
(λX:k.A′) {left(σ)} B →β A′ {left(σ), B/X}JλX:k.AKsem(σ)(B) = JAKsem(σ),B/XJλX:k.A′Ksem(σ)(B) = JA′Ksem(σ),B/X
Thus, it follows that
JλX:k.AKsem(σ) = JλX:k.A′Ksem(σ)
(λX:k.A) {left(σ)} →β JλX:k.AKsem(σ)
(λX:k.A′) {left(σ)} →β JλX:k.A′Ksem(σ)
– To show
Θ G A =βη A′ : k → l Θ G B =βη B′ : k Θ G k → l : 
Θ G A B =βη A′ B′ : l
q→E2y
suppose that σ ∈
(
ΘG
)
seq , so we know that
Jk → lKsem(σ) ∈ JK ⊆ SemKind JkKsem(σ) ∈ SemKind
A {left(σ)} Jk → lKsem(σ) JAKsem(σ) B {left(σ)} JkKsem(σ) JBKsem(σ)
A′ {left(σ)} Jk → lKsem(σ) JA′Ksem(σ) B′ {left(σ)} JkKsem(σ) JB′Ksem(σ)JAKsem(σ) = JA′Ksem(σ) JBKsem(σ) = JB′Ksem(σ)
from the premises. It follows from the definition of Jk → lKsem(σ) and the
fact that (A B) {left(σ)} = A {left(σ)} B {left(σ)} and (A′ B′) {left(σ)} =
A′ {left(σ)} B′ {left(σ)} that
(A B) {left(σ)} JlKsem(σ) JA BKleft(σ)
(A′ B′) {left(σ)} JlKsem(σ) JA′ B′Kleft(σ)JAKsem(σ)(JBKsem(σ)) = JA′Ksem(σ)(JB′Ksem(σ))
– The other inference rules for converting inside the context of a successor Ord
type index follow similarly.
280
The soundness of the interpretation of the well-formedness of declaration rules
follows immediately from the definition of the interpretation of declarations and the
premise.
Turning now to the typing rules for programs and program equality, we have that
there is an unfortunate redundancy of rules. In particular, because the compatibility
of the type equality relation quantifies over all contexts, it effectively duplicates all the
typing rules. Additionally, quantifying over arbitrary contexts adds extra complication
to any direct proof. Therefore, instead of handling typed equality as is, we resort to
an equivalent rephrasing of the relation that effectively combines both the typing and
compatibility rules into a single form as shown in Figures 7.1 and 7.2. As stated, this
alternative definition effectively subsumes both the derivations of syntactic equality
and typing, so we can focus on just c⇔R c′ :
(
Γ `ΘG ∆
)
(et al.) instead of both the
c =R c′ :
(
Γ `ΘG ∆
)
and c :
(
Γ `ΘG ∆
)
.
Lemma 7.8. a) c :
(
Γ `ΘG ∆
)
is derivable if and only if c⇔R c :
(
Γ `ΘG ∆
)
is.
b) Γ `ΘG v : A | ∆ is derivable if and only if Γ `ΘG v ⇔R v : A | ∆ is.
c) Γ | e : A `ΘG ∆ is derivable if and only if Γ | e⇔R e : A `ΘG ∆ is.
Proof. By (mutual) induction on the given typing derivations.
Lemma 7.9.
a) c =R c′ :
(
Γ `ΘG ∆
)
is derivable if and only if c⇔R c′ :
(
Γ `ΘG ∆
)
is.
b) Γ `ΘG v =R v′ : A | ∆ is derivable if and only if Γ `ΘG v ⇔R v′ : A | ∆ is.
c) Γ | e =R e′ : A `ΘG ∆ is derivable if and only if Γ | e⇔R e′ : A `ΘG ∆ is.
Proof. By (mutual) induction on the given typing derivations.
To show adequacy of the typing derivations, we need to be assured that enough of
the syntax of the language is well-behaved, in whatever sense of “well-behaved” that
we have chosen. For input and output abstractions, this is straightforward to show by
type expansion (Lemma 7.2) which came from the saturation condition for worlds.
Lemma 7.10 (Strong activation). Given a binary TS-type A,
a) µα.c A µα.c′ if c {E/α} ‚ c′ {E ′/α} for all E A|V E ′, and
281
Conversion rules:
c⇔R c :
(
Γ `ΘG ∆
)
c R c′ c′ ⇔R c′ :
(
Γ `ΘG ∆
)
c⇔R c′ :
(
Γ `ΘG ∆
) 
c⇔R c :
(
Γ `ΘG ∆
)
c ≺R c′ c′ ⇔R c′ :
(
Γ `ΘG ∆
)
c′ ⇔R c :
(
Γ `ΘG ∆
) ≺
Γ `ΘG v ⇔R v : A | ∆ v R v′ Γ `ΘG v′ ⇔R v′ : A | ∆
Γ `ΘG v ⇔R v′ : A | ∆
R
Γ `ΘG v ⇔R v : A | ∆ v ≺R v′ Γ `ΘG v′ ⇔R v′ : A | ∆
Γ `ΘG v′ ⇔R v : A | ∆
≺R
Γ | e⇔R e : A `ΘG ∆ e R e′ Γ | e′ ⇔R e′ : A `ΘG ∆
Γ | e⇔R e′ : A `ΘG ∆
L
Γ | e⇔R e : A `ΘG ∆ e ≺R e′ Γ | e′ ⇔R e′ : A `ΘG ∆
Γ | e′ ⇔R e : A `ΘG ∆
L
Γ `ΘG v ⇔R v′ : A | ∆ Γ `ΘG v′ ⇔R v′′ : A | ∆
Γ `ΘG v ⇔R v′′ : A | ∆
transR
Γ | e⇔R e′ : A `ΘG ∆ Γ | e′ ⇔R e′′ : A `ΘG ∆
Γ | e⇔R e′′ : A `ΘG ∆
transL
Core conversion compatibility:
Γ, x : A `ΘG x⇔R x : A | ∆
VR Γ | α⇔R x : A `ΘG α : A,∆
VL
c⇔R c′ :
(
Γ `ΘG α : A,∆
)
Γ `ΘG µα.c⇔R µα.c′ : A | ∆
AR
c⇔R c′ :
(
Γ, x : A `ΘG ∆
)
Γ | µ˜x.c⇔R µ˜x.c′ : A `ΘG ∆
AL
Γ `ΘG v ⇔R v′ : A | ∆ Θ `G A : S Γ | e⇔R e′ : A `ΘG ∆
〈v||e〉 ⇔R 〈v′||e′〉 :
(
Γ `ΘG ∆
) Cut
Γ `ΘG v ⇔R v′ : A | ∆ Θ `G A =βη B : S
Γ `ΘG v ⇔R v′ : B | ∆
TCR
Γ | e⇔R e′ : A `ΘG ∆ Θ `G A =βη B : S
Γ | e⇔R e′ : B `ΘG ∆
TCL
FIGURE 7.1. Core parallel conversion rules.
282
(Co-)data conversion compatibility:
Given data F( #        »X : k) : Swhere
#                                                                                                  »
Ki :
(
#                »
Aij : Tijj `
#     »
Yi:li F( #»X ) | #                  »Bij : Rijj
)i
∈ G, we have the rules:
θ =
{
#       »
C/X
}
#                               »
Θ `G C ′iθ : liθ θ′ =
{
#         »
C ′i/Yi
}
θ
#                                                          »
Γ | e⇔R e′ : Bijθ′ `ΘG ∆
j #                                                           »
Γ | v ⇔R v′ : Aijθ′ `ΘG ∆
j
Γ `ΘG K
# »
C′
i ( #»e , #»v )⇔R K
# »
C′
i (
#»
e′ ,
#»
v′) : F( #»C ) | ∆
FRKi
θ =
{
#       »
C/X
} #                                                                                                             »
ci ⇔R c′i :
(
Γ, #              »xi : Aiθ `Θ,
#        »
Yi:liθ
G
#               »
αi : Biθ,∆
)i
Γ | µ˜
[
#                                  »
K
#     »
Yi:li
i ( #»αi , #»xi).ci
i
]
⇔R µ˜
[
#                                  »
K
#     »
Yi:li
i ( #»αi , #»xi).c′i
i
]
: F( #»C ) `ΘG ∆
FL
Given codataG( #        »X : k) : Swhere
#                                                                                                   »
Oi :
(
#                »
Aij : Tijj | G( #»X ) `
#     »
Yi:li #                  »Bij : Rijj
)i
∈ G, we have the rules:
θ =
{
#       »
C/X
} #                                                                                                             »
ci ⇔R c′i :
(
Γ, #              »xi : Aiθ `Θ,
#        »
Yi:liθ
G
#               »
αi : Biθ,∆
)i
Γ `ΘG µ
(
#                                 »
O
#     »
Yi:li
i [ #»xi , #»αi ].ci
i
)
⇔R µ
(
#                                 »
O
#     »
Yi:li
i [ #»xi , #»αi ].c′i
i
)
: G( #»C ) | ∆
GR
θ =
{
#       »
C/X
}
#                         »
Θ `G C ′i : li θ′ =
{
#         »
C ′i/Yi
}
θ
#                                                           »
Γ | v ⇔R v′ : Aijθ′ `ΘG ∆
j #                                                          »
Γ | e⇔R e′ : Bijθ′ `ΘG ∆
j
Γ | O
# »
C′i
i [ #»v , #»e ]⇔R O
# »
C′i
i [
#»
v′ ,
#»
e′ ] : G( #»C ) `ΘG ∆
GLOi
FIGURE 7.2. Parallel conversion rules for (co-)data types.
283
b) µ˜x.c A µ˜x.c′ if c {V/x} ‚ c′ {V ′/x} for all V A|V V ′.
Proof. a) Observe that for all E A|V E ′ we have
〈(µα.c, µα.c′)||(E,E ′)〉 = (〈µα.c||E〉 , 〈µα.c′||E ′〉) (c {E/α} , c′ {E ′/α}) ∈ ‚
Therefore, µα.c A µα.c′ by head expansion (Lemma 7.2).
b) Analogous to part 1 by duality.
However, we also need to know that enough of the structures and case abstractions
are also well-behaved. Therefore, we extend the idea that a substitution strategy is
focalizing—that is, it contains enough structures and abstractions—to the worlds of
the model. For any safety condition S = ( ‚, ‚‚,‚), an S-world TS = (US ,VS ,WS) is
focalizing whenever the following conditions hold:
– K
#»
C ( #»E, #»V ) WS K
#»
C (
# »
E ′ ,
# »
V ′) and O
#»
C [ #»V , #»E ] WS O
#»
C [
# »
V ′ ,
# »
E ′ ] if
#                  »
V WS V ′ and
#                  »
E WS E ′ , and
– µ
( #                          »
O
#   »
Y :l [ #»x , #»α ].c
)
WS µ
( #                           »
O
#   »
Y :l [ #»x , #»α ].c′
)
and µ˜
[ #                           »
K
#   »
Y :l( #»α , #»x ).c
]
WS µ˜
[ #                            »
K
#   »
Y :l( #»α , #»x ).c′
]
such that c
{ #       »
C/X,
#     »
V/x,
#      »
E/α
}‚c′ { #       »C/X, #       »V ′/x, #       »E ′/α} for all #              »C JlK C and
#                         »
V W‚WSS V ′ and
#                         »
E W‚WSS E ′ .
From this assumption, we can show the adequacy of the typing rules.
Lemma 7.11 (Parallel conversion adequacy). For any focalizing strategies #»S , safety
condition S = ( ‚, ‚‚,‚) such that ‚ is transitive, family of focalizing S-worlds TS,
and size measurement,
a) if c⇔µ #»S µ˜ #»S ηµηµ˜βGηG c′ :
(
Γ `ΘG ∆
)
is derivable in µµ˜ #»S then c⇔ c′ :
(
Γ ΘG ∆
)
,
b) if Γ `ΘG v ⇔µ #»S µ˜ #»S ηµηµ˜βGηG v′ : A | ∆ is derivable in µµ˜ #»S then Γ ΘG v ⇔ v′ : A | ∆,
and
c) if Γ | e⇔µ #»S µ˜ #»S ηµηµ˜βGηG e′ : A `ΘG ∆ is derivable in µµ˜ #»S then Γ | e⇔ e′ : A ΘG ∆.
Proof. By mutual induction on the derivations for parallel conversion in Figures 7.1
and 7.2, demonstrating that each inference rule is sound similar to Lemma 7.7.
For the conversion rules, JsymmK and JtransK are sound by the symmetry and
transitivity assumed for the ‚ relation. The JRwK rule for commands follows because
284
the  computation relation in ‚allows for only one side or the other to reduce. For
example, to show the soundness of µ #»S rule:
c⇔ c :
(
Γ ΘG ∆
)
c µ #»S c′ c′ ⇔ c′ :
(
Γ ΘG ∆
)
c⇔ c′ :
(
Γ ΘG ∆
) JK
suppose that σ ∈
(
Γ ΘG ∆
)
seq , so from the premises we know that c {left(σ)} ‚
c {right(σ)} and c′ {left(σ)} ‚ c′ {right(σ)}. Thus, since
(c {left(σ)} , c′ {right(σ)}) (c′ {left(σ)} , c′ {right(σ)}) ∈ ‚
we have c {left(σ)} ‚ c {right(σ)} by S’s closure under expansion. Similarly, we could
expand the right-hand side instead of the left-hand side, which gives us the reverse
rewriting rule ≺. Additionally, the R and L rules for terms and co-terms follows
because every semantic TS-type A is equal to A|‚WSVS , and each rewriting rule on
(co-)terms is simulated in the type by a rewriting rule on commands. For terms, we
have the ηµ rule for terms of any type, so to show that
Γ ΘG µα. 〈v||α〉 ⇔ µα. 〈v||α〉 : A | ∆ µα. 〈v||α〉 ηµ v Γ ΘG v ⇔ v : A | ∆
Γ ΘG µα. 〈v||α〉 ⇔ v : A | ∆
JRK
suppose that σ ∈
(
Γ ΘG ∆
)
seq and JAKsem(σ) ∈ SemType(TS) for some S, so from
the premises we know that µα. 〈v||α〉 {left(σ)} JAKsem(σ) µα. 〈v||α〉 {right(σ)} and
v {left(σ)} JAKsem(σ) v {right(σ)}. Letting E JAKsem(σ) E ′, note that we have the
step
(〈µα. 〈v||α〉 {left(σ)}||E〉 , 〈v {right(σ)}||E ′〉) (〈v {left(σ)}||E〉 , 〈v {left(σ)}||E ′〉) ∈ ‚
and so 〈µα. 〈v||α〉||E〉 ‚ 〈v||E ′〉 by S’s closure under expansion. Similarly, the ηG
rewriting rule is justified individually for each (co-)data type because case abstractions
are (co-)values (by the focalization assumption on each strategy) by inspecting the
positive or negative definitions of the type. In particular, the same justification as ηµ
holds for values of a known co-data type defined as Neg(Aobs), where we only need to
check that 〈V ||E〉 β 〈V ′||E〉 for each E ∈ Aobs to conclude that V Aobs|WS+VS− V ′, so
that V Neg(Aobs) V ′ by restricted double orthogonal introduction (Property 7.5 (c)).
The soundness of ≺R follows similarly because either side of the relation may reduce
285
while the other stays the same, and the soundness of L and ≺L follow analogously
by duality.
We now move on to demonstrating the soundness of the many compatibility rules.
The interpretation of the core compatibility typing rules is sound as follows:
– To show
Γ ΘG v ⇔ v′ : A | ∆ Θ G A : S Γ | e⇔ e′ : A ΘG ∆
〈v||e〉 ⇔ 〈v′||e′〉 :
(
Γ ΘG ∆
) JCutK
suppose
that σ ∈
(
Γ ΘG ∆
)
seq , so we know that A {left(σ)} JSKsem(σ) JAKsem(σ),
v {left(σ)} JAKsem(σ) v′ {right(σ)}, and e {left(σ)} JAKsem(σ) e′ {right(σ)}from
the premises. Note that by definition of JSKsem(σ), JAKsem(σ) must be a TS-type,
so that JAKsem(σ) = JAKsem(σ)∣∣∣‚WSVS = JAK‚WSsem(σ). Therefore, since 〈(v, v′)||(e, e′)〉 =
(〈v||e〉 , 〈v′||e′〉) ∈ ‚we have that 〈v||e〉 ‚ 〈v′||e′〉.
– To show Γ, x : A ΘG x⇔ x : A | ∆
JVRK
suppose that σ ∈
(
Γ, x : A ΘG ∆
)
seq ,
so that (x {left(σ)} , x {right(σ)}) = (left(σ)(x), right(σ)(x)) = σ(x) ∈JAKsem(σ).
– The soundness of
c⇔ c′ :
(
Γ ΘG α : A,∆
)
Γ ΘG µα.c⇔ µα.c′ : A | ∆
JARK
follows immediately from the
premise and Lemma 7.10.
– To show
Γ ΘG v ⇔ v′ : A | ∆ Θ G A =βη B : S
Γ ΘG v ⇔ v′ : B | ∆
JTCRK
suppose that σ ∈(
Γ ΘG ∆
)
seq and JAKsem(σ) ∈ SemType(TS) for some S, so by the premise we
know that v {left(σ)} JAKsem(σ) v′ {right(σ)} and JAKsem(σ) = JBKsem(σ), therefore
v {left(σ)} JBKsem(σ) v′ {right(σ)} as well.
– The soundness of JVLK, JALK, and JTCLK follows dually to the cases for JVRK,JARK, and JTCRK.
Given a higher-order data type declaration (which subsumes simple data
declarations),
data F( #        »X : k) : Swhere
#                                                                                             »
Ki :
(
#              »
Aij : Tijj `
#      »
Yi:li F( #»X ) | #                »Bij : Rijj
)i
∈ G
the associated convertibility compatibility rules are sound as follows:
286
– To show
#                        »
Θ G C ′i : li
#                                                      »
Γ | ej ⇔ e′j : Bijθ ΘG ∆
j #                                                      »
Γ | vj ⇔ v′j : Aijθ ΘG ∆
j
Γ ΘG K
# »
C′
i ( #»ej j, #»vjj)⇔ K
# »
C′
i (
#»
e′j
j
,
#»
v′j
j
) : F( #»C ) | ∆
JFRKiK
where θ′ =
{ #         »
C ′i/Yi
}
θ and θ =
{ #       »
C/X
}
, suppose σ ∈
(
#»Γj
j
,
#»
Γ′j
j
ΘG
#  »∆j
j
,
#  »
∆′j
j
)
seq
and F( #»C ) {left(σ)} JSKsem(σ) rF( #»C )zsem(σ) so we know that
Ci {left(σ)} JkiKsem(σ) JCiKsem(σ)
C ′iθ {left(σ)} JliKsem(σ) JC ′iθKsem(σ)
Aijθ {left(σ)} JTijKsem(σ) JAijθKsem(σ)
Bijθ {left(σ)} JRijKsem(σ) JBijθKsem(σ)
#»e jj {left(σ)} JBijθK e′j {right(σ)}
vj {left(σ)} JAijθK v′j {right(σ)}
by the premises. We now proceed by induction on the number of (co-)values
among ej, e′j, vj, v′j:
∗ If ej, e′j, vj, v′j are all (co-)values, then both the constructions K
# »
C′
i ( #»ej j, #»vjj)
and K
# »
C′
i (
#»
e′j
j
,
#»
v′j
j
) are values because S is focalizing. It then follows that
both constructions are in WS |VS since TS is focalizing, so they are related
by
⋃
i
s
Ki :
(
#              »
Aij : Tijj `
#      »
Yi:li F( #»X ) | #                »Bij : Rijj
){
sem(σ),
#                                       »
(C,JCKsem(σ))/X
Thus, both constructions are also related byr
F( #»C )
z
sem(σ)
= Pos
⋃
i
s
Ki :
(
#              »
Aij : Tijj `
#      »
Yi:li F( #»X ) | #                »Bij : Rijj
){
sem(σ),
#                                       »
(C,JCKsem(σ))/X

by restricted double orthogonal introduction (Property 7.5 (c)).
287
∗ If #»ej j = #»E, ej, #»e′ where ej /∈ CoValueRij , then observe that for all E ′ ∈r
F( #»C )
z
sem(σ)
we have the step
〈
K
# »
C′
i (
#»
E, ej,
#»
e′ , #»vj
j)
∣∣∣∣∣∣∣∣E ′〉
7→ς
〈
µα.
〈
µβj.
〈
K
# »
C′
i (
#»
E, βj,
#»
e′ , #»vj
j)
∣∣∣∣∣∣∣∣α〉∣∣∣∣∣∣∣∣ej〉∣∣∣∣∣∣∣∣E ′〉 ∈ ‚
which lands in ‚ by the inductive hypothesis and Lemma 7.10 applied
to the TS-type
r
F( #»C )
z
sem(σ)
and to the TRij -type JBijθKsem(σ). Thus,
K
# »
C′
i (
#»
E, ej,
#»
e′ , #»vj
j) ∈
r
F( #»C )
z
sem(σ)
by Lemma 7.2.
∗ If all of #»ej j = # »Ejj and #»vjj = #»V , vj, #»v′ where vj /∈ CoValueTij , then observe
that for all E ′ ∈
r
F( #»C )
z
sem(σ)
we have the step
〈
K
# »
C′
i (
# »
Ej
j
,
#»
V , vj,
#»
v′ )
∣∣∣∣∣∣∣∣E ′〉
7→ς
〈
µα.
〈
vj
∣∣∣∣∣∣∣∣µyj.〈K # »C′i ( #»#»jEj, #»V , yj, #»v′ )∣∣∣∣∣∣∣∣α〉〉∣∣∣∣∣∣∣∣E ′〉 ∈ ‚
which lands in ‚ by the inductive hypothesis and Lemma 7.10 applied
to the TS-type
r
F( #»C )
z
sem(σ)
and to the TTij -type JAijθKsem(σ). Thus,
K
# »
C′
i (
# »
Ej
j
,
#»
V , vj,
#»
v′ ) ∈
r
F( #»C )
z
sem(σ)
by Lemma 7.2.
∗ The cases where we have a non-(co-)value among #»e′j
j
or
#»
v′j
j
are analogous
to the previous two cases.
– To show
#                                                                                                                                  »
ci ⇔ c′i :
(
Γ,
#                              »
xi : Ai
#             »{C/X} Θ,
#      »
Yi:li
G
#                              »
αi : Bi
#             »{C/X} ,∆
)i
Γ | µ˜
[
#                               »
K
#      »
Yi:li
i ( #»αi , #»xi).ci
i
]
⇔ µ˜
[
#                               »
K
#      »
Yi:li
i ( #»αi , #»xi).c′i
i
]
: F( #»C ) ΘG ∆
JFLK
suppose that σ ∈
(
Γ ΘG ∆
)
seq and F( #»C ) {left(σ)} JSKsem(σ) rF( #»C )zsem(σ) so
we know that
Ci {left(σ)} JliKsem(σ) JCiKsem(σ)
288
Aijθ {left(σ)} JTijKsem(σ) JAijθKsem(σ)
Bijθ {left(σ)} JRijKsem(σ) JBijθKsem(σ)
and for all
C ′ij JlijKsem(σ) C′ij
Eij
r
Bij
{ #       »
C/X
}z
sem(σ),
#              »
C′ij/Yij
E ′ij
Vij
r
Aij
{ #       »
C/X
}z
sem(σ),
#              »
C′ij/Yij
V ′ij
we have c {left(σ)′} ‚ c′ {right(σ)} where
σ′ = σ
#                   »
(C,C)/X,
#                            »
(C ′ij,C′ij)/Yij
ij
,
#                          »
(Vij, V ′ij)/xij
ij
,
#                            »
(Eij, E ′ij)/αij
ij
We know that WTij
∣∣∣‚WTij
VTij
v
r
Aij
{ #       »
C/X
}z
sem(σ),
#              »
C′ij/Yij
and WRij
∣∣∣‚WRij
VRij
vr
Bij
{ #       »
C/X
}z
sem(σ),
#              »
C′ij/Yij
by Lemma 7.1, so the two case abstractions must
be related by WS since TS is focalizing. Furthermore, for each
(V, V ′) ∈⋃
i
s
Ki :
(
#              »
Aij : Tijj `
#      »
Yi:li F( #»X ) | #                »Bij : Rijj
){
sem(σ),
#                                       »
(C,JCKsem(σ))/X
we have a β step
(〈
µ˜
[
#                               »
K
#      »
Yi:li
i ( #»αi , #»xi).ci
i
]
{left(σ)}
∣∣∣∣∣
∣∣∣∣∣V
〉
,
〈
µ˜
[
#                               »
K
#      »
Yi:li
i ( #»αi , #»xi).c′i
i
]
{right(σ)}
∣∣∣∣∣
∣∣∣∣∣V ′
〉)
 (c, c′) ∈ ‚
Thus, by the safety condition S’s closure under expansion, we have that the two
case abstractions are related by
⋃
i
s
Ki :
(
#              »
Aij : Tijj `
#      »
Yi:li F( #»X ) | #                »Bij : Rijj
){
sem(σ),
#                                       »
(C,JCKsem(σ))/X
‚WS−
289
which means they are also related by
r
F( #»C )
z
sem(σ)
= Pos
⋃
i
s
Ki :
(
#              »
Aij : Tijj `
#      »
Yi:li F( #»X ) | #                »Bij : Rijj
){
sem(σ),
#                                       »
(C,JCKsem(σ))/X

by restricted double orthogonal introduction Property 7.5 (c).
Note that the logical rules for (both primitive and noetherian) recursive data
types are the same as those for higher-order data types, so their soundness follows
similarly.
The recursive rules for ∃Ix and ∃<Ord are sound as follows:
– To show
c0 ⇔ c′0 :
(
Γ, x : A 0 ΘG ∆
)
c1 ⇔ c′1 :
(
Γ, x : A (j+1) Θ,j:IxG α : A j,∆
)
Γ | µ˜[0:Ix @ x.c0 | j+1:Ix @α x.c1]⇔ µ˜[0:Ix @ x.c′0 | j+1:Ix @α x.c′1] : ∃Ix(A) ΘG ∆
∃IxLrec
suppose that σ ∈
(
Γ ΘG ∆
)
seq and that ∃Ix(A) {left(σ)} JSKsem(σ)J∃Ix(A)Ksem(σ), so it must be that A {left(σ)} JSKsem(σ) JAKsem(σ) and thus from
the premises we know that
c0 {left(σ), V0/x} ‚ c′0 {right(σ), V ′0/x}
c1 {left(σ), V1/x,E/α} ‚ c′1 {right(σ), V ′1/x,E ′/α}
for all V0 JAKsem(σ)(0) V ′0 , M JIxKsem(σ) M, V1 JAKsem(σ)(+1(M)) V ′1 , and
E JAKsem(σ)(M) E ′. Furthermore, since
J∃Ix(A)Ksem(σ) = Neg (r @ : (A j `j:Ix ∃Ix(A) | )zsem(σ)
)
, we can then proceed by to show that for every
V
r
@ :
(
A j `j:Ix ∃Ix(A) |
)z
sem(σ)
V ′
we step to a pair of commands in ‚ by induction on M ∼N underlying
MJIxKsem(σ)M:
∗ in the case of
(0 @ V0)
r
@ :
(
A j `j:Ix ∃Ix(A) |
)z
sem(σ)
(0 @ V ′0)
290
where V0 JAKsem(σ)(0) V ′0 , note that we have the β step
 〈0 @ V0||µ˜[0:Ix @ x.c0 | j+1:Ix @α x.c1] {left(σ)}〉 ,
〈0 @ V0||µ˜[0:Ix @ x.c′0 | j+1:Ix @α x.c′1] {right(σ)}〉

 (c0 {left(σ), V0/x} , c′0 {right(σ), V ′0/x}) ∈ ‚
Thus, by the safety conditions S’s closure under expansion, we have that
the two case abstractions are related byr
@ :
(
A j `j:Ix ∃Ix(A) |
)z‚WS
sem(σ)
and so they are also related by J∃Ix(A)Ksem(σ) by restricted double orthogonal
introduction (Property 7.5 (c)).
∗ in the case of
(M+1 @ V1)
r
@ :
(
A j `j:Ix ∃Ix(A) |
)z
sem(σ)
(M+1 @ V ′1)
where M+1 JIxKsem(σ) +1(M) and V1 JAKsem(σ)(0) V ′1 , note that we have
the β step
 〈M+1 @ V1||µ˜[0:Ix @ x.c0 | j+1:Ix @α x.c1] {left(σ)}〉 ,
〈M+1 @ V1||µ˜[0:Ix @ x.c′0 | j+1:Ix @α x.c′1] {right(σ)}〉

 
 〈µα.c1 {left(σ),M/j, V1/x}||µ˜[0:Ix @ x.c0 | j+1:Ix @α x.c1] {left(σ)}〉 ,
〈µα.c′1 {right(σ),M/j, V ′1/x}||µ˜[0:Ix @ x.c′0 | j+1:Ix @α x.c′1] {right(σ)}〉

∈ ‚
where the result is related in ‚ by the inductive hypothesis. Thus, by the
safety conditions S’s closure under expansion, we have that the two case
abstractions are related byr
@ :
(
A j `j:Ix ∃Ix(A) |
)z‚WS
sem(σ)
and so they are also related by J∃Ix(A)Ksem(σ) by restricted double orthogonal
introduction (Property 7.5 (c)).
291
– To show
c⇔ c′ :
(
Γ, x : A j `Θ,j<NG α : ∃<Ord(j, A),∆
)
Γ | µ˜[j < N @α x.c]⇔ µ˜[j < N @α x.c′] : ∃<Ord(N,A) `ΘG ∆
∃<OrdLrec
suppose that σ ∈
(
Γ ΘG ∆
)
seq and that ∃<Ord(N,A) {left(σ)} JSKsem(σ)J∃<Ord(N,A)Ksem(σ), so it must be that N {left(σ)} JOrdK [sem(σ)]JNKsem(σ) and
A {left(σ)} JSKsem(σ) JAKsem(σ) and thus from the premises we know that
c {left(σ), V/x,E/α} ‚ c′ {right(σ), V ′/x,E ′/α}
for all M J< NKsem(σ) M, V JAKsem(σ)(M) V ′, and E J∃<Ord(j, A)Ksem(σ),M/j E ′.
We now proceed to show that the two case abstractions step to a pair of related
commands in ‚ when cut with any values related by J∃<Ord(j, A)Ksem(σ),M/j
by noetherian induction on M JOrdKsem(σ) M. In particular, note that for any
V J∃<Ord(N,A)Ksem(σ) V ′ there is the following unrolling ν step:
(〈V ||µ˜[j < N @α x.c] {left(σ)}〉 , 〈V ′||µ˜[j < N @α x.c′] {right(σ)}〉)
 
 〈V ||µ˜[i < N @ x.c {left(σ), µ˜[j < i@α x.c] {left(σ)}/α}]〉 ,
〈V ′||µ˜[i < N @ x.c′ {right(σ), µ˜[j < i@α x.c′] {right(σ)}/α}]〉
 ∈ ‚
where the result is related in ‚ because
µ˜[j < i@α x.c] {left(σ),M ′/j} J∃<Ord(j, A)Ksem(σ),M′/j µ˜[j < i@α x.c′] {right(σ),M ′/j}
for all M ′ J< NKsem(σ) M′ by the inductive hypothesis. Thus, the two case
abstractions are related in J∃<Ord(N,A)Ksem(σ) by Lemma 7.2.
The soundness of the conversion compatibility rules for higher-order, primitive
recursive, noetherian recursive co-data types follows analogously by duality, as do the
recursion rules for universal quantifiers over Ix and Ord.
Lemma 7.12 (Type adequacy). For any focalizing strategies #»S , safety condition
S = ( ‚, ‚‚,‚) such that ‚ is transitive, family of focalizing S-worlds TS, and size
measurement,
a) if c :
(
Γ `ΘG ∆
)
is derivable in µµ˜ #»S then c :
(
Γ ΘG ∆
)
,
292
b) if Γ `ΘG v : A | ∆ is derivable in µµ˜ #»S then Γ ΘG v : A | ∆, and
c) if Γ | e : A `ΘG ∆ is derivable in µµ˜ #»S then Γ | e : A ΘG ∆.
d) if c =µ #»S µ˜ #»S ηµηµ˜βGηG c
′ :
(
Γ `ΘG ∆
)
is derivable in µµ˜ #»S then c⇔ c′ :
(
Γ ΘG ∆
)
,
e) if Γ `ΘG v =µ #»S µ˜ #»S ηµηµ˜βGηG v′ : A | ∆ is derivable in µµ˜ #»S then Γ ΘG v ⇔ v′ : A | ∆,
and
f) if Γ | e =µ #»S µ˜ #»S ηµηµ˜βGηG e′ : A `ΘG ∆ is derivable in µµ˜ #»S then Γ | e⇔ e′ : A ΘG ∆.
Proof. By Lemmas 7.8, 7.9 and 7.11, noting that c⇔ c :
(
Γ ΘG ∆
)
which is
definitionally the same as c :
(
Γ ΘG ∆
)
, and similarly for (co-)terms.
Finally, we demonstrate the adequacy of sequents, by finding an appropriate
substitution for any well-formed sequent quantifying only over declarations and type
variables. Finding a substitution for sequents that quantify over (co-)variables is more
dependent on the instantiation of the model, but can be done in certain circumstances.
Lemma 7.13 (Sequent adequacy). For any focalizing strategies #»S , safety condition
S and family of S-worlds TS, if
(
`ΘG
)
seq is derivable in µµ˜ #»S then there is a size
measurement such that there exists a σ ∈
(
ΘG
)
seq . Furthermore, givenW‚WSS A (∅, ∅)
for all S, then if
(
Γ `ΘG ∆
)
seq is derivable in µµ˜ #»S then there is a size measurement
such that there exists a σ ∈
(
Γ ΘG ∆
)
seq .
Proof. By induction on the derivation of
(
ΘG
)
seq . Note that we can always choose
a big enough size measurement by including a finite number of extra ordinals in O
less than the chosen 0 equal to the length of Θ, which covers the worst case where
we have Θ = in−1 < 0, in−2 < in−1, . . . , i0 < i1 by assigning least ordinal to i0, the
successor of that to i1, and so on. More specifically, whenever we see a i < M in Θ,
we can assign i a value based on its position in Θ.
The second part also follows by induction on the derivation of
(
Γ ΘG ∆
)
seq ,
and relies on the fact that WWSS w A for all TS-types A (Lemma 7.1), so the fact that
both the positive and negative sides of WWSS are non-empty lets us choose something
for every x : A ∈ Γ and α : A ∈ ∆ regardless of A.
293
Applications
Due to the parametric definition of the model in Section 7.3, the adequacy lemmas
from Section 7.4 can be applied to a number of different goals. In particular, many
standard properties in logic and programming languages—like logical consistency, type
safety, strong normalization, and the soundness of extensionality—can be phrased as
a particular application of adequacy by choosing the right notion of safety conditions
and worlds.
In the following sections, we demonstrate each of these properties in turn. The
general procedure for each proof is a three step process guided by the definition of
the model:
1. Choose appropriate definitions for ‚, ‚‚, and WS and show that they result in
a valid safety condition and worlds.
2. Find a valid substitution for the meaning of the well-formed sequent via
Lemma 7.13, which may impose some restrictions on Γ and ∆.
3. Instantiate the semantic judgement via Lemma 7.12 with the substitution from
step 2, proving that commands have the property asserted in ‚.
Note that step 2, where we must exhibit there is a substitution for the well-formed
sequent, is required to “unlock” the hypothetical semantic judgement from step 3.
Thus, the substitutions can act as a gatekeeper, limiting which free (co-)variables in
the input and output environments we can observe for that particular property.
Logical consistency
Logical consistency means there is no derivation of the empty sequent; i.e. there
is no well-typed, closed command. This is the easiest application of the model, since
every parameter is set to be as small or as big as possible. In particular, defining
‚ = ∅ is a valid choice, since then the only criteria of safety conditions (closure under
expansion) is vacuously true.
To prove that we can come up with an empty instance of the model, we first show
that the generativity requirement imposed by the worlds are trivially true in such an
edge case.
294
Lemma 7.14 (Trivial generativity). For any safety condition S = ( ‚, ‚‚,‚), and
‚-spaces U = (U+,U−) and V = (V+,V−), and
‚‚-space W such that V,W v U, if
either U+ = V+, U− = V−, or ‚ = ∅, then all ‚‚-spaces A = A|‚WV are ‚-spaces.
Proof. Let W = (W+,W−), let A = (A+,A−) = A|‚WV , and let v, e ∈ A. Under the
assumption that U+ = V+, it must be that v ∈ A|V so that 〈v||e〉 ∈ ‚ because
A = A|‚WV . Dually, under the assumption that U− = V−, it must be that e ∈ A|V so
that 〈v||e〉 ∈ ‚. Finally, under the assumption that ‚ = ∅, then either A+ = ∅ or
A− = ∅ because A = A|‚WV , so there is no way to have both v ∈ A and e ∈ A making
the requirement that 〈v||e〉 ∈ ‚ trivial.
From the fact that generativity is trivial, we can show logical consistency of
the underlying logic (in terms of the non-derivability of the empty sequent from
Chapter III): there are no well-typed closed commands in µµ˜.
Theorem 7.1 (Logical consistency). c :
(
`ΘG
)
is not derivable if
(
`ΘG
)
seq is.
Proof. We instantiate the model with ‚‚ = ‚, ‚ = ∅ and WS = US . Note that
S = ( ‚, ‚, ∅) is a safety condition since closure under expansion is trivially true, and
likewise each TS = (US ,VS ,US) is an S-world since it is trivially generative because
of Lemma 7.14 and trivially saturated and focalizing because everything in US is in
WS by definition. Note that since  is vacuous due to emptiness of ‚, the actual
evaluation strategies for #»S are irrelevant.
Now, suppose that we have a derivation of c :
(
`ΘG
)
. By adequacy (Lemma 7.12),
it must be that c :
(
ΘG
)
. Additionally, supping a derivation of
(
`ΘG
)
seq implies
there is a σ ∈ (  ) seq by Lemma 7.13. Putting the two together, we get c {pi1 ◦ σ} ‚
c {pi2 ◦ σ}. But this is a contradiction since ‚ is empty! So there can’t be derivations
of both c :
(
`ΘG
)
and
(
`ΘG
)
seq .
Type safety
Now consider type safety, the property that well-typed programs don’t go wrong
(i.e. get stuck). We proved type safety via the syntactic technique of progress and
preservation previously in Chapters III and IV, but we can also use the model to
show a similar result for the much bigger language in Chapter VI in a way that is
general over the chosen evaluation strategy. Type safety relies on a notion of “final
commands” that represent the appropriate end states of a program, which we can
295
characterize more broadly. For an arbitrary collection of strategies #»S , we say that a
subset of commands, denoted by FinalCommand #»S , is
– necessary if c 67→µ #»S µ˜ #»S β #»S ς #»S for all c ∈ FinalCommand #»S ,
– sufficient if for all non-values v :: S either 〈v||e〉 7→µ #»S µ˜ #»S β #»S ς #»S c for some c or
〈v||e〉 ∈ FinalCommand #»S and for all non-co-values e :: S 〈v||e〉 7→µ #»S µ˜ #»S β #»S ς #»S c
for some c or 〈v||e〉 ∈ FinalCommand #»S , and
– constructive if
〈
K
#»
C ( #»E, #»V )
∣∣∣∣∣∣α〉 ∈ FinalCommand #»S and 〈x∣∣∣∣∣∣O #»C [ #»V , #»E ]〉 ∈
FinalCommand #»S .
Also, note that there is no such thing as a well-typed closed command (as we
saw during logical consistency), so we must allow for the ability to run some open
commands. Thus, we characterize which free (co-)variables are allowed for a particular
global environment G. We say that an input environment Γ is observable with respect
to G if for all (x : A) ∈ Γ, A = G( #»C ) where G is declared as a co-data type in G.
Dually, we say that an output environment ∆ is observable with respect to G if for
all (α : A) ∈ ∆, A = F( #»C ) where F is declared as a data type in G. This follows the
intuition that is okay to end with some (co-)data structure as the result, but we don’t
want to be blocked on a case analysis of some free (co-)variable. As a result, we can
safely run any well-typed program typed in an observable environment without fear
of getting stuck.
Theorem 7.2 (Type safety). For any focalizing strategies #»S such that 7→µ #»S µ˜ #»S β #»S ς #»S is
confluent, any necessary, sufficient, and constructive FinalCommandS , environments
Γ and ∆ that are observable with respect to G, and derivations of c :
(
Γ `G
#»∆
)
and
(
Γ `G ∆
)
seq , it follows that c 7→ µ #»S µ˜ #»S β #»S ς #»S c
′ 67→µ #»S µ˜ #»S β #»S ς #»S implies c
′ ∈
FinalCommandS .
Proof. We instantiate the model with
‚‚ = ‚
‚ = {(c1, c2) | ci 7→ c′i 67→ =⇒ ci ∈ FinalCommandS}
WS = US
S = ( ‚, ‚,‚) is a safety condition since ‚ is closed under expansion by definition.
Note that all ‚-spaces A‚WS are closed under 7→ reduction in the following sense:
296
– If (v1, v2) ∈ A‚WS and (〈v1||e1〉 , 〈v2||e2〉)  (〈v′1||e1〉 , 〈v′2||e2〉) for any e1 and e2
then (v′1, v′2) ∈ A‚WS since 〈v′1||e1〉 ‚ 〈v′2||e2〉 for all e1 A e2 by confluence and
the fact that FinalCommandS is necessary so final commands do not reduce
further by the operational semantics.
– Dually, if (e1, e2) ∈ A‚WS and (〈v1||e1〉 , 〈v2||e2〉)  (〈v1||e′1〉 , 〈v2||e′2〉) for any v1
and v2 then (e′1, e′2) ∈ A‚WS .
TS = (US ,VS ,US) is an S-world since saturation and focalization is trivial and
generation follows from the sufficiency of FinalCommandS : for all (v1, v2), (e1, e2) ∈
A = A|‚WSVS , if vi, ei /∈ VS then one of the following holds
– (〈v1||e1〉 , 〈v2||e2〉) 6 implies that 〈v1||e1〉 ∈ FinalCommandS and 〈v1||e1〉 ∈
FinalCommandS because FinalCommandS is sufficient,
– (〈v1||e1〉 , 〈v2||e2〉)  (〈v′1||e1〉 , 〈v′2||e2〉) implies that (v′1, v′2) ∈ A by forward
closure mentioned above,
– (〈v1||e1〉 , 〈v2||e2〉)  (〈v1||e′1〉 , 〈v2||e′2〉) implies that (e′1, e′2) ∈ A by forward
closure mentioned above, or
– (〈v1||e1〉 , 〈v2||e2〉) (c1, c2) by some other means, can only happen when vi or
ei is a (co-)value by the definition of 7→µ #»S µ˜ #»S β #»S ς #»S .
So it can’t happen that 〈vi||ei〉 7→ c′ 67→ where c′ /∈ FinalCommandS .
Now, by Lemma 7.12, we have c :
(
Γ G ∆
)
and by Lemma 7.13, there is a
σ ∈
(
G
)
seq . Note that for every data type F( #     »X:k) : SG, we have that (α, α) ∈r
F( #»C )
z
pi2◦σ
whenever
r
F( #»C )
z
pi2◦σ
∈ JSKpi2◦σ because FinalCommand #»S is constructive,
so (α, α) is orthogonal to every construction of F. Also, for every co-data type G( #     »X:k) :
S ∈ G, we have that (x, x) ∈
r
G( #»C )
z
pi2◦σ
whenever
r
G( #»C )
z
pi2◦σ
∈ JSKpi2◦σ for the
dual reason. So because Γ and ∆ are observable with respect to G, we can extend
σ ∈
(
G
)
seq to σ′ ∈
(
Γ G ∆
)
seq by substituting (co-)variables for themselves.
And so c = c {pi1 ◦ σ} ‚ c {pi2 ◦ σ′} = c, meaning that c 7→ c′ 67→ implies c′ ∈
FinalCommand #»S .
Note that the operational reduction relation 7→ is deterministic for both the V
and N strategies, and while the LV and LN operational reduction relations aren’t
297
strictly deterministic—for example, in LV we can have multiple possible standard
reductions as in
〈x||µ˜x′.c {y/y′}〉 ←[µ˜LV 〈x||µ˜x′. 〈y||µ˜y′.c〉〉 7→µ˜LV 〈y||µ˜y′.c {x/x′}〉
because both are evaluation contexts—they enjoy the diamond property, which
implies confluence. Furthermore, we have the following set of final commands
for the combination of four strategies defined as the smallest set cfin ∈
FinalCommandV,N ,LV,LN such that:
–
〈
K
#»
C ( #»E, #»V )
∣∣∣∣∣∣α〉 ∈ FinalCommandV,N ,LV,LN ,
–
〈
x
∣∣∣∣∣∣O #»C [ #»V , #»E ]〉 ∈ FinalCommandV,N ,LV,LN ,
– 〈v :: LV||µ˜x::LV .cfin〉 ∈ FinalCommandV,N ,LV,LN if v /∈ ValueLV and
µ˜x::LV .cfin /∈ CoValueN , and
– 〈µα::LN .cfin||e :: LN〉 ∈ FinalCommandV,N ,LV,LN if e /∈ CoValueLN and v /∈
ValueLN
which is necessary, sufficient, and constructive. Therefore, we have type safety for any
combination of these four strategies. Unfortunately, however, the non-deterministic
strategy U is not covered by this application on the model which assumes at least
a confluent operational semantics, and so it would require other techniques like the
syntactic approach of progress and preservation.
Strong normalization
We can also demonstrate the strong normalization property: a command c, term
v, and co-term e is strongly normalizing if there are no infinite reduction sequences
starting from c, v, or e. This property relies heavily on the fact that ‚‚ can be smaller
than ‚andWS can be smaller than US . It also relies on the following forward closure
property that types inherit from the safe pole ‚.
Lemma 7.15 (Forward closure). For all safety conditions S = ( ‚, ‚‚,‚), if c ∈ ‚
and c → c′ ∈ ‚ implies c′ ∈ ‚, then v, e ∈ A‚B and v → v′ and e → e′ implies
v′, e′ ∈ A‚B
298
Proof. Since v ∈ A‚B , we know 〈v||e〉 ∈ ‚, and since 〈v||e〉 → 〈v′||e〉 we have 〈v′||e〉 ∈
‚ as well. Therefore, v ∈ A‚B . The dual argument holds for e ∈ A‚B where e→ e′.
Strong normalization is different from type safety in that substitutions are much
easier to come by. In particular, strong normalization is an observable property for all
well-typed, open expressions. This is because (co-)variables act as benign (co-)values
which can be safely paired with any strongly normalizing parter.
Lemma 7.16. v and e are strongly normalizing if and only if 〈v||α〉 and 〈x||e〉 are.
Proof. Note that a command, term, or (co-)term is strongly normalizing if and only
if all its reducts are, so we can perform noetherian induction on the well-founded
(partial) order imposed by the possible reduction sequences starting from any strongly
normalizing expression. When 〈v||α〉 is strongly normalizing than v is as well because it
is a sub-term of 〈v||α〉, so any reduction of v is also a reduction inside 〈v||α〉. So suppose
that v is strongly normalizing, and we can see that 〈v||α〉 is as well by noetherian
induction on the possible reduction sequences starting from v:
– If 〈v||α〉 → 〈v′||α〉 because v → v′, then 〈v′||α〉 is strongly normalizing by the
inductive hypothesis.
– If 〈v||α〉 → c {α/β} because v = µβ.c then c is strongly normalizing because it
is a sub-command of v, and c {α/β} is strongly normalizing as well since any
reduction on c {α/β} is a corresponding reduction on c.
The fact that e is strongly normalizing if and only if 〈x||e〉 is follows dually to the
above.
We therefore derive strong normalization for any combination of sufficiently
deterministic, stable (Definition 6.1) and focalizing (Definition 5.2) strategies from
adequacy by restricting ‚ and WS to only strongly normalizing expressions.
Theorem 7.3 (Strong normalization). Given any stable and focalizing substitution
strategies S such that µ #»S µ˜ #»S is deterministic, and derivations of
(
Γ `ΘG ∆
)
seq and
Θ `G A : k,
a) if c :
(
Γ `ΘG ∆
)
is derivable in µµ˜ #»S then the weak µS µ˜Sηµηµ˜β
SςSν reduction
theory is strongly normalizing in c,
299
b) if Γ `ΘG v : | ∆ is derivable in µµ˜ #»S then the weak µS µ˜Sηµηµ˜βSςSν reduction
theory is strongly normalizing in v, and
c) if Γ | e : `ΘG ∆ is derivable in µµ˜ #»S then the weak µS µ˜Sηµηµ˜βSςSν reduction
theory is strongly normalizing in e.
Proof. We use CommandSN , TermSN , and CoTermSN to denote the sets of strongly
normalizing commands, terms, and co-terms, respectively, and instantiate the model
as follows:
WS = ({(v, v′) ∈ US | v, v′ ∈ TermSN} , {(e, e′) ∈ US | e, e′ ∈ CoTermSN})
‚ = CommandSN × CommandSN
‚‚ = {(〈v||e〉 , 〈v′||e′〉) | v, v′ ∈ TermSN , e, e′ ∈ CoTermSn}
Furthermore, we will form evaluation strategies out of the given substitution strategies
#»S by using the minimal definition of EvalCxts that includes only , 〈||E〉, and 〈V ||〉
S = ( ‚, ‚‚,‚) is a safety condition because it validates closure under expansion,
which can be proved by noetherian induction over the strongly normalizing term and
co-term components of commands in ‚‚. Suppose that (〈v1||e1〉 , 〈v2||e2〉) ∈ ‚‚ and
(〈v1||e1〉 , 〈v2||e2〉)  (c′1, c′2) ∈ ‚. If 〈vi||ei〉  c′′i then c′′i = c′i ∈ ‚ because µ #»S µ˜ #»S is
deterministic. If 〈vi||ei〉 → 〈v′i||ei〉 because vi → v′i, then 〈v′i||ei〉 ∈ ‚ by the inductive
hypothesis. If 〈vi||ei〉 → 〈vi||e′i〉 because ei → e′i, then 〈vi||e′i〉 ∈ ‚ by the inductive
hypothesis. Therefore, 〈vi||ei〉 ∈ ‚.
Note that by Lemma 7.16, WS = (Variable,CoVariable)‚WS . Thus, TS =
(US ,VS ,WS) is an S-world because it is saturated, generative, and focalizing as follows:
– saturation: (Variable,CoVariable) v W‚WSS
∣∣∣∣
VS
by restricted double orthogonal
introduction (Property 7.5 (c)) since (co-)variables are values. Thus, the possible
〈v||α〉 7→ c ∈ CommandSN are
∗ 〈µβ.c||α〉 7→ c {α/β} ∈ CommandSN , so that c ∈ CommandSN and thus
µβ.c ∈ TermSN .
∗
〈
K
#»
C ( #»E, e′, #»e , #»v )
∣∣∣∣∣∣α〉 7→ 〈µα. 〈µβ. 〈K #»C ( #»E, β, #»e , #»v )∣∣∣∣∣∣α〉∣∣∣∣∣∣e′〉∣∣∣∣∣∣α〉 ∈ CommandSN ,
where every reduction in K
#»
C ( #»E, e′, #»e , #»v ) can be traced to a sub-(co-)-term
in µα.
〈
µβ.
〈
K
#»
C ( #»E, β, #»e , #»v )
∣∣∣∣∣∣α〉∣∣∣∣∣∣e′〉, except for the top-level ς reduction
which is already known to lead to a strongly normalizing term.
300
∗
〈
K
#»
C ( #»E, #»V , v′, #»v )
∣∣∣∣∣∣α〉 7→ 〈µα. 〈v′∣∣∣∣∣∣µ˜y. 〈K #»C ( #»E, #»V , y, #»v )∣∣∣∣∣∣α〉〉∣∣∣∣∣∣α〉 ∈ CommandSN ,
is similar to the previous case.
Likewise, for 〈x||e〉 7→ c ∈ CommandSN e ∈ CoTermSN by duality.
– generation: suppose that we have a ‚‚-space A = A|‚WSVS and let (v1, v2), (e1, e2) ∈
A. The fact that 〈vi||ei〉 ∈ CommandSN follows by noetherian induction on the
possible reductions from vi and ei:
∗ If 〈vi||ei〉 7→ c′i then one of vi or ei must be a (co-)value, so 〈vi||ei〉 ∈
CommandSN already.
∗ If 〈vi||ei〉 → 〈v′i||ei〉 because vi → v′i, then (v′i, v1) ∈ A or (v1, v′i) ∈ A by
Lemma 7.15, so 〈v′i||ei〉 ∈ CommandSN by the inductive hypothesis.
∗ If 〈vi||ei〉 → 〈vi||e′i〉 because ei → e′i, then 〈v′i||ei〉 ∈ CommandSN similarly
to the previous case.
– focalization: structures (data structures and co-data observations) made from
strongly normalizing (co-)values are strongly normalizing themselves, since the
only possible reductions are with in those strongly normalizing (co-)values, and
weak reduction does not reduce inside case abstractions, so they are trivially
strongly normalizing.
Now, by Lemma 7.12, we have c :
(
Γ ΘG ∆
)
and by Lemma 7.13, there
is a σ ∈
(
Γ ΘG Γ
)
seq , since WS = (Variable,CoVariable)‚WS and thus
(Variable,CoVariable) v WS |‚WSVS by restricted double orthogonal introduction
(Property 7.5 (c)). Therefore, c {pi1 ◦ σ} ‚ c {pi2 ◦ σ}, which implies that c is strongly
normalizing. The instantiation of well-typed (co-)terms is similar.
We only consider weak reduction, which does not reduce inside case abstractions,
for simplicity since it makes the focalization requirement for case abstractions—that
requires that they be strongly normalizing for all valid substitutions—trivial. However,
we could extend the result to bounded reduction defined in Chapter VI by noting
that there is at least one valid substitution so long as every quantified type of kind
< N is inhabited whenever N is ∞ or M + 1, as was done previously in (Downen
et al., 2015). Also note that each of the four strategies V, N , LV, and LLV are
stable and focalizing and have deterministic µµ˜ rewriting rules, so any combination
of them leads to a strongly normalizing reduction theory. But as with type safety, the
301
non-deterministic U strategy is unfortunately not covered by this application of the
model, because we need to assume determinacy of the primitive rewriting rules for
demonstrating closure under expansion. To extend the strong normalization result to
non-deterministic strategies like U , we would need to utilize a different kind of model
that better handles non-determinism like Barbanera & Berardi’s (1994) symmetric
candidates method.
Soundness of extensionality
Finally, we tackle the more difficult job of showing that the typed extensionality
η axioms are sound with respect to the untyped operational semantics. Note that
adequacy already justifies η in any instance of the model. So the only task left is to
show that an appropriate notion of contextual equivalence is an instance of the model.
The fact that equivalence relates two things, instead of being a predicate over a single
thing, is the reason why we made the model binary instead of unary. Because we will
be referring to the two different sets of rewrite rules, in this section we will use the
shorthand RS for the untyped rewrite rules µ #»S µ˜ #»S β #»S ς #»S ν, and E
G
S for the typed rules
µ #»S µ˜ #»S ηµηµ˜β
GηGν.
First, we define a contextual equivalence relation for commands and (co-)terms of
the µµ˜-calculus based on the idea that we can distinguish the constructors or observers
that were used to build (co-)data structures.
Definition 7.13 (Contextual equivalence). For any strategies #»S , we say that two
commands c1 :
(
Γ `ΘG ∆
)
and c2 :
(
Γ `ΘG ∆
)
are contextually equivalent, written
c1 ∼= #»S c2 :
(
Γ `ΘG ∆
)
, if and only if for all environments Γ′ and ∆′ that are observable
with respect to G and contexts C such that C[c1] :
(
Γ′ `G ∆′
)
and C[c2] :
(
Γ′ `G ∆′
)
,
C[c1] 7→ R #»S D1[c′1] 67→R #»S if and only if C[c2] 7→ R #»S D2[c′2] 67→R #»S for some D1, D2 ∈
EvalCxt #»S and c′1 ∼ω c′2, where ∼ω is a weak equality on structures defined as:
〈
x
∣∣∣∣∣∣O #»B [ #»V , #»E ]〉 ∼ω 〈x∣∣∣∣∣∣∣∣O # »B′ [ # »V ′ , # »E ′ ]〉 〈K #»B ( #»E, #»V )∣∣∣∣∣∣α〉 ∼ω 〈K # »B′ ( # »E ′ , # »V ′)∣∣∣∣∣∣∣∣α〉
Furthermore, contextual equivalence is extended to term and (co-)terms as follows:
Γ `ΘG v ∼= #»S v′ : A | ∆ , ∀Γ′ | e : A `ΘG ∆′, 〈v||e〉 ∼= 〈v′||e〉 :
(
Γ′,Γ `ΘG ∆′,∆
)
Γ′ | e ∼= #»S e′ : A `ΘG ∆′ , ∀Γ `ΘG v : A | ∆, 〈v||e〉 ∼= 〈v||e′〉 :
(
Γ′,Γ `ΘG ∆′,∆
)
302
The requirement that the constructor or observer matches in the final command
is important to prevent contextual equivalence from collapsing: if ι1 () ∼= #»S ι2 (), for
example, then everything is considered equivalent because:
c1 ∼= #»S 〈ι1 ()||µ˜[ι1 ().c1 | ι2 ().c2]〉 ∼= #»S 〈ι2 ()||µ˜[ι1 ().c1 | ι2 ().c2]〉 ∼= #»S c2
Thus, the requirement that the final constructors match means ι1 () 6∼= #» ι2 ().
Contextual equivalence is the “gold standard” equivalence relation for an
operational semantics, as it is well-known that it is the biggest relation that is
compatible (i.e. applies in any context) and doesn’t confuse answers.
Theorem 7.4 (Coarsest structure-preserving congruence). Suppose that R is a
compatible relation between typed commands, terms, and co-terms such that for all
environments Γ and ∆ observable with respect to G, and related commands c1 R c2 :(
Γ `G ∆
)
implies c1 7→ R #»S D1[c′1] 67→R #»S if and only if c2 7→ R #»S D2[c′2] 67→R #»S for some
D1, D2 ∈ EvalCxt #»S and c′1 ∼ω c′2. Then R is included in ∼= #»S .
Proof. For the R relation on commands, suppose that c1 R c2 :
(
Γ `ΘG ∆
)
for
arbitrary environments. Now, let Γ′ and ∆′ be observable with respect to G and
C be any context such that C[ci] :
(
Γ′ `G ∆′
)
. By the compatibility of R, we have
C[c1] R C[c2] :
(
Γ′ `G ∆′
)
, and so by assumption c1 7→ R #»S D1[c′1] 67→R #»S . Therefore,
c1 ∼= #»S c2 :
(
Γ `ΘG ∆
)
. Similarly, compatibility of the (co-)term R relations means that
if Γ `ΘG v R v′ : A | ∆ and Γ′ | e : A `ΘG ∆′ then 〈v||e〉 R 〈v′||e〉 :
(
Γ′,Γ `ΘG ∆′,∆
)
and
thus 〈v||e〉 ∼= #»S 〈v′||e〉 :
(
Γ′,Γ `ΘG ∆′,∆
)
, and dually for co-terms.
We can now show that the typed βη theory of (co-)data is sound with respect to
contextual equivalence of the untyped operational semantics for the polarized strategy
P = V ,N combining both call-by-value and call-by-name evaluation. As with type
safety, we are limited in the Γ and ∆ environments we can observe: so long as the
variables in Γ only stand for co-data abstractions and the co-variables in ∆ only stand
for data abstractions, we cannot see the use of η in the result of the program when
we only observe the top-level constructor or observer.
Theorem 7.5 (Soundness of βη w.r.t. βς in P). Given environments Γ and ∆ that
are observable with respect to G, and a derivation of
(
Γ `G ∆
)
seq , it follows that
a) c =EGP c
′ :
(
Γ `G ∆
)
implies c ∼=P c′ :
(
Γ `G ∆
)
,
303
b) v =EGP v
′ :
(
Γ `G ∆
)
A implies v ∼=P v′ :
(
Γ `G ∆
)
A, and
c) e =EGP e
′ :
(
Γ `G ∆
)
A implies e ∼=P e′ :
(
Γ `G ∆
)
A.
Proof. We instantiate the model as follows:
‚‚ = ‚
‚ = {(c1, c2) ∈ ‚| ∃c′1 ∼ω c′1, c1 7→ RP c′1 ⇐⇒ c2 7→ RP c′2}
WN = UN
WV = UV
S = ( ‚, ‚‚,‚) is a safety condition because it is closed under expansion by definition.
Furthermore, both TN = (UN ,VN ,WN ) and TV = (UV ,VV ,WV) are S-worlds because
they are trivially saturated and focalizing since WS = US , and trivially generative by
Lemma 7.14.
Now, by Lemma 7.12, we have c⇔ c′ :
(
Γ G ∆
)
and by Lemma 7.13, there is
a σ ∈
(
G
)
seq . Similar to the proof of type safety (Theorem 7.2), we show that
variables are related by every co-data type and co-variables are related by every date
type, letting us extend σ ∈
(
G
)
seq to cover
(
Γ G ∆
)
seq . In particular, note that
〈
K
#»
C ( #»E, #»V )
∣∣∣∣∣∣α〉 7→ RP 〈K #»C ( #»E, #»V )∣∣∣∣∣∣α〉 ∼ω 〈K # »C′ ( # »E ′ , # »V ′)∣∣∣∣∣∣∣∣α〉← [RP 〈K # »C′ ( # »E ′ , # »V ′)∣∣∣∣∣∣∣∣α〉
by reflexivity so
〈
K
#»
C ( #»E, #»V )
∣∣∣∣∣∣α〉 ‚ 〈K # »C′ ( # »E ′ , # »V ′)∣∣∣∣∣∣∣∣α〉. Likewise 〈y∣∣∣∣∣∣O #»C [ #»V , #»E ]〉 ‚〈
y
∣∣∣∣∣∣∣∣O # »C′ [ # »V ′ , # »E ′ ]〉. Therefore, each co-data type relates y with y and each data type
relates β with β by restricted double orthogonal introduction (Property 7.5 (c)), which
means that we can extend σ ∈
(
G
)
seq to σ′ ∈
(
Γ G Γ
)
seq by substituting all
(co-)variables in Γ and ∆ with themselves. Thus, we get that there is some c1 ∼ω c′1
such that c = c {pi1 ◦ σ} 7→ RP c1 if and only if c′ = c′ {pi2 ◦ σ} 7→ RP c′1. Therefore
since EGP equality is also compatible, we have that it is included by Theorem 7.4. The
result for (co-)terms follows similarly, using the fact that for all semantic types A and
(v, v′), (e, e′) ∈ A, 〈v||e〉 ‚ 〈v′||e′〉.
As a consequence of the soundness of the typed βη equational theory with respect
to the P instance of contextual equivalence means that βη is coherent (i.e. there are at
least two non-equal expression) in any observable context. In particular, previously we
304
noted that ι1 () and ι2 () must be distinct terms with respect to contextual equivalence,
which is forced by the fact that 〈ι2 ()||α〉 6∼ω 〈ι1 ()||α〉. Therefore, since the βη-based
equational theory is included in contextual equivalence by the above theorem, it must
be that Γ `ΘG ι2 () 6=µP µ˜PηµβGηG ι1 () : A⊕B | ∆ for any observable Γ and ∆, as that
would imply Γ `ΘG ι2 () ∼=P ι1 () : | ∆.
305
CHAPTER VIII
The Polar Basis for Types
How many types do programming languages really need? Mainstream statically
typed programming languages all have some mechanisms for programmers to declare
their own custom types to make writing software easier, so in practice there seems
to be a limitless supply of different types in languages. However, when we are not
working in a language but with a language—for example, to study its theoretical
properties or to develop practical implementations—the fewer constructs and types
the language has the easier it is to work with. This is where functional programming
languages can shine by using their connection with logic to simplifying the language.
For example, logic tells us that we only need a binary conjunction connective since
larger conjunctions can be encoded by nesting the binary connective, and nothing is
gained or lost because either nesting is equivalently provable: (A∧B)∧C has a proof
if an only if A ∧ (B ∧ C) does. Similarly, an implication with multiple assumptions
can be encoded by nesting the simpler single-assumption implication connective in
the right way: (A ∧ B) ⊃ C has a proof if and only if A ⊃ (B ⊃ C) does. This
lets us encode complex propositions and connectives in terms of a smaller number of
more basic connectives. These encodings correspond to common techniques to simplify
models of functional programming languages down to just a handful of primitive types—
conjunction corresponds to pairing, so nested binary pairs can represent n-ary tuples,
and implication corresponds to functions, so single-input functions can represent n-ary
functions by currying—because the rest can be encoded by converting to and fro. But
we don’t just care if there are programs of the right types, we also care about what
those programs do.
Unfortunately, in real functional languages these seemingly obvious encodings
are not accurate because the two types do not actually describe the “same” set of
program behaviors. If we want to say that a type is unnecessary because it can be
encoded away, we should expect a one-for-one correspondence between the programs
of both types, but this is often not the case because of the effects in the language.
For example, encoding triples as nested pairs does not cause issue in a strict language
like SML, but in a lazy language like Haskell we can observe a difference because of
306
divergent (i.e. infinitely looping) or erroneous values like 1/0 or undefined. We can
convert between nested pairs of type (a, (b, c)) and triples of type (a, b, c) as follows:
fromTriple :: (a, b, c)→ (a, (b, c))
fromTriple (x, y, z) = (x, (y, z))
toTriple :: (a, (b, c))→ (a, b, c)
toTriple p = (fst p, fst (snd p), snd (snd p))
However, the nested pair type (Int, (Bool, String)) in Haskell contains both (1, loop())
and (1, (loop(), loop())), where loop is a recursive function that loops forever and never
returns an answer:
loop :: a→ b
loop x = loop x
These two nested tuples can be distinguished by pattern-matching: for example,
casexof ( , ( , ))→ 9 yields 9 when x = (1, (loop(), loop())) but loops forever when
x = (1, loop()). Yet the two different pairs are collapsed in the triple type (Int, b, c)
which can only express (1, loop(), loop()), so a round trip to and from triples doesn’t
give back what we put in.1
The issues with unfaithful encodings are not just isolated to lazy languages;
strict languages like SML have similar problems with different types. For example, the
common technique known as currying in functional languages, which converts between
functions of type (a, b)→ c and a→ (b→ c), lets us represent binary functions with
unary functions nested in the right way as follows:
curry f x y = f (x, y) uncurry f (x, y) = f x y
However, this encoding too is not accurate. The type a → (b → c) contains both
functions λx.loop() and λx.λy.loop() (where the non-terminating loop function raises
is defined similarly in SML), which are observably different because the partial
application f 1 loops forever when f = λx.loop() and returns a function when
1There is also the stricter alternative definition toTriple (x, (y, z)) = (x, y, z), but a round-trip
with this instead collapses (5, loop()) and loop().
307
f = λx.λy.loop(). Yet, uncurry collapses these distinct values into λ(x, y).loop(),
so a round-trip of uncurrying and currying does not give back the same function.2
Both of these counter-examples to simple, well-accepted encodings still cause trouble
with infinite loops in place of exceptions.
The impact of inaccuracy on round-trips means that encodings have limited use
in practice. Clearly we would prefer to have n-ary tuples instead of encoding them
by hand in our source programming languages. Likewise, we want n-ary tuples in
the target representation of lazy languages since they are more efficient to implement
by reducing indirection and excessive thunking. So to utilize the above encodings in
the middle of the compilation process, we need to go back and forth, and inaccurate
encodings means that properties of the source and target language are lost: for example,
the fact that λ(x, y).f (x, y) = f in ML is lost by currying since λx.λy.f x y 6= f . So
does this mean all hope is lost when our functional languages have effects, or even just
general recursion? Must we choose between living with unfaithful encodings or giving
up entirely on the game of encoding complex types into simpler primitives altogether?
Thankfully, we do not have to choose! The root cause of the unfaithfulness
comes from a mismatch in the opportunities for strictness or laziness of programs
that is implicit in types. This inherent connection between types and evaluation
strategy (i.e. strictness and laziness) which we have seen previously under the guise
of polarity (Zeilberger, 2009; Munch-Maccagnoni, 2013) in Chapter IV, which was
originally developed in the setting of proof search not evaluation, as well as the call-
by-push-value strategy (Levy, 2003). And it turns out that this fine-grained approach
where programs can intermingle both strict and lazy evaluation gives us the tools
we need to build strong encodings of types in languages with effects that accurately
represent user-defined types.
For our setting, we will continue to model programs in the classical sequent
calculus developed in the previous chapters (specifically Chapter V). This calculus
has a built in control effect which makes the above issues of strictness and laziness
relevant for encodings (because the root of the issue is programs which fail to return
the expected result, regardless of whether the failure is caused by aborting the current
2Haskell also exhibits problems with currying, but instead due to differences in strictness on pairs.
Specifically, λ( , ).9 and λ .9 are different functions of type (a, b) → Int because they differ on
the input ⊥ :: (a, b), but these two functions are collapsed by a round-trip through currying and
uncurrying (one way or the other, depending on the strictness of uncurry).
308
control path or an infinite loop), and lets us express a set of basic connectives with
pleasant symmetries and relationships to one another.
This chapter covers the following topics:
– A primitive basis of polarized connectives suitable for encoding all simple (non-
polymorphic and non-recursive) user-defined (co-)data types in languages with
effects (Section 8.1).
– A definition of isomorphism between polarized types and simple (co-)data
declarations in the sequent calculus (Section 8.2).
– A syntactic theory of isomorphisms for (co-)data types and their declarations
(Section 8.3).
– A demonstration that the commonly expected algebraic and logical laws are
sound (with respect to type isomorphism) for the primitive basis of polarized
connectives (Section 8.4).
– An encoding of all simple user-defined (co-)data types in terms of the primitive
basis, such that the encoded type is indeed isomorphic to the user-defined one
(Section 8.5).
We stick to simple types for simplicity, to avoid the syntactic overhead of quantified type
annotations from the higher-order parametric µµ˜ sequent calculus from Section 6.2,
and because they have a canonical finite basis in terms of polarized connectives.
However, this development does not rely on simplicity, and could be generalized to
accomodate polymorphic (co-)data declarations like those for the ∀ and ∃ connectives.
Accommodating recursive types from Chapter VI, though, would require a different
analysis.
Polarizing User-Defined (Co-)Data Types
We will express the polarized encoding problem—translating user-defined types
into a fixed set of pre-defined primitive types—within the common framework of
data and co-data in the parametric µµ˜-sequent calculus. Our source language for
representing user-defined types is the parametric µµ˜-calculus from Chapter 5.2. In
contrast to the λ-calculus, this setting makes it easier to include both call-by-name and
call-by-value evaluation within the same program which lets us simultaneously model
309
both ML- and Haskell-like languages within one framework, and to express additional
types (like A ` B and ∼A in Section 8.1) with pleasant symmetric properties (like
the algebraic and logical laws in Section 8.4) that are generally not found in λ-based
languages.
For example, to model pairs from functional languages, we can declare the
following pair type for a kind S—instantiating V for S for ML-like pairs and N
for S for Haskell-like pairs—as follows to get the associated left and right rules:
data (X : S)×S (Y : S) : Swhere
PairS : (X : S, Y : S ` X ×S Y | )
Γ `G v1 : A | ∆ Γ′ `G v2 : B | ∆′
Γ,Γ′ `G PairS(v1, v2) : A×S B | ∆,∆′
×SRPairS
c :
(
Γ, x1 : A, x2 : B `G ∆
)
Γ | µ˜[PairS(x1, x2).c] : A×S B `G ∆
×SL
For this data type, the structure PairS(v1, v2) is just like a pair from the respective
functional language, and the usual case-analysis expression can be written as:
case v of PairS(x1, x2) ⇒ v′ , µα. 〈v||µ˜[PairS(x1, x2).〈v||α〉]〉. As another example,
function types do not need to be primitives in this language, since they can be
declared as co-data types. In particular, we have both strict and lazy functions by
again instantiating the right kind for S—picking V for strict functions and N for lazy
ones—to get the associated left and right rules for functions:
codata (X : S)→S (Y : S) : Swhere
CallS : (X : S | X →S Y ` Y : S)
c :
(
Γ, x : A `G β : B,∆
)
Γ `G µ(CallS [x, β].c) : A→S B | ∆
→SR
Γ `G v : A | ∆ Γ′ | e : B `G ∆′
Γ,Γ′ | CallS [v, e] : A→S B `G ∆,∆′
→SLCallS
310
As discussed in Chapter V, the familiar λ-calculus notation for function abstraction
and application is written as:
λx.v , µ(CallS [x, β].〈v||β〉) v v′ , µβ. 〈v||CallS [v′, β]〉
Our target language, which we will use for encoding all the constructs from the
source, is not really a different language at all. Rather, it is a limited subset of the
source language, consisting of only a fixed number of different constructs. The idea
is to declare just a handful of (co-)data types up front, collectively named P, and
then forget the declaration mechanism entirely to prevent the language from being
extended with any new types. The key, then, is to ensure that all the programs from
the source language can be faithfully encoded into the limited constructs included in
the target, without running into the same troubles of unfaithful encodings from the
introduction. In other words, we want to be able to faithfully encode any collection
of declarations G into the pre-determined basis P .
The brunt of our pre-defined primitive data and co-data types is given in
Figure 8.1. Each of these (co-)data types are chosen as a basis because of their
symmetry—for each one, there is a dual mirror image on the other side—and because
they all perform one, and only one, aspect of the functionality allowed by the
declaration mechanism. The additive (co-)data types capture the use of multiple
different constructors or observers for a type by giving a choice between two (⊕
and &) or a choice of no (0 and >) alternatives. The multiplicative (co-)data types
capture the use of multiple components within structures or observations, by giving
a combination of two (⊗ and `) or no (1 and ⊥) parts. And finally the negation
(co-)data types, which capture the ability for data structures to contain co-terms and
co-data observations to contain terms.
We might think that we have some flexibility in choosing the evaluation strategies
for the declarations in Figure 8.1. But as it turns out, since we want to use these
(co-)data types as the backbone of faithful encodings, our hand is forced. Intuitively,
each of these declarations follows a simple rule of thumb for choosing the kinds for
types: every type to the left (of `) is V and every type to the right is N , except for the
active type whose kind is the reverse. This rule of thumb has a few consequences. The
first is that every data type is call-by-value and every co-data type is call-by-name,
which follows the general wisdom of polarization in computation (Zeilberger, 2009;
Munch-Maccagnoni, 2013). The second consequence is that every data type constructor
311
Additive (co-)data types
data (X : V)⊕ (Y : V) : V where
ι1 : (X : V ` X ⊕ Y | )
ι2 : (Y : V ` X ⊕ Y | )
codata (X : N ) & (Y : N ) : N where
pi1 : ( | X & Y ` X : N )
pi2 : ( | X & Y ` Y : N )
data 0 : V where codata> : N where
Multiplicative (co-)data types
data (X : V)⊗ (Y : V) : V where
( , ) : (X : V , Y : V ` X ⊗ Y | )
codata (X : N )` (Y : N ) : N where
[ , ] : ( | X ` Y ` X : N , Y : N )
data 1 : V where () : ( ` 1 | ) codata⊥ : N where [] : ( | ⊥ ` )
Involutive negation (co-)data types
data∼(X : N ) : V where
∼ : ( ` ∼X | X : N )
codata¬(X : V) : N where
¬ : (X : V | ¬X ` )
FIGURE 8.1. Declarations of the primitive polarized data and co-data types.
builds on V types and every co-data type constructor builds on N types, except for
the negation constructors which are reversed because their underlying (co-)terms are
reversed. The last consequence is that the notion of data type values and co-data type
co-values are hereditarily as restrictive as possible, where a structure or observation is
only a (co-)value if it contains components that are (co-)values in the most restrictive
sense.
The basic (co-)data types from Figure 8.1 are still incomplete, though, for
our purpose of encoding all (co-)data types expressible in the source language. In
particular, how could we possibly represent a type like the call-by-name pair A×N B?
The ⊗ data type constructor won’t do since it operates over the wrong kind of types.
Therefore, we need a mechanism for plainly “shifting” between N and V kinds of
types, and to do that we must break our rule of thumb. One way to do the conversion
is with singleton (co-)data types, declared as follows, that wraps a component of the
312
data ↓S(X : S) : V where
↓S : (X : S ` ↓SX | )
codata ↑S(X : S) : N where
↑S : ( | ↑SX ` X : S)
data S⇑(X : V) : Swhere
S⇑ : (X : V ` S⇑X | )
codata S⇓(X : N ) : Swhere
S⇓ : ( | S⇓X ` X : N )
FIGURE 8.2. Declarations of the shifts between strategies as data and co-data types.
other strategy:
data ↓(X : N ) : V where
↓ : (X : N ` ↓X | )
codata ↑(X : V) : N where
↑ : ( | ↑X ` X : V)
The other possibility is a singleton (co-)data type that is of the other strategy, declared
as follows:
codata ⇓(X : N ) : V where
⇓ : ( | ⇓X ` X : N )
data ⇑(X : V) : N where
⇑ : (X : V ` ⇑X | )
As it turns out, we will use both styles of shifts because they are each useful in different
situations for encoding complex (co-)data types. As a technical device, we will use a
whole family of shifts parameterized by a kind of strategy as defined in Figure 8.2, with
the above as defaults. The idea is that ↓S and ↑S shift to V and N (respectively) from
S, whereas S⇑ and S⇓ shift from V and N (respectively) to S. These parameterized
shifts include some redundancy (as we will see in Section 8.4), but they are useful
notationally for generically manipulating types.
By combining the polarized types from Figure 8.1 with the shifts from Figure 8.2,
we get the polarized basis P for all user-defined (co-)data types. In particular, the
polarized basis is expressive enough to translate programs using any collection G of
user-defined (co-)data types as shown in Figure 8.3, so that if c :
(
Γ `ΘG ∆
)
thenJcKG : (JΓKG `ΘP J∆KG) (where JΓKG and J∆KG are defined pointwise). We informally
use deep pattern matching to aid writing the translation, with the understanding
that it is desugared into several shallow patterns in the obvious way, and to express
the repeated composition of the binary connectives, we define the (“big”) versions of
the polarized additive, multiplicative, and quantifier connectives over n-ary vectors of
313
types and type variables as follows:
⊕
 , 0
⊕
(A, #»B) , A⊕
(⊕ #»
B
) ⊗
 , 1
⊗
(A, #»B) , A⊗
(⊗ #»
B
)
¯
 , >
¯
(A, #»B) , A&
(¯
#»
B
) ¸
 , ⊥
¸
(A, #»B) , A` (¸ #»B)
ιi (v) , ι2 ( i. . .ι1 (v)) pii [e] , pi2 [ i. . .pi1 [e]]
(vn, . . . , v1) , (vn, (. . ., (v1, ()))) [e1, . . . , en] , [e1, [. . ., [en, []]]]
This polarizing encoding is sound so that equalities in the source, including η,
are preserved in the target.
Theorem 8.1 (Polarization soundness). For any (composite) strategy S including V
and N , and i = 1, 2:
a) if ci : (Γ `ΘG ∆) and c1 =µS µ˜Sηµηµ˜βGηG c2 then JciKG : (JΓKG `ΘP J∆KG) andJc1KG =µS µ˜Sηµηµ˜βPηP Jc2KG,
b) if Γ `ΘG vi : A|∆ v1 =µS µ˜Sηµηµ˜βGηG v2 then JΓKG `ΘP JviKG : JAKG|J∆KGJv1KG =µS µ˜Sηµηµ˜βPηP Jv2KG, and
c) if Γ|ei : A `ΘG ∆ and e1 =µS µ˜Sηµηµ˜βGηG e2 then JΓKG|JeiKG : JAKG `ΘP J∆KG andJe1KG =µS µ˜Sηµηµ˜βPηP Je2KG.
Proof. The polarizing encoding of (co-)data types as shown in Figure 8.3 is stated in
terms of deep pattern matching on data structures and co-data observations, which
avoids the terrifying bureaucracy of the many levels of shallow patterns needed to
implement the translation. Thankfully, these deep patterns fit a certain form which
makes them much easier to desugar compared to fully general patterns. In particular,
every pattern used in the encoding begins with a match on a S⇑ or S⇓ shift, then
several nested matches on the additive structure of type A⊕ B or A& B, and then
314
JXKG , XJ〈v||e〉KG , 〈JvKG||JeKG〉JxKG , x JαKG , αJµα.cKG , µα.JcKG Jµ˜x.cKG , µ˜x.JcKG
Given data F(Θ) : Swhere
#                                                                                    »
Ki :
(
#               »
Ai1 : Tijj ` F(Θ) | #                »Bij : Rijj
)i
∈ G:
r
F( #»C )
z
G
, S⇑
⊕ #                                                                                  »⊗( #                                 »∼(↑RijJBijKGθ)j, #                      »↓TijJAijKGθj)i

where θ =
#                      »{JCKG/X}q
Ki( # »eijj, # »vijj)
y
G , S⇑(ιi(
#                                »∼(↑Rij [JeijKG])j, #                       »↓Tij(JvijKG)j))s
µ˜[
#                                »
Ki( #  »αijj, # »xijj).ci
i
]
{
G
, µ˜[
#                                                                                    »
S⇑(ιi(
#                         »∼(↑Rij [αij])
j
,
#               »↓Tij(xij)
j
)).JciKGi]
Given codataG(Θ) : Swhere
#                                                                                    »
Oi :
(
#              »
Aij : Tijj | G(Θ) ` #                »Bij : Rijj
)i
∈ G:
r
G( #»C )
z
G
, S⇓
˘ #                                                                               »˙( #                               »¬(↓TijJAijKGθ)j #                        »↑RijJBijKGθj)i

where θ =
#                      »{JCKG/X}q
Oi[ # »vijj, # »eijj]
y
G , S⇓[pii[
#                               »¬(↓Tij(JvijKG))j, #                       »↑Ri1 [JeijKG]j]]s
µ(
#                               »
Oi[ # »xijj, #  »αijj].ci
i
)
{
G
, µ(
#                                                                                  »
S⇓[pii[
#                      »¬[↓Tij(xij)]
j
,
#              »↑Tij [αij]
j
]].JciKG i)
FIGURE 8.3. A polarizing translation from G into P .
315
concludes with a match on the multiplicative structure of the following form:
p ∈ Pattern ::= S⇑
(
p+
)
p+ ∈ AddPattern ::= p× | ι1
(
p+
)
| ι2
(
p+
)
p× ∈ MultPattern ::= x | () |
(
x, p×
)
| ∼
(
q×
)
| ↓S(x)
q ∈ CoPattern ::= S⇓
[
q+
]
q+ ∈ AddCoPattern ::= q× | pi1
[
q+
]
| pi2
[
q+
]
q× ∈ MultCoPattern ::= α | [] |
[
α, q×
]
| ¬p× | ↑S [α]
We can then easily desugar (co-)patterns of this form by just un-nesting the pattern
one level at a time within the alternatives of every pattern matching (co-)term as
follows:
µ˜
[
#                    »
S⇑
(
p+i
)
.ci
i
]
, µ˜
[
S⇑x.
〈
x
∣∣∣∣∣∣∣∣µ˜[ #      »p∃i .cii]〉]
µ˜

#                  »
ι1
(
p+i
)
.ci
i
#                   »
ι2
(
p′+i
)
.c′i
i
 , µ˜
ι1 (x).
〈
x
∣∣∣∣∣∣∣∣µ˜[ #       »p+i .cii]〉
ι2 (x).
〈
x
∣∣∣∣∣∣∣∣µ˜[ #        »p′+i .c′ii]〉

µ˜[x.c] , µ˜x.c
µ˜
[(
y, p×
)
.c
]
, µ˜
[
(y, x).
〈
x
∣∣∣∣∣∣µ˜[p×.c]〉]
µ˜
[
∼
(
q×
)
.c
]
, µ˜
[
∼ (α).
〈
µ
(
q×.c
)∣∣∣∣∣∣α〉]
µ
(
#                  »
S⇓
[
q+i
]
.ci
i
)
, µ
(
S⇓[α].
〈
µ
(
#      »
q∀i .ci
i)∣∣∣∣∣∣∣∣α〉)
µ

#                  »
pi1
[
q+i
]
.ci
i
#                   »
pi2
[
q′+i
]
.c′i
i
 , µ
pi1 [α].
〈
µ
(
#       »
q+i .ci
i
)∣∣∣∣∣∣∣∣α〉
pi2 [α].
〈
µ
(
#        »
q′+i .c
′
i
i
)∣∣∣∣∣∣∣∣α〉

µ(α.c) , µα.c
µ
([
β, q×
]
.c
)
, µ
(
[β, α].
〈
µ
(
q×.c
)∣∣∣∣∣∣α〉)
µ
(
¬
[
p×
]
.c
)
, µ
(
¬ [x].
〈
x
∣∣∣∣∣∣µ˜[p×.c]〉)
Additionally, in order to prove the soundness of the η law for (co-)data types
with respect to the encoding, we use a couple helpful tricks with η. First, note that
316
the seemingly stronger version of the η law for co-data types which applies to values
(or the stronger η law for data types that applies to co-values)
(ηFS) E : F(
#»
C ) = µ˜
[ #                                                   »K( #»α , #»x ).〈K( #»α , #»x )||E〉]
(ηGS ) V : G(
#»
C ) = µ
( #                                                 »O[ #»x , #»α ].〈V ||O[ #»x , #»α ]〉)
can be derived from the η law on (co-)variables by combining with the ηµ and ηµ˜ rules
for µ- and µ˜-abstractions as follows:
E : F( #»C ) =ηµηµ˜ µ˜y:F(
#»
C ).
〈
µβ:F( #»C ). 〈y||β〉
∣∣∣∣∣∣E〉
=ηF µ˜y:F(
#»
C ).
〈
µβ:F( #»C ).
〈
y
∣∣∣∣∣∣µ˜[ #                                                  »K( #»α , #»x ).〈K( #»α , #»x )||β〉]〉∣∣∣∣∣∣E〉
=µ µ˜y:F(
#»
C ).
〈
y
∣∣∣∣∣∣µ˜[ #                                                   »K( #»α , #»x ).〈K( #»α , #»x )||E〉]〉
=ηµ˜ µ˜
[ #                                                   »K( #»α , #»x ).〈K( #»α , #»x )||E〉]
V : G( #»C ) =ηµηµ˜ µβ:G(
#»
C ).
〈
V
∣∣∣∣∣∣µ˜y:G( #»C ). 〈y||β〉〉
=ηG µβ:G(
#»
C ).
〈
V
∣∣∣∣∣∣µ˜y:G( #»C ). 〈µ( #                                                »O[ #»x , #»α ].〈y||O[ #»x , #»α ]〉)∣∣∣∣∣∣β〉〉
=µ˜ µβ:G(
#»
C ).
〈
µ
( #                                                 »O[ #»x , #»α ].〈V ||O[ #»x , #»α ]〉)∣∣∣∣∣∣β〉
=ηG µ
( #                                                 »O[ #»x , #»α ].〈V ||O[ #»x , #»α ]〉)
Second, note that we have the following equalities
(µ˜ηFS) c =
〈
z
∣∣∣∣∣∣µ˜[ #                                                       »K( #»α , #»x ).c {K( #»α , #»x )/z}]〉 (µ˜z.c : F( #»C ) ∈ CoV alueS)
(µ˜ηGS ) c =
〈
µ
( #                                                     »H[ #»x , #»α ].c {H[ #»x , #»α ]/γ})∣∣∣∣∣∣γ〉 (µγ.c : G( #»C ) ∈ V alueS)
the first of which is derived from the η law of the F as follows:
c =µ˜S 〈z||µ˜z.c〉
=ηFS
〈
z
∣∣∣∣∣∣µ˜[ #                                                         »K( #»α , #»x ).〈K( #»α , #»x )||µ˜z.c〉]〉 (µ˜z.c : F( #»C ) ∈ CoV alueS)
=µ˜S
〈
z
∣∣∣∣∣∣µ˜[ #                                                       »K( #»α , #»x ).c {K( #»α , #»x )/z}]〉
and the second of which is likewise derived from the η law of G as follows:
c =µS 〈µγ.c||γ〉
=ηGS
〈
µ
( #                                                       »H[ #»x , #»α ].〈µγ.c||H[ #»x , #»α ]〉)∣∣∣∣∣∣γ〉 (µγ.c : G( #»C ) ∈ V alueS)
317
=µS
〈
µ
( #                                                     »H[ #»x , #»α ].c {H[ #»x , #»α ]/γ})∣∣∣∣∣∣γ〉
As examples, the particular instances of this rule for the polarized data types are:
(µ˜η⊗V ) c = 〈z||µ˜[(x, y).c {(x, y)/z}]〉 (z : A⊗B)
(µ˜η⊕V ) c = 〈z||µ˜[ι1 (x).c {ι1 (x)/z}|ι2 (y).c {ι2 (y)/z}]〉 (z : A⊕B)
(µ˜η1V) c = 〈z||µ˜[().c {()/z}]〉 (z : 1)
(µ˜η0V) c = 〈z||µ˜[]〉 (z : 0)
(µ˜η∼V ) c = 〈z||µ˜[∼ (α).c {∼ (α)/z}]〉 (z : ∼A)
and the instances for the polarized co-data types are:
(µ˜ηN` ) c = 〈µ([α, β].c {[α, β]/γ})||γ〉 (γ : A`B)
(µ˜η&N ) c = 〈µ(pi1 [α].c {pi1 [α]/γ}|pi2 [β].c {pi2 [β]/γ})||γ〉 (γ : A&B)
(µ˜η⊥N ) c = 〈µ([].c {[]/γ})||γ〉 (γ : ⊥)
(µ˜η>N ) c = 〈µ()||γ〉 (γ : >)
(µ˜η¬N ) c = 〈µ(¬ [x].c {¬ [x]/γ})||γ〉 (γ : ¬A)
With the above observations about pattern matching and extensionality, we
are now ready to prove that the translation is sound. The fact that well-typed
commands and (co-)terms have the associated translated type follows straightforwardly
by (mutual) induction on their typing derivations. More interesting is the translation
of equalities across the encoding. Note that since the translation is compositional and
hygienic, the reflexive, symmetric, transitive, and (importantly) congruent closure of
the equational theory is guaranteed (Downen & Ariola, 2014a). Therefore, we only
need to check that each axiom is preserved by the translation. In that regard, it
is important to note the fact that (1) that (co-)values translate to (co-)values and
(2) substitution distributes over translation (that is, JcKG {JV KG/x} =α Jc {V/x}KG,
etc.), both of which can be confirmed by induction on the syntax of (co-)terms.
The substitution axioms translate directly without change because of the above
mentioned two facts about (co-)values and substitution, like so:
(ηµ) µα. 〈v||α〉 = v translates to Jµα. 〈v||α〉KG , µα.〈JvKG||α〉 =ηµ JvKG
(ηµ˜) µ˜x. 〈x||e〉 = e translates to Jµ˜x. 〈x||e〉KG , µ˜x.〈x||JeKG〉 =ηµ˜ JeKG
318
(µ) 〈µα.c||E〉 = c {E/α} translates to
J〈µα.c||E〉K , 〈µα.JcKG∣∣∣∣∣∣JEKG〉 =µ JcKG {JEKG/α} =α Jc {E/α}KG
(µ˜) 〈V ||µ˜x.c〉 = c {V/x} translates to
J〈V ||µ˜x.c〉K , 〈JV KG∣∣∣∣∣∣µ˜x.JcKG〉 =µ˜ JcKG {JV KG/x} =α Jc {V/x}KG
Given data F(Θ) : Swhere
#                                                                                    »
Ki :
(
#               »
Ai1 : Tijj ` F(Θ) | #                »Bij : Rijj
)i
∈ G we have:
(βF)
〈
Ki( # »eijj, # »vijj)
∣∣∣∣∣
∣∣∣∣∣µ˜
[
#                                      »
K
#    »
Y :k
i ( #  »αijj, # »xijj).ci
i
]〉
=
〈
µ #  »αij
j.
〈
# »vij
∣∣∣∣∣∣µ˜ # »xijj.ci〉∣∣∣∣∣∣ # »eijj〉 translates
by induction on the pattern S⇑
(
ιi
(
#                         »
¬
[
↑Rij [αij]
]j
,
#               »↓Tij(xij)
j
))
to:
s〈
Ki( # »eijj , # »vijj)
∣∣∣∣∣∣∣∣µ˜[ #                                  »Ki( #  »αijj , #  »xijj).cii]〉
{
G
,
〈
S⇑
(
ιi
(
#                                   »
¬
[
↑Rij
[JeijKG]]j, #                          »↓Tij(JvijKG)j
))
∣∣∣∣∣∣∣
∣∣∣∣∣∣∣µ˜
 #                                                                                          »S⇑
(
ιi
(
#                          »
¬
[
↑Rij[αij ]
]j
,
#                »↓Tij (xij)
j
))
.JciKG
i
〉
=βS⇑ηµ˜
〈
ιi
(
#                                    »
¬
[
↑Rij
[JeijKG]]j, #                          »↓Tij(JvijKG)j
)∣∣∣∣∣∣∣
∣∣∣∣∣∣∣µ˜
 #                                                                              »ιi
(
#                           »
¬
[
↑Rij [αij ]
]j
,
#                »↓Tij (xij)
j
)
.JciKG
i
〉
=β⊕ηµ˜
〈(
#                                    »
¬
[
↑Rij
[JeijKG]]j , #                          »↓Tij(JvijKG)j
)∣∣∣∣∣
∣∣∣∣∣µ˜
[(
#                           »
¬
[
↑Rij [αij ]
]j
,
#                »↓Tij (xij)
j
)
.JciKG]〉
,
〈(
¬
[
↑Ri1
[Jei1KG]] , . . . ,¬ [↑Rim[JeimKG]] , #                          »↓Tij(JvijKG)j
)
∣∣∣∣∣∣∣∣µ˜[(¬ [↑Ri1 [αi1]] , . . . ,¬ [↑Rim [αim]] , #                »↓Tij (xij)j).JciKG]〉
=β⊗ηµ˜
〈
¬
[
↑Ri1
[Jei1KG]]∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
µ˜

¬ [↑Ri1 [αi1]]. . . .〈
¬
[
↑Rim
[JeimKG]]∣∣∣∣∣∣∣∣µ˜
[
¬ [↑Rim [αim]].
〈(
#                          »
↓Tij
(JvijKG)j
)∣∣∣∣∣
∣∣∣∣∣µ˜
[
#                »↓Tij (xij)
j
.JciKG]〉]〉

〉
319
=β¬ηµ
〈
↑Ri1
[Jei1KG]∣∣∣∣∣∣∣∣∣∣∣µ˜

↑Ri1 [αi1]. . . .
〈
↑Rim
[JeimKG]∣∣∣∣∣∣∣∣µ˜
[
↑Rim [αim].
〈(
#                          »
↓Tij
(JvijKG)j
)∣∣∣∣∣
∣∣∣∣∣µ˜
[
#                »↓Tij (xij)
j
.JciKG]〉]〉

〉
=β↑ηµ
〈
µαi1. . . .
〈
µα1m.
〈(
#                          »
↓Tij
(JvijKG)j
)∣∣∣∣∣
∣∣∣∣∣µ˜
[
#                »↓Tij (xij)
j
.JciKG]〉
∣∣∣∣∣
∣∣∣∣∣JeimKG
〉∣∣∣∣∣
∣∣∣∣∣Jei1KG
〉
,
〈
µ #  »αij
j .
〈(
#                          »
↓Tij
(JvijKG)j
)∣∣∣∣∣
∣∣∣∣∣µ˜
[
#                »↓Tij (xij)
j
.JciKG]〉
∣∣∣∣∣
∣∣∣∣∣ #         »JeijKGj
〉
=β⊗β1ηµ˜
〈
µ #  »αij
j .
〈
↓Ti1
(Jvi1KG)∣∣∣∣∣∣µ˜[↓Ti1(xi1).. . .〈↓Tin(JvinKG)∣∣∣∣∣∣µ˜[↓Tin(xin).JciKG]〉]〉∣∣∣∣∣∣∣∣ #         »JeijKGj〉
=β↓ηµ˜
〈
µ #  »αij
j .
〈Jvi1KG∣∣∣∣∣∣µ˜xi1. . . .〈JvinKG∣∣∣∣∣∣µ˜xin.JciKG〉〉∣∣∣∣∣∣∣∣ #         »JeijKGj〉
,
〈
µ #  »αij
j .
〈
#          »JvijKGj∣∣∣∣∣∣∣∣µ˜ #  »xijj .JciKG〉∣∣∣∣∣∣∣∣ #         »JeijKGj〉
,
r〈
µ #  »αij
j .
〈
# »vij
∣∣∣∣∣∣µ˜ #  »xijj .ci〉∣∣∣∣∣∣ # »eijj〉zG
(ηF) β : F( #»C ) = µ˜
[
#                                                                        »
Ki( #  »αijj, # »xijj).
〈
Ki( #  »αijj, # »xijj)
∣∣∣∣∣∣β〉i] translates by induction on the
pattern S⇑
(
ιi
(
#                         »
¬
[
↑Rij [αij]
]j
,
#               »↓Tij(xij)
j
))
to:
t
µ˜
[
#                                                                            »
Ki( #  »αijj , #  »xijj).
〈
Ki( #  »αijj , #  »xijj)
∣∣∣∣∣∣β〉i]
|
G
, µ˜

#                                                                                                                   »
S⇑
(
ιi
(
#                             »
∼
(
↑Rij [αij ]
)j
,
#                »↓Tij (xij)
j
))
.〈
S⇑
(
ιi
(
#                             »
∼
(
↑Rij [αij ]
)j
,
#                »↓Tij (xij)
j
))∣∣∣∣∣
∣∣∣∣∣β
〉
i
=
µ˜η
↓
V
µ˜
 #                                                                                                                                                                          »S⇑
(
ιi
(
#                             »
∼
(
↑Rij [αij ]
)j
, #  »xij
j
))
.
〈
S⇑
(
ιi
(
#                             »
∼
(
↑Rij [αij ]
)j
, #  »xij
j
))∣∣∣∣∣
∣∣∣∣∣β
〉i
=
µ˜η
↑
V
µ˜
 #                                                                                                                                      »S⇑(ιi ( #             »∼ (αij)j , #  »xijj)).〈S⇑(ιi ( #             »∼ (αij)j , #  »xijj))∣∣∣∣∣∣∣∣β〉i

=µ˜η∼V µ˜
[
#                                                                                                        »
S⇑
(
ιi
(
# »yij
j , #  »xij
j
))
.
〈
S⇑
(
ιi
(
# »yij
j , #  »xij
j
))∣∣∣∣∣∣β〉i]
=µ˜η⊗V η1V µ˜
[
#                                                         »
S⇑(ιi (x)).〈S⇑(ιi (x))||β〉
i
]
320
=µ˜η⊕V µ˜[S⇑(x).〈S⇑(x)||β〉]
=η⇑ β
Given codataG(Θ) : Swhere
#                                                                                    »
Oi :
(
#              »
Aij : Tijj | G(Θ) ` #                »Bij : Rijj
)i
∈ G we have:
(βG) is analogous to the translation of βF by duality.
(ηG) is analogous to the translation of ηF by duality.
But is the converse statement of completeness, that if the encodings of two
commands or (co-)terms are equal then they are equal to begin with, also true?
Unfortunately not so directly; the polarizing encoding has the effect of “anonymizing”
types by moving away from a nominal style, where the different declarations lead
to distinct types, to a more structural style, where differently declared types can
be collapsed if they share a common underlying pattern of (co-)term structures.
This collapse of types doesn’t mean that all hope is lost, however, because the
(co-)terms are only collapsed between types not within types; there is still a one-
for-one correspondence between typed (co-)terms of the same type in the source with
the encoded (co-)terms in the target. To argue this case, we turn to applying the idea
of isomorphisms between types (Di Cosmo, 1995).
Type Isomorphisms
Usually, we can say that two types are isomorphic when there are mappings to
and from both of them whose composition is the identity. In the sequent setting, we
interpret “mappings” as open commands with a free variable and co-variable, and the
“identity” mapping is the simple command 〈x||α〉 connecting its free (co-)variables.
Definition 8.1 (Type isomorphism). Two closed types A and B are isomorphic,
written A ≈ B, if and only if there exist commands c : (x : A ` β : B) and c′ :
(y : B ` α : A) for any x, y, α, β such that the following equalities hold:
〈µβ.c||µ˜y.c′〉 = 〈x||α〉 : (x : A ` α : A) 〈µα.c′||µ˜x.c〉 = 〈y||β〉 : (y : B ` β : B)
Moreover, two open types A and B with free type variables #         »X : S are isomorphic,
written #         »X : S  A ≈ B, if and only if for all types #        »C : S , it follows that A #             »{C/X} ≈
B
#             »{C/X}.
321
Note that this definition of isomorphism between types is equivalent to a more
traditional presentation in terms of inverse functions within the language. In particular,
two types A : S and B : S are isomorphic in the sense of Definition 8.1 if and only
if there are two closed function values V : A →S B and V ′ : B →S A such that
V ′ ◦ V = id : A →S A and V ◦ V ′ = id : B →S B, because we can always abstract
over the open commands to get a pair of closed functions, or call the functions to
retrieve a pair of open commands, where one is inverse whenever the other is. However,
Definition 8.1 has the advantage of not assuming that our language has a primitive
function type (since they are just user-defined co-data types like any other), and of
avoiding the awkwardness of mapping between the different kinds of types that might
be isomorphic to one another.
Having defined what is an isomorphism between types, we should ask if it
is actually an equivalence relation as expected; are type isomorphisms closed
under reflexivity, symmetry, and transitivity? The reflexivity and symmetry of the
isomorphism relation between types is rather straightforward to show.
Theorem 8.2 (Reflexivity and Symmetry). For all types A and B, (a) A ≈ A, and
(b) A ≈ B implies B ≈ A.
Proof. The symmetry of type isomorphism follows immediately from its symmetric
definition. More interestingly, we can establish the reflexive isomorphism of any type
with the extensionality laws of µ- and µ˜-abstractions. In particular, for a given A, we
have the command 〈x||α〉 : (x : A ` α : A) which serves as both open commands of the
isomorphism A ≈ A. The fact that the self-composition of this command is equal to
itself comes from the ηµ and ηµ˜ axioms: 〈µα. 〈x||α〉||µ˜x. 〈x||α〉〉 =ηµ 〈x||µ˜x. 〈x||α〉〉 =ηµ˜
〈x||α〉.
In contrast, transitivity of type isomorphisms is tricker, and in fact it is not
guaranteed to hold in every possible situation. In particular, the transitivity of
isomorphism relies on the exchange of µ- and µ˜-bindings, which reassociates the
composition of commands, but this not always valid in the multiple-strategy scenario.
Specifically, given two any two (co-)terms of different kinds (recall the system for
distinguishing between multiple base kinds in Figures 5.17 and 5.18 from Section 5.5),
v :: S and e :: T , the exchange law χS`T is:
(χS`T ) 〈v||µ˜x::S. 〈µα::T .c||e〉〉 = 〈µα::T . 〈v||µ˜x::S.c〉||e〉
322
And when exchanging bindings of the same S, we just write χS for χS`S . Thankfully,
even though exchange is not necessarily guaranteed, it is still valid for many
combinations of strategies. For any S, χN`S is derivable from the universal strength of
the µ˜N axiom and likewise χS`V is derivable from the µV axiom. So for all combinations
ofN and V , each of χN`N , χV`V , and χN`V hold, but χV`N is invalidated by the counter
example:3
〈µ ::V .c1||µ˜x::V . 〈µα::N .c||µ˜ :B:N .c2〉〉 =µV c1
6= c2 =µ˜N 〈µα::N . 〈µ ::V .c1||µ˜x::V .c〉||µ˜ ::N .c2〉
As it turns out, transitivity of type isomorphisms can be built on χ. The consequence
of χV`N being invalid is that, in general, isomorphisms between types of the same
kind S are always transitive because χS holds, but isomorphisms between different
kinds of types might not be because we can’t rely on both χS`T and χT `S .
Theorem 8.3 (Homogeneous transitivity). For all strategies S such that χS holds
and types A : S, B : S, and C : S, if A ≈ B and B ≈ C then A ≈ C.
Proof. Let c1 : (x : A ` β : B) and c2 : (y : B ` α : A) be the commands from the
isomorphism A ≈ B, and let c3 : (y′ : B ` γ : C) and c4 : (z : C ` β′ : B) be the
commands from the isomorphism B ≈ C. We now establish the isomorphism A ≈ C
by composing c1 and c3 to get c5, and composing c2 and c4 to get c6:
c5 , 〈µβ.c1||µ˜y′.c3〉 : (x : A ` γ : C) c6 , 〈µβ′.c4||µ˜y.c2〉 : (z : C ` α : A)
With the help of χS , we get that the composition of c5 and c6 is the identity command
〈x||α〉 : (x : A ` α : A):
〈µγ.c5||µ˜z.c6〉 , 〈µγ. 〈µβ.c1||µ˜y′.c3〉||µ˜z. 〈µβ′.c4||µ˜y.c2〉〉
=χS 〈µβ′. 〈µγ. 〈µβ.c1||µ˜y′.c3〉||µ˜z.c4〉||µ˜y.c2〉
=χS 〈µβ′. 〈µβ.c1||µ˜y′. 〈µγ.c3||µ˜z.c4〉〉||µ˜y.c2〉
=Iso 〈µβ′. 〈µβ.c1||µ˜y′. 〈y′||β′〉〉||µ˜y.c2〉
=ηµ˜ 〈µβ′. 〈µβ.c1||β′〉||µ˜y.c2〉
3The invalidity of χV`N exactly corresponds to the failure of associativity in categorical models
of polarity Munch-Maccagnoni (2013).
323
=ηµ 〈µβ.c1||µ˜y.c2〉 =Iso 〈x||α〉
In the other direction, the composition of c6 and c5 is the identity command 〈z||γ〉 :
(z : C ` γ : C):
〈µα.c6||µ˜x.c5〉 , 〈µα. 〈µβ′.c4||µ˜y.c2〉||µ˜x. 〈µβ.c1||µ˜y′.c3〉〉
=χS 〈µβ. 〈µα. 〈µβ′.c4||µ˜y.c2〉||µ˜x.c1〉||µ˜y′.c3〉
=χS 〈µβ. 〈µβ′.c4||µ˜y. 〈µα.c2||µ˜x.c1〉〉||µ˜y′.c3〉
=Iso 〈µβ. 〈µβ′.c4||µ˜y. 〈y||β〉〉||µ˜y′.c3〉
=ηµ˜ 〈µβ. 〈µβ′.c4||β〉||µ˜y′.c3〉
=ηµ 〈µβ′.c4||µ˜y′.c3〉 =Iso 〈z||γ〉
So isomorphisms between types of the same kind are an equivalence relation. But
do they give us the right sense of a one-for-one correspondence between (co-)terms
of those types? As it turns out, an isomorphism A ≈ B of S-kinded types provides
just enough structure to convert all equalities between A-typed (co-)terms to B-typed
(co-)terms, and vice versa, which also relies on the χS axiom to exchange (co-)variable
bindings.
Theorem 8.4. For all strategies S such that χS, types A : S ≈ B : S, and
environments Γ and ∆, there are contexts C and C ′ such that if Γ `G vi : A | ∆
and Γ `G ei : A | ∆ (for i = 1, 2), then Γ `G C[vi] : B | ∆ and Γ `G C ′[ei] : B | ∆ (for
i = 1, 2), v1 = v2 if and only if C[v1] = C[v2], and e1 = e2 if and only if C ′[e1] = C ′[e2].
Proof. Let c : (x : A ` β : B) and c′ : (y : B ` α : A) be the commands from the
isomorphism A ≈ B, where x, y /∈ Γ and α, β /∈ ∆ (renaming c and c′ as necessary).
The contexts are then C , µβ. 〈||µ˜x.c〉 and C ′ , µ˜y. 〈µα.c′||〉. C[v1] = C[v2]
follows from v1 = v2 and C ′[e1] = C ′[e2] follows from e1 = e2 by just applying the
same equalities within the larger context. More interestingly, we can derive v1 = v2
from C[v1] = C[v2] from the definition of the isomorphism by placing them in a larger
context, so we have the following equality via χS :
µα. 〈C[vi]||µ˜y.c′〉 , µα. 〈µβ. 〈vi||µ˜x.c〉||µ˜y.c′〉
=χS µα. 〈vi||µ˜x. 〈µβ.c||µ˜y.c〉〉
=Iso µα. 〈vi||µ˜x. 〈x||α〉〉 =ηµηµ˜ vi
324
And since we assumed C[v1] = C[v2], we have
v1 = µα. 〈C[v1]||µ˜y.c′〉 = µα. 〈C[v2]||µ˜y.c′〉 = v2
e1 = e2 follows similarly from C ′[e1] = C ′[e2] because of the fact that
µ˜x. 〈µβ.c||C[ei]〉 = ei.
Finally, we extend the idea of isomorphisms between types to isomorphisms
between (co-)data declarations.
Definition 8.2 (Declaration isomorphism). We say that two data declarations are
isomorphic, written4
data F(Θ) : Swhere
#                                               »
K :
(
Γ `Θ′ F(Θ) | ∆
) ≈ data F′(Θ) : S ′where#                                                   »
K′ :
(
Γ `Θ′′ F′(Θ) | ∆
)
if and only if Θ  F(Θ) ≈ F′(Θ). Dually, we say that two co-data declarations are
isomorphic, written
codataG(Θ) : Swhere
#                                                »
O :
(
Γ | G(Θ) `Θ′ ∆
) ≈ codataG′(Θ) : S ′where#                                                       »
O′ :
(
Γ′ | G′(Θ) `Θ′′ ∆′
)
if and only if Θ  G(Θ) ≈ G′(Θ).
Theorem 8.5 (Declaration isomorphism equivalence). The (co-)data declaration
isomorphism relation is (a) reflexive, (b) symmetric, and (c) transitive for any
(co-)data types of the same strategy S, such that χS holds.
Proof. Follows from the reflexivity (Theorem 8.2 (a)), symmetry (Theorem 8.2 (b)),
and transitivity (Theorem 8.3) of the type isomorphism relation underlying
Definition 8.2.
This more specific notion of type-based isomorphism is the backbone of the
syntactic theory that we will develop for the purpose of reasoning more easily about
(co-)data types in general, the polarized basis P of (co-)data types, and eventually
the faithfulness of the polarization translation.
4Note that we reuse Γ and ∆ as shorthand for the list of types #         »A : T and #         »B : R within the
signatures of constructors and observers.
325
A Syntactic Theory of (Co-)Data Type Isomorphisms
Before turning to our main result—that every user-defined (co-)data type can
be represented by an isomorphic type composed solely from the polarized basic
connectives—we first explore a theory for type isomorphisms based on data and
co-data declarations. The advantage of using (co-)data type declarations for studying
type isomorphisms is that the declarations themselves provide a larger context for
localized manipulations surrounded by extra alternatives (of other constructors and
observers) and extra components (within the same constructor or observer). The
end result is that we only need to manually verify a few fundamental (co-)data type
isomorphisms by hand, while the particular isomorphisms of interest can be easily
composed out of these basic building blocks.
Structural laws of declarations
We present an isomorphism theory for the structural laws of data and co-data
types in Figures 8.4 and 8.5, which are exactly dual to one another and capture several
facts about isomorphic ways to declare (co-)data types.
– Commute: The first group of laws state that the parts of any declaration may
be reordered, including (1) the order of components within the signature of a
constructor or observer, and (2) the order of constructor or observer alternatives
within the declaration. These axioms are useful to show that the listed orders
of the various parts of a declaration don’t matter.
– Mix : The second group of laws states how two isomorphism between (co-)data
type declarations may be combined together. In particular, there are two ways
to mix declaration isomorphisms: (1) an isomorphic pair of single-alternative
declarations can have the components of their single constructor or observer
mixed into the signature of all the alternatives of another declaration
isomorphism to form a larger declaration isomorphism, and (2) a pair of
declaration isomorphism can have their respective alternatives mixed together
to form a larger isomorphism. These inference rules let us use localized reasoning
within a small (co-)data type declaration, and then compose the results together
into a large declaration isomorphism that does everything.
326
– Shift: The third group of laws state that every call-by-value (V) data declaration
isomorphism and every call-by-name (N ) co-data declaration isomorphism may
be generalized to (co-)data types of any strategy.
– Interchange: The fourth group of laws show how isomorphisms between data
type declarations and co-data type declarations can be interchanged one-for-one
with one another, so long as the data type is call-by-value (V) and the co-data
type is call-by-name (N ).
– Compatibility: The final group of laws state that an isomorphism between types
can be lifted into an isomorphism between data and co-data type declarations
with constructors and observations containing a component of that type as either
an input or an output.
These laws let us derive other facts about isomorphisms between (co-)data types. As
a simple example, applying the shift laws to the trivial cases of the commute laws for
data declarations lets us rename constructor and type names, which effectively tells
us that there is only one empty and unit type for any strategy S:
data F(Θ) : SwhereK : ( ` F(Θ) | ) ≈ data F′(Θ) : SwhereK′ : ( ` F′(Θ) | )
data F(Θ) : Swhere ≈ data F′(Θ) : Swhere
Additionally, the mix laws let us extend an existing isomorphism by combining it
with a reflexive isomorphism of any declaration, letting us add on arbitrary other
alternatives or components to two isomorphic data declarations:
data F1(Θ) : Vwhere
#                                                  »K1 : (Γ1 ` F(Θ) | ∆1)
≈
data F′1(Θ) : Vwhere
#                                                    »
K′1 :
(
Γ′1 ` F′(Θ) | ∆′1
)
data F2(Θ) : Vwhere
#                                                  »K2 : (Γ2 ` F(Θ) | ∆2)
≈
data F2(Θ) : Vwhere
#                                                  »K2 : (Γ2 ` F(Θ) | ∆2)
data F(Θ) : Vwhere
#                                                  »K1 : (Γ1 ` F(Θ) | ∆1)
#                                                  »K2 : (Γ2 ` F(Θ) | ∆2)
≈
data F′(Θ) : Vwhere
#                                                    »
K′1 :
(
Γ′1 ` F′(Θ) | ∆′1
)
#                                                  »K2 : (Γ2 ` F(Θ) | ∆2)
327
data F1(Θ) : Vwhere
#                                                  »K1 : (Γ1 ` F(Θ) | ∆1)
≈
data F′1(Θ) : Vwhere
#                                                    »
K′1 :
(
Γ′1 ` F′(Θ) | ∆′1
)
data F2(Θ) : Vwhere
K2 : (Γ2 ` F(Θ) | ∆2)
≈
data F2(Θ) : Vwhere
K2 : (Γ2 ` F(Θ) | ∆2)
data F(Θ) : Vwhere
#                                                                     »K1 : (Γ2,Γ1 ` F(Θ) | ∆1,∆1)
≈
data F′(Θ) : Vwhere
#                                                                       »
K′1 :
(
Γ2,Γ′1 ` F′(Θ) | ∆′1,∆2
)
We can justify the laws in Figures 8.4 and 8.5 in terms of the definitions of
type and (co-)data declaration isomorphisms. In particular, we can calculate when
specific instances of two (co-)data types happen to be isomorphic, so that the laws
for declaration isomorphisms are sound when the specific instances hold in general for
any matching choice of types. These calculations can be done for the commute, mix,
compatibility, interchange and shift laws for data declarations as follows.
Lemma 8.1 (Data commute instance). For any types #        »C : S , #           »C ′ : S ′, F( #»C ) ≈ F′( # »C ′)
for the declarations
a)
data F( #         »X : S ) : V where
K :
(
Γ2,Γ1 ` F( #»X ) | ∆1,∆2
) and data F′(
#            »
X ′ : S ′) : V where
K′ :
(
Γ′1,Γ′2 ` F′(
# »
X ′) | ∆′2,∆′1
)
such that Γ1θ = Γ′1θ′, Γ2θ = Γ′2θ′, ∆1θ = ∆′1θ′, and ∆1θ = ∆′1θ′ where θ =
#             »{C/X} and θ′ = #                »{C ′/X ′}
b)
data F( #         »X : S ) : V where
#                                                   »
K1 :
(
Γ1 ` F( #»X ) | ∆1
)
#                                                   »
K2 :
(
Γ2 ` F( #»X ) | ∆2
) and
data F′(
#            »
X ′ : S ′) : V where
#                                                      »
K′2 :
(
Γ′2 ` F′(
# »
X ′) | ∆′2
)
#                                                      »
K′1 :
(
Γ′1 ` F′(
# »
X ′) | ∆′1
)
such that
#                      »
Γ1θ = Γ′1θ′,
#                      »
Γ2θ = Γ′2θ′,
#                        »
∆1θ = ∆′1θ′, and
#                        »
∆2θ = ∆′2θ′ where θ =
#             »{C/X} and θ′ = #                »{C ′/X ′}
Proof. The isomorphisms between F( #»C ) and F′( # »C ′) are established by the commands
c : (x : F( #»C ) ` α′ : F′( # »C ′)) and c′ : (x′ : F′( # »C ′) ` α : F( #»C )) as follows:
a)
c , 〈x||µ˜[K( #»β1 , #»β2 , #»y2 , #»y1).〈K′( #»β2 , #»β1 , #»y1 , #»y2)||α′〉]〉
c′ , 〈x′||µ˜[K′( #»β2 , #»β1 , #»y1 , #»y2).〈K( #»β1 , #»β2 , #»y2 , #»y1)||α〉]〉
b) c ,
〈
x
∣∣∣∣∣∣
∣∣∣∣∣∣µ˜
K1( #»β1 , #»y1).〈K′1( #»β1 , #»y1)||α′〉
K2(
#»
β2 ,
#»y2).〈K′2(
#»
β2 ,
#»y2)||α′〉
〉 c′ , 〈x′
∣∣∣∣∣∣
∣∣∣∣∣∣µ˜
K′2( #»β2 , #»y2).〈K2( #»β2 , #»y2)||α〉
K′1(
#»
β1 ,
#»y1).〈K1( #»β1 , #»y1)||α〉
〉
328
Data Commute
data F(Θ) :Vwhere
K : (Γ2,Γ1 ` F(Θ)|∆1,∆2)
≈ data F
′(Θ) :Vwhere
K′ :
(
Γ1,Γ2 ` F′(Θ)|∆2,∆1
)
data F(Θ) :Vwhere
#                                             »K1 : (Γ1 ` F(Θ)|∆1)
#                                             »K2 : (Γ2 ` F(Θ)|∆2)
≈
data F′(Θ) :Vwhere
#                                               »
K′2 :
(
Γ2 ` F′(Θ)|∆2
)
#                                               »
K′1 :
(
Γ1 ` F′(Θ)|∆1
)
Data Mix
data F1(Θ) :Vwhere
#                                                »K1 : (Γ1 ` F1(Θ)|∆1)
≈
data F′1(Θ) :Vwhere
#                                                »
K′1 :
(
Γ′1 ` F′1(Θ)|∆′1
)
data F2(Θ) :Vwhere
K2 : (Γ2 ` F2(Θ)|∆2)
≈ data F
′
2(Θ) :Vwhere
K′2 :
(
Γ′2 ` F′2(Θ)|∆′2
)
data F(Θ) :Vwhere
#                                                              »K : (Γ2,Γ1 ` F(Θ)|∆1,∆2)
≈
data F′(Θ) :Vwhere
#                                                                 »
K′ :
(
Γ′2,Γ′1 ` F′(Θ)|∆′1,∆′2
)
data F1(Θ) :Vwhere
#                                                »K1 : (Γ1 ` F1(Θ)|∆1)
≈
data F′1(Θ) :Vwhere
#                                                »
K′1 :
(
Γ′1 ` F′1(Θ)|∆′1
)
data F2(Θ) :Vwhere
#                                                »K2 : (Γ2 ` F2(Θ)|∆2)
≈
data F′2(Θ) :Vwhere
#                                                »
K′2 :
(
Γ′2 ` F′2(Θ)|∆′2
)
data F(Θ) :Vwhere #                                             »K1 : (Γ1 ` F(Θ)|∆1)
#                                             »K2 : (Γ2 ` F(Θ)|∆2)
≈ data F′(Θ) :Vwhere #                                               »K′1 :
(
Γ′1 ` F′(Θ)|∆′1
)
#                                               »
K′2 :
(
Γ′2 ` F′(Θ)|∆′2
)
Data Shift
data F(Θ) :Vwhere #                                     »K : (Γ` F(Θ)|∆) ≈ data F′(Θ) :Vwhere #                                            »K′ : (Γ′ ` F′(Θ)|∆′)
data F(Θ) :Swhere #                                     »K : (Γ` F(Θ)|∆) ≈ data F′(Θ) :S ′where #                                            »K′ : (Γ′ ` F′(Θ)|∆′)
Co-data-Data Interchange
codataG(Θ) :N where
#                                         »
O :
(
Γ|G( #»X )`∆
)
≈ codataG′(Θ) :N where
#                                                 »
O′ :
(
Γ′|G′( # »X ′)`∆′
)
data F(Θ) :Vwhere #                                     »K : (Γ` F(Θ)|∆) ≈ data F′(Θ) :Vwhere
#                                                »
K′ :
(
Γ′ ` F′( # »X ′)|∆′
)
Data Compatibility
Θ`A : S Θ`B : S Θ  A ≈ B
data F(Θ) :VwhereK : (A :S ` F(Θ)|) ≈ data F′(Θ) :VwhereK′ : (B :S ` F′(Θ)|)
Θ`A : S Θ`B : S Θ  A ≈ B
data F(Θ) :VwhereK : (` F(Θ)|A :S) ≈ data F′(Θ) :VwhereK′ : (` F′(Θ)|B :S)
FIGURE 8.4. A theory for structural laws of data type declaration isomorphisms.
329
Co-data Commute
codataG(Θ) :N where
O : (Γ2,Γ1|G(Θ)`∆1,∆2)
≈ codataG
′(Θ) :N where
O′ :
(
Γ1,Γ2|G′(Θ)`∆2,∆1
)
codataG(Θ) :N where
#                                              »O1 : (Γ1|G(Θ)`∆1)
#                                              »O2 : (Γ2|G(Θ)`∆2)
≈
codataG′(Θ) :Swhere
#                                                »
O′2 :
(
Γ2|G′(Θ)`∆2
)
#                                                »
O′1 :
(
Γ1|G′(Θ)`∆1
)
Co-data Mix
codataG1(Θ) :N where
#                                                 »O1 : (Γ1|G1(Θ)`∆1)
≈
codataG′1(Θ) :N where
#                                                 »
O′1 :
(
Γ′1|G′1(Θ)`∆′1
)
codataG2(Θ) :N where
O2 : (Γ2|G2(Θ)`∆2)
≈ dataG
′
2(Θ) :N where
O′2 :
(
Γ′2|G′2(Θ)`∆′2
)
codataG(Θ) :N where
#                                                               »O : (Γ2,Γ1|G(Θ)`∆1,∆2)
≈
codataG′(Θ) :N where
#                                                                  »
O′ :
(
Γ′2,Γ′1|G′(Θ)`∆′1,∆′2
)
codataG1(Θ) :N where
#                                                 »O1 : (Γ1|G1(Θ)`∆1)
≈
codataG′1(Θ) :N where
#                                                 »
O′1 :
(
Γ′1|G′1(Θ)`∆′1
)
codataG2(Θ) :N where
O2 : (Γ2|G2(Θ)`∆2)
≈ dataG
′
2(Θ) :N where
O′2 :
(
Γ′2|G′2(Θ)`∆′2
)
codataG(Θ) :N where #                                              »O1 : (Γ1|G(Θ)`∆1)
#                                              »O2 : (Γ2|G(Θ)`∆2)
≈ codataG′(Θ) :N where #                                                »O′1 :
(
Γ′1|G′(Θ)`∆′1
)
#                                                »
O′2 :
(
Γ′2|G′(Θ)`∆′2
)
Co-data Shift
codataG(Θ) :N where #                                      »O : (Γ|G(Θ)`∆) ≈ codataG′(Θ) :N where #                                             »O′ : (Γ′|G′(Θ)`∆′)
codataG(Θ) :Swhere #                                      »O : (Γ|G(Θ)`∆) ≈ codataG′(Θ) :S ′where #                                             »O′ : (Γ′|G′(Θ)`∆′)
Data-Co-data Interchange
data F(Θ) :Vwhere #                                     »K : (Γ` F(Θ)|∆) ≈ data F′(Θ) :Vwhere
#                                                »
K′ :
(
Γ′ ` F′( # »X ′)|∆′
)
codataG(Θ) :N where
#                                         »
O :
(
Γ|G( #»X )`∆
)
≈ codataG′(Θ) :N where
#                                                 »
O′ :
(
Γ′|G′( # »X ′)`∆′
)
Co-data Compatibility
Θ`A : S Θ`B : S Θ  A ≈ B
codataG(Θ) :N whereO : (|G(Θ)`A :S) ≈ codataG′(Θ) :N whereO′ : (|G′(Θ)`B :S)
Θ`A : S Θ`B : S Θ  A ≈ B
codataG(Θ) :N whereO : (A :S|G(Θ)` ) ≈ codataG′(Θ) :N whereO′ : (B :S|G′(Θ)` )
FIGURE 8.5. A theory for structural laws of co-data type declaration isomorphisms.
330
For part (a), the composition of c′ and c along α and x of type F( #»C ) is equal to
the identity command 〈x′||α′〉 via the βF and ηF′ axioms as follows:
〈
µα.c′
∣∣∣∣µ˜x.c〉
,
〈
µα.〈x′||µ˜[K′( #»β2 , #»β1 , #»y1 , #»y2).〈K( #»β1 , #»β2 , #»y2 , #»y1)||α〉]〉∣∣∣∣∣∣µx.〈x||µ˜[K( #»β1 , #»β2 , #»y2 , #»y1).〈K′( #»β2 , #»β1 , #»y1 , #»y2)||α′〉]〉〉
=ηµ˜
〈
µα.〈x′||µ˜[K′( #»β2 , #»β1 , #»y1 , #»y2).〈K( #»β1 , #»β2 , #»y2 , #»y1)||α〉]〉∣∣∣∣∣∣µ˜[K( #»β1 , #»β2 , #»y2 , #»y1).〈K′( #»β2 , #»β1 , #»y1 , #»y2)||α′〉]〉
=µV
〈
x′
∣∣∣∣∣∣µ˜[K′( #»β2 , #»β1 , #»y1 , #»y2).〈K( #»β1 , #»β2 , #»y2 , #»y1)||µ˜[K( #»β1 , #»β2 , #»y2 , #»y1).〈K′( #»β2 , #»β1 , #»y1 , #»y2)||α′〉]〉]〉
=βFµαµ˜x 〈x′||µ˜[K′(
#»
β2 ,
#»
β1 ,
#»y1 ,
#»y2).〈K′( #»β2 , #»β1 , #»y1 , #»y2)||α′〉]〉
=ηF′
〈
x′
∣∣∣∣α′〉
And the reverse composition of c and c′ along α′ and x′ of type F′( # »C ′) is similarly
equal to the identity command 〈x||α〉 via the βF′ and ηF.
For part (b), the composition of c′ and c along α and x of type F( #»C ) is equal to
the identity command 〈x′||α′〉 via the βF and ηF′ axioms as follows:
〈
µα.c′
∣∣∣∣µ˜x.c〉
,
〈
µα.
〈
x′
∣∣∣∣∣∣
∣∣∣∣∣∣µ˜
K′2( #»β2 , #»y2).〈K2( #»β2 , #»y2)||α〉
K′1(
#»
β1 ,
#»y1).〈K1( #»β1 , #»y1)||α〉
〉∣∣∣∣∣∣
∣∣∣∣∣∣µ˜x.
〈
x
∣∣∣∣∣∣
∣∣∣∣∣∣µ˜
K1( #»β1 , #»y1).〈K′1( #»β1 , #»y1)||α′〉
K2(
#»
β2 ,
#»y2).〈K′2(
#»
β2 ,
#»y2)||α′〉
〉〉
=ηµ˜
〈
µα.
〈
x′
∣∣∣∣∣∣
∣∣∣∣∣∣µ˜
K′2( #»β2 , #»y2).〈K2( #»β2 , #»y2)||α〉
K′1(
#»
β1 ,
#»y1).〈K1( #»β1 , #»y1)||α〉
〉∣∣∣∣∣∣
∣∣∣∣∣∣µ˜
K1( #»β1 , #»y1).〈K′1( #»β1 , #»y1)||α′〉
K2(
#»
β2 ,
#»y2).〈K′2(
#»
β2 ,
#»y2)||α′〉
〉
=µV
〈
x′
∣∣∣∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣∣∣∣
µ˜

K′2(
#»
β2 ,
#»y2).
〈
K2(
#»
β2 ,
#»y2)
∣∣∣∣∣∣
∣∣∣∣∣∣µ˜
K1( #»β1 , #»y1).〈K′1( #»β1 , #»y1)||α′〉
K2(
#»
β2 ,
#»y2).〈K′2(
#»
β2 ,
#»y2)||α′〉
〉
K′1(
#»
β1 ,
#»y1).
〈
K1(
#»
β1 ,
#»y1)
∣∣∣∣∣∣
∣∣∣∣∣∣µ˜
K1( #»β1 , #»y1).〈K′1( #»β1 , #»y1)||α′〉
K2(
#»
β2 ,
#»y2).〈K′2(
#»
β2 ,
#»y2)||α′〉
〉

〉
=βFµαµ˜x
〈
x′
∣∣∣∣∣∣
∣∣∣∣∣∣µ˜
K′2( #»β2 , #»y2).〈K′2( #»β2 , #»y2)||α′〉
K′1(
#»
β1 ,
#»y1).〈K′1(
#»
β1 ,
#»y1)||α′〉
〉 =ηF′ 〈x′∣∣∣∣α′〉
And the reverse composition of c and c′ along α′ and x′ of type F′( # »C ′) is equal to the
identity command 〈x||α〉 via the βF′ and ηF.
331
Lemma 8.2 (Data mix instance). For any data types declared as
data F1(
#        »
X :S ) :V where
#                                                   »
K1 :
(
Γ1 ` F1( #»X ) |∆1
) data F
′
1(
#          »
X ′ :S ′) :V where
#                                                    »
K′1 :
(
Γ′1 ` F′1(
# »
X ′) |∆′1
)
data F2(
#        »
X :S ) :V where
#                                                   »
K2 :
(
Γ2 ` F2( #»X ) |∆2
) data F
′
2(
#          »
X ′ :S ′) :V where
#                                                    »
K′2 :
(
Γ′2 ` F′2(
# »
X ′) |∆′2
)
data F3(
#        »
X :S ) :V where
K3 :
(
Γ3 ` F3( #»X ) |∆3
) data F′3(
#          »
X ′ :S ′) :V where
K′3 :
(
Γ′3 ` F′3(
# »
X ′) |∆′3
)
data F( #        »X :S ) :V where
#                                                                   »
K4 :
(
Γ3,Γ1 ` F( #»X ) |∆1,∆3
)
#                                                                   »
K5 :
(
Γ3,Γ2 ` F( #»X ) |∆2,∆3
)
data F′(
#         »
X ′ :S ) :V where
#                                                                      »
K′4 :
(
Γ′3,Γ′1 ` F′(
# »
X ′) |∆′1,∆′3
)
#                                                                      »
K′5 :
(
Γ′3,Γ′2 ` F′(
# »
X ′) |∆′2,∆′3
)
and types #        »C : S , #           »C ′ : S ′, if F1( #»C ) ≈ F′1(
# »
C ′), F2(
#»
C ) ≈ F′2(
# »
C ′), and F3(
#»
C ) ≈ F′3(
# »
C ′),
then F( #»C ) ≈ F′( # »C ′).
Lemma 8.3 (Data compatibility instance). For types A : T , A′ : T , #        »C : S , #           »C ′ : S ′,
if A
#             »{C/X} ≈ A′ #                »{C ′/X ′} then F( #»C ) ≈ F′( # »C ′) for the following declarations:
a)
data F( #         »X : S ) : V where
K :
(
A : T ` F( #»X ) |
) and data F′(
#            »
X ′ : S ′) : V where
K′ :
(
A′ : T ` F′( # »X ′) |
)
b)
data F( #         »X : S ) : V where
K :
(
` F( #»X ) | A : T
) and data F′(
#            »
X ′ : S ′) : V where
K′ :
(
` F′( # »X ′) | A′ : T
)
Proof. Suppose that the isomorphisms F1(
#»
C ) ≈ F′1(
# »
C ′), F2(
#»
C ) ≈ F′2(
# »
C ′), and F3(
#»
C ) ≈
F′3(
# »
C ′) are witnessed by the commands
c1 : (x1 : F1(
#»
C ) ` α′1 : F′1(
# »
C ′)) c′1 : (x′1 : F′1(
# »
C ′) ` α1 : F1( #»C ))
c2 : (x2 : F2(
#»
C ) ` α′2 : F′2(
# »
C ′)) c′2 : (x′2 : F′2(
# »
C ′) ` α2 : F2( #»C ))
c3 : (x3 : F3(
#»
C ) ` α′3 : F′3(
# »
C ′)) c′3 : (x′3 : F′3(
# »
C ′) ` α3 : F3( #»C ))
332
respectively. Then isomorphisms between F( #»C ) and F′( # »C ′) are established by the
commands c : (x : F( #»C ) ` α′ : F′( # »C ′)) and c′ : (x′ : F′( # »C ′) ` α : F( #»C )) as follows:
c ,
〈
x
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
µ˜

#                                                                                                                                                                                          »
K4i(
#  »
β1i ,
#»
β3 ,
#»y3 ,
#  »y1i).〈
v′1i
∣∣∣∣∣∣
∣∣∣∣∣∣µ˜
 #                                                                                                                                                    »K′1i( #   »β′1j , #  »y′1j ).〈v′3∣∣∣∣∣∣∣∣µ˜[K′3( #»β′3 , #»y′3).〈K′4j( #                               »β′1j , #»β′3 , #»y′3 , #  »y′1j )∣∣∣∣∣∣∣∣α′〉]〉j
〉
i
#                                                                                                                                                                                          »
K5i(
#  »
β2i ,
#»
β3 ,
#»y3 ,
#  »y2i).〈
v′2i
∣∣∣∣∣∣
∣∣∣∣∣∣µ˜
 #                                                                                                                                                    »K′2i( #   »β′2j , #  »y′2j ).〈v′3∣∣∣∣∣∣∣∣µ˜[K′3( #»β′3 , #»y′3).〈K′5j( #                               »β′2j , #»β′3 , #»y′3 , #  »y′2j )∣∣∣∣∣∣∣∣α′〉]〉j
〉
i

〉
c′ ,
〈
x′
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
µ˜

#                                                                                                                                                                                        »
K′4i(
#  »
β′1i ,
#»
β′3 ,
#»
y′3 ,
#  »
y′1i).〈
v3
∣∣∣∣∣
∣∣∣∣∣µ˜
[
K3(
#»
β3 ,
#»y3).
〈
v1i
∣∣∣∣∣
∣∣∣∣∣µ˜
[
#                                                                                            »
K1j(
#   »
β1j ,
#  »y1j ).
〈
K4j(
#   »
β1j ,
#»
β3 ,
#»y3 ,
#   »
β1j )
∣∣∣∣∣∣α〉j]〉]〉
i
#                                                                                                                                                                                        »
K′5i(
#  »
β′2i ,
#»
β′3 ,
#»
y′3 ,
#  »
y′2i).〈
v3
∣∣∣∣∣
∣∣∣∣∣µ˜
[
K3(
#»
β3 ,
#»y3).
〈
v2i
∣∣∣∣∣
∣∣∣∣∣µ˜
[
#                                                                                            »
K2j(
#   »
β2j ,
#  »y2j ).
〈
K5j(
#   »
β1j ,
#»
β3 ,
#»y3 ,
#   »
β1j )
∣∣∣∣∣∣α〉j]〉]〉
i

〉
where we make use of the following shorthand:
v1i , µα1.
〈
K′1i(
#  »
β′1i ,
# »
y′1i)
∣∣∣∣∣∣µ˜x′1.c′1〉 v′1i , µα′1. 〈K1i( #  »β1i , # »y1i)∣∣∣∣∣∣µ˜x1.c1〉
v2i , µα2.
〈
K′2i(
#  »
β′2i ,
# »
y′2i)
∣∣∣∣∣∣µ˜x′2.c′2〉 v′2i , µα′2. 〈K2i( #  »β2i , # »y2i)∣∣∣∣∣∣µ˜x2.c2〉
v3 , µα3.
〈
K′3(
#»
β′3 ,
#»
y′3)
∣∣∣∣∣∣µ˜x′3.c′3〉 v′3 , µα′3. 〈K3( #»β3 , #»y3)∣∣∣∣∣∣µ˜x3.c3〉
The composition of c and c′ along α′ and x′ of type F′( # »C ′) is equal to the identity
command 〈x||α〉 via the combined strength of the µ˜ and η axioms for the call-by-value
data types F′1, F′2, and F′3, as previously discussed in Section 8.1, as well as the call-by-
value χ axiom to reassociate the bindings to bring the isomorphisms for those data
types together, as follows:
〈µα′.c||µ˜x′.c′〉
333
=ηµ˜
〈
µα′.
〈
x
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
µ˜

#                                                                                                                                                                                       »
K4i(
#  »
β1i ,
# »
β3 ,
#»y3 ,
#  »y1i).〈
v′1i
∣∣∣∣∣∣
∣∣∣∣∣∣µ˜
 #                                                                                                                                                    »K′1i( #   »β′1j , #  »y′1j ).〈v′3∣∣∣∣∣∣∣∣µ˜[K′3( # »β′3 , #»y′3).〈K′4j( #                               »β′1j , # »β′3 , #»y′3 , #  »y′1j )∣∣∣∣∣∣∣∣α′〉]〉j
〉
i
#                                                                                                                                                                                       »
K5i(
#  »
β2i ,
# »
β3 ,
#»y3 ,
#  »y2i).〈
v′2i
∣∣∣∣∣∣
∣∣∣∣∣∣µ˜
 #                                                                                                                                                    »K′2i( #   »β′2j , #  »y′2j ).〈v′3∣∣∣∣∣∣∣∣µ˜[K′3( # »β′3 , #»y′3).〈K′5j( #                               »β′2j , # »β′3 , #»y′3 , #  »y′2j )∣∣∣∣∣∣∣∣α′〉]〉j
〉
i

〉
∣∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣∣
µ˜

#                                                                                                                                                                                                                                   »
K′4i(
#  »
β′1i ,
# »
β′3 ,
#»
y′3 ,
#  »
y′1i).
〈
v3
∣∣∣∣∣
∣∣∣∣∣µ˜
[
K3(
# »
β3 ,
#»y3).
〈
v1i
∣∣∣∣∣
∣∣∣∣∣µ˜
[
#                                                                                            »
K1j(
#   »
β1j ,
#  »y1j ).
〈
K4j(
#   »
β1j ,
# »
β3 ,
#»y3 ,
#   »
β1j )
∣∣∣∣∣∣α〉j]〉]〉i
#                                                                                                                                                                                                                                   »
K′5i(
#  »
β′2i ,
# »
β′3 ,
#»
y′3 ,
#  »
y′2i).
〈
v3
∣∣∣∣∣
∣∣∣∣∣µ˜
[
K3(
# »
β3 ,
#»y3).
〈
v2i
∣∣∣∣∣
∣∣∣∣∣µ˜
[
#                                                                                            »
K2j(
#   »
β2j ,
#  »y2j ).
〈
K5j(
#   »
β1j ,
# »
β3 ,
#»y3 ,
#   »
β1j )
∣∣∣∣∣∣α〉j]〉]〉i

〉
=µαµ˜xβF′
〈
x
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
µ˜

#                                                                                                                                                                                                             »
K4i(
#  »
β1i ,
# »
β3 ,
#»y3 ,
#  »y1i). 〈v′1i|∣∣∣∣∣∣∣∣∣∣∣∣
µ˜

#                                                                                                                                                                                       »
K′1i(
#   »
β′1j ,
#  »
y′1j ).〈v′3|∣∣∣∣∣∣∣µ˜
K
′
3(
# »
β′3 ,
#»
y′3).〈v3|∣∣∣∣µ˜[K3( # »β3 , #»y3).〈v1j ||µ˜[ #                                                                                              »K1j( #   »β1k , #   »y1k ).〈K4k( #   »β1k , # »β3 , #»y3 , #   »β1k )||α〉k]〉]〉
〉
j

〉
i
#                                                                                                                                                                                                             »
K5i(
#  »
β2i ,
# »
β3 ,
#»y3 ,
#  »y2i). 〈v′2i|∣∣∣∣∣∣∣∣∣∣∣∣
µ˜

#                                                                                                                                                                                       »
K′2i(
#   »
β′2j ,
#  »
y′2j ).〈v′3|∣∣∣∣∣∣∣µ˜
K
′
3(
# »
β′3 ,
#»
y′3).〈v3|∣∣∣∣µ˜[K3( # »β3 , #»y3).〈v2j ||µ˜[ #                                                                                              »K2j( #   »β2k , #   »y2k ).〈K5k( #   »β2k , # »β3 , #»y3 , #   »β2k )||α〉k]〉]〉
〉
j

〉
i

〉
=
µ˜Vη
F′3
V
〈
x
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
µ˜

#                                                                                                                                                                                                                      »
K4i(
#  »
β1i ,
# »
β3 ,
#»y3 ,
#  »y1i).〈K3( # »β3 , #»y3)||µ˜x3. 〈v′1i|∣∣∣∣∣∣∣∣µ˜

#                                                                                                                                                                                                 »
K′1i(
#   »
β′1j ,
#  »
y′1j ).〈µα′3.c3|∣∣∣∣µ˜x′3.〈µα3.c′3||µ˜[K3( # »β3 , #»y3).〈v1j ||µ˜[ #                                                                                              »K1j( #   »β1k , #   »y1k ).〈K4k( #   »β1k , # »β3 , #»y3 , #   »β1k )||α〉k]〉]〉〉
j
〉
i
#                                                                                                                                                                                                                      »
K5i(
#  »
β2i ,
# »
β3 ,
#»y3 ,
#  »y2i).〈K3( # »β3 , #»y3)||µ˜x3. 〈v′2i|∣∣∣∣∣∣∣∣µ˜

#                                                                                                                                                                                                 »
K′2i(
#   »
β′2j ,
#  »
y′2j ).〈µα′3.c3|∣∣∣∣µ˜x′3.〈µα3.c′3||µ˜[K3( # »β3 , #»y3).〈v2j ||µ˜[ #                                                                                              »K2j( #   »β2k , #   »y2k ).〈K5k( #   »β2k , # »β3 , #»y3 , #   »β2k )||α〉k]〉]〉〉
j
〉
i

〉
334
=χV
〈
x
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
µ˜

#                                                                                                                                                                                        »
K4i(
#  »
β1i ,
# »
β3 ,
#»y3 ,
#  »y1i).〈K3( # »β3 , #»y3)||µ˜x3. 〈v′1i|∣∣∣∣∣∣∣∣µ˜

#                                                                                                                                                                »
K′1i(
#   »
β′1j ,
#  »
y′1j ).〈µα3. 〈µα′3.c3||µ˜x′3.c3〉|∣∣∣∣µ˜[K3( # »β3 , #»y3).〈v1j ||µ˜[ #                                                                                              »K1j( #   »β1k , #   »y1k ).〈K4k( #   »β1k , # »β3 , #»y3 , #   »β1k )||α〉k]〉]〉
j
〉
i
#                                                                                                                                                                                        »
K5i(
#  »
β2i ,
# »
β3 ,
#»y3 ,
#  »y2i).〈K3( # »β3 , #»y3)||µ˜x3. 〈v′2i|∣∣∣∣∣∣∣∣µ˜

#                                                                                                                                                                »
K′2i(
#   »
β′2j ,
#  »
y′2j ).〈µα3. 〈µα′3.c3||µ˜x′3.c′3〉|∣∣∣∣µ˜[K3( # »β3 , #»y3).〈v2j ||µ˜[ #                                                                                              »K2j( #   »β2k , #   »y2k ).〈K5k( #   »β2k , # »β3 , #»y3 , #   »β2k )||α〉k]〉]〉
j
〉
i

〉
=Isoηµ
〈
x
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
µ˜

#                                                                                                                                                                                                                     »
K4i(
#  »
β1i ,
# »
β3 ,
#»y3 ,
#  »y1i).〈K3( # »β3 , #»y3)||µ˜x3. 〈v′1i|
|µ˜[
#                                                                                                                                                                                                      »
K′1i(
#   »
β′1j ,
#  »
y′1j ).〈x3||µ˜[K3(
# »
β3 ,
#»y3).〈v1j ||µ˜[
#                                                                                              »
K1j(
#   »
β1k ,
#   »y1k ).〈K4k( #   »β1k , # »β3 , #»y3 , #   »β1k )||α〉
k
]〉]〉
j
]〉〉
i
#                                                                                                                                                                                                                     »
K5i(
#  »
β2i ,
# »
β3 ,
#»y3 ,
#  »y2i).〈K3( # »β3 , #»y3)||µ˜x3. 〈v′1i|
|µ˜[
#                                                                                                                                                                                                      »
K′2i(
#   »
β′2j ,
#  »
y′2j ).〈x3||µ˜[K3(
# »
β3 ,
#»y3).〈v2j ||µ˜[
#                                                                                              »
K2j(
#   »
β2k ,
#   »y2k ).〈K5k( #   »β2k , # »β3 , #»y3 , #   »β2k )||α〉
k
]〉]〉
j
]〉〉
i

〉
=µ˜xµαβF3
〈
x
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
µ˜

#                                                                                                                                                                       »
K4i(
#  »
β1i ,
# »
β3 ,
#»y3 ,
#  »y1i). 〈v′1i|∣∣∣∣∣∣µ˜[
#                                                                                                                                                      »
K′1i(
#   »
β′1j ,
#  »
y′1j ).〈v1j ||µ˜[
#                                                                                              »
K1j(
#   »
β1k ,
#   »y1k ).〈K4k( #   »β1k , # »β3 , #»y3 , #   »β1k )||α〉
k
]〉
j
]
〉
i
#                                                                                                                                                                       »
K5i(
#  »
β2i ,
# »
β3 ,
#»y3 ,
#  »y2i). 〈v′2i|∣∣∣∣∣∣µ˜[
#                                                                                                                                                      »
K′2i(
#   »
β′2j ,
#  »
y′2j ).〈v2j ||µ˜[
#                                                                                              »
K2j(
#   »
β2k ,
#   »y2k ).〈K5k( #   »β2k , # »β3 , #»y3 , #   »β2k )||α〉
k
]〉
j
]
〉
i

〉
=
µ˜Vη
F′1
V η
F′2
V
〈
x
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
µ˜

#                                                                                                                                                                                       »
K4i(
#  »
β1i ,
# »
β3 ,
#»y3 ,
#  »y1i). 〈K1i( #  »β1i , #  »y1i)|∣∣∣∣µ˜x1.〈µα′1.c1||µ˜x′1.〈µα1.c′1||µ˜[ #                                                                                              »K1j( #   »β1k , #   »y1k ).〈K4k( #   »β1k , # »β3 , #»y3 , #   »β1k )||α〉k]〉〉〉
i
#                                                                                                                                                                                       »
K5i(
#  »
β2i ,
# »
β3 ,
#»y3 ,
#  »y2i). 〈K2i( #  »β2i , #  »y2i)|∣∣∣∣µ˜x2.〈µα′2.c2||µ˜x′2.〈µα2.c′2||µ˜[ #                                                                                              »K2j( #   »β2k , #   »y2k ).〈K5k( #   »β2k , # »β3 , #»y3 , #   »β2k )||α〉k]〉〉〉
i

〉
=χV
〈
x
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
µ˜

#                                                                                                                                                                                         »
K4i(
#  »
β1i ,
# »
β3 ,
#»y3 ,
#  »y1i). 〈K1i( #  »β1i , #  »y1i)|∣∣∣∣µ˜x1.〈µα1. 〈µα′1.c1||µ˜x′1.c′1〉 ||µ˜[ #                                                                                              »K1j( #   »β1k , #   »y1k ).〈K4k( #   »β1k , # »β3 , #»y3 , #   »β1k )||α〉k]〉〉
i
#                                                                                                                                                                                         »
K5i(
#  »
β2i ,
# »
β3 ,
#»y3 ,
#  »y2i). 〈K2i( #  »β2i , #  »y2i)|∣∣∣∣µ˜x2.〈µα2. 〈µα′2.c2||µ˜x′2.c′2〉 ||µ˜[ #                                                                                              »K2j( #   »β2k , #   »y2k ).〈K5k( #   »β2k , # »β3 , #»y3 , #   »β2k )||α〉k]〉〉
i

〉
335
=Iso
〈
x
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
µ˜

#                                                                                                                                                                    »
K4i(
#  »
β1i ,
# »
β3 ,
#»y3 ,
#  »y1i). 〈K1i( #  »β1i , #  »y1i)|∣∣∣∣µ˜x1.〈µα1. 〈x1||α1〉 ||µ˜[ #                                                                                              »K1j( #   »β1k , #   »y1k ).〈K4k( #   »β1k , # »β3 , #»y3 , #   »β1k )||α〉k]〉〉
i
#                                                                                                                                                                    »
K5i(
#  »
β2i ,
# »
β3 ,
#»y3 ,
#  »y2i). 〈K2i( #  »β2i , #  »y2i)|∣∣∣∣µ˜x2.〈µα2. 〈x2||α2〉 ||µ˜[ #                                                                                              »K2j( #   »β2k , #   »y2k ).〈K5k( #   »β2k , # »β3 , #»y3 , #   »β2k )||α〉k]〉〉
i

〉
=ηµηµ˜
〈
x
∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣µ˜

#                                                                                                                                                                                              »
K4i(
#  »
β1i ,
# »
β3 ,
#»y3 ,
#  »y1i).〈K1i( #  »β1i , #  »y1i)||µ˜[
#                                                                                              »
K1j(
#   »
β1k ,
#   »y1k ).〈K4k( #   »β1k , # »β3 , #»y3 , #   »β1k )||α〉
k
]〉
i
#                                                                                                                                                                                              »
K5i(
#  »
β2i ,
# »
β3 ,
#»y3 ,
#  »y2i).〈K2i( #  »β2i , #  »y2i)||µ˜[
#                                                                                              »
K2j(
#   »
β2k ,
#   »y2k ).〈K5k( #   »β2k , # »β3 , #»y3 , #   »β2k )||α〉
k
]〉
i

〉
=µµ˜βF1βF2
〈
x
∣∣∣∣∣∣∣
∣∣∣∣∣∣∣µ˜

#                                                                                                          »
K4i(
#  »
β1i ,
# »
β3 ,
#»y3 ,
#  »y1i).〈K4i( #  »β1i , # »β3 , #»y3 , #  »β1i)||α〉
i
#                                                                                                          »
K5i(
#  »
β2i ,
# »
β3 ,
#»y3 ,
#  »y2i).〈K5i( #  »β2i , # »β3 , #»y3 , #  »β2i)||α〉
i
〉
=ηF〈x||α〉
And the reverse composition of c′ and c along α and x of type F( #»C ) is equal to the
identity command 〈x′||α′〉 similarly.
Lemma 8.4 ((Co-)Data interchange shift instance). For any types #        »C : S , #           »C ′ : S ′ and
(co-)data declarations
data F( #         »X : S ) : T where
#                                                     »
K :
(
Γ ` F( #         »X : S ) | ∆
) data F
′(
#            »
X ′ : S ′) : T where
#                                                             »
K′ :
(
Γ′ ` F′( #            »X ′ : S ′) | ∆′
)
codataG( #         »X : S ) : Rwhere
#                                                     »
O :
(
Γ | F( #         »X : S ) ` ∆
) codataG
′(
#            »
X ′ : S ′) : Rwhere
#                                                              »
O′ :
(
Γ′ | G′( #            »X ′ : S ′) ` ∆′
)
F( #»C ) ≈ F′( # »C ′) implies G( #»C ) ≈ G′( # »C ′) when T = V and F( #»C ) ≈ F′( # »C ′) implies
G( #»C ) ≈ G′( # »C ′) when R = N .
Proof. First, suppose that the commands
c1 : (x1 : F(
#»
C ) ` α′1 : F′(
# »
C ′)) and c′1 : (x′1 : F′(
# »
C ′) ` α1 : F( #»C ))
336
witness the isomorphism F( #»C ) ≈ F′( # »C ′). Then the isomorphism between G( #»C ) and
G′( # »C ′) is established by:
c2 ,
〈
µ
 #                                                                                                                                                                    »O′i[ #»y′i , #»β′i ].
〈
µα1.
〈
K′i(
#»
β′i ,
#»
y′i )
∣∣∣∣∣∣µ˜x′1.c′1〉
∣∣∣∣∣
∣∣∣∣∣µ˜
[
#                                                          »
Kj(
#»
βj ,
#»yj ).〈x2||Oj[ #»yj , #»βj ]〉
j
]〉i
∣∣∣∣∣∣∣
∣∣∣∣∣∣∣α′2
〉
: (x2:G(
#»
C ) ` α′2:G′(
# »
C ′))
c′2 ,
〈
µ
 #                                                                                                                                                                    »Oi[ #»yi , #»βi ].
〈
µα′1.
〈
Ki(
#»
βi ,
#»yi )
∣∣∣∣∣∣µ˜x1.c1〉
∣∣∣∣∣
∣∣∣∣∣µ˜
[
#                                                          »
K′j(
#»
β′j ,
#»
y′j ).〈x′2||O′j[
#»
y′j ,
#»
β′j ]〉
j
]〉i
∣∣∣∣∣∣∣
∣∣∣∣∣∣∣α2
〉
: (x′2:G′(
# »
C ′) ` α2:G( #»C ))
The composition of c′2 and c2 along α2 and x2 of type G(
#»
C ) is equal to the identity
command 〈x′2||α′2〉 via the βG, ηFV , βF′ , and ηG′ axioms as follows:
〈
µα2.c
′
2
∣∣∣∣µ˜x2.c2〉
,
〈
µα2.
〈
µ
#                                                                                                                                                                        »Oi[ #»yi , #»βi ].
〈
µα′1.
〈
Ki(
#»
βi ,
#»yi)
∣∣∣∣∣∣µ˜x1.c1〉
∣∣∣∣∣
∣∣∣∣∣µ˜
[
#                                                           »
K′j(
#»
β′j ,
#»
y′j ).〈x′2||O′j [
#»
y′j ,
#»
β′j ]〉
j
]〉i
∣∣∣∣∣∣∣
∣∣∣∣∣∣∣α2
〉
∣∣∣∣∣∣
∣∣∣∣∣∣µ˜x2.
〈
µ
#                                                                                                                                                                       »O′i[ #»y′i , #»β′i ].〈µα1.〈K′i( #»β′i , #»y′i)∣∣∣∣∣∣µ˜x′1.c′1〉∣∣∣∣∣∣∣∣µ˜[ #                                                           »Kj( #»βj , #»yj ).〈x2||Oj [ #»yj , #»βj ]〉j]〉
i
∣∣∣∣∣∣
∣∣∣∣∣∣α′2
〉〉
=ηµ
〈
µ
#                                                                                                                                                                        »Oi[ #»yi , #»βi ].
〈
µα′1.
〈
Ki(
#»
βi ,
#»yi)
∣∣∣∣∣∣µ˜x1.c1〉
∣∣∣∣∣
∣∣∣∣∣µ˜
[
#                                                           »
K′j(
#»
β′j ,
#»
y′j ).〈x′2||O′j [
#»
y′j ,
#»
β′j ]〉
j
]〉i∣∣∣∣∣∣
∣∣∣∣∣∣µ˜x2.
〈
µ
#                                                                                                                                                                       »O′i[ #»y′i , #»β′i ].〈µα1.〈K′i( #»β′i , #»y′i)∣∣∣∣∣∣µ˜x′1.c′1〉∣∣∣∣∣∣∣∣µ˜[ #                                                           »Kj( #»βj , #»yj ).〈x2||Oj [ #»yj , #»βj ]〉j]〉
i
∣∣∣∣∣∣
∣∣∣∣∣∣α′2
〉〉
=µ˜N
〈
µ

#                                                                                                                                                                                              »
O′i[
#»
y′i ,
#»
β′i ].
〈
µα1.
〈
K′i(
#»
β′i ,
#»
y′i)
∣∣∣∣∣∣µ˜x′1.c′1〉∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
µ˜

#                                                                                                                                                 »
Kj(
#»
βj ,
#»yj ).
〈
µ

#                                                                                                    »
Ok[ #»yk ,
# »
βk ].
〈
µα′1.
〈
Kk(
# »
βk ,
#»yk)
∣∣∣∣∣∣µ˜x1.c1〉∣∣∣∣∣
∣∣∣∣∣µ˜
[
#                                                       »
K′l(
#»
β′l ,
#»
y′l ).〈x′2||O′l[
#»
y′l ,
#»
β′l ]〉
l
]〉k
∣∣∣∣∣∣Oj [ #»yj , #»βj ]〉
j

〉
i
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
α′2
〉
337
=βGµαµ˜x
〈
µ

#                                                                                                                                                    »
O′i[
#»
y′i ,
#»
β′i ].
〈
µα1.
〈
K′i(
#»
β′i ,
#»
y′i)
∣∣∣∣∣∣µ˜x′1.c′1〉∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣µ˜

#                                                                                                       »
Kj(
#»
βj ,
#»yj ).
〈
µα′1.
〈
Kj(
#»
βj ,
#»yj )
∣∣∣∣∣∣µ˜x1.c1〉∣∣∣∣∣
∣∣∣∣∣µ˜
[
#                                                       »
K′l(
#»
β′l ,
#»
y′l ).〈x′2||O′l[
#»
y′l ,
#»
β′l ]〉
l
]〉 j

〉
i

∣∣∣∣∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣∣∣∣∣
α′2
〉
=µ˜VηFV
〈
µ

#                                                                                                                                              »
O′i[
#»
y′i ,
#»
β′i ].
〈
µα1.
〈
K′i(
#»
β′i ,
#»
y′i)
∣∣∣∣∣∣µ˜x′1.c′1〉∣∣∣∣∣
∣∣∣∣∣µ˜x1.
〈
µα′1.c1
∣∣∣∣∣
∣∣∣∣∣µ˜
[
#                                                         »
K′l(
#»
β′l ,
#»
y′l ).〈x′2||O′l[
#»
y′l ,
#»
β′l ]〉
l
]〉〉i

∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣α
′
2
〉
=χV
〈
µ

#                                                                                                                                                     »
O′i[
#»
y′i ,
#»
β′i ].
〈
K′i(
#»
β′i ,
#»
y′i)
∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣
µ˜x′1. 〈µα′1. 〈µα1.c′1||µ˜x1.c1〉∣∣∣∣∣
∣∣∣∣∣µ˜
[
#                                                         »
K′l(
#»
β′l ,
#»
y′l ).〈x′2||O′l[
#»
y′l ,
#»
β′l ]〉
l
]〉〉i

∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣α
′
2
〉
=Iso
〈
µ
 #                                                                                                                                                                                       »O′i[ #»y′i , #»β′i ].
〈
K′i(
#»
β′i ,
#»
y′i)
∣∣∣∣∣
∣∣∣∣∣µ˜x′1.
〈
µα′1.
〈
x′1
∣∣∣∣α′1〉
∣∣∣∣∣
∣∣∣∣∣µ˜
[
#                                                         »
K′l(
#»
β′l ,
#»
y′l ).〈x′2||O′l[
#»
y′l ,
#»
β′l ]〉
l
]〉〉i
∣∣∣∣∣∣∣
∣∣∣∣∣∣∣α′2
〉
=ηµηµ˜
〈
µ
 #                                                                                                                                 »O′i[ #»y′i , #»β′i ].
〈
K′i(
#»
β′i ,
#»
y′i)
∣∣∣∣∣
∣∣∣∣∣µ˜
[
#                                                         »
K′l(
#»
β′l ,
#»
y′l ).〈x′2||O′l[
#»
y′l ,
#»
β′l ]〉
l
]〉i
∣∣∣∣∣∣∣
∣∣∣∣∣∣∣α′2
〉
=βF′µµ˜
〈
µ
(
#                                                        »
O′i[
#»
y′i ,
#»
β′i ].〈x′2||O′i[
#»
y′i ,
#»
β′i ]〉
i
)∣∣∣∣∣
∣∣∣∣∣α′2
〉
=ηG′
〈
x′2
∣∣∣∣α′2〉
The composition of c2 and c′2 along α′2 and x′2 of type G′(
#»
C ) is equal to the identity
command 〈x2||α2〉 via the βG′ , ηF′V , βF, and ηG similarly.
Second, suppose that the commands c2 : (x2 : G(
#»
C ) ` α′2 : G′(
# »
C ′)) and c′2 : (x′2 :
G′( # »C ′) ` α2 : G( #»C )) witness the isomorphism G( #»C ) ≈ G′( # »C ′). Then the isomorphism
between F( #»C ) and F′( # »C ′) is established by:
c1 ,
〈
x1
∣∣∣∣∣∣∣
∣∣∣∣∣∣∣µ˜
 #                                                                                                                                                                   »Ki( #»βi , #»yi ).
〈
µ
(
#                                                       »
O′j[
#»
y′j
#»
β′j ].〈K′j(
#»
y′j ,
#»
β′j )||α′1〉
j
)∣∣∣∣∣
∣∣∣∣∣µ˜x′2. 〈µ˜α2.c′2∣∣∣∣∣∣Oi[ #»yi , #»βi ]〉
〉i〉
: (x1:F(
#»
C ) ` α′1:F′(
# »
C ′))
c′1 ,
〈
x′1
∣∣∣∣∣∣∣
∣∣∣∣∣∣∣µ˜
 #                                                                                                                                                                   »Ki( #»βi , #»yi ).
〈
µ
(
#                                                       »
Oj[ #»yj
#»
βj ].〈Kj( #»yj , #»βj )||α1〉
j
)∣∣∣∣∣
∣∣∣∣∣µ˜x2. 〈µ˜α′2.c2∣∣∣∣∣∣O′i[ #»y′i , #»β′i ]〉
〉i〉
: (x′1:F′(
# »
C ′) ` α1:F( #»C ))
338
Both compositions of c and c′ are equal to the identity command analogously to the
previous part by duality.
With the above two isomorphisms established, we now have enough to justify the
soundness of all the proposed structural laws for (co-)data declarations.
Theorem 8.6 (Structural law soundness). The declaration isomorphism laws in
Figures 8.4 and 8.5 are all sound.
Proof. The data laws all follow by generalizing the particular instances where the
data types are isomorphic: the commute, interchange, and compatibility laws are
all immediate consequence of Lemmas 8.1, 8.3 and 8.3, both mix laws follow from
Lemma 8.2 by taking either F1 and F′1 to be the empty data declaration of no
alternatives or taking F3 and F′3 to be the unit data declaration of one alternative
with no components (both of which are isomorphic by reflexivity), and the shift law
follows by applying Lemma 8.4 twice. The co-data laws follow from the data laws by
Lemma 8.4.
Finally, there is one more property about (co-)data declarations that will be
extremely useful in the following section. Namely that certain singleton (co-)data
types are just trivial wrappers around another type. In the right circumstances, these
wrappers can be identified with their underlying types, up to isomorphism, which lets
us connect the world of (co-)data declarations with the world of actual types.
Lemma 8.5 ((Co-)Data identity).
a) For any data F(Θ) : RwhereK : (A : T ` F(Θ) | ), if either T = V or R = V
then #»Θ  F(Θ) ≈ A.
b) For any codataG(Θ) : RwhereO : ( | G(Θ) ` A : T ), if either T = N or
R = N then #»Θ  G( #»Θ) ≈ A.
Proof. a) Let Θ = #         »X : S , suppose #        »C : S , and let B = A #             »{C/X}. F( #»C ) ≈ B is
established by the commands:
c1 , 〈x||µ˜[K(y).〈y||β〉]〉 : (x : F( #»C ) ` β : B) c2 , 〈K(y)||α〉 : (y : B ` α : F( #»C ))
First, the composition of c2 and c1 along α and x of type F(
#»
C ) : R is equal to
the identity command 〈y||β〉 by using the ηµ and ηµ˜ axioms to reveal the βF
339
redex as follows:
〈µα::R.c2||µ˜x::R.c1〉 , 〈µα::R. 〈K(y)||α〉||µ˜x::R. 〈x||µ˜[K(y).〈y||β〉]〉〉
=ηµηµ˜ 〈K(y)||µ˜[K(y).〈y||β〉]〉
=βF 〈y||µ˜y::T . 〈y||β〉〉 =µ˜T 〈y||β〉
Next, suppose that T = V . The composition of c1 and c2 along β and y of type
B : V is equal to the identity command 〈x||α〉 by using the strength of the µV
axiom to reveal the ηF redex as follows:
〈µβ::V .c1||µ˜y::V .c2〉 , 〈µβ::V . 〈x||µ˜[K(y).〈y||β〉]〉||µ˜y::V . 〈K(y)||α〉〉
=µV 〈x||µ˜[K(y).〈y::V||µ˜y. 〈K(y)||α〉〉]〉
=µ˜V 〈x||µ˜[K(y).〈K(y)||α〉]〉 =ηF 〈x||α〉
Otherwise, suppose that R = V. The composition of c1 and c2 along β and y
of type B : T is equal to the identity command 〈x||α〉 by using the combined
strength of the µV and ηF axioms to percolate out the case analysis on x and
create an inner βF redex:
〈µβ::T .c1||µ˜y::T .c2〉
, 〈µβ::T . 〈x||µ˜[K(y).〈y||β〉]〉||µ˜y::T . 〈K(y)||α〉〉
=µV 〈x||µ˜x. 〈µβ::T . 〈x||µ˜[K(y).〈y||β〉]〉||µ˜y::T . 〈K(y)||α〉〉〉
=µVηF 〈x||µ˜[K(y).〈K(y)||µ˜x. 〈µβ::T . 〈x||µ˜[K(y).〈y||β〉]〉||µ˜y::T . 〈K(y)||α〉〉〉]〉
=µ˜V 〈x||µ˜[K(y).〈µβ::T . 〈K(y)||µ˜[K(y).〈y||β〉]〉||µ˜y::T . 〈K(y)||α〉〉]〉
=βFµ˜T 〈x||µ˜[K(y).〈µβ::T . 〈y||β〉||µ˜y::T . 〈K(y)||α〉〉]〉
=ηµµ˜T 〈x||µ˜[K(y).〈K(y)||α〉]〉 =ηF 〈x||α〉
b) Analogous to the proof of Lemma 8.5 (a) by duality.
Internal polarized laws of declarations
Now that we have established some basic structural laws about isomorphisms
between general user-defined (co-)data types, we can focus on some more specific
laws about the polarized types in Figure 8.1. In particular, we can show that these
340
Additive laws
data F(Θ) : V where
K1 : (A : V ` F(Θ) |)
K2 : (B : V ` F(Θ) |)
≈⊕L
data F′(Θ) : V where
K′ : (A⊕B : V ` F′(Θ) | )
data F(Θ) : V where ≈0L
data F′(Θ) : V where
K′ : (0 : V ` F′(Θ) |)
Multiplicative laws
data F(Θ) : V where
K : (A : V , B : V ` F(Θ) |) ≈⊗L
data F′(Θ) : V where
K′ : (A⊗B : V ` F′(Θ) |)
data F(Θ) : V where
K : (` F(Θ) |) ≈1L
data F′(Θ) : V where
K′ : (1 : V ` F′(Θ) |)
Negation Laws
data F(Θ) : V where
K : (` F(Θ) | A : N ) ≈∼L
data F′(Θ) : V where
K′ : (∼A : V ` F(Θ) |)
Shift Laws
data F(Θ) : V where
K : (A : S ` F(Θ) |) ≈↓SL
data F′(Θ) : V where
K′ : (↓SA : V ` F′(Θ) |)
FIGURE 8.6. Isomorphism laws of positively polarized data sub-structures.
polar types play a part in a family of isomorphisms that closely resemble some of the
logical rules of the sequent calculus. Namely, each of the left rules for the positive data
types and the right rules for the negative co-data types correspond to an isomorphism
between (co-)data declarations with signatures matching the premises and conclusion
of the rules, as shown in Figures 8.6 and 8.7. The role of using declarations for this
purpose is to give enough structural substrate for stating these rules: the sequents
containing multiple inputs and multiple outputs in the rules can be expressed by the
types of constructors or observers, and multiple premises can be expressed by multiple
alternatives for constructors or observers. And as a result, we can reason about the
polarized types as sub-components within the structure of larger (co-)data types.
Theorem 8.7 (Polarized sub-structure laws). The declaration isomorphism laws in
Figures 8.6 and 8.7 are all sound.
341
Additive laws
codataG(Θ) : V where
O1 : ( | G(Θ) ` A : N )
O2 : ( | G(Θ) ` B : N )
≈&R
codataG′(Θ) : N where
O′ : ( | G′(Θ) ` A&B : N )
codataG(Θ) : N where ≈>R
codataG′(Θ) : N where
O′ : ( | G′(Θ) ` > : N )
Multiplicative laws
codataG(Θ) : N where
O : (` G(Θ) | A : N , B : N ) ≈`R
codataG′(Θ) : N where
O′ : (` G′(Θ) | A`B : N )
codataG(Θ) : N where
O : ( | G(Θ) ` ) ≈⊥R
codataG′(Θ) : N where
O′ : ( | G′(Θ) ` ⊥ : N )
Negation Laws
codataG(Θ) : N where
O : (A : V | G(Θ) ` ) ≈¬R
codataG′(Θ) : N where
O′ : ( | G(Θ) ` ¬A : N )
Shift Laws
codataG(Θ) : N where
K : ( | G(Θ) ` A : S) ≈↑SR
codataG′(Θ) : N where
O′ : ( | G′(Θ) ` ↑SA : N )
FIGURE 8.7. Isomorphism laws of negatively polarized co-data sub-structures.
342
Proof. Due to the (co-)data interchange laws from Figures 8.4 and 8.5 we only need
to demonstrate half of the isomorphisms in Figures 8.6 and 8.7 since each side implies
the other. So let us focus only on the more familiar data type declarations, because all
the laws for polarized co-data sub-structures are derived from those. In each case, the
main technique for establishing these laws is that, for any substitution θ matching the
environment Θ, the data type F′(Θ)θ on each right-hand side is isomorphic to the single
component of the single alternative under the substitution θ according to Lemma 8.5
because each of the data types are call-by-value (i.e. F′(Θ) : V). What remains is to
then demonstrate that in each case, the data type F(Θ)θ is also isomorphic to that
same type.
The sub-structure laws for the nullary data types (0, 1) are the easiest to
show. Note how for the 0L law we directly have that F(Θ) ≈ 0 as a trivial
case of Lemma 8.1 (b), and F′(Θ) ≈ 0 by Lemma 8.5 (a), so together we know
F(Θ) ≈ 0 ≈ F′(Θ). Similarly for the 1L law, we have F(Θ) ≈ 1 as a trivial case of
Lemma 8.1 (a), and so we get F(Θ) ≈ 1 ≈ F′(Θ) from Lemma 8.5 (a) as well.
The sub-structural laws for the unary data types (∼, ↓S) follow a different line
of reasoning, but are not much more difficult to demonstrate. For instance, consider
the negating ∼L law, where we know that F(Θ)θ ≈ ∼Aθ by Lemma 8.3 (b) because
Aθ ≈ Aθ by reflexivity, and as usual F′(Θ)θ ≈ ∼Aθ by Lemma 8.5 (a). Additionally,
the shifting ↓SL law is sound because we know that F(Θ)θ ≈ ↓SAθ by Lemma 8.3 (a)
because of the reflexive isomorphism Aθ ≈ Aθ, and F′(Θ)θ ≈ ↓SAθ by Lemma 8.5 (a).
And finally, the sub-structural laws for the binary (co-)data types (⊕,⊗) require
the most effort. This is because each of these types have two parts, and so we must
relate one part at a time and then mix the result together. In particular, we know
that F1(Θ)θ ≈ F′1(A,B)θ and F2(Θ)θ ≈ F′2(A,B)θ for the declarations
data F1(Θ) : V where
K1 : (A : V ` F1(Θ) | )
data F′1(X : V , Y : V) : V where
K′1 : (X : V ` F′1(Θ) | )
data F2(Θ) : V where
K2 : (B : V ` F2(Θ) | )
data F′2(X : V , Y : V) : V where
K′2 : (Y : V ` F′2(Θ) | )
by applying Lemma 8.3 (a) to the reflexive isomorphisms Aθ ≈ X {Aθ/X} and
Bθ ≈ Y {Bθ/Y }. Now note the two different ways to mix these isomorphisms together
with Lemma 8.2. First, we could mix the above F1(Θ)θ ≈ F′1(A,B)θ and F2(Θ)θ ≈
343
F′2(A,B)θ as the first two isomorphisms while the third is F3(Θ)θ ≈ F′3(A,B)θ given
by Lemma 8.1 (a) of the trivial data declarations
data F3(Θ) : V where
K3 : ( ` F3(Θ) | )
data F′3(X : V , Y : V) : V where
K′3 : ( ` F′3(X, Y ) | )
which tells us that F(Θ)θ ≈ A ⊕ B as required by the ⊕L law. Second, we could
mix F1(Θ)θ ≈ F′1(A,B)θ and F2(Θ)θ ≈ F′2(A,B)θ as the second two isomorphisms
while the first is F0(Θ)θ ≈ F′0(A,B)θ by Lemma 8.1 (b) of the trivial data
declarations data F3(Θ) : V where and data F′3(X : V , Y : V) : V where which tells
us that F(Θ)θ ≈ A⊗B as required by the ⊗L law.
In addition to the specific laws of Figures 8.6 and 8.7, each of the polarized
connectives is compatible with isomorphism. For example, if we have A ≈ A′, then
we also have A ⊕ B ≈ A′ ⊕ B and B ⊕ A ≈ B ⊕ A′. This fact lets us apply type
isomorphisms within the context of certain larger types: if two types are isomorphic,
then we can build on them with polarized connectives however we want and still have
an isomorphism. Said another way, for any type A made from polarized connectives
and any other isomorphic types B ≈ C, we can substitute both B and C for X in A
and still have the isomorphism A {B/X} ≈ A {C/X}.
Theorem 8.8 (Polarized isomorphism substitution). For any Θ, X : S `P A : T ,
Θ `G B : S, and Θ `G C : S, if Θ  B ≈ C then Θ  A {B/X} ≈ A {C/X}.
Proof. By induction on the typing derivation of Θ, X : S `P A : T and the fact that
each polarized connective is compatible with isomorphism. For example, in the case
of ⊗, given that A1 ≈ A′1 and A2 ≈ A′2 we have that
data F1() : V where
K : (A1 ⊗ A2 : V ` F1() | )
≈⊗L
data F2() : V where
K : (A1 : V , A2 : V ` F2() | )
≈ data F3()
: V where
K : (A′1 : V , A′2 : V ` F3() | )
≈⊗L
data F4() : V where
K : (A′1 ⊗ A′2 : V ` F4() | )
344
by the ⊗L, data compatibility, and data mix laws, and so A1 ⊗ A2 ≈ F1() ≈ F4() ≈
A′1 ⊗ A′2 by Lemma 8.5. The compatibility of the other polarized connectives follows
similarly.
Laws of the Polarized Basis
We have just seen in the previous section that there is an encoding of user-defined
(co-)data types solely in terms of the basic polarized connectives. However, how do
we know that this encoding is canonical, or that there are not many different and
unrelated encodings for the same purpose? Does it matter what order in which the
components of (co-)data types are put together, or in which way they are nested?
Or maybe we could instead encode (co-)data types in terms of the positive ⊕ and ⊗
connectives instead of the negative & and `?
As it turns out, none of these differences matter. The advantage of using the
polarized connectives, as declared in Figure 8.1, as the basis for encodings is that they
exhibit many pleasant—if none too surprising—properties, some of which have been
explored previously by Zeilberger (2009) and Munch-Maccagnoni (2013). That is, in
contrast with types like call-by-name tuples or call-by-value functions, the relationships
between types that we should expect—corresponding to common and well-known
relationships from algebra and logic—are actually isomorphisms between polarized
types even in the face of effects that let terms avoid giving a result.
Algebraic laws
Let’s begin by first exploring the algebraic properties of the polarized connectives.
In particular, the isomorphic relationship between the additive and multiplicative
connectives from Figure 8.1
– On the positive side, the ⊕ and 0 connectives form a commutative monoid of
types up to isomorphism—meaning they satisfy commutative, associative, and
unit laws as isomorphisms between types—and so do the ⊗ and 1 connectives.
Furthermore, all four together form a commutative semiring up to isomorphism—
meaning the “multiplication” ⊗ distributes over the “addition” ⊕ and is
annihilated by the “zero” 0.
345
A⊕B ≈ B ⊕ A
(A⊕B)⊕ C ≈ A⊕ (B ⊕ C)
0⊕ A ≈ A ≈ A⊕ 0
A⊗B ≈ B ⊗ A
(A⊗B)⊗ C ≈ A⊗ (B ⊗ C)
1⊗ A ≈ A ≈ A⊗ 1
A⊗ (B ⊕ C) ≈ (A⊗B)⊕ (A⊗ C)
(A⊕B)⊗ C ≈ (A⊗ C)⊕ (B ⊗ C)
A⊗ 0 ≈ 0 ≈ 0⊗ A
A&B ≈ B & A
(A&B) & C ≈ A& (B & C)
>& A ≈ A ≈ A&>
A`B ≈ B ` A
(A`B)` C ≈ A` (B ` C)
⊥` A ≈ A ≈ A`⊥
A` (B & C) ≈ (A`B) & (A` C)
(A&B)` C ≈ (A` C) & (B ` C)
A`> ≈ > ≈ >` A
FIGURE 8.8. Algebraic laws of the polarized basis of types.
– On the negative side, the & and > connectives form a commutative monoid up
to isomorphism and ` and ⊥ do as well. All four together form a commutative
semiring with & as addition and ` as mutiplication.
These properties of the additive and multiplicative connectives are summarized in
Figure 8.8.
We can verify that each of these isomorphisms are, in fact, isomorphisms using the
previously-established laws of (co-)data declarations in general and internal polarized-
substructures in particular from Figures 8.4, 8.5, 8.6 and 8.7. The general technique
follows the observation that, because of Lemma 8.5, if we have either a singleton data
declaration isomorphism or a singleton (co-)data declaration isomorphism of the form:
data F() : V where
K : (A : V ` F() | )
≈ data F
′() : V where
K′ : (A′ : V ` F′() | )
or
codataG() : N where
O : ( | G() ` A : N )
≈ codataG
′() : N where
O′ : ( | G′() ` A′ : N )
then we have A ≈ A′ by composing A ≈ F() ≈ F′() ≈ A′ or A ≈ G() ≈ G′() ≈
A′. Therefore, we can prove isomorphism laws about the polarized (co-)data types
by (1) placing both sides of the proposed isomorphism within a singleton data or
346
co-data type, as appropriate, (2) “unpacking” the two sides within the structure
of the containing (co-)data type declaration, and (3) use the laws of declaration
isomorphisms to show the two sides are indeed isomorphic. Each of these algebraic
laws can be derived from the laws in Figures 8.6 and 8.7 as follows.
Commutativity
The commutativity laws for reordering the binary connectives, unsurprisingly,
follows from the commutativity laws for reordering the parts of declarations. For the
multiplicative ⊗ and `, we use the first commute law to reorder the components
within a single constructor or observer, as follows:
data F1() : V whereK : (A⊗B : V ` F1() | )
≈⊗L data F2() : V whereK : (A : V , B : V ` F2() | )
≈ data F3() : V whereK : (B : V , A : V ` F3() | )
≈⊗L data F4() : V whereK : (B ⊗ A : V ` F4() | )
codataG1() : N whereO : ( | G1() ` A`B : N )
≈`R codataG2() : N whereO : ( | G2() ` A : N , B : N )
≈ codataG3() : N whereO : ( | G3() ` B : N , A : N )
≈`L codataG4() : N whereO : ( | G4() ` B ` A : N )
Whereas for the additive ⊕ and &, we use the second commute law to reorder the
alternatives within a declaration as shown in the following isomorphism:
data F1() : V whereK : (A⊕B : V ` F1() | )
≈⊕L data F2() : V whereK1 : (A : V ` F2() | )
K2 : (B : V ` F2() | )
≈ data F3() : V whereK2 : (B : V ` F3() | )
K1 : (A : V ` F3() | )
≈⊕L data F4() : V whereK : (B ⊕ A : V ` F4() | )
347
codataG1() : N whereO : ( | G1() ` A&B : N )
≈&R codataG2() : N whereO1 : ( | G2() ` A : N )
O2 : ( | G2() ` B : N )
≈ codataG3() : N whereO2 : ( | G3() ` B : N )
O1 : ( | G3() ` A : N )
≈&R codataG4() : N whereO : ( | G4() ` B & A : N )
Unit
Combining the binary connectives with their corresponding units is an identity
operation that leaves types unchanged, up to isomorphism. These unit laws rely
on the fact that the right and left laws for the nullary connectives “cancel out,”
in an appropriate way, any occurence of the nullary connective within a (co-)data
declaration as described by the 1L, 0L, ⊥R, and >R laws. For the multiplicative 1 and
⊥ connectives, we use the fact that 1 vanishes from the left-hand side of a constructor
and ⊥ vanishes from the right-hand side of an observer:
data F1() : V whereK : (1⊗ A : V ` F1() | )
≈⊗L data F2() : V whereK : (1 : V , A : V ` F2() | )
≈1L data F3() : V whereK : (A : V ` F3() | )
≈1L data F4() : V whereK : (A : V , 1 : V ` F4() | )
≈⊗L data F5() : V whereK : (A⊗ 1 : V ` F5() | )
codataG1() : N whereO : ( | G1() ` ⊥` A : N )
≈`R codataG2() : N whereO : ( | G2() ` ⊥ : N , A : N )
≈⊥R codataG3() : N whereO : ( | G3() ` A : N )
≈⊥R codataG4() : N whereO : ( | G4() ` A : N ,⊥ : N )
≈`R codataG5() : N whereO : ( | G5() ` A`⊥ : N )
Note the use of the mix law to extend 1L and ⊥R to allow for an extra component
along side the unit connective. Alternatively, for the additive 0 and > connectives,
we use the fact that any constructor containing a 0 on its left-hand side completely
348
vanishes itself, whereas an observer containing a > on its right-hand side vanishes:
data F1() : V whereK : (0⊕ A : V ` F1() | )
≈⊕L data F2() : V whereK1 : (0 : V ` F2() | )
K2 : (A : V ` F2() | )
≈0L data F3() : V whereK : (A : V ` F3() | )
≈0L data F4() : V whereK1 : (A : V ` F4() | )
K2 : (0 : V ` F4() | )
≈⊕L data F5() : V whereK : (A⊕ 0 : V ` F5() | )
codataG1() : N whereO : ( | G1() ` >& A : N )
≈&R codataG2() : N whereO1 : ( | G2() ` > : N )
O2 : ( | G2() ` A : N )
≈>R codataG3() : N whereO : ( | G3() ` A : N )
≈>R codataG4() : N whereO1 : ( | G4() ` A : N )
O2 : ( | G4() ` > : N )
≈&R codataG5() : N whereO : ( | G5() ` A&> : N )
Again, the mix law is used to extend 0L and >R for (co-)data declarations with
another alternative.
Associativity
Nested applications of the same binary connective can be reassociated, up to
isomorphism. This is because (co-)data declarations are “flat:” there is a single, flat list
of alternative, with each one containing a single, flat list of components on either side
of the turnstyle. Therefore, after we fully unpack a nested application of a connective,
it flattens out, so that we may repack the same parts back together in the other order.
349
For the multiplicative ⊗ and `, we have the following isomorphism:
data F1() : V whereK : ((A⊗B)⊗ C : V ` F1() | )
≈⊗L data F2() : V whereK : (A⊗B : V , C : V ` F2() | )
≈⊗L data F3() : V whereK : (A : V , B : V , C : V ` F3() | )
≈⊗L data F4() : V whereK : (A : V , B ⊗ C : V ` F4() | )
≈⊗L data F5() : V whereK : (A⊗ (B ⊗ C) : V ` F5() | )
codataG1() : N whereO : ( | G1() ` (A`B)` C : N )
≈`R codataG2() : N whereO : ( | G2() ` A`B : N , C : N )
≈`R codataG3() : N whereO : ( | G3() ` A : N , B : N , C : N )
≈`R codataG4() : N whereO : ( | G4() ` A : N , B ` C : N )
≈`R codataG4() : N whereO : ( | G4() ` A` (B ` C) : N )
Note that the mix law is used to extend ⊗L and `R to allow for an extra component
on either side of the main pair. For the additive ⊕ and &, we have the following
isomorphisms, again using mix to extend ⊕L and &R to allow for an extra alternative
before or after the main pair:
data F1() : V whereK : ((A⊕B)⊕ C : V ` F1() | )
≈⊕L data F2() : V whereK1 : (A⊕B : V ` F2() | )
K2 : (C : V ` F2() | )
≈⊕L data F3() : V whereK1 : (A : V ` F3() | )
K2 : (B : V ` F3() | )
K3 : (C : V ` F3() | )
≈⊕L data F4() : V whereK1 : (A : V ` F4() | )
K2 : (B ⊕ C : V ` F4() | )
≈⊕L data F5() : V whereK : (A⊕ (B ⊕ C) : V ` F5() | )
350
codataG1() : N whereO : ( | G1() ` (A&B) & C : N )
≈&R codataG2() : N whereO1 : ( | G2() ` A&B : N )
O2 : ( | G2() ` C : N )
≈&R codataG3() : N whereO1 : ( | G3() ` A : N )
O2 : ( | G3() ` B : N )
O3 : ( | G3() ` C : N )
≈&R codataG4() : N whereO1 : ( | G4() ` A : N )
O2 : ( | G4() ` B & C : N )
≈&R codataG5() : N whereO : ( | G5() ` A& (B & C) : N )
Distributivity
Distributing a multiplication over an addition also arises from the flat nature
of (co-)data declarations much like reassociating a binary connective. The difference
is that when the addition is flattened out into the structure of the declaration, the
multiplied type is carried along for the ride (via the mix law) and copied across both
alternatives, as shown in the following isomorphisms:
data F1() : V whereK : (A⊗ (B ⊕ C) : V ` F1() | )
≈⊗L data F2() : V whereK : (A : V , B ⊕ C : V ` F2() | )
≈⊕L data F3() : V whereK1 : (A : V , B : V ` F3() | )
K2 : (A : V , C : V ` F3() | )
≈⊗L data F4() : V whereK1 : (A⊗B : V ` F4() | )
K2 : (A : V , C : V ` F4() | )
≈⊗L data F5() : V whereK1 : (A⊗B : V ` F5() | )
K2 : (A⊗ C : V ` F5() | )
≈⊕L
data F6() : V where
K : ((A⊗B)⊕ (A⊗ C) : V ` F6() | )
351
codataG1() : N whereO : ( | G1() ` A` (B & C) : N )
≈`R codataG2() : N whereO : ( | G2() ` A : N , B & C : N )
≈&R codataG3() : N whereO1 : ( | G3() ` A : N , B : N )
O2 : ( | G3() ` A : N , C : N )
≈`R codataG4() : N whereO1 : ( | G4() ` A`B : N )
O2 : ( | G4() ` A : N , C : N )
≈`R codataG5() : N whereO1 : ( | G5() ` A`B : N )
O2 : ( | G5() ` A` C : N )
≈&R
codataG6() : N where
O : ( | G6() ` (A`B) & (A` Cx) : N )
Annihilation
When a type is multiplied by the additive unit, it is cancelled out. This occurs
because, unlike an addition, the multiplication places the type next to the unit where
it is in harms way. Thus, when 0L and >R are extended (by the mix law) to allow
for an extra component alongside the units, it is swept aside as the entire alternative
is deleted, as in the following isomorphisms:
data F1() : V whereK : (A⊗ 0 : V ` F1() | )
≈⊗L data F2() : V whereK : (A : V , 0 : V ` F2() | )
≈0L data F3() : V where
≈0L data F4() : V whereK : (0 : V , A : V ` F4() | )
≈⊗L data F5() : V whereK : (0⊗ A : V ` F5() | )
codataG1() : N whereO : ( | G1() ` A`> : N )
≈`R codataG2() : N whereO : ( | G2() ` A : N ,> : N )
≈>R codataG3() : N where
≈>R codataG4() : N whereO : ( | G4() ` > : N , A : N )
≈`R codataG5() : N whereO : ( | G5() ` >` A : N )
352
∼(A&B) ≈ (∼A)⊕ (∼B)
∼> ≈ 0
∼(A`B) ≈ (∼A)⊗ (∼B)
∼⊥ ≈ 1
∼(¬A) ≈ A
¬(A⊕B) ≈ (¬A) & (¬B)
¬0 ≈ >
¬(A⊗B) ≈ (¬A)` (¬B)
¬1 ≈ ⊥
¬(∼A) ≈ A
FIGURE 8.9. De Morgan duality laws of the polarized basis of types.
Duality laws
Isomorphism of types also gives us common logical properties of the polarized
connectives based on duality, established with the same technique used in Section 8.4.
In particular, we get two parallel copies of the De Morgan laws—one for ∼ negation
and the other for ¬ negation—that relates the positive data types with the negative
co-data types as shown in Figure 8.9. The positive “or” (⊕) is dualized into the
negative “and” (&) and the positive “and” (⊗) is dualized into the negative “or” (`).
Additionally, the two negation connectives cancel each other out, up to isomorphism.
That is to say, they are a characterization of involutive negation as (co-)data types.5
Remark 8.1. Note that, while the polarized basis for (co-)data types from Figure 8.1 is
nicely symmetric, it is also somewhat redundant. The fact that the negation connectives
are involutive and follow the de Morgan laws means that we get the following derived
type isomorphisms:
A⊕B ≈ ∼((¬A) & (¬B)) A&B ≈ ¬((∼A)⊕ (∼B))
A⊗B ≈ ∼((¬A)` (¬B)) A`B ≈ ¬((∼A)⊗ (∼B))
The consequence of these isomorphisms is that we only really need half the listed
additive and multiplicative connectives. We could, for example, take only ⊕, 0, ⊗, 1,
∼, and ¬ and primitive connectives and encode &, >, `, and ⊥ in terms of them
as above. Dually, we could take &, >, `, ⊥, ∼, and ¬ as primitive and encode ⊕, 0,
⊗, 1. Or we could instead mix and match between the positive (data) and negative
5This fact was noticed by Zeilberger (2009) and further brought to the forefront by Munch-
Maccagnoni (2014). The key is to have two dual negations, where one can be encoded with implication
(¬A ≈ A→ ⊥) and its dual can be encoded with subtraction (∼A ≈ 1−A).
353
(co-data) forms as desired. The only requirement is that we have at least one binary
and nullary additive connective, one binary and nullary multiplicative connective, and
both dual involutive negations. End remark 8.1.
Involutive negation
Double negation elimination (both the positive ∼(¬A) ≈ A and negative
¬(∼A) ≈ A forms) is perhaps deceptively simple: the ∼L and ¬R laws just flip
the double-negated type back and forth across the turnstyle until both negations
disappear:
data F1() : V whereK : (∼(¬A) : V ` F1 | )
≈∼L data F2() : V whereK : ( ` F2 | ¬A : N )
≈¬R data F3() : V whereK : (A : V ` F3 | )
codataG1() : V whereO : ( | G1 ` ¬(∼A) : N )
≈¬R codataG2() : V whereO : (∼A : V | G2 ` )
≈∼L codataG3() : V whereO : ( | G3 ` A : N )
Note that in the case of ∼(¬A), ¬R must be used in a data declaration instead of a
co-data declaration, and likewise ∼L must be used in a co-data declaration for ¬(∼A).
This can be accomplished with the (co-)data interchange laws, that let us convert
each data isomorphism from Figures 8.6 and 8.6 into a co-data isomorphism and vice
versa.
Constant negation
Involutive negation converts “true” into “false” and “false” into “true,” but it
also swaps between the data and co-data formulations of each. For the multiplicative
units, the data type 1 for true is the negation of the co-data type ⊥ for false because
both represent a (co-)data type with one alternative containing nothing:
data F1() : V whereK : (∼⊥ : V ` F1 | )
≈∼L data F2() : V whereK : ( ` F2 | ⊥ : N )
≈⊥R data F3() : V whereK : ( ` F3 | )
≈1L data F4() : V whereK : (1 : V ` F4 | )
354
codataG1() : N whereO : ( | G1 ` ¬1 : N )
≈¬R codataG2() : N whereO : (1 : V | G2 ` )
≈1L codataG3() : N whereO : ( | G3 ` )
≈⊥R codataG4() : N whereO : ( | G4 ` ⊥ : N )
For the additive units, the data type 0 for false is the negation of the co-data type >
for true because both represent a (co-)data type with no alternatives:
data F1() : V whereK : (∼> : V ` F1() | )
≈∼L data F2() : V whereK : ( ` F2() | > : N )
≈>R data F3() : V where
≈0L data F4() : V whereK : (0 : V ` F4() | )
codataG1() : V whereO : ( | G1() ` ¬0 : N )
≈¬R codataG2() : V whereO : (0 : V | G2() ` )
≈0L codataG3() : V where
≈>R codataG4() : V whereO : ( | G4() ` > : N )
De Morgan laws
Involutive negation also converts “and” into “or” and “or” into “and” while
interchanging data with co-data. For the multiplicatives, the connective ⊗ is an “and”
pair that amalgamates two pieces of data into a single structure, and this is the
negation of ` which is an “or” pair that conjoins two observers together:
data F1() : V whereK : (∼(A`B) : V ` F1() | )
≈∼L data F2() : V whereK : ( ` F2() | A`B : N )
≈`R data F3() : V whereK : ( ` F3() | A : N , B : N )
≈∼L data F4() : V whereK : (∼A : V ` F4() | B : N )
≈∼L data F5() : V whereK : (∼A : V ,∼B : V ` F5() | )
≈⊗L data F6() : V whereK : ((∼A)⊗ (∼B) : V ` F6() | )
355
codataG1() : N whereO : ( | G1() ` ¬(A⊗B) : N )
≈¬R codataG2() : N whereO : (A⊗B : V | G2() ` )
≈⊗L codataG3() : N whereO : (A : V , B : V | G3() ` )
≈¬R codataG4() : N whereO : (A : V | G4() ` ¬B : N )
≈¬R codataG5() : N whereO : ( | G5() ` ¬A : N ,¬B : N )
≈`R codataG6() : N whereO : ( | G6() ` (¬A)` (¬B) : N )
For the additives, the connective ⊕ is an “or” that yields one of two possible alternative
types of answers, and this is the negation of & which gives observers the option of
one of two possible types of questions:
data F1() : V whereK : (∼(A&B) : V ` F1() | )
≈∼L data F2() : V whereK : ( ` F2() | A&B : N )
≈&R data F3() : V whereK1 : ( ` F3() | A : N )
K2 : ( ` F3() | B : N )
≈∼L data F4() : V whereK1 : (∼A : V ` F4() | )
K2 : ( ` F4() | B : N )
≈∼L data F5() : V whereK1 : (∼A : V ` F5() | )
K2 : (∼B : V ` F5() | )
≈⊕L data F6() : V whereK : ((∼A)⊕ (∼B) : V ` F6() | )
codataG1() : N whereO : ( | G1() ` ¬(A⊕B) : N )
≈¬R codataG2() : N whereO : (A⊕B : V | G2() ` )
≈⊕L codataG3() : N where
O1 : (A : V | G3() ` )
O2 : (B : V | G3() ` )
≈⊕L codataG4() : N where
O1 : ( | G4() ` ¬A : N )
O2 : (B : V | G4() ` )
≈⊕L codataG5() : N where
O1 : ( | G5() ` ¬A : N )
O2 : ( | G5() ` ¬B : N )
≈&R codataG6() : N whereO : ( | G6() ` (¬A) & (¬B) : N )
356
Shift laws
The last group of polarized connectives, the shifts, have not appeared in any of the
algebraic or duality laws here. That is partially because their role is not to represent the
structural aspects of (co-)data types—like the ability to contain several components
or offer multiple alternatives—but instead serve to explicitly signal the mechanisms,
like the ability to delay a computation and force it later, that integrate different
evaluation strategies. In fact, the presence of shifts have the effect of prohibiting the
usual algebraic and dual laws of polarized types which we see in practice in functional
programming languages.
Returning to the examples of unfaithful encodings from the introduction, consider
again the problem of encoding triples in terms of pairs in a call-by-name language
like Haskell, where lazy pairs are described by the ×N data type declared previously
in Section 8.1, and lazy triples are represented as:
data LazyTriple(X:N , Y :N , Z:N ) : N where
L3 : (X:N , Y :N , Z:N ` LazyTriple(X, Y, Z) | )
By applying the polarization encoding from Figure 8.3 to a collection of declarations
G containing both ×N and LazyTriple, we get that6
X : N , Y : N , Z : N  JLazyTriple(X, Y, Z)KG ≈ ⇑(↓X ⊗ (↓Y ⊗ ↓Z))
X : N , Y : N , Z : N  JX ×N (Y ×N Z)KG ≈ ⇑(↓X ⊗ ↓⇑(↓Y ⊗ ↓Z))
but these two types represent very different space of possible program behaviors
because of the extra shifts in the encoding of X ×N (Y ×N Z). In other words, the
difference between the two is that the type X ×N (Y ×N Z) allows for extra values
like PairN (x, µ . 〈y||β〉), where µ . 〈y||β〉 is a term that does not return any result,
but LazyTriple(X, Y, Z) does not, which is explicitly expressed by the presence or
absence of shifts in their encoding. Furthermore, whereas we can apply properties like
associativity of ⊗ within the encoding of LazyTriple(X, Y, Z), where
X : N , Y : N , Z : N  ⇑(↓X ⊗ (↓Y ⊗ ↓Z)) ≈ ⇑((↓X ⊗ ↓Y )⊗ ↓Z)
6More specifically, the immediate output of translation is JLazyTriple(X,Y, Z)KG , ⇑((↓X⊗(↓Y ⊗
(↓Z ⊗ 1))) ⊕ 0) and JX ×N Y KG , ⇑((↓X ⊗ (↓Y ⊗ 1)) ⊕ 0), which is cleaned up as shown by the
laws in Section 8.4.
357
↓VA ≈ A ≈ V⇑A ↑NA ≈ A ≈ N⇓A
FIGURE 8.10. Identity laws of the redundant self-shift connectives.
this is blocked by the extra shifts in ⇑(↓X ⊗ ↓⇑(↓Y ⊗ ↓Z)), which prevent the law
from applying.
We can also view the troubles with currying in a call-by-value language like ML
in terms of extra shifts by with the representation of call-by-value functions as the
→V co-data type previously declared in Section 8.1, whose encoding simplifies to
X : V , Y : V  JX →V Y KG ≈ ⇓(¬X ` ↑Y )
Again, the shifts get in the way when we try to apply the algebraic or logical laws of
the polarized connectives. The type of uncurried call-by-value functions is
X : V , Y : V , Z : V  J(X ⊗ Y )→V ZK ≈ ⇓(¬(X ⊗ Y )` ↑Z) ≈ ⇓(¬X ` ¬Y ` ↑Z)
whereas the type of curried call-by-value functions is
X : V , Y : V , Z : V  JX →V (Y →V Z)K ≈ ⇓(¬X ` ↑⇓(¬Y ` ↑Z))
which is not the same because of the extra shifts.
This does not mean that the shifts are completely lawless, however. Since we began
with a large family of shifts—singleton data and co-data type constructors mapping
between any kind S and V or N—some of them turn out to be redundant as shown in
Figure 8.10. The data shifts ↓V and V⇑ for wrapping call-by-value type as another call-
by-value type and the co-data shifts ↑N and N⇓ for doing the same to call-by-name
types are all identity operations on types, up to isomorphism. In particular, the data
declarations for ↓V and V⇑ are the simplest instance of Lemma 8.5 (a) which means
that ↓VA ≈ A ≈ V⇑A, and likewise ↑NA ≈ A ≈ N⇓A because of Lemma 8.5 (b).
This fact tells us that the polarizing translation on already-polarized types is actually
an identity up to isomorphism, i.e. for any Θ `P A : S, it follows that Θ  JAKP ≈ A.
For example, we have
X : V , Y : V  JX ⊕ Y KP , V⇑(((↓VX ⊗ 1)⊕ (↓VY ⊗ 1))⊕ 0) ≈ X ⊕ Y
358
for the additive data type, and
X : V  J¬XKP , N⇓(>& (⊥` (¬↓VX))) ≈ ¬X
for the negation co-data type, justifying our rule of thumb for deciding the appropriate
strategies for the polarized basis P of (co-)data types.
Functional laws
So far, our attention has been largely focused on properties of the polarized
(co-)data types from Figure 8.1, some of which, like `, are unfamiliar as programming
constructs. But what about a more familiar construct like functions? We have seen that
call-by-value functions don’t behave as nicely as we’d like, which can be understood
as unfortunate extra shifts between evaluation strategies. So is there a type of
function that avoids these problems? As it turns out, the mixed-polarity, “primordial”
(Zeilberger, 2009) function type that we considered in Chapter IV captures the best of
both the call-by-value and call-by-name worlds, which is represented by the co-data
declaration:
codata (X:V → Y :N ) : N where
· : (X : V | X → Y ` Y : N )
The particular placement of V and N again follows the rule of thumb from Section 8.1,
so as a consequence the polarized encoding for A → B avoids any impactful shifts.
Because of the identity laws for shifts from Figure 8.10, the polarizing encoding for
the above declaration G simplifies down to just ¬ and `:
X : V , Y : N  JX → Y KG ≈ N⇓(¬(↓VX)` (↑NY )) ≈ ¬X ` Y
This gives us the most primitive expression of functions in our multi-strategy language;
the rest can be encoded in terms of the above polarized function type by adding back
the extra shifts.
Alternatively, we could have chosen to replace the unfamiliar ` with this function
type. Because of the involutive nature of the ¬ and ∼ negations, we have the following
encoding of ` in terms of → and ∼: A ` B ≈ ¬(∼A) ` B ≈ (∼A) → B. Certainly
functions are more familiar than ` as a programming construct, but the cost of leaning
on this familiarity is the loss of symmetry because functions are a “half-negated or.” In
359
A→ B ≈ (∼B)→ (¬A)
(A⊗B)→ C ≈ A→ (B → C)
1→ A ≈ A
A→ ⊥ ≈ ¬A
A→ (B & C) ≈ (A→ B) & (A→ C)
(A⊕B)→ C ≈ (A→ C) & (B → C)
A→ > ≈ >
0→ A ≈ >
∼(A→ B) ≈ A⊗ (∼B)
A→ (¬B) ≈ ¬(A⊗B)
FIGURE 8.11. Derived laws of polarized functions.
particular, we can recast all of the algebraic and logical laws about ` in terms of→ as
shown in Figure 8.11, some of which are familiar properties of implication, that are all
derived from the encoding A→ B ≈ ¬A`B. The commutativity, associativity, and
unit laws of the underlying ` give us contrapositive, currying, thunking, and negating
laws:
A→ B ≈ (¬A)`B ≈ B ` (¬A) ≈ (¬(∼B))` (¬A) ≈ (∼B)→ (¬A)
(A⊗B)→ C ≈ (¬(A⊗B))` C ≈ ((¬A)` (¬B))` C
≈ (¬A)` ((¬B)` C) ≈ A→ (B → C)
1→ A ≈ (¬1)` A ≈ ⊥` A ≈ A
A→ ⊥ ≈ (¬A)`⊥ ≈ ¬A
Likewise, distributing ` over & and annihilating it with > recognize certain functions
as products or trivial units:
A→ (B & C) ≈ (¬A)` (B & C) ≈ ((¬A)`B) & ((¬A)` C)
≈ (A→ B) & (A→ C)
(A⊕B)→ C ≈ (¬(A⊕B))` C ≈ ((¬A) & (¬B))` C
≈ ((¬A)` C) & ((¬B)` C) ≈ (A→ C) & (B → C)
A→ > ≈ (¬A)`> ≈ >
0→ A ≈ (¬0)` A ≈ >` A ≈ >
360
And finally, the De Morgan duality between ` and ⊗ tells us that the continuation
of a function is a pair, and that a continuation for a pair is a function:
∼(A→ B) ≈ ∼((¬A)`B) ≈ (∼(¬A))⊗ (∼B) ≈ A⊗ (∼B)
A→ (¬B) ≈ (¬A)` (¬B) ≈ ¬(A⊗B)
The Faithfulness of Polarization
Now that we have laid down some laws for declaration isomorphisms, we can
put them to use for encoding user-defined (co-)data types in terms of the polarized
connectives from Figure 8.1. In particular, we can extend the laws from Figures 8.6
and 8.7 for polarized sub-structures appearing within a simple singleton declaration
to apply to any general (co-)data type using the mix laws from Figures 8.4 and 8.5.
For example, given a declaration of the form
data F(Θ) : V where
K0 : (Γ0, A : V , B : V ` F(Θ) | ∆0)
#                                         »K : (Γ ` F(Θ) | ∆)
we can combine the A and B components of the K0 constructor with the ⊗ connective
by starting with the ⊗L law, and then building up to the full declaration of F
by applying the mix law to the appropriate reflexive isomorphisms as discussed in
Section 8.3 as follows:
data F(Θ) : V where
K : (A : V , B : V ` F(Θ) | )
≈ data F
′(Θ) : V where
K′ : (A⊗B : V ` F′(Θ) | )
data F(Θ) : V where
K0 : (A : V , B : V ,Γ0 ` F(Θ) | ∆0)
≈ data F
′(Θ) : V where
K′0 : (A⊗B : V ,Γ0 ` F′(Θ) | ∆0)
data F(Θ) : V where
K0 : (A : V , B : V ,Γ0 ` F(Θ) | ∆0)
#                                         »K : (Γ ` F(Θ) | ∆)
≈
data F′(Θ) : V where
K′0 : (A⊗B : V ,Γ0 ` F′(Θ) | ∆0)
#                                           »
K : (Γ ` F′(Θ) | ∆)
Other combinations of components at different positions in constructors of F can
be targeting with the commute laws for data declarations. This idea is the central
technique of the encoding, which just repeats the above procedure until we are left
361
with only a singleton (co-)data type that “wraps” its encoding. First we consider how
to encode a just one (co-)data type declaration in terms of polarized connectives.
Theorem 8.9 (Polarizing (co-)data declarations). For any S validating χS ,
a) given data F(Θ) : Swhere
#                                                                                   »
Ki :
(
#              »
Aij : Tijj ` F(Θ) | #                »Bij : Rijj
)i
∈ G, we have
Θ  F(Θ) ≈ JF(Θ)KG, and
b) given codataG(Θ) : Swhere
#                                                                                    »
Oi :
(
#              »
Aij : Tijj | G(Θ) ` #                »Bij : Rijj
)i
∈ G, we have
Θ  G(Θ) ≈ JG(Θ)KG.
Proof. a) Observe that we have the following data isomorphism by extending the
polarized laws from Figure 8.6 with the mix and commute laws from Figure 8.4:
data F1(Θ) : V where
#                                                                                      »
Ki :
(
#              »
Aij : Tijj ` F1(Θ) | #                »Bij : Rijj
)i
≈↓L data F2(Θ) : V where
#                                                                                            »
Ki :
(
#                     »↓TijAij : V
j ` F2(Θ) | #                »Bij : Rijj
)i
≈↑R data F3(Θ) : V where
#                                                                                                    »
Ki :
(
#                     »↓TijAij : V
j ` F3(Θ) | #                        »↑RijBij : N
j
)i
≈∼L data F4(Θ) : V where
#                                                                                                              »
Ki :
(
#                     »↓TijAij : V
j
,
#                               »∼(↑RijBij) : V
j ` F4(Θ) |
)i
≈1L,⊗L data F5(Θ) : V where
#                                                                                                                     »
Ki :
(⊗( #            »↓TijAijj, #                      »∼(↑RijBij)j) : V ` F5(Θ) | )
i
≈0L,⊕L
data F6(Θ) : V where
K :
⊕ #                                                              »⊗( #            »↓TijAijj, #                      »∼(↑RijBij)j)
i
 : V ` F6(Θ) |

With the above isomorphism between F1 and F6, it follows from the data shift
law that:
data F(Θ) : Swhere
#                                                                                   »
Ki :
(
#              »
Aij : Tijj ` F(Θ) | #                »Bij : Rijj
)i
≈ data F′(Θ) : SwhereK :
⊕ #                                                              »⊗( #            »↓TijAijj, #                      »∼(↑RijBij)j)
i
 : V ` F′(Θ) |

362
We then get that
Θ  F′(Θ) ≈ S⇑
⊕ #                                                              »⊗( #            »↓TijAijj, #                      »∼(↑RijBij)j)
i
 , JF(Θ)KG
by applying Lemma 8.3 (a) to the reflexive isomorphism of
⊕ #                                                              »⊗( #            »↓TijAijj, #                      »∼(↑RijBij)j)
i

so by transitivity Θ  F(Θ) ≈ JF(Θ)KG.
b) Analogous to the proof of Theorem 8.9 (a) by duality.
Now that we know how to encode individual (co-)data types in isolation, we
look to a global encoding of arbitrary types made out of a collection G of (co-)data
declarations. The only limitation on the group of declarations G is that they be
well-formed and non-cyclic, which is a consequence of the judgement
(
`G
)
seq from
Section 6.2. The non-cyclic requirement ensures that the dependency chains between
declarations is well-founded, so the process of inlining the encodings of (co-)data types
will eventually terminate and give a final, fully-expanded encoding.
Theorem 8.10 ((Co-)Data Polarization). Given derivations of both
(
`G
)
seq and
Θ `G A : S, it follows that Θ  A ≈ JAKG.
Proof. By lexicographic induction on (1) the derivation of
(
`G
)
seq , and (2) the
derivation of Θ `G A : S. The case when A is a variable is immediate. The case where
A = F( #»C ) for some F declared in G as
data F(
#          »
X : S ′) : Swhere
#                                                                 »
K :
(
#         »
A : T ` F( #»X ) | #         »B : R
)
follows from Theorems 8.9 and 8.8. In particular, we have
#                               »
Θ  C ≈ JCKG #          »X : S ′  Aij ≈ JAijKG′ #          »X : S ′  Bij ≈ JBijKG′
from the inductive hypothesis for some G ′ strictly smaller than G. From Theorem 8.9,
we have
F( #»C ) ≈ S⇑(
⊕
(
#                                                               »⊗
( #               »↓TijAijθ
j
,∼( #                »↑RijBijθ
j
))
i
))
363
where θ =
#             »{C/X}, and from Theorem 8.8 we know that
Θ  S⇑
⊕ #                                                                    »⊗( #               »↓TijAijθj,∼( #                »↑RijBijθj))
i

≈ S⇑
⊕ #                                                                                                    »⊗( #                              »↓TijJAijKGJθKGj, #                                         »∼(↑RijJBijKGJθKG)j)i

where JθKG = #                      »{JCKG/X}. Therefore, we have Θ  F( #»C ) ≈ rF( #»C )zG by distributing
the substitution θ over translation. The case where A = G( #»C ) for some G declared in
G as co-data follows similarly.
Note that as an immediate consequence of the full (co-)data polarization
encoding (Theorem 8.10), we can generalize the fact that isomorphism distributes
over substitution into a type made from polarized connectives (Theorem 8.8) to
conclude that isomorphism distributes over substitution into any type built from
(non-cyclic) (co-)data type constructors. In particular, for any non-cyclically well-
formed G, Θ, X : S `G A : T , Θ `G B : S, and Θ `G C : S, if Θ  B ≈ C then
Θ  A {B/X} ≈ A {C/X}. This fact means that we can apply any isomorphism
within the context of any encodable (co-)data type.
364
CHAPTER IX
Representing Functional Programs
At this point, we have looked at the design and theory of many different
programming language features in the setting of the sequent calculus. We looked at a
symmetric mechanism for user-defined data and co-data types, an abstract treatment
of evaluation strategy in terms of substitution that lets us mix multiple evaluation
orders within a program (Chapter V). We also looked at ways to do capture type
abstraction as higher-order data and co-data types, and well-founded recursion in both
types and programs (Chapter VI). But how has what we learned impact functional
programming languages which are based on natural deduction and the λ-calculus? Is
there some way to transfer ideas born in the sequent calculus over to more traditional
programming languages?
Yes! The sequent calculus was developed as a tool for studying natural deduction
(Gentzen, 1935a), so the two already have a well-established relationship (Gentzen,
1935a). Here, we will now employ that canonical relationship between the two logics
to develop the natural deduction, λ-based counterpart to everything we have done
in the µµ˜ sequent calculus. This lets us talk about ideas like user-defined (co-)data
types, multiple kinds of evaluation strategies, and mixed induction and co-induction
in a core language that is much closer to pure functional programming.
The pure λ-calculus counterpart is not just loosely based on the ideas developed in
the µµ˜-calculus, the two are in a close correspondence between their static and dynamic
semantics. In particular, by limiting ourselves to just one consequence, which eliminates
the possibility of control effects, typability and equations between program fragments
in the two languages are in a one-for-one correspondence. The correspondence between
sequent calculus and natural deduction has two applications. First, it lets us compile
functional programs, which come from languages based on the λ-calculus, so a sequent
calculus representation. Second, it lets us transfer results, such as strong normalization
(in Chapter VII) found in the sequent-based language to the natural deduction one.
So the single-consequence µµ˜-calculus can be seen as a more machine-like version of
a pure λ-calculus core language which serves as a compile target as well as a good
vehicle for studying the properties of functional programs.
365
Next, we consider what natural deduction language corresponds to the entire,
multiple-consequence, µµ˜-calculus. Multiple consequences are neatly accommodated
by Parigot’s (1992) λµ-calculus, which is a form of natural deduction for classical logic,
and a term language for first-class control effects. This extension sacrifices purity for
greater expressivity (Felleisen, 1991) which has practical applications in compilers.
In particular, optimizing compilers rely on the idea of join points—a representation
of shared control flow in a program which joins back together after diverging across
different branching paths, like after an if-then-else construct—for preventing code size
explosion while transforming programs.
In common compiler intermediate languages, join points are represented as φ-
nodes in static single assignment (SSA) (Cytron et al., 1991) and as just continuations
in continuation-passing style (CPS) (Appel, 1992). However, the proper treatment of
join points is typically a troublesome issue in languages based on a pure direct-style
λ-calculus (Kennedy, 2007). In contrast, the direct-style λµ-calculus presented here
can be used as a basis for a compiler intermediate language for functional languages
that allows for the proper expression of shared control flow in terms of general first-
class control. If we are interested in compiling pure functional programs in particular,
we can restrict this calculus down to the pure subset without loosing the correct
treatment of join points by limiting the types and occurrences of co-variables (Maurer
et al., 2017; Downen et al., 2016). This restriction also makes it more direct to give
a good account of recursive join points which are helpful for implementing efficient
loops for functional languages.
This chapter covers the following topics:
– A λ-calculus based natural deduction counterpart to the µµ˜ sequent calculus,
λlet (Section 9.1), including all of the language features we have considered so
far: user-defined, higher-order (co-)data types, mixed evaluation strategies, and
well-founded recursion.
– The direct correspondence between the static and dynamic semantics of λlet
and the single-consequence restriction µµ˜ (Section 9.2), which is “direct” in the
sense that the translation between the two languages is local, not global, so that
the types of terms are exactly the same in both languages (unlike continuation-
passing style transformations).
366
– The multipe-consequence extension of the pure natural deduction λlet -calculus,
λµlet (Section 9.3), which heightens the correspondence to cover the full µµ˜
sequent calculus and allows for the direct representation of shared control flow.
Pure Data and Co-Data in Natural Deduction
So far, we have looked at a calculus in sequent style, which corresponds to a
classical logic and thus includes control effects (Griffin, 1990). Let’s now shift focus,
and see how the intuition we gained from the sequent calculus can be reflected back into
a more traditional core calculus for representing functional programs. The goal here is
to see how the principles we have developed in the sequent setting can be incorporated
into a λ-calculus based language: using the traditional connection between natural
deduction and the sequent calculus, we show how to translate our primitive and
noetherian recursive types and programs into natural deduction style. In essence, we
will consider a functional calculus based on an effect-free subset of the µµ˜-calculus
corresponding to Gentzen’s (1935a) LJ sequent calculus for intuitionistic logic.
Static semantics
Essentially, the intuitionistic restriction of the µµ˜ sequent calculus for representing
effect-free programs follows a single mantra, based on the connection between the
classical and intuitionistic logics LK and LJ: there is always exactly one conclusion. In
the type system, this means that the sequent for typing terms has the more restricted
form Γ `ΘG v : A, where the active type on the right is no longer ambiguous and does
not need to be distinguished (with |), as is more traditional for functional languages.
Notice that this limitation on the form of sequents impacts which data and co-data
types we can express. For example, the common sums and products, which were
declared as
dataX ⊕ Y where
ι1 : X ` X ⊕ Y |
ι2 : X ` X ⊕ Y |
codataX & Y where
pi1 : | X & Y ` X
pi2 : | X & Y ` Y
fit into this restricted typing discipline, because each of their constructors and observers
involves exactly one type to the right of entailment. However, the (co-)data types for
367
representing more exotic connectives like the two negations
data∼X where
∼ : ` ∼X | X
codata¬Y where
¬ : X | ¬Y `
or the binary and nullary disjunctive co-data types
codataX ` Y where
[ , ] : | X ` Y ` X, Y
codata⊥where
[] : | ⊥ `
do not fit, because they require placing zero or two types to the right of entailment.
In sequent style, this means these pure data types can never contain a co-value, and
pure co-data types must always involve exactly one co-value for returning the unique
result. In functional style, the data types are exactly the algebraic data types used in
functional languages, with the corresponding constructors and case expressions, and
the co-data types can be thought of as merging functions with records into a notion
of abstract “objects” which compute and return a value when observed. For example,
to observe a value of type X & Y , we could access the first component as a record
field, v pi1, and we describe an object of this type by saying how it responds to all
possible observations, λ{pi1 ⇒ v1 | pi2 ⇒ v2}, with the typing rules:
Γ ` v1 : A Γ ` v2 : B
Γ ` λ{pi1 ⇒ v1 | pi2 ⇒ v2} : A&B
Γ ` v : A&B
Γ ` v pi1 : A
Γ ` v : A&B
Γ ` v pi2 : B
Likewise, the traditional λ-abstractions and type abstractions from system F (as seen
previously in Chapter II) can be expressed by objects of these form. Specifically, since
they definable (as seen previously in Sections 5.2 and 6.2) as pure co-data types with
one observer, · : (X | X → Y ` Y ) and @ :
(
| ∀(X) `Y :k X Y
)
respectively,
so that the application of a function v to an argument v′ is written as v · v′, the
specification of the polymorphic v to a type A is written as v @ A, and the basic
λ-abstractions are syntactic sugar that removes the extra generality:
λx.v , λ{ · x⇒ v} ΛY :k.v , λ{@ Y :k ⇒ v}
Thus, these objects serve as “generalized λ-abstractions” (Abel & Pientka, 2013)
defined by shallow case analysis rather than deep pattern-matching.
368
v ∈ Term ::= x | letx = v in v′ (core)
| K( #»B, #»v ) | case v′ of
#                                »
K( #   »Y :l, #»x )⇒ v (data)
| λ
{
#                               »
O[ #   »Y :l, #»x ]⇒ v
}
| v′ O[ #»B, #»v ] (co-data)
F ∈ FrameCxt ::=  | letx = F in v (core frames)
| F O[ #»B, #»v ] (co-data frames)
| caseF of
#                                »
K( #   »Y :l, #»x )⇒ v (data frames)
FIGURE 9.1. Untyped syntax for a natural deduction language of data and co-data.
Putting this more formally, the untyped syntax of λlet , a natural deduction style
pure λ-calculus, is given in Figure 9.1. At its core, the λlet -calculus includes variables
and let expressions, which allow for the binding and reference of names without
imposing any particular structure. In addition, the untyped syntax of λlet includes
arbitrary data structures and case analysis on data structures (of the form K( #»B, #»v )
and case v′ of
#                                »
K( #   »Y :l, #»x )⇒ v) as well as arbitrary co-data objects and observations
of those objects (of the form λ
{
#                               »
O[ #   »Y :l, #»x ]⇒ v
}
and v′ O[ #»B, #»v ]). The observations of
the results of terms are given by the syntax of frame contexts F . While these contexts
are a meta-syntactic construct (that is, they are contexts of the syntax of terms, but
not syntax themselves), they will soon play a crucial in the dynamic semantics of λlet
to come.
On top of the untyped syntax, we have the static typing rules. The rules for the
type-level upward and well-formed sequents are exactly the same as in the µµ˜-calculus,
so we do not repeat them here, and instead only present the typing rules for terms.
First, there are the core typing rules in Figure 9.2 which correspond to the core of
the µµ˜-calculus: the Var rule corresponds to right-variable rule VR, the Let rule
corresponds to the Cut rule, and the TC rule corresponds to the right-type conversion
rule TCR. Note that weakening and contraction are built into these rules, following
the style of natural deduction which makes structural inferences implicit. Next, we
have the typing rules for pure data and co-data types in the λlet -calculus: the rules for
simple multi-kinded (co-)data types are shown in Figure 9.3 and the more advanced
rules for higher-order (co-)data types are shown in Figure 9.4. Intuitively, these rules
369
Judgement ::= Γ `ΘG v : A
Γ, x : A `ΘG x : A
Var
Γ `ΘG v : A Θ `G A : S Γ, x : A `ΘG v′ : C
Γ `ΘG letx = v in v′ : C
Let
Θ `G A =βη B : k Γ `ΘG v : A
Γ `ΘG v : B
TC
FIGURE 9.2. A natural deduction language for the core calculus.
generalize the typing rules from the λ-calculus in Chapter II. Also note that for a case
expression which introduces type variables in its branches, the associated elimination
rule implicitly imposes the restriction that the return type cannot reference those type
variables. The implicit restriction comes from the fact that, for the conclusion of the
elimination rule to be well formed. That is, if we know that the sequent corresponding
to Γ `ΘG case v of
#                                »
Ki(
#   »
Y :l #»x )⇒ vi
i
: C is well formed, i.e. if we have a derivation of(
Γ `ΘG C
)
seq , then that implies that none of #»Y are free in C, since they are not
already in Θ because we are able to extend the typing environment to Θ, #      »Y : l in the
premise.
Dynamic semantics
With the static semantics for how natural deduction programs are formed, we
now consider the dynamic semantics for how programs behave. As with the µµ˜-sequent
calculus, we will characterize the impact of evaluation strategy on substitution as a
parameter to the language. In λlet , the corresponding notion of a substitution strategy
T is a subset of terms called values (V ∈ ValueT ) and a subset of frame contexts
called co-values (E ∈ CoValueT ), such that variables are values, the empty context
is a co-value, and co-values compose (i.e. if E and E ′ are co-values then so is E[E ′]).
Next, an evaluation strategy T includes a substitution strategy as well as a subset
of all contexts called evaluation contexts (D ∈ EvalCxtT ) such that every co-value is
an evaluation context. Note that the scope of potential evaluation contexts is quite
large, and co-values point out a special subset of all evaluation contexts. In essence,
370
Given data F( #        »X : k) : Swhere
#                                                »
Ki :
#              »
Aij : Tijj ` F( #»X )
i
∈ G, we have the rules:
#                                              »
Γ `ΘG v : Aij
{ #        »
B/X
}j
Γ `ΘG Ki( #»v ) : F(
#»
B)
FIKi
Θ `G F( #»B) : S Γ `ΘG v : F(
#»
B)
#                                                                 »
Γ,
#                                »
x : Aij
{ #        »
B/X
}j
`ΘG vi : C
i
Γ `ΘG case v of
#                        »Ki( #»x )⇒ vi
i
: C
FE
Given codataG( #     »X:k) : Swhere
#                                                                        »
Oi :
#              »
Aij : Tijj | G( #»X ) ` A′i : Ri
i
∈ G, we have the
rules:
#                                                                                       »
Γ,
#                                »
x : Aij
{ #        »
B/X
}j
`ΘG vi : A′i
{ #        »
B/X
}i
Γ `ΘG λ
{
#                       »Oi[ #»x ]⇒ vi
i
}
: G( #»B)
GI
Θ `G G( #»B) : S Γ `ΘG v : G(
#»
B)
#                                                »
Γ `ΘG vj : Aij
{ #        »
B/X
}j
Γ `ΘG v Oi[ #»vjj] : A′i
GEOi
FIGURE 9.3. Natural deduction typing rules for simple (co-)data.
371
Given data F( #        »X : k) : Swhere
#                                                              »
Ki :
#               »
Aij : Tijj `
#       »
Y :lij
j
F( #»X )
i
∈ G, we have the rules:
#                                                 »
Θ `G B′ : lij
{
#        »
B/X
}j #                                                                 »
Γ `ΘG v : Aij
{
#         »
B′/Y ,
#        »
B/X
}j
Γ `ΘG Ki(
# »
B′ , #»v ) : F( #»B)
FIKi
Θ `G F( #»B) : S Γ `ΘG v : F(
#»
B)
#                                                                                 »
Γ,
#                                 »
x : Aij
{
#        »
B/X
}j
`Θ,
#       »
Y :lij
j
G vi : C
i
Γ `ΘG case v of
#                                          »
Ki(
#       »
Y :lij
j
, #»x )⇒ vi
i
: C
FE
Given codataG( #     »X:k) : Swhere
#                                                                                      »
Oi :
#               »
Aij : Tijj | G( #»X ) `
#       »
Y :lij
j
A′i : Ri
i
∈ G, we have the
rules:
#                                                                                                        »
Γ,
#                                 »
x : Aij
{
#        »
B/X
}j
`Θ,
#       »
Y :lij
j
G vi : A′i
{
#        »
B/X
}i
Γ `ΘG λ
{
#                                         »
Oi[
#       »
Y :lij
j
, #»x ]⇒ vi
i
}
: G( #»B)
GI
Θ `G G( #»B) : S Γ `ΘG v : G(
#»
B)
#                                                 »
Θ `G B′ : lij
{
#        »
B/X
}j #                                                                   »
Γ `ΘG vj : Aij
{
#         »
B′/Y ,
#        »
B/X
}j
Γ `ΘG v Oi[
# »
B′
j
, #»vj
j ] : A′i
GEOi
FIGURE 9.4. Natural deduction typing rules for higher-order (co-)data.
372
(letT ) letx = V in v letT v {V/x} (V ∈ ValueT )
(ηletT ) letx = v inE[x] ηletT E[v] (E ∈ CoValueT , x /∈ FV (E))
(ccT ) E[letx = v′ in v] ccT letx = v′ inE[v] ( 6= E ∈ CoValueT , x /∈ FV (E))
(ccT ) E
 case v′ of#                        »
K( #   »Y :l)⇒ v
 ccT case v
′ of
#                                »
K( #   »Y :l)⇒ E[v]
FIGURE 9.5. A core parametric theory for the natural deduction calculus.
co-values are evaluation contexts with some additional properties: evaluation contexts
in general are stationary, but co-values are mobile.
With the concept of λlet evaluation strategies in mind, lets look at the strategy-
parametric rewriting rules for the λlet -calculus. The core rewriting rules, which
correspond to the core theory of the µµ˜-calculus, are given in Figure 9.5. These rules
are responsible for interpreting let expressions: the letT rule substitutes a let -bound
T -value for the bound variable, and the ηletT rule eliminates a trivial let expression of
the form letx = v inE[x] which introduces a name only to use it exactly once in the
eye of a T -co-value. The core theory also includes commuting conversions ccT which
push co-values inside of the block structures of let and case expressions. Intuitively,
in the term E[letx = v′ in v], the result of v is passed to the co-value E, however
the two are separated by an intermediate let . A ccT reduction is thus needed to
push the co-value inward and bring the question E in contact with the answer v as in
letx = v′ inE[v]. The same situation happens with a case in place of a let , so there
is a commuting conversion for case , too. Unfortunately, this means that the core λlet
theory must know about and manipulate language constructs revolving around data
types, unlike the core theory of the µµ˜ sequent calculus which made no assumptions
about specific types.
Next, we have the rewriting rules for data and co-data in the natural deduction
λlet -calculus. First are the untyped and strategy-parameterized β and ς laws in
Figure 9.6, which mimic the similar β and ς in the µµ˜-calculus. The β laws generalize
the β laws for the λ-calculus from Chapter II to accomodate arbitrary data and
co-data types and arbitrary substitution strategies. The ς lift laws are necessary to
keep evaluation moving forward when non-values are found in unfortunate contexts.
373
(βT )
case Ki(
#»
B,
#»
V )of
#                                   »
Ki(
#   »
Y :l, #»x )⇒ vi
i βT vi
{ #     »
V/x,
#       »
B/Y
}
(βT ) λ
{
#                                  »
Oi[
#   »
Y :l, #»x ]⇒ vi
i
}
O1[
#»
B,
#»
V ] βT vi
{ #     »
V/x,
#       »
B/Y
}
(ςT ) Ki(
#»
B,
#»
V , v′, #»v ) ςT
letx = v′
in Ki(
#»
B,
#»
V , x, #»v )
(v′ /∈ ValueT , x fresh)
(ςT ) V ′ O[
#»
B,
#»
V , v′, #»v ] ςT
letx = v′
inV ′ O[ #»B, #»V , x, #»v ]
(v′ /∈ ValueT , x fresh)
FIGURE 9.6. The untyped parametric βς laws for arbitrary data and co-data types.
For example, call-by-value λ-calculi often have issues with getting stuck prematurely on
open terms, where evaluation should still continue even though the value of everything
isn’t known yet. For example, in the open λ-calculus term (λx.λy.x) (f 1) 2, plain
call-by-value β-reduction is stuck because f 1 is not a value and cannot be substituted
for x even though the result of the term must be f 1 for any value of f . However, the
ς rules can lift the inconveniently-placed f 1 out of the way, letting reduction proceed
as follows:
(λx.λy.x) (f 1) 2→ςV let z = f 1 in (λx.λy.x) z 2
→βV let z = f 1 in (λy.z) 2
→βV let z = f 1 in z
→ηletV f 1
We also have the typed and strategy-independent β and η laws in Figure 9.6.
The β laws work for any evaluation strategy by binding unevaluated sub-terms in let
expressions, and the η laws expand terms based on their type. Note that, as in the
µµ˜-calculus, the η law for co-data types acts on variables, but the more commonly
seen generalization to values is derivable with the help of the core theory for let :
V : G( #»C ) =letT let y = V in y
374
(βF) case Ki(
#»
B, #»v )of
#                                   »
Ki(
#           »
Y :l, #»x )⇒ vi βF let #          »x = vi in vi
{ #       »
B/Y
}
(βG) λ
{
#                                  »
Oi[
#   »
Y :l, #»x ]⇒ vi
}
Oi[
#»
B, #»v ] βG let #         »x = v in vi
{ #       »
B/Y
}
(ηF) v : F( #»C ) ≺ηF case v of
#                                                      »
K( #   »Y :l, #»x )⇒ K( #   »Y :l, #»x )
(ηG) y : G( #»C ) ≺ηG λ
{
#                                                          »
O[ #   »Y :l, #»x ]⇒ y O[ #   »Y :l, #»x ]
}
FIGURE 9.7. The typed βη laws for declared data and co-data types.
V ∈ ValueV ::= x | K( #»A, #»V ) | λ
{
#                                »
O[ #   »X:l, #»x ]⇒ v
}
E ∈ CoValueV ::= F
D ∈ EvalCxtV ::= E
FIGURE 9.8. Call-by-value (V) strategy in natural deduction.
=ηG let y = V inλ
{
#                                                          »
O[ #   »Y :l, #»x ]⇒ y O[ #   »Y :l, #»x ]
}
=letT λ
{
#                                                            »
O[ #   »Y :l, #»x ]⇒ V O[ #   »Y :l, #»x ]
}
where the definition of capture-avoiding substitution enforces the side condition that
the variables #»Y and #»x are not free in V . In this sense, the strategy-independent η
laws in Figure 9.7 also generalize the η laws for the λ-calculus from Chapter II.
Let’s now consider some example evaluation strategies, corresponding to the ones
we defined for the µµ˜ sequent calculus. First, we have the call-by-value strategy V
shown in Figure 9.8, which says that only variables, data structures made from values,
and co-data objects are values. However, all frame contexts are co-values, which also
exactly spell out the set of evaluation contexts. This definition essentially follows the
normal notion of values and evaluation contexts in the call-by-value λ-calculus, except
that evaluation does not descend into data structures (like pairs) or the arguments
of co-data observations (like function calls). Instead, the ς rules lift unevaluated
components out of these contexts and bind them to a variable with a let expression
which is a co-value so evaluation can continue on them. Next, we have the call-by-
name strategy N shown in Figure 9.8, which says that every term is a value, but
375
V ∈ ValueN ::= v
E ∈ CoValueN ::=  | caseE of
#                                »
K( #   »X:l, #»x )⇒ v | E O[ #»A, #»v ]
D ∈ EvalCxtN ::= E
FIGURE 9.9. Call-by-name (N ) strategy in natural deduction.
only the empty context, a case on a co-value, and an observation on a co-value are
co-values. This evaluation strategy more closely matches the call-by-name λ-calculus,
where every term is substitutable and evaluation contexts, which are exactly co-values,
are only those contexts which force an answer to be given for computation to continue.
Finally, we have the most subtle strategy of the three: the call-by-need strategy
LV shown in Figure 9.10. Note how the values of call-by-need are exactly the values
of call-by-value. However, the co-values of call-by-need lie in between call-by-value
and call-by-name. In particular, every call-by-name co-value is a call-by-need co-value,
but there are extra co-values of the form letx = E inD[E ′[x]], where the evaluation
context D can include extra let -bindings around E ′[x]. Note that this is the first
evaluation strategy with an interesting difference between co-values and evaluation
contexts: evaluation contexts can wrap a co-value with extra let -bindings as in
letx1 = v1 in . . . letxn = vn inE. This is because those bindings are delayed until
their value is needed, as in the call-by-need λ-calculus. For example, if we have the
term letx = 1 + 1 in v, the right-hand-side of x = 1 + 1 is delayed, and instead v is
evaluated in the context letx = 1 + 1 in v. If it happens that v reduces to a term
that needs x, that is to say v 7→ D[E[x]], then letx =  inD[E[x]] is an evaluation
context, so that 1 + 1 is evaluated and substituted for x as in:
letx = 1 + 1 in v 7→ letx = 1 + 1 inD[E[x]]
7→ letx = 2 inD[E[x]]
7→ D[E[2]] {2/x}
However, the bindings are not mobile; they should not be pushed inward by commuting
conversions like co-values are.
376
V ∈ ValueLV ::= x | K( #»A, #»V ) | λ
{
#                                »
O[ #   »X:l, #»x ]⇒ v
}
E ∈ CoValueLV ::=  | caseE of
#                                »
K( #   »X:l, #»x )⇒ v | E O[ #»A, #»V ] | letx = E inD[E ′[x]]
D ∈ EvalCxtLV ::= E | letx = v inD
FIGURE 9.10. Call-by-need (LV) strategy in natural deduction.
Finally, we can combine multiple evaluation strategies within a single program
using as similar technique as in the µµ˜ sequent calculus. That is to say, evaluation
strategies can be combined by taking the disjoint union of the respective substitution
strategies and composing together each of their evaluation contexts to get a single set
of operational contexts. The disjointness of the union can be regulated by kinds, as
a looser form of typing discipline as shown in Figure 9.11. As before, the statement
v :: T intuitively means that v : A and A : T for some unknown type A. We can then
disjointly union several substitution strategies #»T based on kinds by associating each
strategy with a kind, and distinguishing values based on the kind of their output and
co-values based on the kind of their input. That is to say, the combined set of values
Value #»T contains any value V ∈ ValueTi such that v :: Ti. In contrast, the combined
set of co-values CoValue #»T contains any co-value E ∈ CoValueTi such that E[v] :: Tj
for all v :: Ti. For example, to combine the three strategies above, we would get the
following composite strategy:
V :: V V ∈ ValueV
V ∈ ValueV,N ,LV
V :: N V ∈ ValueN
V ∈ ValueV,N ,LV
V :: LV V ∈ ValueLV
V ∈ ValueV,N ,LV
E ∈ CoValueV ∀v :: V .∃R.E[v] :: R
E ∈ CoValueV,N ,LV
E ∈ CoValueN ∀v :: N .∃R.E[v] :: R
E ∈ CoValueV,N ,LV
E ∈ CoValueLV ∀v :: LV .∃R.E[v] :: R
E ∈ CoValueV,N ,LV
E ∈ CoValueV,N ,LV
E ∈ EvalCxtV,N ,LV
D ∈ EvalCxtV,N ,LV
letx :: LV = v inD ∈ EvalCxtV,N ,LV
377
Judgement ::= Γ `ΘG v :: A
Core kinding rules:
Γ, x :: T `ΘG x :: T
Var
Γ `ΘG v :: T Γ, x :: T `ΘG v′ :: R
Γ `ΘG letx = v in v′ :: R
Let
Given data F( #        »X : k) : Swhere
#                                                           »
Ki :
#              »
Aij : Tijj `
#       »
Y :lij
j
F( #»X )
i
∈ G, we have the rules:
#                         »Γ `G v :: Tij
j
Γ `G Ki(
# »
B′ , #»v ) :: S
FIKi
Γ `G v :: S
#                                              »
Γ, #           »x :: Tijj `G vi :: R
i
Γ `G case v of
#                                         »
Ki(
#      »
Y :lij
j
, #»x )⇒ vi
i
:: R
FE
Given codataG( #     »X:k) : Swhere
#                                                                                   »
Oi :
#              »
Aij : Tijj | G( #»X ) `
#       »
Y :lij
j
A′i : Ri
i
∈ G, we have
the rules:
#                                              »
Γ, #          »x : Tijj `G vi :: Ri
i
Γ `G λ
{
#                                        »
Oi[
#      »
Y :lij
j
, #»x ]⇒ vi
i}
:: S
GI Γ `G v :: S
#                           »Γ `G vj :: Tij
j
Γ `G v Oi[
# »
B′
j
, #»vj
j] :: Ri
GEOi
FIGURE 9.11. Type-agnostic kind system for multi-kinded natural deduction terms.
378
Note that, the statement of a combination of evaluation strategies is not as clean
in the λlet natural deduction calculus as it was in the µµ˜ sequent calculus because
the term-heavy syntax of λlet makes it more indirect to talk about concepts such as
co-values.
Remark 9.1. It is worth considering if the η laws in Figure 9.7 really say all that needs
to be said about extensionality for data types. For example, the instance of the η law
for sum types A⊕B is
v : A⊕B =η⊕ case v of
ι1 (x)⇒ ι1 (x)
ι2 (y)⇒ ι2 (y)
and after all, there are apparently much stronger extensionality laws for sums, like
C[v : A⊕B] = case v of
ι1 (x)⇒ C[ι1 (x)]
ι2 (y)⇒ C[ι2 (y)]
which generalizes η⊕ so that the term v : A ⊕ B may appear in any context
C.1 Unfortunately, this strong sum η law is deeply troublesome when faced with
computational effects like nontermination. For example, if we apply the strong η law
for sums with C = λx.x  and v = Ω : A ⊕ B, where Ω is a term which loops
forever without returning a result, then the stronger η⊕ law is completely unsound
with respect to contextual equivalence, since
λz.z Ω 6∼= caseΩof ι1 (x)⇒ λz.z ι1 (x) | ι2 (y)⇒ λz.z ι2 (y) ∼= Ω
Thus, the exceptionally strong version of η⊕ only makes sense in a pure and normalizing
language, where everything terminates and all terms evaluate to some result.
Dealing with a strong extensionality law like this, which places very strict
requirements on the language like strong normalization that can only be deep
properties of the language as a whole, is difficult to handle directly. As an alternative
1This strong sum law is sometimes also written in terms of a substitution v′ {v/z} where v can
occur in many places instead of the context C[v] where v occurs in exactly one place, but these
amount to the same thing since C might be let z =  in v′ so that C[v] = let z = v in v′.
379
approach, Munch-Maccagnoni & Scherer (2015) propose the use of polarization to
tame the strong sum extensionality law to make it more manageable without loosing
anything important. In particular, the polarization hypothesis suggests that call-by-
value is the most fitting evaluation strategy for a data type like A ⊕ B, and so in
the strong η⊕ would be more appropriate to restrict the term in question to be a
(call-by-value) value as in:
C[V : A⊕B] = caseV of ι1 (x)⇒ C[ι1 (x)] | ι2 (y)⇒ C[ι2 (y)]
This equation does not suffer from the same troubles when effects like nontermination
are introduced: an infinitely looping term is never a value, so the above extensionality
law does not cause the same sort of counter-example. In other words, this restricted
version of the strong extensionality law for sums is sound even in the presence of
effects. And as Munch-Maccagnoni & Scherer note, if the language happens to be
strongly normalizing, then every (closed) term reduces to a value, anyway, and so the
unrestricted sum extensionality law can be derived after the fact as a deeper property
of the language.
So where does this leave our simplistic treatment of extensionality for data types
taken here? As it turns out, the strong sum law is derivable from the simplistic η⊕
law in the equational theory with the help of commuting conversions. First, note that
Munch-Maccagnoni & Scherer’s extensionality law for call-by-value sums is derivable
as follows:
caseV of
ι1 (x)⇒ C[ι1 (x)]
ι2 (y)⇒ C[ι2 (y)]
=letV caseV of
ι1 (x)⇒ let z = ι1 (x) inC[z]
ι2 (y)⇒ let z = ι2 (y) inC[z]
=ccV let z = caseV of
ι1 (x)⇒ ι1 (x)
ι2 (y)⇒ ι2 (y)
inC[z]
=η⊕ let z = V inC[z]
=letV C[V ]
380
Γ `ΘG v0 : A 0 Γ, x : A j `Θ,j:IxG v1 : A (j + 1)
Γ `ΘG λ{@ 0:Ix ⇒ v0 | @x (j+1):Ix ⇒ v1} : ∀Ix(A)
∀IxRrec
Θ `G ∃Ix(A) : S Γ `ΘG v : ∃Ix(A) Γ, x : A 0 `ΘG v0 : C Γ, x : A (j + 1) `Θ,j:IxG v1 : A j
Γ `ΘG loop v of 0:Ix @ x⇒ v0 | (j+1) : Ix @ x⇒ v1 : C
∃IxLrec
Γ, x : ∀<Ord(j, A) `Θ,j<NG v : A j
Γ `ΘG λ{@x j<N ⇒ v} : ∀<Ord(N,A)
∀<OrdRrec
FIGURE 9.12. The pure, recursive size abstractions in natural deduction.
This derivation makes two key observations about the call-by-value equational laws in
the core theory: (1) every context C is equivalent to a V co-value evaluation context
let z =  inC[z], and (2) this co-value commutes with a case expression. In total,
we can effectively transport any such context out of the branches of a case , leaving
just a trivial case expression that is handled by the simple η⊕ law. We can then
deduce that the simple βη equational theory for (co-)data types includes the strong
sum law from Munch-Maccagnoni & Scherer’s observation: so long as the language is
strongly normalizing, then the strong sum extensionality law is sound with respect to
contextual equivalence by the above derivation since every closed term is equivalent
to a value. End remark 9.1.
Well-founded recursion
To incorporate well-founded recursion into the natural deduction λlet -calculus, we
only need to allow for the same forms of recursively-defined types as in the µµ˜-calculus
in Chapter VI as well as the recursive abstraction over size indexes. Specifically, the
typing and rewriting rules for the extra recursive terms for size abstractions are shown
in Figures 9.12 and 9.13. Intuitively, the objects of ∀Ix(A) are stepwise loops that
can return any A N by counting up from 0 and using the previous instances of itself,
whereas we can write looping case expressions over values of ∃Ix(A) to count down from
any A N to 0. Similarly, values of ∀<Ord(N,A) are self-referential objects that always
behave the same no matter the number of recursive invocations. Curiously though, the
recursive forms for ∃<Ord(N,A) are conspicuously missing from the functional calculus.
In essence, the recursive form for ∃<Ord(N,A) is a case expression that introduces a
continuation variable for the recursive path out of the expression in addition to the
381
(β∀IxT ) λ
{
@ [0:Ix]⇒ v0
@x [j+1:Ix]⇒ v1
}
@ 0 
β
∀Ix
T
v0
(β∀IxT ) λ
{
@ [0:Ix]⇒ v0
@x [j+1:Ix]⇒ v1
}
@ [M + 1] 
β
∀Ix
T
letx = λ
{
@ [0:Ix]⇒ v0
@x [j+1:Ix]⇒ v1
}
@M
in v1 {M/j}
(β∃IxT )
loopV @ 0of
x@ 0:Ix ⇒ v0
x@ (j+1:Ix)⇒ v1
β∃Ix v0 {V/x}
(β∃IxT )
loopV @ (M + 1)of
x@ (0:Ix)⇒ v0
x@ (j+1:Ix)⇒ v1
β∃Ix
loop v1 {V/x,M/j}@M of
x@ (0:Ix)⇒ v0
x@ (j+1:Ix)⇒ v1
(ν∀<Ord) λ{@x [j<M ]⇒ v} ν∀<Ord λ{@ [i<M ]⇒ v {λ{x@ [j<i]⇒ v}/x}}
FIGURE 9.13. The β and ν laws for recursion.
normal return path, effectively requiring a form of subtraction type C − ∃<Ord(M,A)
for smaller indices M to be useful. If we were to try to come up with a pure looping
term for ∃<Ord(N,A) in the λlet , we would end up with something like
Θ `G ∃<OrdN,A : S Γ `ΘG v : ∃<Ord(N,A) Γ, x : A j `Θ,j<NG v′ : ∃<Ord(j, A)
Γ `ΘG loop v of x@ j<N ⇒ v′ : C
∃<OrdLrec
which is not a useful construct since there is no way to return a result of type C
from this term, making it unobservable. So while ∃<Ord can still be used to hide Ord
indices, its recursive nature lies outside the pure functional paradigm. This follows
the frequent situation where one of four classical principles gets lost in translation to
intuitionistic settings. It occurs with De Morgan laws (¬(A ∧ B) → (¬A) ∨ (¬B) is
not intuitionistically valid), the conjunctive and disjunctive connectives (` requires
multiple conclusions so it does not fit the pure mold), and here as well, and also in the
orthogonality semantics of programs from Chapter VII where the space orthogonal to
an intersection contains more than the union of the orthogonal parts.
Example 9.1. Intuitively, we can think of the values of ∀Ix(A) as a dependently typed
version of the recursion operator for natural numbers in Gödel’s (1980) System T.
382
Indeed, we can encode such an operator using deep co-pattern matching as:
rec : ∀X : Ix → R.X 0→ (∀i : Ix.X i→ X (i+ 1))→ ∀i : Ix.X i
rec = λ
@X · x · f @ (0:Ix) ⇒ x@X · x · f @r (j+1:Ix)⇒ f @ j · r

which desugars into the shallow co-patterns:
rec : ∀X : Ix → R.X 0→ (∀i : Ix.X i→ X (i+ 1))→ ∀i : Ix.X i
rec = ΛX.λx.λf.λ{@ (0:Ix)⇒ x | @r (j+1:Ix)⇒ f @ j · r}
So essentially, we are using the natural number index to drive the recursion upward to
compute some value, where the type of that returned value can depend on the number
of steps in the chosen index. In a call-by-name setting, where we choose a maximal
set of values so that V can be any term, then the behavior of rec implements the
recursor: given that rec X x f βN recX,x,f we have
recX,x,f @ 0→β∀IxN x recX,x,f @ (M + 1)→β∀IxN f @M · (recX,x,f @M)
Contra-posed, ∃Ix(A) implements a dependently-typed, stepwise recursion in the other
direction. The looping form breaks down a value depending on an arbitrary index N
by counting down until that index reaches 0, finally returning some value which does
not depend on the index. For instance, we can sum the values in any vector of numbers,
v : Vec(N,ANat), where Vec and Nat are declared as in Chapter VI, in accumulator
style by looping over the recursive structure ∃i : Ix .ANat⊗Vec(i,ANat):2
loopN @ ((0 @ Z), v)of
(0:Ix) @ (acc,Nil) ⇒ acc
(i+1:Ix) @ (acc,Cons(x, xs))⇒ (x+ acc, xs)
End example 9.1.
Example 9.2. Instead, values of ∀<Ord are useful for representing stronger induction
that recurses on deeply nested sub-structures. For example, we can convert a list
2Note, we assume an addition operator + : ANat→ ANat→ ANat.
383
x1, x2, . . . , xn, where List is declared the same as in Chapter VI, into a list of its
adjacency pairs (x1, x2), (x3, x4), . . . , (xn−1, xn) by
pairs Nil = Nil
pairs Cons(x, ys) = Nil
pairs Cons(x,Cons(y, zs)) = Cons((x, y), pairs zs)
where we silently drop the final element if the list is odd. The pairs function can be
straightforwardly encoded using ∀<Ord as:
pairs : ∀X : ?.∀i <∞. List(i,X)→ List(i,X ⊗X)
pairs = ΛX.λ

@r i<∞⇒ λx:List(i,X). casexsof
Nil⇒ Nil
Consj<i(x:X, ys:List(j,X))⇒ case ysof
Nil⇒ Nil
Consk<j(y:X, zs:List(k,X))⇒ Consk((x, y), (r @ k) · zs)

Note that the type of the recursive argument r is ∀i′ < i. List(i′, X)→ List(i′, X ⊗X).
Thus, the recursive self-invocation r @ k : List(k,X)→ List(k,X ⊗X) is well-typed,
since we learn that j < i and k < j by analyzing the Cons structure of the list, and
can infer that k < i by transitivity. End example 9.2.
Example 9.3. As a comparison between recursive data and co-data, consider the
following two alternative methods of encoding branching structure.
dataTree(i : Ix, X : S) : S by primitive recursion on i
where i = 0 Leaf : ` Tree(0, X) |
where i = j + 1 Branch : Tree(j,X) : S, X : S,Tree(j,X) : S ` Tree(j + 1, X) |
codataRiver(i : Ix, X : S) : S by primitive recursion on i
where i = j + 1 ForkL : | River(j + 1, X) ` River(j,X) : S
Curr : | River(j + 1, X) ` X : S
ForkR : | River(j + 1, X) ` River(j,X) : S
384
The first is the data type Tree(i,X) defined by primitive recursion on i that represents
balanced binary trees of exactly height i containing an X element in each internal
Branch node. The second is the co-data type River(i,X) defined by primitive recursion
on i that represents a branching stream which has a current element (accessed by
Curr), a left fork (accessed by ForkL) and a right fork (accessed by ForkR). The index
i in River(i,X) dictates how far we can flow down the forking river before it runs dry.
The two types are effectively different representations of the same information.
From a River(i,X) of a given depth i, we can grow a Tree(i,X) of the same height i
by flowing down each forking path in each branch as follows:
grow : ∀X : S.∀i : Ix.River(i,X)→ Tree(i,X)
grow = λ
@X @ (0:Ix) · s ⇒ Leaf@X @f (j+1:Ix) · s⇒ Branch (f · (s ForkL), s Curr, f · (s ForkR))

Conversely, we can chop down a Tree(i,X) of height i and have its elements flow down
a River(i,X) of the same depth i by traversing each branch, keeping the branch’s X
element as the current element of the river, and having the left and right sides of the
branch flow down the left and right forks of the river as follows:
chop : ∀X : S.∀i : Ix.Tree(i,X)→ River(i,X)
chop = λ

@X @f (j+1:Ix) · Branch(l, x, r) ForkL⇒ f · l
@X @f (j+1:Ix) · Branch(l, x, r) Curr ⇒ x
@X @f (j+1:Ix) · Branch(l, x, r) ForkL⇒ f · r

One way to interpret chop is that the ForkL, Curr, and ForkR observers from River
form the constructors of a zipper (Huet, 1997) into Trees. Indeed, in the µµ˜ sequent
calculus, chop would be written as
chop = µ

[X @ (j+1:Ix) @f Branch(l, x, r) · ForkL[β1]] . 〈f ||l · β1〉
[X @ (j+1:Ix) @f Branch(l, x, r) · Curr[α]] . 〈x||α〉
[X @ (j+1:Ix) @f Branch(l, x, r) · ForkR[β2]] . 〈f ||r · β2〉

where a River observation like ForkL[ForkR[ForkR[ForkL[Curr[α]]]]] is a first-class
co-data structure detailing a path for chop to descend into a tree.
385
Both of these objects above were written using deep pattern matching, which
we can desugar into shallow patterns as usual. The desugared version of grow is
not terribly surprising, but chop is more interesting. In particular, by spelling out
each case separately, it is clear that a 0 height tree is converted into a nullary river
λ{} : River(0, X) that does not have to respond to any observation, because there are
none in the 0 case.
grow = ΛX.λ
@ (0:Ix) ⇒ λs. Leaf@f (j+1:Ix)⇒ λs.Branch (f · (s ForkL), s Curr, f · (s ForkR))

chop = ΛX.λ

@ (0:Ix) ⇒ λt.λ{}
@f (j+1:Ix)⇒ λt. case tof Branch(l, x, r)⇒ λ

ForkL ⇒ f · l
Curr ⇒ x
ForkR⇒ f · r


End example 9.3.
Natural Deduction versus Sequent Calculus
The natural deduction based λlet is heavily influenced by language features that
have been developed in the µµ˜ sequent calculus, but how do the two really compre? For
example, µµ˜ allows for the first-class treatment of control flow with its µ-abstractions,
which are entirely missing from λlet , so clearly the intuitionistic λlet cannot directly
correspond to all of the classical µµ˜. Instead, the λlet -calculus corresponds to a well-
established subset of µµ˜ based off Gentzen’s (1935a) intuitionistic LJ sequent calculus.
LJ is exactly the same as LK that we saw in Chapter III, except that every sequent
is limited to exactly one consequence. In terms of the µµ˜-calculus, this corresponds a
restriction to forcing exactly one free co-variable in every command and co-term (which
have no active consequence), and zero free co-variables in every term (which already
have an active consequence). It turns out this restriction is enough to establish a static
and dynamic correspondence between the λlet -calculus based on natural deduction
and the single-consequence µµ˜-calculus based on the sequent calculus.
First, we consider how to translate between expressions of λlet and µµ˜. The
canonical, compositional translations are given in Figure 9.14. The translation LJ JvK
386
converts λlet terms into µµ˜ terms and LJ JF K converts λlet frames into µµ˜ co-terms.
Variables and introduction forms (i.e. data structures and co-data objects) translate
directly to similar forms in the sequent calculus, whereas let and elimination forms
(i.e. data case expressions and co-data observations) require the help of a µ and cut
in the sequent calculus to represent the elimination. Going the other way, the NJ J K
family of translations convert each µµ˜ command and (co-)term into a λlet term. Also
note that for commands and co-terms, NJ JcKα and NJ JeKα denotes the single free
co-variable found in c and e. Terms translate directly to similar forms in natural
deduction, whereas co-terms translate to contexts, with the singled-out co-variable α
being the empty context, µ˜-abstractions being a let context, and the left co-terms for
(co-)data being elimination forms of (co-)data types. Finally, a cut 〈v||e〉 is translated
as plugging the translation of v, which is a term, into the translation of e, which is a
context. Finally, we also extend the translation to substitution strategies pointwise,
by saying that the values and co-values of a strategy in one language translate to the
values and co-values of the other, which we close under reduction. In other words,
given a λlet substitution strategy T = (ValueT ,CoValueT ), then the translation of T
to µµ˜ is
LJ JT K = ({v ∈ Term | ∃V ∈ ValueT ,LJ JV K  v},
{e ∈ CoTerm | ∃E ∈ CoValueT ,LJ JEK  e})
and similarly in the other direction, given a µµ˜ substitution strategy S =
(ValueS ,CoValueS), the translation of S is
NJ JSK = ({v ∈ Term | ∃V ∈ ValueS ,LJ JV K  v},
{F ∈ FrameCxt | ∃E,α ∈ CoValueS ,∀v ∈ Term,LJ JEKα[v] F [v]})
We can check that these translations preserve the semantics of the two languages.
In terms of the static semantics, the two translations preserve types on the nose. That
is to say, if we have a term v : A in λlet then LJ JvK : A with exactly the same type
A in µµ˜, and if v : A in µµ˜ then NJ JvK : A in λlet . Note that this shows that the
translations are direct, instead of indirect. For example, continuation-passing style
(CPS) translations, which are an alternative to the kinds of translations in Figure 9.14,
alter the interface of terms in a way seen in their types. That is to say, a CPS translation
387
Natural deduction to sequent calculus:
LJ JxK , x
LJ Jletx = v in v′K , µα. 〈LJ JvK||µ˜x. 〈LJ Jv′K||α〉〉
LJ
r
K( #»B, #»v )
z
, K
#»
B (
#          »
LJ JvK)
LJ
s
case v′ of
#                                »
K( #   »Y :l, #»x )⇒ v
{
, µα.
〈
LJ Jv′K∣∣∣∣∣∣∣∣µ˜[ #                                           »K #   »Y :l( #»x ).〈LJ JvK||α〉]〉
LJ
s
λ
{
#                               »
O[ #   »Y :l, #»x ]⇒ v
}{
, µ
(
#                                                »
O
#   »
Y :l [ #»x , α].〈LJ JvK||α〉i)
LJ
r
v′ O[ #»B, #»v ]
z
, µα.
〈
LJ Jv′K∣∣∣∣∣∣O #»B [ #          »LJ JvK , α]〉
LJ JKα , α
LJ Jletx = F in vKα , LJ JF Kα {µ˜x. 〈LJ JvK||α〉/α}
LJ
s
caseF of
#                                »
K( #   »Y :l, #»x )⇒ v
{
α
, LJ JF Kα {µ˜[ #                                           »K #   »Y :l( #»x ).〈LJ JvK||α〉]/α}
LJ
r
F O[ #»B, #»v ]
z
α
, LJ JF Kα {O #»B [LJ J #»v K, α]/α}
Sequent calculus to natural deduction:
NJ J〈v||e〉Kα , NJ JeKα[NJ JvK]
NJ JxK , x
NJ Jµα.cK , NJ JcKα
NJ
r
K
#»
B ( #»v )
z
, K( #»B, #           »NJ JvK)
NJ
s
µ
( #                        »
O
#   »
Y :l [ #»x , α].c
){
, λ
{
#                                              »
O[ #   »Y :l, #»x ]⇒ NJ JcKα}
NJ JαKα , 
NJ Jµ˜x.cKα , letx =  inNJ JcKα
NJ
r
O
#»
B [ #»v , e]
z
α
, NJ JeKα[ O[ #»B, #           »NJ JvK ]]
NJ
s
µ˜
[ #                   »
K
#   »
Y :l( #»x ).c
]{
α
, caseof
#                                               »
K( #   »Y :l, #»x )⇒ NJ JcKα
FIGURE 9.14. Translations between λlet and single-consequence µµ˜.
388
may convert a term of type Int to a term of type ¬¬ Int, whereas here a term of type
Int would be converted to another term of type Int.
Theorem 9.1 (LJ J K preserves types). For any λlet derivations of Θ `G C : T and
Γ `ΘG v : A, there is a single-consequence µµ˜ derivation of Γ `ΘG LJ JvK : A | .
Proof. By induction on the given λlet derivation. For example, the translation of the
Let rule is:
.... IH
Γ `ΘG LJ JvK : A | Θ `G A : S
.... IH
Γ, x : A `ΘG LJ Jv′K : C | Θ `G C : T Θ | α : C : α `G C VL
〈LJ Jv′K||α〉 : (Γ, x : A `ΘG α : C) Cut
Γ | µ˜x. 〈LJ Jv′K||α〉 : A `ΘG α : C AL
〈LJ JvK||µ˜x. 〈LJ Jv′K||α〉〉 : (Γ,Γ `ΘG α : C) Cut
〈LJ JvK||µ˜x. 〈LJ Jv′K||α〉〉 : (Γ `ΘG α : C) CL
Γ `ΘG µα. 〈LJ JvK||µ˜x. 〈LJ Jv′K||α〉〉 : C | AR
Where the additional premise that Θ `G C : T is satisfied by assumption. The other
cases follow similarly.
Theorem 9.2 (NJ J K preserves types).
a) For any single-consequence µµ˜ derivation of Γ `ΘG v : A | , there is a λlet
derivation of Γ `ΘG NJ JvK : A.
b) For any single-consequence µµ˜ derivation of c :
(
Γ `ΘG α : A
)
, there is a λlet
derivation of Γ `ΘG NJ JcKα : A.
c) For any single-consequence µµ˜ derivation of Γ | e : A `ΘG α : B, λlet derivation
of Γ `ΘG v : A, and derivation of Θ `G A : S, there is λlet derivation of
Γ `ΘG NJ JeKα[v] : B.
Proof. By mutual induction on the given µµ˜ derivations. For example, the AL rule is
translated as
Γ `ΘG v : A Θ `G A : S
.... IH
Γ, x : A `ΘG NJ JcKα : B
Γ `ΘG letx = v inNJ JcKα : A Let
The rest of the cases are similar.
389
In terms of the dynamic semantics, the two translations form an equational
correspondence (Sabry & Felleisen, 1992) between the two languages. Recall from
Chapter V that an equational correspondence between λlet and µµ˜ means that the
equations between expressions in the two languages are preserved by both translations,
and the translations are inverses of one another up to the respective equational theories.
This gives us a one-for-one correspondence between the two languages—two terms are
equal in λlet if and only if their translation is equal in µµ˜, and so on. However,
the correspondence does not hold for all instances of the parametric equational
theories; only certain pairs of substitution strategies correspond to one another. More
specifically, besides the fact that a call-by-value strategy cannot correspond to a
call-by-name one, the correspondence holds for similar strategies which are strongly
focalizing. The idea of focalization can be translated to natural style as follows.
Definition 9.1 (Focalizing strategy). A λlet substitution strategy T is focalizing if
and only if
– variables are values and the empty context is a co-value (as assumed to hold for
all strategies),
– data structures built from values and co-data objects are values (i.e. K( #»A, #»V )
and λ
{
#                                 »
O[ #     »X:k, #»x ]⇒ v
}
are values), and
– case analysis on data and co-data observations built from values are co-values
(i.e. caseof
#                                  »
K( #     »X:k, #»x )⇒ v and  O[ #»A, #»V ] are co-values).
But it turns out this still rather loose criteria of focalization isn’t enough. We
need to impose a stronger focalization criteria which makes sure that the focusing ς
rules can always make progress to a (co-)value, and in natural deduction, co-value can
be decomposed into smaller co-values.
Definition 9.2 (Strong focalization). A µµ˜ substitution strategy S is strongly
focalizing if and only if every (co-)term is either a (co-)value or a ςS redex.
A λlet substitution strategy T is strongly focalizing if and only if every term is
either a value or a ςT redex, and if E[F ] is a co-value then so is F .
Note how unlike terms, the correspondence involving co-terms is rather implicit,
since the natural deduction based λlet has no syntactic notion of co-terms, only the
meta-syntactic notion of contexts. With the idea of strong focalization, we establish the
390
correspondence between co-terms in the µµ˜-calculus and frame contexts in the λlet -
calculus. That is to say, going from sequent calculus to natural deduction, co-terms
translate into frame contexts.
Lemma 9.1 (Framing co-terms). For all e in the single-consequence µµ˜-calculus, if
α ∈ FV (e) then NJ JeKα ∈ FrameCxt.
Proof. By induction on the syntax of co-terms. The base cases for co-variables, input
abstractions, and case abstractions are immediate, while the case for co-data structures
follows from the inductive hypothesis and the fact that FrameCxts compose.
Whereas going from natural deduction to the sequent calculus, co-value contexts
are factored out and revealed as co-value co-terms, but only for strongly focalizing
λlet substitution strategies. Note that ÏR denotes the reflexive-transitive (but not
compatible) closure of the rewrite relation R.
Lemma 9.2 (Factoring co-values). For all strongly focalizing λlet -calculus
substitution strategies T , λletT -calculus co-values E, terms v, and co-variables α,
〈LJ JE[v]K||E ′〉 ÏµLJJT K 〈LJ JvK||LJ JEKα {E ′/α}〉 for all µµ˜LJJT K-calculus co-values E ′.
Proof. By induction on the syntax of CoValueT as a subset of FrameCxt:
– : 〈LJ Jv′K||E ′〉 = 〈LJ Jv′K||α {E ′/α}〉 = 〈LJ Jv′K||LJ JKα {E ′/α}〉
– letx = E in v:
〈LJ Jletx = E[v′] in vK||E ′〉 µLJJT K 〈LJ JE[v′]K||µ˜x. 〈LJ JvK||E ′〉〉
ÏIH 〈LJ Jv′K||LJ JEKα {µ˜x. 〈LJ JvK||E ′〉/α}〉
= 〈LJ Jv′K||LJ Jletx = E in vKα {E ′/α}〉
which follows since letx =  in v ∈ CoValueT because T is strongly focalizing.
391
– caseE of
#                                »
K( #   »Y :l, #»x )⇒ v:
〈
LJ
s
caseE[v′]of
#                                »
K( #   »Y :l, #»x )⇒ v
{∣∣∣∣∣
∣∣∣∣∣E ′
〉
µLJJT K
〈
LJ JE[v′]K∣∣∣∣∣∣∣∣µ˜[ #                »K #   »Y :l( #»x ).〈LJ JvK||E ′〉]〉
ÏIH
〈
LJ Jv′K∣∣∣∣∣∣∣∣LJ JEKα {µ˜[ #                »K #   »Y :l( #»x ).〈LJ JvK||E ′〉]/α}〉
=
〈
LJ Jv′K∣∣∣∣∣
∣∣∣∣∣LJ
s
caseE of
#                                »
K( #   »Y :l, #»x )⇒ v
{
α
{E ′/α}
〉
which follows since caseof
#                                »
K( #   »Y :l, #»x )⇒ v in v ∈ CoValueT because T is
strongly focalizing.
– E O[ #»B, #»v ]:
〈
LJ
r
E[v′] O[ #»B, #»v ]
z∣∣∣∣∣∣∣∣E ′〉 µLJJT K 〈LJ JE[v′]K∣∣∣∣∣∣O #»B [ #          »LJ JvK , E ′]〉
ÏIH
〈
LJ Jv′K∣∣∣∣∣∣LJ JEKα {O #»B [ #          »LJ JvK , E ′]/α}〉
=
〈
LJ Jv′K∣∣∣∣∣∣∣∣LJrE O[ #»B, #»v ]z
α
{E ′/α}
〉
which follows since  O[ #»B, #»v ] ∈ CoValueT because T is strongly focalizing.
By making the correspondence between the different forms of co-values in the
two languages more concrete, we can demonstrate the first part of the equational
correspondence that the two language’s rewriting rules are sound under translation
with respect to each other.
Lemma 9.3 (Soundness of LJ J K). For any strongly focalizing λlet substitution
strategy T and pure declarations G, if v R v′ then LJ JvK LJ JRK LJ Jv′K where
the translation of the rewrite relation R is defined by cases on R as:
LJ JletT K ,→µ˜LJJT K→ηµ LJ JηletT K , =ηµ˜µSηµ LJ JccT K , =µLJJT Kηµ
LJ
qβT y ,→βLJJT K→ηµ LJrν∀<Ordz , ν∀<Ord
LJ
qβFy ,→βF ← µLJJT K LJq≺ηFy ,← ηFηµµLJJT K
392
Proof. Note that substitution commutes with translation (i.e. LJ JvK {LJ JV K/x} =
LJ Jv {V/x}K and LJ JvK {A/X} = LJ Jv {A/X}K). Soundness follows by cases on the
rewrite rule R, where we let S = LJ JT K.
– letT :
LJ Jletx = V in vK = µα. 〈LJ JV K||µ˜x. 〈LJ JvK||α〉〉
→µ˜S µα. 〈LJ JvK {LJ JV K/x}||α〉
→ηµ LJ JvK {LJ JV K/x}
= LJ Jv {V/x}K
– ηletT :
LJ Jletx = v inE[x]K = µα. 〈LJ JvK||µ˜x. 〈LJ JE[x]K||α〉〉
µS µα. 〈LJ JvK||µ˜x. 〈x||LJ JEKα〉〉 (Lemma 9.2)
→ηµ˜ µα. 〈LJ JvK||LJ JEKα〉
← µS µα. 〈LJ JE[v]K||α〉 (Lemma 9.2)
→ηµ LJ JE[v]K
– ccT :
LJ JE[letx = v′ in v]K
←ηµ µα. 〈LJ JE[letx = v′ in v]K||α〉
µS µα. 〈LJ Jletx = v′ in vK||LJ JEKα〉 (Lemma 9.2)
= µα. 〈µβ. 〈LJ Jv′K||µ˜x. 〈LJ JvK||β〉〉||LJ JEKα〉
→µS µα. 〈LJ Jv′K||µ˜x. 〈LJ JvK||LJ JEKα〉〉
← µS µα. 〈LJ Jv′K||µ˜x. 〈LJ JE[v]K||α〉〉 (Lemma 9.2)
= LJ Jletx = v′ inE[v]K
– βT for data types:
LJ
s
case K( #»B, #»V )of
#                                   »
Ki(
#   »
Y :l, #»x )⇒ vi
i
{
393
= µα.
〈
K
#»
B
i (
#            »
LJ JV K)∣∣∣∣∣
∣∣∣∣∣µ˜
[
#                                            »
K
#   »
Y :l
i ( #»x ).〈LJ JviK||α〉i]〉
→βS µα.
〈
LJ JviK { #       »B/Y , #                  »LJ JV K/x}∣∣∣∣∣∣α〉
→ηµ LJ JviK { #       »B/Y , #                  »LJ JV K/x}
= LJ
r
vi
{ #       »
B/Y ,
#     »
V/x
}z
– βT for co-data types:
LJ
s
λ
{
#                                  »
Oi[
#   »
Y :l, #»x ]⇒ vi
i
}
Oi[
#»
B,
#»
V ]
{
= µα.
〈
µ
(
#                                                  »
O
#   »
Y :l
i [ #»x , α].〈LJ JviK||α〉i)∣∣∣∣∣
∣∣∣∣∣O #»Bi [ #            »LJ JV K , α]
〉
→βS µα.
〈
LJ JviK { #       »B/Y , #                  »LJ JV K/x}∣∣∣∣∣∣α〉
→ηµ LJ JviK { #       »B/Y , #                  »LJ JV K/x}
= LJ
r
vi
{ #       »
B/Y ,
#     »
V/x
}z
– ν[∀<Ord]:
LJ Jλ{x@ j<N ⇒ v}K
= µ([j<N @x α].〈LJ JvK||α〉)
ν∀<Ord µ([i<N @ α].〈LJ JvK {µ([j<i@x α].〈LJ JvK||α〉)/x}||α〉)
= LJ Jλ{@ i<N ⇒ v {λ{x@ j<i⇒ v}/x}}K
– βF for a data type F:
LJ
s
case K( #»B,
#»
v′ )of
#                                   »
Ki(
#   »
Y :l, #»x )⇒ vi
i
{
= µα.
〈
K
#»
B
i (
#            »
LJ Jv′K)∣∣∣∣∣
∣∣∣∣∣µ˜
[
#                                            »
K
#   »
Y :l
i ( #»x ).〈LJ JviK||α〉i]〉
→βF µα.
〈 #            »
LJ Jv′K ∣∣∣∣∣∣µ˜ #»x . 〈LJ JviK { #       »B/Y }∣∣∣∣∣∣α〉〉
← µS LJ
r
let
#          »
x = v′ in vi
{ #       »
B/Y
}z
394
– βG for a co-data type G:
LJ
s
λ
{
#                                  »
Oi[
#   »
Y :l, #»x ]⇒ vi
i
}
Oi[
#»
B,
#»
v′ ]
{
= µα.
〈
µ
(
#                                                  »
O
#   »
Y :l
i [ #»x , α].〈LJ JviK||α〉i)∣∣∣∣∣
∣∣∣∣∣O #»Bi [ #            »LJ Jv′K , α]
〉
→βG µα.
〈 #            »
LJ Jv′K ∣∣∣∣∣∣µ˜ #»x . 〈LJ JviK { #       »B/Y }∣∣∣∣∣∣α〉〉
← µS LJ
r
let
#          »
x = v′ in vi
{ #       »
B/Y
}z
– ηF for a data type F:
LJ
s
case v of
#                                                         »
Ki(
#   »
Y :l, #»x )⇒ Ki( #   »Y :l, #»x )
i
{
= µα.
〈
LJ JvK∣∣∣∣∣
∣∣∣∣∣µ˜
[
#                                                 »
K
#   »
Y :l
i ( #»x ).
〈
K
#   »
Y :l
i ( #»x )
∣∣∣∣∣∣α〉i]〉
→ηG µα. 〈LJ JvK||α〉
→ηµ LJ JvK
– ηG for a co-data type G:
LJ
s
λ
{
#                                                             »
Oi[
#   »
Y :l, #»x ]⇒ y Oi[ #   »Y :l, #»x ]
i
}{
= µ
(
#                                                                                 »
O
#   »
Y :l
i [ #»x , α].
〈
µα.
〈
y
∣∣∣∣∣∣O #   »Y :li [ #»x , α]〉∣∣∣∣∣∣α〉i
)
→µS µ
(
#                                                           »
O
#   »
Y :l
i [ #»x , α].
〈
y
∣∣∣∣∣∣O #   »Y :li [ #»x , α]〉i
)
→ηG y = LJ JyK
Lemma 9.4. For any S and T = NJ JSK, and any c and e in the single-consequence
µµ˜-calculus,
a) NJ JEKα[NJ JcKβ]ccT NJ Jc {E/β}Kα, and
b) NJ JEKα[NJ JeKβ[NJ JvK]]ccT NJ Je {E/β}Kα[v].
395
Proof. By mutual induction on the syntax of commands and co-terms:
NJ JEKα[NJ J〈v||e〉Kβ] = NJ JEKα[NJ JeKβ[NJ JvK]
IH NJ Je {E/β}Kα[NJ JvK]
= NJ J〈v||e〉 {E/β}Kα
NJ JEKα[NJ JβKβ[v′]] = NJ JEKα[v′]
NJ JEKα[NJ Jµ˜x.cKβ[v′]] = NJ JEKα[letx = v′ inNJ JcKβ]
→ccT letx = v′ inNJ JEKα[NJ JcKβ]
IH letx = v′ inNJ Jc {E/β}Kα
= NJ Jµ˜x.c {E/β}Kα[v′]
NJ JEKα [NJsµ˜[ #                    »K #   »Y :l( #»x ).c]{
β
[v′]
]
= NJ JEKα [case v′ of #                                               »K( #   »Y :l, #»x )⇒ NJ JcKβ]
→ccT case v′ of
#                                                                       »
K( #   »Y :l, #»x )⇒ NJ JEKα[NJ JcKβ]
IH case v′ of
#                                                                  »
K( #   »Y :l, #»x )⇒ NJ Jc {E/β}Kα
= NJ
s
µ˜
[
#                                     »
KY :l( #»x ).c {E/β}
]{
α
[v′]
NJ JEKα [NJrO #»B [ #»v , e]z
β
[v′]
]
= NJ JEKα [NJ JeKβ [v′ O[ #»B, #           »NJ JvK ]]]
IH NJ Je {E/β}Kα [v′ O[ #»B, #           »NJ JvK ]]
= NJ
r
O
#»
B [ #»v , e {E/β}]
z
α
[v′]
Lemma 9.5 (Soundness of NJ J K). For any strongly focalizing substitution strategy
S and pure declarations G in the single-consequence µµ˜-calculus,
a) if v is ς-normal and v R v′ then NJ JvK NJ JRK NJ Jv′K,
b) if c is ς-normal and c R c′ then NJ JcKα NJ JRK NJ Jc′Kα,
c) if e is ς-normal and e R e′ then NJ JeKα[v] NJ JRK NJ Je′Kα[v] for all v,
396
where the translation of the rewrite relation R is defined by cases on R as:
NJ JµSK , ccNJJSK NJ Jµ˜SK , letNJJSK
NJ
qηµy , =α NJqηµ˜y , ηletNJJSK letNJJSK
NJ
qβSy ,→βNJJSKccNJJSK NJrν∀<Ordz , ν∀<Ord
NJ
qβFy ,→βF NJq≺ηFy , ≺ηF
Proof. Note that S-values are translated to NJ JSK-values and substitution for value
and type variables commutes with translation (i.e. NJ JvK {NJ JV K/x} = NJ Jv {V/x}K,
etc.). Soundness follows by cases on the rewrite rule R, where we let T = NJ JSK.
– µS :
NJ J〈µβ.c||E〉Kα = NJ JEKα[NJ JcKβ]ccT NJ Jc {E/β}Kα (Lemma 9.4)
– µ˜S :
NJ J〈V ||µ˜x.c〉Kα = letx = NJ JV K inNJ JcKα
letT NJ JcKα {NJ JV K/x}
= NJ Jc {V/x}Kα
– ηµ: NJ Jµα. 〈v||α〉K = NJ JαKα[NJ JvK] = NJ JvK
– ηµ˜: by assuming ς-normality, there are only two cases
NJ Jµ˜x. 〈x||E〉Kα[v] = letx = v inNJ JEKα[v] ηletT NJ JEKα[v]
NJ Jµ˜x. 〈x||µ˜y.c〉Kα[v] = letx = v in let y = x inNJ JcKα
letT letx = v inNJ JcKα {x/y}
=α let y = v inNJ JcKα
= NJ Jµ˜y.cKα[v]
397
– βS for data types:
NJ
s〈
K
#»
B
i (
#»
V )
∣∣∣∣∣
∣∣∣∣∣µ˜
[
#                     »
K
#   »
Y :l
i ( #»x ).ci
i
]〉{
α
= case Ki(
#»
B,
#            »
NJ JV K)of #                                                   »Ki( #   »Y :l, #»x )⇒ NJ JciKαi
βT NJ JciKα { #       »B/Y ,NJ JV K/x}
= NJ
r
ci
{ #       »
B/Y , V/x
}z
α
– βS for co-data types:
NJ
s〈
µ
(
#                          »
O
#   »
Y :l
i [ #»x , β].ci
i
)∣∣∣∣∣
∣∣∣∣∣O #»Bi [ #»V ,E]
〉{
α
= NJ JEKα [λ{ #                                                  »Oi[ #   »Y :l, #»x ]⇒ NJ JciKβi Oi[ #»B, #            »NJ JV K ]}]
→βT NJ JEKα [NJ JciKβ { #       »B/Y , #                   »NJ JV K/x}]
= NJ JEKα [NJrci { #       »B/Y , #     »V/x}z
β
]
ccT NJ
r
ci
{ #       »
B/Y ,
#     »
V/x,E/β
}z
α
(Lemma 9.4)
– ν∀<Ord :
NJ Jµ(j<N @x α.c)K = λ{x@ j<N ⇒ NJ JcKα}
ν∀<Ord λ{@ i<N ⇒ NJ JcKα {λ{x@ j<i⇒ NJ JcKα}/x}}
= NJ Jµ(i<N @ α.c {µ(j<i@x α.c)/x})K
– βF for a data type F:
NJ
s〈
K
#»
B
i ( #»v )
∣∣∣∣∣
∣∣∣∣∣µ˜
[
#                     »
K
#   »
Y :l
i ( #»x ).ci
i
]〉{
α
= case Ki(
#»
B,
#           »
NJ JvK)of #                                                   »Ki( #   »Y :l, #»x )⇒ NJ JciKαi
βF let
#                      »
x = NJ JvK inNJ JciKα { #       »B/Y }
= NJ
r〈
#»v
∣∣∣∣∣∣µ˜ #»x .ci { #       »B/Y }〉z
α
398
– βG for a co-data type G:
NJ
s〈
µ
(
#                          »
O
#   »
Y :l
i [ #»x , β].ci
i
)∣∣∣∣∣
∣∣∣∣∣O #»Bi [ #»v , e]
〉{
α
= NJ JeKα [λ{ #                                                  »Oi[ #   »Y :l, #»x ]⇒ NJ JciKβi Oi[ #»B, #           »NJ JvK ]}]
→βG NJ JeKα [let #                      »x = NJ JvK inNJ JciKβ { #       »B/Y }]
= NJ
r〈
µβ.
〈
#»v
∣∣∣∣∣∣µ˜ #»x .ci { #       »B/Y }〉∣∣∣∣∣∣e〉z
α
– ηF for a data type F:
NJ
s
µ˜
[
#                                                 »
K
#   »
Y :l
i ( #»x ).
〈
K
#   »
Y :l
i ( #»x )
∣∣∣∣∣∣α〉i]{
α
[v] = case v of
#                                                         »
Ki(
#   »
Y :l, #»x )⇒ Ki( #   »Y :l, #»x )
i
ηF v = NJ JαKα[v]
– ηG for a co-data type G:
NJ
s
µ
(
#                                                           »
O
#   »
Y :l
i [ #»x , α].
〈
y
∣∣∣∣∣∣O #   »Y :li [ #»x , α]〉i
){
= λ
{
#                                                             »
Oi[
#   »
Y :l, #»x ]⇒ y Oi[ #   »Y :l, #»x ]
i
}
ηG y = NJ JyK
The second part of the equational correspondence is to demonstrate that the two
translations are inverses of one another, meaning that every round-trip translation is
equal to the starting point. This round-trip equality does not depend on the focalization
properties of substitution strategies, but instead assumes that we work with pre-
focused, ς-normal expressions. The reason for this assumption is to midigate the
subtle difference between co-terms like v · e in µµ˜ and frame contexts like F [ · v] in
λlet . In particular, due to the ordering imposed by the µµ˜ ς rules, the term v has the
first priority in v · e and e is lifted out and evaluated second. In λlet , however, the
context F always has implicit priority F [ · v] if F is not an evaluation context, and
v only gets discovered and evaluated when it lands in the eye of an evaluation context.
By ς-normalizing beforehand, this difference in discovery and evaluation is eliminated
entirely.
Lemma 9.6 (Natural deduction round trip). For any T , if v is ς-normal then
NJ JLJ JvKK = v.
399
Proof. By induction on the syntax of terms.
– x: NJ JLJ JxKK = x
– letx = v′ in v:
NJ JLJ Jletx = v′ in vKK = NJ Jµα. 〈LJ Jv′K||µ˜x. 〈LJ JvK||α〉〉K
= letx = NJ JLJ Jv′KK inNJ JLJ JvKK
=IH letx = v′ in v
– K( #»B, #»V ): NJ
r
LJ
r
K( #»B, #»V )
zz
= K( #»B, #                         »NJ JLJ JV KK) =IH K( #»B, #»V )
– case v′ of
#                                »
K( #   »Y :l, #»x )⇒ v:
NJ
s
LJ
s
case v′ of
#                                »
K( #   »Y :l, #»x )⇒ v
{{
= NJ
s
µα.
〈
LJ Jv′K∣∣∣∣∣∣∣∣µ˜[ #                                           »K #   »Y :l( #»x ).〈LJ JvK||α〉]〉{
= caseNJ JLJ Jv′KK of #                                                          »K( #   »Y :l, #»x )⇒ NJ JLJ JvKK
=IH case v′ of
#                                »
K( #   »Y :l, #»x )⇒ v
– λ
{
#                               »
O[ #   »Y :l, #»x ]⇒ v
}
:
NJ
s
LJ
s
λ
{
#                               »
O[ #   »Y :l, #»x ]⇒ v
}{{
= NJ
s
µ
( #                                                »
O
#   »
Y :l [ #»x , α].〈LJ JvK||α〉){
= λ
{
#                                                         »
O[ #   »Y :l, #»x ]⇒ NJ JLJ JvKK}
=IH λ
{
#                               »
O[ #   »Y :l, #»x ]⇒ v
}
– v′ O[ #»B, #»V ]:
NJ
r
LJ
r
v′ O[ #»B, #»V ]
zz
= NJ
r
µα.
〈
LJ Jv′K∣∣∣∣∣∣O #»B [ #»V , α]〉z
= NJ JLJ Jv′KK O[ #»B, #                         »NJ JLJ JV KK ]
=IH v′ O[
#»
B,
#»
V ]
400
Lemma 9.7 (Sequent calculus round trip). For any S in the single-consequence,
higher-order µµ˜-calculus,
a) if v is ς-normal then LJ JNJ JvKK =µSηµ v,
b) if c is ς-normal then LJ JNJ JcKαK =µSηµ µα.c,
c) if e is ς-normal then for all v, LJ JNJ JeKα[v]K =µSηµ µα. 〈LJ JvK||e〉
Proof. My (mutual) induction on the syntax of commands, terms, and co-terms.
– x: LJ JNJ JxKK = x
– µα.c: LJ JNJ Jµα.cKK = LJ JNJ JcKαK =IH µα.c
– K
#»
B ( #»V ): LJ
r
NJ
r
K
#»
B ( #»V )
zz
= K
#»
B (
#                         »
LJ JNJ JV KK) =IH K #»B ( #»V )
– µ
(
#                        »
OY :l[ #»x , α].c
)
:
LJ
s
NJ
s
µ
(
#                        »
OY :l[ #»x , α].c
){{
= µ
( #                                                                 »
O
#   »
Y :l [ #»x , α].
〈
LJ JNJ JcKαK∣∣∣∣∣∣α〉)
=IH µ
( #                                            »
O
#   »
Y :l [ #»x , α].〈µα.c||α〉
)
=µS µ
( #                        »
O
#   »
Y :l [ #»x , α].c
)
– 〈v||e〉:
LJ JNJ J〈v||e〉KαK = LJ JNJ JeKα[NJ JvK]K =IH µα. 〈LJ JNJ JvKK∣∣∣∣∣∣e〉 =IH µα. 〈v||e〉
– α: LJ JNJ JαKα[v]K = LJ JvK =ηµ µα. 〈LJ JvK||α〉
– µ˜x.c:
LJ JNJ Jµ˜x.cKα[v]K = LJ Jletx = v inNJ JcKαK
= µα.
〈
LJ JvK∣∣∣∣∣∣µ˜x. 〈LJ JNJ JcKαK∣∣∣∣∣∣α〉〉
=IH µα. 〈LJ JvK||µ˜x. 〈µα.c||α〉〉
=µS µα. 〈LJ JvK||µ˜x.c〉
401
– µ˜
[ #                   »
K
#   »
Y :l( #»x ).c
]
:
LJ
s
NJ
s
µ˜
[ #                   »
K
#   »
Y :l( #»x ).c
]{
α
[v]
{
= LJ
s
case v of
#                                               »
K( #   »Y :l, #»x )⇒ NJ JcKα{
= µα.
〈
LJ JvK∣∣∣∣∣∣∣∣µ˜[ #                                                           »KY :l( #»x ).〈LJ JNJ JcKαK∣∣∣∣∣∣α〉]〉
=IH µα.
〈
LJ JvK∣∣∣∣∣∣∣∣µ˜[ #                                      »KY :l( #»x ).〈µα.c||α〉]〉
=µS µα.
〈
LJ JvK∣∣∣∣∣∣∣∣µ˜[ #                   »KY :l( #»x ).c]〉
– O
#»
B [ #»V ,E]:
LJ
r
NJ
r
O
#»
B [ #»V ,E]
z
α
[v]
z
= LJ
r
NJ JEKα[v O[ #»B, #            »NJ JV K ]]z
=IH µα.
〈
LJ
r
v O[ #»B, #            »NJ JV K ]z∣∣∣∣∣∣∣∣E〉
= µα.
〈
µα.
〈
LJ JvK∣∣∣∣∣∣O #»B [ #                         »LJ JNJ JV KK , α]〉∣∣∣∣∣∣E〉
=IH µα.
〈
µα.
〈
LJ JvK∣∣∣∣∣∣O #»B [ #»V , α]〉∣∣∣∣∣∣E〉
=µS µα.
〈
LJ JvK∣∣∣∣∣∣O #»B [ #»V ,E]〉
With the soundness of equations and the round-trip equalities established for
ς-normal forms, we already have an equational correspondence between the ς-normal
sub-languages of λlet and µµ˜. This correspondence can be generalized to the full
languages by noting that ς-normalization commutes with all other rewriting rules.
Theorem 9.3 (Parametric equational correspondence). For any stable substitution
strategies S and T = NJ JSK, the λletT -calculus is in equational correspondence with
the single-consequence µµ˜S-calculus.
Proof. First, note that there is an equational correspondence between the ς-normal
sub-languages of the two calculi. That the equations are sound with respect to
translation of ς-normal forms follows from Lemmas 9.3 and 9.5 since the translations
are compositional, so compatibility holds automatically (Downen & Ariola, 2014a).
That the translations are inverses on ς-normal forms up to the equational theory is
shown in Lemmas 9.6 and 9.7.
Additionally, note that ς reduction is strongly normalizing (because each reduction
strictly decreases the number of non-(co-)values in a data structure or observation),
402
so every expression has a ς-normal form. Also note that because of stability, ς-
normalization commutes with all other reductions: if v ς v1 6→ς and v →R v2
then v1 R v′ and v2 ς v′ for some v′, and similar for commands and co-terms.
This follow by induction on the syntax of (co-)terms and commands, where the only
interest case is when a lifted non-(co-)value reduces to a (co-)value, which is undone by
µS µ˜Sηµηµ˜ in the µµ˜-calculus (as a generalization of Lemma 1 in Johnson-Freyd et al.
(2017)) and letT ηlet in λletT . For example, we could have the divergent reductions
ι1 (x)ς letx = v′ in ι1 (x) 6→ς
ι1 (v)→β ι1 (V )
which is brought back together from the inductive hypothesis v′  V ′← V as follows:
letx = v in ι1 (x) letx = V ′ in ι1 (x)→letT ι1 (V ′)← ς ι1 (V )
Therefore, ς-normalization also forms an equational correspondence between the
full languages and the ς-normal sub-languages, and thus the full µµ˜S and λletT calculi
are in equational correspondence, because equational correspondences compose (Sabry
& Felleisen, 1992).
As a result of this static and dynamic correspondence, we can transfer results from
the sequent calculus to natural deduction. In particular, many of the applications of
orthogonality models from Chapter VII can be directly used to prove similar properties
about λlet . For example, the logic of λlet is consistent by composing Theorems 9.1
and 7.1, so that there’s no well-typed closed term of an arbitrary type variable, i.e.
`X:SG v : X is not derivable, since that would imply a contradictory well-typed closed
command in µµ˜ by instantiating X with A and ¬A for example. Note that this is the
same as saying there is no well-typed closed term of the empty data type `G v : 0,
due to the strength of the 0E elimination rule corresponding to the ex falso quodlibet
principle: “from false, anything follows.” We also get strong normalization of the (weak
or bounded) letT βT ν∀<Ord reduction theory for any strongly focalizing substitution
strategy T by composing Lemma 9.3 and Theorem 7.3 because they translate to at
least one reduction in µµ˜, which can be extended to include ςT reductions by noting
that they are strongly normalizing on their own and commute with the other reductions
as in the proof of Theorem 9.3 so they can’t introduce an infinite reduction sequence.
403
Additionally, we know that the entire λlet equational theory, including the η law for
(co-)data types, is coherent for the polarized P = V ,N strategy since ι1 () 6= ι2 () in
a closed environment as that would cause a contradiction with Theorems 9.3 and 7.5.
Remark 9.2. Note that, while here we apply the results from orthogonality models
of the sequent calculus to a natural deduction language via translation, we could
also build an orthogonality model for the natural deduction λlet -calculus directly.
The way that we have phrased the definitions (of computational poles, interaction
spaces, orthogonality, safety conditions, worlds, and types) in Chapter VII is quite
general, and can be applied to other families of languages, too. For example, we could
build a model directly the call-by-name instance of the λlet -calculus by setting the
‚pole in the assumed safety condition to be the set of all λ-calculus terms so that
the computation relation is the reduction of terms. The untyped space U = (U+,U−)
of the world can be defined so that the positive side U+ is the set of all λlet -calculus
terms and the negative side U− is the set of all λlet -calculus frame contexts. That way,
the cut operation can be defined as context filling: given v, F ∈ U, then 〈v||F 〉 = F [v].
Furthermore, the value space V of the world is the set of terms (because all terms are
values in N ) and the set of call-by-name evaluation contexts (which are N co-values).
In this setup, we can construct the model for functions as the negatively-
constructed type built around applicative contexts:
A→− B = Neg({E[ v] | v ∈ A, E ∈ B})
As it turns out, this negative definition of function types in terms of orthogonality
and applicative contexts is logically the same as the more traditional (unary) logical
relation model for functions:
A→LR B = {v ∈ Term | ∀v′ ∈ A, v v′ ∈ B}
In particular, suppose v ∈ A→− B for some semantic types A and B, so the fact that
〈v||E[ v]〉 ∈ ‚ implies that 〈v v′||E〉 ∈ ‚ because cutting is context filling in the λ-
calculus: 〈v||E[ v′]〉 = E[v v′] = 〈v v′||E〉. That means v v′ ∈ B since B = B‚W , which
in turn means that v ∈ A→LR B. Going the other way, suppose v ∈ A→LR B, so the
fact that v v′ ∈ B for all v′ ∈ A (which comes by definition) implies that 〈v v′||E〉 ∈ ‚
since B is a ‚-space. As before, we have 〈v v′||E〉 = E[v v′] = 〈v||E[ v′]〉 ∈ ‚,
meaning that v ∈ A→− B by Lemma 7.6. End remark 9.2.
404
v ∈ Term ::= . . . | µα.c c ∈ Command ::= 〈v||α〉
Judgement ::= (Γ `ΘG v : A | ∆) | c :
(
Γ `ΘG ∆
)
Control rules:
Γ `ΘG v : A | α : A,∆
〈v||α〉 :
(
Γ `ΘG α : A,∆
) Pass c :
(
Γ `ΘG α : A,∆
)
Γ `ΘG µα.c : A | ∆
Act
Control kinding rules:
Γ `G v :: R | α :: R,∆
〈v||α〉 ::
(
Γ `G α :: R,∆
) Pass c ::
(
Γ `G α :: R,∆
)
Γ `G µα.c :: R | ∆ Act
FIGURE 9.15. λµ: adding multiple consequences to natural deduction.
Multiple Consequences
The natural deduction λlet -calculus corresponds to the µµ˜ sequent calculus
restricted to a single consequence. What, then, is the natural deduction equivalent to
the entire µµ˜-calculus? Since multiple consequences in the classical sequent calculus let
us write additional programs, we need some way to represent multiple consequences in
natural deduction as well. As it turns out, Parigot’s (1992) λµ-calculus for representing
classical logic in natural deduction style gives us exactly what we need to extend the
pure λ-calculus with co-variables representing other side consequences besides the
main return value of the term.
The extension of the pure λlet -calculus with co-variables is the parametric λµlet -
calculus whose additional syntax and typing rules are shown in Figure 9.15. The
λµlet -calculus extend λlet with commands of the form 〈v||α〉 and µ-abstractions of
the form µα.c. Note that commands and µ-abstractions are similar to constructs of
the same name in the sequent calculus, except that a command is always between a
term and a co-variable: λµlet still has no notion of co-terms which represent the left
rules exclusive to the sequent calculus. The two new typing rules are for activation
(Act) that forms a term by abstracting over a co-variable in a command (and thus
activating its output as the output of the term), and passivation (Pass) that forms a
command by throwing the output of a term to some free co-variable in scope.
405
(µT ) E[µα.c] µT µβ.c {〈E||β〉/〈||α〉} ( 6= E ∈ CoValueT
β /∈ FV (E))
(µα) 〈µα.c||β〉 µα c {β/α}
(ηµ) µα. 〈v||α〉 ηµ v (α /∈ FV (v))
(ccµ) µα. 〈letx = v′ in v||β〉 ccµ letx = v′ inµα. 〈v||β〉 (α /∈ FV (v′))
(ccµ) µα.
〈 case v′ of
#                                »
K( #   »Y :l, #»x )⇒ v
∣∣∣∣∣∣
∣∣∣∣∣∣β
〉
ccµ
case v′ of
#                                                    »
K( #   »Y :l, #»x )⇒ µα. 〈v||β〉
(α /∈⋃{ #             »FV (v′)})
FIGURE 9.16. The laws of control in λµ.
The λµlet also extend λlet ’s dynamic semantics with the additional rewriting
rules in rewriting rules shown in Figure 9.16, which are also parameterized by the
same concept of substitution strategy as λlet . The µT law incrementally captures
co-value contexts, representing a portion of the full co-value, by performing a structural
substitution (Ariola & Herbelin, 2008) of command-delimited co-values for commands
containing co-variables. The substitution {〈E||β〉/〈||α〉} means to follow the usual
capture-avoiding substitution rules to replace every command of the form 〈v||α〉,
where v can be any term, with 〈E[v′]||α〉 where v′ = {〈E||β〉/〈||α〉} is the structural
substitution applied to that sub-term v. The µα law finishes off capturing a co-value
by substituting one co-variable for another. The ηµ law is analogous to the ηµ law
in the sequent calculus, which eliminates a redundant µ-abstraction. And finally, we
have the ccµ laws for commuting conversions of control, which are analogous to the
ccT commuting conversions which push co-values into let and case expressions. Since
the term µα. 〈v||β〉 means to swap co-values, naming the surrounding co-value α and
instead use the one named β for evaluating v, it is subject to the same control flow
as a co-value.
Sharing, control, and join points
Compilers must balance many conflicting concerns when optimizing programs.
For example, there is often a tradeoff between space and time which impacts the way
in which programs can be effectively transformed. In particular, in compilers that use
term-based representations like the λ-calculus, this tradeoff can be most easily seen
in reductions that duplicate terms. The most common form of duplication is caused
406
by substitution used by β and other laws, which repeats the same full value, whose
description may be very large, for every occurence of the substituted variable. Because
of duplication, “simpler” terms do not always result in “better” code.
Consider the following term, where the locally bound variable f is referenced
three times to form the three components of the resulting triple and where the term
v in the body of f is something very large.
let f = λx. case even xof
True ⇒ v
False⇒ Nothing
in (f, f z, f 1)
A straightforward let reduction would replicate the abstraction λx. case even xof . . .
three times to place it in each context where f is used. Often, putting a value in its
context can open up new opportunities for simplifications or optimizations which lead
to better code, but not always. In this example, the first use of f is as a component
in a data structure (the triple), and so substituting the value for f opens up no
new opportunities for simplification whatsoever. The second use of f is provided the
argument z, so substituting a value for this f does open up a new β reduction, but
the term gets immediately stuck on the free variable z after, so this, too, is not helpful.
The second use of f , which is provided the concrete argument 1, is very different to
the first two cases, however. After substituting the value for f , a β reduction plugs
in 1 for its parameter x, and the body of the function can then be simplified down to
the simple result Nothing. Therefore, in the context of compiling programs, it is often
better to selectively inline values for only specific references to variables that might
lead to useful further simplifications. In the above example, we would only want to
inline a value for the third occurence of f , to get the term
let f = λx. case even xof
True ⇒ v
False⇒ Nothing
in (f, f z,Nothing)
407
which simplifies the known call to f without unhelpfully duplicating the large sub-term
v.
Substitution of values for variables is not the only cause for replication and
increase of code size, however. For example, consider the following λlet -term which
corresponds to expressions that routinely arise when simplifying functional programs.
case

case z of
A(x, y)⇒ y
B(x) ⇒ Just(x)
C ⇒ vc
of
Just(x) ⇒ v1
Nothing⇒ v0
In this term, we have a case that is discriminating the result of another case
expression. The thing to notice is that, in the case that z is B(x), the inner case
returns Just(x) to the outer case , which can then be simplified to v0. It would be
better to make this observation about information and control flow explicit in the
term and to shortcut the production and consumption of the intermediate short-lived
value Just(x) which creates extra garbage only to directly marshall a result from one
case to another. But the way that the two case s are nested keeps Just(x) away from
its surrounding use. But this is exactly what the commuting conversions are good for!
A case context is a co-value in any focalizing strategy T , so ccT pushes the outer
case into the branches of the inner one, opening up the opportunity to simplify the
middle B branch as follows:
case

case z of A(x, y)⇒ y
B(x) ⇒ Just(x)
C ⇒ z′
of
Just(x) ⇒ v
Nothing⇒ 0
408
→ccT case z of A(x, y)⇒ case y of Just(x) ⇒ v
Nothing⇒ 0
B(x) ⇒ case Just(x)of Just(x) ⇒ v
Nothing⇒ 0
C ⇒ case z′ of Just(x) ⇒ v
Nothing⇒ 0
→βT case z of A(x, y)⇒ case y of Just(x) ⇒ v
Nothing⇒ 0
B(x) ⇒ v
C ⇒ case z′ of Just(x) ⇒ v
Nothing⇒ 0
The ccT commuting conversion has allowed us to simplify the B branch and eliminate
an unnecessary intermediate value. However, notice how the ccT commuting conversion
has also unfortunately duplicated the full outer case into both the A and B branches
to no avail, since the case is just stuck on the variables y and z′. But worse, the sub-
term v, which could be very large in practice, has also been inadvertently duplicated.
This is not just a mild inconvenience, but actually represents a fatal problem to the
use of unrestrained commuting conversions for compilation; unfortunate nestings of
case can create a chain reaction where commuting conversions increase the size of
the program exponentially thereby making compilation unfeasible Lindley (2005).
Lets instead change our viewpoint and look at the problem from the perspective
of the sequent calculus. The above nested case s in λlet translates to a µµ˜ term of
the following form:
µα.
〈
µβ.
〈
z
∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣
µ˜

A(x, y). 〈y||β〉
B(x) . 〈Just(x)||β〉
C . 〈z′||β〉

〉∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣
µ˜
 Just(x) . 〈v||α〉
Nothing. 〈0||α〉
〉
Note the key difference here, the outer case has been given a name via the co-variable
β which is explicitly referenced in the inner case . Because of the symmetry and
duality of the sequent calculus, we can therefore use exactly the same intuition that
409
we had for ordinary variables: only inline co-values for co-variables that appear in
contexts that could lead to useful further simplifications. In this example, only inlining
into the B branch is helpful to eliminate the intermediate Just(x), which leads to the
following simplification:
µα.
〈
µβ.
〈
z
∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣
µ˜

A(x, y). 〈y||β〉
B(x) . 〈Just(x)||β〉
C . 〈z′||β〉

〉∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣
µ˜
 Just(x) . 〈v||α〉
Nothing. 〈0||α〉
〉
=µS µα.
〈
µβ.
〈
z
∣∣∣∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣∣∣∣
µ˜

A(x, y). 〈y||β〉
B(x) .
〈
Just(x)
∣∣∣∣∣∣
∣∣∣∣∣∣µ˜
 Just(x) . 〈v||α〉
Nothing. 〈0||α〉
〉
C . 〈z′||β〉

〉
∣∣∣∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣∣∣∣
µ˜
 Just(x) . 〈v||α〉
Nothing. 〈0||α〉
〉
=βS µα.
〈
µβ.
〈
z
∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣
µ˜

A(x, y). 〈y||β〉
B(x) . 〈v||α〉
C . 〈z′||β〉

〉∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣
µ˜
 Just(x) . 〈v||α〉
Nothing. 〈0||α〉
〉
This use of explicit control flow in this way to tame commuting conversions mimics
Kennedy’s (2007) analysis and promotion of continuation-passing style (CPS) λ-calculi
as a favorable representation compared to direct-style λ-calculi, where co-values take
the role of the functional representation of continuations as λ-abstractions in CPS.
A CPS aficionado could then view the sequent calculus as a foundation for how
defunctionalization (Reynolds, 1998) can be strategically employed in a CPS language,
where only the continuations for data types like sums and pairs are abstractions, and
the continuations for co-data types like functions are concrete structures that more
directly correspond to the original structure of functional programs.
However, there is something interesting to note about the selective inlining of
co-values for co-variables. Even though we started with a single-consequence µµ˜ term,
coming from a pure λlet term, the result of simplification steps outside of the simplistic,
410
single-consequence regime. Note that the case abstraction
µ˜

A(x, y). 〈y||β〉
B(x) . 〈v||α〉
C . 〈z′||β〉

has two free co-variables, not one, meaning that it no longer lies in the single-
consequence subset of µµ˜. This means that if we want to mimic this same sequent
calculus simplification in direct natural deduction style, we also have to step outside
the pure λlet -calculus. The need to be more explicit about control flow is one possible
application of the classical λµlet -calculus, where we can introduce µ-abstractions
which let us speak more clearly about the flow of control in a program and skip over
uninteresting control paths. For example, the above µµ˜ simplification can be restated
in λµlet as follows:
case

case z of A(x, y)⇒ y
B(x) ⇒ Just(x)
C ⇒ z′
of
Just(x) ⇒ v
Nothing⇒ 0
=µT ηµccµ µα.

caseµβ.

case z of A(x, y)⇒ µδ. 〈y||β〉
B(x) ⇒ µδ. 〈Just(x)||β〉
C ⇒ µδ. 〈z′||β〉
∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣
β
of
Just(x) ⇒ v
Nothing⇒ 0
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
α

411
=µT µα.

caseµβ.

case z of A(x, y)⇒ µδ. 〈y||β〉
B(x) ⇒ µδ.

case Just(x)of
Just(x) ⇒ v
Nothing⇒ 0
∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣
α

C ⇒ µδ. 〈z′||β〉
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
β

of
Just(x) ⇒ v
Nothing⇒ 0
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
α

=βT µα.

caseµβ.

case z of A(x, y)⇒ µδ. 〈y||β〉
B(x) ⇒ µδ. 〈v||α〉
C ⇒ µδ. 〈z′||β〉
∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣
β
of
Just(x) ⇒ v
Nothing⇒ 0
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
α

The term that we get at the end corresponds to the same term that we got in µµ˜,
where the B branch jumps directly to the surrounding α, skipping the second case
since it was already done in that branch, while the other branches go to the second
case via β. In this sense, the co-variable α serves as an explicit join point, where the
branching control induced by case analysis eventually meets back up; not every case
flows through β, but no matter the value of z, eventually some result will be output
through α. So from the outside, this term appears to be pure—returning the same
result regardless of the context in which it is used—even though internally might use
impure jumps to skip past the normal control flow for programs.
However, notice how describing this control flow goes against the natural
predilections of the λµlet natural deduction calculus. In natural deduction everything
is a term that must return some result, forming an implicit call-and-return control flow
in the program. But here we must use µ-abstractions that bind an unused co-variable
δ to skip past the implied control flow of case expressions and insert our own. In
the µµ˜ sequent calculus, in contrast, the control flow of case analysis (on either data
structures or co-data observations) is explicit, so selective inlining for co-variables is
more natural. This is an example of how not all representations are equal, even if
they are equivalently expressive. Representing join points as first-class control is more
natural in the sequent style (or continuation-passing style, for the same reasons). For
412
imperative programs, static single assignment (SSA) (Cytron et al., 1991) represents
join points with φ-nodes that change their value based on the history of run-time
control flow that leads to them.
Alternatively, we can come up with a representation of join points that fits more
closely with the idioms of pure functional programming, based on either the sequent
calculus (Downen et al., 2016) or natural deduction (Maurer et al., 2017). The purely
functional view of join points limits the expressive power over control flow by restricting
co-variables based on their static scope (so that co-variables cannot be captured in
closure, making them unreachable when they go out of scope) and their static types
(limiting co-variables to only data types of the form ∃X1:k1 . . . ∃Xn:kn.(A1⊗· · ·⊗Am)
in the sequent calculus or co-data types of the form ∀Z.∀X1:k1 . . . ∀Xn:kn.(A1 →
· · · → Am → Z) in natural deduction). This alternative view of join points based
on the sequent calculus lets us integrate them directly into common intermediate
languages based on the λ-calculus that are used for compiling functional programs, for
example the core intermediate language for the Glasgow Haskell Compiler,3 to improve
existing optimizations and enable new ones without disrupting the rest of the language.
In other words, there are is a space for languages in between λlet and λµlet for
representing purely functional join points that serve as a practical compromise between
the expressiveness of the classical sequent calculus and the referential transparency of
the intuitionistic sequent calculus.
Natural deduction vs sequent calculus with multiple consequences
Since we have extended the natural deduction calculus to incorporate co-variables,
we can extend the correspondence to cover the entire classical sequent calculus. The
translations between the two languages are likewise extended for λµlet as shown in
Figure 9.17. The definition of LKJvK is the same as LJ JvK in every case except for
the new µ-abstractions and commands of λµlet . In contrast, the reverse translation
NKJ K requires more changes from NJ J K. In particular, co-terms are now translated
as command-delimited frame contexts, and not just frame contexts. For example,
instead of translating the co-variable α as the empty context, it is now 〈||α〉 because
there might be a choice between many different co-variables, so we need to spell out
which one to not confuse them. As a consequence of this change, the translations of
3See https://ghc.haskell.org/trac/ghc/wiki/SequentCore for more information.
413
let and case expressions are now also different; they both make use of a dummy
co-variable δ, listed explicitly as the extra parameter in NKJ Kδ, which is effectively
ignored at run-time but convenient for coercing commands to terms (via µδ.c) and
terms to commands (via 〈v||δ〉). For example, in the command 〈letx = v inµδ.c||δ〉, if
δ is not free in c, then any co-value substituted for the outer δ is discarded as in the
following reduction sequence:
〈letx = v inµδ.c||δ〉 〈letx = V inµδ.c||δ〉
→letT 〈µδ.c {V/x}||δ〉
→µα c {V/x}
So the co-value of the dummy δ does not influence program behavior. However, it
is extremely useful for translating input and especially case abstractions, where the
commands in the branches of the case may all “return” a result to different places
denoted by different co-variables, which goes against the more predictable control flow
of the λ-calculus syntax.
Despite the apparent differences between the single-consequence NJ J K and
multiple-consequence NKJ K translations, the two still coincide up to the µµ˜ equational
theory for single-consequence commands and (co-)terms. In this sense, NKJ K is a
conservative extension of the NJ J K translation.
Lemma 9.8 (Natural purification). For any single-consequence µµ˜ command c, term
v, or co-term e,
a) NKJvKδ =ηµccµ NJ JvK where β /∈ FV (v) for all co-variables β,
b) NKJcKδ =ηµccµ 〈NJ JcKα||α〉 for some α such that β /∈ FV (NJ JcKα) for all
co-variables β, and
c) NKJeKδ[v] =ηµccµ 〈NJ JeKα[v]||α〉 for some α and all v such that β /∈ FV (NJ JeKα)
for all co-variables β when δ /∈ FV (e).
Proof. By mutual induction on the syntax of commands, terms, and co-terms.
The cases for cuts, (co-)variables, data structures, and co-data observations follow
immediately from the inductive hypothesis. The remaining cases are:
– µα.c: NKJµα.cKδ = µα.NKJcKδ =IH µα. 〈NJ JcKα||α〉 =ηµ NJ JcKα = NJ Jµα.cK
414
Natural deduction to sequent calculus:
LKJµα.cK , µα.LKJcK LKJ〈v||α〉K , 〈LKJvK||α〉
Sequent calculus to natural deduction:
NKJ〈v||e〉Kδ , NKJeKδ[NKJvKδ]
NKJxKδ , x
NKJµα.cKδ , µα.NKJcKδ
NK
r
K
#»
B ( #»v )
z
δ
, K( #»B, #              »NKJvKδ)
NK
s
µ
( #                        »
O
#   »
Y :l [ #»x , α].c
){
δ
, λ
{
#                                                        »
O[ #   »Y :l, #»x ]⇒ µα.NKJcKδ}
NKJαKδ , 〈||α〉
NKJµ˜x.cKδ , 〈letx =  inµδ.NKJcKδ||δ〉
NK
r
O
#»
B [ #»v , e]
z
δ
, NKJeKδ[ O[ #»B, #              »NKJvKδ ]]
NK
s
µ˜
[ #                   »
K
#   »
Y :l( #»x ).c
]{
δ
,
〈
caseof
#                                                        »
K( #   »Y :l, #»x )⇒ µδ.NKJcKδ ∣∣∣∣∣∣∣∣δ〉
FIGURE 9.17. Translations between natural deduction and the sequent calculus with
many consequences.
415
– µ
( #                        »
O
#   »
Y :l [ #»x , α].c
)
:
NK
s
µ
( #                        »
O
#   »
Y :l [ #»x , α].c
){
δ
= λ
{
#                                                        »
O[ #   »Y :l, #»x ]⇒ µα.NKJcKδ}
=IH λ
{
#                                                                   »
O[ #   »Y :l, #»x ]⇒ µα. 〈NJ JcKα||α〉}
=ηµ λ
{
#                                              »
O[ #   »Y :l, #»x ]⇒ NJ JcKα}
= NJ
s
µ
( #                        »
O
#   »
Y :l [ #»x , α].c
){
– µ˜x.c where α ∈ FV (c):
NKJµ˜x.cKδ[v] = 〈letx = v inµδ.NKJcKδ||δ〉
=IH 〈letx = v inµδ. 〈NJ JcKα||α〉||δ〉
=ccµ 〈µδ. 〈letx = v inNJ JcKα||α〉||δ〉
=µα 〈letx = v inNJ JcKα||α〉
= 〈NJ Jµ˜x.cKα[v]||α〉
– µ˜
[ #                   »
K
#   »
Y :l( #»x ).c
]
:
NK
s
µ˜
[ #                   »
K
#   »
Y :l( #»x ).c
]{
δ
[v] =
〈
case v of
#                                                        »
K( #   »Y :l, #»x )⇒ µδ.NKJcKδ ∣∣∣∣∣∣∣∣δ〉
=IH
〈
case v of
#                                                                   »
K( #   »Y :l, #»x )⇒ µδ. 〈NJ JcKα||α〉∣∣∣∣∣∣∣∣δ〉
=ccµ
〈
µδ.
〈
case v of
#                                               »
K( #   »Y :l, #»x )⇒ NJ JcKα ∣∣∣∣∣∣∣∣α〉∣∣∣∣∣∣∣∣δ〉
=µα
〈
case v of
#                                               »
K( #   »Y :l, #»x )⇒ NJ JcKα ∣∣∣∣∣∣∣∣α〉
=
〈
NJ
s
µ˜
[
#                   »
KY :l( #»x ).c
]{
α
∣∣∣∣∣
∣∣∣∣∣α
〉
Now that we can translate the full classical µµ˜ sequent calculus to the natural
deduction λµlet -calculus, the two translations still preserve the static and dynamic
semantics of the languages. This correspondence follows similar to the single-
consequence one, where the types correspond on the nose and there is an equational
correspondence between the two languages. The one wrinkle with the preservation of
416
types is that the NKJvKδ translation introduces δ as an extra free co-variable, whose
type can be anything because its co-value is never used, which should be different from
any free or bound co-variable in v to make sure that the translation is still well-typed.
Theorem 9.4 (LKJ K preserves types). a) For any λµlet derivation c : (Γ `ΘG ∆),
there is a µµ˜ derivation of LKJcK : (Γ `ΘG ∆).
b) For any λµlet derivations of Θ `G C : T and Γ `ΘG v : A | ∆, there is a µµ˜
derivation of Γ `ΘG LKJvK : A | ∆.
Proof. By induction on the given λµlet derivation. The cases for the subset of λµlet
rules that in λlet are the same as Theorem 9.1. The two new cases are for the Pass
and Act rules which translate as follows:
.... IH
Γ `ΘG LKJvK : A | α : A,∆ | β : A `ΘG β : A VL
〈LKJvK||β〉 : (Γ `ΘG β : A,α : A,∆) CR,XR
〈LKJvK||α〉 : (Γ `ΘG α : A,∆) Cut
.... IH
LKJcK : (Γ `ΘG α : A,∆)
Γ `ΘG µα.LKJcK : A | ∆ AR
Where the additional premise that Θ `G C : T is satisfied by assumption. The other
cases follow similarly.
Theorem 9.5 (NKJ K preserves types). Given any (δ : C) /∈ ∆,
a) For any µµ˜ derivation of Γ `ΘG v : A | ∆, if δ /∈ BV (v) then there is a λµlet
derivation of Γ `ΘG NKJvKδ : A | δ : C,∆.
b) For any µµ˜ derivation of c :
(
Γ `ΘG ∆
)
, if δ /∈ BV (c) then there is a λµlet
derivation of NKJcKδ : (Γ `ΘG δ : C,∆).
c) For any µµ˜ derivation of Γ | e : A `ΘG ∆, λµlet derivation of
Γ `ΘG v : A | δ : C,∆, and derivation of Θ `G A : S, if δ /∈ BV (e) then there is
a λµlet derivation of NKJeKδ[v] : (Γ `ΘG δ : C,∆).
417
Proof. By mutual induction on the given µµ˜ derivations, similar to Theorem 9.2. For
example, the AL rule is now translated as
Γ `ΘG v : A | δ : C,∆ Θ `G A : S
.... IH
NKJcK : (Γ, x : A `ΘG δ : C,∆)
Γ, x : A `ΘG µδ.NKJcK : C | δ : C,∆ Act
Γ `ΘG letx = v inµδ.NKJcK : C | δ : C,∆ Let
〈letx = v inµδ.NKJcK||δ〉 : (Γ `ΘG δ : C,∆) Pass
Also, the AR rule is translated as:
.... IH
c :
(
Γ `ΘG δ : B,α : A,∆
)
Γ `ΘG µα.NKJcKδ : A | δ : B,∆ Act
where we rely on the side condition that δ /∈ BV (µα.c) which implies that δ 6= α. The
rest of the cases are similar.
Lemma 9.9 (Soundness of LKJ K). For any T , if v is ς-normal and v R v′ then
LKJvK LKJRK LKJv′K where the translation of the rewrite relation R is defined by
cases on R as:
LKJµT K , =µLKJT Kηµ LKJµαK , µLKJSK
LK
qηµy , ηµ LKqccµy , =µα
Proof. The case for ηµ is immediate and µα follows from the fact that co-variable
substitution commutes with translation (i.e. LKJcK {β/α} = LKJc {β/α}K). The case
for µT follows from (the multiple-consequence generalization of) Lemma 9.2:
LKJE[µα.c]K =µLKJT Kηµ µα. 〈µα.LKJcK||Eα〉
=µLKJT Kηµ µα.LKJc {〈E||α〉/〈||α〉}K
= LKJµα.c {〈E||α〉/〈||α〉}K
where the commutation of substitution and translation in second-to-last step follows by
induction on the structure of commands and (co-)terms; every step follows immediately
from the definition of structural substitution and the inductive hypothesis with the
418
base cases:
LKJ〈v||α〉K {Eα/α} = 〈LKJvK||α〉 {Eα/α}
= 〈LKJvK {Eα/α}||Eα〉
=IH 〈LKJv {〈E||α〉/〈||α〉}K||Eα〉
=µSηµ 〈LKJE[v {〈E||α〉/〈||α〉}]K||α〉
= LKJ〈E[v {〈E||α〉/〈||α〉}]||α〉K
Likewise, the case for ccµ follows from (the multiple-consequence generalization of)
Lemma 9.2, where the case for a let expression is:
LKJµα. 〈letx = v′ in v||β〉K = µα. 〈µγ. 〈LKJv′K||µ˜x. 〈LKJvK||γ〉〉||β〉
=µα µα. 〈LKJv′K||µ˜x. 〈LKJvK||β〉〉
=α µγ. 〈LKJv′K||µ˜x. 〈LKJvK {γ/α}||β〉〉
=µα µγ. 〈LKJv′K||µ˜x. 〈µα. 〈LKJvK||β〉||γ〉〉
= LKJletx = v′ inµα. 〈v||β〉K
And the case for a case expression is similar.
Lemma 9.10 (Soundness of NKJ K). For any strongly focalizing S and declarations
G,
a) if v is ς-normal and v R v′ then NKJvKδ R NKJv′Kδ,
b) if c is ς-normal and c R c′ then NKJcKδ R NKJc′Kδ,
c) if e is ς-normal and e R e′ then NKJeKδ[v] R NKJe′Kδ[v] for all v,
where the translation of the rewrite relation R is defined by cases on R as:
NKJµSK ,→µNKJSKµα NKJµ˜SK ,→letNKJSKµα
NK
qηµy , ηµ NKqηµ˜y , =letSηletµαccµccNKJSKσNKJSK
NK
qβSy ,→βNKJSKµNKJSK→µα NKrν∀<Ordz , ν∀<Ord
NK
qβFy ,→βF=µαccµ NKq≺ηFy ,←ηF=ηµccµ
Proof. First, note that co-value substitution commutes with translation as
419
– NKJvKδ {NKJEKδ/〈||α〉} = NKJv {E/α}Kδ,
– NKJcKδ {NKJEKδ/〈||α〉} = NKJc {E/α}Kδ, and
– NKJeKδ[v] {NKJEKδ/〈||α〉} = NKJe {E/α}K{NKJEKδ/〈||α〉},
which follows by induction on the syntax of commands, terms, and co-terms, where
the base case is:
NKJαKδ[v] {NKJEKδ/〈||α〉} = 〈v||α〉 {NKJEKδ/〈||α〉}
= NKJEKδ[v {NKJEKδ/〈||α〉}]
= NKJα {E/α}Kδ[v {NKJEKδ/〈||α〉}]
Likewise, value substitution commutes with translation as NKJvKδ {NKJV Kδ/x} =
NKJv {V/x}Kδ, and so on.
The cases for ηµ, βF, and ηF are calculations, and the cases for µS , µ˜S , βS , and
ν∀<Ord follow by the above commutation of substitution and translation similarly as
in Lemma 9.5. The remaining case for ηµ˜ is similar to Lemma 9.5 and follows from
the fact that for every covalue E, NKJEKδ = 〈E ′||β〉 for some co-value context E ′ and
co-variable β according the two cases coming from the strong focalization of S:
NKJµ˜x. 〈x||E〉Kδ[v] = 〈letx = v inµδ.NKJEKδ[x]||δ〉
= 〈letx = v inµδ. 〈E ′[x]||β〉||δ〉
=ccµ 〈µδ. 〈letx = v inE ′[x]||β〉||δ〉
=µα 〈letx = v inE ′[x]||β〉
=ηletNKJSK 〈E ′[x]||β〉
= NKJEKδ[x]
NKJµ˜x. 〈x||µ˜y.c〉Kδ[v] = 〈letx = v inµδ. 〈let y = x inµδ.NKJcKδ||δ〉||δ〉
=letNKJSK 〈letx = v inµδ. 〈µδ.NKJcKδ {x/y}||δ〉||δ〉
=µα 〈letx = v inµδ.NKJcKδ {x/y}||δ〉
=α 〈let y = v inµδ.NKJcKδ||δ〉
= NKJµ˜y.cKδ[v]
Lemma 9.11 (Natural deduction round trip). For any T ,
420
a) if v is ς-normal then NKJLKJvKKδ =µαηµccα v, and
b) if c is ς-normal then NKJLKJcKKδ =µαηµccα c.
Proof. By mutual induction on the syntax of commands, terms, and co-terms. The
cases for variables, µ-abstractions, data structures, and cuts follow immediately by
the inductive hypothesis, while the remaining cases are:
– letx = v′ in v:
NKJLKJletx = v′ in vKKδ = NKJµα. 〈LKJv′K||µ˜x. 〈LKJvK||α〉〉Kδ
=IH µα. 〈letx = v′ inµδ. 〈v||α〉||δ〉
=ccµ µα. 〈µδ. 〈letx = v′ in v||α〉||δ〉
=µα µα. 〈letx = v′ in v||α〉
=ηµ letx = v′ in v
– case v′ of
#                                »
K( #   »Y :l, #»x )⇒ v:
NK
s
LK
s
case v′ of
#                                »
K( #   »Y :l, #»x )⇒ v
{{
δ
= NK
s
µα.
〈
LKJv′K∣∣∣∣∣∣∣∣µ˜[ #                                            »K #   »Y :l( #»x ).〈LKJvK||α〉]〉{
δ
=IH µα.
〈
case v′ of
#                                                   »
K( #   »Y :l, #»x )⇒ µδ. 〈v||α〉
∣∣∣∣∣∣∣∣δ〉
=ccµ µα.
〈
µδ.
〈
case v′ of
#                                »
K( #   »Y :l, #»x )⇒ v
∣∣∣∣∣∣∣∣α〉∣∣∣∣∣∣∣∣δ〉
=µα µα.
〈
case v′ of
#                                »
K( #   »Y :l, #»x )⇒ v
∣∣∣∣∣∣∣∣α〉
=ηµ case v′ of
#                                »
K( #   »Y :l, #»x )⇒ v
– λ
{
#                               »
O[ #   »Y :l, #»x ]⇒ v
}
:
NK
s
LK
s
λ
{
#                               »
O[ #   »Y :l, #»x ]⇒ v
}{{
δ
= NK
s
µ
( #                                                  »
O
#   »
Y :l [ #»x , α].〈LKJvK||α〉){
δ
=IH λ
{
#                                                    »
O[ #   »Y :l, #»x ]⇒ µα. 〈v||α〉
}
=ηµ λ
{
#                               »
O[ #   »Y :l, #»x ]⇒ v
}
421
– v′ O[ #»B, #»V ]:
NK
r
LK
r
v′ O[ #»B, #»V ]
zz
δ
= NK
r
µα.
〈
LKJv′K∣∣∣∣∣∣O #»B [ #             »LKJV K , α]〉z
δ
=IH µα.
〈
v′ O[ #»B, #»V ]
∣∣∣∣∣∣α〉
=ηµ v′ O[
#»
B,
#»
V ]
Lemma 9.12 (Sequent calculus round trip). For any S,
a) if v is ς-normal then LKJNKJvKδK =µS v,
b) if c is ς-normal then LKJNKJcKδK =µS c,
c) if e is ς-normal then for all v, LKJNKJeKδ[v]K =µS 〈LKJvK||e〉
Proof. By mutual induction on the syntax of commands, terms, and co-terms. The
cases for cuts, (co-)variables, µ-abstractions, and data structures follow immediately
by the inductive hypothesis, while the remaining cases are:
– µ
(
#                        »
OY :l[ #»x , α].c
)
:
LK
s
NK
s
µ
(
#                        »
OY :l[ #»x , α].c
){
δ
{
= LK
s
λ
{
#                                                        »
O[ #   »Y :l, #»x ]⇒ µα.NKJcKδ}{
=IH µ
( #                                            »
O
#   »
Y :l [ #»x , α].〈µα.c||α〉
)
=µα µ
( #                        »
O
#   »
Y :l [ #»x , α].c
)
– µ˜x.c:
LKJNKJµ˜x.cKδ[v]K = LKJ〈letx = v inµδ.NKJcKδ||δ〉K
=IH 〈µα. 〈LKJvK||µ˜x. 〈µδ.c||α〉〉||δ〉
=µα 〈LKJvK||µ˜x.c〉
422
– µ˜
[
#                   »
KY :l( #»x ).c
]
:
LK
s
NK
s
µ˜
[
#                   »
KY :l( #»x ).c
]{
δ
[v]
{
= LK
s〈
case v of
#                                                        »
K( #   »Y :l, #»x )⇒ µδ.NKJcKδ ∣∣∣∣∣∣∣∣δ〉{
=IH
〈
µα.
〈
LKJvK∣∣∣∣∣∣∣∣µ˜[ #                                     »KY :l( #»x ).〈µδ.c||α〉]〉∣∣∣∣∣∣∣∣δ〉
=µα
〈
LKJvK∣∣∣∣∣∣∣∣µ˜[ #                   »KY :l( #»x ).c]〉
– O
#»
B [ #»V ,E]:
LK
r
NK
r
O
#»
B [ #»V ,E]
z
δ
[v]
z
= LK
r
NKJEKδ[v O[ #»B, #                »NKJV Kδ ]]z
=IH
〈
µα.
〈
LKJvK∣∣∣∣∣∣O #»B [ #»V , α]〉∣∣∣∣∣∣E〉
=µS
〈
LKJvK∣∣∣∣∣∣O #»B [ #»V ,E]〉
Theorem 9.6 (Parametric equational correspondence). For any stable substitution
strategies S and T = NKJSK, the λµletT -calculus is in equational correspondence with
the µµ˜S-calculus.
Proof. Analogous to the proof of Theorem 9.3, relying on Lemmas 9.9, 9.10, 9.11
and 9.12 and the fact that ς-normalization still commutes with all other reductions.
423
CHAPTER X
Conclusion
This dissertation has explored how a logical foundation based on the classical
sequent calculus can be used to address issues in the design, theory, and implementation
of programming languages, especially functional ones. We looked at how all the usual
logical connectives that we study individually in the theory of computation are all
instances of the general idea of data and co-data. We also looked at how the evaluation
strategy of a programming language, which is usually inextricably woven into its
very being, can be abstracted away as a parameter to the language description in
terms of their impact on substitution, so that we can reason about the behavior of
programs by just considering what variables and co-variables might stand for. This
treatment of evaluation strategy extends beyond simplistic call-by-value and call-
by-name evaluation to more complex evaluation methods like call-by-need, and also
provides a general mechanism for mixing two or more different evaluation strategies
within a single program.
Next, we looked at how the simple framework of data and co-data could be
extended to more advanced typed features found in programming languages and
proof assistants. In particular, type abstraction and polymorphism (in the form of
either generics or modules) is represented as hidden types contained within (co-)data
structures. Additionally, well-founded recursive types (like lists, trees, and streams)
can be represented by (co-)data indexed with a measure of their size, whose definition
follows by a well-founded recursion scheme over that index. Well-founded recursive
programs over those types are then represented by quantifying over the size of the
type. This technique allows for multiple different recursion schemes with their own
advantages and disadvantages. Here, we considered both noetherian recursion which
prevents programs from depending on the size index so that they can be erased at
run-time, as well as primitive recursion which allows programs to depend on and react
to the size index at run-time, similar to GADTs or index-dependent types.
With the collection of language features in place, we then studied their theory
and application to programming. On the theory side, we developed a model for the
language based on the idea of orthogonality which was general enough to capture
424
several language-wide safety features, including type safety, strong normalization, and
soundness of the typed extensional equational theory used to freely reason about
program behavior with respect to the untyped operational semantics used to run
programs. Unlike the common formulation of such models, the one presented here was
parametric in two unusual dimensions (1) the connectives used to build types in the
language, and (2) the evaluation strategy used to run programs in the language. These
extra dimensions of generality are due to the parametric formulation of the underlying
language, and allows us to establish results for many instances of the parametric
language in one fell swoop.
On the application side, we looked at some ways that the ideas developed here
could apply to the compilation of programs. We used the idea of polarity to come
up with a small basis of polarized types which can faithfully encode all the simple
(co-)data types that could possibly be declared based on a theory of type isomorphisms.
The “faithfulness” criteria that we demand of encodings is important if they are to
be useful for the purpose of optimizing compilation. In particular, we should expect
that exactly the same equations between programs should hold both before and after
encoding, so that we do not break the semantics of programs or lose out on optimization
opportunities due to leaky abstractions. To achieve these faithful encodings, we employ
the general wisdom from polarized logic and the call-by-push-value paradigm that
types from call-by-name and call-by-values can be represented by sprinkling polarity
shifts in the appropriate places. However, because in our case we are interested in
evaluation strategies beyond just these two, we introduce a family of four distinct
shifts that are used for encoding the correct evaluation order into types.
Completing the circle, we come back to the natural deduction world by introducing
a λ-calculus based language corresponding to everything we have done in the setting
of the classical sequent calculus. The two different viewpoints are then related in two
ways. First, we restrict the sequent calculus to just intuitionistic logic by limiting all
sequents to a single consequence. This lets us establish a correspondence between
the static and dynamic semantics of pure λ-calculus and the intuitionistic language
of the sequent calculus. Second, we generalize the natural deduction language to
classical logic by introducing multiple consequences to the λ-calculus, which lets
us extend the correspondence by relating the full language of the classical sequent
calculus with the λ-calculus with first-class control. The two-way correspondence is
applicable to functional programming by showing how functional programs based on
425
the λ-calculus can be compiled to the sequent calculus, and also revealing how ideas
born in the sequent calculus can be translated back into their analog in functional
programming. The classical correspondence also has the additional application in
optimizing compilation by giving a more expressive representation of control that
can maintain sharing under optimization, avoiding the need to duplicate code while
transforming programs to reveal simplifications.
Future Work
Types
We considered some advanced type features found in programming languages
and proof assistants that go beyond just simple types. However, there are still many
more advanced type features that could be included in the analysis and framework
presented here.
Sub-typing
Intuitively, co-data types are a lot like objects and interfaces from object-oriented
languages removed from assumptions about mutable state. But we would be remiss
to not mention sub-typing which is an essential feature in any statically-typed object-
oriented language. For example, if we declare an interface, we should be able to extend
that interface with additional messages, so that any object following the extended
interface can just as well be considered an object of the original one.
In the language of co-data, this idea corresponds to extending a co-data
declaration with additional alternatives. For example, given our usual simple definition
of the binary product co-data type A&B
codataX & Y where
pi1 : | X & Y ` X
pi2 : | X & Y ` Y
we can extend binary products to ternary products by extending the co-data
declaration of & to a co-data declaration that includes an additional projection
426
observation as follows:
codataTern(X, Y, Z) extendsX & Y where
pi3 : | Tern(X, Y, Z) ` Z
So the type Tern(A1, A2, A3) includes the same projections pi1 [e1] and pi2 [e2] (where
ei : Ai) as A1 ⊕ A2 does, but also includes the third projection pi3 [e3]. That means
an object of Tern(A1, A2, A3) must respond to all three messages, as in the term
µ(pi1 [α1].c1 | pi2 [α2].c2 | pi3 [α3].c3), which is an extension of the objects of type A&B.
The dual of this concept of sub-typing for co-data is sub-typing for data. For example,
the simple sum data type that we have used
dataX ⊕ Y where
ι1 : X ` X ⊕ Y |
ι2 : Y ` X ⊕ Y |
can be extended with an additional third alternative as follows:
dataOneOfThree(X, Y, Z) extendsX ⊕ Y where
ι3 : Z ` OneOfThree(X,Y,Z) |
So the type OneOfThree(A1, A2, A3) includes the same injections ι1 (v1) and ι2 (v2)
(where vi : Ai) as A1 ⊕ A2 does, but also includes the third injection ι3 (v3). That
means that case analysis of oneOfThree(A1, A2, A3) must consider all three possible
cases, as in the co-term µ˜[ι1 (x1).c1 | ι2 (x2).c2 | ι3 (x3).c3].
The main purpose of sub-typing is not extension but subsumption. That is, we
expect to be able to use values of a sub-type in contexts expecting the super-type.
Using the above example, we would have the sub-typing relationships (written as
A <: B and read as “A is a subtype of B”) Tern(A,B,C) <: A & B and A ⊕ B <:
OneOfThree(A,B,C), which can then be used to coerce between terms and co-terms of
sub-types. This generalizes the notion of type equality from Chapter VI to a directed
(i.e. non-symmetric) relationship, likewise generalizing the type conversion rules to
the following pair of subsumption rules:
Γ `ΘG v : A | ∆ Θ `G A :< B : k
Γ `ΘG v : B | ∆
SubR
Θ `G A <: B : k Γ `ΘG e : B | ∆
Γ | e : A `ΘG ∆
SubR
427
The right subsumption rule is what we’re used to seeing in programming languages.
It says that if a term v produces a result of type A which is a sub-type of B, then v
can also be said to produce a result of type B. The left subsumption rule is a little
different, though, and runs in the opposite direction due to duality. It says that if
a co-term e consumes any result of type B which is a super-type of A, then e can
also be said to consume any result of type A. These rules give the usual notion of
sub-type polymorphism as coercion. We could also allow for abstraction over sub-
types, analogous to the system F<: Cardelli et al. (1994) extension of system F with
sub-typed bounded polymorphism, by adding the kind <k A of all subtypes of A in
kind k like we did for orderings in Ord.
Also note that the orthogonality models for programs in Chapter VII are
already set up to easily accomodate sub-typing. In particular, recall that we used
the containment ordering relation, A v B on interactions spaces which means
that everything in A is contained in B. However, there is another natural ordering
relationship between interaction spaces which flips directions between the two sides
of the space. That is, given any spaces A = (A+,A−) and B = (B+,B−) (so that A+
and B+ are the positive sides of A and B and A− and B− are the negative sides), we
have the sub-space relation A ≤ B defined as
(A+,A−) ≤ (B+,B−) , (A+ ⊆ B+) ∧ (B+ ⊆ A−)
This follows the intuitive meaning of behavioral sub-typing (Liskov, 1987), where A is
a sub-type of B if every value (i.e. positive element) of A can be used in any context
(i.e. negative element) expecting a B, or conversely every context of B can be used
with any value of A.
Implicit types
The treatment of connectives that we have explored in this dissertation has been
very syntax directed. That is to say, the typing rules specific to every connective only
applies to expressions of certain syntactic forms. For example, the right rules specific
to A⊕B is to construct an injection ιi (v) : A⊗B for i = 1 or i = 2 and the only left
rule specific to A+B is to deconstruct an injection µ˜[ι1 (x).c1 | ι2 (x).c2] : A⊕B.
However, there are also more implicit forms types which are not tied to specific
syntax. For example, union and intersection types combine existing types by directly
428
combining the programs they specify without modification. More specifically, the union
of two types, written A∨B, is like an untagged union in C, so v : A∨B if either v : A
or v : B without tagging it with an injection explaining which was the case. Dually,
the intersection of two types, written A ∧ B, requires membership in both types so
that v : A ∧ B if both v : A and v : B as-is. However, note that these implicit rules
are dangerous: in a language of effects, implicit union and intersection types require
careful value and evaluation context restrictions, depending on the evaluation strategy
of the language, similar to the value restriction in ML to be sound. In particular, the
ML style of implicit polymorphism can be seen as an infinite intersection quantifier,∧
X.A, whereas the dual form of implicit existential quantification can be seen as an
infinite union, ∨X.A, which require the same restrictions as binary intersection and
union types.
Understanding these value and evaluation context restrictions for intersection and
union types was one of the original motivations for studying polarity in computation
(Zeilberger, 2008a). And as noted by Munch-Maccagnoni (2009), the orthogonal model
of programs puts these kinds of implicit types on firmer ground. In particular, these
sorts of types are closely related to the orthogonal model of sub-typing mentioned
previously. In Chapter VII, we used union and intersection operations on interaction
spaces, A unionsq B and A unionsq B respectively, that fit well with the containment relationship.
However, there is an alternative presentation of union and intersection that fits with
sub-spaces. Given any spaces A = (A+,A−) and B = (B+,B−), their sub-spacial union
and intersections, A ∨ B and A ∧ B respectively, are defined as
(A+,A−) ∨ (B+,B−) , (A+ ∪ B+,A− ∩ B−)
(A+,A−) ∧ (B+,B−) , (A+ ∩ B+,A− ∪ B−)
Note that in combination with the containment-based operations, these complete the
set, giving all four combinations of union and intersection of the two sides of interaction
spaces. And like the containment operations (Property 7.3), these operations share
similar de Morgan laws with respect to sub-spaces. This lets us build a model for
binary or infinite intersection and union types similar to Munch-Maccagnoni (2009)
as the appropriately value-restricted positively or negatively constructed types from
Chapter VII.
429
The practical application of these more implicit types is that they allow for more
sub-typing and type equalities since they leave no trace on the underlying programs,
and thus enable more programs to be written. For example, there is Girard’s “shocking”
Munch-Maccagnoni (2009) type equality ∧X.(A ⊕ B) = (∧X.A) ⊕ (∧X.B) which
does not make sense if we interpret ∧ as a traditional universal quantification—just
because either A or B is true for any X doesn’t mean that it must always be that
A is true for every X or B is true for every X—but make sense when viewed as a
cost-free run-time type coercion. So these kinds of implicit types could be useful in
applications that demand both very expressive types with little to no run-time cost.
Dependent types
One of the most active research topics in type theory for programming languages
is dependent types: allowing the compile-time type of a program depend on the run-
time value of the program. This dissertation only considers a limited view of type
dependency in the form of the Ix-indexed types from Chapter VI. The Ix indexes in
programs influence their behavior, so they are a form of run-time value, but also types
can depend on their value as well in the primitive recursive (co-)data declarations.
That gives us a limited form of dependent types over just the natural numbers as
type indexes. However, generalizing the classical sequent calculus to full-spectrum
dependent types that allow for type dependency on run-time values of arbitrary other
types is much more challenging.
For example, in terms of the λlet natural deduction language, consider the
following dependent version of the Let rule:
Γ ` v : A Γ, x : A ` v′ : B
Γ ` letx = v in v′ : B {v/x} Letdep
The dependency comes from the fact that in the premise, the variable x can be seen in
both the term v′ as well as in the consequence B. So in the conclusion of the inference
rule, we bind v to x in the term v′ and also substitute v for x in the consequence
B {v/x}.
Unfortunately, this rule does not fit well with the commuting conversions of the
λlet -calculus. For example, in the call-by-value V instance, we have the following ccV
430
rule for re-associating nested let expressions:
let y = (letx = v in v′) in v′′ →ccV letx = v in (let y = v′ in v′′)
This kind of reduction is essential theoretically for the correspondence between
the intuitionistic sequent calculus and natural deduction, and also has practical
applications in optimizing compilers for functional programming languages. However,
this reduction step is highly dubious when combined with the dependent Letdep typing
rule, since it changes the types of terms. Before reduction, suppose that we have the
following typing derivation using Letdep:
Γ ` v : A Γ, x : A ` v′ : B
Γ ` letx = v in v′ : B {v/x} Letdep Γ, y : B {v/x} ` v′′ : C
Γ ` let y = (letx = v in v′) in v′′ : C {(letx = v in v′)/y} Letdep
Yet, after reduction, the re-associated typing derivation sees different dependencies
on values in types:
Γ ` v : A
Γ, x : A ` v′ : B Γ, x : A, y : B ` v′′ : C
Γ, x : A ` let y = v′ in v′′ : C {v′/y} Letdep
Γ ` letx = v in let y = v′ in v′′ : C {v′/y} Letdep
In contrast, in the sequent calculus we do not have this form of elimination
rule, which pairs an object of interest with some observation on it, but rather input-
independent left rules. The let expression above corresponds to an input abstraction
typed by the left rule
c : (Γ, x : A ` ∆)
Γ | µ˜x.c : A ` ∆ AL
which is then later paired with its input via a Cut as in
Γ′ ` v : A | ∆′
c : (Γ, x : A ` ∆)
Γ | µ˜x.c : A ` ∆ AL
〈v||µ˜x.c〉 : (Γ,Γ′ ` ∆,∆′) Cut
The “return type” of the binding, which was explicitly known in the let expression in
natural deduction, instead implicitly found in the output environment ∆ of the co-term
µ˜x.c. Forming a full-spectrum dependently typed sequent calculus would therefore
431
involve a dependent Cut rule that ties together the (co-)values of the two sides of a
cut to be referenced in the environments of the other side. Finding a solution to this
problem would lead to a dependent type theory that is more robust to extensions, being
able to accomodate effect like the control implicit in the classical sequent calculus,
as well as handling additional simplifications like the commuting conversions above
without altering types in troublesome ways.
Exotic types
Our exploration of the sequent calculus naturally brought up types like negation
(∼A and ¬A) and the disjunctive co-data type (A ` B), and the subtraction data
type (A−B) which are normally unheard of in programming. Unsurprisingly, these
were also exactly the sorts of types that were disregarded in the correspondence
between the (classical and intuitionistic) sequent calculus and natural deduction in
Chapter IX. Each of these types requires having more than one consequence to a
constructer or observer, which does not have a nice, natural analogue in the language
of the λ-calculus.
It would be interesting to find out if all of these types can be represented naturally
in the λ-calculus, and to see if any are useful for ordinary programming or are analogous
to other programming features. For example, subtraction types have been used to
model delimited control (Ariola et al., 2009b). Are par (`) types also related to
delimited control? Do the involutive negation types give a practical model of call
stacks (Munch-Maccagnoni, 2014) in the implementation of functional programming?
Effects
Languages based on the classical sequent calculus include an inherent notion of
first-class control effect without adding anything extra. However, there are many more
kinds of effects that arise in programming languages which could be analysed from
the sequent calculus point of view.
General recursion
In this dissertation, we considered recursion as a language feature, but only
the well-founded kind which could not lead to infinite loops. But general purpose
programming languages include general recursion, which accepts the possibility for
infinite loops in exchange for greater expressive power. General recursion could be
432
added to the µµ˜ sequent calculus by including two new types of general binders,
νx.V and ν˜α.E, which give a self-referential name to values and co-values. These
self-reference abstractions can be unrolled at run-time as follows:
(νS) νx.V νS V {(νx.V )/x} (ν˜S) ν˜α.E ν˜S E {[ν˜α.E]/α}
where we restrict the abstractions to (co-)values so that they are valid for substitute
according to the chosen strategy. Because of this value restriction imposed at run-time,
we get the following two dual typing rules for general recursion:
Γ, x : A `ΘG V : A | ∆
Γ `ΘG νx.V : A | ∆
RecR
Γ | E : A `ΘG α : A,∆
Γ | ν˜α.E : A `ΘG ∆
RecL
Note that the general form of recursion can be seen as subsuming the well-founded
recursion considered here by translating the recursive size abstractions to a general self-
reference abstraction over the non-recursive pattern match as follows for the ∀j < N.A
and ∃j < N.A types:
µ(j<N @x α.c) = νx.µ(j<N @ α.c)
µ˜[j<N @α x.c] = ν˜α.µ˜[j<N @ x.c]
The above encoding of well-founded recursion as general recursion is sensible in any
focalizing strategy because then the case abstractions are always (co-)values. This
is another example where the correct use of data or co-data gives the appropriate
evaluation order independently of the chosen evaluation strategy of the language at
large.
The syntactic theory of general recursion in the sequent calculus is fairly
straightforward since we must already be care of computations which don’t return to
their caller because of control effects, so the developments in Chapters V, VI, VIII,
and IX are largely unaffected by their inclusion. The main point of interest is the
semantic analysis in Chapter VII. As-is, the recursive typing rules RecR and RecL
are not sound with respect to the model in Chapter VII. To be expected, general
recursion imposes extra demands on such models, such as some form of continuity
property of the safety condition in question which lets us justify that the recursive
fixed points are safe because all their approximations (i.e. finite unrollings) are safe.
433
Both Pitts (2000) and Mellies & Vouillon (2005) give orthogonality-based models that
incorporate general recursion. It would be interesting to isolate the minimal extra
assumption that is needed to include general recursive fixed points to the parametric
orthogonality models considered here.
Delimited control
The µµ˜ sequent calculus inherits a form of classical control effects corresponding
to from classical logic (Griffin, 1990), however, there is a more expressive family of
control effects known as delimited control (Felleisen, 1988). Whereas classical control
allows for the first-class manipulation of evaluation contexts, delimited control adds
the ability to delimit the scope of that control, effectively breaking down the evaluation
context into independent segments that can be handled separately. It turns out this
extra functionality radically increases the expressive power of control, to the point
that any monadic effect can be encoded directly in a call-by-value functional language
using delimited control (Filinski, 1994, 1999), giving a form of user-defined effects.
We might then ask, what does delimited control look like in the sequent calculus?
One way to extend the sequent calculus with delimited control effects is to add a
dynamic, as opposed to static, co-variable t̂p (Munch-Maccagnoni, 2014), short for
the “top-level”, in analogy to the type-theoretic account of delimited control in the
λ-calculus (Ariola et al., 2009b). The main idea of the dynamic co-variable acts as a
delimiter by making the η law for the t̂p abstraction
µt̂p.
〈
V
∣∣∣∣∣∣t̂p〉→ V
fire even if t̂p is “free” in V , but only when V is a value. Another approach would
be to collapse the distinction between terms and commands based on the syntactic
relaxation of the λµ-calculus (Herbelin & Ghilezan, 2008; Downen & Ariola, 2014a).
By collapsing terms and commands, we can effectively “stack” many pending co-values
in a single command with a chain of cuts as in
〈〈〈〈v||α〉||β〉||γ〉||δ〉
so that each cut acts as a delimiter separating the co-values α, β, γ, and δ which are
seen by the active term v in that order.
434
Both of these approaches have some unfortunate weaknesses in their current state,
however. First, both views of delimited control present an rather asymmetric language
construct, which goes against the symmetry of the sequent calculus seen in every other
language feature. Why is there only a dynamic co-variable, and no dynamic variable
with the dual behavior? And why are only terms collapsed with commands, instead of
both terms and co-terms, so that the pending stack of cuts can grow in both directions
simultaneously? Both of these extensions to dualize delimited control would bring
it more into the dual character of the sequent calculus, and avoid our natural bias
from the λ-calculus to only think of terms first while co-terms are an afterthought at
best. Second, the classical µµ˜-calculus has a canonical type system that corresponds
to the LK system of classical logic. However, the type system for t̂p in the sequent
calculus given by Munch-Maccagnoni (2014) is, while also canonical, too restrictive to
make any interesting use of extra capabilities of the delimited control effect. And the
approach of collapsing commands and terms by Herbelin & Ghilezan (2008); Downen
& Ariola (2014a) is effectively untyped. While there are expressive type systems for
delimited control, they are behavioral, meaning they depend on the specific evaluation
order of the language, and are type-and-effect systems that are designed specifically
for those programming languages rather than coming from logic like LK or the λ- and
λµ-calculi. This poses the question: what is the logical concept that corresponds with
delimited control, and how does it integrate with duality?
Algebraic effect handlers
Another technique for adding user-defined effects to a language is with algebraic
effect handlers (Cartwright & Felleisen, 1994; Pretnar, 2010; Bauer & Pretnar, 2015).
Effect handlers act as one or more “administrators” that handle requests for primitive
effectful operations and dictate how to evaluate them at run-time. For example, we
could describe mutable state (with a single reference cell) by the get and put operations.
A program could then request to get the current state or put a new one, and the handler
of that program is responsible for interpreting those requests and putting them to
action on the stateful reference cell.
Data and co-data in the sequent calculus may be seen as a way of describing
the interface of interaction between the program and the administrator. In particular,
the dual nature of data and co-data suggests two dual ways of describing the third
party that defines an effect: third-party as an administrator, where there is an external,
435
abstract definition for the behavior of primitive operations and the program must yield
control of execution when reaching an effectful operation, or third-party as a resource,
where the program carries around extra, external information that may be scrutinized
in order to perform arbitrary effect-dependent behavior. The sequent calculus already
gives a rich language for interacting with contexts described as structures and abstract
processes; extending the sequent calculus with a notion of algebraic effects would
extend this language to programmable computational effects, similar to Felleisen’s
proposal for abstract continuations (Felleisen et al., 1988).
The fact that both these interpretations of algebraic effect handlers rely on some
third-party is reminiscent of the approach to delimited control of directly chaining
many Cuts by collapsing terms or co-term with commands. The essence appears to
be that, while only one party is needed to represent general recursion and two parties
are needed to represent classical control, other effects like delimited control or effect
handlers need (at least) three parties to carry out effectful computation. This also
suggests an approach to generalizing the orthogonality models from Chapter VII to
give a unified framework for admitting other kinds of effects. If we can compose more
than two entities into the elements of a computational pole, then we could model the
inclusion of heaps and handlers in the state of computation needed for representing
effects like mutable state and exceptions.
Linearity
While this dissertation has primarily been about languages based on classical
logic, there is still an inherent implicit connection to linear logic (Girard, 1987). In
particular, the dual roles of construction and deconstruction via patterns and pattern
matching gives yet another view of the logical rules of linear logic. Indeed, we used
this analogy to name the two different data and co-data forms of conjunction (⊗ and
&) and disjunction (⊕ and `) based on this connection with the multiplicative and
additive connectives of linear logic. However, since we assume that the structural rules
of weakening and contraction are always valid, we end up with a system equivalent
to classical logic in terms of provability (the ability to build a derivation for a given
judgement). To build a stronger connection with linear logic, we would need a more
refined type system which lets us rule out certain applications of structural rules.
From the standpoint of programming languages, incorporating linearity in the
type system would give a more general account of purity than was given in Chapter IX.
436
That is, linearity as a programming feature is about restricting effects, rather than
enabling them. In particular, we established a pure subset of the classical sequent
calculus (based on Gentzen’s LK) by restricting everything to exactly one consequence
(as in Gentzen’s LJ). The effect of this restriction is to ensure that every co-value
is used exactly once during execution (barring other effects like infinite loops). In
contrast, linear logic allows for multiple consequences while still maintaining this same
run-time invariant, as noted by Girard (1987). Thus, we can still enforce properties
like referential transparency while also accommodating more exotic types like A`B
which step outside the normal functional paradigm. A language based on Girard’s
(1993) logic of unity—which integrates and subsumes each of classical, intuitionistic,
and linear logics—would let us safely integrate first-class control with pure functional
programming, combining the advantages of both.
437
REFERENCES CITED
Abadi, Martín, & Cardelli, Luca. (1996). A theory of objects. 1st edn. Secaucus, NJ,
USA: Springer-Verlag New York, Inc.
Abel, Andreas. (2006). A polymorphic lambda calculus with sized higher-order types.
Ph.D. thesis, Ludwig-Maximilians-Universität München.
Abel, Andreas, Pientka, Brigitte, Thibodeau, David, & Setzer, Anton. (2013).
Copatterns: Programming infinite structures by observations. Pages 27–38 of:
Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on
principles of programming languages. POPL ’13. New York, NY, USA: ACM.
Abel, Andreas M., & Pientka, Brigitte. (2013). Wellfounded recursion with
copatterns: A unified approach to termination and productivity. Pages 185–196
of: Proceedings of the 18th ACM SIGPLAN international conference on
functional programming. ICFP ’13. New York, NY, USA: ACM.
Andreoli, Jean-Marc. (1992). Logic programming with focusing proofs in linear logic.
Journal of logic and computation, 2(3), 297–347.
Appel, Andrew W. (1992). Compiling with continuations. New York, NY, USA:
Cambridge University Press.
Ariola, Zena M., & Felleisen, Matthias. (1997). The call-by-need lambda calculus.
Journal of functional programming, 7(3), 265–301.
Ariola, Zena M., & Herbelin, Hugo. (2003). Minimal classical logic and control
operators. Pages 871–885 of: Baeten, Jos C. M., Lenstra, Jan Karel, Parrow,
Joachim, & Woeginger, Gerhard J. (eds), Automata, languages and
programming: 30th international colloquium. ICALP 2003. Berlin, Heidelberg:
Springer Berlin Heidelberg.
Ariola, Zena M., & Herbelin, Hugo. (2008). Control reduction theories: The benefit
of structural substitution. Journal of functional programming, 18(3), 373–419.
Ariola, Zena M., Maraist, John, Odersky, Martin, Felleisen, Matthias, & Wadler,
Philip. (1995). A call-by-need lambda calculus. Pages 233–246 of: Proceedings
of the 22nd ACM SIGPLAN-SIGACT symposium on principles of programming
languages. POPL ’95. New York, NY, USA: ACM.
Ariola, Zena M., Bohannon, Aaron, & Sabry, Amr. (2009a). Sequent calculi and
abstract machines. ACM transactions on programming languages and systems,
31(4), 13:1–13:48.
438
Ariola, Zena M., Herbelin, Hugo, & Sabry, Amr. (2009b). A type-theoretic
foundation of delimited continuations. Higher-order and symbolic computation,
22(3), 233–273.
Ariola, Zena M., Herbelin, Hugo, & Saurin, Alexis. (2011). Classical call-by-need and
duality. Pages 27–44 of: Typed lambda calculi and applications: 10th
international conference. TLCA’11. Berlin, Heidelberg: Springer Berlin
Heidelberg.
Ariola, Zena M., Downen, Paul, Herbelin, Hugo, Nakata, Keiko, & Saurin, Alexis.
(2012). Classical call-by-need sequent calculi: The unity of semantic artifacts.
Pages 32–46 of: Schrijvers, Tom, & Thiemann, Peter (eds), Functional and logic
programming: 11th international symposium. Lecture Notes in Computer
Science, vol. 7294. Berlin, Heidelberg: Springer Berlin Heidelberg.
Barbanera, Franco, & Berardi, Stefano. (1994). A symmetric lambda calculus for
“classical” program extraction. Pages 495–515 of: Proceedings of the
international conference on theoretical aspects of computer software. TACS ’94.
London, UK, UK: Springer-Verlag.
Barendregt, Hendrik Pieter. (1985). The lambda calculus: Its syntax and semantics.
Studies in Logic and the Foundations of Mathematics. Amsterdam, New-York,
Oxford: North-Holland.
Bauer, Andrej, & Pretnar, Matija. (2015). Programming with algebraic effects and
handlers. Journal of logical and algebraic methods in programming, 84(1),
108–123.
Cardelli, Luca, Martini, Simone, Mitchell, John C., & Scedrov, Andre. (1994). An
extension of system F with subtyping. Information and computation, 109(1),
4–56.
Carraro, Alberto, Ehrhard, Thomas, & Salibra, Antonino. (2012). The stack calculus.
Pages 93–108 of: Proceedings seventh workshop on logical and semantic
frameworks, with applications. LSFA 2012.
Cartwright, Robert, & Felleisen, Matthias. (1994). Extensible denotational language
specifications. Pages 244–272 of: Hagiya, Masami, & Mitchell, John C. (eds),
Theoretical aspects of computer software: International symposium. TACS ’94.
Berlin, Heidelberg: Springer Berlin Heidelberg.
Church, Alonzo. (1932). A set of postulates for the foundation of logic. Annals of
mathematics, 33(2), 346–366.
Church, Alonzo. (1936). An unsolvable problem of elementary number theory.
American journal of mathematics, 58(2), 345–363.
439
Coq 8.4. (2012). The Coq proof assistant reference manual. version 8.4 edn. INRIA.
Coquand, Thierry. (1985). Une théorie des constructions. Ph.D. thesis, Université
Paris 7.
Cousineau, Guy, Curien, Pierre-Louis, & Mauny, Michel. (1987). The categorical
abstract machine. Science of computer programming, 8(2), 173–202.
Curien, Pierre-Louis, & Herbelin, Hugo. (2000). The duality of computation. Pages
233–243 of: Proceedings of the fifth ACM SIGPLAN international conference on
functional programming. ICFP ’00. New York, NY, USA: ACM.
Curien, Pierre-Louis, & Munch-Maccagnoni, Guillaume. (2010). The duality of
computation under focus. Pages 165–181 of: Calude, Cristian S., & Sassone,
Vladimiro (eds), Theoretical computer science: 6th IFIP TC 1/WG 2.2
international conference, TCS 2010, held as part of WCC 2010. TCS 2010.
Berlin, Heidelberg: Springer Berlin Heidelberg.
Curry, Haskell B., Feys, Robert, & Craig, William. (1958). Combinatory logic. Vol. 1.
North-Holland Publishing Company.
Cytron, Ron, Ferrante, Jeanne, Rosen, Barry K., Wegman, Mark N., & Zadeck,
F. Kenneth. (1991). Efficiently computing static single assignment form and the
control dependence graph. ACM transactions on programming languages and
systems, 13(4), 451–490.
Danos, Vincent, Joinet, Jean-Baptiste, & Schellinx, Harold. (1997). A new
deconstructive logic: Linear logic. Journal of symbolic logic, 62(3), 755âĂŞ–807.
David, Rene, & Py, Walter. (2001). λµ-calculus and Bohm’s theorem. Journal of
symbolic logic, 66(1), 407–413.
Davis, Martin. (2004). The undecidable: Basic papers on undecidable propositions,
unsolvable problems, and computable functions. Dover Publication, Incorporated.
de Bruijn, Nicolaas. (1968). AUTOMATH, a language for mathematics. Tech. rept.
66-WSK-05. Technological University Eindhoven.
Di Cosmo, Roberto. (1995). Isomorphisms of types: From λ-calculus to information
retrieval and language design. Basel, Switzerland: Birkhauser Verlag.
Downen, Paul, & Ariola, Zena M. (2012). A systematic approach to delimited
control with multiple prompts. Pages 234–253 of: Seidl, Helmut (ed),
Programming languages and systems: 21st European symposium on
programming, ESOP 2012, held as part of the European joint conferences on
theory and practice of software, ETAPS 2012. Lecture Notes in Computer
Science, vol. 7211. Springer Berlin Heidelberg.
440
Downen, Paul, & Ariola, Zena M. (2014a). Compositional semantics for composable
continuations: From abortive to delimited control. Pages 109–122 of:
Proceedings of the 19th ACM SIGPLAN international conference on functional
programming. ICFP ’14. New York, NY, USA: ACM.
Downen, Paul, & Ariola, Zena M. (2014b). Delimited control and computational
effects. Journal of functional programming, 24, 1–55.
Downen, Paul, & Ariola, Zena M. (2014c). The duality of construction. Pages
249–269 of: Shao, Zhong (ed), Programming languages and systems: 23rd
European symposium on programming, ESOP 2014, held as part of the European
joint conferences on theory and practice of software, ETAPS 2014. Lecture
Notes in Computer Science, vol. 8410. Springer Berlin Heidelberg.
Downen, Paul, Maurer, Luke, Ariola, Zena M., & Varacca, Daniele. (2014).
Continuations, processes, and sharing. Pages 69–80 of: Proceedings of the 16th
international symposium on principles and practice of declarative programming.
PPDP ’14. New York, NY, USA: ACM.
Downen, Paul, Johnson-Freyd, Philip, & Ariola, Zena M. (2015). Structures for
structural recursion. Pages 127–139 of: Proceedings of the 20th ACM SIGPLAN
international conference on functional programming. ICFP ’15. New York, NY,
USA: ACM.
Downen, Paul, Maurer, Luke, Ariola, Zena M., & Peyton Jones, Simon. (2016).
Sequent calculus as a compiler intermediate language. Pages 74–88 of:
Proceedings of the 21st ACM SIGPLAN international conference on functional
programming. ICFP ’16. New York, NY, USA: ACM.
Dummett, Michael. (1991). The logical basis of methaphysics. The William James
lectures, 1976. Harvard University Press, Cambridge, Massachusetts.
Felleisen, Matthias. (1991). On the expressive power of programming languages.
Science of computer programming, 17(1-3), 35–75.
Felleisen, Matthias, & Friedman, Daniel P. (1986). Control operators, the SECD
machine, and the λ-calculus. Pages 193–219 of: Proceedings of the IFIP TC
2/WG2.2 working conference on formal descriptions of programming concepts
part III.
Felleisen, Matthias, & Hieb, Robert. (1992). The revised report on the syntactic
theories of sequential control and state. Theoretical computer science, 103(2),
235–271.
441
Felleisen, Matthias, Wand, Mitch, Friedman, Daniel, & Duba, Bruce. (1988).
Abstract continuations: A mathematical semantics for handling full jumps.
Pages 52–62 of: Proceedings of the 1988 ACM conference on LISP and
functional programming. LFP ’88. New York, NY, USA: ACM.
Felleisen, Mattias. (1988). The theory and practice of first-class prompts. Pages
180–190 of: Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on
principles of programming languages. POPL ’88. New York, NY, USA: ACM.
Filinski, Andrzej. (1989). Declarative continuations and categorical duality. M.Phil.
thesis, Computer Science Department, University of Copenhagen.
Filinski, Andrzej. (1994). Representing monads. Pages 446–457 of: Proceedings of
the 21st ACM SIGPLAN-SIGACT symposium on principles of programming
languages. POPL ’94. New York, NY, USA: ACM.
Filinski, Andrzej. (1999). Representing layered monads. Pages 175–188 of:
Proceedings of the 26th ACM SIGPLAN-SIGACT symposium on principles of
programming languages. POPL ’99. New York, NY, USA: ACM.
Fischer, Michael J. (1993). Lambda-calculus schemata. Lisp and symbolic
computation, 6(3-4), 259–288.
Gentzen, Gerhard. (1935a). Untersuchungen über das logische schließen. I.
Mathematische zeitschrift, 39(1), 176–210.
Gentzen, Gerhard. (1935b). Untersuchungen über das logische schließen. II.
Mathematische zeitschrift, 39(1), 405–431.
Giménez, Eduardo. (1996). Un calcul de constructions infinies et son application a la
vérification dd systèmes communicants. Ph.D. thesis, Ecole Normale Supérieure
de Lyon.
Girard, Jean-Yves. (1971). Une extension de l’interprétation de Gödel à l’analyse, et
son application à l’élimination de coupures dans l’analyse et la théorie des types.
Pages 63–92 of: Fenstad, J. E. (ed), Proceedings of the 2nd Scandinavian logic
symposium. North-Holland.
Girard, Jean-Yves. (1987). Linear logic. Theoretical computer science, 50(1), 1–101.
Girard, Jean-Yves. (1991). A new constructive logic: Classical logic. Mathematical
structures in computer science, 1(3), 255–296.
Girard, Jean-Yves. (1993). On the unity of logic. Annals of pure and applied logic,
59(3), 201–217.
442
Girard, Jean-Yves. (2001). Locus solum: From the rules of logic to the logic of rules.
Mathematical structures in computer science, 11(3), 301–506.
Girard, Jean-Yves, Taylor, Paul, & Lafont, Yves. (1989). Proofs and types. New
York, NY, USA: Cambridge University Press.
Gödel, Kurt. (1934). On undecidable propositions of formal mathematical systems.
Published in Davis (2004). Lecture notes taken by Stephen C. Kleene and J.
Barkley Rosser.
Gödel, Kurt. (1980). On a hitherto unexploited extension of the finitary standpoint.
Journal of philosophical logic, 9(2), 133–142.
Graham-Lengrand, Stéphane. (2015). The Curry-Howard view of classical logic.
Lecture notes for the Master Parisien de Recherche en Informatique (MPRI).
Griffin, Timothy G. (1990). A formulae-as-types notion of control. Pages 47–58 of:
Proceedings of the 17th ACM SIGPLAN-SIGACT symposium on principles of
programming languages. POPL ’90. New York, NY, USA: ACM.
Hagino, Tatsuya. (1987). A typed lambda calculus with categorical type
constructors. Pages 140–157 of: Pitt, David H., Poigné, Axel, & Rydeheard,
David E. (eds), Category theory and computer science. Berlin, Heidelberg:
Springer Berlin Heidelberg.
Hagino, Tatsuya. (1989). Codatatypes in ML. Journal of symbolic computation, 8(6),
629–650.
Herbelin, Hugo. (2005). C’est maintenant qu’on calcule : Au coeur de la dualité.
Habilitation thesis, Université Paris 11.
Herbelin, Hugo, & Ghilezan, Silvia. (2008). An approach to call-by-name delimited
continuations. Pages 383–394 of: Proceedings of the 35th annual ACM
SIGPLAN-SIGACT symposium on principles of programming languages.
POPL ’08. New York, NY, USA: ACM.
Herbelin, Hugo, & Zimmermann, Stéphane. (2009). An operational account of
call-by-value minimal and classical λ-calculus in “natural deduction” form.
Pages 142–156 of: Curien, Pierre-Louis (ed), Typed lambda calculi and
applications: 9th international conference. TLCA 2009. Berlin, Heidelberg:
Springer Berlin Heidelberg.
Howard, William A. (1980). The formulae-as-types notion of constructions. Pages
479–490 of: To H.B. Curry: Essays on combinatory logic, lambda calculus and
formalism. Academic Press. Unpublished manuscript of 1969.
Huet, Gérard. (1997). The zipper. Journal of functional programming, 7(5), 549–554.
443
Johnson-Freyd, Philip, Downen, Paul, & Ariola, Zena M. (2016). First class call
stacks: Exploring head reduction. Proceedings of the workshop on continuations,
WoC 2016, London, UK, April 12th 2015. EPTCS, vol. 212.
Johnson-Freyd, Philip, Downen, Paul, & Ariola, Zena M. (2017). Call-by-name
extensionality and confluence. Journal of functional programming, 27, e12.
Kay, Alan C. (1993). The early history of Smalltalk. Pages 69–95 of: The second
ACM SIGPLAN conference on history of programming languages. HOPL-II.
New York, NY, USA: ACM.
Kelsey, Richard, Clinger, William, & et al., Jonathan Rees. (1998). Revised5 report
on the algorithmic language Scheme. Higher-order and symbolic computation,
11(1), 7–105.
Kennedy, Andrew. (2007). Compiling with continuations, continued. Pages 177–190
of: Proceedings of the 12th ACM SIGPLAN international conference on
functional programming. ICFP ’07. New York, NY, USA: ACM.
Klop, Jan Willem, & de Vrijer, Roel C. (1989). Unique normal forms for lambda
calculus with surjective pairing. Information and computation, 80(2), 97–113.
Krivine, Jean-Louis. (2007). A call-by-name lambda-calculus machine. Higher-order
and symbolic computation, 20(3), 199–207.
Krivine, Jean-Louis. (2009). Realizability in classical logic. Panoramas et synthèses,
27, 197–229.
Lambek, Joachim, & Scott, Philip J. (1986). Introduction to higher order categorical
logic. New York, NY, USA: Cambridge University Press.
Laurent, Olivier. 2002 (Mar.). Étude de la polarisation en logique. Ph.D. thesis,
Université de la Méditerranée - Aix-Marseille II.
Lengrand, Stéphane, & Miquel, Alexandre. (2008). Classical Fω, orthogonality and
symmetric candidates. Annals of pure and applied logic, 153(1), 3–20.
Levy, Paul Blain. (2001). Call-by-push-value. Ph.D. thesis, Queen Mary and
Westfield College, University of London.
Levy, Paul Blain. (2003). Call-by-push-value: A functional/imperative synthesis.
Semantics Structures in Computation, vol. 2. Springer Netherlands.
Lindley, Sam. (2005). Normalisation by evaluation in the compilation of typed
functional programming languages. Ph.D. thesis, University of Edinburgh,
College of Science and Engineering, School of Informatics.
444
Liskov, Barbara. (1987). Keynote address - data abstraction and hierarchy. Pages
17–34 of: Addendum to the proceedings on object-oriented programming systems,
languages and applications (addendum). OOPSLA ’87. New York, NY, USA:
ACM.
Maraist, John, Odersky, Martin, & Wadler, Philip. (1998). The call-by-need lambda
calculus. Journal of functional programming, 8(3), 275–317.
Martin-Löf, Per. (1975). An intuitionistic theory of types: Predicative part. Pages
73–118 of: Logic colloquium ’73. Studies in Logic and the Foundations of
Mathematics, vol. 80. Amsterdam, The Netherlands: North-Holland.
Martin-Löf, Per. (1982). Constructive mathematics and computer programming.
Pages 153–175 of: Proceedings of the sixth international congress for logic,
methodology, and philosophy of science. Amsterdam, The Netherlands:
North-Holland.
Martin-Löf, Per. (1998). An intuitionistic theory of types. Pages 127–172 of:
Twenty-five years of constructive type theory. Oxford Logic Guides, vol. 36.
Oxford, United Kingdom: Clarendon Press.
Maurer, Luke, Downen, Paul, Ariola, Zena M., & Peyton Jones, Simon. (2017).
Compiling without continuations. Pages 482–494 of: Proceedings of the 38th
ACM SIGPLAN conference on programming language design and
implementation. PLDI ’17. New York, NY, USA: ACM.
Mellies, Paul-Andre, & Vouillon, Jerome. (2005). Recursive polymorphic types and
parametricity in an operational framework. Pages 82–91 of: Proceedings of the
20th annual IEEE symposium on logic in computer science. LICS ’05.
Washington, DC, USA: IEEE Computer Society.
Mendler, Nax P. (1988). Inductive definition in type theory. Ph.D. thesis, Cornell
University.
Moggi, Eugenio. (1989). Computational lambda-calculus and monads. Pages 14–23
of: Proceedings of the fourth annual symposium on logic in computer science.
Piscataway, NJ, USA: IEEE Press.
Munch-Maccagnoni, Guillaume. (2009). Focalisation and classical realisability. Pages
409–423 of: Grädel, Erich, & Kahle, Reinhard (eds), Computer science logic:
23rd international workshop, CSL 2009, 18th annual conference of the EACSL.
CSL 2009. Berlin, Heidelberg: Springer Berlin Heidelberg.
Munch-Maccagnoni, Guillaume. (2013). Syntax and models of a non-associative
composition of programs and proofs. Ph.D. thesis, Université Paris Diderot.
445
Munch-Maccagnoni, Guillaume. (2014). Formulae-as-types for an involutive negation.
Pages 70:1–70:10 of: Proceedings of the joint meeting of the twenty-third EACSL
annual conference on computer science logic (CSL) and the twenty-ninth annual
ACM/IEEE symposium on logic in computer science (LICS). CSL-LICS ’14.
New York, NY, USA: ACM.
Munch-Maccagnoni, Guillaume, & Scherer, Gabriel. (2015). Polarised intermediate
representation of lambda calculus with sums. Pages 127–140 of: 2015 30th
annual ACM/IEEE symposium on logic in computer science. LICS 2015.
Ohori, Atsushi. (1999). The logical abstract machine: A Curry-Howard isomorphism
for machine code. Pages 300–318 of: Middeldorp, Aart, & Sato, Taisuke (eds),
Functional and logic programming: 4th Fuji international symposium.
FLOPS ’99. Berlin, Heidelberg: Springer Berlin Heidelberg.
Ohori, Atsushi. (2003). Register allocation by proof transformation. Pages 399–413
of: Degano, Pierpaolo (ed), Programming languages and systems: 12th European
symposium on programming, ESOP 2003 held as part of the joint European
conferences on theory and practice of software, ETAPS 2003. ESOP 2003.
Berlin, Heidelberg: Springer Berlin Heidelberg.
Oury, Nicolas. (2008). Coinductive types and type preservation. Message on the
Coq-club mailing list.
Parigot, Michel. (1992). λµ-calculus: An algorithmic interpretation of classical
natural deduction. Pages 190–201 of: Voronkov, Andrei (ed), Logic
programming and automated reasoning: International conference. LPAR ’92.
Berlin, Heidelberg: Springer Berlin Heidelberg.
Peyton Jones, Simon, Tolmach, Andrew, & Hoare, Tony. (2001). Playing by the
rules: Rewriting as a practical optimisation technique in GHC. Haskell
workshop. ACM SIGPLAN.
Peyton Jones, Simon, Vytiniotis, Dimitrios, Weirich, Stephanie, & Washburn,
Geoffrey. (2006). Simple unification-based type inference for GADTs. Pages
50–61 of: Proceedings of the eleventh ACM SIGPLAN international conference
on functional programming. ICFP ’06. New York, NY, USA: ACM.
Peyton Jones, Simon L., & Launchbury, John. (1991). Unboxed values as first class
citizens in a non-strict functional language. Pages 636–666 of: Hughes, John
(ed), Functional programming languages and computer architecture: 5th ACM
conference. Berlin, Heidelberg: Springer Berlin Heidelberg.
Pfenning, Frank, & Davies, Rowan. (2001). A judgmental reconstruction of modal
logic. Mathematical structures in computer science, 11(4), 511–540.
446
Pierce, Benjamin C. (2002). Types and programming languages. Cambridge,
Massachusetts: The MIT Press.
Pitts, Andrew M. (1997). A note on logical relations between semantics and syntax.
Logic journal of IGPL, 5(4), 589–601.
Pitts, Andrew M. (2000). Parametric polymorphism and operational equivalence.
Mathematical structures in computer science, 10(3), 321–359.
Plotkin, Gordon D. (1975). Call-by-name, call-by-value and the λ-calculus.
Theoretical computer science, 1, 125–159.
Polonovski, Emmanuel. (2004). Explicit substitutions, logic and normalization. Ph.D.
thesis, Université Paris-Diderot - Paris VII.
Prawitz, Dag. (1974). On the idea of a general proof theory. Synthese, 27(1/2),
63–77.
Pretnar, Matija. (2010). The logic and handling of algebraic effects. Ph.D. thesis,
University of Edinburgh.
Reynolds, John C. (1974). Towards a theory of type structure. Pages 408–423 of:
Robinet, Bernard (ed), Programming symposium, proceedings colloque sur la
programmation. Lecture Notes in Computer Science, vol. 19. London, UK, UK:
Springer-Verlag.
Reynolds, John C. (1983). Types, abstraction and parametric polymorphism. Pages
513–523 of: Mason, R. E. A. (ed), Proceedings of the IFIP 9th world computer
congress. Information Processing 83. Amsterdam: Elsevier Science Publishers B.
V. (North-Holland).
Reynolds, John C. (1998). Definitional interpreters for higher-order programming
languages. Higher-order and symbolic computation, 11(4), 363–397.
Ronchi Della Rocca, Simona, & Paolini, Luca. (2004). The parametric λ-calculus: a
metamodel for computation. Springer-Verlag.
Sabry, Amr, & Felleisen, Matthias. (1992). Reasoning about programs in
continuation-passing style. Pages 288–298 of: Proceedings of the 1992 ACM
conference on LISP and functional programming. LFP ’92. New York, NY, USA:
ACM.
Sabry, Amr, & Felleisen, Matthias. (1993). Reasoning about programs in
continuation-passing style. Lisp and symbolic computation, 6(3-4), 289–360.
Sabry, Amr, & Wadler, Philip. (1997). A reflection on call-by-value. ACM
transactions on programming languages and systems (TOPLAS), 19(6), 916–941.
447
Schrijvers, Tom, Peyton Jones, Simon, Sulzmann, Martin, & Vytiniotis, Dimitrios.
(2009). Complete and decidable type inference for GADTs. Pages 341–352 of:
Proceedings of the 14th ACM SIGPLAN international conference on functional
programming. ICFP ’09. New York, NY, USA: ACM.
Selinger, Peter. (2001). Control categories and duality: On the categorical semantics
of the lambda-mu calculus. Mathematical structures in computer science, 11(2),
207–260.
Selinger, Peter. (2003). Some remarks on control categories. Unpublished
Manuscript.
Singh, Satnam, Peyton Jones, Simon, Norell, Ulf, Pottier, François, Meijer, Erik, &
McBride, Conor. (2011). Sexy types — are we done yet? Software Summit.
Turing, Alan M. (1936). On computable numbers, with an application to the
Entscheidungsproblem. Proceedings of the London mathematical society, 42(2),
230–265.
van Oostrom, Vincent. (1994). Confluence by decreasing diagrams. Theoretical
computer science, 126(2), 259–280.
Wadler, Philip. (2003). Call-by-value is dual to call-by-name. Pages 189–201 of:
Proceedings of the eighth ACM SIGPLAN international conference on functional
programming. New York, NY, USA: ACM.
Wadler, Philip. (2005). Call-by-value is dual to call-by-name, reloaded. Pages
185–203 of: Giesl, Jürgen (ed), Term rewriting and applications: 16th
international conference. RTA 2005. Berlin, Heidelberg: Springer Berlin
Heidelberg.
Wadler, Philip. (2015). Propositions as types. Communications of the ACM, 58(12),
75–84.
Wright, Andrew K., & Felleisen, Matthias. (1994). A syntactic approach to type
soundness. Information and computation, 115(1), 38–94.
Zeilberger, Noam. (2008a). Focusing and higher-order abstract syntax. Pages
359–369 of: Proceedings of the 35th annual ACM SIGPLAN-SIGACT symposium
on principles of programming languages. POPL ’08. New York, NY, USA: ACM.
Zeilberger, Noam. (2008b). On the unity of duality. Annals of pure and applied logic,
153(1), 660–96.
Zeilberger, Noam. (2009). The logical basis of evaluation order and pattern-matching.
Ph.D. thesis, Carnegie Mellon University.
448