|
|
||
|
|
Start of topic | Skip to actions
Notes from MEPLS meeting held 2008/09/06.
Predicate-based Testing, Verification, and other Formal Methods in Undergraduate Computer Science CurriculumSpeaker: Rex Page, OU.Verification consists of finding bugs in software (or hardware) by mathematical analysis of its code, which is really just How do we find people who can do formal verification? Incorporating formal methods to undergraduate courses: put the material in
Example of a 1-week module
Example walk-through of verification process
Who will do the work? (also, where? how?)
Q. Do you have experience with ACL2 or other verification tool in commercial environments? A. No. Can't do this Q. You said about "unduely complicated" formulas. What does that mean? A. I mean for example, C++ code is unduely complicated---writing things functionally tends to make it more amenable for analysis. Q. [...] A. I have been working with Ruben Gamboa and Matthias Felleisen to put a textbook together. Q. What exactly does the test (defproperty) generate? A. The test environment will generate random lists with random numbers. This is also used in commercial environments, e.g. Haskell Quick Check, Erlang's proprietary testing utility. Q. Is it better to do case analysis? A. Our ultimate goal is to prove it as a theorem. Scheme is advantageous here, because anyone can pick it up so fast. Comment from audience: we just use the test to increase confidence, weeding out the "obvious" bugs before investing in full-blown formal analysis. A Paradigm for Semi-Formal ProofsSpeaker: J.Nelson Rushton, Texas Tech U.Lightweight Formal Methods (LFM)
Essence of Approach ("Semi-formal Proofs")
Q. Do you mean inference relation = implications? A. Yes. Desired Features
Details
The semi-formal proofs are designed to look like prose proofs with careful annotation. Theorem: "There are infinitely many primes" (1 -> 7) 1. ... 2. ... (1) 3. ... (2, "fundamental theorem of arithmetic") : 7. ... (6, 3) The annotation (1 -> 7) means that the theorem holds from the assertion (1 -> 7). (6, 3) means "by 6 and 3". The check for structural integrity (consistency of the annotations) can be checked mechanically. Each listing and its immediate contingents can be turned into "slides" that allow the user to focus on individual steps of the proof. The individual steps are in prose and informal (though hopefully rigorous). Using the slides to read semi-formal proof is very different from prose proof. Words like "hence", "therefore" are clarified completely, esp. which definition or lemma it's referring to. Q. Isn't this also a property of tree-style proof system? A. Yes. It's actually a property of most other formal systems as well. My approach distinguishes itself in that the style is very close to prose proof while having this property. HypothesesShould we teach this method to students as the thing to use from the cradle to the grave? Tested this by giving students semi-formal and prose proofs, seeing if they can find errors in the proofs. They didn't even have enough proof-background that it was inconclusive. But personal experience is it makes it easier for students to learn proofs. The reason is that in this semi-formal style, we can get them slicing and dicing right from the get-go, which is how we best learn other things, like soccer (we just give them a ball to teach soccer, not the rule book). Q&A, commentsQ. Any traction with math dept.? A. No, not yet. They tend to avoid me. FortressOpen source project with intl participation 1.0 release has synchronized spec and implementation Sun people are expected to grow lang and lib to develop a compiler Language Design & FeaturesParallel language
Q. How do you control (limit) the parallelism? A. Research is going into how much and where parallelism is a win---this is easier than, say, for functional programming languages because the semantics from the ground-up is parallel. So we don't have to figure out where the parallelism is legal before assessing its influence on performance. Growable language (see Guy Steele's keynote)
Mathematical Syntax
Q. This looks more like Haskell type classes than OO. A. In a lot of ways, type classes has advantages over OO. It may turn out to be better ways to go about it, but current research focus is on finding OO-style type systems to encode mlutiple inheritance. Full unicode character set is available for use, including mathematical operators and Greek letters. Use of "funny characters" is under control of libs and therefore users. Q. How do you type them? A. You can use ASCII simulations or unicode character names, for example. Q. Why not use LaTeX? or develop a better version? A. We did, actually. More later. More infovisit projectfortress.sun.com
Librarylibrary is now > 10,000 linesSample code 1: skip lists by Michael Spiegel of U of Virginia [see slides, if available online] Tools`Fortify' code formatter sum: RR64 := 0 -> LaTeX? code -> fed to LaTeX? -> rendered Fortress mode for emacs uses Fortify to render as the user types. However, Sun is mostly focused on getting the compiler and libraries in place. Q. This is already making the language big. Is this in line with "growable"? A. Ultimately, all of this syntax mambo jumbo should be moved into the library. Fortress 1.0Implementation expanded and made more reliable, synced with spec, etc. Automated Testing During Spec Build:
What works NOW:
\Pi is executed with a tree-like semantics: \Pi i (factorial) is a tree with branches tagged with * and leaves with numbers. They are merged in parallel and the implementation automatically balances the tree.
Next Steps:
Initially targeted to JVM for full multithreaded platform independence, but eventually must develop a dedicated VM. Other related works: X10, Chapel (they have taken very different turns) X10 is more strongly based on Java, and Chapel is more strongly based on C; they are more specifically focused on the problem of parallel programming. Q. Transaction memory A. TM may or not be employed by Fortress eventually. But we don't know how fast TM can be made and other implementations may or may not turn out to be more advantageous, depending on the application. SequenceL? and the MulticoreSpeaker: Daniel E. Cooke, Texas Tech U.Changes in hardware = language opportunitiesToday we see multicore computers are becoming ubiquitous. Expressing parallel programs and compiling them are major challenges. SequenceL? focuses on micro-parallelism. TransparencyExample. Conventional matrix multiplication code is hard to read.In SequenceL?, it's just MM_{I,J} (matrix m1, m2) := Sum (m1 (I, all) * m2 (all, J)) m1 (I, all) is "all columns from row I" semanticseverything in the languages are built from:
consume-simplify-produceFind things you can do independently -> simplify -> put them back together to produce new state e.g. initial = (20-5) / (10-5) - CSP -> 15 / 5 - CSP -> 3 -> 3 (final) e.g. [(20-5) / (10-5), when, 10 - 5 Tableau works like a continuation as well. (factorial example) normalize transposeNormalization replicates operators for each element in a list. Transpose tranposes the generated normalized tableau, effectively applying the operators. fact ([2,3,4]) -> [fact (2), fact (3), fact (4)] N = [[fact, fact, fact], [2, 3, 4]] (normalization) T = [fact (2), fact (3), fact (4)] (transpose) Normalize transpose preserves nesting structure of the input by default. It also works on function bodies: Even (vector number) := number when number mod 2 = 0 [even, [2, 3, 4]] -> [2, 3, 4] when [2,3,4] mod 2 = 0 -> [2, 3 ,4] when [2 mod 2, 3 mod 2, 4 mod 2] = 0 -> [2, 3, 4] when [0, 1, 0] = 0 -> [2, 3, 4] when [0 = 0, 1 = 0, 0 = 0] -> [2, 3, 4] when [true, false, true] -> [2 when true, 3 when false, 4 when true] -> [2, empty, 4] -> [2, 4] Q. You can't overload operators (like =) over int and over int list? A. Actually you can. The normalization works until both arguments' types match. Used to implement free variabls: MM_{I,J} (matrix m1, m2) := Sum (m1 (I, all) * m2 (all, J)) I, J are free variables. Automatic parallelismSequenceL? code exposes parallelism in your program. SequenceL? translator generates multithreaded C code. Map-Reduce Example grep (vector Line, Key) = Line when and (Line [I, ..., I + size (Key) - 1]) = Key normalize-transpose effectively does Map-Reduce automatically. ScaleLanguage is concise (you can keep everything in your head) Not a toy language: wrote some NASA space shuttle control code. Further information"Normalize, transpose, and distribute: An automatic approach for handling nonscalars" Mar 2008 is probably the most complete reference as of now. Q. The code is very concise. Is anyone working on applying formal methods? A. A Ph.D student is working on using formal methods to verify SequenceL? code. Strategic Programming by Model InterpretationSpeaker: William R. Cook, UT AustinAcrobat API is huge:
Likewise for MS Word automation API Implementations usually have a certain strategy, which is informal and implicit so becomes a rigid component of the system. This is less than ideal. Other strategies
Q. Is it really necessary for SAP to have 50+ attributes etc? A. They are interfaces to relational DB, and it's at least better than association tables which is an uncontrolled alternative. Using Datamodels>||BoundedValue? x: Integer enforce min <= x and x <= max min: read-only Integer max: Integer ||< write generic operations on the model >|| Equal_{Tree} (a, b) return Equal_{Integer} (a.value, b.value) and Equal_{Tree} (a.left, b.left) and Equal_{Tree} (a.right, b.right) ||< Can we write a general strategy of equality? Pass type in as parameter. >|| Now instantiate the generic Equal by partial evaluation. Closely related to Polytypic Programming. Partially evaluating dataIf you use BoundedValue?, there's a clear strategy as to how you insert boundedness checks. Again, you can use partial evaluation for compiled languages. How do we create a factory that converts datamodel to message processor? Factory : Model -> Message -> Value The message interpreter takes a representation of a datamodel and generates a class that has the necessary getters and setters that check constraints etc. general strategy = interpreter specifics of problem = model partial evaluation, supercompilation Most model-driven research uses translators. For example yacc is a translator (code generator) and replacing that by partial evaluation Examples of real-world modelsweb (UI, Schema, db, request): HTML
Key problem is how to integrate multiple models. -> pass them in as arguments to a parametrized model Programming environmentSelf-contained, self-hosting A big designed to solve easy problems very quickly PhilosophyObject --reflection--> description is how it's usually doneDescription --interpretation + partial evaluation--> object Don't design your programs, but design your programs AsidesBackground of speaker: denotational semantics of inheritance, mixins, F-bounded polymorphism, ADT vs OOP e.g. web-based apps to track intermediate companies selling HP printersThe question: What is the relationship between ADT and objects? The untyped lambda-calculus is the first OO language! e.g. Smalltalk bools are isomorphic to Church bools. Q&AQ. What's the new thing provided by this approach? A. Provide tools to write your programs. Also program-writing tools can be made into libraries. We're spending too much time writing the general model-compliant parts; we should write the model and tweak the anomalies. As interpreters get more complex, it may also get out of hand; we'll see. Q. How much overlap is there with Lisp macros? A. Because the models are explicit, separate values, it can be passed to different interpreters, giving different code. This approach can hence easily integrate multiple languages. Q. Type-checking of partial evaluator? A. I'm avoiding it for now because I'm going for expressiveness first. But it will have to be done in the future. Domain Specific LanguagesSpeaker: Walid Taha, Rice U.General purpose programming languages try to be "one lang for everything". Results:
Solution: domain-specific languages
Examples of DSLs
Benefits of DSLs
Creating DSLsDesigning a DSL also gives a formalization of the problem domain. It's easier to analyze the DSL than its encoding in a general-purpose language. Selling a play/drama script: write a script-player! Work with a domain expert, and build them a DSL. Collaboration: we do the technical stuff and formal semantics, the domain expert worries about the domain. Make it easier for other computer scientists to build DSLs cheaply and correctly. e.g. DSLs look more like spec
We need extensible grammars, type systems, semantics, ...; better ways to structure interpreters to implement them quickly (monads and monad transformers) Think: is a visual language better? Should we build in reasoning facilities about some of the things being modeled? Domain-specific proof/specification languages? DSL programs look more like specifications than "programs" (which is an advantage). How do you check the soundness of the domain-specific proofs and theorems, that it's what the user wants? Make it more intuitive? Further informationConferences coming up on DSL:
Google for "domain-specific language" Q&A, commentsQ. Delineation between a language and a library? A. A library is a language. Q. I propose a distinction: certain growth can't be characterized as library. When you have a new informal specification. e.g. ZF is a library for predicate calculus but second-order calculus is a language extension. You show something and ask what it is. Different people say different things: snake, tree, wall. Everyone seems to be working on the same thing but with different emphases. Java Type Inference is Broken: Can We Fix It?Speaker: Daniel Smith (collaboration with Corky Cartwright), Rice U.Type inference in Java: programmer should be able to remove <> annotations and have the compiler infer them. The inference algorithm assigns type variables and generates constraints from the code, then solves it, much like Hindley-Milner style type inference. Each type parameter has an upper bound, and we must find a type satisfying the constraints. The JLS algorithm is broken:
Q. what's been reaction to discoveries of those bugs? A. They're aware of the bugs, but it's not a high-priority issue for them since it works for many uses for now. Q. what is a bug in this context? A. inconsistencies comment: some people at Sun are really interested in this work. Class subtyping example
Examples of bugs: `join' finds a least upper bound, but join (List<? extends Number>, List super Number>) = List You can write code that produces ClassCastException? when there's no cast Unsoundness/Incompleteness
Prohibited languages features
Proposals for Fixing
Disjunctive constraints
Eliminating Recursive References
Open Problems
Q&A CommentsYou began too slowly. You should immediately start with the problem statement. Q. How does the quality of error messages compare with and without those enhancements? A. Inference shouldn't affect type-error messages. -- JunInoue - 09 Sep 2008
Topic Actions: Edit | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r4 < r3 < r2 < r1 | More topic actions
Webs: Main | TWiki | Africa | EmbeddedSystems | Gpce | Houston | International | K12 | MetaOCaml | MulticoreOCR | ProgrammingLanguages | RAP | RIDL | Sandbox | SpeechClub | Teaching | Texbot | WG211 Web Actions: |
|
This work is licensed under a Creative Commons Attribution 2.5 License. Please follow our citation guidelines.