Start of topic | Skip to actions
Scribe: Angela Zhu Date: 2007-04-09

Sorting Algorithms

Today Daniel and Jun talked about sorted list in concoqtion. For any sorting algorithm, which takes an unsorted list to a sorted list, there are two things need to be guaranteed for the output list. One is the order, which means the resulting list is monotonically increasing. The other is that the output is a permutation of the input, meaning the list before and after sorting have exactly the same elements with the same multiplicities.

While the "order" requirement only need local information, it is easy to be guaranteed. However, "permutation" of a list requires non-local information, thus not easy to be proved---though it can be approximated by a guarantee that the size doesn't change.

Problem with the compare function

To guarantee a list is orderd, one needs to pass in a compare function appropriate to the type of elements in the list. There are several problems to consider when typing this function. In talking about this issue, it's easier to talk about the operator <= (also called leq to use as a function name), which returns just a yes or no, rather than a general comparison operator cmp that can tell us which is greater than, which is less than, or they are equal (e.g. consider C's strcmp()). Once we can type <=, the general comparison is easy to type using the same strategy.

First, the return value must contain a coq term somewhere that serves as a certificate of x <= y or x >= y, whichever the result turned out to be. A second, tightly related problem is that the parameter is given as OCaml values whereas the certificate is a coq term so it can only talk about kinds. So we have to somehow map the OCaml values to something coq can compare.

For snat, this is very easy: we just compare their indices. This is because the index (which is a nat) directly tells us how "big" that snat is under the usual total order for natural numbers. But this doesn't always work. For example, we can't compare OCaml ints, because it has no index to rely on.

One solution is to restrict the sorting function to lists of objects that are indexed by their ordinal, or informally their "ranks" within the ordering. That means that the type correlates to ordering. Such a function would look like:

type ('x:'(nat), 'y:'(nat)) cmp_result =
  | True of let 'x:'(nat) 'y:'(nat) 'certificate:'(x <= y) in ()
  | False (* which means x > y *)
;;

leq ('x:'(nat), 'y:'(nat)): '(x) a * '(y) a -> ('(x), '(y)) cmp_result

The type "'(x) a" is indexed by naturals, as an example here. The relation ship between values of type '(something) a is "x < y -> '(x) a < '(y) a". For example, the "a" can be snat.

Walid suggests that, instead of taking indexed elements and returning a boolean, maybe they wrap the result in an existential.

indexed values -> boolean
non-indexed values -> exists boolean

leq: '(x) a * '(y) a -> '(x <= y) modified_bool option
leq_walid: 'a * 'a -> \exists x y, ('(x) a, '(y) a, '(x <= y)) more_modified_bool

index type of order and order_f function

(* Using snat's: *)
type ('x:'(nat), 'y:'(nat)) order =
  | Le of 
     let 'x:'(nat) 'y:'(nat) 'xley:'(x <= y) in
     '(x) snat * '(y) snat : ('(x), '(y)) order
  | Ge of
     let 'x:'(nat) 'y:'(nat) 'ylex:'(y <= x) in
     '(x) snat * '(y) snat : ('(x), '(y)) order;;     

(* Using snat's type *)
let rec order_f x y: ('(x) '(y)) order = 
    match (x, y) with
       | Z, Z -> le O O '(0<=0)
       | S m , Z -> (Ge (Sm) O '(O<=Sm)
       | Z, S m -> (Le O (S m) '( O<= Sm)
       | S m, S n -> match order_f m n with 
    | Le m' n' (m'<=n') -> Le (S m') (S n') '(S m' <= S n')

where

carrier =
   |True .|n, m, p:(n<=m)| ()
   |False

Walid asked if they need both case in their proof. The answer is no; they only wrote two different versions because Jun could get away with true (with proof) vs false (without proof) whereas Dan needed <= vs >= (both with proof). Dan's version can serve both.

The input list was restricted to list of naturals (for simplicity), typed as:

type nlist = 
    |NNil
    |NCons .|k| '(k) snat * nlist

    |eq '(n) snat -> '(m) snat -> '(n<=m) carrier

Bubble sort and insertion sort

(* ensure_sorted: renamed to f because the whiteboard was too small *)
 let rec f (ls: nlist) =
     match ls in \ex olist with
          |NCons .|n| (car , cdr) -> 
           match (f cdr ) in \ex olist with 
             | E. |m| new_cdr -> match new_cdr with 
             | OCons .|m, d, p:'(m<=d) (cadr, _) ->
      match leq car cadr with 
                  |True .|j, k, p2:'(j<=k)| E .|n| (OCons .|n, m, p2| (car, new_cdr)

type 'b:nat olist
     |ONil of '(b) olist
     |OCons of let 'x:nat b:nat px<=b) in x snat * b olist : '(x) olist.

The indexed type pushes one from bubble sort to insertion sort. In bubble sort, the intermediate data structure that the program work on only gradually gets sorted, and at any given moment, only the tail is sorted (unless the algorithm is about to finish). If we try to build the (guaranteed-)sorted list greedily, we end up building a separate list from the input list, which is exactly what insertion sort does. A similar situation arises in almost any other sort except insertion sort itself.

Something about theorem prover

While theorem prover don't always success, sometime we need to turn to an alternative. For example, first prove similar things and correlate the problem to this alternative.


End of topic
Skip to actions | Back to top
Creative Commons LicenseThis work is licensed under a Creative Commons Attribution 2.5 License. Please follow our citation guidelines.