Yao Lirong's Blog

Standard Data Types

2020/01/30

From Textbook: Standard Data Types


Lists

Building Lists

The empty list is written [] and is pronounced “nil”, a name that comes from Lisp. Given a list lst and element elt, we can prepend elt to lst by writing elt::lst. The double-colon operator is pronounced “cons”

cons always prepend things, so cons is actually right-associative. The following code has the same effect.

1
2
1::2::3::[];;
1::(2::(3::[]));;

All the elements of a list must have the same type. The word list itself here is not a type. For example, given int, it produces the type int list. You could think of type constructors as being like functions that operate on types, instead of functions that operate on values. (We mentioned this idea of thinking constructor as a function on type in CS2112)

Accessing Lists

The following code computes the sum of a list.

1
2
3
4
let rec sum lst = 
match lst with
| [] -> 0
| h::t -> h + sum t

The following code computes the length of a list. _, the underscore character is used when we want to indicate the presence of some value in a pattern without actually giving it a name.

1
2
3
4
5
let rec length lst = 
match lst with
| [] -> 0
| _::t -> 1 + length t

The following code appends one list onto the beginning of another list.

1
2
3
4
let rec append lst1 lst2 = 
match lst1 with
| [] -> lst2
| h::t -> h::(append t lst2)

Note: every natural number is either 0 or is 1 greater than some other natural number n, and so a proof by induction has a base case for 0 and an inductive case for n+1. Likewise all our functions have a base case for the empty list and a recursive case for the list that has one more element than another list. This similarity is no accident. There is a deep relationship between induction and recursion; we’ll explore that relationship in more detail later in the course.

Mutating Lists

Values in OCaml are immutable. The following code increments the head by 1.

1
2
3
4
let inc_first lst =
match lst with
| [] -> []
| h::t -> (h+1)::t

This code looks extremely similar with C or Java operating on pointers. The implementation of list in OCaml works in the way that it shares the tail list t between the old list and the new list, such that the amount of memory in use does not increase (beyond the one extra piece of memory needed to store h+1). The reason that it’s quite safe for the compiler to implement sharing is exactly that list elements are immutable.

Pattern Matching with Lists

Basics

Each of the clauses pi -> ei is called a branch or a case of the pattern match. The p‘s here are a new syntactic form called a pattern.

  • a variable name, e.g. x
  • the underscore character _, which is called the wildcard (we don’t care what it is)
  • the empty list []
  • p1::p2
  • [p1; ...; pn]
1
2
3
4
let length_is lst n =
match length lst with
| n -> true
| _ -> false

The code above always returns true, because suppose that the length of lst is 5. Then the pattern match becomes: match 5 with n -> true | _ -> false. And n matches 5. A variable pattern matches any value and here produces the binding n->5. The correct codes are written below.

1
2
3
4
5
6
7
8
9
10
11
12
let length_is lst n =
match length lst with
| m -> if m=n then true else false
| _ -> false

let length_is lst n =
match length lst with
| m -> m=n
| _ -> false

let length_is lst n =
length lst = n

However, this doesn’t mean patterns are not the variable values as in switch statement. Yes they are general “patterns”. But you can match them to specific values. e.g.

1
2
3
4
5
6
7
8
9
match 5 with
| 6 -> true
| _ -> false;;
- : bool = false

match 5 with
| 5 -> true
| _ -> false;;
- : bool = true

Advanced Pattern Matching

  • p1 | ... | pn: an “or” pattern; matching against it succeeds if a match succeeds against any of the individual patterns pi, which are tried in order from left to right. All the patterns must bind the same variables.
  • (p : t): a pattern with an explicit type annotation.
  • c: here, c means any constant, such as integer literals, string literals, and booleans.
  • 'ch1'..'ch2': here, ch means a character literal. For example, 'A'..'Z' matches any uppercase letter.
  • p when e: matches p but only if e evaluates to true.

Tuples and Records

both represent heterogeneous types of values, both sizes are fixed

Records

Works like struct in C++. Each field is identified by names.

Definition

1
2
3
type student ={name: string; gpa : float; year :int;} (*defining a type*)
let rbg = {name = "R B"; gpa = 4.0; year = 1954;} (*declare an instance of that type*)
let s = rbg.name (*accessing field in the record*)

Patter Matching

1
2
3
4
5
6
match rbg with 
| {name=n; gpa=g; year=y} -> y

(*syntactic sugar of codes above*)
match rgb with
| {name;gpa;year} -> name

Tuples

Tuples are identified by position, instead of naming the components.

Definition

1
2
3
4
let t = (10,"am") (*t has type: int * string*) 
type time = int * string
let t:time = (10,"am") (*t has type: time*)
fst t;; snd t;; (*predefined functions to access the first and second element of a tuple*)

Pattern Matching

if we use a pattern in a let expression (or definition), we are really just doing pattern matching with a single clause.

1
2
3
4
5
6
7
8
9
let tick t =
let (t,s) = t in (t+1,s) (*tick : int * 'a -> int * 'a = <fun>*)

let tick (t:time):time =
let (t,s) = t in (t+1,s) (*tick : time -> time = <fun>*)

let tick (t:time):time =
match t with
| (t,s) -> (t+1,s)

Pattern Matching in a Nutshell

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
(* Pokemon types *)
type ptype =
TNormal | TFire | TWater

(* A record to represent Pokemon *)
type mon = {name: string; hp : int; ptype: ptype}

(*********************************************
* Several ways to get a Pokemon's hit points:
*********************************************)

(* OK *)
let get_hp m =
match m with
| {name=n; hp=h; ptype=t} -> h

(* better *)
let get_hp m =
match m with
| {name=_; hp=h; ptype=_} -> h

(* better *)
let get_hp m =
match m with
| {name; hp; ptype} -> hp

(* better *)
let get_hp m =
match m with
| {hp} -> hp

(* best *)
let get_hp m = m.hp

(**************************************************
* Several ways to get the 3rd component of a tuple
**************************************************)

(* OK *)
let thrd t =
match t with
| (x,y,z) -> z

(* good *)
let thrd t =
let (x,y,z) = t in z

(* better *)
let thrd t =
let (_,_,z) = t in z

(* best *)
let thrd (_,_,z) = z

(*************************************
* How to get the components of a pair
*************************************)

let fst (x,_) = x
let snd (_,y) = y


(************************
* take tuple as a whole
************************)
let rep_ok ((n,lst) as v) =
if List.length lst = n then v
else failwith "RI violated"
CATALOG
  1. 1. Lists
    1. 1.1. Building Lists
    2. 1.2. Accessing Lists
    3. 1.3. Mutating Lists
    4. 1.4. Pattern Matching with Lists
      1. 1.4.1. Basics
      2. 1.4.2. Advanced Pattern Matching
  2. 2. Tuples and Records
    1. 2.1. Records
      1. 2.1.1. Definition
      2. 2.1.2. Patter Matching
    2. 2.2. Tuples
      1. 2.2.1. Definition
      2. 2.2.2. Pattern Matching
    3. 2.3. Pattern Matching in a Nutshell