Posted in Functional Programming in Scala

Functional Program Design

 

JSON

Case classes are Scala’s preferred way to define complex data. Below is one way to represent JSON data

Left hand-side of a For-expression generator can also be a pattern as shown below which lists all first-name and last-name of phone#s that start with 212:

Here JObj(bindings) <- data acts as implicit filter

distinct is needed to remove duplicate authors who are in results list more than twice

Partial Functions

Partial Function, as opposed to Total functions, provides an answer only for a subset of possible data, and defines the data it can handle by isDefined as follows:

More common way of writing Partial Functions is using "case" statement which gives a default implementation of isDefined:

Note that our exception changed to MatchError instead of ArithmeticException when using "case".

Partial-Functions are well handled by "collect" as shown below:

Magic here is that collect expects a PartialFunction and invokes the isDefined method. If we define partial-function inline, the compiler knows that it’s a partial function and you avoid explicit PartialFunction trait.

Seq,Set and Map are also Partial-functions

Checking for isDefined can be painful and luckily Scala supports lift method that converts partial-function to total-function which returns an Option

(above notes on Partial functions has been taken from this blog)

We can chain together partial-functions using orElse or andThen which are defined in PartialFunction trait just like lift.

withFilter

Scala provides another variation of filter called withFilter which doesn't create a new collection like filter does. It acts like a view that filters the result to be passed on to subsequent calls to map/flatMap etc.

We can't use filter after applying withFilter but we can apply multiple withFilters

Scala compiler translated for-expression into higher-order functions using map, flatMap and filter. For eg, for (x <- e1) yield e2
is translated to e1.map(x => e2)

Translation of for is not limited to lists or sequences or even collections.
It is based solely on the presence of the methods map, flatMap and withFilter. This lets us use for syntax for our own types as well – you must only define map, flatMap and withFilter for these types.
There are many types for which this is useful: arrays, iterators, databases, XML data, optional values, parsers, etc.

As long as client interface to the database defines the methods map, flatMap and withFilter we can use for syntax database querying.
This is the basis of Scala database connection frameworks ScalaQuery and Slick.
Similar ideas underly Microsoft’s LINQ.

Streams

Streams are similar to Lists except for their Tails are evaluated on demand.

Streams are defined from a constant Stream.empty and a constructor Stream.cons.

They can also be defined like the other collections by using the
object Stream as a factory Stream(1, 2, 3)

The toStream method on a collection will turn the collection into a stream:
(1 to 1000).toStream > res0: Stream[Int] = Stream(1, ?)

Stream supports almost all methods of List except for :: which always produces a List. #:: should be used instead to produce a Stream which can be used in Expressions as well as Patterns.

To find the second prime number between 1000 and 10000:
((1000 to 10000).toStream filter isPrime)(1)

x #:: xs == Stream.cons(x, xs)

Even the implementation of Streams are very close to Lists with only major difference being the use of "Call-by-name" to declare second param of cons ro filter ops as follows which causes lazy-evaluation.

Posted in Functional Programming in Scala

Transition to Functional programming with Scala

This blog lists key points I noticed while trying to shift gears to Functional Programming using Scala JVM language coming from a traditional Object-oriented programming background for over a decade. I’ve extensively used Coursera’s Scala course and online documentation/blogs to come-up with this content.

What is Functional programming

Mainstream languages like C, C++, Java etc were based on the idea of update in place. They presuppose that the way to solve a programming problem is to have procedures that read and write values to memory locations, be it directly visible to the programmer (by means of pointers) or not.

Other languages, like Lisp, are based on a different assumption: a computation is best expressed as a set of functions over immutable values. So, in order to produce any kind of repetitive computation, recursion is essential. This assumption has strong roots in lambda calculus and related mathematical formalisms.

Function evaluation strategies

Scala function can be evaluated either Call-by-name or Call-by-value. Call-by-value has the advantage that it evaluates every function argument only once. Call-by-name has the advantage that a function argument is not evaluated if the corresponding parameter is unused in the evaluation of the function body.

Scala uses Call-by-value generally, unless the type of function parameter starts with ‘=>’ in which Call-by-name is used.

Tail-recursion

Recursion has been looked-down upon in Imperative style programming but its the basic building block for Functional Programming. Stack-overflow issue still exists if you have too many recursive calls and thats where Tail-recursion comes into play. We can convert any Recursive function into Tail-recursion by using an intermediate function containing an Aggregator. This lets Compiler use the same stack-space for all recursive method calls instead of creating a new one for each Recursive call. Below is an example of factorial definition.

A tail-recursive solution is as follows:

Note that Recursive functions needs explicit return type declared in function definition whereas its optional for non-recursive types.

Higher-order functions and Currying

Functions are first-class citizens which gets treated just like any primitive types, i.e. you can pass it as a parameter to a function, return it from a function and define a var to hold a function.

Currying is when you break down a function that takes multiple arguments into a series of functions that take part of the arguments. Advantage of this is that it offers great amount of expressiveness and reusability to a function. Function arguments associate to the left in context of Currying. Basically, this lets you express your solution just as if you you write a mathematical equation.

We’ll develop motivation for Currying step-by-step:

Since a and b are getting passed unchanged from sumInts and sumCubes into sum, we can get their definitions even shorter by getting rid of params as follows:

Here sum is now a function that returns another function. Returned functions sumF applies the given function parameter f and sums the results. We can now define earlier 3 functions as follows:

These functions can in turn be applied like any other function:

In above example, we can avoid sumInts, sumCubes & sumFactorials middlemen as follows:

sum (cube) (1,10)

  • sum(cube) applies sum to cube and returns  ‘sum of cubes’ function
  • sum(cube) is therefore equivalent to sumCubes
  • This function is next applied to arguments (1,10)

Generally, function application associates to the left:

sum (cube)(1,10) == (sum (cube)) (1,10)

The definition of functions that return functions is so useful in functional programming that there is a special syntax for it in Scala. For example, the following definition of sum is equivalent to the one with the nested sumF function, but shorter:

Thus Currying resembles the process of evaluating a function of multiple variables, when done by hand, on paper as follows:

For example, given the function f(x,y)=y/x

  1. To evaluate f(2,3), first replace x with 2
  2. Since the result is a function of y, this new function g(y) can be defined as g(y)=f(2,y)=y/2
  3. Next, replace the y argument with 3, producing g(3)=f(2,3)=3/2

On paper, using classical notation, this is usually done all in one step. However, each argument can be replaced sequentially as well. Each replacement results in a function taking exactly one argument. Below is an example where we can use wildcard as a placeholder for unknown param that’ll be provided later:

Example: MapReduce

Below is screenshot of IDE where product function is implemented with Currying principles. mapReduce generalizes this further more by supporting any operation like sum, product etc.

With above context, we can re-implement product in terms of mapReduce as follows:

Function

Function-type:

Every function has a type that is declared as A => B (in its simplest form). This function takes argument of type A and returns result of type B.

Here f is an anonymous function.

We can also declare function type as follows where MyFunction is a function-type that takes two Ints and returns a Boolean:

Note that Function-types associate to right

Persistent Data-structures:

Classes:

Scala will create a new Type and a Constructor for every class definition as in below examples

Functions within Class are called ‘method’s. They differ quiet a bit which we’ll see later.

We can add ‘require‘ method to the class definition to enforce some restrictions on class variables during constructions.


Require is a pre-defined function that takes a condition and an optional message string. An IllegalArgumentException will be thrown if condition is not met.

Similarly, there exists assert method with same signature which throws a AssertionError.

  • require enforces pre-condition on caller of function
  • assert checks the function code itself

Every class def introduces an implicit constructor called ‘primary‘ constructor that takes all parameters and executes all statements in class body such as ‘require’ etc.

An Auxiliary constructor lets us add new constructor by using this keyword as shown below. We can define multiple auxiliary constructors.

Any method with a parameter can be used as an Infix operator such as:

x.add(y) can be re-written as x add y

Symbols can be used as Identifiers for Variable or Methods such as +, *&^%, etc

We can define prefix operators like -y by declaring it as unary operator as follows:

You can create a singleton class by replacing “class” identifier with “object”

Traits

Traits lets us achieve multiple inheritance in Scala. They’re similar to Java Interfaces except that Traits can contain method definitions (Java 8 supports methods in Interfaces) It cannot define variable/parameters though.

We can instantiate Traits using Anonymous classes as follows:

We are not instantiating Trait directly, but rather creating a suitable object for the Trait to attach itself to so you can use the Trait’s functionality without needing to define a class that extends the Trait.

Scala class Hierarchy

All Scala classes automatically import classes belonging to following packages: scala, java.lang and scala.Predef

_ is the wildcard in Scala

Scala-class-hierarchy

“Any” is base type of all classes containing methods like ‘==’

“AnyRef” is just alias of java.lang.Object and is the base class of all reference types

“AnyVal” is base type of all primitive types

Nothing” is sub-type of every other type. It’s used to signal abnormal termination (for eg. ‘throw Exc’ expressionn is used to abort evaluation and its type is Nothing) and, also as element type of empty collections as in Set[Nothing]

“Null” is subtype of every class/ref inherited from Object/Any thus incompatible with AnyVal. Every reference class type also has null as a value.

As in Java, Polymorphism is achieved by both sub-classing and generics. Below is an example:

Note that:

  1. Type parameters doesn’t effect Scala during runtime as it uses Type-erasure(type info is used only by compiler and not preserved for runtime)
  2. If we define a method param as val, it implicitly defines a val with same name in method definition and assigns method param to it.
  3. Generics in methods: singleton method can be invoked as follows with the type info:

    Alternatively, we can let Scala deduce type info from the context thus letting us invoke singleton as follows:
  4. Above code creates what is called as Immutable linked list. This is fundamental to Functional programming which is constructed using 2 basic elements:
    Nil:  the empty list
    Cons: cell containing an element and remainder of listBelow is graphical representation of Immutable linked lists:Immutable linked list

 

Functions as Objects

Function-types relates to Classes and function value relate to objects. Functions type A => B is just abbreviation for scala.Function1[A,B] which is define as follows:

There are also traits with Function2, Function3 etc to support upto 22 params.

Anonymous function such as (x: Int) => x * x would be expanded as follows:

This in turn gets converted to as follows using Java anonymous class syntax:

Function calls such as f(a,b) is expanded to f.apply(a,b). For eg:

would be translated to:

ETA Expansion

Note that method such as def f(x: Int): Boolean = ... is not a function value as it ends in infinite expansion. But when name of method is used in place where function type is expected, then its converted to function value (x: Int) => f(x) which gets expanded as below. This is known as ETA expansion :

This blog has a great explanation about ETA expansion.

In summary, functions and methods are not the same in Scala (although most of the cases we can ignore this difference). A Scala method, as in Java, is a part of class whereas Function is a complete object (instance of class that implement one the Traits Function0,Function1,Function3..)

Pattern matching

Pattern matching provides a functional way of object decomposition. A case class definition adds feature to pattern match on classes. This modifier adds the benefits of

  1. add companion ‘object’ singleton declarations for syntactic convenience
  2. provides concrete subclass with no body
  3. ‘==’: does a structural equality comparison, i.e two case class instances with same params are equal
  4. toString: method that gives string representation of class members
  5. copy(): we can copy whole of a case class instance values into a new instance.
  6. all params will be declared with ‘val’ by default. Note that if you use currying fashion, only the first param(s) will be provided with ‘val’

Since the concrete sub-class has no body, we will be using Pattern matching using ‘match’ keyword.

For eg,

Note match statement execution is sequential as it executes all cases starting from top to bottom one-by-one and skips after a matching case.

A good side-effect of such a matching is that it provides an alternative way to design functions. For example, consider the factorial function. If you choose the recursive version, usually, you would define it like this:

Where-as a Pattern matching based solution is:

Lists

Lists are immutable and are recursive whereas Arrays are mutable and flat.

val l = List(1,2,3) is similar to using Cons (short form for constructor) operator as 1::2::3::Nil.
Note that operators ending in ‘:’ are right associative and thus above can be translated as Nil.::3.::2.::1

All operations on lists can be expressed in terms of ‘head’, ‘tail’ and ‘isEmpty’

Lists can also be used in pattern matching as follows:
1 :: 2 :: xs               Lists starting with 1 and then 2
x :: Nil or List(x)  Lists with length 1
List() or Nil                 Empty list
List(2 :: xs)             List containing only element as another list that start with 2

Below is an example of a typical way to decompose lists with pattern-matching using Insertion sort as an example:

More methods on lists xs.length ; xs.last ; xs.init ; xs take n ; xs drop n ; xs(n) ; xs ++ ys ; ::: (concatenating lists) ; xs.reverse ; xs updated (n, x) ; xs indexof x ; xs contains x

Pairs and Tuples

Pairs consisting of x and y is written as (x,y).

val pair = ("answer",42)
Type of above pair is (String,Int). Pairs can be used as patterns.
val (label,pair) = pair This returns label: String = answer, value: Int = 42

A tuple expression (e1,…,en) is an abbreviation of parameterized type scala.Tuplen(e1,...,en)

Type inference

Consider below code

msort call can be re-written to msort(nums)(x,y) as Scala compiler can infer the parameter types of x & y based on type of nums.

Implicit parameters

In above example, Scala compiler searches for implicit definition based on below rules:

– is marked ‘implicit’
– has type compatible
– is visible at point of function call or is defined in companion object associated with implicit’s type

Higher order List functions

Normal List operations include the following:

  • transforming each element of a list
  • retrieving list of all elements satisfying a criteria
  • combining elements of a list

Scala allows generic functions to implement above patterns using higher-order functions.

Below is a screenshot of my IntelliJ IDEA workspace with some examples:

Screen Shot 2016-07-21 at 11.17.57 AM.png

Reduction of Lists

Common operation on Lists is to combine elements of List with a given operator.

reduceLeft inserts given binary operator between adjacent elements of a list

List(x1,….,xn) reduceLeft op = (…(x1 op x2) op …) op xn

Using reduceLeft, we define sum and product op as follows:

Above ops can be re-written using wildcard pattern as follows

foldLeft is similar to reduceLeft except for it takes an accumulator as a additional param which will be returned when foldLeft is called on empty list.

foldLeft and foldRight are equivalent (possible differences in efficiency) only if the operators are associative and commutative

Other Collections

Lists are linear, i.e access to first element is faster than middle or end element.

Scala offers alternative sequence implementation: Vector which is immutable and offers more balanced access patterns.

Screen Shot 2016-07-23 at 10.57.50 PM.png

When is a List preferred to a Vector ?

List: if our operations involve having a Head and Tail of a sequence as these are done at constant-time with Lists whereas its complicated with a Vector

Vector: when we need bulk operations like Map, Filter, Fold

Vectors are created analogous to Lists including its operations. Only exception is we’ll replace :: with :+.  x +: xs creates new Vector with leading element x followed by elements of xs and +: creates new Vector with trailing element x preceded by all elements of xs.

Screen Shot 2016-07-24 at 9.58.49 PM.png

More functions on Sequences

Screen Shot 2016-07-24 at 8.44.50 AM.png

xs zip ys       A sequence of pairs drawn from corresponding elements of sequences xs & ys

xs unzip ys  Splits a sequence of pairs xs into two sequences consisting of the first and second half of all pairs.

Above functions lists all combinations of numbers x and y where x is from 1 to 2 and y is from 3 to 4

Below function:

is equivalent to following def which uses case

Note that {case p1 => e1, ... case pn => en} is same as x => x match {case p1 => e1 ... case pn => en}

We can combine Sequence of Sequences using flatten

Its also true that xs flatMap f is equal to (xs map f).flatten

All collection types share common set of general core methods: map, flatMap,filter,foldLeft & foldRight. Latter two reduces to single value

For Expressions

Higher-order functions like map,flatMap and filter might make program difficult to understand in which case we can use for expressions.

For eg: if persons is a list of elements of class Person with fields name and age,

we can obtain names of persons over 20 as follows:

which is equivalent to below non-intuitive expression using map.

A for expression is of the form:

where s is a sequence of generators and filters and e is expression whose value is returned by an iteration.

We can also use {s} instead of (s) which lets us write sequence of generators and filters in multiple lines without need of semi-colons.

With For expression, we can re-write scalarProduct as follows:

More examples from Coursera scala course:

Sets

Set is written analogous to a sequence:

Most ops on sequences are available for Sets as it inherits them from Iterable as follows:

Differences between Set and Sequence:

    1. Sets are un-ordered
    2. No duplicate elements. s map (_ / 2) //Set(2,0,3,1)
    3. Fundamental operation on Set is contains just as head/tail for Lists and index for Vector

Maps

Class Map[Key, Value] extends collection type Iterable[(Key, Value)] thus it supports all operations that iterables do.

Note that maps extends iterables of key/value pairs. So, key -> value is just an alternatetive way of writing (key, value)

Class Map[Key, Value] also extends the function type Key => Value so maps can be used anywhere functions can. In particular, it can be applied to key arguments.

Applying a map to a non-existent key gives error:

capitalOfCountry("Andorra")//java.util.NoSuchElementException: key not found: Andorra

To query a map without knowing if key exists or not, we can use get which returns a Option value

Since Option classes are case classes, we can decompose them using pattern matching

Maps are partial functions as they could lead to exception if no key found. The operation withDefaultValue turns a map into a total function.

Repeated parameter

For convenience, we can convert below function call:

to

Inside Polynom function, binding is seen as a Seq[(Int,Double)]

sortWith and groupBy

groupBy partitions a collection into a map of collections according to a discriminator function f