I'm currently working through the book Programming in Scala, 2nd ed.. I'm in chapter 26 which is on Extractors.
What are Extractors?
Earlier in the book case classes were introduced. One of the things you can do
with case classes is use them in pattern matching expressions. For example, you
might have an Email
case class to describe an email address:
case class Email(username:String, domain:String) def isEmail(e: Any):Boolean = e match { case Email(_,_) => true case _ => false }
But what if you are dealing with values that aren't case classes? Extractors allow you to basically define arbitrary match expressions without needing to set up case classes.
To understand extractors I find it helpful to understand their inverse,
injections, first. Injections take inputs and produce an output of some type,
not unlike a case class constructor. Injections have an apply
method. So
instead of our Email
case class above, we could define a somewhat similar
injection like so:
object Email { def apply(username: String, domain: String) = user + "@" + domain }
(This is basically the example given in the book.)
An extractor on the other hand works in the reverse direction. In this case
it defines an unapply
method that would take a candidate string and, if it is
an email address, will return the parts which would correspond to the arguments
to the apply
method:
object Email { def apply(username: String, domain: String) = user + "@" + domain def unapply(s: String): Option[(String, String)] = { val parts = s aplit "@" if (parts.length == 2) Some(parts(0), parts(1)) else None } }
An extractor requires the unapply
method, the apply
method is optional.
Also note that the unapply
method returns an Option
and will return None
if the string is not an email address.
For more details see Extractor Objects. For example, you can have an unapply method that returns a Boolean
, and there is the unapplySeq
method for handling variable number of arguments to an extractor.
When to use extractors vs case classes?
So extractors let you define match expression patterns in much the same way as can be achieved with case classes. When do you use extractors instead of case classes?
The advantage that extractors have over case classes is that case classes expose the implementation type of an API's data model. Using an extractor allows you to hide the implementation type(s) of the API's data model and change it without requiring client code to change as well.
For example, we can use our Email
extractor with a non-case class email class
like so:
class EmailImpl(val username:String, val domain:String) { override def toString: String = username + "@" + domain } object Email { def unapply(s: String): Option[(String, String)] = { val parts = s split "@" if (parts.length == 2) Some(parts(0), parts(1)) else None } }
Case classes do have advantages though. They are simpler, can be optimized
better by the Scala compiler and with sealed
case class hierarchies the
compiler can also catch missed cases.
You can always start with case classes and then switch to extractors once there is a need to change the data model's concrete representation type.
An example: Date
Extractors seem like a great way of working with unstructured data and APIs
where you don't have control over the source code. As an example consider the
following extractor for getting the individual time values out of a
java.util.Date
instance:
import java.util.{Calendar => JCal, Date => JDate} object Date { def unapply(d: JDate):Option[(Int,Int,Int,Int,Int,Int,Int)] = { val cal = JCal.getInstance cal.setTime(d) Some(cal get JCal.YEAR, cal get JCal.MONTH, cal get JCal.DAY_OF_MONTH, cal get JCal.HOUR_OF_DAY, cal get JCal.MINUTE, cal get JCal.SECOND, cal get JCal.MILLISECOND) } }
This allows you write match expression against java.util.Date
instances, like
so:
def isFebruary(d: JDate): Boolean = d match { case Date(_, 1, _, _, _, _, _) => true case _ => false }
You can also just use them to unpack values:
val Date(year, month, day, hour, minutes, seconds, milliseconds) = new JDate
Now you have a much simpler way to work with Java date instances.
Final thoughts
Extractors are kind of hard to think about because you have to think in terms of
turning output backwards into inputs. Also, it's not always obvious what would
be useful for unapply
to return. When working through the examples and making
up some of my own I found it useful to think of how I would implement the
injection (i.e., the apply
method) first.
No comments:
Post a Comment