Introduction
In this post we are going to look at adding validation to the code we worked on in Part 6. We will use active patterns that we looked at in the last post (Part 7) and we will see how you can easily model domain errors.
Setting Up
We are going to use the code from Part 6.
Solving the Problem
The first thing we need to do is create a new record type that will store our validated data. Add this type below the Customer type definition:
type ValidatedCustomer = {
CustomerId : string
Email : string option
IsEligible : bool
IsRegistered : bool
DateRegistered : DateTime option
Discount : decimal option
}
Have a look at the source data to understand why some of the parts are optional.
We now need to add a new helper function to create a ValidatedCustomer:
let create customerId email isEligible isRegistered dateRegistered discount =
{
CustomerId = customerId
Email = email
IsEligible = isEligible
IsRegistered = isRegistered
DateRegistered = dateRegistered
Discount = discount
}
Now we need to think about how we handle validation errors. The obvious choices are using Option but then we lose the reason for the error or Result but that is going to make using our new create function difficult to use; We will use Result. The other thing we need to consider is what types of errors do we expect. The obvious ones are missing data (empty string) or invalid data (string to DateTime/decimal/boolean). We also want to return all of the errors, not just the first one, so we need a list of errors in the output. Lets create the error type as a discriminated union with tupled data:
type ValidationError =
| MissingData of name: string
| InvalidData of name: string * value: string
For missing data, we only need the name of the item but for invalid data, we want the name and the value that failed.
Now we will create some helper functions using active patterns to handle empty string, email regex and booleans:
let (|ParseRegex|_|) regex str =
let m = Regex(regex).Match(str)
if m.Success then Some (List.tail [ for x in m.Groups -> x.Value ])
else None
let (|IsValidEmail|_|) input =
match input with
| ParseRegex ".*?@(.*)" [ _ ] -> Some input
| _ -> None
let (|IsEmptyString|_|) (input:string) =
if input.Trim() = "" then Some () else None
let (|IsBoolean|_|) (input:string) =
match input with
| "1" -> Some true
| "0" -> Some false
| _ -> None
You will need to add 'open System.Text.RegularExpressions' in the declarations at the top of the file.
Now lets create our validate functions using our new ValidationError discriminated union:
let validateCustomerId customerId = // string -> Result
if customerId <> "" then Ok customerId
else Error (MissingData "CustomerId")
let validateEmail email = // string -> Result
if email <> "" then
match email with
| IsValidEmail _ -> Ok (Some email)
| _ -> Error (InvalidData ("Email", email))
else
Ok None
let validateIsEligible (isEligible:string) = // string -> Result
match isEligible with
| IsBoolean b -> Ok b
| _ -> Error (InvalidData ("IsEligible", isEligible))
let validateIsRegistered (isRegistered:string) = // string -> Result
match isRegistered with
| IsBoolean b -> Ok b
| _ -> Error (InvalidData ("IsRegistered", isRegistered))
let validateDateRegistered (dateRegistered:string) = // string -> Result
match dateRegistered with
| IsEmptyString -> Ok None
| _ ->
let (success, value) = dateRegistered |> DateTime.TryParse
if success then Ok (Some value)
else Error (InvalidData ("DateRegistered", dateRegistered))
let validateDiscount discount = // string -> Result
match discount with
| IsEmptyString -> Ok None
| _ ->
try
discount
|> decimal
|> Some
|> Ok
with
| _ -> Error (InvalidData ("Discount", discount))
We now need to create a validation function. Notice that I have added the expected return type:
let validate (input:Customer) : Result =
let customerId = input.CustomerId |> validateCustomerId
let email = input.Email |> validateEmail
let isEligible = input.IsEligible |> validateIsEligible
let isRegistered = input.IsRegistered |> validateIsRegistered
let dateRegistered = input.DateRegistered |> validateDateRegistered
let discount = input.Discount |> validateDiscount
create customerId email isEligible isRegistered dateRegistered discount // We have a problem
As noted earlier, we now have a problem - The create function isn't expecting Result types. With the skills and knowledge that we have from this series, we can solve this but not in a very elegant way!
Firstly, we create a couple of helper functions to extract Error and Ok data:
let getError input =
match input with
| Ok _ -> []
| Error ex -> [ ex ]
let getValue input =
match input with
| Ok v -> v
| _ -> failwith "Oops, you should have got here!"
Now we create a list of potential errors, concatenate them and then check to see if there are any:
let validate (input:Customer) : Result =
let customerId = input.CustomerId |> validateCustomerId
let email = input.Email |> validateEmail
let isEligible = input.IsEligible |> validateIsEligible
let isRegistered = input.IsRegistered |> validateIsRegistered
let dateRegistered = input.DateRegistered |> validateDateRegistered
let discount = input.Discount |> validateDiscount
let errors =
[
customerId |> getError;
email |> getError;
isEligible |> getError;
isRegistered |> getError;
dateRegistered |> getError;
discount |> getError
]
|> List.concat
match errors with
| [] -> Ok (create (customerId |> getValue) (email |> getValue) (isEligible |> getValue) (isRegistered |> getValue) (dateRegistered |> getValue) (discount|> getValue))
| _ -> Error errors
Finally, we need to plug the validation into the pipeline:
let parse (data:string seq) = // seq -> seq>
data
|> Seq.skip 1
|> Seq.map parseLine
|> Seq.choose id
|> Seq.map validate
If you run the code using 'dotnet run', you should get some typed data as output.
If you want to see how we can solve this in a more idiomatically functional way, have a look at my post on Functional Validation in F# Using Applicatives that I did for the 2019 F# Advent Calendar.
Final Code
This is what you should have ended up with:
open System
open System.IO
open System.Text.RegularExpressions
type Customer = {
CustomerId : string
Email : string
IsEligible : string
IsRegistered : string
DateRegistered : string
Discount : string
}
type ValidatedCustomer = {
CustomerId : string
Email : string option
IsEligible : bool
IsRegistered : bool
DateRegistered : DateTime option
Discount : decimal option
}
type ValidationError =
| MissingData of name: string
| InvalidData of name: string * value: string
type FileReader = string -> Result
let readFile : FileReader =
fun path ->
try
seq {
use reader = new StreamReader(File.OpenRead(path))
while not reader.EndOfStream do
yield reader.ReadLine()
}
|> Ok
with
| ex -> Error ex
let parseLine (line:string) : Customer option =
match line.Split('|') with
| [| customerId; email; eligible; registered; dateRegistered; discount |] ->
Some {
CustomerId = customerId
Email = email
IsEligible = eligible
IsRegistered = registered
DateRegistered = dateRegistered
Discount = discount
}
| _ -> None
let create customerId email isEligible isRegistered dateRegistered discount =
{
CustomerId = customerId
Email = email
IsEligible = isEligible
IsRegistered = isRegistered
DateRegistered = dateRegistered
Discount = discount
}
let (|ParseRegex|_|) regex str =
let m = Regex(regex).Match(str)
if m.Success then Some (List.tail [ for x in m.Groups -> x.Value ])
else None
let (|IsValidEmail|_|) input =
match input with
| ParseRegex ".*?@(.*)" [ _ ] -> Some input
| _ -> None
let (|IsEmptyString|_|) (input:string) =
if input.Trim() = "" then Some () else None
let (|IsBoolean|_|) (input:string) =
match input with
| "1" -> Some true
| "0" -> Some false
| _ -> None
let validateCustomerId customerId =
if customerId <> "" then Ok customerId
else Error <| MissingData "CustomerId"
let validateEmail email =
if email <> "" then
match email with
| IsValidEmail _ -> Ok (Some email)
| _ -> Error (InvalidData ("Email", email))
else
Ok None
let validateIsEligible (isEligible:string) =
match isEligible with
| IsBoolean b -> Ok b
| _ -> Error (InvalidData ("IsEligible", isEligible))
let validateIsRegistered (isRegistered:string) =
match isRegistered with
| IsBoolean b -> Ok b
| _ -> Error (InvalidData ("IsRegistered", isRegistered))
let validateDateRegistered (dateRegistered:string) =
match dateRegistered with
| IsEmptyString -> Ok None
| _ ->
let (success, value) = dateRegistered |> DateTime.TryParse
if success then Ok (Some value)
else Error (InvalidData ("DateRegistered", dateRegistered))
let validateDiscount discount =
match discount with
| IsEmptyString -> Ok None
| _ ->
try
discount
|> decimal
|> Some
|> Ok
with
| _ -> Error (InvalidData ("Discount", discount))
let getError input =
match input with
| Ok _ -> []
| Error ex -> [ ex ]
let getValue input =
match input with
| Ok v -> v
| _ -> failwith "Oops, you should have got here!"
let validate (input:Customer) : Result =
let customerId = input.CustomerId |> validateCustomerId
let email = input.Email |> validateEmail
let isEligible = input.IsEligible |> validateIsEligible
let isRegistered = input.IsRegistered |> validateIsRegistered
let dateRegistered = input.DateRegistered |> validateDateRegistered
let discount = input.Discount |> validateDiscount
let errors =
[
customerId |> getError;
email |> getError;
isEligible |> getError;
isRegistered |> getError;
dateRegistered |> getError;
discount |> getError
]
|> List.concat
match errors with
| [] -> Ok (create (customerId |> getValue) (email |> getValue) (isEligible |> getValue) (isRegistered |> getValue) (dateRegistered |> getValue) (discount|> getValue))
| _ -> Error errors
let parse (data:string seq) =
data
|> Seq.skip 1
|> Seq.map parseLine
|> Seq.choose id
|> Seq.map validate
let output data =
data
|> Seq.iter (fun x -> printfn "%A" x)
let import (fileReader:FileReader) path =
match path |> fileReader with
| Ok data -> data |> parse |> output
| Error ex -> printfn "Error: %A" ex
[]
let main argv =
import readFile @"D:\temp\customers.csv"
0
Conclusion
In this post we have looked at how we can add validation by using active patterns and how easy it is to add additional functionality into the data processing pipeline.
In the next post we will look at improving the code from the first post we worked on by using more domain terminology.
If you have any comments on this series of posts or suggestions for new ones, send me a tweet (@ijrussell) and let me know.