Introduction to Functional Programming in F# – Part 6

Introduction

Welcome to the sixth post in this introductory series on functional programming in F#. In this post we will introduce the basics of reading and parsing external data using sequences and the Seq module and how we can isolate code that talks to external services to make our codebase as testable as possible.

Setting Up

Copy the following data into a file. I have created a file called "customers.csv" and have stored it into "D:\temp".

CustomerId|Email|Eligible|Registered|DateRegistered|Discount
John|john@test.com|1|1|2015-01-23|0.1
Mary|mary@test.com|1|1|2018-12-12|0.1
Richard|richard@nottest.com|0|1|2016-03-23|0.0
Sarah||0|0||

Now create a console application in a new folder.

dotnet new console -lang F#

Now we are ready to start.

Solving the Problem

We are going to use features of the built-in System.IO classes, so first we need to open the package;

open System.IO

and then we need a function that takes a path as a string and returns a collection of strings from the file;

let readFile path = // string -> seq
    seq { use reader = new StreamReader(File.OpenRead(path))
          while not reader.EndOfStream do
              yield reader.ReadLine() 
    }

There are a few new things in this simple function!

seq is called a Sequence Expression. The code inside the curly brackets is creating a sequence of strings. seq creates a sequence of { 1; 2; 3; 4; 5 }.

StreamReader implements the IDisposable interface. F# deals with that by using the 'use' and 'new' keywords.

'yield' adds that item to the sequence.

Now we need to write some code in the main function to call our readFile function and output the data to the Terminal window;

@"D:\temp\customers.csv"
|> readFile 
|> Seq.iter (fun x -> printfn "%s" x)

You must leave the '0' at the end of the main function.

Seq is the sequence module which has a wide range of functions available, similar to List and Array. Seq.iter will iterate over the sequence and returns unit.

The code in Program.fs should now look like this;

open System.IO

let readFile path = // string -> seq
    seq { use reader = new StreamReader(File.OpenRead(path))
          while not reader.EndOfStream do
              yield reader.ReadLine() 
    }

[]
let main argv =
    @"D:\temp\customers.csv"
    |> readFile 
    |> Seq.iter (fun x -> printfn "%s" x)
    0

Run the code by typing 'dotnet run' in the Terminal.

To handle potential errors from loading a file, we are going to add some error handling to the readFile function;

let readFile path = // string -> Result,exn>
    try
        seq { use reader = new StreamReader(File.OpenRead(path))
              while not reader.EndOfStream do
                  yield reader.ReadLine() 
        }
        |> Ok
    with
    | ex -> Error ex

To handle the change in the signature of the output from the readFile function, we will introduce a new function;

let import path =
    match path |> readFile with
    | Ok data -> data |> Seq.iter (fun x -> printfn "%A" x)
    | Error ex -> printfn "Error: %A" ex.Message

and replace the code in the main function with;

import @"D:\temp\customers.csv"

Run the program to check it still works.

Now we want to create a type to read in the data;

type Customer = {
    CustomerId : string
    Email : string
    IsEligible : string
    IsRegistered : string
    DateRegistered : string
    Discount : string
}

and create a function that takes a sequence of strings as input and returns a sequence of Customer;

let parse (data:string seq) = // seq -> seq
    data
    |> Seq.skip 1 // Ignore the header row
    |> Seq.map (fun line -> 
        match line.Split('|') with
        | [| customerId; email; eligible; registered; 
dateRegistered; discount |] -> 
            Some { 
                CustomerId = customerId
                Email = email
                IsEligible = eligible
                IsRegistered = registered
                DateRegistered = dateRegistered
                Discount = discount
             }
        | _ -> None
    )
    |> Seq.choose id // Ignore None and unwrap Some

There are some new features in this function:

'Seq.skip 1' will ignore the first item in the sequence as it is not a Customer.

The Split function creates an array of strings. We then pattern match the array and get the data which we then use to populate a Customer. If you weren't interested in all of the data, you can use '_' for those parts. We have now met the three primary collection types in F#; List ([..]), Seq (seq ) and Array ([|..|]).

'Seq.choose id' will ignore any item in the sequence that is None and will unwrap the Some items to return a sequence of Customers.

We also need to add a function to output the sequence of customer to the Terminal window;

let output data =
    data 
    |> Seq.iter (fun x -> printfn "%A" x)

and add this function to the Ok path in the import function;

let import path =
    match path |> readFile with
    | Ok data -> data |> parse |> output
    | Error ex -> printfn "Error: %A" ex.Message

The next stage is to extract the code from the map in the parse function to its own function;

let parseLine (line:string) : Customer option =
    match line.Split('|') with
    | [| customerId; email; eligible; registered; 
dateRegistered; discount |] -> 
        Some { 
            CustomerId = customerId
            Email = email
            IsEligible = eligible
            IsRegistered = registered
            DateRegistered = dateRegistered
            Discount = discount
        }
    | _ -> None

and modify the parse function to use the parseLine function;

let parse (data:string seq) =
    data
    |> Seq.skip 1
    |> Seq.map (fun x -> parseLine x)
    |> Seq.choose id

We can simplify this function by removing the lambda;

let parse (data:string seq) =
    data
    |> Seq.skip 1
    |> Seq.map parseLine
    |> Seq.choose id

Whilst we have improved the code a lot, it is difficult to test without having to load a file. In addition, the signature of the readFile function is 'string -> Result<seq,exn>' which means that it could easily have been a Url to a webservice rather than a path to a file on disk.

To make this testable and extensible, we can use Higher Order Functions and pass a function as a parameter into the import function;

let import (fileReader:string -> Result) path =
    match path |> fileReader with
    | Ok data -> data |> parse |> output
    | Error ex -> printfn "Error: %A" ex.Message

This means that we can now pass any function with this signature into the import function.

This signature is quite simple but they can get quite complex, so we can create a type signature and use that instead;

type FileReader = string -> Result

and replace the function signature in import with it;

let import (fileReader:FileReader) path =
    match path |> fileReader with
    | Ok data -> data |> parse |> output
    | Error ex -> printfn "Error: %A" ex.Message

We can also use it like an Interface in the readFile function but it does mean modifying our code a little;

let readFile : FileReader =
    fun path ->
        try
            seq { use reader = new StreamReader
(File.OpenRead(path))
                  while not reader.EndOfStream do
                      yield reader.ReadLine() }
            |> Ok
        with
        | ex -> Error ex

We need to make a small change to our call in main to tell it to use the readFile function;

import readFile @"D:\temp\customers.csv"

If we use import with readFile regularly, we can use partial application to create a new function that does that for us;

let importWithFileReader = import readFile

To use it we would simply call;

importWithFileReader @"D:\temp\customers.csv"

The payoff for the work we have done using Higher Order Functions and Type Signatures is that we can easily pass in a fake function for testing like the following;

let fakeFileReader : FileReader =
    fun _ ->
        seq {
            "CustomerId|Email|Eligible|Registered|
DateRegistered|Discount"
            "John|john@test.com|1|1|2015-01-23|0.1"
            "Mary|mary@test.com|1|1|2018-12-12|0.1"
            "Richard|richard@nottest.com|0|1|2016-03-23|0.0"
            "Sarah||0|0||"
        }
        |> Ok

import fakeFileReader "_"

or any other function that satisfies the Type Signature.

Final Code

What we have ended up with is the following;

open System.IO

type Customer = {
    CustomerId : string
    Email : string
    IsEligible : string
    IsRegistered : string
    DateRegistered : string
    Discount : string
}

type FileReader = string -> Result

let readFile : FileReader =
    fun path ->
        try
            seq { use reader = new StreamReader
(File.OpenRead(path))
                  while not reader.EndOfStream do
                      yield reader.ReadLine() }
            |> Ok
        with
        | ex -> Error ex

let parseLine (line:string) : Customer option =
    match line.Split('|') with
    | [| customerId; email; eligible; registered; 
dateRegistered; discount |] -> 
        Some { 
            CustomerId = customerId
            Email = email
            IsEligible = eligible
            IsRegistered = registered
            DateRegistered = dateRegistered
            Discount = discount
        }
    | _ -> None

let parse (data:string seq) =
    data
    |> Seq.skip 1
    |> Seq.map parseLine
    |> Seq.choose id

let output data =
    data 
    |> Seq.iter (fun x -> printfn "%A" x)

let import (fileReader:FileReader) path =
    match path |> fileReader with
    | Ok data -> data |> parse |> output
    | Error ex -> printfn "Error: %A" ex.Message

[]
let main argv =
    import readFile @"D:\temp\customers.csv"
0

In a future post, we will extend this code by adding data validation.

Conclusion

In this post we have looked at how we can import data using some of the most useful functions on the Seq module, Sequence Expressions and Type Signatures.

In the next post we will look at another exciting F# feature - Active Patterns.

If you have any comments on this series of posts or suggestions for new ones, send me a tweet (@ijrussell) and let me know.

Part 5 Table of Contents Part 7

Introduction to Funcitional Programming in F#

Blog 3/22/23

Introduction to Functional Programming in F# – Part 8

Discover Units of Measure and Type Providers in F#. Enhance data management and type safety in your applications with these powerful tools.

Blog 5/17/23

Introduction to Functional Programming in F# – Part 10

Discover Agents and Mailboxes in F#. Build responsive applications using these powerful concurrency tools in functional programming.

Blog 8/8/23

Introduction to Functional Programming in F# – Part 12

Explore reflection and meta-programming in F#. Learn how to dynamically manipulate code and enhance flexibility with advanced techniques.

Blog 5/18/22

Introduction to Functional Programming in F#

Dive into functional programming with F# in our introductory series. Learn how to solve real business problems using F#'s functional programming features. This first part covers setting up your environment, basic F# syntax, and implementing a simple use case. Perfect for developers looking to enhance their skills in functional programming.

Blog 10/1/22

Introduction to Functional Programming in F# – Part 4

Unlock F# collections and pipelines. Manage data efficiently and streamline your functional programming workflow with these powerful tools.

Blog 7/12/23

Introduction to Functional Programming in F# – Part 11

Learn type inference and generic functions in F#. Boost efficiency and flexibility in your code with these essential programming concepts.

Blog 9/13/22

Introduction to Functional Programming in F# – Part 2

Explore functions, types, and modules in F#. Enhance your skills with practical examples and insights in this detailed guide.

Blog 8/7/20

Understanding F# Type Aliases

In this post, we discuss the difference between F# types and aliases that from a glance may appear to be the same thing.

Blog 12/22/22

Introduction to Functional Programming in F# – Part 7

Explore LINQ and query expressions in F#. Simplify data manipulation and enhance your functional programming skills with this guide.

Blog 9/15/22

Introduction to Functional Programming in F# – Part 3

Dive into F# data structures and pattern matching. Simplify code and enhance functionality with these powerful features.

Blog 3/22/23

Introduction to Functional Programming in F# – Part 9

Explore Active Patterns and Computation Expressions in F#. Enhance code clarity and functionality with these advanced techniques.

Blog 10/11/22

Introduction to Functional Programming in F# – Part 5

Master F# asynchronous workflows and parallelism. Enhance application performance with advanced functional programming techniques.

Blog 7/21/20

Understanding F# applicatives and custom operators

In this post, Jonathan Channon, a newcomer to F#, discusses how he learnt about a slightly more advanced functional concept — Applicatives.

Blog 3/11/21

Introduction to Web Programming in F# with Giraffe – Part 2

In this series we are investigating web programming with Giraffe and the Giraffe View Engine plus a few other useful F# libraries.

Blog 3/12/21

Introduction to Web Programming in F# with Giraffe – Part 3

In this series we are investigating web programming with Giraffe and the Giraffe View Engine plus a few other useful F# libraries.

Blog 3/10/21

Introduction to Web Programming in F# with Giraffe – Part 1

In this series we are investigating web programming with Giraffe and the Giraffe View Engine plus a few other useful F# libraries.

Blog 11/30/22

Introduction to Partial Function Application in F#

Partial Function Application is one of the core functional programming concepts that everyone should understand as it is widely used in most F# codebases.In this post I will introduce you to the grace and power of partial application. We will start with tupled arguments that most devs will recognise and then move onto curried arguments that allow us to use partial application.

Blog 12/3/21

Using Discriminated Union Labelled Fields

A few weeks ago, I re-discovered labelled fields in discriminated unions. Despite the fact that they look like tuples, they are not.

Blog 11/24/23

Part 3: How to Analyze a Database File with GPT-3.5

In this blog, we'll explore the proper usage of data analysis with ChatGPT and how you can analyze and visualize data from a SQLite database to help you make the most of your data.

Blog 11/27/23

Part 4: Save Time and Analyze the Database File

ChatGPT-4 enables you to analyze database contents with just two simple steps (copy and paste), facilitating well-informed decision-making.

Bleiben Sie mit dem TIMETOACT GROUP Newsletter auf dem Laufenden!