Tuesday, October 18, 2011

10 reasons to use the F# runtime in your C# app

Most people have at least noticed that F# shipped with Visual Studio 2010. I mean, you click File –> New Project and there's the F# project templates, you can't miss them.

What most people probably didn't realize is that even if you don't use the F# language or aren't even interested in it, you can still profit from using the F# runtime in C# / VB.NET projects. The F# runtime is just a regular DLL named FSharp.Core.dll you can reference just like any other assembly. It's available for .NET 2.0 and 4.0 (separate DLLs). This availability for .NET 2.0 is particularly valuable for projects that for some reason can't be upgraded to newer versions of .NET.

Granted, the library is designed to be used from F#, so sometimes it looks weird in C#, but we'll see how to work around some of the oddities with FSharpx.

Let's see some of the things FSharp.Core gives you, in no particular order:

Tuples

So you want to use tuples but you can't upgrade to .NET 4 because of company policies or some obscure dependency that breaks. No problem, FSharp.Core.dll implements them, so you can use tuples in .NET 2.0 with exactly the same API and namespace as .NET 4 tuples. If you then upgrade you don't have to change anything.

Tuples are simple but not trivial to implement, for example some forget to implement equality / hashing so you'd end up with "WTF moments" at some point. It's worth using a library that implements them properly.

As usual, keep in mind that tuples are essentially anonymous. Item1, Item2, etc don't convey any information about what they're holding, only their types.

Persistent collections

Persistent lists are one of the most frequently used data structures in functional programming. They're so prevalent that F# has special syntax for them. For example, to define an empty list in F# :

let empty = []

F# infers the list element type. In C# things are more verbose:

var empty = FSharpList<int>.Empty;

To add an element to a list you actually create a new list that has the new element as head and the other list as tail. Again, F# has special syntax:

let a = 1::empty

While in C#:

var a = new FSharpList<int>(1, empty);

or:

var a = FSharpList<int>.Cons(1, empty);

FSharpx helps with a little sugar here:

var a = empty.Cons(1);

You can also create an immutable list from any IEnumerable<T>:

var b = SeqModule.ToList(new[] { 1, 2, 3 });

Again, FSharpx adds some sugar:

var b = new[] { 1, 2, 3 }.ToFSharpList();

or:

var b = FSharpList.Create(1, 2, 3);

How do you use a FSharpList? You can access a particular element just as with a regular mutable list:

Console.WriteLine(b[2]); // prints "3"

Be aware that random access in an immutable linked list is O(n).

FSharpList implement IEnumerable<T>, so you can traverse it with foreach and use all LINQ functions (Aggregate, Select, Where, etc).

Functional languages often use pattern matching and recursion to process a list. The F# wikibook has a great chapter explaining it. FSharpx implements basic pattern matching on lists for C#, so you can write this to reverse a list:

[Test]
void Reverse() {
    var a = Enumerable.Range(0, 1000).ToFSharpList();
    var r = Loop(FSharpList<int>.Empty, a);
    Console.WriteLine(r);
}

static FSharpList<T> Loop<T>(FSharpList<T> acc, FSharpList<T> l) {
    return l.Match(() => acc,
                   (head, tail) => Loop(acc.Cons(head), tail));
}

But be careful! F# compiles the equivalent code using tail call optimization, while C# doesn't have that feature, so the above code blows with a StackOverflowException when given a sufficiently big list (unless you've compiled with optimizations and running in a 64-bit CLR !)

When recursively processing lists, it's best to use Aggregate() instead if possible (usually called fold in functional languages), which encapsulates recursion without blowing the stack. It's also simpler:

var a = Enumerable.Range(0, 1000000).ToFSharpList();
var r = a.Aggregate(FSharpList<int>.Empty, (acc, i) => acc.Cons(i));

Of course, this is just demo code. If you really want to reverse a list just call ListModule.Reverse(a);

FSharp.Core also implements a persistent set and dictionary.

Imperative programmers might wonder why they should use an immutable collection when the BCL already has several perfectly good mutable collections.

One of the most cited reasons for using persistent collections (and functional programming in general) is multithreading. Indeed you can freely and safely pass persistent collections around threads, which makes multithreaded development easier. However, the same can be said about passing collections around regular functions: you can be sure that no function can ever modify a list, therefore you have one less thing to keep track of in your head and you statically eliminate a whole class of bugs. Immutability makes all kinds of programming simpler, multithreaded or not. Of course, for immutable collections to really work as immutable, the underlying element type must be also immutable.

Reactive extensions also includes an ImmutableList class, although it's internal.

The Option type

I have blogged before about using the F# Option type in C# projects here and here. Options are pervasively used in F#, for example several functions on collections use options. The problem is, these functions take the equivalent of a Func but in F#, which is an FSharpFunc, which makes it very inconvenient to use them from C#.

FSharpx wraps these F# functions so they can be used with System.Func and System.Action. For example:

var a = FSharpList.Create(1, 2, 3);
a.TryFind(x => x > 4) // returns FSharpOption<int>
    .Match(v => Assert.Fail("shouldn't have found value {0}", v),
           () => { /* nothing found */ });

The Unit type

Many functional languages like F# have a type called "Unit", which is just like "void" in C-like languages, except it's actually usable as a proper type.

By "usable" I mean you can actually define something like a Func<Unit> (you can't have a Func<void>, it's not even syntactically correct even though there is a type System.Void). A Func<Unit> is just like an Action, except it's obviously a Func so it can be used for example in a LINQ expression (i.e. a monad).

FSharpx includes a ToFunc() extension method on Action, Action<T>, Action<T1,T2>, etc. to respectively convert them to Func<Unit>, Func<T,Unit>, Func<T1,T2,Unit> and so on.

You can also use it for types like FSharpOption<Unit> as I blogged about before.

Reactive Extensions also includes a Unit type.

Discriminated unions

I have blogged before about using F# discriminated unions in C# here and here, in the context of validation. They're very useful to express things like "either this or that" without having to introduce a whole class hierarchy implementing equality / hash / comparison.

Just as with other things, using them in C# is more verbose than in F#.

Let's see an example:

var a = FSharpChoice<int, string>.NewChoice1Of2(1);
if (a.IsChoice1Of2) {
    var x = ((FSharpChoice<int, string>.Choice1Of2)a).Item;
    Console.WriteLine(x + 2);
} else if (a.IsChoice2Of2) {
    var x = ((FSharpChoice<int, string>.Choice2Of2)a).Item;
    Console.WriteLine(x + ";");
}

Now that looks really ugly. And what's with the downcasting?!

FSharpx makes this more usable by implementing pattern matching (basically a visitor) so you can write instead:

var a = FSharpChoice.New1Of2<int, string>(1);
a.Match(x => Console.WriteLine(x + 2),
        x => Console.WriteLine(x + ";"));

FSharpx also implements LINQ operators around 2-choice and integrates with Option. Here's an example:

object a = 40;
const string b = "60";
var r = from i in FSharpOption.ParseInt(b).ToFSharpChoice("Invalid value b")
        from j in FSharpChoice.Cast<int>(a).SelectSecond(_ => "Invalid value a")
        select i + j;
r.Match(i => Assert.AreEqual(100, i),
        Assert.Fail);

Just as with tuples, discriminated unions are essentially anonymous. Tuples are the generic, anonymous product types. Discriminated unions are the generic, anonymous sum types.

Reactive extensions uses an internal Either<TLeft, TRight> type.

Async

Once again, you're stuck with .NET 3.5 drooling over the Task Parallel Library in .NET 4.

Reactive extensions used to include a backport of System.Threading.dll, but it was unsupported and it's not included in recent releases any more.

F# has asynchronous workflows, which is similar yet somewhat different from C# 5 await/async (see differences in this series of posts by Tomas Petricek)

FSharpx has LINQ bindings for this so you can write:

static FSharpAsync<string> Get(string u) {
    var web = new WebClient();
    return web.AsyncDownloadString(new Uri(u));
}
var qq = // qq is of type FSharpAsync<string>
    from google in Get("http://www.google.com")
    from bing in Get("http://www.bing.com")
    select google + bing;

string result = qq.Run();

Or you can run multiple requests in parallel:

var urls = FSharpList.Create(
      "http://www.google.com"
    , "http://www.bing.com"
    , "http://www.yahoo.com"
    , "http://www.microsoft.com"
    );
var result = FSharpAsync.Parallel(urls.Select(Get)).Select(s => string.Join("", s)).Run();

It may not be as powerful as F# async workflows, but still useful.

BigInteger

Another one for .NET 2.0 / 3.5 users. FSharp.Core includes System.Numerics.BigInteger for arbitrary-precision arithmetic. It doesn't have all of .NET 4 BigInteger's methods, but it implements the basic operations. Want to calculate 23^25 + 4? No problem:

var a = new BigInteger(23);
var b = BigInteger.Pow(a, 25);
b += new BigInteger(4);
Console.WriteLine(b);

Result: 11045767571919545466173812409689947

Lazy

The Lazy<T> type is yet another feature that .NET 4 copied from F#, or so it seems. Are you still writing singletons the "old" way? With Lazy you can just do this in .NET 3.5 (using FSharpx-added sugar):

class MySingleton {
    private MySingleton() {}

    private static readonly Lazy<MySingleton> instance = 
        FSharpLazy.Create(() => new MySingleton());

    public static MySingleton Instance {
        get { return instance.Value; }
    }
}

Although to be honest, I don't think I've ever used this.

Enumerable cache

Sometimes you have a forward-only iterator wrapped in an IEnumerable, like database results or some data parsed lazily from a web request, and you want to traverse it more than once, but you also want to keep it lazy, so ToList() doesn't cut it. With FSharp.Core you can cache it on demand using Seq.cache, named SeqModule.Cache in C# / VB.NET.

System.Interactive also has a function like this, it's called MemoizeAll, although I like the F# name better as it seems to be more an application of caching than memoization.

Enumerable zip

Another nifty operator that is only available in .NET 4+. The one in FSharp.Core is slightly different: Enumerable.Zip includes a mapper, its signature is:

IEnumerable<TResult> Zip<TFirst, TSecond, TResult>(
    this IEnumerable<TFirst> first,
    IEnumerable<TSecond> second,
    Func<TFirst, TSecond, TResult> resultSelector)

while the one in F# (also in the static SeqModule class) zips directly to a tuple:

IEnumerable<Tuple<T1, T2>> Zip<T1, T2>(IEnumerable<T1> first, IEnumerable<T2> second)

Conclusion

If you're working with Visual Studio 2010, the F# runtime is a great library you can take advantage of, even in .NET 2.0 projects. And you already have it, so use it!

If you run .NET 3.5 or better, FSharpx makes it more C# friendly. It also makes it easier to interop with F# projects if you ever need it, since they use the same underlying types.

Even in .NET 4, persistent collections, discriminated unions and Option alone are easily worth the dependency.

Also worth mentioning is the F# PowerPack, a separate library implementing additional collections like HashMultiMap and LazyList and math-specific facilities such as rational and complex numbers, matrix, vectors.

And it's all open source, Apache-licensed.

PS: did you know the VB.NET runtime has a CSV parser?

7 comments:

Alejandro Varela said...

nice work! thanks!

kurtnele said...

F# is also better for expressing math functions

Pankaj Nikam said...

Awesome read man! It rocks. Thanks for such awesome tips about using F# via C#. One more thing, I like the parser part too. Its really interesting to know such hidden gems in .NET.

Anonymous said...

Great post...

Ivan Towlson said...

Nice tips! Just one heads up -- the link to the F# wikibook seems to be broken at the moment.

Mauricio Scheffer said...

@Ivan thanks! fixed

Used PC Distributor said...

Nice blog Post !