The Roberto Selbach Chronicles

About |  Blog |  Archive

Tag: c#

Returns in Go and C#

Whenever someone posts anything related to the Go programming language on Hacker News, it never takes long before someone complains about error handling. I find it interesting because it is exactly one of things I like the most about Go.

I don’t want to specifically talk about error handling though. I want to talk about a feature that is intrinsically tied to it in Go: the ability of functions to return multiple values

For instance, in Go it is common and idiomatic to write functions like this —

func Divide(a, b float64) (float64, error) {
    if b == 0 {
        return 0.0, errors.New("divide by zero")
    }
    return a / b, nil
}

So the caller would do:

result, err := Divide(x, y)
if err != nil {
    // do error handling...
}

Some people deplore this. I absolutely love it. I find it so much clearer than, for instance, what we often have to do in C#. You see, C# didn’t have multiple returns (until very recently; see below) so you ended up with a few options.

First, you can simple throw exceptions.

public SomeObject GetObjectById(int id) {
    if (!SomeObjectRepo.Has(id))
        throw new ArgumentOutOfRangeException(nameof(id));
    // ...
}
...
try
{
    var obj = GetObjectById(1);
    // do something with obj
}
catch (ArgumentOutOfRangeException ex)
{
    //  error handling
}

I find the flow difficult to read. Particularly because variables are scoped within the try-catch so often you need to first declare something above the try and then test it after the catch.

A second option is to return null:

public SomeObject GetObjectById(int id)
{
    if (!SomeObjectRepo.Has(id))
        return null;

    // go get the object
}
...
var obj = GetObjectById(1);
if (obj == null) 
{
    // do error handling
}

This looks closer to what I like but it still has some serious downsides. You don’t get any error information. What made it fail? I don’t know. As well, this doesn’t work for non-nullable types. A method returning a, say, int cannot return null. Sure, you could return int? instead of int and then test for .HasValue but that’s cumbersome and artificial.

A third option is the use of a generic return type. Something like —

public class Result<T>
{
    public T Value {get;protected set;}
    public Exception Exception {get; protected set;}

    public bool IsError => Exception != null;

    public Result() : this(default(T)) {}
    public Result(T value)
    {
        Value = value;
    }

    public static Result<T> MakeError(Exception exception)
    {
        return new Result<T>
        {
            Value = default(T),
            Exception = exception
        };
    }
}

You could then use this to return values like —

public Result<int> Divide(int a, int b)
{
    if (b == 0)
    {
        return Result<int>.MakeError(new DivideByZeroException());
    }

    return new Result<int>(a / b);
}
...
var res = Divide(8, 4);
if (res.IsError)
{
    // do error handling, e.g.
    throw res.Exception;
}
// do something with res.Value (2)

This works, but it looks artificial. You need to create instances of Result<T> all around all the time. It is not that bad if your codebase uses this throughout and it becomes automatic for all programmers envolved. When it’s an exception to the rule, it is horrible.

A very similar solution is to return something like Tuple<T1, T2, ...>

public Tuple<int,Exception> Divide(int a, int b)
{
    if (b == 0)
        return new Tuple<int,Exception>(0, new DivideByZeroException());
    return new Tuple<int,Exception>(a/b, null);
}
...
var res = Divide(1, 2);
if (res.Item2 != null) // Item2 is the exception
{
    // do error handling
}
// do something with res.Item1

Same principle. It’s ugly and artificial, but it will come back to us.

The way the C# authors found to work around this problem is the idiomatic try-pattern, which consists in creating non-exception-throwing versions of methods. For example, if we go back to the first C# example above (GetObjectById()), we could create a second method like so —

public bool TryGetObjectById(int id, out SomeObject result) {
    try 
    {
        result = GetObjectById(id);
        return true;
    }
    catch
    {
        result = default(SomeObject);
        return false;
    }
}
...
SomeObject result;
if (!TryGetObjectById(1, out result))
{
    // do error handling
}
// do something with result

Note that ever since C# 7.0 you can declare the out variable directly inside the method call as such —

if (!TryGetObjectById(1, out var result))

Which spares you of declaring your out variables arguably at the expense of clarity.

This method is idiomatic and found everywhere in the .NET Framework. I actually like it but it still has the problem of losing important information, namely what caused the method to fail: all you get is true or false.

In C# 7.0, the language authors came up with a new solution: they added syntactic sugar to the language to make the tuple solution a bit more appealing —

public (int, Exception) Divide(int a, int b)
{
    if (b == 0)
        return (0, new DivideByZeroException());

    return (a / b, null);
}
...
var (res, err) = Divide(1, 2);
if (err != null) 
{
    // do error handling
}

Suddenly this becomes very familiar to a Go programmer. In the background, this is using a tuple. In fact, you can check that this is so by using the method above like this —

var res = Divide(1, 2);
if (res.Item2 != null)
    // do error handling
// use res.Item1

You will see that res is of type System.ValueTuple. Also, if you create a library in C# 7.0 and then try to use it with a program in older versions of C#, you will see that the exposed type of the method is a tuple. This is actually nice because it means this big language change is backwards compatible.

All that said, I haven’t seen many uses of the new tuple returns in C# code in the wild. Maybe it’s just early (C# 7.0 has been out for only a few months.) Or maybe the try-pattern is simply way too ingrained in the way of doing things in C#. It’s more idiomatic.

I sure prefer the new (Go-like) way.

Casting objects and boxing

I’m back from a trip to a customer.

How was it?

Okay. I got more snow that I expected on the way there, so the drive wasn’t much fun. Then again, a part of the trip goes through a beautiful forest that was worth everything else.

Cool!

Also, while showing the customer a new feature, the app crashed.

Typical. Blame it on Murphy!

That’s what I did at first. Then I blamed it on the developer. And then I finally went looking at the C# code to find out why it happened.

What was it?

It turned out to be a rather common but not obvious mistake. See the code below and tell me what is the value of each of doubleF, doubleI, and doubleO.

float f = 0.0;
int i = (int)f;
object o = f;

var doubleF = (double) f;
var doubleI = (double) i;
var doubleO = (double) o;

I’m sensing a catch here, but I’ll bite. They’re all cast from the same original variable f so I’m guessing they’d all end up 0.0…?

You would, wouldn’t you? But you’re wrong.

Waaat?

The final line in that code will throw an InvalidCastException at you — and crash your app if you don’t catch it, as was the case in our app.

Wait what? How? How come you can’t cast 0.0 to double?

Well, you can. For instance, this works perfectly —

var d = (double) 0.0f;

But this doesn’t —

object f = 0.0f;
var d = (double) d;

It makes no sense!

Actually it does. The problem is taking object to mean “anything.” Which incidentally it does, just not the way most people think. You see, object is a type representing Object, which is a class other types inherit from but not all. You can store anything as object because Object boxes whatever object you put in it. It stores the value internally but the compiler doesn’t know what type is stored there.

No no no! I know for a fact that you can too check what type is stored in an object

You’re right, you can. For instance —

object o = /* something */
Console.WriteLine(o.GetType());

This will print the type of whatever you put in the variable o. But this is at run time: the compiler doesn’t know.

That’s why we using casting. If we know for a fact that variable o will contain a, say, int, we can help the compiler and tell it about it with a cast. Remember, when you cast something, you are telling the compiler what type will be stored in the variable. The compiler can’t be held responsible if you lie to it.

Let’s get back at the original problem —

object o = 1.0f;
var d = (double) o;

You told the compiler that o will be a double, but it isn’t. Remember a double is a shorthand for the struct Double as float is for Single. And guess what? A Double is not a Single. When you stored a float in the variable o of type object, the float value was boxed inside an object of type Object. When you cast, the compiler has to unbox whatever was inside o and guess what, the value stored in o is of a different structure, with different methods and storage, than what you told it it was. You could convert between them, but they are not the same.

So the compiler expects an object of type Double but it has a Single and things fail miserably.

But you just said that we can convert between them! Why don’t the compiler does it?

It could. But think of how this would work out in real life. Remember the compiler doesn’t know what will be inside o so it needs to test what the value is. It would need to test if the type is a, say, string. If it is, then convert string to Double. If it isn’t then check if it is a Int32. Then a Int64. Then a DateTime. The number of possibilities is enormous and the compiler would have to generate all this code every time it needs it finds a cast. This would be a lot of code. It would be so much code in fact that you’d be mad not to put it all in separate methods. It would also be slow so the compiler won’t do this by default.

That’s why we have the Convert class, which in turn depends on types implementing the IConvertible interface. Whenever you want to convert a value of TypeA to TypeB, you can use this conversion methods. You can do —

object o = 1.0f;
var d = Convert.ToDouble(o);

The compiler authors had to make a decision: either they’d generate lots of slow code to test for the type and convert the value, or they’d leave the decision for the programmer who can call Convert.ToSomething when needed.

And they chose the former.

Exactly. I believe it was reasonable. If you know something will be of a given type at run time, you can still cast it. Otherwise, you should convert it.

New stuff coming in C# 7.0

Mads Torgersen wrote a blog post highlighting what’s new in C# 7.0:

C# 7.0 adds a number of new features and brings a focus on data consumption, code simplification and performance.

The changes all seem to be in line with the recent trends of borrowing syntax sugar from other languages to C#. Nothing wrong with that: copy what’s good and shed what’s bad.

One of the changes is related to out variables. These are the C# way to deal with not being able to return multiple values (see below for good news on that front). It’s basically the same as passing by reference in, say, C. For instance:

int myOutVar;
changeMyOutVar(out myOutVar);

You could have the value of myOutVar set inside changeMyOutVar. Simple. What is changing in C# 7.0 is that you would no longer need to predeclare myOutVar:

changeMyOutVar(out int myOutVar);

The scope of the new variable will be the enclosing block. I have to say it: I don’t like it. It feels obfuscated to me. The variable doesn’t look like it should be in that scope. Compare with this popular Go idiom:

if err := DoSomething(); err != nil {
    return err;
}

The variable err is created inside the if and its scope is there as well. I know a lot of people who hate this for the same reason I don’t like the way the new out variables are to be created in C#. I find it much more clear in Go though.

The feature I absolutely loved to read about is tuples. Error handling in .NET is often done with exceptions, which I find clunky and cumbersome. With tuples, we might be able to move to something more Go-like:

(User, bool) FindUser(string username) {
    var found = _userList.Find(u => u.Username == username);
    if (found == null)
        return null, false;
    return found, true;
}

So we could do something like:

var (user, ok) = FindUser("someusername");
if (!ok) {
    // user not found, deal with this
}

Check his post for more features.

Clashing method names in Go interfaces

I wrote about how the Go and C# compilers implement interfaces and mentioned how C# deals with clashing method names but I didn’t talk about how Go does it, so here it is.

Two interfaces with the same method name that need to behave differently is very likely a sign of bad API design and we should fix it instead. But sometimes we can’t help it (e.g. the interfaces are part of a third-party package). How do we deal with it then? Let’s see.

Given two interfaces

type Firster interface {
    DoSomething()
}

type Seconder interface {
    DoSomething()
}

We implement them like this

type MyStruct struct{}

func (ms MyStruct) DoSomething() {
    log.Println("Doing something")
}

We can run this little test here to verify that MyStruct implements both interfaces.

But what if we need DoSomething() to do something different depending on whether MyStruct is being cast as Firster or Seconder?

Go doesn’t support any kind of explicit declaration of interfaces when implementing methods. We can’t do something like, say

type MyStruct struct{}

func (ms MyStruct) Firster.DoSomething() {
    log.Println("Doing something")
}

func (ms MyStruct) Seconder.DoSomething() {
    log.Println("Doing something")
}

That won’t work. The solution is to wrap MyStruct with a new type that reimplements DoSomething(). Like this

type SeconderWrapper struct {
    MyStruct
}

func (sw SeconderWrapper) DoSomething() {
    log.Println("Doing something different")
}

Now when we need to pass it to a function expecting Seconder, we can wrap MyStruct

ms := MyStruct{}
useSeconder(SeconderWrapper{ms})

That will run the DoSomething() from SeconderWrapper. When passing it as Firster, the original DoSomething() will be called instead. You can see this in action here.

Interfaces in Go and C#

I make no secret of the fact that I love Go. I think it’s a wonderfully designed language and it gave me nothing but pleasure in the years I’ve been working with it full time.

Now however I am working on a project that requires the use of C#, which prompted me to realize something.

When people ask about Go, most people talk about channels and concurrency but I think one of the most beautiful aspects of Go is its implementation of interfaces.

To see what I mean, let’s define a simple interface in Go.

type Greeter interface {
    Hello() string
}

Now say we have a type that implements this interface

type MyGreeter struct {}

func (mg MyGreeter) Hello() string {
    return "Hello World"
}

This is it. myGreeter implicitly implements the Greeter interface and we can pass it along to any function that accepts a Greeter.

func doSomethingWithGreeter(g Greeter) {
    // do something
}

func main() {
    mg := myGreeter{}
    doSomethingWithGreeter(mg)
}

The fact that the implementation is actually important, but we’ll get to that. Let’s first do the same thing in C#.

interface Greeter
{
    string Hello();
}

And then we create a class that implements the interface.

class MyGreeter
{
    public string Hello()
    {
        return "Hello world";
    }
}

We quickly find out that the compiler doesn’t like this.

public static string doSomething(Greeter g)
{
    return g.Hello ();
}

public static void Main (string[] args)
{
    MyGreeter mg = new MyGreeter ();
    doSomething (mg);
}

The compiler complains that it cannot convert MyGreeter to type Greeter. That’s because the C# compiler requires classes to explicitly declare the interfaces they implement. Changing MyGreeter as below solves the problem.

class MyGreeter : Greeter
{
    public string Hello()
    {
        return "Hello world";
    }
}

And voilà, everything works.

Now, we might argue that it is not much different. All you have to do is to declare the interface implemented by the class, right? Except it does make a difference.

Imagine that you cannot change MyGreeter, be it because it’s from a third-party library or it was done by another team.

In Go, you could declare the Greeter interface in your own code and the MyGreeter that is part of somebody else’s package would “magically” implement it. It is great for mocking tests, for example.

Implicit interfaces is an underrated feature of Go.

Update: someone on Twitter pointed me to the fact that C# not only also has implicit interfaces but that this is the default state of things. That is true in a literal sense, but it’s not the same thing.

Imagine in our C# above, we have a second interface.

interface IAgreeable
{
    string Hello();
    string Bye();
}

(Yes, I was also told that in C# we should always name interfaces like ISomethingable. I disagree but there it is.)

We then implement our class thusly

class MyClass : Greeter, IAgreeable
{
    public string Hello() {
        return "Hello world";
    }
    public string Bye() {
        return "Bye world";
    }
}

It now correctly implements both interfaces and all is well with the world. Until, that is, the Hello() method needs to be different for each interface in which case you will need to do an explicit implementation.

class MyClass : Greeter, IAgreeable
{
    string Greeter.Hello() {
        return "Hello world from Greeter";
    }
    public string Hello() {
        return "Hello world from !Greeter";
    }
    public string Bye() {
        return "Bye world";
    }
 }

And then the compiler will call the appropriate Hello() depending on the cast.

public static string doSomething(Greeter g)
{
     return g.Hello ();
}
public static string doSomethingElse(IAgreeable g)
{
    return g.Hello ();
}
public static void Main (string[] args)
{
    MyClass mg = new MyClass ();
    Console.Out.WriteLine(doSomething (mg));
    Console.Out.WriteLine(doSomethingElse (mg));
}

This will print

Hello world from Greeter
Hello world from !Greeter

Personally I consider two interfaces with clashing method names that need to behave differently a design flaw in the API, but reality being what it is, sometimes we need to deal with this.

Container changes in C++11

The recently approved C++11 standard brings a lot of welcome changes to C++ that modernize the language a little bit. Among the many changes, we find that containers have received some special love.

Initialization

C++ was long behind modern languages when it came to initializing containers. While you could do

[cpp]int a[] = {1, 2, 3};[/cpp]

for simple arrays, things tended to get more verbose for more complex containers:

[cpp]
vector v;
v.push_back(“One”);
v.push_back(“Two”);
v.push_back(“Three”);
[/cpp]

C++11 has introduced an easier, simpler way to initialize this:

[cpp]
vector v = {“One”, “Two”, “Three”};
[/cpp]

The effects of the changes are even better for things like maps, which could get cumbersome quickly:

[cpp]
map<string, vector > m;
vector v1;
v1.push_back(“A”);
v1.push_back(“B”);
v1.push_back(“C”);

vector v2;
v2.push_back(“A”);
v2.push_back(“B”);
v2.push_back(“C”);

m[“One”] = v1;
m[“Two”] = v2;
[/cpp]

This can now be expressed as:

[cpp]
map<string, vector> m = One,

                             {"Two", {"Z", "Y", "X"}}};

[/cpp]

Much simpler and in line with most modern languages. As an aside, there’s another change in C++11 that would be easy to miss in the code above. The declaration

[cpp]map<string, vector> m;[/cpp]

was illegal until now due to >> always being evaluated to the right-shift operator; a space would always be required, like

[cpp]map<string, vector > m[/cpp]

No longer the case.

Iterating

Iterating through containers was also inconvenient. Iterating the simple vector v above:

[cpp]
for (vector::iterator i = v.begin();

 i != v.end(); i++)
cout << i << endl;[/cpp]

Modern languages have long had some foreach equivalent that allowed us easier ways to iterate through these structures without having to explicitly worry about iterators types. C++11 is finally catching up:

[cpp]
for (string s : v)

cout << s << endl;

[/cpp]

As well, C++11 brings in a new keyword, auto, that will evaluate to a type in compile-type. So instead of

[cpp]
for (map<string, vector >::iterator i = m.begin();

 i != m.end(); i++) {

[/cpp]

we can now write

[cpp]
for (auto i = m.begin(); i != m.end(); i++) {
[/cpp]

and auto will evaluate to map<string, vector>::iterator.

Combining these changes, we move from the horrendous

[cpp]
for (map<string, vector >::iterator i = m.begin();

 i != m.end(); i++)
for (vector<string>::iterator j = i->second.begin();
     j != i->second.end(); j++)
    cout << i->first << ': ' << *j << endl;

[/cpp]

to the much simpler

[cpp]
for (auto i : m)

for (auto j : i.second)
    cout << i.first << ': ' << j << endl;

[/cpp]

Not bad.

C++11 support varies a lot from compiler to compiler, but all of the changes above are already supported in the latest versions of GCC, LLVM, and MSVC compilers.

Euler 9 in C

So I recently wrote an ugly solution to Project Euler’s problem #9. It was written in Python and even though every other Project Euler solution I wrote ran in less than one second, this one took over 18 seconds to get to the answer. Obviously it’s my fault for using a purely brute force algorithm.

Anyway, boredom is a great motivator and I just rewrote the exact same algorithm in C, just to see how much faster it would be.

#include <stdio.h>

int is_triplet(int a, int b, int c)
{
        return (((a < b) && (b < c)) && ((a * a + b * b) == (c * c)));
}

int main(void)
{
        for (int a = 0; a < 1000; a++)
                for (int b = a + 1; b < 1000; b++)
                        for (int c = b + 1; c < 1000; c++)
                                if ((a + b + c == 1000) && is_triplet(a, b, c)) {
                                        printf("%d %d %d = %dn", a, b, c,
                                               a * b * c);
                                        return 0;
                                }
}

It’s the exact same algorithm, but instead of 18 seconds, it ran in 0.072s.

Update: And here’s the version suggested by Gustavo Niemeyer:

#include <stdio.h>

int is_triplet(int a, int b, int c)
{
        return ((a * a + b * b) == (c * c));
}

int main(void)
{
        for (int a = 1; a < 1000; a++)
                for (int b = a + 1; b < (1000 - a) / 2; b++) {
                        int c = 1000 - a - b;
                        if (is_triplet(a, b, c)) {
                                printf("%d %d %d = %dn", a, b, c,
                                       a * b * c);
                                return 0;
                        }
                }
}

It now runs in 0.002s :–)