Sunday 4 February 2018

Structs in C#

I've never been particularly interested in using structs rather than classes in C#. I guess it comes from the reasoning "JavaScript, Python, Groovy... do not have them, so why should I care?". They are available mainly (or only) for performance reasons that probably in most cases should not hit us, so why should we care too much?

The fact that the new Tuples introduced in C# 7 (ValueTuple) that make so nice to work with multiple return values are structs rather than classes has slightly woken up my interest in them. I remember that one important performance feature is that structs lack the Object header that all reference types (instances of a class) have. This header is 8 bytes or 16 bytes depending on whether you are in a 32 bits or 64 bits system. If you are working with many instances of that class or struct, this can make a huge difference.. Another thing that had in mind is that assignment and modifications can be a bit tricky. I've been reading about it, and doing some minor tests, so I'll write a short summary. This entry in MSDN already gives us most of what we need

A struct is a value type. When a struct is created, the variable to which the struct is assigned holds the struct's actual data. When the struct is assigned to a new variable, it is copied. The new variable and the original variable therefore contain two separate copies of the same data. Changes made to one copy do not affect the other copy.

Value type variables directly contain their values, which means that the memory is allocated inline in whatever context the variable is declared. There is no separate heap allocation or garbage collection overhead for value-type variables.

Heap vs Stack. When you declare a variable of struct type, the struct will be created directly in the stack. var myStruct = new MyStruct();
However, when a class has a struct field or property, the struct is stored in the Heap. The difference here with normal classes is that the field will get the struct inlined right there, rather than being a reference to another memory location.

Given that structs are either directly located in the stack or inlined, assignments are based on copying. We are doing copies of the struct when we assign a local struct variable to another, or we assign it to a struct field or property or we pass it as parameter. Let's see some examples:

    struct Address
    {
        public Address(string city, string street)
        {
            this.City = city;
            this.Street = street;
        }

        public string City;
        public string Street;

        public override string ToString()
        {
            return this.City + ", " + this.Street;
        }
    }

    class Person
    {
        public string Name {get;set;}
        public Address Location {get;set;}

        public Person(string name, Address location)
        {
            this.Name = Name;
            this.Location = location;
        }

        public override string ToString()
        {
            return this.Name + " - " + this.Location.ToString();
        }
    }

 
    //main
            var address1 = new Address("Marseille", "Port Vieux");
            //a copy of address1 is done and assigned to address2
            var address2 = address1;

            address1.Street = "Rue de la Republique";

            Console.WriteLine(address1.ToString()); //Rue de la Republique
            Console.WriteLine(address2.ToString()); //Port Vieux

            Console.WriteLine("----------------");
            
            //when we pass the struct as parameters its also a copy what gets passed
            var p1 = new Person("Francois", address1);

            Console.WriteLine(p1.ToString());
            //Republique

            address1.Street = "Rue du Temps";

            Console.WriteLine(p1.ToString());
            //Republique

            //with this assignment we are doing another copy
            p1.Location = address1;
            Console.WriteLine(p1.ToString());

            address1.Street = "Rue de la Liberte";
            Console.WriteLine(p1.ToString());
            //Temps

There's and important difference between modifying a struct through a field or a property. If my class has a struct property and I modify one of its fields, that struct itself is modified inline, I mean, given:

struct Address
    {
        public Address(string city, string street)
        {
            this.City = city;
            this.Street = street;
        }

        public string City;
        public string Street;

        public override string ToString()
        {
            return this.City + ", " + this.Street;
        }
    }

   class Container
    {
  public Address AddressProp {get;set;}
        public Address AddressField;
        
        public override string ToString()
        {
            return "Prop: " + this.AddressProp.ToString() + " - Field: " + this.AddressField.ToString();
        }
    }

We can do this instance.structField.field = value;

            var container = new Container();
            container.AddressField = new Address("Lyon", "Rue Victor Hugo");

            Console.WriteLine(container.ToString());
            //here I'm modifying the struct contents, the inlined values
            container.AddressField.Street = "Rue du Rhone";
            Console.WriteLine(container.ToString()); //Rhone

However, if it is a property, we'll get a compilation error instance.structProperty.field = value;

  container.AddressProp = new Address("Paris", "Boulevard Voltaire");
  //this line does not compile!!!
  container.AddressProp.Street = "Rue de Belleville";
 //Error: Cannot modify the return value of 'Container.AddressProp' because it is not a variable

Getting such compilation error makes sense. When accessing the property it's returning a copy that I'm not assigning, so the ensuing assignment is useless and the compiler just prevents it.

On the other side, both for properties and fields if I assign them to a variable I'll get a copy of the original struct that I can modify with no issue.

No comments:

Post a Comment