.NET Memory Management 101

I watched an excellent video by Maarten Balliauw recently about dotMemory and the ClrMd framework and how .NET manages its memory.

I put up my crude notes in my previous post.

Here I thought I’d have a play myself and see what I can replicate myself. The source code is available in my Git repo.

Running it will give you four options for tests which we can use to see how objects and values are assigned in .NET.

Console menu

When you run a test there’ll be notifications of when to take a snapshot. Once you have, press Enter for the test to either continue of tell you it’s finished.

Reference Types & Value Types

The .NET framework has two main types, which it handles differently when it comes to memory management. These are:

  • Value type
  • Reference type

The key differences between the two from a coding perspective is that:

  Value Type Reference Type
Can be null? No Yes
Create new instance
on every method call
Yes No

Consider the following:

using System;

object referenceType = new object();
DateTime valueType = new DateTime(1, 2, 3);

public void AreTheyEqual(object o, DateTime d)
{
    Console.WriteLine("Reference types equal? {0}", referenceType == o);
    Console.WriteLine("Value types equal?     {0}", referenceType == o);
    Console.WriteLine("Reference types same?  {0}", Object.ReferenceEquals(o, referenceType));
    Console.WriteLine("Value types equal?     {0}", Object.ReferenceEquals(d, valueType));
}

AreTheyEqual(referenceType, valueType);

Try it on dotFiddle.

We create one value type and one reference type. We pass these into the method which then tests if they are equal using == and whether they’re actually the same thing with Object.ReferenceEquals().

The output is:

Reference types equal? True
Value types equal?     True
Reference types same?  True
Value types equal?     False

You can see that, whilst both the reference type and the value type equate as expected, the valuetype is not the same entity as that passed into it. This is because it has been copied whereas with the reference type we just passed a reference (pointer) to the existing object.

Stack and Heap

You will hear talk of the stack and the heap quite a bit. In fact, they’re technically just implementation details that are not guaranteed not to change (only the functional behaviours of the types is guaranteed).

“I find this characterization of a value type based on its implementation details rather than its observable characteristics to be both confusing and unfortunate. Surely the most relevant fact about value types is not the implementation detail of how they are allocated, but rather the by-design semantic meaning of “value type”, namely that they are always copied “by value”. If the relevant thing was their allocation details then we’d have called them “heap types” and “stack types”. But that’s not relevant most of the time. Most of the time the relevant thing is their copying and identity semantics.” - Eric Lippert

See:

Anyhow, in this case we are looking into under the hood of the memory management system (which is an implementation detail as well) as it makes sense for us to talk about stacks and heaps.

The Stack

The stack is where value types go!

Well, no, not exactly. It’s a good start though. The .NET primitives are all value types and, when a local variable, will sit on the stack along with references to objects in the heap.

Let’s have a look at the local primitives test. This simply creates a bunch of local integers with the values 0, 1, 2,… 49. An integer is a value type and as I just said, a local value type will be pushed onto the stack.

The code is very simple:

public void Start()
{
    _presenter.PromptForSnapshot();

    var i0 = RANDOM.Next();
    var i1 = RANDOM.Next();
    var i2 = RANDOM.Next();
    var i3 = RANDOM.Next();
    var i4 = RANDOM.Next();
    var i5 = RANDOM.Next();
    var i6 = RANDOM.Next();

    // Removed for space

    var i59 = RANDOM.Next();

    _presenter.PromptForSnapshot();
}

If we run the test and connect a memory profiler like dotMemory then we we can look at the heap before and after the variables are created. If you’re using dotMemory then you should have two snapshots taken and see similar to this:

Primitives Memory Usage

You can see that there’s no interesting activity recorded in the heap at all. All we can see if basic operating traffic by the application itself. There were 382 objects before we created the integers and 381 after. This is, of course, as expected. All the integers were added to the stack so don’t show the heap at all.

Why not put values on the heap?

Well, the heap it quite slow. When we add something to the stack we just write it in the next slot, then, when we are done we just move the pointer to where the framework will read from down. There are no clean up costs at all, the memory is just overwritten rather than reclaimed.

The Heap

All reference types end up in the head, with one or more references to them on the stack. As I’m sure you’re well aware strings are reference types despite the fact that, when coding we don’t need to use the new keyword.

If you run the program again and this time select the Strings option then we’ll see how these get stored.

The console will output:

Creating strings
================

----------------------------------------------------------------
Take a snapshot of the memory and press 'Enter' when you're done
----------------------------------------------------------------
Now we are going to create a set of 100 unique strings.

----------------------------------------------------------------
Take a snapshot of the memory and press 'Enter' when you're done
----------------------------------------------------------------
Now, we'll call the garbage collector

----------------------------------------------------------------
Take a snapshot of the memory and press 'Enter' when you're done
----------------------------------------------------------------
All done, have a look at the snapshots.

Done...

If you’ve created the snapshots in dotMemory you’ll see:
strings dotMemory

I’ve named the snapshots, yours might just be called Snapshot #1, Snapshot #2 etc.

You can see straight away from the snapshot boxes that 104 objects were created between the time the first snapshot was taken and the second. We expect that 100 of these will be the 100 strings we created, 1 will be the array holding them and the other two will ‘something else’. If we click on the compare we can see the details.

Compare before strings created to afterwards

If we filter on String then we can see that that 102 new strings were created and 1 new string array. So far, so good, we can see what the strings were.

String instances

There’s a lot of 90 byte strings there, they’ll be the ones we created for this test. The other two are:
* 124 bytes: "Now we are going to create a set of 100 unique strings."
* 16 bytes: "D"
These are for formatting the GUID into a string, and the text I write to the console.

If we compare the second and third snapshots now, after GC.Collect() was called, we can see that the objects have all been destroyed:

Strings destroyed

Conclusion

Reference types always go on the heap, local value types go on the stack. There is more to it than this, for example:

  • What happens to value types in reference types
  • What happens to reference type in value types?

Also, we’ve only shown that garbage collection cleans up unused references. The garbage collector does more than this too, but we’ll look at this in a future blog.

Hope you enjoyed this.

Thanks for reading.

Feel free to contact me @BanksySan!

No comments:

Post a Comment