dotNoted

Icon

Observations of .Net development in the wild

.Net memory copy performance

I couldn’t find any articles on the performance of .Net memory copy techniques, so I crafted a very rough test to see for myself. Note: this is a very rough test, if I hadn’t mentioned it. As such there are lots of problems with it. I don’t really care, other than to acknowledge that dealing with them is a hallmark of a well done test. This isn’t a well-done test. It is a hack to convince me that Array.Copy isn’t an evil performance hog. This is what I have to work with and it’s an improvement.

 

I though of 4 different ways to copy memory which don’t use the Marshal class and wrote a quick routine to test the speed of each one. You’ll find the code below. Here are the results of a typical test:

 

Array.Copy took: 0.0200288 seconds

ArrayList took: 3.6051840 seconds

MemoryStream took: 0.6709648 seconds

P/Invoke copy took: 0.0600864 seconds

Array.Copy took: 0.0300432 seconds

ArrayList took: 4.3462496 seconds

MemoryStream took: 0.0300432 seconds

P/Invoke copy took: 0.0300432 seconds

Array.Copy took: 0.0400576 seconds

ArrayList took: 4.9971856 seconds

MemoryStream took: 0.0100144 seconds

P/Invoke copy took: 0.0400576 seconds

Each test is run ten times on a new set of data. Oh, yea, the data is a byte array of 10,000,000 randomly chosen values. It gets recreated on each run so that caching has less of an effect. So what are we seeing here? Well, the Array.Copy method is about as fast as both MemoryStream (which was suprising – I thought wrapping a stream around a data buffer and copying in the data would take a while) and PInvoke. All of them are pretty quick for normal application use. Copying the data into ArrayList took anywhere from 100 to 1000 times longer than any of these other methods. This is due, obviously, to the boxing and unboxing of the byte datatype – a big hit. If you run this and let the test run a couple of times, you’ll also perhaps see some big (~2x) variations on the ArrayList time, since there is possibly some GC activity due to memory pressure – it depends on your system, however (which is one of the problems with this test, I ran it on my regular workstation). Another interesting thing is that the MemoryStream method took an order of magnitude more time on the first run… I’d chalk this up to the great instruction cache on the Intel chip I’m running. I don’t really know, I’m speculating, but then I didn’t really want to answer this question, so I’m good with it for now. If you have an answer, post a comment and I’ll concede ignorance if your explaination is convincing. Finally, the times that MemoryStream and P/Invoke took are surprisingly (to me) short. I already mentioned MemoryStream, but I also thought that the extra baggage in the form of a switch into unmanaged code for P/Invoke would be more significant. As it is, there doesn’t seem to be any significant difference among these methods and Array.Copy (which is another problem – no measurement of variance and significance levels).

I’m now rather convinced that using any of these methods to push bytes around in memory is ok, and I’ll proceed to use which ever lends itself to the problem at hand, which will most likely overwhelmingly be Array.Copy.

As mentioned, here’s the whole test:

using System;

namespace

MemcopyTests

{

/// <summary>

/// Summary description for Class1.

/// </summary>

class Class1

{

/// <summary>

/// The main entry point for the application.

/// </summary>

[STAThread]

static void Main(string[] args)

{

TimeSpan startTime, endTime;

for(int i=0; i<10; i++)

{

byte[] data = createTestData();

startTime = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;

byte[] newData = new byte[data.Length+1];

Array.Copy(data, 0, newData, 0, data.Length);

newData[newData.Length-1] = (byte)1;

endTime = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;

System.Diagnostics.Debug.WriteLine(String.Format(

"Array.Copy took: {0:N7} seconds", endTime.TotalSeconds – startTime.TotalSeconds));

startTime = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;

System.Collections.ArrayList dataList =

new System.Collections.ArrayList(data);

dataList.Add((

byte)1);

endTime = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;

System.Diagnostics.Debug.WriteLine(String.Format(

"ArrayList took: {0:N7} seconds", endTime.TotalSeconds – startTime.TotalSeconds));

startTime = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;

System.IO.MemoryStream dataStream =

new System.IO.MemoryStream(data.Length);

dataStream.Read(data, 0, data.Length);

dataStream.Capacity += 1;

dataStream.Position = dataStream.Capacity-1;

dataStream.WriteByte((

byte)1);

endTime = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;

System.Diagnostics.Debug.WriteLine(String.Format(

"MemoryStream took: {0:N7} seconds", endTime.TotalSeconds – startTime.TotalSeconds));

startTime = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;

byte[] newData2 = new byte[data.Length+1];

unsafe

{

fixed(void* pData = data)

fixed(void* pNewData2 = newData2)

MoveMemory(new IntPtr(pNewData2), new IntPtr(pData), data.Length);

}

newData2[newData2.Length-1] = (

byte)1;

endTime = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;

System.Diagnostics.Debug.WriteLine(String.Format(

"P/Invoke copy took: {0:N7} seconds", endTime.TotalSeconds – startTime.TotalSeconds));

}

Console.ReadLine();

}

[System.Runtime.InteropServices.DllImport(

"Kernel32.dll", EntryPoint="RtlMoveMemory", SetLastError=false)]

static extern void MoveMemory(IntPtr dest, IntPtr src, int size);

private static byte[] createTestData()

{

byte[] data = new byte[10000000];

Random random = new Random();

random.NextBytes(data);

return data;

}

}

}

Filed under: Code Kaizen

Leave a comment