Notes on Array Layout

From: andrew cooke <andrew@...>

Date: Wed, 9 Dec 2009 11:22:56 -0300

2D array data are typically stored in computer memory in one of two
layouts: row major or column major.

This doesn't help explain anything without knowing which index is
"row" and which "column".  It turns out that the conventional indexing
is [y,x] (sigh).  So the first index is row (where you are in a
column) and the second is column (where you are in a row).

For example, this is how the indices of a 2x3 (2 rows x 3 columns)
matrix, called A, are conventionally visualised:

A[0,0]  A[0,1]  A[0,2]
A[1,0]  A[1,1]  A[1,2]

Where I am using 0-based indexing (normal for C).  In Fortran and
Matlab, things are similar but you need to add 1 to all the indices
(so A[0,0] becomes A[1,1] etc).

If we put numbers in the array it might look like:

1 2 3
4 5 6

Row Major

With row major layout, rows are stored together.  So the data above
would be stored as sequence in memory as:

A[0,0]  A[0,1]  A[0,2]  A[1,0]  A[1,1]  A[1,2]

In terms of numbers in the array:

1 2 3 4 5 6

This is how arrays are stored in C.  As you move through the data, the
last index increases most quickly, like a car's mileometer (mnemoic: C
is Car)

Column Major

With column major layout, columns are stored together.  So the data
above would be stored as a sequence in memory as:

A[0,0]  A[1,0]  A[0,1]  A[1,1]  A[0,2]  A[1,2]

In terms of numbers in the array:

1 4 2 5 3 6

This is how arrays are stored in Fortran and Matlab (except that
indexing is 1-based).  As you move through the data, the first index
varies faster (mnemonic: Fortran First Fastest).

Note that there are two separate issues here.  First, there's the
convention of which array index is interpreted as row and which as
column.  Second, there's the convention of how you map the indices to
memory.  These two are mixed up by the names used.

We can separate these two issues by focussing on the indexing.  Then we have:

In C, last index varies most quickly as we move through memory.
In Fortran and Matlab, the first index varies most quickly as we move
through memory

And, separately:

It is conventional for the first index to indicate "row" and the
second "column".

Putting these together we get "row major" for C and "column major" for
Fortran and Matlab.

Andrew