Implementing Markov Chains

Markovian Linked List Tutorial for Programmers and Game Developers

© Guy Lecky-Thompson

Tutorial showing how to implement Markovian Linked Lists (Chains) based on examples from the game development industry for programmers and developers of all genres.

Introduction

In programming terms, being able to generate samples of data on the fly such that it looks like natural occurring phenomena is very useful. From financial modeling, to language generation, through to landscape creation for video games, taking real data, sampling it and using that to generate convincing synthetic versions is key in many areas. One technique for doing this is known as the Markovian Linked List.

One particular application is in generating natural-sounding names based on samples of real country names. This is the example we are going to be using to illustrate the article; without the actual data, but as a theoretical application.

Weighted Linked Lists

A linked list is one in which each node is pointing to another. This could be a sibling, parent, or just the next node in the list, depending on whether we are building a list or tree. Each link has equal weight. In other words, if we assume that we have a node that connects to 10 others, each possible link would have an equal chance of being chosen at random.

On the other hand, a weighted linked list contains links which have a specific weight depending on how 'strong' that link is. To illustrate this, let us consider a list of country names. Each one consists of letters, one following the other.

We could analyze this list of countries, and tabulate the frequencies of each pair of letters in an adjacency table. This is just a two-dimensional array of numbers, where each number indicates the amount of times that the letters are paired in the sample data. The array is 28x28 cells, allowing for each letter of the alphabet, and leading and trailing spaces.

When we build the table, we run through each country name, and analyze the letter pairs. So, if the first letter is an 'A', we can increment the cell referenced by ' A', or [0][1]. Assuming the country is Australia, we can then increment the cell referenced by 'Au', or [1][21]. We then proceed through the country, ending with 'a ', or [1][0].

We can also make a linked list version, in which the link count is incremented. The abstract data type for the node might look like:

struct node
{
node link_target[MAX_NODES];
long link_count[MAX_NODES];
char data;
};

When we build the list, we add it to the link_target array, and increment the appropriate link_count. We can then follow the link, and do the same analysis again with the next data pair.

Generating Sample Data

When we generate sample data, we need to establish the chances of choosing one of the links at random. To do this, we need first to sum all the valid link_count entries. This gives us a total, which we can then use to generate a random number.

Once we have this random number in hand, we can move through the link_count array, adding the values to a running total. When the running total passes the random number chosen, we select the appropriate link_target. This might be coded thus:

link_target PickANode ( node parent_node )
{
long total = 0;
for (i = 0; i < MAX_NODES; i++)
{
if (link_count[i] == 0) break;
total += link_count[i];
}
long random = rand() % total;
total = 0;
for (i = 0; i < MAX_NODES; i++)
{
if (link_count[i] == 0) break;
total += link_count[i];
if ( total >= random ) break;
}
return link_target[i];
}

Using this algorithm, almost any data pair, or even tuple set, can be analyzed and natural looking synthetic data sets generated. It can be used for language, terrain, or behavior generation and with some experimentation can produce startling results.


The copyright of the article Implementing Markov Chains in Computer Programming Tutorials is owned by Guy Lecky-Thompson. Permission to republish Implementing Markov Chains must be granted by the author in writing.




Post this Article to facebook Add this Article to del.icio.us! Digg this Article furl this Article Add this Article to Reddit Add this Article to Technorati Add this Article to Newsvine Add this Article to Windows Live Add this Article to Yahoo Add this Article to StumbleUpon Add this Article to BlinkLists Add this Article to Spurl Add this Article to Google Add this Article to Ask Add this Article to Squidoo