Coding a Database-Driven Tag Cloud Using PHP, mySQL, and CSS

I’m currently in the middle of a personal project to create a dynamic grid of photos, randomly generated each time the page is loaded.  I thought a good way of showing the individual photo albums would be to use a tag cloud.  For those that don’t know, a tag cloud is a way of displaying labels or words, emphasizing the most common words by making their font size larger.  It’s a quick way to see which tags are used most often and provides at a quick glance information about the content.  I’ve never built a tag cloud before, so this is a learning process for me, too.

My goal is to use PHP to build a tag cloud based on tags in a mySQL table.  Ideally, the table will have a “tags” field for each row/entry, and we’ll use all these tags to generate our tag cloud.  In the end, we’ll end up with something like this:

Tag Cloud goal

Final Result

Brainstorming, Planning, and Getting Started

When I start coding something, I usually start by listing out the high-level steps that need to be done in order to get the desired results.  Try to think of what steps you think we’ll need.  What are our overall requirements and constraints? How would you bring the data in from the database?  How would you figure out what size each tag should be relative to the others?  How would you display the results?

For me, the requirements are:

  • Font size should be a linear function of the number of times that tag appears
  • It should read tags from a mySQL database
  • The maximum and minimum font sizes should be easily changeable so it can adapted to any page
  • It should be output in text form so that it can fit in a container of any size
  • We should be able to choose the maximum number of tags to include (rather than listing every tag), if desirable

Based on these requirements, I come up with a high-level outline of what needs to be done. Don’t worry if the process doesn’t make sense at first — I’ll explain it as we go.

  • Define user variables and calculate font size range
  • Create a few arrays to store data
  • Connect to and query mySQL Database
  • Loop through each result of the query
    • Split the result of each query “, ” (our tags will be in the form “tag 1, tag 2, tag 3″ etc.) and store the results to an array
    • Loop through all items of this string array (i.e. loop through each tag)
    • Check if it exists in tag array
      • If it isn’t already in the array, add it to the array with a value of 1
      • If it is already in the array, increase that tag’s value by 1
  • Sort the tag array by value, descending
  • Add the first “n” tags to a new array, where n is the number of tags we want to display
  • Alphabetize this new array
  • Calculate the tag count/value range
  • Create a function to calculate font size based on tag count/value for a given tag
  • Loop through the array to calculate the font size for each tag and echo the results

Right. So before we get started coding, make sure you have your mySQL database set up properly. Basically, you just need a table with a text field called “tags” (or if it’s not “tags,” make sure you change it in the code later). Each row’s “tags” field should contain tags in the following form: “tag 1, tag2, tag three” etc.

Database of tags

My "tags" field in my database

Getting the Tags from the Database

First, we need to create a few arrays to read and process our data to and connect to our database. If you don’t have much experience with mySQL, I recommend checking out this great tutorial at Tizag.com. To establish a connection, we use the mysql_connect command (with your own hostname, username, password, and database name). To get data from the table, we use the mysql_query command (with your own tablename). In addition, we should define some user variables regarding font sizes, etc.

<?php
// Define user variables
$fontSizeUnit = "px"; // Will get appended to font size in CSS
$maxFontSize = 80; // Maximum font size (for most common tag)
$minFontSize = 10; // Minimum font size (for least common tag)
$tagsToDisplay = 20; // The number of top tags to display
$tagDivider = "/"; // Placed between tags.  Leave blank for none

// Calculations
$fontSizeSpread = $maxFontSize - $minFontSize; // Size range

// Create a few blank arrays
$tagArray = array();
$tempTagArray = array();

// Connect to and query mySQL Database (NOTE: change "hostname," "username,"
// "password," and "databasename" to your information)
mysql_connect("hostname", "username", "password") or die(mysql_error());
mysql_select_db("databasename");
// Get all rows that don't have blank tags (NOTE: change "tablename" to your table)
$result = mysql_query("SELECT * FROM tablename WHERE tags!=''");

Looping Through Each Result and Storing Individual Tags to an Array

Looping through each result is accomplished like this:

// Loop through each result of the query
while($row = mysql_fetch_array($result)){
	/* Code to
	   loop*/
}

Now that we’re looping through each result, we need to further decompose each row’s string of tags into individual tags. That is, we need to convert the single string “tag 1, tag 2, tag 3″ into three separate strings, “tag 1″, “tag 2″, and “tag 3″.  To do this, we have to take each tag result and split it into an array, with “, ” as the delimiter. In PHP, the preg_split function handles this. Note: this code goes inside the “while” loop above.

	// Split the result of each query ", " (our tags will be in the form "tag 1,
	// tag 2, tag 3" etc.) and store the results to an array
	$tempStringArray = preg_split("/, /", $row["tags"]);

Looping Through Each Tag

Our next task is to go through each tag and check whether or not it exists in the array of all tags. First, it would be helpful to convert each tag to lowercase with the strtolower function (if you want to have case matching, you can skip the lowercase conversion). Still inside the “while” loop, we loop through each element in the $tempStringArray (the array that stores all tags for a given database row):

    // Loop through all items of this string array (i.e. loop through each tag)
    // Use a "for" loop
    for ($a = 0; $a < count($tempStringArray); $a++) {
        // Convert to lowercase
        $tempStringArray[$a] = strtolower($tempStringArray[$a]);
        // Check if it exists in tag array
        if ($tagArray[$tempStringArray[$a]] == '') {
            // If it doesn't exist, create it with value 1
            $tagArray[$tempStringArray[$a]] = 1;
        } else {
            // If it does exist, increase the value by 1
            $tagArray[$tempStringArray[$a]] += 1;
        }
    }

So far, your code should look like this:

<?php
// Define user variables
$fontSizeUnit = "px";
$maxFontSize = 80;
$minFontSize = 12;
$tagsToDisplay = 20;
$tagDivider = "/"; // Leave blank for none

// Calculations
$fontSizeSpread = $maxFontSize - $minFontSize;

// Create a few blank arrays
$tagArray = array();
$tempTagArray = array();

// Connect to and query mySQL Database (NOTE: change "hostname," "username,"
// "password," and "databasename" to your information)
mysql_connect("hostname", "username", "password") or die(mysql_error());
mysql_select_db("databasename");
// Get all rows that don't have blank tags (NOTE: change "tablename" to your table)
$result = mysql_query("SELECT * FROM tablename WHERE tags!=''");

// Loop through each result of the query
while($row = mysql_fetch_array($result)){

	// Split the result of each query ", " (our tags will be in the form "tag 1,
	// tag 2, tag 3" etc.) and store the results to an array
	$tempStringArray = preg_split("/, /", $row["tags"]);

	// Loop through all items of this string array (i.e. loop through each tag)
	// Use a "for" loop
	for ($a = 0; $a < count($tempStringArray); $a++) {
		// Convert to lowercase
		$tempStringArray[$a] = strtolower($tempStringArray[$a]);
		// Check if it exists in tag array
		if ($tagArray[$tempStringArray[$a]] == '') {
			// If it doesn't exist, create it with value 1
			$tagArray[$tempStringArray[$a]] = 1;
		} else {
			// If it does exist, increase the value by 1
			$tagArray[$tempStringArray[$a]] += 1;
		}
	}

}

Getting and Alphabetizing the Top “n” Tags

Next, we want to limit our display to a certain amount of tags. That is, we only want to display the 20 (“n”) most common tags, for example. To do this, we want to reverse-sort our tag array using the arsort function so that our most common tags are first.  Next, we create a tag counting variable ($numberOfTags) and a foreach loop to loop through the tags until we get to the limit (once we get to the limit, we break the loop.  During the loop, we store the the included tags to a new array, $finalTagArray; this array will contain only the top tags that we want to use.  Finally, we will alphabetize the tag names using the ksort function so that our tag cloud is alphabetical.

// Store to temporary array to be used to choose the top tags, then sort
// temporary array according to tag value, descending

arsort($tagArray);
$numberOfTags = 0;
foreach ($tagArray as $key => $val) {
	$numberOfTags++;

	if ($numberOfTags > $tagsToDisplay) {
		break;
	}

	$finalTagArray[$key] = $val;
}
ksort($finalTagArray);

Calculating Each Tag’s Font Size

In order to calculate the output font size, we first need to find the range of tag counts by finding the maximum and minimum tag counts and finding the difference.  Once we have this range, we can find out how much the font size will increase with each tag count increase by dividing the font size range (which we found earlier) by the tag count range.

$maxTagCount = max($finalTagArray);
$minTagCount = min($finalTagArray);

$tagCountSpread = $maxTagCount - $minTagCount;

$unitsPerCount = $fontSizeSpread/$tagCountSpread;

Next, we create a function calcSize which will take a tag count as a parameter an return the font size.  After declaring a new function, we import the global variables for $minTagCount, $minFontSize, $fontSizeUnit, and $unitsPerCount.  We calculate the font size by finding how many counts the tag has above the minimum count, multiplying by the font size per tag count, then adding the minimum font size.  Got that?  If not, see if the code below helps.  Finally, we round the font size to the nearest integer, append the font unit, then return the font size as a string.

// Function to calculate the font size
function calcSize($thisTagCount) {

	// Import necessary global variables
	global $minTagCount, $minFontSize, $fontSizeUnit, $unitsPerCount;

	// Calculate the font size
	$thisFontSize = $minFontSize+($unitsPerCount*($thisTagCount-$minTagCount));

	// Round the font size and add units
	$thisFontSize = round($thisFontSize) . $fontSizeUnit;

	// Return the font size
	return $thisFontSize;

}

Outputing the Results

In order to print out the results, we again loop through each item in the final tag array.  As we loop through, we echo the “span” HTML element with “font-size” in the style attribute and call the calcSize function with each tag’s count.  Finally, we use a variable $b to determine whether we’re at the last element.  If we’re not, we add the tag divider character to separate each tag:

$b = 1;
foreach ($finalTagArray as $key => $val) {
	echo "<span style='font-size: ";
        echo calcSize($val);
        echo "'>" . $key . "</span>";
	if ($b != count($finalTagArray)) {
		echo " " . $tagDivider . " ";
	}
	$b++;
}

Final Code

And with that, we’re done! Close your PHP tags, and your final code should look something like this:

<?php
// Define user variables
$fontSizeUnit = "px";
$maxFontSize = 50;
$minFontSize = 12;
$tagsToDisplay = 50;
$tagDivider = "/"; // Leave blank for none

// Calculations
$fontSizeSpread = $maxFontSize - $minFontSize;

// Create a few blank arrays
$tagArray = array();
$tempTagArray = array();

// Connect to and query mySQL Database (NOTE: change "hostname," "username,"
// "password," and "databasename" to your information)
mysql_connect("hostname", "username", "password") or die(mysql_error());
mysql_select_db("databasename");
// Get all rows that don't have blank tags (NOTE: change "tablename" to your table)
$result = mysql_query("SELECT * FROM tablename WHERE tags!=''");

// Loop through each result of the query
while($row = mysql_fetch_array($result)){

	// Split the result of each query ", " (our tags will be in the form "tag 1,
	// tag 2, tag 3" etc.) and store the results to an array
	$tempStringArray = preg_split("/, /", $row["tags"]);

	// Loop through all items of this string array (i.e. loop through each tag)
	// Use a "for" loop
	for ($a = 0; $a < count($tempStringArray); $a++) {
 		// Convert to lowercase
 		$tempStringArray[$a] = strtolower($tempStringArray[$a]);
 		// Check if it exists in tag array
 		if ($tagArray[$tempStringArray[$a]] == '') {
 			// If it doesn't exist, create it with value 1
 			$tagArray[$tempStringArray[$a]] = 1;
 		} else {
 			// If it does exist, increase the value by 1
 			$tagArray[$tempStringArray[$a]] += 1;
 		}
 	}
}
// Store to temporary array to be used to choose the top tags, then sort
// temporary array according to tag value, descending
arsort($tagArray);
$numberOfTags = 0;
foreach ($tagArray as $key => $val) {
	$numberOfTags++;

	if ($numberOfTags > $tagsToDisplay) {
		break;
	}

	$finalTagArray[$key] = $val;
}
ksort($finalTagArray);

$maxTagCount = max($finalTagArray);
$minTagCount = min($finalTagArray);

$tagCountSpread = $maxTagCount - $minTagCount;

$unitsPerCount = $fontSizeSpread/$tagCountSpread;

// Function to calculate the font size
function calcSize($thisTagCount) {

	// Import necessary global variables
	global $minTagCount, $minFontSize, $fontSizeUnit, $unitsPerCount;

	// Calculate the font size
	$thisFontSize = $minFontSize+($unitsPerCount*($thisTagCount-$minTagCount));

	// Round the font size and add units
	$thisFontSize = round($thisFontSize) . $fontSizeUnit;

	// Return the font size
	return $thisFontSize;

}

$b = 1;
foreach ($finalTagArray as $key => $val) {
	echo "<span style='font-size: ";
        echo calcSize($val);
        echo "'>" . $key . "</span>";
	if ($b != count($finalTagArray)) {
		echo " " . $tagDivider . " ";
	}
	$b++;
}

?>

Of course, you can style this however you want using css. With some style tweaks, I ended up with this:

Tag Cloud Styled

My styled Tag Cloud


Leave a comment

8 Comments.

  1. Nice! Your code is well commented too, which is something that many programmers forget to do, despite how important it is. =\

  2. I have been looking for a way to do this with and with out a data base.

    I am still at the beginning of learning php but I am going to look aver the code for a little bit and see if I can make it work with out a data base.

    I know its odd, but I am trying to code a little cms that will not use a datebase so it is just upload and go.

    Thanks for the information.

  3. It is actually kind of bad if you look at it from a normalization point of view. Those comma seperated values mean a tag is stored multiple times, eating up memory/disk space.
    When you use a many to many relationship you can use MySQL GROUP BY and COUNT to quickly retreive all active tags that are in use, or to get all active tags that are not in use.
    This way PHP won’t have to loop through the entire resultset, store it in an array, sort it etc as MySQL is more optimized for these kind of tasks.

  4. Hi Matt, works great, except I’m not quite sure how to call the calcSize function. Can you help me with this line where it echos the tag…

    echo “” . $key . ““;

    Thank you :wink:

    • Hi Phil,

      Ah, good catch. A few weeks ago, the formatting of this post randomly got really messed up, and it looks like some things were still bad after I fixed it. There was supposed to be a “span” tag in the code, but when I fixed it, I guess WordPress interpreted it as actual HTML code rather than content.

      I think it should be good now. See the last few lines and you should be able to see where calcSize fits in now.

      Thanks for noticing this!

      Matt

  5. this was great, so easy to implement just what I needed for a beta website project, greatly appreciated.

Leave a Reply


[ Ctrl + Enter ]

Trackbacks and Pingbacks:

  • Articulos - Trackback on 2011/01/31/ 21:00