Scripting Snippets

Bits of scripting I use infrequently and always have to lookup!

Looping and reading a line at a time

while read line; do
  echo $line
done < test.txt

Remove annoying .DS_Store files even if directories that have spaces in them

find . -name .DS_Store -exec rm {} \;

Find differences between two directories (with subdirectories), stating if files differ

diff -rq $DIR1 $DIR2

Count files in current and all subdirectories

find . -type f | wc -l

JasperReports Tips

JasperReports is a java based open source reporting framework with similar, if not more, functionality to that of Crystal Reports. There is a visual report designer called iReport, but my experience of using it for anything but the simplest report has been a waste. It is useful for dragging and dropping report layouts around, but once you get the hang of it, it’s far easier to do it directly in XML.

Documentation is available, but it’s quite often difficult to get simple answers via a google search – as is possible for most open source software. The current books also lack good examples when they explain some of the more powerful features. Simple things like, I don’t want my report to have nulls printed on it, what are my options are suprisingly awkward to locate! My own, FAQs …

  • Null values in reports, what can I do?

    If you want to replace the null value with a blank entry this is easy to achieve with the isBlankWhenNull attribute of textField

    <textField isBlankWhenNull="true"> ... </textField>

    If you need to replace the null value with something else, you can do this directly in the report as shown below. Using the ternary operator the GROUP_NAME field is checked for a null, if it is null the text None is used instead.

    <textFieldExpression>(($F{GROUP_NAME} != null) ? $F{GROUP_NAME} : "None")</textFieldExpression>
  • How to add a row count?

    If you need a row count and don’t want to add unnecessary fields into the SQL statement you can use a variable. Straight after your field definitions include the following, assuming you have a field named USER_NAME in your fied definitions!

    <variable name="row_count" class="java.lang.Integer" calculation="Count">
        <variableExpression><![CDATA[$F{USER_NAME}]]></variableExpression>
        <initialValueExpression><![CDATA[new java.lang.Integer(0)]]></initialValueExpression>
    </variable>

    This can be used in the report as follow

    <textField>
        <reportElement x="0" y="0" width="30" height="20"/>
        <textFieldExpression class="java.lang.Integer"><![CDATA[$V{row_count}]]></textFieldExpression>
    </textField>
  • Highlight certain rows depending on the value of other data?

    If you need to highlight some text, the trick is to create two versions of the textField that displays it and use the <printWhenExpression> tag to determine which one displays the text. The below example highlight in red if the PASS_DATE field is null.

    <!-- main textField -->
    <textField>
        <reportElement x="466" y="0" width="228" height="20">
            <printWhenExpression>new Boolean($F{PASS_DATE} != null)</printWhenExpression>
        </reportElement>
        <textFieldExpression class="java.lang.String"><![CDATA[$F{USER_NAME}]]></textFieldExpression>
    </textField>
    
    <!-- highlight textField -->
    <textField>
        <reportElement x="466" y="0" width="228" height="20" forecolor="red">
            <printWhenExpression>new Boolean($F{PASS_DATE} == null)</printWhenExpression>
        </reportElement>
        <textFieldExpression class="java.lang.String"><![CDATA[$F{USER_NAME}]]></textFieldExpression>
    </textField>

JSTL Internationalisation

Some notes on how to get JSTL pages ready.

Setup

Inside a JSP page included on every other, for example a header jsp file.

<fmt:setLocale value="${param.locale}" scope="request" />
<fmt:setTimeZone value="${param.timeZone}" scope="request" />
<fmt:setBundle basename="Messages"/>

At your classpath root create a fallback localised file called Messages.properties in which to store key/value pairs. Files for other languages take the form of Messages_en.properties, Messages_en_US.properties, etc.

page.text.title=Page Title
page.text.message=Page Message
page.text.welcomeMessage=Welcome {0}
page.text.returnLink=<a href="{0}" title="Return">Return</a>

Examples

<h1>Page Title</h1>
<h1><fmt:message value="page.text.title"/></h1>

<h1>Welcome <c:out value="${person.name}"/></h1>
<h1><fmt:message value="page.text.welcomeMessage"><fmt:param value="${fn:escapeXml(person.name)"}/></fmt:message></h1>

Pull phpBB2 forum posts (basically a simple CMS)

phpBB2 is currently one of the most popular bulletin boards on the internet (yes, this statement is old, but the post is a repost of old content from my previous website).

Forum Powered Website

A forum is the most popular section of many websites so I was looking for a way to drive the website content from the forum. In the end I decided to write the following code to get topics from a forum which I could then display on other pages.

/*
 * PHPBB2 extension to get posts from a forum.  This allows you to power a news sections of
 * a website by posting in the forum.  Any posts in reply to the topic are classed as
 * comments.
 *
 * $Id$
 */

define('IN_PHPBB', true);

$phpbb_root_path = 'forum/';
$phpEx = 'php';

include($phpbb_root_path . 'config.' . $phpEx);
include($phpbb_root_path . 'includes/constants.' . $phpEx);
include($phpbb_root_path . 'includes/db.' . $phpEx);
include($phpbb_root_path . 'includes/template.' . $phpEx);
include($phpbb_root_path . 'includes/functions.' . $phpEx);
include($phpbb_root_path . 'includes/bbcode.' . $phpEx);

// need template defined as the bbcode uses it
$template = new Template($phpbb_root_path . 'templates/Saphic');

function &getForumPosts($forum_id) {
  global $db;

  $posts_array = array();

  // get list of topics
  $topic_sql = 'SELECT t.topic_id, t.topic_first_post_id, t.topic_replies FROM ' . TOPICS_TABLE . ' t WHERE t.forum_id = ' . $forum_id . ' ORDER BY t.topic_time DESC';
  
  if (!($topic_result = $db->sql_query($topic_sql))) { 
    return $posts_text;
  }
  
  if ($topics = $db->sql_fetchrowset($topic_result)) {
    $db->sql_freeresult($topic_result);
  }
  
  foreach($topics as $topic_row) {
    // get post for topic
    $post_sql = 'SELECT p.post_time, pt.post_subject, pt.post_text, pt.bbcode_uid FROM ' . POSTS_TABLE . ' p, ' . POSTS_TEXT_TABLE . ' pt WHERE p.post_id = ' . $topic_row['topic_first_post_id'] . ' AND p.post_id = pt.post_id';
    
    if (!($post_result = $db->sql_query($post_sql))) {
      continue;
    }
    
    if ($posts = $db->sql_fetchrowset($post_result)) { 
      $db->sql_freeresult($post_result); 
    }
    
    $post_row = $posts[0];  // should always have a result
    $text = bbencode_second_pass($post_row['post_text'], $post_row['bbcode_uid']);

    array_push($posts_array, array('topic_id' => $topic_row['topic_id'],
                                   'topic_replies' => $topic_row['topic_replies'],
                                   'post_time' => $post_row['post_time'],
                                   'post_subject' => $post_row['post_subject'],
                                   'post_text' => $text));
  }

  return $posts_array;
}

In the pages for your actual website, you can include the file above and then do the following.

<dl>
<?php
foreach(getForumPosts(13) as $post) {
  echo('<dt>' . date("jS F, Y", $post['post_time']) . '</dt>' . "n");
  echo('<dd>' . nl2br($post['post_text']) . '<br/>');
  echo($post['topic_replies'] . ' comment/s (<a href="forum/viewtopic.php?t=' . $post['topic_id'] . '">view</a>, <a href="http://www.example.com/forum/posting.php?mode=reply&t=' . $post['topic_id'] . '">add</a>)</dd>' . "n");
}?>
</dl>

Java Authentication – LDAP and Active Directory

Been asked to integrate your application’s authentication with an LDAP directory (Active Directory is LDAP v3 compliant)? Me too! There is a fair amount of information about this topic available by searching, but when I was doing this I couldn’t find one place that had everything explained in detail, so I decided to document how I did it.

What follows will explain how to validate a username/password combination against an LDAP compliant directory server using java and the opensource LDAP library called jldap.

First off, go and download the jldap jar file and browse around the code samples as it’s a well documented library. Second, take a look how easy it is to make a connection to an LDAP directory server.

LDAPConnection conn = new LDAPConnection();
conn.connect("localhost", 389);
conn.bind("cn=admin,dc=tilion,dc=org,dc=uk", "password");

If you’re not too sure about LDAP syntax you may like to read the Wiki LDAP entry. In short, LDAP uses a tree structure where each entry has a unique identifier, known as it’s Distinguished Name (DN). In the username above cn=admin is the Relative Distinguished Name (RDN) and dc=tilion,dc=org,dc=uk is the DN of it’s parent entry. Put together these form the DN for a user with privileges to bind to the LDAP directory (DC stands for Domain Component). Entries generally have a CN attribute, known as the Common Name along with a whole load of more familiar named attributes.

If binding to an Active Directory server, the username is more likely to be of the format cn=Administrator,cn=Users,dc=tilion,dc=org,dc=uk. Part of the complexity with LDAP queries is that there is no fixed format for where particular types of entry live from server to server. Most Active Directory servers will be alike, but won’t be the same when compared to a Novell directory or an OpenLDAP server. For the purposes of authentication we need to locate where in the directory the entries that represent a user object live.

  • cn=Users,dc=tilion,dc=org,dc=uk is the default for Active Directory
  • ou=People,dc=tilion,dc=org,dc=uk is the default an OpenLDAP server storing unix users accounts (the one I have anyway!)

Whoever set up the LDAP server should be able to tell you the base DN for your environment. For example, when setting up Active Directory you specify the name (check terminology) in the format machine.domain.ext, which would lead to the base DN of dc=machine,dc=domain,dc=ext.

Here is where we get to the querk. Imagine we need to check a login where the username is darren and the password, well, lets just say it’s the right password for the username. On an OpenLDAP server, all you need to do is try binding to the directory as shown below.

conn.bind("uid=darren,ou=People,dc=tilion,dc=org,dc=uk", "password");

uid is the attribute that holds the actual username value (the CN, or Common Name is often different to the actual username).

Unfortunately, Active Directory is different, in that you can only bind to it using a DN, which references the actual entry via it’s CN. What you have to do is perform a query to check if the username exists, grab it’s CN and then perform a bind using the CN and the given password. So, let’s see this in action …

// assume we have a connection, which is already bound
LDAPSearchResults searchResults = conn.search(
        "cn=Users,dc=tilion,dc=org,dc=uk",
        LDAPConnection.SCOPE_ONE,
        sAMAccountName + "=" + <username>,  // <username> came from the user trying to login
        null,
        false);
LDAPEntry entry = searchResults.next();
if (entry != null) {  // the username is valid, lets pull out the CN from the attributes
    String cnValue = null;
    LDAPAttributeSet attrSet = entry.getAttributeSet();
    Iterator allAttrs = attrSet.iterator();
    while (allAttrs.hasNext()) {
        LDAPAttribute attr = (LDAPAttribute)allAttrs.next();
        String attrName = attr.getName();
        if (attrName.equalsIgnoreCase("cn")) {  // we got the CN
            cnValue = attr.getStringValues().nextElement();
        } else {
            continue;
        }
    }

    if (cnValue == null) {
        // return auth failed, the username doesn't exist
    }

    // attempt a bind with CN and given password
    LDAPConnection tmp = new LDAPConnection();
    tmp.connect(HOST, PORT);
    tmp.bind("cn=" + cnValue + "," + "cn=Users,dc=tilion,dc=org,dc=uk", <password>);  // <password> came from the user trying to login

    // return auth successful, username and password are valid

    // an LDAPException is thrown if the credentials are invalid
}

Concepts covered, you’re probably wondering how are you going to find all those cn,dn,dc,xyz details about your particular LDAP directory? That’s exactly why I created a standalone application to query an LDAP server when I was learning this stuff. You can download the LDAP test application (NOT UPLOADED YET!), which includes the the compiled jar, full source and a maven pom.xml.

The code shown here is for illustration purposes only and should not be used in production without proper error handling additions. It is as concise as possible to illustrate a point.

Useful Attributes

A quick round up of useful attributes in various LDAP compliant servers.

Active Directory

  • sAMAccountName holds the username
  • displayName holds the full name
  • mail holds the email address

OpenLDAP (holding unix user accounts)

  • uid holds the username
  • cn holds the full name
  • mailacceptinggeneralid holds the email address

Computer networks and the internet for “normal people”

There are a huge amount of buzz words and confusing terms surrounding the internet and computer networking that makes it very difficult to understand for the everyday person. This article attempts to shed some light on the confusion and give you a broad overview of what happens when your computer connects to the internet.

Lets think about a network that most of us navigate and use almost every day without thinking it complex – the road network. The roads provide the way to get from source to destination and the cars provide the mode of transport. Keep that image in your head as you read through the following paragraphs.

Source and Destination

In the road network a source or destination can be thought of as a house or building address. In computing terms this is called an IP address and is of the form 123.123.123.123 – that is 4 sets of numbers from 0 to 255 separated by a dot. Other examples are 192.168.1.23, 10.0.0.1, 62.34.12.67 Just as each house address is unique, each IP address is unique.

How do you get an IP address then? This will differ from computer to computer depending on how yours is setup and what you use it for. One thing for sure is if you want to communicate on a computer network (like the internet) you will definately have an IP address.

Unlike a house address which never changes, your computer IP address may change from time to time. There is a limited set of unique IP addresses so they can’t just give them out to everybody. What normally happens is you lease an IP address for a set period of time while you need to use your computer and then free it up for someone else to use while you don’t need it. This process is hidden away from users so you don’t have to concern yourself with it. If you really want to know – something called a DHCP server is responsible for allocating you an IP address and telling you how long you can lease it for.

Domain Names

You’re probably thinking that you’ve never come across an IP address before yet you’ve been happily surfing the internet for months. Humans aren’t so good at remembering strings of numbers so we use domain names instead. Examples of domain names are www.google.com, www.bbc.co.uk and tilion.org.uk. Notice that they don’t necessarily have to start with www and they don’t all end the same (the endings are actually predefined and vary from country to country apart from the main three which are .com, .net and .org.

A domain name can be thought of as an easy way to remember an IP address. Servers known as Domain Name Servers (DNS for short) are responsible for converting the domain name you enter into a web browser, into the IP address to locate the machine hosting it. Again this happens behind the scenes and most users are totally unaware.

Transport

Unlike the road network, where you can see the cars travelling around from source to destination, a computer network sends electrical signals along wires (or wirelessly). The road network of the computer world is known as TCPIP and describes a way for your communication to travel around. All you really need to know is that if you try to contact another computer at a specified IP address (or domain name) the communication will succeed providing there are no broken roads between you and the destination.

Ports

Remember how an IP address identified a particular computer somewhere in the world. Most computers do more than one thing at once so there needs to be a way to differentiate between the services offered so the computer knows how to reply – this is where ports come in. An IP address can be likened to a house address, so a port can be likened to a way to enter the house (front door, rear door, window, sky light). A computer port is a number from 1 – 65535 (yep, it has a lot of ways to get in!).

Web Servers

One of the most popular services offered by other computers is serving web pages (known as HTTP or Hypertext Transfer Protocol). The HTTP bit is the way you have to speak to the server at the other end so it can understand you – think of it as speaking english or spanish and you must both be speaking the same one. Don’t worry too much about speaking HTTP as your web browser understands how to do that for you. Web servers typically listen on port 80 so when you type http://www.google.com into your web browser, 80 is the default port it uses to communicate with the other computer. http://www.google.com is exactly the same as typing in http://www.google.com:80 (the :80 mean use port 80). If you try to communicate on a different port the chances are noone will be listening or maybe the service that is listening doesn’t speak the same language as you, e.g. http://www.tilion.org.uk:22

Remember that the specified (or default) port is the one used on the server. Your computer is also using a port to talk to the server, but which one you shouldn’t worry about aside from the fact it will NOT be the same port as the server is using.

Other Services

Some other services you’ve probably used already and not realised are;

  • port 110 – POP3 which is used to get your emails
  • port 443 – HTTP in SSL mode which is a secure way to view a web page and is used for sensitive information like banking

Technical readers can probably spot mistakes in the analogies, but I’ve tried to keep technical detail to a minimum for the sake of understandability!

Bits and Bytes

A gathering of data related to bits, byte, endianess, etc …

Number Systems

Decimal

The numbering system we use in everyday life is called decimal (base 10). Each column can use a number between 0 and 9 (10 possibilities).

103 102 101 100
1 8 2 5

The top row is the value of the column and the bottom row is a decimal number. For anyone who’s a little rusty on their maths – xy means x times x, repeated y times. So, 104 = 10 * 10 * 10 * 10.

The decimal number 1825 is one thousand, eight hundred and twenty five or

(1 * 103) + (8 * 102) + (2 * 101) + (5 * 100)
(1 * 10 * 10 * 10) + (8 * 10 * 10) + (2 * 10) + (5 * 1)
(1 * 1000) + (8 * 100) + (2 * 10) + (5 * 1)
1000 + 800 + 20 + 5 = 1825

Remember that any number to the power of 0 is 1, so 100 = 1

Binary

Computers use the binary system (base 2), where each column can have a 0 or a 1 (2 possibilities). This works well for machinery as they can use positive and negative electrical signals to represents 0 and 1.

23 22 21 20
1 0 1 1

The binary number 1011 is equivalent to 11 in decimal.

(1 * 23) + (0 * 22) + (1 * 21) + (1 * 20)
(1 * 2 * 2 * 2) + (0 * 2 * 2) + (1 * 2) + (1 * 1)
(1 * 8) + (0 * 4) + (1 * 2) + (1 * 1)
8 + 0 + 2 + 1 = 11

As you can see you need a lot more columns to represent a number in binary that you do in decimal. If you had to convert 1825 into decimal it would be

011100100001
210 + 29 + 28 + 25 + 20
1024 + 512 + 256 + 32 + 1 = 1825

When talking about binary each column is refered to as a bit (so each 0 or 1 is a bit) and each collection of 8 bits is refered to as a byte. A kilobyte or kb is 1024 bytes. A megabyte or Mb is 1024 kilobytes or 1048576 bytes. Now that’s a lot of 1’s and 0’s !

Hexadecimal

Binary is a very verbose format so it’s common for binary data to be represented using hexadecimal (hex for short) which is base 16. Thats means every column can contain one of 16 unique values. As we only have 10 numerical digits, hex uses some letters from the alphabet to give a full set of 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F

163 162 161 160
1 b 3 a

The hex number 1b3a is equivalent to 6970 in decimal.

For the following calculations the alphabetical characters maps to decimal equivalents; a = 10, b = 11, c = 12, d = 13, e = 14 and f = 15

(1 * 163) + (11 * 162) + (3 * 161) + (10 * 160)
(1 * 16 * 16 * 16) + (11 * 16 * 16) + (3 * 16) + (10 * 1)
(1 * 4096) + (11 * 256) + (3 * 16) + (10 * 1)
4096 + 2816 + 48 + 10 = 6970

So, how can hex make binary look more attractive? Every 4 bits from a binary number can be represented as 1 column from a hex number! Lets take 011100100001 from earlier and break it up;

011100100001
0111 0010 0001 (in binary)
7 2 1 (in hex)

So 011100100001 in binary becomes 721 in hex. Another example

1110011011010010
1110 0110 1101 0010 (in binary)
e 6 d 2 (in hex)

So 1110011011010010 in binary becomes e6d2 in hex.

Endianess

Endianess referes to which end (left or right) the most significant bit lies. Sparc processors use big endian storage format which stores the most significant byte first. Intel processors (common PC) store data in little endian.

Java uses big endian format whereas a lot of C code I’ve come across uses little endian. If you need to read something in little endian using java it’s best to use the NIO suit.

ByteBuffer.wrap( buffer ).order( ByteOrder.LITTLE_ENDIAN ).getLong();

Decimal (and above discussions on binary and hexadecimal) uses big endian as it stores the largest bit first (on the left hand side). A binary example of this

Big Endian     00110110 = 32 + 16 + 4 + 2 = 54
Little Endian  00110110 = 4 + 8 + 32 + 64 = 108

There is a more thorough description and good C example highlighting the problems with endianess differences located at Wikipedia

Hex Editing

A good Guide to Hex Editing

Reading/writing mixed endian binary files in java

Shameless reproduction of content from http://www.heatonresearch.com/articles/22/page2.html for my own reference.

The BinaryFile class can be seen in BinaryFile.java. To use the BinaryFile class, create a RandomAccessFile class to the file that you would like to work with. This file can be opened for read or write access. Then construct a BinaryFile object, passing in your RandomAccessFile object to the constructor. The following two lines prepare to read/write to a file called “test.dat”.

file=new RandomAccessFile("test.dat","rw");
bin=new BinaryFile(file);

Once this is complete, you can call the various methods provided, to access different data types. The methods to access the various data types are prefixed with either read or write and then the type. For example, the method to read a fixed length string is readFixedLengthString. The complete class is shown in Listing 1.

Listing 1: Reading Java Binary Files

import java.io.*;

/**
 * @author Jeff Heaton(<a href="http://www.jeffheaton.com" title="http://www.jeffheaton.com">http://www.jeffheaton.com</a>)
 * @version 1.0
 */
class BinaryFile
{

  /**
   * Use this constant to specify big-endian integers.
   */
  public static final short BIG_ENDIAN = 1;

  /**
   * Use this constant to specify litte-endian constants.
   */
  public static final short LITTLE_ENDIAN = 2;

  /**
   * The underlying file.
   */
  protected RandomAccessFile _file;

  /**
   * Are we in LITTLE_ENDIAN or BIG_ENDIAN mode.
   */
  protected short _endian;

  /**
   * Are we reading signed or unsigned numbers.
   */
  protected boolean _signed;

  /**
   * The constructor.  Use to specify the underlying file.
   *
   * @param f The file to read/write from/to.
   */
  public BinaryFile(RandomAccessFile f)
  {
    _file = f;
    _endian = LITTLE_ENDIAN;
    _signed = false;
  }

  /**
   * Set the endian mode for reading integers.
   *
   * @param i Specify either LITTLE_ENDIAN or BIG_ENDIAN.
   * @exception java.lang.Exception Will be thrown if this method is 
   * not passed either BinaryFile.LITTLE_ENDIAN or BinaryFile.BIG_ENDIAN.
   */
  public void setEndian(short i) throws Exception
  {
    if ((i == BIG_ENDIAN) || (i == LITTLE_ENDIAN))
      _endian = i;
    else
      throw (new Exception(
          "Must be BinaryFile.LITTLE_ENDIAN or BinaryFile.BIG_ENDIAN"));
  }

  /**
   * Returns the endian mode.  Will be either BIG_ENDIAN or LITTLE_ENDIAN.
   *
   * @return BIG_ENDIAN or LITTLE_ENDIAN to specify the current endian mode.
   */
  public int getEndian()
  {
    return _endian;
  }

  /**
   * Sets the signed or unsigned mode for integers.  true for signed, false for unsigned.
   *
   * @param b True if numbers are to be read/written as signed, false if unsigned.
   */
  public void setSigned(boolean b)
  {
    _signed = b;
  }

  /**
   * Returns the signed mode.
   *
   * @return Returns true for signed, false for unsigned.
   */
  public boolean getSigned()
  {
    return _signed;
  }

  /**
   * Reads a fixed length ASCII string.
   *
   * @param length How long of a string to read.
   * @return The number of bytes read.
   * @exception java.io.IOException If an IO exception occurs.
   */
  public String readFixedString(int length) throws java.io.IOException
  {
    String rtn = "";

    for (int i = 0; i &lt; length; i++)
      rtn += (char) _file.readByte();
    return rtn;
  }

  /**
   * Writes a fixed length ASCII string.  Will truncate the string if it does not fit in the specified buffer.
   *
   * @param str The string to be written.
   * @param length The length of the area to write to.  Should be larger than the length of the string being written.
   * @exception java.io.IOException If an IO exception occurs.
   */
  public void writeFixedString(String str, int length)
      throws java.io.IOException
  {
    int i;

    // trim the string back some if needed

    if (str.length() &gt; length)
      str = str.substring(0, length);

    // write the string

    for (i = 0; i &lt; str.length(); i++)
      _file.write(str.charAt(i));

    // buffer extra space if needed

    i = length - str.length();
    while ((i--) &gt; 0)
      _file.write(0);
  }

  /**
   * Reads a string that stores one length byte before the string.  
   * This string can be up to 255 characters long.  Pascal stores strings this way.
   *
   * @return The string that was read.
   * @exception java.io.IOException If an IO exception occurs.
   */
  public String readLengthPrefixString() throws java.io.IOException
  {
    short len = readUnsignedByte();
    return readFixedString(len);
  }

  /**
   * Writes a string that is prefixed by a single byte that specifies the length of the string.  This is how Pascal usually stores strings.
   *
   * @param str The string to be written.
   * @exception java.io.IOException If an IO exception occurs.
   */
  public void writeLengthPrefixString(String str) throws java.io.IOException
  {
    writeByte((byte) str.length());
    for (int i = 0; i &lt; str.length(); i++)
      _file.write(str.charAt(i));
  }

  /**
   * Reads a fixed length string that is zero(NULL) terminated.  This is a type of string used by C/C++.  For example char str[80].
   *
   * @param length The length of the string.

   * @return The string that was read.
   * @exception java.io.IOException If an IO exception occurs.
   */
  public String readFixedZeroString(int length) throws java.io.IOException
  {
    String rtn = readFixedString(length);
    int i = rtn.indexOf(0);
    if (i != -1)
      rtn = rtn.substring(0, i);
    return rtn;
  }

  /**
   * Writes a fixed length string that is zero terminated.  This is the format generally used by C/C++ for string storage.
   *
   * @param str The string to be written.
   * @param length The length of the buffer to receive the string.
   * @exception java.io.IOException If an IO exception occurs.
   */
  public void writeFixedZeroString(String str, int length)
      throws java.io.IOException
  {
    writeFixedString(str, length);
  }

  /**
   * Reads an unlimited length zero(null) terminated string.
   *
   * @return The string that was read.
   * @exception java.io.IOException If an IO exception occurs.
   */
  public String readZeroString() throws java.io.IOException
  {
    String rtn = "";
    char ch;

    do
    {
      ch = (char) _file.read();
      if (ch != 0)
        rtn += ch;
    } while (ch != 0);
    return rtn;
  }

  /**
   * Writes an unlimited zero(NULL) terminated string to the file.
   *
   * @param str The string to be written.
   * @exception java.io.IOException If an IO exception occurs.
   */
  public void writeZeroString(String str) throws java.io.IOException
  {
    for (int i = 0; i &lt; str.length(); i++)
      _file.write(str.charAt(i));
    writeByte((byte) 0);
  }

  /**
   * Internal function used to read an unsigned byte.  External classes should use the readByte function.
   *
   * @return The byte, unsigned, as a short.
   * @exception java.io.IOException If an IO exception occurs.
   */
  protected short readUnsignedByte() throws java.io.IOException
  {
    return (short) (_file.readByte() & 0xff);
  }

  /**
   * Reads an 8-bit byte.  Can be signed or unsigned depending on the signed property.
   *
   * @return A byte stored in a short.
   * @exception java.io.IOException If an IO exception occurs.
   */
  public short readByte() throws java.io.IOException
  {
    if (_signed)
      return (short) _file.readByte();
    else
      return (short) _file.readUnsignedByte();
  }

  /**
   * Writes a single byte to the file.
   *
   * @param b The byte to be written.
   * @exception java.io.IOException If an IO exception occurs.
   */
  public void writeByte(short b) throws java.io.IOException
  {
    _file.write(b & 0xff);
  }

  /**
   * Reads a 16-bit word.  Can be signed or unsigned depending on the signed property.  
   * Can be little or big endian depending on the endian property.
   *
   * @return A word stored in an int.
   * @exception java.io.IOException If an IO exception occurs.
   */
  public int readWord() throws java.io.IOException
  {
    short a, b;
    int result;

    a = readUnsignedByte();
    b = readUnsignedByte();

    if (_endian == BIG_ENDIAN)
      result = ((a &lt;&lt; 8) | b);
    else
      result = (a | (b &lt;&lt; 8));

    if (_signed)
      if ((result & 0x8000) == 0x8000)
        result = -(0x10000 - result);

    return result;
  }

  /**
   * Write a word to the file.
   *
   * @param w The word to be written to the file.
   * @exception java.io.IOException If an IO exception occurs.
   */

  public void writeWord(int w) throws java.io.IOException
  {
    if (_endian == BIG_ENDIAN)
    {
      _file.write((w & 0xff00) &gt;&gt; 8);
      _file.write(w & 0xff);
    } else
    {
      _file.write(w & 0xff);
      _file.write((w & 0xff00) &gt;&gt; 8);
    }
  }

  /**
   * Reads a 32-bit double word.  Can be signed or unsigned 
   * depending on the signed property.  Can be little or big endian depending on the endian property.
   *
   * @return A double world stored in a long.
   * @exception java.io.IOException If an IO exception occurs.
   */
  public long readDWord() throws java.io.IOException
  {
    short a, b, c, d;
    long result;

    a = readUnsignedByte();
    b = readUnsignedByte();
    c = readUnsignedByte();
    d = readUnsignedByte();

    if (_endian == BIG_ENDIAN)
      result = ((a &lt;&lt; 24) | (b &lt;&lt; 16) | (c &lt;&lt; 8) | d);
    else
      result = (a | (b &lt;&lt; 8) | (c &lt;&lt; 16) | (d &lt;&lt; 24));

    if (_signed)
      if ((result & 0x80000000L) == 0x80000000L)
        result = -(0x100000000L - result);

    return result;
  }

  /**
   * Writes a double word to the file.
   *
   * @param d The double word to be written to the file.
   * @exception java.io.IOException If an IO exception occurs.
   */
  public void writeDWord(long d) throws java.io.IOException
  {
    if (_endian == BIG_ENDIAN)
    {
      _file.write((int) (d & 0xff000000) &gt;&gt; 24);
      _file.write((int) (d & 0xff0000) &gt;&gt; 16);
      _file.write((int) (d & 0xff00) &gt;&gt; 8);
      _file.write((int) (d & 0xff));
    } else
    {
      _file.write((int) (d & 0xff));
      _file.write((int) (d & 0xff00) &gt;&gt; 8);
      _file.write((int) (d & 0xff0000) &gt;&gt; 16);
      _file.write((int) (d & 0xff000000) &gt;&gt; 24);
    }
  }

  /**
   * Allows the file to be aligned to a specified byte boundary.  
   * For example, if a 4(double word) is specified, the file pointer will be 
   * moved to the next double word boundary.
   *
   * @param a The byte-boundary to align to.
   * @exception java.io.IOException If an IO exception occurs.
   */
  public void align(int a) throws java.io.IOException
  {
    if ((_file.getFilePointer() % a) &gt; 0)
    {
      long pos = _file.getFilePointer() / a;
      _file.seek((pos + 1) * a);
    }
  }
}

Quake 3 network protocol

This document describes the network protocol that quake 3 uses to converse with clients and the outside world (query servers). Currently more of a work in progress.

I recently (August 2012) added a blog entry with a revised version of my protocol 43 proxy server incase anyone finds it useful to continue experimenting.

Query

To query a server is very simple. Send a connectionless (UDP) packet with 4 OOB header bytes (0xff) and the text string getstatus. There are many sites which contain a thorough description of this so I won’t go into details.

Game Protocol 68 – used by 1.32

All game packets are connectionless (UDP), but there is still a handshaking process which must occur before you are allowed to join the server.

The client sends a challenge request (sometimes you need to send multiple requests before the server will respond).

+------------+----------------+
| Header     | Content        |
+------------+----------------+
| 0xffffffff | getchallenge   |
+------------+----------------+

If the server is able to accept more connections it will reply with.

+------------+------------------------+
| Header     | Content                |
+------------+------------------------+
| 0xffffffff | challengeResponse <ID> |
+------------+------------------------+

Once the client has the <ID> it can send a connect request. However, the CS is huffman compressed in protocol 68 and is NOT clear text as indicated below. Protocol 43 uses the plain text version.

+------------+----------------+
| Header     | Content        |
+------------+----------------+
| 0xffffffff | connect "<CS>" |
+------------+----------------+

<CS> represents a connection string containing the player details, e.g.
\cg_predictItems\1\sex\male\handicap\100\color\3\snaps\40\rate\10000\model\doom/red\name\UnnamaedPlayer\protocol\68\qport\<PORT>\challenge\<ID>
<PORT> represents the local port used to send this packet

If the connect is successful the server will reply with the following

+------------+-----------------+
| Header     | Content         |
+------------+-----------------+
| 0xffffffff | connectResponse |
+------------+-----------------+

The server will now place you in the CNCT (connecting) state and start sending you game updates.

This is where the communication gets substantially more difficult, so I warn you what follows may be incomplete and perhaps incorrect – although I hope not!.

Client to Server

+-----------------------------------------------------------------------------------------------+
| NAME                    | LEN | ENCODING     | COMMENT                                        |
+-------------------------+-----+---------------------------------------------------------------+
| sequenceNumber          | 32  | None         | MSG_ReadLong                                   |
| qport                   | 16  | None         | MSG_ReadShort                                  |
| serverId                | 32  | Huff (1)     | MSG_ReadLong                                   |
| messageAcknowledge      | 32  | Huff (1)     | MSG_ReadLong                                   |
| reliableAcknowledge     | 32  | Huff (1)     | MSG_ReadLong                                   |
+-------------------------+-----+--------------+------------------------------------------------+
| clientCommand           | 8   | (1), XOR (2) | MSG_ReadByte                                   |
| ...                     |     |              |                                                |
+-------------------------+-----+--------------+------------------------------------------------+

Client commands

0 - clc_bad
1 - clc_nop
2 - clc_move
3 - clc_moveNoDelta
4 - clc_clientCommand
5 - clc_EOF

Server to Client

+-----------------------------------------------------------------------------------------------+
| NAME                    | LEN | ENCODING     | COMMENT                                        |
+-------------------------+-----+---------------------------------------------------------------+
| sequenceNumber          | 32  | None         | MSG_ReadLong                                   |
| reliableAcknowledge     | 32  | Huff (1)     | MSG_ReadLong                                   |
+-------------------------+-----+--------------+------------------------------------------------+
| serverCommand           | 8   | (1), XOR (3) | MSG_ReadByte                                   |
| ...                     |     |              |                                                |
+-------------------------+-----+--------------+------------------------------------------------+

Server Commands

0 - svc_bad
1 - svc_nop
2 - svc_gamestate
3 - svc_configstring
4 - svc_baseline
5 - svc_serverCommand
6 - svc_download
7 - svc_snapshot
8 - svc_EOF

Details

(1) – Huffman compression using a predefined set of nodes to further reduce message length (detailed below).

int msg_hData[256] = {
250315,// 0
41193,// 1
6292,// 2
7106,// 3
3730,// 4
3750,// 5
6110,// 6
23283,// 7
33317,// 8
6950,// 9
7838,// 10
9714,// 11
9257,// 12
17259,// 13
3949,// 14
1778,// 15
8288,// 16
1604,// 17
1590,// 18
1663,// 19
1100,// 20
1213,// 21
1238,// 22
1134,// 23
1749,// 24
1059,// 25
1246,// 26
1149,// 27
1273,// 28
4486,// 29
2805,// 30
3472,// 31
21819,// 32
1159,// 33
1670,// 34
1066,// 35
1043,// 36
1012,// 37
1053,// 38
1070,// 39
1726,// 40
888,// 41
1180,// 42
850,// 43
960,// 44
780,// 45
1752,// 46
3296,// 47
10630,// 48
4514,// 49
5881,// 50
2685,// 51
4650,// 52
3837,// 53
2093,// 54
1867,// 55
2584,// 56
1949,// 57
1972,// 58
940,// 59
1134,// 60
1788,// 61
1670,// 62
1206,// 63
5719,// 64
6128,// 65
7222,// 66
6654,// 67
3710,// 68
3795,// 69
1492,// 70
1524,// 71
2215,// 72
1140,// 73
1355,// 74
971,// 75
2180,// 76
1248,// 77
1328,// 78
1195,// 79
1770,// 80
1078,// 81
1264,// 82
1266,// 83
1168,// 84
965,// 85
1155,// 86
1186,// 87
1347,// 88
1228,// 89
1529,// 90
1600,// 91
2617,// 92
2048,// 93
2546,// 94
3275,// 95
2410,// 96
3585,// 97
2504,// 98
2800,// 99
2675,// 100
6146,// 101
3663,// 102
2840,// 103
14253,// 104
3164,// 105
2221,// 106
1687,// 107
3208,// 108
2739,// 109
3512,// 110
4796,// 111
4091,// 112
3515,// 113
5288,// 114
4016,// 115
7937,// 116
6031,// 117
5360,// 118
3924,// 119
4892,// 120
3743,// 121
4566,// 122
4807,// 123
5852,// 124
6400,// 125
6225,// 126
8291,// 127
23243,// 128
7838,// 129
7073,// 130
8935,// 131
5437,// 132
4483,// 133
3641,// 134
5256,// 135
5312,// 136
5328,// 137
5370,// 138
3492,// 139
2458,// 140
1694,// 141
1821,// 142
2121,// 143
1916,// 144
1149,// 145
1516,// 146
1367,// 147
1236,// 148
1029,// 149
1258,// 150
1104,// 151
1245,// 152
1006,// 153
1149,// 154
1025,// 155
1241,// 156
952,// 157
1287,// 158
997,// 159
1713,// 160
1009,// 161
1187,// 162
879,// 163
1099,// 164
929,// 165
1078,// 166
951,// 167
1656,// 168
930,// 169
1153,// 170
1030,// 171
1262,// 172
1062,// 173
1214,// 174
1060,// 175
1621,// 176
930,// 177
1106,// 178
912,// 179
1034,// 180
892,// 181
1158,// 182
990,// 183
1175,// 184
850,// 185
1121,// 186
903,// 187
1087,// 188
920,// 189
1144,// 190
1056,// 191
3462,// 192
2240,// 193
4397,// 194
12136,// 195
7758,// 196
1345,// 197
1307,// 198
3278,// 199
1950,// 200
886,// 201
1023,// 202
1112,// 203
1077,// 204
1042,// 205
1061,// 206
1071,// 207
1484,// 208
1001,// 209
1096,// 210
915,// 211
1052,// 212
995,// 213
1070,// 214
876,// 215
1111,// 216
851,// 217
1059,// 218
805,// 219
1112,// 220
923,// 221
1103,// 222
817,// 223
1899,// 224
1872,// 225
976,// 226
841,// 227
1127,// 228
956,// 229
1159,// 230
950,// 231
7791,// 232
954,// 233
1289,// 234
933,// 235
1127,// 236
3207,// 237
1020,// 238
927,// 239
1355,// 240
768,// 241
1040,// 242
745,// 243
952,// 244
805,// 245
1073,// 246
740,// 247
1013,// 248
805,// 249
1008,// 250
796,// 251
996,// 252
1057,// 253
11457,// 254
13504,// 255
};

(2) – XOR algorithm used by the server to decode the message content.

#define CL_ENCODE_START 12
byte key, *string;
int i, index;

string = (byte *)clc.serverCommands[ reliableAcknowledge &amp; (MAX_RELIABLE_COMMANDS-1) ];
index = 0;
//
key = clc.challenge ^ serverId ^ messageAcknowledge;
for (i = CL_ENCODE_START; i &lt; msg-&gt;cursize; i++) {
    // modify the key with the last received now acknowledged server command
    if (!string[index]) {
        index = 0;
    }

    if (string[index] &gt; 127 || string[index] == '%') {
        key ^= '.' &lt;&lt; (i &amp; 1);
    }  else {
        key ^= string[index] &lt;&lt; (i &amp; 1); } index++; // encode the data with this key *(msg-&gt;data + i) = (*(msg-&gt;data + i)) ^ key;
}

(3) – XOR algorithm used by the client to decode the message content.

Notes

data has mixed endianess – is this right?

Packet fragmentation is worked out using a sequencetNumber & FRAGMENT_BIT (where FRAGMENT_BIT is 1<<31) calculation. If fragmented, flip the bit to 0 to correct the sequence number.

cl_shownet 1 – MSG_SIZE
cl_shownet 2|3 – READ_COUNTCMD
showpackets 1 – WHO recv MSG_SIZE : s=SEQ_NO (optional fragment info)

Acknowledgements