You are here

January 2009

Error message

Deprecated function: Function create_function() is deprecated in GeSHi->_optimize_regexp_list_tokens_to_string() (line 4716 of /home/web/ithiriel.com/www/web/sites/all/libraries/geshi/geshi.php).

Implementations: Introduction

While trying to decide how to approach a given project a few weeks ago, I had the choice of using one of several programming languages. This prompted me to look around for ideas of what other people had said comparing languages.

In my opinion, the best comparison I found was David Howard's essay C++ vs Java vs Python vs Ruby : a first impression. While mostly devoid of bias, what really won me over was the ability to compare the implementations of his example matter, a "Red-Black tree algorithm."

Fortunately, for me, the project I was working on did not need anything spectacular for a user interface or deployment. However, most "real" projects are limited by some requirement based around usage or target audience. Writing custom web applications is usually limited by the abilities of the target webserver. Writing applications for deployment on desktop machines is limited by the support for the target OS and its GUI in the programming language or libraries. Writing applications for deployment on a security-audited machine can require the application only being able to use what is already installed.

Since I do not plan to be limited to a particular platform, application, etc. in the future, I find it behooves me to learn how to implement tasks (some "real-world", some "toy") in various languages (and, in some instances, various libraries within the languages). Each implementation will have its own article and will be written with my thoughts as I go. (Or that's the intention anyway.)

By posting what I come up with publicly, I hope to help others looking to fulfill these tasks and I hope to get people more knowledgeable than I to comment on what I've done wrong.

XML Parser: Statement of Problem

Background

World of Warcraft is an enormously popular MMORPG. Recently, Blizzard Entertainment announced that they had surpassed 11.5 million subscribers worldwide.

One of the many services offered by Blizzard Entertainment is the Armory which allows players to view their characters and others online. Items are also searchable through a recent update. All of the dynamic content (characters and items) are served as XML content which is then passed through XSL stylesheets to present the HTML content. This allows those who have the interest to query the Armory for the XML representation of an item or a character and then parse it locally.

Problem

Write an XML parser that processes an item's XML data and generates a data representation of that item.

Implementation Details

The XML data is stored in files with one paragraph per file.

The parser function should accept a single parameter, a filename, and return the expected object.

Elements that contain data as references not in the XML paragraphs can be left in the format present in the elements. The damageData element and the current attribute of the durability element may be ignored.

Implementations do not need to be object-oriented.

Discussion

Four items have been randomly selected as test data:

  • 41660: Deadly Gladiator's Dragonhide Robes
  • 40503: Valorous Cryptstalker Tunic
  • 40526: Gown of the Spell-Weaver
  • 44006: Obsidian Greathelm

This data is by no means representative of what would need to be done to fully implement a parser for all equippable items in World of Warcraft. However, for the purpose of this exercise, it will suffice.

Based on the XML paragraphs, a generated item should have the following basic structure (expressed in "pseudo-Java"):

  1. class Item {
  2.     int id;                     /* id */
  3.     String name;                /* name */
  4.     String icon;                /* icon */
  5.     int quality;                /* overallQualityId */
  6.  
  7.     int bonding;                /* bonding */
  8.     int classId;                /* classId */
  9.     int equipInventoryType;     /* equipData/inventoryType */
  10.     String equipSubclass;       /* equipData/subclassName */
  11.  
  12.     int strength;               /* bonusStrength */
  13.     int agility;                /* bonusAgility */
  14.     int stamina;                /* bonusStamina */
  15.     int intellect;              /* bonusIntellect */
  16.     int spirit;                 /* bonusSpirit */
  17.     int armor;                  /* armor */
  18.     int addedArmor;             /* armor[armorBonus] */
  19.     String[] sockets;           /* one per socketData/socket */
  20.  
  21.     int maxDurability;          /* durability[max] */
  22.     String[] allowedClasses;    /* one per allowableClasses/class */
  23.     int requiredLevel;          /* requiredLevel */
  24.     int attackPower;            /* bonusAttackPower */
  25.     int critRating;             /* bonusCritRating */
  26.     int expertiseRating;        /* bonusExpertiseRating */
  27.     int hasteRating;            /* bonusHasteRating */
  28.     int hitRating;              /* bonusHitRating */
  29.     int spellpower;             /* bonusSpellPower */
  30.     int resilience;             /* bonusResilience */
  31.  
  32.     Spells[] spells;            /* one per spellData/spell */
  33.  
  34.     ItemSet itemSet;            /* setData */
  35.     ItemSource source;          /* source */
  36. }
  37.  
  38. class Spell {
  39.     int trigger;                /* trigger */
  40.     String desc;                /* desc */
  41. }
  42.  
  43. class ItemSet {
  44.     String name;                /* name */
  45.     String[] items;             /* one per item */
  46.     Map<int, String> bonuses;   /* setBonus[ threshold ], setBonus[ desc ] */
  47. }
  48.  
  49.  
  50. class ItemSource {
  51.     String sourceType;          /* [value] */
  52.  
  53.     /* These are only defined when sourceType == "sourceType.creatureDrop" */
  54.     int areaId;                 /* [areaId] */
  55.     String areaName;            /* [areaName] */
  56.     int creatureId;             /* [creatureId] */
  57.     String creatureName;        /* [creatureName] */
  58.     String difficulty;          /* [difficulty] */
  59.     int dropRate;               /* [dropRate] */
  60. }

If not specified by the XML data:

  • All integer values should return 0.
  • All strings should return null or the appropriate language equivalent thereof.
  • All arrays should be empty.

The returned data representations should match the above access terms unless language convention dictates otherwise. (For example, in a Java implementation, the methods would use the standard get... names, e.g. getStrength() in place of strength.)

Edits:

  1. Clarified how the XML is stored. (2009/01/14)
  2. Removed object-oriented-specific terminology. (2009/01/14)
  3. Changed requirement on the value of unspecified arrays. (2009/01/14)
  4. Changed requirement on the durability element. (2009/01/25)

XML Parser: Rendering the Item in Ruby

The first language to be used for XML parsing is Ruby. Ruby appears to have three main XML libraries: REXML, Hpricot, and libxml-ruby. As time allows, each will be tested.

Before starting the testing, several things need to be done. First, there needs to be a representation of the Item class in Ruby. From the pseudocode initially shown in the problem statement, I derive the Ruby representation of the Item class and its helper classes.

Since no limitations have been set for accessing or modifying the object's data, all fields are implemented with attr_accessor. Unfortunately, since attr_accessor does not have the ability to set default values, the variable initialization needs to be done in the constructor.

One notable difference between the pseudocode and the Ruby version is the casing. While the pseudocode uses "camel case" variable names as standard for Java, the Ruby version uses "underscore" variable names as per the conventions of the community. This creates small differences like critRating vs. crit_rating. There is an ongoing debate as to which is easier to read.

This represents the bare minimum thus far needed to implement the Item object. When discussing testing, other methods will be added to the Item class and helper classes.

How I Got Started Programming

It's an old meme but it still offers insight, I think.

How old were you when you started programming?

Eight.

Before then, I had entered in programs from magazines (they had such things back in the early 1980's) but I had not actually endeavored to try to create something of my own. What I did create was a colossal din of ugly graphics and noise but it was something I actually created.

How did you get started programming

I grew up around computers. We had several TI-99/4A computers.

My father was a programmer as well but, to this day, I'm not sure if he did it as part of his day job or if he did it as a hobby. I think it was mostly his influence that kept me working with computers and programming. I don't know if that was deliberate or not.

What was your first language?

BASIC.

What was the first real program you wrote?

This depends on your definition of "real."

My first "serious" program that I remember was a program for playing Yahtzee in TI BASIC.

Discounting that, the next would be a BattleMech "database" project I did in Turbo Pascal 5.0. Exhibiting a lot of bad programming practices, it at least worked for what I needed it to do. However, there were issues and I was planning on rewriting it from scratch. I never quite got there.

What languages have you used since you started programming?

Trying to be chronologically accurate: BASIC, Pascal, C, sh, C++, Perl, Java, Python, SQL, PHP, VBScript, Cold Fusion, Ruby, Lua

Points of contention are:
Is LOGO a language?
Do different dialects of BASIC qualify as different languages?

What was your first professional programming gig?

I did a very small amount of Java during my internship with IBM.

If there is one thing you learned along the way that you would tell new developers, what would it be?

Always ask "Why?" If someone tells you that x is a best practice, find out why it is before blindly using it.

What's the most fun you've ever had programming?

Rewriting the dice rolling routine for my modular Python IRC bot. Different games have different dice rolling requirements so I ended up making the dice routine itself modular too. The first draft of that rewrite was done on notebook paper early one morning when I didn't have computer access.

Project Euler

I discovered Project Euler yesterday via a post on the Ruby-Talk mailing list. If you're the sort of person who enjoys puzzles and doesn't mind math, I suggest that you try out the problems for yourself.

I've been working through the problems slowly and in numerical order. While most of my solutions have been "brute force," it has given me an opportunity to learn how to do things in Ruby.

Why slowly? Because I haven't had a lot of time.

Why in numerical order? Later problems build on earlier ones. As I reach a problem that requires a given function or routine, I can go "Hey, I just did that in problem #x" and reuse it.

Why Ruby? Ruby was my "language to learn" last year but I neglected it so I'm hoping I can get more done before I work on this year's language (which will probably be Erlang or Scheme).

A lot of what I've learned to do is quick one-liners with arrays, much like what Drew Olson posted almost two years ago. (I love reinventing the wheel, don't you?) And then there have been neat numerical tricks.

For example, to find out if a number is prime with a somewhat naive algorithm:

  1. class Fixnum
  2.   def prime?
  3.     return false if self < 2
  4.     return false if self == 2
  5.     return false if self % 2 == 0
  6.  
  7.     i = 3
  8.     bound = Math.sqrt( self )
  9.     while i <= bound and i < self
  10.       return false if self % i == 0
  11.       i += 2
  12.     end
  13.  
  14.     true
  15.   end
  16. end

Monkeypatching this into Fixnum is probably not the greatest idea to some people but it allows use of syntax like 31.prime? (useful for testing) and ( n + 1 ).prime?, which is theoretically useful. For example, to find if the Mersenne number for a given power of two is prime, you could write: ( 2 ** n - 1 ).prime?

And, most importantly, I find n.prime? to read better than prime?( n ). Readability always trumps cleverness. Luckily, here, it's both readable and clever.

If you want to learn how to use a language better, I suggest you use it for the Project Euler problems. After a while, you'll find ways to do things you've probably not thought of before (or, possibly, ever needed to do before). And for Ruby fans, you can find a use for Array#each_cons.