Jump to content


Photo
- - - - -

xml_to_json()


  • Please log in to reply
1 reply to this topic

#1 Juju

Juju

    GMC Member

  • GMC Member
  • 1109 posts
  • Version:Unknown

Posted 10 March 2016 - 10:14 AM

xml_to_json()

 

A no-dependency tool to decode XML data, for GM:S v1.4.1567 and later (some earlier versions may work).

 

Direct link to script

 

Follow @jujuadams

 

  • Title:  xml_to_json()
  • Description:  A no-dependency tool to decode XML data, for GM
  • GM Version:  GM:Studio (v1.4.1567)
  • Registered: no
  • File Type: text file of GML script
  • File Size:  <30 kB
  • File Link:  Direct link to script
  • Required Extension: no
  • Required DLL: no

 

Spoiler

 

Key Points

 

  1. Loads basic XML into GM’s JSON data structure format

  2. Great for content management, localisation and modding support

  3. One-shot script with no dependencies, requires no initialisation

  4. Automatically detects lists of data

  5. Differentiates child types (ds_map vs. ds_list) by using non-integer ds indexes

  6. Nodes are ds_maps, children are nested ds_maps

  7. Attributes are keys with an @ affix

 

 

 

Recommended reading:

  • W3Schools’ introduction to XML
  • GameMaker: Studio’s buffer functions
  • GameMaker: Studio’s JSON-related functions

 

XML is a commonly-used, flexible, branching data storage method that’s used in web, game and business/industrial development. It is by no means a concise format but its broad range of uses and adaptability have made it invaluable for human-readable data storage. A grasp of XML is invaluable for not only basic content creation (for example, weapon parameters or a shopkeeper’s inventory) but finds a new life in localisation/translation and, excitingly, mod support for GameMaker games. XML is occasionally returned by RESTful APIs in preference over JSON.

 

XML has been approached in the past within the GameMaker world. The majority of implementations are now years old and/or rely on depreciated functions. The only major remaining XML library, gmXML, requires a small amount of porting to work on modern versions of GM:S. This isn’t too tricky for most experienced GM devs. Unfortunately, gmXML is extremely heavy - it has a myriad of detailed functions for the traversal and editing of XML files. Whilst this might be attractive from a technical point of view, 99.9% of the time projects will be importing XML files as content, not manipulating them. The clever database-like functions are unnecessary. However, the biggest flaw with gmXML is that it is object-based, something that limits its application significantly due to the large overhead associated with objects in GM.

 

This script is a pragmatic tool designed to turn XML into a JSON-styled data structure. It broadly obeys the rules of JSON in GameMaker, that is, it returns a tree of ds_map and ds_list structures nested inside each other. Indeed, the nested data structure that this script creates can be directly exported as a JSON using the native json_encode() function. Traversing this data structure is very similar to JSON. However, unlike assumptions made when reading JSON, code attempting to read output from this script regularly finds itself in a situation where it does not know whether a key:value pair is a string, a nested ds_list or a nested ds_map. This problem is solved in this script using a simple method. This script seeks to implement basic XML-to-JSON functionality. It does not have error checking, it does not support XML format files, it does not seek to emulate DOM or DOM nomenclature (though it bears a resemblance to it). This script is not universal but, equally, it does not need to be.

 

 

 

The script takes one or two arguments - the data source and, optionally, a “verbose” setting. The verbose setting provides extra output to the Compile Form, using show_debug_message, to help with tracking bugs. The data source can be one of two types - it can be a buffer or an external file. It’s worth mentioning at this point that a text file is, in reality, just a buffer. A standard ASCII file is a string of bytes (uint8 a.k.a. buffer_u8 in GM) that represent letters, numbers, symbols and so on. The numerical representation of the letter “A”, for example, is 65. We can find the numerical value of any character with GM’s function ord(). If a file path is passed to the script then then the file is loaded into a buffer (which is unloaded at the end of the script) ready for processing.

 

The script is formed around a loop that iterates over each character in the file. Depending on what character is read from the buffer, and depending on what characters have come before it, the script makes decisions on how to build its data structures. This is achieved using a number of different states. If a state-changing symbol is read (opening a tag, starting a string assigning an attribute etc.) then suitable actions are performed, typically adding cache data to the JSON. If a non-special symbol is read then that symbol is appended as a character to the string cache.

 

This is best demonstrated with a simple piece of XML:

<test attrib=”example” />

The first character is an “open tag” symbol, the script creates a new ds_map ready for data assignment. In the XML format, a name follows a new tag when creating a block; characters are cached until the first space. This means “test” is cached. Once the script reads a space, it knows that the name has finished and copies the cache across as “_GMname : test” in the ds_map already created. Whilst opening a tag creates a ds_map, a node isn’t added to a parent until it is given a name.

 

Any data that's left inside the tag is an attribute. “attrib” from the XML source is cached next. When the script reads an equals sign (and is inside a tag), it transfers the cache to keyName in preparation. Upon reading a “ symbol, the script sets the insideString state to true. When the script has been set to inside a string, all data is considered to be text and is cached as such until another quote mark is seen. As a result, “example” is cached. When the next space is read, the script adds “@attrib : example” to the ds_map. Note that attributes have the @ symbol prefix to avoid collisions.

 

The “/” symbol acts as a terminator in XML; the script sets the state terminating to true which tells the script to close the block at the next close tag symbol. The next symbol is indeed “>”, “close tag”, and the script sets insideTag to false. When a block is terminated (in this case it is created and terminated in the same tag), the script navigates up a layer ready for further input.

 

 

 

A crucial part of XML is being able to specify lists of data. Let’s look at an example:

<parent>
   <child/>
   <child/>
   <child/>
</parent>

Whilst it’s clear to us that this is a list of three children nested inside a parent, the script only analyses XML one character at a time and does not forward-predict. After terminating the first child block, the script adds the first child node to the parent using ds_map_add_map. The script expects that each new tag has a unique name under the same parent.

 

However, the second “child” tag would cause a naming conflict with the first “child” tag. The script knows this as the key “child” already exists with a numeric value. The XML file is trying to define a list; in this case, the script deletes the entry created with ds_map_add_map and replaces it with ds_map_add_list. The previous child node and the second, new, child node are added to a ds_list.

 

The third identical tag causes a problem, however. The script can see that a tag with the name “child” already exists but, using typical methods, it’s impossible to determine whether the numeric value stored under that key is an identifier for a ds_map or an identifier for a ds_list. A ds_map and a ds_list can share the same numerical identifier despite being very different data structures. As such, the script doesn’t know whether to replace the key:value pair with a new list or to add to an existing list.

 

The solution is to use GameMaker’s relaxed datatyping to sneak extra information into that numerical value. In this case, we can add 0.1 to the identifier for all ds_list so that scripts can differentiate ds_map and ds_list with a simple conditional:

if ( floor( ident ) == ident ) {
   //ds_map
} else {
   //ds_list
}

This method can be expanded to encompass all data structure types. This method won’t cause errors when reading from data structures as the index argument(s) for those functions are floored internally. When writing scripts that traverse unpredictable data, keep this method in mind to help with structure discovery.


  • 1

Come find me @jujuadams

 

Try out my open-source 3D globe terrain generator!

How about a fancy-pants text engine?

Adding dialogue boxes to your games is now super easy. Also localisation. Also tweening.


#2 chance

chance

    GMC Member

  • Global Moderators
  • 8762 posts
  • Version:GM:Studio

Posted 10 March 2016 - 12:37 PM

This is an interesting tool, although I admit I'm unlikely to use it.  The extent of my JSON encoding will probably never exceed the example given in the Studio manual for posting simple data to a server (json_encode and http_post_string).

 

But as I understand it, your tool resolves some potential ambiguities about data type.  So I hope some members can put this to good use.


  • 0