Jump to content


Photo

URL Encoding with UTF-8 support (GM8, GMS)


  • Please log in to reply
3 replies to this topic

#1 GameGeisha

GameGeisha

    GameGeisha

  • GMC Member
  • 6565 posts
  • Version:GM:Studio

Posted 02 October 2013 - 02:01 AM

In response to one of Nocturne's status updates regarding URL encoding in GMS, I'm sharing a stand-alone version of the URL encoding script used in JSOnion for single strings:

{
    /**
    url_encode(str): Encode the string <str> in x-www-form-urlencoded format.
    */
    var s, hex_digits, special_chars;
    s = "";
    hex_digits = "0123456789ABCDEF";
    special_chars = "$&+,/:;=?@ " + '"' + "'<>#%{}|\^~[]`!";
    
    //Main loop
    var i, l, c, o, escapes, escape_bytes;
    l = string_length(argument0);
    for (i=1; i<=l; i+=1) {
        c = string_char_at(argument0, i);
        o = ord(c);
        escapes = 0;
        //Single-byte characters
        if (o <= $7F) {
            if (string_pos(c, special_chars) != 0) || (o < 32) {
                escapes = 1;
                escape_bytes[0] = o;
            }
        }
        //2-byte characters
        else if (o <= $7FF) {
            escapes = 2;
            escape_bytes[0] = (o>>6)+192;
            escape_bytes[1] = (o&63)+128;
        }
        //3-byte characters
        else if (o <= $FFFF) { 
            escapes = 3;
            escape_bytes[0] = (o>>12)+224;
            escape_bytes[1] = ((o>>6)&63)+128;
            escape_bytes[2] = (o&63)+128;
        }
        //Too long
        else {
            show_error("Invalid character.", true);
        }
        //Dump in escape characters, if any
        if (escapes == 0) {
            s += c;
        }
        else {
            var j;
            for (j=0; j<escapes; j+=1) {
                s += "%" + string_char_at(hex_digits, (escape_bytes[j]>>4)+1) + string_char_at(hex_digits, (escape_bytes[j]&15)+1);
            }
        }
    }
    
    //Done
    return s;
}

Unlike most other URL encoding conversion GML scripts, this one supports UTF-8 multi-byte characters in addition to the standard ASCII character set. This means you can use it to access URLs and make POST requests involving non-Latin characters.

 

Feel free to use this script in any project, and please let me know if you have any questions or comments about it.

 

GameGeisha


Edited by GameGeisha, 13 June 2014 - 05:33 PM.

  • 3
Latest Releases:
  • GMLinear --- Matrix and vector math in one line!
  • GMAssert --- Debug invalid values and write quick unit tests with ease!
  • KameGMS --- Bring up TortoiseSVN and TortoiseGit dialogs from within the GMS IDE!
  • JSOnion v1.1 --- The stink-free way to handle JSON! (even deeply nested ones)

#2 borut

borut

    Courage Wolf Productions

  • GMC Member
  • 1457 posts
  • Version:GM:Studio

Posted 17 April 2014 - 06:23 AM

can you please show an example of usage?


  • 0

#3 alexandervrs

alexandervrs

    GMC Member

  • GMC Member
  • 745 posts
  • Version:GM:Studio

Posted 17 April 2014 - 06:50 AM

can you please show an example of usage?

 


Well, pretty much you'll have to do something like,

 

test = url_encode("http://mysite.com/το λινκ μου/");

and you'll get something like http%3A%2F%2Fmysite.com%2F%CF%84%CE%BF%20%CE%BB%CE%B9%CE%BD%CE%BA%20%CE%BC%CE%BF%CF%85%2F in your "test" variable.


  • 0

#4 veta420

veta420

    GMC Member

  • GMC Member
  • 720 posts
  • Version:GM:Studio

Posted 29 May 2015 - 02:52 AM

this is a great script, is there a decoder anywhere?  If not I'd be willing to make one and post it here?


  • 0