Jump to content


Photo

Idea: Sharing data between extensions / dlls


  • Please log in to reply
85 replies to this topic

#1 Medo42

Medo42

    GMC Member

  • GMC Member
  • 306 posts

Posted 18 April 2011 - 01:36 PM

Programmers sometimes need to move data between different, unrelated extensions. An example would be an extension for file reading/writing, and one for networking: Both extensions would likely have some sort of buffer data structure, but if you want to read a file and send it over a connection, how do you get the data from the file extension into the networking extension? You will probably need to copy it over GML somehow. 39dll solves this particular problem by having both file access and networking functions, but in my opinion this is not a good solution, unless your ideal goal is to create the "everything extension". I prefer modularity though, which is why I don't plan to include file access functions in my own Faucet Networking extension. In my opinion, the better solution is to solve the problem of data exchange between extensions.

One option would be to create a "Shared Buffers" dll that is linked by all extensions which want to exchange data in this way. Windows will only load the dll into memory once for the game process, so it can be used as a central point to manage and exchange shared data buffers. If both the file and the networking extension in the example above used this dll, the user could simple read a file into a shared buffer and pass that to the networking extension for sending. A more complex implementation could offer interfaces to exchange data from "internal" buffers of extensions which use different implementations, or other types of data exchange, like data streams.

So far, this is only an idea that still needs to be fleshed out, but I want to put it out there to find out what you think. One problem with the shared dll approach is that this dll would have to be in the game directory when running, and some people probably don't like that. Additionally, the dll would have to be adopted by several extensions to be really useful. A major problem is to ensure stability of the Buffer dll's interface, so that you don't get different extensions using different, incompatible versions of the Buffers dll.

But before any work goes into creating this, please tell me your opinion. Do you think this would be useful to you as an extension/dll user? Would you, as an extension/dll creator, support this interface? Discuss! :P
  • 0

#2 Medusar

Medusar

    GMC Member

  • GMC Member
  • 1228 posts
  • Version:GM:Studio

Posted 18 April 2011 - 02:14 PM

I can't really think of many occasions when you really need to transfer data between separate DLLs... Your networking example is one but as you mentioned that's already been accounted for. Apart from that, most DLLs for GM allow complex calculations to be done in compiled code or they provide an interface to an external API. You'd hardly ever even want to share data between unrelated DLLs as they tend to use their own structures. So lots of data would not be compatible.
The developer could decide to expose an API for his DLL but I'd prefer any contributors to PM me so that the extra functionality would end up in the same code base. Either way this would not involve a buffer DLL.
  • 0

#3 Maarten Baert

Maarten Baert

    GMC Member

  • GMC Member
  • 745 posts
  • Version:GM8.1

Posted 18 April 2011 - 02:54 PM

I had exactly the same problem while writing my own networking dll (Http Dll 2), which is why I added buffers. All my other DLL's simply convert the data to a very long hexadecimal string, which can be safely passed through GML. This is clearly not very efficient.

I would prefer letting GM load the Shared Buffers dll, as a normal extension. Then the GEX could have a function 'shared_buffer_handle()' that returns a pointer to a controller class inside the shared buffers DLL. Other DLLs could then store this pointer during initialization, and use the pointer to access the buffers later. Since the DLL is only loaded once (by GM), there can only be one version of the DLL at any time. All other DLLs can use any version of the shared buffers DLL as long as the interface of the DLL never changes. This is possible with function pointers. Virtual functions should work too, but since different compilers use different vtable formats this won't work if one DLL was compiled in VC++ and another was compiled in GCC.

class Buffer; // opaque pointer
class SharedBufferInterface {
    
    private:
    Buffer* (*createbuffer)();
    void (*destroybuffer)(Buffer*);
    // ...
    
    public:
    inline Buffer* CreateBuffer() { return (*createbuffer)(); }
    inline void DestroyBuffer(Buffer* buffer) { (*destroybuffer)(buffer); }
    // ...
    
};

I would probably use this DLL in my own DLLs if we would create one. I have code for buffers that are very similar to 39dll's buffers, I will create a simple proof of concept to test the idea.

@Medusar: It would be really useful for serialization: converting a data structure to a string or loading a data structure from a string (like ds_map_write/ds_map_read). This is useful to save/load the game. Many DLLs use data structures, it would be great if we could save them all to a single file without having to use temp files. This would also make it a lot easier to send the contents of a data structure to another client in a multiplayer game.
  • 0

#4 Medo42

Medo42

    GMC Member

  • GMC Member
  • 306 posts

Posted 18 April 2011 - 05:16 PM

You make an interesting point about the vtable - the way it looks, you need a C interface instead of a C++ one in order to make GCC and MSVC work together. And using two seperate versions for the two compilers would defeat the entire point of course.

Since the DLL is only loaded once (by GM), there can only be one version of the DLL at any time.

I didn't test this, but from my understanding Windows only loads a dll once per process even if it is requested multiple times.
  • 0

#5 Tha_14

Tha_14

    GMC Member

  • New Member
  • 174 posts
  • Version:GM8.1

Posted 18 April 2011 - 05:31 PM

OOOHHHHH,Big Replies!!!
I need a lot of time to read that.
  • 0

#6 Medo42

Medo42

    GMC Member

  • GMC Member
  • 306 posts

Posted 18 April 2011 - 05:56 PM

Another thought, it's possible to implement this without an extra dll and offer a statically linked library instead. The idea is to use handles which contain a pointer to the buffer descriptor instead of a generated integer as is usually done. As long as everything runs in 32-bit mode you should be able to store the pointer's integer value in a GM real without a problem, but it feels slightly less naughty to encode it to a GM string instead.
  • 0

#7 icuurd12b42

icuurd12b42

    Self Formed Sentient

  • GMC Member
  • 15785 posts
  • Version:GM:Studio

Posted 18 April 2011 - 06:06 PM

You can pass along the shared pointer via a function call in GM

1) define external call to sharedmem.dll called CreateMem
2) define external call in dll1 called Dll1SetSharedMem
2) define external call in dll2 called Dll2SetSharedMem


var mem; mem = CreateMem();
Dll1SetSharedMem(mem);
Dll2SetSharedMem(mem);


in sharemem CreateMem
return (double)(DWORD) GlobalAllocPtr(GMEM_FIXED!GMEM_ZEROINIT, 1024);

in dll 1 and dll2
void* sharedmem = NULL;

void Dll1SetSharedMem(double mem) //same for Dll2SetSharedMem in dll2
{
sharedmem = (void*)(DWORD) mem;
}

Edited by icuurd12b42, 18 April 2011 - 06:07 PM.

  • 0

#8 paul23

paul23

    GMC Member

  • Global Moderators
  • 3830 posts
  • Version:GM:Studio

Posted 18 April 2011 - 09:38 PM

uhm wait, don't cast pointers like that. - There's no way this is guaranteed to work: with 64-bit computers pointers can be larger than the mantissa of doubles. - And I'm unsure how this works out (I think this depends on the compiler wether they're mem-copied or "copied-by-value"), and as someone raised in this post there are a lot of other "potential" problems when using pointers like this.


Now I am not in favour of the whole idea, instead I would try to redesign and make a "tree-structure" of dependency: GM (root) handles 1 or 2 dlls, and then those dlls handle sub dlls (and the communication between sub-dlls). Dlls don't know about their "roots" or other dlls except for their "children".
If you let memory be shared between dlls and generally do stuff like that, in bigger projects you'll always run into the problem: "who owns what", which can slow development down a lot!
  • 0

#9 Medo42

Medo42

    GMC Member

  • GMC Member
  • 306 posts

Posted 18 April 2011 - 10:28 PM

uhm wait, don't cast pointers like that. - There's no way this is guaranteed to work: with 64-bit computers pointers can be larger than the mantissa of doubles.

That's why I said "As long as everything runs in 32-bit mode", which is currently the case for Game Maker related things.

- And I'm unsure how this works out (I think this depends on the compiler wether they're mem-copied or "copied-by-value"), and as someone raised in this post there are a lot of other "potential" problems when using pointers like this.

Again, for 32-bit pointers this shouldn't be a problem if done this way, assuming Game Maker's real type always corresponds to double. It doesn't matter if they are stored in FPU registers now and then, because those have an even larger mantissa than the normal double representation. Copying the pointer to a string instead (in some encoding that avoids the 0-byte of course) is a bit more straightforward in terms of correctness though.

Now I am not in favour of the whole idea, instead I would try to redesign and make a "tree-structure" of dependency: GM (root) handles 1 or 2 dlls, and then those dlls handle sub dlls (and the communication between sub-dlls). Dlls don't know about their "roots" or other dlls except for their "children".
If you let memory be shared between dlls and generally do stuff like that, in bigger projects you'll always run into the problem: "who owns what", which can slow development down a lot!

The "Who owns what" question has to be solved once - when designing the dll. After that everyone just has to follow the rules.
I don't see where you are going with your idea. Arranging dlls in a tree structure implies that they are dependend on each other, but what I would like to get is a standard way to exchange information between independend extensions. For example, imagine an SHA256 dll that creates a digest over a buffer. It would be good if that function could be developed to just work on any buffer, without all extension programmers having to add this new dll as a "child" dll.
  • 0

#10 paul23

paul23

    GMC Member

  • Global Moderators
  • 3830 posts
  • Version:GM:Studio

Posted 19 April 2011 - 12:03 AM


uhm wait, don't cast pointers like that. - There's no way this is guaranteed to work: with 64-bit computers pointers can be larger than the mantissa of doubles.

That's why I said "As long as everything runs in 32-bit mode", which is currently the case for Game Maker related things.

- And I'm unsure how this works out (I think this depends on the compiler wether they're mem-copied or "copied-by-value"), and as someone raised in this post there are a lot of other "potential" problems when using pointers like this.

Again, for 32-bit pointers this shouldn't be a problem if done this way, assuming Game Maker's real type always corresponds to double. It doesn't matter if they are stored in FPU registers now and then, because those have an even larger mantissa than the normal double representation. Copying the pointer to a string instead (in some encoding that avoids the 0-byte of course) is a bit more straightforward in terms of correctness though.

Now I am not in favour of the whole idea, instead I would try to redesign and make a "tree-structure" of dependency: GM (root) handles 1 or 2 dlls, and then those dlls handle sub dlls (and the communication between sub-dlls). Dlls don't know about their "roots" or other dlls except for their "children".
If you let memory be shared between dlls and generally do stuff like that, in bigger projects you'll always run into the problem: "who owns what", which can slow development down a lot!

The "Who owns what" question has to be solved once - when designing the dll. After that everyone just has to follow the rules.
I don't see where you are going with your idea. Arranging dlls in a tree structure implies that they are dependend on each other, but what I would like to get is a standard way to exchange information between independend extensions. For example, imagine an SHA256 dll that creates a digest over a buffer. It would be good if that function could be developed to just work on any buffer, without all extension programmers having to add this new dll as a "child" dll.


I'm saying that there is no need for a "standard" way: GM acts as root and should handle these translations.

Also consider the loading:
if A is a dll and B needs to get memory allocated by A: how can B be sure that A is not freed, and the memory cleared?
  • 0

#11 icuurd12b42

icuurd12b42

    Self Formed Sentient

  • GMC Member
  • 15785 posts
  • Version:GM:Studio

Posted 19 April 2011 - 02:01 AM

@paul... about the casts... Problem occurs when you don't do a progressive cast down Like char t= (char)MyDouble; You have to progressively cast to the next smaller type char t = (char)(int)(DWORD)MyDouble;

As for who owns what, in my example, GM owns the memory but SharedMem.dll manages it. You have to decide on what is located in the address and who can to what to it.

Best scenario is to have a garbage system... Simplest is to simply global allocate a large chunk that will exist from start to end of the application and this chunk can be used for data passing. AKA the mem_fixed flag used here.

Remember data sharing is the goal of this concept. Each dll could copy the passed data in it's own space. You just read and write to the buffer as a means to pass along data.


Here I have an in head elaborate design for shared resources for a GMTOOLS (sharing resources and functions)

http://cid-fba0b7e57.../Public/GMTools

see tools overview

Edited by icuurd12b42, 19 April 2011 - 02:04 AM.

  • 0

#12 Maarten Baert

Maarten Baert

    GMC Member

  • GMC Member
  • 745 posts
  • Version:GM8.1

Posted 19 April 2011 - 09:42 AM

I just finished my proof of concept:
http://gm.maartenbae...red_buffers.zip
I've tested it in GCC and VC++, both versions are compatible.

The main DLL (shared_buffers.dll) exports buffer functions to GM, so games can access the buffers too:
shared_buffers_get_handle()buffer_create()buffer_destroy(id)buffer_exists(id)buffer_to_string(id)buffer_get_pos(id)buffer_get_length(id)buffer_at_end(id)buffer_get_error(id)buffer_clear_error(id)buffer_clear(id)buffer_set_pos(id,pos)buffer_read_from_file(id,filename)buffer_write_to_file(id,filename)buffer_read_int8(id)buffer_read_uint8(id)buffer_read_int16(id)buffer_read_uint16(id)buffer_read_int32(id)buffer_read_uint32(id)buffer_read_int64(id)buffer_read_uint64(id)buffer_read_float32(id)buffer_read_float64(id)buffer_write_int8(id,value)buffer_write_uint8(id,value)buffer_write_int16(id,value)buffer_write_uint16(id,value)buffer_write_int32(id,value)buffer_write_uint32(id,value)buffer_write_int64(id,value)buffer_write_uint64(id,value)buffer_write_float32(id,value)buffer_write_float64(id,value)buffer_read_string(id)buffer_write_string(id,string)buffer_read_data(id,len)buffer_write_data(id,string)buffer_read_hex(id,len)buffer_write_hex(id,string)buffer_write_buffer(id,id2)buffer_write_buffer_part(id,id2,pos,len)
Please tell me if I forgot something important. The functions that can be called by the other DLLs are almost the same, but they use pointers instead of ids:
Buffer* Create();void Destroy(Buffer* buffer);unsigned int GetID(Buffer* buffer);Buffer* Find(unsigned int id);const char* ToString(Buffer* buffer);char* GetData(Buffer* buffer);unsigned int GetPos(Buffer* buffer);unsigned int GetLength(Buffer* buffer);bool IsAtEnd(Buffer* buffer);bool GetError(Buffer* buffer);void ClearError(Buffer* buffer);void Clear(Buffer* buffer);void SetPos(Buffer* buffer, unsigned int newpos);void SetLength(Buffer* buffer, unsigned int newlength);bool ReadFromFile(Buffer* buffer, const char* filename);bool WriteToFile(Buffer* buffer, const char* filename);int8_t ReadInt8(Buffer* buffer);uint8_t ReadUint8(Buffer* buffer);int16_t ReadInt16(Buffer* buffer);uint16_t ReadUint16(Buffer* buffer);int32_t ReadInt32(Buffer* buffer);uint32_t ReadUint32(Buffer* buffer);int64_t ReadInt64(Buffer* buffer);uint64_t ReadUint64(Buffer* buffer);float ReadFloat32(Buffer* buffer);double ReadFloat64(Buffer* buffer);void WriteInt8(Buffer* buffer, int8_t value);void WriteUint8(Buffer* buffer, uint8_t value);void WriteInt16(Buffer* buffer, int16_t value);void WriteUint16(Buffer* buffer, uint16_t value);void WriteInt32(Buffer* buffer, int32_t value);void WriteUint32(Buffer* buffer, uint32_t value);void WriteInt64(Buffer* buffer, int64_t value);void WriteUint64(Buffer* buffer, uint64_t value);void WriteFloat32(Buffer* buffer, float value);void WriteFloat64(Buffer* buffer, double value);const char* ReadString(Buffer* buffer);void WriteString(Buffer* buffer, const char* string);void ReadData(Buffer* buffer, char* ptr, unsigned int bytes);void WriteData(Buffer* buffer, const char* ptr, unsigned int bytes);void ReadHex(Buffer* buffer, char* ptr, unsigned int bytes);void WriteHex(Buffer* buffer, const char* ptr, unsigned int bytes);void WriteBuffer(Buffer* buffer, Buffer* source);void WriteBufferPart(Buffer* buffer, Buffer* source, unsigned int pos, unsigned int bytes);
The functions GetID and Find can be used to convert pointers to ids and vice versa.

The function shared_buffers_get_handle() is used to pass the pointer to the other DLL:
shared_buffers_init();
test_init();

// initialize shared buffers interface for test.dll
test_init_shared_buffers(shared_buffers_get_handle());
Now the other DLL can use the pointer to manipulate buffers:
#include "gm.h"
#include "SharedBuffers.h"

SharedBufferInterface *sbi = NULL;

gmexport double test_init_shared_buffers(double handle) {
	sbi = (SharedBufferInterface*)(gm_double_to_uint(handle));
	return 1;
}

gmexport double test_writetobuffer(double id) {
	if(sbi == NULL) return 0;
	Buffer *b = sbi->Find(gm_double_to_uint(id));
	if(b == NULL) return 0;
	sbi->WriteInt32(b, 12345);
	sbi->WriteString(b, "Hello world");
	sbi->WriteFloat32(b, 42.42f);
	return 1;
}

Memory management is done by shared_buffers.dll, other DLLs can't allocate or free memory. They should call the Resize function.

Using it in other DLLs is very simple, just include SharedBuffers.h. That's it.

This system has the added benefit of modularity: If some user doesn't need the buffer functionality, that user can choose not to add shared_buffers.dll to his/her game. The part of the DLL that doesn't need buffers will still work.

We could also use shared memory to pass the pointer of the struct, or even to store the entire struct, but I think it's easier to simply pass the pointer through GM. If we use shared memory we would still need functions to tell DLLs to load the shared memory at the right time, because it's possible some DLLs are loaded before shared_buffers.dll is loaded.

What do you think? Would you use something like this in your DLLs? What should be changed to make it better?

Edited by Maarten Baert, 19 April 2011 - 10:53 AM.

  • 0

#13 Medo42

Medo42

    GMC Member

  • GMC Member
  • 306 posts

Posted 19 April 2011 - 10:32 AM

@paul... about the casts... Problem occurs when you don't do a progressive cast down Like char t= (char)MyDouble; You have to progressively cast to the next smaller type char t = (char)(int)(DWORD)MyDouble;

That cast has undefined behaviour if the double value is out of range for the DWORD type. And with "undefined behaviour" I don't mean that you could get a wrong result, but that it can cause your program to crash. And that's not theoretic, it happened to me when developing Faucet Networking. Always make sure that the source value fits in the target type when you cast floating point values to anything else - then you don't need this strange cascade of casts either.

To avoid this problem you can use boost::numeric_cast, which throws an exception if the source is out of range, or you can use my own clipped_cast, which results in the highest/lowest legal value for the target if the source is out of range. In your example, you could use it like "char t = clipped_cast<char>(MyDouble)".

I found this very informative in this matter: http://www.boost.org...efinitions.html

Here I have an in head elaborate design for shared resources for a GMTOOLS (sharing resources and functions)

That looks very comprehensive, but it's far beyond the scope of what I want. In my opinion, finding a good way to share buffers is difficult enough for a single project :)

Here are two different ways to define the ownership/responsibility and rules. There are more different possibilities of course, but something to think about:

Model 1:
Buffers are owned by the extensions that created them, and can be shared for read-only access by other extensions. The owning extension defines for how long a buffer is valid. If extensions are allowed to keep references to the buffer, they must assume that the buffer can change and become invalid at any time between function calls (it is possible to provide a safe way for checking this). Otherwise the extension has to create a copy to work on.

This is probably the simplest model, and it allows efficient access to buffers of other extensions without (usually) requiring to copy. In order to allow sharing buffers with more sophisticated data structures without copying, a buffer can be modeled as a list of memory blocks instead of a single one. There could be a convenience function for people who don't want to deal with this, which will always give you a buffer as a single block of memory by making a copy if it consists of multiple regions.

This model can be implemented without an extra dll.

Model 2:
Buffers are owned by a Shared Buffers dll, and can be read and written by all extensions which have a reference. That means buffers would have to follow a generic implementation provided in the dll, or at least a generic interface which can be implemented by the extensions if they really need something more sophisticated. Also, all extensions would have to assume that buffers can be changed between function calls, unless e.g. a read-only wrapper for buffers in provided. On the upside, you could provide a single set of GM functions for reading/writing bytes, floats etc., which is more difficult if buffers are managed by the extensions.

This is more complex, but allows to have a unified Buffer implementation for most purposes.

Actually, thinking about this some more, this can also be achieved without an extra dll.

Edit: I was still writing this when Maarten posted :)

Edited by Medo42, 19 April 2011 - 10:34 AM.

  • 0

#14 Medo42

Medo42

    GMC Member

  • GMC Member
  • 306 posts

Posted 19 April 2011 - 11:03 AM

Great stuff Maarten. Here are a few thoughts:
The finished thing should have a very permissive license so that it can be used with any extension. Something like MIT or (even simpler but equivalent) ISC would be preferable for me.

Now for some technical comments :)

shared_buffers_get_handle()
I don't really like relying on the size of int as you do here... I know we do it anyway in the end, but I'd still change that to uintptr_t :)
And I'd prefer an automatic way to share this information, so that the user doesn't have to care about it. I guess an extension could do it as part of an automatic initialization script though and hide it this way.

buffer_get_error(id)
buffer_clear_error(id)

I don't think it's very useful to handle buffer overruns like this. If someone writes code that reads past the end of a buffer, it is not likely that he will explicitly check for errors either. The only real improvement would be with exceptions, but GM doesn't support anything like that of course. As it is, I'd prefer the behaviour to be undefined but safe. In my own Buffers implementation I just read until the end and then stop there, the rest of the returned data is undefined (unless you try to read a string - in that case a reasonable behaviour is still possible and I just return the first part of the string)

buffer_read_from_file(id,filename)
buffer_write_to_file(id,filename)

IMO, that should go into its own dll/extension - the very purpose of this project is to make things more modular :)

More criticism later :)
Can you upload this to github or some other collaboration platform?

Edited by Medo42, 19 April 2011 - 12:42 PM.

  • 0

#15 Maarten Baert

Maarten Baert

    GMC Member

  • GMC Member
  • 745 posts
  • Version:GM8.1

Posted 19 April 2011 - 12:43 PM

Great stuff Maarten. Here are a few thoughts:
The finished thing should have a very permissive license so that it can be used with any extension. Something like MIT or (even simpler but equivalent) ISC would be preferable for me.

Okay, I'm using ISC now.

shared_buffers_get_handle()
I don't really like relying on the size of int as you do here... I know we do it anyway in the end, but I'd still change that to uintptr_t :)

Changed unsigned int to uintptr_t :).

And I'd prefer an automatic way to share this information, so that the user doesn't have to care about it. I guess an extension could do it as part of an automatic initialization script though and hide it this way.

I still have to try this, I'm not 100% sure extensions can call other extensions during initialization. It might not work correctly if the extensions are loaded in the wrong order.

buffer_get_error(id)
buffer_clear_error(id)

I don't really like the idea that a buffer can have an error. If someone writes code that reads past the end of a buffer, it is not likely that he will explicitly check for errors either. The only real improvement would be with exceptions, but GM doesn't support anything like that of course. As it is, I'd prefer the behaviour to be undefined but safe. In my own Buffers implementation I just read until the end and then stop there, the rest of the returned data is undefined (unless you try to read a string - in that case a reasonable behaviour is still possible and I just return the first part of the string)

They are optional, you can simply ignore the errors if you want. If you read past the end of the buffer, you will simply get the default value for that type (which is either 0, 0.0, an empty string, or a block of null bytes depending on the function you're using). I added them because some users might want to make sure the data they've just read is actual data. This could be important in some situations.

buffer_read_from_file(id,filename)
buffer_write_to_file(id,filename)

IMO, that should go into its own dll/extension - the very purpose of this project is to make things more modular :)

I know, but they're so simply I thought it would be silly not to add them. If I comment them out the DLL becomes 0.5KB smaller, that's hardly noticeable. We can still write a separate extension with more complex file functions.

Can you upload this to github or some other collaboration platform?

I have very little experience with revision control software, could you upload it somewhere?

I just made some changes to the code. The previous version would write everything to the end of the buffer, the new version writes it at the current position and resizes the buffer if needed. This is similar to writing to files. Now it's also possible to write to the middle of the buffer. I also added buffer_set_length (for GM).

The link is still the same:
http://gm.maartenbae...red_buffers.zip
I'm writing the documentation now.
  • 0

#16 Medo42

Medo42

    GMC Member

  • GMC Member
  • 306 posts

Posted 19 April 2011 - 01:07 PM

They are optional, you can simply ignore the errors if you want. If you read past the end of the buffer, you will simply get the default value for that type (which is either 0, 0.0, an empty string, or a block of null bytes depending on the function you're using). I added them because some users might want to make sure the data they've just read is actual data. This could be important in some situations.

In this situation you should check *first* whether there is actually enough data left in the buffer.

I just made some changes to the code. The previous version would write everything to the end of the buffer, the new version writes it at the current position and resizes the buffer if needed. This is similar to writing to files. Now it's also possible to write to the middle of the buffer. I also added buffer_set_length (for GM).

These are both things I don't see much practical use for (unless you can come up with a good example). You could always add these functions later if it turns out they are needed, but you can't remove them again once they are included in a release.
  • 0

#17 Maarten Baert

Maarten Baert

    GMC Member

  • GMC Member
  • 745 posts
  • Version:GM8.1

Posted 19 April 2011 - 03:42 PM

In this situation you should check *first* whether there is actually enough data left in the buffer.

Yes, but that's not always possible. If you're reading null-terminated strings from the buffer, you don't know the size in advance - unless you read the string byte by byte, which is cumbersome.

These are both things I don't see much practical use for (unless you can come up with a good example). You could always add these functions later if it turns out they are needed, but you can't remove them again once they are included in a release.

You can also use buffers as large arrays, which is useful to save memory. A 1024x1024 ds_grid uses 24*1024*1024 bytes = 24MB, but if you're only storing bytes this could be done with a 1MB buffer. A 3D particle system DLL could also use buffers to store the position of the particles, so it can be passed to a 3D graphics DLL directly without the overhead of rewriting the entire buffer every time something changes.

I think it can also be useful for some file formats. Some file formats store 'pointers' to other parts of the file in the file itself, which is cumbersome if you can't 'go back' to set the pointers to parts of the file that have been written later.

I think it would be great to have a simple 'memory block' data structure in GM, which gives you the same freedom as C++. I don't see why you'd want to limit it to sequential writing. Random access writing doesn't make the DLL significantly more complicated, slower or harder to use. You can still do sequential reading or writing, you just have to set the position to 0 if you want to start reading from the start again.

I've finished the help file, it's included in the ZIP:
http://gm.maartenbae...red_buffers.zip

Edited by Maarten Baert, 19 April 2011 - 03:45 PM.

  • 0

#18 Medo42

Medo42

    GMC Member

  • GMC Member
  • 306 posts

Posted 20 April 2011 - 01:46 PM

You are moving very fast with the implementation, which is nice, but I hope you don't mind if I try out some ideas and change what you have done so far around a bit. Then you'll finally be able to criticise my stuff instead of me just nagging about yours :D. I hope to be constructive with this though, so don't take it as anything against you - I just want to make sure this in the best possible shape before people start using it.

The array/file-like usage could be useful in some cases, it is just not what I had in mind originally.

What I want to do is try to find a minimal interface for buffers and seperate the existing functions into an implementation of that interface (e.g. the functions for reading/writing blocks of data) and "helper" functions which only use that interface (e.g. reading/writing specific data types).

The second point would be to make this all work without a dll, using a very small statically linked library instead. I have a plan for how that would work already, and it would get rid of the initialization problem entirely. Maybe it can even be done with just a .hpp file and no library at all.

And when that works, the GM buffer manipulation functions can be put into their own dll/extension that just uses the buffer sharing code like any other extension. This might seems a little bit pointless, but I'd like to try it because splitting things up into independend components is very often a good idea. If nothing else, it helps avoid the argument whether there should be file access functions included in the dll, because it's now in an exchangable component, even if no alternative is ever written.

Edited by Medo42, 20 April 2011 - 01:47 PM.

  • 0

#19 sabriath

sabriath

    12013

  • GMC Member
  • 3175 posts

Posted 20 April 2011 - 02:18 PM

Call me stupid...but why not just simply do this:

//in gm

var temp;

temp = "012345678";

external_call(get_pointer, someid, temp);
external_call(give_pointer, someotherid, temp);

I did a proof-of-concept that GM will actually pass strings byref to their data (not the variable itself, but the data it contains is retained). This allowed me to create hooks into classes by creating enough space for a string to hold an entire class and then calling a 'create' function to move it there. I then made it so that the class could hold pointer data to DLL allocated space, and since it's a pointer, GM wouldn't mess up the actual data for it.

You could use this same technique to pass a pointer reference between DLL extensions by creating a "get_pointer" function on one and a "give_pointer" on another. Since the DLL would allocate the space (and the OS would link it to the running applications handle), it doesn't change it's location between extensions

An example:

gmexportd get_pointer(double id, buffer* pt)
{
  pt = mybuffers[static_cast<int>(id)];
}


gmexportd give_pointer(double id, buffer* pt)
{
  otherbuffers[static_cast<int>(id)] = pt;
}



However, I am not sure how this will help programmers by "modularizing" a standard way of communicating between extensions. It might be great for 2 extensions to combine minds and come up with something together, but trying to come up with a general standard?
  • 0

#20 Maarten Baert

Maarten Baert

    GMC Member

  • GMC Member
  • 745 posts
  • Version:GM8.1

Posted 20 April 2011 - 04:00 PM

You are moving very fast with the implementation, which is nice, but I hope you don't mind if I try out some ideas and change what you have done so far around a bit. Then you'll finally be able to criticise my stuff instead of me just nagging about yours :D. I hope to be constructive with this though, so don't take it as anything against you - I just want to make sure this in the best possible shape before people start using it.

Sure, no problem :).

The array/file-like usage could be useful in some cases, it is just not what I had in mind originally.

What I want to do is try to find a minimal interface for buffers and seperate the existing functions into an implementation of that interface (e.g. the functions for reading/writing blocks of data) and "helper" functions which only use that interface (e.g. reading/writing specific data types).

I think that's the main difference between my idea and yours: You're trying to solving one problem: "DLLs need a way to transfer data to each other", but I'm trying to solve a second problem at the same time: "Many DLLs need buffers, it would be much simpler if all DLLs could use the same buffers instead of their own implementation".

The reason I think we have to solve both problems at once is that most DLLs won't be compatible, unless they were specifically designed to communicate with each other. If you create a platform to transfer raw data without any formatting, it will be used by only a few DLLs. But if you create a shared buffers system that also allows GML users to read or write data, all DLLs that need buffers can use it. And there are lots of DLLs that could use buffers:
- data structure DLLs can use it to serialize/unserialize data structures
- file IO Dlls can use it to save/load files
- socket DLLs can use it to send/receive data
- HTTP request DLLs can use it to store the data that was downloaded from the server
- cryptography DLLs can use it to encrypt/decrypt binary data or calculate hashes (MD5/SHA1/...)
- compression DLLs can use it to compress/decompress data
- ...
The possibilities are endless, and all those DLLs would instantly be compatible with each other - simply because they're using the same buffers. You could serialize ANY data structure, use ANY compression DLL to compress that data, then use ANY cryptography DLL to encrypt it, and finally use ANY network DLL to send the data to another computer. This is currently not possible without using temp files or converting the data to a hex string every time.

I agree with what you said about a minimal interface. Most DLLs won't need functions like buffer_write_int32, you can easily do that yourself in C++ if you have buffer_write_data, which takes any number of bytes. I will try to rewrite it to make the interface a bit simpler.

The second point would be to make this all work without a dll, using a very small statically linked library instead. I have a plan for how that would work already, and it would get rid of the initialization problem entirely. Maybe it can even be done with just a .hpp file and no library at all.

I could be wrong, but I don't think that will work that easily. If DLLs have their own runtime libraries (e.g. because they use different compilers), they will also have their own heap. So you can't just allocate memory in DLL_A and resize the same buffer in DLL_B - unless you use functions like GlobalAlloc, I think (but I'm not sure). MSDN says GlobalAlloc is slower than the default malloc, I will have to test this to see if the difference is relevant (if it works in the first place).

By the way, statically linked libraries are not compatible with other compilers, so you would have to create a separate library for every compiler. It would be easier to simply add one .cpp file and one .h file IMHO.

I think we will need the helper functions anyway, so why not keep it simple and just add the functions to the helper functions DLL?

And when that works, the GM buffer manipulation functions can be put into their own dll/extension that just uses the buffer sharing code like any other extension. This might seems a little bit pointless, but I'd like to try it because splitting things up into independend components is very often a good idea. If nothing else, it helps avoid the argument whether there should be file access functions included in the dll, because it's now in an exchangable component, even if no alternative is ever written.

Good point, but I think the buffers would be almost useless without the helper functions.

Call me stupid...but why not just simply do this:
[...]
I did a proof-of-concept that GM will actually pass strings byref to their data (not the variable itself, but the data it contains is retained). This allowed me to create hooks into classes by creating enough space for a string to hold an entire class and then calling a 'create' function to move it there. I then made it so that the class could hold pointer data to DLL allocated space, and since it's a pointer, GM wouldn't mess up the actual data for it.

I'd prefer not to rely on this, because it could change in the future (if YYG decides to create a new runner for GM9, for example).

However, I am not sure how this will help programmers by "modularizing" a standard way of communicating between extensions. It might be great for 2 extensions to combine minds and come up with something together, but trying to come up with a general standard?

See above :).

Edited by Maarten Baert, 20 April 2011 - 04:02 PM.

  • 0

#21 Medo42

Medo42

    GMC Member

  • GMC Member
  • 306 posts

Posted 20 April 2011 - 05:36 PM

I think that's the main difference between my idea and yours: You're trying to solving one problem: "DLLs need a way to transfer data to each other", but I'm trying to solve a second problem at the same time: "Many DLLs need buffers, it would be much simpler if all DLLs could use the same buffers instead of their own implementation".

And I agree - there should definitely be a default implementation that everyone can use. I just want to give people who really want to do it the option of defining their own implementation, which might be backed by an internal data structure. For example, in Faucet Networking you can supply a socket instead of a buffer in all of the read/write operations - writing will append data to the send buffer, and reading will read from the buffer of received data. This was added after actually using the extension for a bit, because it makes especially receiving small amounts of data at a time much less verbose, and I do not want to revert to the old behavior of always passing a buffer to send, and returning one with the received data. However, if we go with your plan of only one buffer implementation, I would either have to do exactly that, or keep my own write/read functions beside the ones of the default buffer dll (which would be confusing), or create new functions "socket_sendbuffer(socket)" and "socket_receivebuffer(socket)". The last option would actually be acceptable, but still a bit less elegant.

The reason I think we have to solve both problems at once is that most DLLs won't be compatible, unless they were specifically designed to communicate with each other. If you create a platform to transfer raw data without any formatting, it will be used by only a few DLLs. But if you create a shared buffers system that also allows GML users to read or write data, all DLLs that need buffers can use it.


So your point is one of gaining widespread acceptance - which is obviously important here. But as I said above, my point is to allow custom implementations, not to exclude the default one, so it should still be easy to just use the default implementation.

I could be wrong, but I don't think that will work that easily. If DLLs have their own runtime libraries (e.g. because they use different compilers), they will also have their own heap. So you can't just allocate memory in DLL_A and resize the same buffer in DLL_B


But it won't be necessary to do so, because the buffers will be managed by the dll which creates them. I'll try to come up with a proof of concept.

By the way, statically linked libraries are not compatible with other compilers, so you would have to create a separate library for every compiler. It would be easier to simply add one .cpp file and one .h file IMHO.


As far as I can make out, static libraries generated by gcc (for C code, not C++) can be linked by Visual Studio. I didn't try this though, and just adding the .cpp/.h files to your project is easy enough as an alternative.

Edited by Medo42, 20 April 2011 - 08:05 PM.

  • 0

#22 Medo42

Medo42

    GMC Member

  • GMC Member
  • 306 posts

Posted 20 April 2011 - 08:06 PM

Sorry, I can't concentrate well at the moment for some reason, so I won't manage changing your code to work dll-less today, and I'm busy tomorrow and over the weekend. The idea is this though: Every extension / dll has its own SharedBufferInterface instance. Buffer references passed over GM consist of a string-encoded (e.g. turned into a hex string) struct that looks like this:
struct BufferReference {
	SharedBufferInterface *owner;
	uint32_t handle;
};
To use a buffer reference that is handed in, you reconstruct the struct (heh) and delegate all function calls to the included owner pointer. That way, the buffers are always managed in the heap space of the extension where they were created.

One other problem which I just noticed is that the buffer writing and reading in Faucet Networking needs to be aware of endianness, which can be set on a per-buffer basis. I don't see a way of replicating that functionality without again resorting to provide my own read/write functions, unless it is included in the generic implementation.

Edited by Medo42, 20 April 2011 - 08:23 PM.

  • 0

#23 Maarten Baert

Maarten Baert

    GMC Member

  • GMC Member
  • 745 posts
  • Version:GM8.1

Posted 21 April 2011 - 12:43 PM

And I agree - there should definitely be a default implementation that everyone can use. I just want to give people who really want to do it the option of defining their own implementation, which might be backed by an internal data structure. For example, in Faucet Networking you can supply a socket instead of a buffer in all of the read/write operations - writing will append data to the send buffer, and reading will read from the buffer of received data. This was added after actually using the extension for a bit, because it makes especially receiving small amounts of data at a time much less verbose, and I do not want to revert to the old behavior of always passing a buffer to send, and returning one with the received data. However, if we go with your plan of only one buffer implementation, I would either have to do exactly that, or keep my own write/read functions beside the ones of the default buffer dll (which would be confusing), or create new functions "socket_sendbuffer(socket)" and "socket_receivebuffer(socket)". The last option would actually be acceptable, but still a bit less elegant.

I would either use the last option, or create functions like socket_read_float32 (instead of buffer_read_float32).

So your point is one of gaining widespread acceptance - which is obviously important here. But as I said above, my point is to allow custom implementations, not to exclude the default one, so it should still be easy to just use the default implementation.

Ah, I understand what you mean now. You want to be able to create buffers that aren't actually memory buffers (it could be files, sockets, cryptographic streams, whatever), but still behave like buffers to all other DLLs. Like polymorphism. That's a great idea, I really like it :). But obviously you don't want to use an interface with tons of functions (like buffer_write_int32 etc.) that aren't really needed, because those will just make it harder to write your own implementation.

As far as I can make out, static libraries generated by gcc (for C code, not C++) can be linked by Visual Studio. I didn't try this though, and just adding the .cpp/.h files to your project is easy enough as an alternative.

I thought it was impossible without some conversion program, but I could be wrong.

Sorry, I can't concentrate well at the moment for some reason, so I won't manage changing your code to work dll-less today, and I'm busy tomorrow and over the weekend. The idea is this though: Every extension / dll has its own SharedBufferInterface instance. Buffer references passed over GM consist of a string-encoded (e.g. turned into a hex string) struct that looks like this:

struct BufferReference {
	SharedBufferInterface *owner;
	uint32_t handle;
};
To use a buffer reference that is handed in, you reconstruct the struct (heh) and delegate all function calls to the included owner pointer. That way, the buffers are always managed in the heap space of the extension where they were created.


One other problem which I just noticed is that the buffer writing and reading in Faucet Networking needs to be aware of endianness, which can be set on a per-buffer basis. I don't see a way of replicating that functionality without again resorting to provide my own read/write functions, unless it is included in the generic implementation.

Good point, I hadn't thought about endianness. Why exactly do you need it? I thought all Windows DLLs were little-endian anyway. You could add a function 'GetEndianness' to the interface, but I don't really see why it is needed. Endianness is only a problem if the program that generates the data does not use the same format as the program that interprets the data. But if a program can interpret the data, I assume it already knows what the data is, so it should also know whether it's little-endian or big-endian. You will have to come up with a fixed format anyway if you want to exchange data, so the endianness should be part of the definition of that format.

I think it's a good idea to add an extra argument to functions like buffer_write_int32 and buffer_read_int32 to set the endianness, so users can read or write in the correct format. But I don't think we should set the endianness on a per-buffer basis, it's the responsibility of the reader and writer to use the correct format.


Now, about the interface. I'm trying to come up with a minimal set of functions we will need for buffers. But there are actually two kinds of buffers we could use:
- stream-like buffers, which can only read from the start and write to the end (like sockets), which I will call streams.
- memory-like buffers, which can read and write at any position (like my buffers, or files), which I will call buffers.
You could use the buffers as if they are streams (writing writes to the 'writing position' of the buffer, reading reads at the 'reading position'). So you could create a set of functions for reading and writing that work for streams AND buffers, and a second set of functions that only work for buffers.

Streams:
- read any type of data (the interface has just one read function, but the helper functions can add more)
- write any type of data (same here).
- a function that returns how many bytes of data are left to be read

Buffers:
- get read position
- get write position
- get length
- set read position
- set write position
- set length

The interface should allow anyone to create their own implementation of streams (e.g. sockets) OR buffers (e.g. files). However, we will still need the GML functions to manipulate the buffers and stream, and also the helper functions, so we will still need a buffer dll. Well, it's not strictly needed anymore, but buffers are rather useless if you can't manipulate them :).

There's one thing that worries me though. GM users are used to ids, but now we're forcing them to use real pointers instead. That means they could easily make their game crash if they accidentally pass the wrong variable to a function like buffer_write_int32, or if they accidentally destroy the same buffer twice. I'm not sure that's a good idea. I think we should use the shared dll to keep a list of all buffers and streams, so we can simply pass ids to GM instead of pointers.
  • 0

#24 Medo42

Medo42

    GMC Member

  • GMC Member
  • 306 posts

Posted 28 April 2011 - 11:13 AM

Ok, some more discussing then before I go ahead with the implementation because you raise a good point once more. If we keep on like this I think we can work out most of the problems :)

Good point, I hadn't thought about endianness. Why exactly do you need it? I thought all Windows DLLs were little-endian anyway. You could add a function 'GetEndianness' to the interface, but I don't really see why it is needed. Endianness is only a problem if the program that generates the data does not use the same format as the program that interprets the data. But if a program can interpret the data, I assume it already knows what the data is, so it should also know whether it's little-endian or big-endian. You will have to come up with a fixed format anyway if you want to exchange data, so the endianness should be part of the definition of that format.

I think it's a good idea to add an extra argument to functions like buffer_write_int32 and buffer_read_int32 to set the endianness, so users can read or write in the correct format. But I don't think we should set the endianness on a per-buffer basis, it's the responsibility of the reader and writer to use the correct format.

Maybe I expressed that a bit ambiguously - Of course I know whether the data is little- or big-endian, but I don't necessarily have any influence on it. Talking from the standpoint of my networking extension again, many networking protocols are defined with big-endian byte order, while others (e.g. most things based on 39dll) are little-endian. So in order to communicate in any particular protocol you have to set the correct endianness in the buffer you are reading / writing. You could make that a parameter of the reading/writing functions, but since the data from/to any source / destination will usually use the same convention throughout, setting it as a parameter of the buffer saves a lot of redundand typing - and you can still switch it around for some fields if you really need to, which is more verbose than using a parameter but only necessary in very few cases. This applies equally to data formats in files.

Now, about the interface. I'm trying to come up with a minimal set of functions we will need for buffers. But there are actually two kinds of buffers we could use:
- stream-like buffers, which can only read from the start and write to the end (like sockets), which I will call streams.
- memory-like buffers, which can read and write at any position (like my buffers, or files), which I will call buffers.
You could use the buffers as if they are streams (writing writes to the 'writing position' of the buffer, reading reads at the 'reading position'). So you could create a set of functions for reading and writing that work for streams AND buffers, and a second set of functions that only work for buffers.

The buffers I currently use are actually a mix of the two, where you can write only to the end but read anywhere. That would not be possible for all types of streams though, and I want to avoid making things more complicated. I never actually used the option to set the read position in Gang Garrison 2, so it probably isn't a very usual case. It could be removed from the actual receive buffer of the socket to make the socket fit your definition of a stream - if someone does need it, he can copy the received data out to a separate buffer first.

There's one thing that worries me though. GM users are used to ids, but now we're forcing them to use real pointers instead. That means they could easily make their game crash if they accidentally pass the wrong variable to a function like buffer_write_int32, or if they accidentally destroy the same buffer twice. I'm not sure that's a good idea. I think we should use the shared dll to keep a list of all buffers and streams, so we can simply pass ids to GM instead of pointers.

The idea is that users should never look at or change handles, and only call the functions with some kind of handle that was passed out before... but of course, you cannot assume that, and it would be good if it didn't crash the entire game when this rule is disregarded. We could add a "signature" to the beginning of a valid pointer, some magic number that can be used to check that this is actually a buffer handle and not some random string. It would still be possible to craft a bad handle to crash the game, but it would be very unlikely to have it happen by accident. We could also add a magic number to the beginning of a buffers implementation in memory to make malicious crafting of handles more difficult, but that's probably overkill.

I have to admit that while the solution without a central dll is interesting and in a way quite elegant, it does create some practical problems. On the other hand it solves the problem of requiring the user to initialize all extensions with a pointer to the buffers implementation. If we can get rid of that somehow in the central dll solution and also provide a useful error message like "This extension requires the buffers dll / extension, please install it", I'll be happy with that one.

You mentioned that it might not be possible in the extension initialization, but if that is the case there are still convoluted ways to get the result we want. For example, this is how an extension init code could look like:
if(variable_global_exists("__sharedBuffersInitialized")) {
    __myExtensionInit(shared_buffers_get_handle());
} else {
    if(!variable_global_exists("__sharedBuffersInitCallbacks")) {
        global.__sharedBuffersInitCallbacks = ds_list_create();
    }
    ds_list_add(global.__sharedBuffersInitCallbacks, __myExtensionInit);
}
Then the shared buffers dll / extension could be initialized like this:
global.__sharedBuffersInitialized = true;
if(variable_global_exists("__sharedBuffersInitCallbacks")) {
    var i;
    for(i=0; i<ds_list_size(global.__sharedBuffersInitCallbacks); i+=1) {
        script_execute(ds_list_find_value(global.__sharedBuffersInitCallbacks, i), shared_buffers_get_handle());
    }
    ds_list_destroy(global.__sharedBuffersInitCallbacks);
}
I'm not sure if extension GML scripts can be called that way. If not, one could work around that by using a string with the script name instead.

Hmm, maybe I should have checked first how extensions are initialized and if this is even necessary :)
  • 0

#25 Medo42

Medo42

    GMC Member

  • GMC Member
  • 306 posts

Posted 28 April 2011 - 09:27 PM

I set up a git repository at https://github.com/M...shared-buffers/ and added a first change: When you write an integer, the value you pass in is now rounded, and the conversion is now done by a template function.
  • 0

#26 Maarten Baert

Maarten Baert

    GMC Member

  • GMC Member
  • 745 posts
  • Version:GM8.1

Posted 30 April 2011 - 09:26 PM

Maybe I expressed that a bit ambiguously - Of course I know whether the data is little- or big-endian, but I don't necessarily have any influence on it. Talking from the standpoint of my networking extension again, many networking protocols are defined with big-endian byte order, while others (e.g. most things based on 39dll) are little-endian. So in order to communicate in any particular protocol you have to set the correct endianness in the buffer you are reading / writing. You could make that a parameter of the reading/writing functions, but since the data from/to any source / destination will usually use the same convention throughout, setting it as a parameter of the buffer saves a lot of redundand typing - and you can still switch it around for some fields if you really need to, which is more verbose than using a parameter but only necessary in very few cases. This applies equally to data formats in files.

Yes, you're right - adding that argument every time is just annoying. But we can't really use the buffers to store the endianness, since the new buffer interface has only one write function: WriteData. It doesn't know the exact type, so it can't do the conversion properly (obviously you don't need the conversion for strings, and it's a lot more complicated if you try to write entire structs at once). But we can still make a global helper function to set the endianness, so the other helper functions can use this setting when reading or writing. I don't really like global state either, but it's easier than adding that extra argument every time.

The idea is that users should never look at or change handles, and only call the functions with some kind of handle that was passed out before... but of course, you cannot assume that, and it would be good if it didn't crash the entire game when this rule is disregarded. We could add a "signature" to the beginning of a valid pointer, some magic number that can be used to check that this is actually a buffer handle and not some random string. It would still be possible to craft a bad handle to crash the game, but it would be very unlikely to have it happen by accident. We could also add a magic number to the beginning of a buffers implementation in memory to make malicious crafting of handles more difficult, but that's probably overkill.

That doesn't really solve the destroy-twice problem: the handle is essentially valid, the buffer just happens to be gone. Additionally, I think users will easily get confused by the different types of buffers and streams, so they could accidentally try to destroy one type (e.g. a socket) using the function that was meant to destroy another type (e.g. a memory buffer). That would lead to very weird crashes. Since GM users aren't familiar to debuggers and such, they have really no way to track this type of bug down, so they would quickly give up.

I have to admit that while the solution without a central dll is interesting and in a way quite elegant, it does create some practical problems. On the other hand it solves the problem of requiring the user to initialize all extensions with a pointer to the buffers implementation. If we can get rid of that somehow in the central dll solution and also provide a useful error message like "This extension requires the buffers dll / extension, please install it", I'll be happy with that one.

We could solve the initialization problem simply by delaying the initialization until the first function that actually needs buffers tries to use them. That would be even better, as it allows us to create DLLs that are compatible with Shared Buffers, but don't require it. If you would try to call such a function without the DLL, the DLL could either show an error message or simply do nothing - that's up to the DLL developer.

You mentioned that it might not be possible in the extension initialization, but if that is the case there are still convoluted ways to get the result we want. For example, this is how an extension init code could look like:

if(variable_global_exists("__sharedBuffersInitialized")) {
    __myExtensionInit(shared_buffers_get_handle());
} else {
    if(!variable_global_exists("__sharedBuffersInitCallbacks")) {
        global.__sharedBuffersInitCallbacks = ds_list_create();
    }
    ds_list_add(global.__sharedBuffersInitCallbacks, __myExtensionInit);
}
Then the shared buffers dll / extension could be initialized like this:
global.__sharedBuffersInitialized = true;
if(variable_global_exists("__sharedBuffersInitCallbacks")) {
    var i;
    for(i=0; i<ds_list_size(global.__sharedBuffersInitCallbacks); i+=1) {
        script_execute(ds_list_find_value(global.__sharedBuffersInitCallbacks, i), shared_buffers_get_handle());
    }
    ds_list_destroy(global.__sharedBuffersInitCallbacks);
}
I'm not sure if extension GML scripts can be called that way. If not, one could work around that by using a string with the script name instead.

Convoluted indeed :P. But if it works, it would be great. And if we can't call GEX functions like that, we could still use execute_string (finally, a good reason to use that function).

I set up a git repository at https://github.com/M...shared-buffers/ and added a first change: When you write an integer, the value you pass in is now rounded, and the conversion is now done by a template function.

Thanks, I'm downloading Git now. I didn't know about std::numeric_limits, that's really useful. Your cast function is very similar to my gm_cast<X> function I used in ExtremePhysics, I just rewrote it to use std::numeric_limits.

During the last few days I've written another proof-of-concept:
http://gm.maartenbae...eam_buffers.zip
It uses separate streams and buffers now. I've included memory buffers in the main DLL for now, but I could move them to a separate DLL (I wasn't sure which was better). The new version allows anyone to create their own implementation of either streams or buffers (all buffers are also streams). I've rewritten the help file too.

The DLL uses a global id table, so you don't have to pass pointers around. You can use the same id for the buffer/stream and your own data structure. For example, if your DLL creates sockets, you can use the id that was generated for the buffer/stream for your socket, so you won't need a function like 'socket_get_stream_id' or similar - the id is the same.

There's also a function 'GetInterface' which you can use to get the pointer to the interface of a specific buffer/stream (not the main interface, just the interface of that type). This address isn't really useful, but you can use it to check whether some stream or buffer has the correct type. That way you don't have to keep your own table of ids: you can just use the shared buffer dll to convert the id into a pointer, and then check the interface address to make sure it's the correct type. You don't have to use it like that, but I thought it could be useful (I used this for the memory buffer implementation).

I haven't added endianness yet, and it still uses my old cast functions. I will change that tomorrow (it's 23:25 now where I live :)).
  • 0

#27 Medo42

Medo42

    GMC Member

  • GMC Member
  • 306 posts

Posted 20 May 2011 - 09:16 PM

Sorry for not being more active. This project is still on my mind, but at the moment I am tied up with work and writing a thesis. I will try to make some time for this soon though.

Let me just clear up one points from that last post. I hope to answer some of the others in code soon :)


The idea is that users should never look at or change handles, and only call the functions with some kind of handle that was passed out before... but of course, you cannot assume that, and it would be good if it didn't crash the entire game when this rule is disregarded. We could add a "signature" to the beginning of a valid pointer, some magic number that can be used to check that this is actually a buffer handle and not some random string. It would still be possible to craft a bad handle to crash the game, but it would be very unlikely to have it happen by accident. We could also add a magic number to the beginning of a buffers implementation in memory to make malicious crafting of handles more difficult, but that's probably overkill.

That doesn't really solve the destroy-twice problem: the handle is essentially valid, the buffer just happens to be gone. Additionally, I think users will easily get confused by the different types of buffers and streams, so they could accidentally try to destroy one type (e.g. a socket) using the function that was meant to destroy another type (e.g. a memory buffer). That would lead to very weird crashes. Since GM users aren't familiar to debuggers and such, they have really no way to track this type of bug down, so they would quickly give up.

Check the implementation detail that I gave again, specifically the struct for the buffer handle:
struct BufferReference {
        SharedBufferInterface *owner;
        uint32_t handle;
};
The pointer does not point to a buffer, but rather to an instance of the function pointer interface. We already share the exact same pointer between extensions in your implementation. The only difference is that there can now be several instances of this, each managing its own buffers. This way all of the problems you mention can be avoided.
  • 0

#28 Medo42

Medo42

    GMC Member

  • GMC Member
  • 306 posts

Posted 21 May 2011 - 11:57 PM

Alright, I went over your code and started picking it apart a bit. Maybe I'm just taking your stuff apart and putting it back together in a slightly different way, but if so it at least helps me undersand things a bit better :).

I'd still like to have a well-defined "core" whose API is the only place where we actually have to care about compiler compatibility, since this is where all the extensions will call into. Now, I'm still not quite clear on what you can and can't expect to be compatible, but let me start with the proposed API (developed mainly from your interface structs) and then discuss it further:
#include <stdint.h>

typedef struct {
	uint32_t (__stdcall *read)(void*, uint8_t*, uint32_t);
	void (__stdcall *write)(void*, const uint8_t*, uint32_t);
	uint32_t (__stdcall *getBytesLeft)(void*);
	uint8_t (__stdcall *destroy)(void*);
} SharedStreamInterface;

typedef struct {
	uint32_t (__stdcall *read)(void*, uint8_t*, uint32_t);
	void (__stdcall *write)(void*, const uint8_t*, uint32_t);
	uint32_t (__stdcall *getBytesLeft)(void*);
	uint8_t (__stdcall *destroy)(void*);

	uint32_t (__stdcall *getreadpos)(void*);
	uint32_t (__stdcall *getwritepos)(void*);
	uint32_t (__stdcall *getlength)(void*);
	void (__stdcall *setreadpos)(void*, uint32_t);
	void (__stdcall *setwritepos)(void*, uint32_t);
	void (__stdcall *setlength)(void*, uint32_t);
} SharedBufferInterface;

extern "C" {
	// Functions applicable for both buffers and streams
	__stdcall uint32_t readData(uint32_t id, uint8_t* data, uint32_t size);
	__stdcall void writeData(uint32_t id, const uint8_t* data, uint32_t size);
	__stdcall uint32_t getBytesLeft(uint32_t id);
	__stdcall void destroyStreamOrBuffer(uint32_t id);
	__stdcall uint8_t streamOrBufferExists(uint32_t id);

	// Functions only applicable for buffers
	__stdcall uint32_t getReadPos(uint32_t id);
	__stdcall uint32_t getWritePos(uint32_t id);
	__stdcall uint32_t getLength(uint32_t id);
	__stdcall void setReadPos(uint32_t id, uint32_t pos);
	__stdcall void setWritePos(uint32_t id, uint32_t pos);
	__stdcall void setLength(uint32_t id, uint32_t length);
	__stdcall uint8_t bufferExists(uint32_t id);

	// Adding new buffers/Streams
	__stdcall uint32_t addStream(SharedStreamInterface *interface, void *stream);
	__stdcall uint32_t addBuffer(SharedBufferInterface *interface, void *buffer);
}

Most things should be straightforward. Technicalities: I'm not sure stdcall is actually more compatible, so that could possibly be removed. The code is relying on mingw and MSVC to generate the same memory layout for the struct definitions, but since they only contain fields of the same type (function pointers) it will probably work. After all, you relied on the same thing too :)

One difference from your model is that this interface allows a uniform "destroy" function that can be used on any type of stream or buffer. On destruction, the library will first call the destroy()-function provided in the interface to allow the buffer to clean up. If that function returns true, the buffer is removed from the list. Specific implementations can choose to prevent being destroyed by returning false instead, e.g. if the buffer is actually tied to some internal data structure of a different resource. In order to *actually* destroy those buffers, an implementation would have to make sure that true is returned when it really wants to clean up - a bit complicated, but I think it's manageable.

By offering the functions directly instead of returning pointers to buffer structs it is impossible for a user of this api to store a pointer that could become invalid at any time.

In my opinion, this api is the part which has to be most carefully designed, because any change to it would break compatibility with basically everything. We can put some classes on top of it to make it nicer to use, and we should definitely have your memory buffer implementation to go with it, but that is all outside this "core" part and can be written in normal C++ and compiled together with the extension that uses it.

The function names could use some tweaking though. Please criticise every aspect of it that you can, now is the best time for it. I can add an implementation tomorrow.
  • 0

#29 Medo42

Medo42

    GMC Member

  • GMC Member
  • 306 posts

Posted 23 May 2011 - 11:23 PM

The promised implementation is actually done, but untested and in need of some extra polish, so I will put it up a bit later. I also wrote a small helper wrapper for the addStream/addBuffer functions in the api posted above, which allows you to write buffer and stream implementations as plain classes, as long as they are derived from one of the interface classes AbstractStream or AbstractBuffer.

One thing I am still not quite content with is that we have to rely on the interface pointers that are passed in to remain valid. We could store a copy of the interface struct instead of the pointer to get around this, but that would waste a bit of space. The pointer validity problem takes an especially difficult turn when considering what happens during shutdown, because there may be situations where one DLL has already been unloaded while the other is being destroyed. It might not be as difficult as I make it out at the moment though.

A question about your buffer implementation: The grow/shrink strategy seems sound, but do you have a specific reason for rolling your own instead of relying on std::vector?
  • 0

#30 Medo42

Medo42

    GMC Member

  • GMC Member
  • 306 posts

Posted 25 May 2011 - 11:30 PM

You can download what I have so far over here: https://github.com/M...-shared-buffers

The destroy function is a bit of a sore spot in this whole thing, since it really shouldn't directly belong to the buffer implementation - it should be implemented by the entity that owns the buffer instead. I might pull it out of the shb_StreamInterface and put it as a parameter into the shareBuffer/shareStream functions instead, or alternatively just remove it from the AbstractStream class and have shareStream take it as an extra parameter.

I added a new function writeOther() to the stream interface that takes the id of another buffer and a size argument, and then reads the requested ammount of data from the other stream/buffer directly to its own memory. Transferring data between memory buffers would require an extra copy without a function like this, but I am not totally happy with how this worked out either and would appreciate a better solution.

I'm quite convinced by now that the best approach to the architecutre question is to provide the shared buffers library implementation as an extension and have it loaded exactly once. Distributing a pointer to the interface to the other extensions / dlls doesn't really require the complicated solution we discussed earlier, it can be accomplished by having the Shared Buffers extension provide a function like getSharedBufferInterfacePointer that is called by everyone else :) (Reading back I notice that this was your original idea, d'oh! Oh well, I actually tested that it works this way now.)

Please don't be annoyed that I rewrote so much of what you had already done. This project is a bit of a learning exercise for me (I'm not very experienced with C++), and in many cases I had to get my hands dirty myself in order to understand all the issues involved.

Edit: I replaced the buffer implementation (which was adapted from my networking lib) with a modified version of yours now. Also, I remover writeOther again in favor of a function to directly access the memory of a buffer. This makes working with buffers a bit more efficient for tasks like creating a checksum over a whole buffer or appending a buffer to anothe one, and I think it is fairly safe to assume that most buffer implementations would use a single block of memory. However, we might make this an optional operation and let buffer implementations return NULL there if they don't support this - in that case, the caller would have to use the other access mechanism and create a copy first.

Edited by Medo42, 26 May 2011 - 11:06 AM.

  • 0




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users