About Monkey 2 › Forums › Monkey 2 Programming Help › Structs and the GC
Tagged: Struct
This topic contains 17 replies, has 7 voices, and was last updated by 
 sicilica 2 years, 8 months ago.
- 
		AuthorPosts
 - 
		
			
				
July 28, 2016 at 5:28 pm #2498
So, with structs (and I guess also with the ability to get references / manually pass around pointers), theoretically you could completely avoid invoking the GC if you didn’t use classes, because:
- Any structs you create inside of a local scope would be on the stack rather than the heap (I assume)
 - Any structs that are passed by value will be on the stack
 - When you need pointers, or for any global data, you could create global arrays of structs (not struct pointers) in memory at init, then assign to those structs by value if they need replaced or pass pointers to them into functions when they need to be used – essentially giving you one large malloc() that you can control, instead of giving to the GC (you would, of course, need to write an allocator and probably want to optimize your ability to track slots that are in use / free, maybe with bitmasks etc)
 
What other heap allocations through the GC would be unavoidable? I would assume that any construction of strings (if you concat stuff or w/e) would go on the heap? Are there other “gotchas” you’d have to think about?
If you do it right, you should also be able to improve your cache hits by getting some control over locality this way. Does the overhead to try to manually control memory in m2 make sense – or would you be better off either not worrying about the GC and just hoping it’s “fast enough”, or switching to something like C if you actually need that level of performance?
July 28, 2016 at 6:30 pm #2501Would be cool if someone explain the need about structs.
When do you need them and are the a lot faster then a class ?With my background using C# and python,monkey1,javascript,nodejs I never needed them.
http://monkey2.monkey-x.com/language-reference/
To declare a struct:
Struct Identifier
[/crayon]Monkey1[crayon-5cba8ab318ce9378228998 inline="true" ] ...Struct members...End
A struct can contain consts, globals, fields, methods, functions and other user defined types.
Structs are similar to classes, but differ in several important ways:
- A struct is a ‘value type’, whereas a class is a ‘reference type’. This means that when you assign a struct to a variable, pass a struct to a function or return a struct from a function, the entire struct is copied in the process. —– what does this mean ?
 - Stucts are statically typed, whereas classes are dynamically typed.
 - Struct methods cannot be virtual.
 - A struct cannot extend anything.
 
July 28, 2016 at 7:41 pm #2502- A struct is a ‘value type’, whereas a class is a ‘reference type’. This means that when you assign a struct to a variable, pass a struct to a function or return a struct from a function, the entire struct is copied in the process. —– what does this mean ?
 
I will assume here that Mark is saying that the struct data type in MX2 is being passed by value. Very expensive in time and memory when you pass a large data object to and from a function or method, as the whole data object is copied to the stack.
Where as a data object that’s being passed by reference will only need the address of the object in memory to be copied onto the stack. A lot faster and more memory efficient.
July 28, 2016 at 7:58 pm #2503That’s exactly what the difference between passing a value and a reference is. @gcmartijn, I don’t think structs exist in any of the languages you mentioned, since they are all pretty high level. A struct is like if you duct-tape a bunch of variables together: if I pass a struct into a function, for example, it’s the same as if I was taking a parameter for each variable in the struct and passing all of them at the same time. All of the languages you mention follow the “every object is a pointer” mindset, so if you pass an object into a function, you’re not passing its data; instead, you’re passing an integer that represents where in memory the object is.
Anyway, whether you pass values or pointers into a function isn’t really the question I’m getting at here – of course you would never want to copy large structs around, and if you don’t know exactly what you’re doing you might not want to use structs at all. My question is about memory allocation – if I can’t be confident that I have a lot of control over the allocator, then it doesn’t make any sense to mess with what I’m talking about at all.
July 28, 2016 at 9:35 pm #2504Are there other “gotchas” you’d have to think about?
Don’t think so…
The garbage collector gets involved whenever you call ‘new class’ or ‘new blah[]’. Also, if you ‘Slice’ an array or somehow create a new array, eg: with String.Split().
Strings aren’t GC’d but still need to be malloced/freed.
Would be cool if someone explain the need about structs.When do you need them and are the a lot faster then a class ?
They’re really just another tool and it’s up to you when to use them!
I like to use them for small-ish data structures that are frequently duplicated, eg: things like ‘Vec3f’ that might be used in complex-ish mathematical expressions such as v=v*scale+offset.
If Vec3f were a class, such expressions would involve considerable memory management overhead. Not so with structs where new values just go on the stack and are pretty much free to create. The downside is the struct contents need to be copied, but this is likely to be true no matter how you write Vec3f as each ‘operation’ usually needs to modify all members.
re: the docs – is this any better?
A struct is a ‘value type’, whereas a class is a ‘reference type’. This means that when you assign a struct to a variable, pass a struct to a function or return a struct from a function, an entirely new copy of the struct is created. This can be expensive if the struct is large as it involves copying every field from the source struct to the copy. The up-side is that allocating a copy does not involve garbage collection as all copies are created on the stack.
July 28, 2016 at 10:13 pm #2506I think it removes another onionskin but you’re still not down to the core.
What’s missing is a to the point introduction about the garbage collector, the stack, call by reference/value, the memory management, … the influences these things have in m2 on performance, so that you get an understanding about when you want to use which option, plus examples.
Btw. i like them for using small structs i calc with, like mem6f.
July 29, 2016 at 2:45 pm #2522I don’t see it yet haha, I know the point about a reference and non reference.
So a struct don’t cost memory because it will reset after a loop, and will use only memory when it is used by me inside a loop ?
While a Class Field is used from the beginning till the end of the program.
Look at this not working example
Normally I want to keep a postion X and Y and update this every loop.
Is it better/faster to keep it in a struct.Monkey1234567891011121314151617181920212223242526Class MojoTest Extends WindowField posX:Float ' it cost memory nowMethod OnRender( canvas:Canvas ) OverrideApp.RequestRender()' posX is still using memory' lot of code hereLocal blaX:Float ' decl. is using memoryLocal inst:=New TheStruct ' (decl.) only at this point a struct is using memoryinst.posX=MOUSEX()blaX=MOUSEX() ' only at this point a Local is using memory (same as struct)posX=MOUSEX() ' reference to the memory and update the valueEndStruct TheStructField posX:FloatEndEndCan I say, A Struct acts just a Local but you can store Fields inside them.
It is only using memory when you declare them, and it is resetting after a loop.If you want to use a variable in multiple classes then maybe you can make a global struct.
And maybe the struct will use memory if a function is using the global struct, but mostly not.July 29, 2016 at 3:36 pm #2524I think the stack is a key thing here, if you didn’t have structs, then if you wanted to store vectors or matrices in a usable way you’d have to use a class which are objects managed by the garbage collector. So if you have a game entity that has vectors to store coordinates, velocity, direction and so on, that ends up being a lot more work for the GC. But with structs they’re all on the stack so when the game entity is removed the GC has less clean up work to do.
Yes you could store coordinates as individual x,y float fields in your class and do all your vector math “by hand” as it were, and not see any real benefit to using structs but you make your life a lot easier when you can just do Position += Direction * Speed
Put it this way, I converted my collision code from Monkey1 which stored vectors using a class. When I changed them to structs instead I saw a significant performance boost.
July 29, 2016 at 4:01 pm #2525It has less to do with how much memory you need to use and how often you need to allocate it then it has to do with managing the heap. Any time you “new up” a class in a language with managed pointers, all you really get is a pointer. Even if I store that class in a Local, which would be on the stack, the stack only contains the pointer – the actually data is somewhere in the heap. Even if you know how many “things” you’ll need and make, say, an array of 20 instances of some class – your array, wherever it’s stored, is an array of 20 pointers. The 20 instances you actually create will all be in the heap, and since they are allocated individually, it’s almost certain that they will all be at random places in memory. You want to control your allocations because contiguous memory will result in a lot fewer cache misses, and because polluting memory like that leads to fragmentation as objects get created and destroyed essentially ad hoc. For software that doesn’t need to be highly performant you don’t care, but for a game, you have to manage huge amounts of memory, you have a lot of calculations to make many many times per second, and you need all of this to happen as smoothly as possible so your framerate doesn’t have significant dips and spikes.
@mark – Let me try to think of a couple of the specific questions I have, I’m being super vague.
Is there a difference between declaring an array these two ways?
Monkey1234567Global myArray:MyStruct[20]Global myArray:MyStruct[] = New MyStruct[20] ' or with := ...Global myArray:MyStruct[]' do stuff, get in a method...myArray = New MyStruct[20]When you say strings need to be malloc’d / free’d – how or when does that happen? Is the GC still the one that tracks their usage?
And on the GC, I presume that it works by spidering memory to see what malloc’d objects are still accessible, similar to the way I think Java does? The only other algorithm I know of is to count references, which I don’t think we’re doing? Does it rely on reflection to know how to follow pointers at runtime?
In addition to giving us structs, M2 gave us the ability to convert a value into a pointer, like getting a reference in C++ (I think we use varptr or some weird keyword, I don’t know off the top of my head…). As in the above example, lets assume I declare an array of struct values in some global or local scope. I then pass a pointer to one of those structs into another function (because in this case, I didn’t make it an array of values because I wanted to pass by value, what I wanted was to control the allocator – I want the subroutine to be able to write to it still). What happens with the GC here? Now all the sudden there’s a pointer on the stack in my subroutine – will the GC do anything to try to manage it? What if the reference I passed was from a Local, on the stack, rather than something I had allocated in the heap? In C++, that pointer would be an address in the stack, and nothing would stop me from shooting myself in the foot if the subroutine ends up storing it somewhere and trying to use it after the caller’s stack frame goes out of scope. And again, what is the GC going to think / do about this pointer – especially if I do something nasty like store it?
July 29, 2016 at 4:09 pm #2526Sorry I’m being kinda nit-picky about this. I guess the thing for me is, this is the first time I’ve ever seen any form of unmanaged memory present in a language that also does managed pointers and automatic garbage collection, and it’s probably the first for a lot of people. With a language like C, I know exactly how all of that’s working; with a language like Java, I know that I have no ability to influence things so it’s okay if the inner workings are a magical black box; but in a situation like this, I’m excited to try to use these tools that could allow huge optimization, but it’s hard/scary to do so without understanding exactly how all of the memory management is performed under the hood.
July 29, 2016 at 8:43 pm #2532I try to visualize it and learn by mx2 codes.
And found this text about a array with struct
<div class=”post-text”>struct
An array of values v encoded by a struct (value type) looks like this in memory:
vvvv
class
An array of values v encoded by a class (reference type) look like this:
pppp
..v..v…v.v..
where p are the this pointers, or references, which point to the actual values v on the heap. The dots indicate other objects that may be interspersed on the heap. In the case of reference types you need to reference v via the corresponding p, in the case of value types you can get the value directly via its offset in the array.
</div>
July 29, 2016 at 8:47 pm #2533I don’t know how they work under the hood but I think of structs as primitive types.
When you pass ints and floats around their value is copied rather than a pointer to the value being shared.
This is true for structs so I have been able to adopt them happily knowing my managed object count is reduced.
August 2, 2016 at 2:52 am #2673How structs work under the hood – is precisely what I’m asking here. I know very well what a struct is.
So this code, which works in Monkey-X, isn’t valid in M2:
Monkey123Class WhateverField arr:Float[4]EndI’m guessing that that was just an alias for “:= New Float[4]” anyway. In any case, for structs to make any sense, they need to be able to actually ‘own’ or ‘contain’ their data – for example, a struct for a vector or matrix needs to contain some array of floats. If the struct just contains a pointer, and the array ITSELF is going through the GC anyway, then the struct serves absolutely no purpose and provides no optimization – and Arrays are always Objects, from what I can tell.
Edit:
Actually, I probably found a bug that I should report. Here was the actual struct I was playing with; it throws a runtime error when you init an instance of it, on “data[0] = val” because data.Length is 0 at the time. Maybe it’s trying to initialize the properties and fields, but “data” hasn’t been assigned the pointer to the new array yet?
Monkey12345678910111213141516171819202122Struct Vector3Field data:= New Float[3]Property x:Float()Return data[0]Setter(val:Float)data[0] = valEndProperty y:Float()Return data[1]Setter(val:Float)data[1] = valEndProperty z:Float()Return data[2]Setter(val:Float)data[2] = valEndEndAugust 2, 2016 at 4:33 am #2675I don’t know how they work under the hood but I think of structs as primitive types.
This is probably the best way to think of them, and they do in fact work very much like primitive types under the hood, eg:
Monkey123456Struct MyIntField value:IntEndLocal t1:IntLocal t2:MyIntHere, both t1 and t2 will both take 4-ish bytes of stack memory (the size of the ‘int’ struct might end up being aligned) and both will ‘disappear’ when the statement block ends (so using Varptr with either is dangerous!).
Both are also ‘copied’ when assigned to variables or passed to or returned from functions.
Where things get perhaps a little confusing is with…
Monkey1Local t3:=New MyIntSince MyInt is a struct not a class, the ‘new MyInt’ bit doesn’t actually allocate ‘heap’ memory – it creates a new temporary MyInt (on the stack) that is then copied to t3. Which is actually quite similar to what this does…
Monkey1Local t4:=5+10This also creates a temporary int to store the result of 5+10 in (ie: 5+10 is a ‘new Int’), which is then copied to t4.
But really, apart from this wrinkle there’s nothing really that magical about structs – it’s pointers that are the tricky ones!
(Note also that when I say ‘on the stack’, this really just means logically on the stack. Compilers generally try to store as much as possible in cpu registers (which are really another form of stack storage) and this applies to structs as much as ints etc. So code like “Local t3:=New MyInt” may well reduce down to a single move instruction to a cpu register).
Structs and primitives also act the same way when it comes to arrays:
Monkey12Local a1:=New Int[100]Local a2:=New MyInt[100]Both of these will allocate about 400 bytes of heap memory (all arrays are stored on the heap regardless of whether the elements are primitives, structs or classes).
And in both cases, Varptr a[i+1]-Varptr a[i] will be 4, ie: the values are stored consecutively in memory.
So structs and primitive types are actually very very similar concepts.
As for…
Monkey1234Struct Vector3Field data:= New Float[3]EndThis should in fact be causing a compile time error – something along the lines of ‘struct field initializers must be constant’ (another related story…) – if not, there’s a bug or you’re not using the latest version of mx2cc.
August 2, 2016 at 9:37 pm #2686So there won’t be any way to allocate even fixed-size arrays “inside” a struct? That’s too bad, but it makes sense that arrays need to always be pointers. Guess I’ll have to continue indexing into databuffers and large float arrays.
It sounds like my intuition for everything else with struct pointers was right, then, but can you clarify what you said about how strings are handled? You said strings weren’t GC’d – but certainly they are, since they would always be on the heap, no?
 - 
		AuthorPosts
 
You must be logged in to reply to this topic.