Jump to content


i30817's Content

There have been 71 items by i30817 (Search limited from 23-May 23)


By content type

See this member's


Sort by                Order  

#539817 [SOLVED]Time to deal with the stuttering

Posted by i30817 on 17 April 2012 - 10:21 AM in Mega Mod Help

From what i saw in the java sources, they try to keep the load factor at 75% for a good tradeoff of wasted space - speed.
How did you calculate 99%? I don't see any reason (if you had extreme bad luck) why the table couldn't fill up completely.

Lol; learned about a new hashing method (2008!)
http://en.wikipedia....pscotch_hashing



#539812 [SOLVED]Time to deal with the stuttering

Posted by i30817 on 17 April 2012 - 10:02 AM in Mega Mod Help

I see what you mean, i'd forgotten about what resize would really do (rehash the whole table positions) instead of just copy into doubled array.

What was the reason for the persistent slowdowns with the 100% resize? I get it's the linear search of a degenerated hashtable - was it searching the whole table?

It might be "testing" a value that is not on the table, that then is NOT put on it.
So it's a script "peculiarity", exposing a weakness on the find method?

BTW, your "double return", non-initialized nArraySize, and unused sizeBefore look a little... buggy.



#539805 [SOLVED]Time to deal with the stuttering

Posted by i30817 on 17 April 2012 - 07:48 AM in Mega Mod Help

I don't really like the resizing algorithm.

I mean think of what happens when it needs a new slot:
it will grow to twice the size (10/10+10/10) and search for 1/10 of the new size again - 20% of the old size  - which means it might need to grow 2 or 3 times until it finds a free slot, doubling memory 3 times. And if the hashcode starts to line up values, i don't want to think what might happen to performance.

Oh well, if you can't use Boot, you can't.



#539709 [SOLVED]Time to deal with the stuttering

Posted by i30817 on 15 April 2012 - 09:28 PM in Mega Mod Help

Why the hell is setsize public wtf? Is that the cause of the slowdown? Multiple calls of setsize per turn?



#539707 [SOLVED]Time to deal with the stuttering

Posted by i30817 on 15 April 2012 - 08:55 PM in Mega Mod Help

Still not clear on why you can't just use Boost::Map directly or something - it's bound to be optimized to death.
(i don't know if Boost::Map exists, i'm just a lowly java programmer).

Edit:
toupper - BGScript variables are case-insensitive?

Ow, i really dislike datastructures that change the data.

Also it's not a char array?

for(int i = 0; (i < 32) && var.name.GetBuf()[i] != 0; i++)


this is legal C++?

*shoots head*

edit: byte array. duh.

BTW, a rate of occupancy of 10% is... really bad - you don't want to slow down the gets, but you don't want to waste memory either - not to mention the ultra expensive copying, but i suppose it doesn't matter for the low amount of variables needed.
edit2: finally read it correctly, not a rate of occupancy, but only the buckets you'll see in one go before resizing. Eh, still think that a map from a well used library will work much better. Especially if the array is anything approaching full and space is added to the end... multiple resizes?

Can't you just wrap a map variable?



#539669 [SOLVED]Time to deal with the stuttering

Posted by i30817 on 15 April 2012 - 10:29 AM in Mega Mod Help

I think Ascension64 is the man to ask about memory allocation constraints in TobEx. I'm pretty sure i read something about that on the TobEx archive too.



#539666 [SOLVED]Time to deal with the stuttering

Posted by i30817 on 15 April 2012 - 10:05 AM in Mega Mod Help

I'm currently tinkering with overriding their hashmap functionality. It has some hidden mechanism to determine if the hashtable needs to be resized. For example, ::Resize() is not called if Add() returns 0, but it is called under some other circumstances I do not know about. I cannot implement any hashing algorithm until I can say the engine that my map needs to be resized.

Resize is a private, implementation detail of Maps. It normally happens when you need to add a new mapping that doesn't exist yet on the map and it's rate of occupancy is beyond what the designer thought was wise. As it is private, i'd be very surprised if it was called outside the class.

And even if it was, i'd just NOP it and let other; implicit on Add resize of the implementation you snipped in work.


In the theoretical analysis, this can swept under the rug by saying that after a certain number of steps, declare a failure and re-hash everything with new hash functions.

What's the problem? When the table becomes too tight, just rehash it. It will anyway be rehashed when it's fully occupied, why not make it a little earlier?


Because it's not the same hash function. To make this work flawlessly you have to have a way to generate hashfunctions for strings (and there is no guarantee that the generated function doesn't lead to a cuckoo cycle either, so it's another cycle too). The cuckoo hashing amortizes the linear probing of Get() into the Add().
It's like trying to get a perfect hashing function at runtime.

I'd really not use cuckoo hashing for this, unless you find a very pro library that doesn't gloss over this problem and assures you you'll never lose a mapping because of this "failure mode".
I know i suggested it, but it was a "brainfart" due to not remembering it's required low occupancy for best inserts (50% with 2 hash funcions; 75% with 3 hash functions) and it's braindead "failure mode". It's not surprising it's used in caches... caches can ignore the failure mode!



#539659 [SOLVED]Time to deal with the stuttering

Posted by i30817 on 15 April 2012 - 09:40 AM in Mega Mod Help

Because to keep the O(1) property of Gets(), you need to rebuild the table with 2 different hash functions when you find a cycle on the cuckoo bumping.

http://mybiasedcoin....ce-part_15.html

  • The failure mode of cuckoo hashing is unpleasant. One thing I glossed over last time is that there's some (small) chance that cuckoo hashing gets caught in a loop; in moving things around to try to find an empty space, it can get stuck in a cycle, moving the same things back and forth repeatedly without ever being able to squeeze in a new element. In the theoretical analysis, this can swept under the rug by saying that after a certain number of steps, declare a failure and re-hash everything with new hash functions. In practical settings, like a router, this is often not a feasible option.




#539652 [SOLVED]Time to deal with the stuttering

Posted by i30817 on 15 April 2012 - 08:14 AM in Mega Mod Help

Cuckoo hashing is complex to implement. Unless you are a wizard at datastructures, i don't advice it. Instead, if you really want to try it, try using a already working and tested implementation; or at least copy it.

Here's a java example with commentary on the thread:
http://www.java-gami...23102/view.html
Notice that this implementation fails if the hashcode is the same!

the tricky part appears to be the "chose another hash function part" when the table needs to be rebuilt - just for that reason i'd try to continue putting in the Boost map instead, since the problem appears to be the deficient resize, which that should "solve". and it's a implementation that has no corner cases like this.

Any "professional" map implementation will have a resize to avoid the O(n) degeneration. If the Infinity Engine doesn't have it, it's probably because it's hand rolled (why?)



#539597 [SOLVED]Time to deal with the stuttering

Posted by i30817 on 14 April 2012 - 11:04 AM in Mega Mod Help

My approach to see if the cause is the hashtable array being replaced would be to find that Growth function and to insert a breakpoint at that function and check out if it's being invoked before the "stutter" and not being invoked previously to the stutter.

I'm incompetent at both programming and debugging, much less reversing though. My last reversing "accomplishment" was taking ToEE; noting it uses CPython, finding the source of CPython; finding the function in that source that is responsible for jumping from the python dll to the C bindings, making a note of that function name and debugging ToEE with a breakpoint there so i could find the location in the executable of any C-function invoked by the game console, then using that to clarify some "mysterious" function arguments by seeing what the game provides as arguments normally.



#539590 [SOLVED]Time to deal with the stuttering

Posted by i30817 on 14 April 2012 - 09:02 AM in Mega Mod Help

Oh, i see what you did with the lists Ehhh, that's less efficient supposedly.

BTW, have you heard about cuckoo hashing?
http://en.wikipedia..../Cuckoo_hashing

It's cool if you read much more than you write (like in BGScript i think).



#539584 [SOLVED]Time to deal with the stuttering

Posted by i30817 on 14 April 2012 - 08:22 AM in Mega Mod Help

Hashtables - if the domain is not limited, as these strings aren't - can't have constant sizes - they are growable datastructures, and that happens when you introduce a new value beyond a certain limit (they also normally need empty buckets, so the limit - normally - depends on the rate of empty buckets)

So the modify function always has a grow subfunction, sometimes others too, like the read function.
Maybe the extra space is being discarded or growing wrong somehow - imagine that they stupidly increased it by +1.

But that wouldn't happen constantly if the program is only "reading" from the table, as Globals does. Unless the check for new growth is more than the actual amount grown (unbelievably stupid mistake though)



#539440 [SOLVED]Time to deal with the stuttering

Posted by i30817 on 12 April 2012 - 07:58 PM in Mega Mod Help

i30817
c'mon. it starts to sound like magic :/

http://catb.org/jarg...agic-story.html

Maybe it's the area files inside the save that prevent you loading it on the base game?



#539437 [SOLVED]Time to deal with the stuttering

Posted by i30817 on 12 April 2012 - 07:47 PM in Mega Mod Help

It wouldn't explain the stutter, but it would explain the +23 thing. You even have a case (all 'a', which is 62 in ascii, in contrast with the digits you used in the others, which map to the same number) that stutters with 23! So i think the hashcode is failing somehow.

If it's a overflow into negative numbers ("somehow") affecting the stutter, you should be able to test it by doubling again to see if you can get it positive again.

if there's a case where 24 chars stutter and 48-49 don't...



#539434 [SOLVED]Time to deal with the stuttering

Posted by i30817 on 12 April 2012 - 07:30 PM in Mega Mod Help

Mmmm. What would happen if the hashcode overflowed into negative numbers?

-X % in C++ doesn't launch a exception or anything does it? Or result on a negative number?

ISO/IEC 14882:2003 : Programming languages -- C++. 5.6.4: ISO, IEC. 2003. "the binary % operator yields the remainder from the division of the first expression by the second. .... If both operands are nonnegative then the remainder is nonnegative; if not, the sign of the remainder is implementation-defined".

herp.



#539430 [SOLVED]Time to deal with the stuttering

Posted by i30817 on 12 April 2012 - 07:10 PM in Mega Mod Help

I think you'll have trouble to do it in vanilla without stressing the globals (i really really hope it's a map and not a list) map.

Edit: then again, i suppose you can just use the same savegames in vanilla right? The map is full on unused values, should do no harm, as long a mods didn't change default bioware variables to "new" values. And even so, it should do no harm, because Asc64 won't play the game with the save, but try to debug the pauses.



#539427 [SOLVED]Time to deal with the stuttering

Posted by i30817 on 12 April 2012 - 07:03 PM in Mega Mod Help

after 23 characters all strings hashcodes > 5000, so they get all assigned to the last slot, and the hasmap algorithm degenerates into a linear search

Well, I suppose no one truncates hashfunction's value to determine the number of buckets. Usually you do something like this:
int bucketNum = hash("some string") % buckectsCount;

I know. That's why it would be stupid, though i can't think of another reason for the "magic 23".

Because (in a "classical" hashmap) then it has to check all contiguous occupied slots around the hashcode slot for the string, to see if the mapping exists - to avoid hashcode collisions.

Unfortunately, irrelevant. Setting the value to anything does not affect stuttering.

That new post was a observation that the pause would be "normal" if you were to check a string that doesn't exist in a huge fully occupied map without free slots.
Though it should have then happened to ALL non-existing strings, which makes no sense.

Maybe it does? And there aren't so many "non-existing strings on the map" anymore? That one on baldur.bcs could be one of the few that is on a global script and has no value in the globals yet, so it's constantly checking the whole globals map.
edit: but no, you tried with other values < 23 without a mapping and had no stutter right? And besides, hashmaps are normally coded to never let themselves get fully occupied for this very reason.


Stumped.
:WTF:



#539425 [SOLVED]Time to deal with the stuttering

Posted by i30817 on 12 April 2012 - 06:58 PM in Mega Mod Help

Hey, wait a minute. The string doesn't exist in the globals yet??

Because (in a "classical" hashmap) then it has to check all contiguous occupied slots around the hashcode slot for the string, to see if the mapping exists - to avoid hashcode collisions.

So the half a second pause is it checking near all the map (or all the map, if it's badly programmed), for a non-existant map on the mid-game globals map full of strings.

Strange that the size of the string has anything to do with it in that case though. It's a mystery.



#539424 [SOLVED]Time to deal with the stuttering

Posted by i30817 on 12 April 2012 - 06:52 PM in Mega Mod Help

Don't you'll wish IE was open source?
:twisted:

BTW, i edited the post above - i think the last idea is much more likely.



#539421 [SOLVED]Time to deal with the stuttering

Posted by i30817 on 12 April 2012 - 06:30 PM in Mega Mod Help

It might be that the buffer that it uses to copy the string into to check is pre-allocated to a certain size, and that string needs a relocation. Then after the copy and comparison, the memory is freed and a new array with the default (inadequate) size is allocated.

No idea why it would only start manifesting in the middle of the game; unless there is something on baldur.bsc that stopped it from happening before.


The other alternative is that there is a memory leak in the process above, but i'd imagine that would eventually cause a crash from out of memory, not stuttering.

Other possibilities:
The hash function of the underlying hashmap (if any) is defective as soon as it exceeds a certain length - very unlikely.

There's a funky cache that gets invalidated.

You have enough string mappings that a combination of a "bad" hash based on the number of characters, and size of the dictionary/hashmap cause a lots of strings to map to a boundary slot. Say the hash is based on a sum of the chars, the hashmap has a current size 5000 (enough for all strings yet, the problem is not that it hasn't enough slots, but that the hashcode is not zooming on the right slot), after 23 characters all strings hashcodes > 5000, so they get all assigned to the last slot, and the hasmap algorithm degenerates into a linear search + string equality testing - i think this one is very likely.
The cause of this is that the hashcode is being used incorrectly, either truncating the result to fit on the array without modular arithmetic, or that the hascode is giving many strings the same slot.



#539277 TobEx Wish list

Posted by i30817 on 11 April 2012 - 12:37 PM in ToBEx

No