Jump to content


Photo

[SOLVED]Time to deal with the stuttering


  • Please log in to reply
135 replies to this topic

#61 Ascension64

Ascension64
  • Modder
  • 5983 posts

Posted 14 April 2012 - 03:06 PM

Btw does CVariableMap have a vtbl pointer?

No.

You can compile with CRT 9.0 (VC++ 2008) or CRT 1.0 (VC++ 2010). Obviously you need Detours 3.0 and MFC 4.2.

--------------
Retired Modder
Note: I do not respond to profile comments/personal messages in regards to troubleshooting my modifications. Please post on the public forums instead.

Baldur's Gate Trilogy-WeiDU and Mods
Throne of Bhaal Extender (TobEx)

Contributions: (NWN2) A Deathstalker (voice acting) - (IWD2) IWD2 NPC Project (soundset editing) - (Misc) SHS PC Soundsets (voice acting)
Legacy: (BG/Tutu/BGT) Beregost Crash Fixer 1.9 (18 Jul 10) - (BG2) Enable conversations with charmed/dominated creatures (18 Jul 10) - (BG2) Experience Corrections (18 Jul 10) - (Misc) Platform Conversion Utility RC2 (13 Feb 10)


#62 Suslik

Suslik

    Investigator

  • Member
  • 500 posts

Posted 14 April 2012 - 03:13 PM

Obviously you need Detours 3.0 and MFC 4.2.

This. Old MFC was causing me so much trouble, of course, not CRT. You did send it to me, but before I have properly tested it, you added "Spell Reflection Fix"/"Spell Reflection Animation Fix" to TobEx yourself : D Btw those fixes work like a charm, I'm happy that I somewhat helped you in that investigation!

I will try again and tell you the results.

Edited by Suslik, 14 April 2012 - 03:17 PM.


#63 Ascension64

Ascension64
  • Modder
  • 5983 posts

Posted 14 April 2012 - 04:42 PM

Just grab the latest build off github.

--------------
Retired Modder
Note: I do not respond to profile comments/personal messages in regards to troubleshooting my modifications. Please post on the public forums instead.

Baldur's Gate Trilogy-WeiDU and Mods
Throne of Bhaal Extender (TobEx)

Contributions: (NWN2) A Deathstalker (voice acting) - (IWD2) IWD2 NPC Project (soundset editing) - (Misc) SHS PC Soundsets (voice acting)
Legacy: (BG/Tutu/BGT) Beregost Crash Fixer 1.9 (18 Jul 10) - (BG2) Enable conversations with charmed/dominated creatures (18 Jul 10) - (BG2) Experience Corrections (18 Jul 10) - (Misc) Platform Conversion Utility RC2 (13 Feb 10)


#64 Suslik

Suslik

    Investigator

  • Member
  • 500 posts

Posted 14 April 2012 - 05:29 PM

Just a quick report on current TobEx src:

Solution file is MSVS2010 but default project toolset(project->properties) is set to v90. Bug? Feature? Does not compile in VS2010 until set to v100

Error 1 error C2440: 'return' : cannot convert from 'const void *' to 'void *' \src\lib\cptrlistex.cpp 54 (a few of those)

Error above was since v20 at least, as I recall. Why u no fix it? D: I "fixed" it via commenting those functions in cptrlistex.cpp which return void *. Since they are never called, it works : D

Ok, except for that and some bugginess with a messed detours version(which also requires detoured), I have built tobex and it works. Yeah, you will call me a vagrant, but I have just replaced v23's TobEx.dll with a new one and it still works : D I cannot install the thing properly due to generalized_biffing and >1000 mods and I dunno how to do it manually, but just replacing TobEx.dll works.

Anyway, I'm going to do some code injection here, yarrr!

#65 Suslik

Suslik

    Investigator

  • Member
  • 500 posts

Posted 14 April 2012 - 07:11 PM

Update. I have created my own DETOUR_CVariableMap and I have overridden all calls of the default CVariableMap to mine. It works, I can even log these activities and the whole thing does not crash, yay.

Now I need to actually override the logic. And I have a few problems with understanding of the interface:

1) BOOL CVariableMap::Add(CVariable& var)
What string key is this var added with? Does its "name" field count as string key? If yes, it means that name length is limited to 32 characters?

2) What do these do? Are they even public?
void CVariableMap::MarshalGlobal(arg1, arg2)        .text 0064AB6E
void CVariableMap::MarshalLocal(cre)                        .text 0064AE43
And what happens if I do not override them, but I do override information stored in those 8 bytes of CVariableMap? I think bad things will hapen, but I'm not sure.

I mean in case I'm overriding the memory of CVariableMap(I use 8 bytes stored in it in my own way), it's imperative to override all methods that will use this class's memory, but can I be sure that these are all necessary methods and there are not any more I do not know about?

Edited by Suslik, 14 April 2012 - 07:29 PM.


#66 Ascension64

Ascension64
  • Modder
  • 5983 posts

Posted 14 April 2012 - 07:48 PM

Just a quick report on current TobEx src:

Solution file is MSVS2010 but default project toolset(project->properties) is set to v90. Bug? Feature? Does not compile in VS2010 until set to v100

Error 1 error C2440: 'return' : cannot convert from 'const void *' to 'void *' \src\lib\cptrlistex.cpp 54 (a few of those)

Error above was since v20 at least, as I recall. Why u no fix it? D: I "fixed" it via commenting those functions in cptrlistex.cpp which return void *. Since they are never called, it works : D

Ok, except for that and some bugginess with a messed detours version(which also requires detoured), I have built tobex and it works. Yeah, you will call me a vagrant, but I have just replaced v23's TobEx.dll with a new one and it still works : D I cannot install the thing properly due to generalized_biffing and >1000 mods and I dunno how to do it manually, but just replacing TobEx.dll works.

Anyway, I'm going to do some code injection here, yarrr!

I'm not sure if your project settings are same as mine. I don't have the issues you mention. Also compiling under CRT 9.0 is required to work on Win2000 and WinXP SP1 or less. People have complained about using the CRT 10.0. I still haven't gotten confirmation that it actually works, though.

Aren't you jumping the gun a little? You haven't fully diagnosed the cause of the stutters yet? One really easy way is to track down the calls to the CVariableMap constructor and increase the initial size of the initial hash table.

What string key is this var added with? Does its "name" field count as string key? If yes, it means that name length is limited to 32 characters?

Yes, and yes.

What do these do? Are they even public?

I would have though you knew about Unmarshal and Marshal, although probably not in binary to binary sense. The variables need to be saved in the saved game, so global variables marshalled into BALDUR.GAM and .ARE structures, while local variables marshalled into .CRE structures.

I mean in case I'm overriding the memory of CVariableMap(I use 8 bytes stored in it in my own way), it's imperative to override all methods that will use this class's memory, but can I be sure that these are all necessary methods and there are not any more I do not know about?

No, you can never be sure. Test.

--------------
Retired Modder
Note: I do not respond to profile comments/personal messages in regards to troubleshooting my modifications. Please post on the public forums instead.

Baldur's Gate Trilogy-WeiDU and Mods
Throne of Bhaal Extender (TobEx)

Contributions: (NWN2) A Deathstalker (voice acting) - (IWD2) IWD2 NPC Project (soundset editing) - (Misc) SHS PC Soundsets (voice acting)
Legacy: (BG/Tutu/BGT) Beregost Crash Fixer 1.9 (18 Jul 10) - (BG2) Enable conversations with charmed/dominated creatures (18 Jul 10) - (BG2) Experience Corrections (18 Jul 10) - (Misc) Platform Conversion Utility RC2 (13 Feb 10)


#67 Suslik

Suslik

    Investigator

  • Member
  • 500 posts

Posted 15 April 2012 - 05:41 AM

would have though you knew about Unmarshal and Marshal, although probably not in binary to binary sense.

I do know about marshalling, but I have no idea how to marshal those variables into .gam/.are/.cre files. And why is there no unmarshal? Hell, why is hashmap responsible for such operations and not some external mechanism? But anyway, I hope the game will not crash if will test it without saving. I will test.

Aren't you jumping the gun a little? You haven't fully diagnosed the cause of the stutters yet?

I am currently about 95% sure that the stuttering is caused by GetGlobals. But I do not know how exactly the hashmap is organised so I cannot tell how the size of the hashmap affects its performance. But I will experiment.

Thanks for info, I will continue testing.

#68 Suslik

Suslik

    Investigator

  • Member
  • 500 posts

Posted 15 April 2012 - 06:25 AM

HOLD ONTO YOUR SEATS THIS IS THE FIRST MAJOR SCIENTIFIC BREAKTHROUGH OF THE MILLENIUM

Max size of the hashmap in both stuttering and non-stuttering save is 2048. This hashmap seems like "GLOBALS" one.

If I manually set the size of all hashmaps to 2048, nothing happens. Stuttering game stutters and non-stuttering.. ehh.. non-stutters. But! If I manually set the size of all hashmaps to 4096, the stuttering game stops stuttering!
The game starts consuming extra 50Mb of RAM, but those are minor details : D

Ok, now I have somewhat run out of ideas why that happens. But at least I am sure that stuttering is indeed due to some internal hashmap issues. Testing continues..

Edited by Suslik, 15 April 2012 - 06:28 AM.


#69 Suslik

Suslik

    Investigator

  • Member
  • 500 posts

Posted 15 April 2012 - 07:41 AM

It seems I understood what's the issue. When variable hashmap gets overflowed, its Find() complexity degenerates to O(n) due to linear search. Well, Asc64 claimed this one page ago, but now I have understood and confirmed it. I think I will re-implement Add/Find methods using Cockoo algorithm as i30817 suggested, while leaving the structure of the table untouched so that it would be unnecessary to tinker with overriding all those Marshal(and probably other) methods which I do not know how to implement.

#70 i30817

i30817
  • Member
  • 611 posts

Posted 15 April 2012 - 08:14 AM

Cuckoo hashing is complex to implement. Unless you are a wizard at datastructures, i don't advice it. Instead, if you really want to try it, try using a already working and tested implementation; or at least copy it.

Here's a java example with commentary on the thread:
http://www.java-gami...23102/view.html
Notice that this implementation fails if the hashcode is the same!

the tricky part appears to be the "chose another hash function part" when the table needs to be rebuilt - just for that reason i'd try to continue putting in the Boost map instead, since the problem appears to be the deficient resize, which that should "solve". and it's a implementation that has no corner cases like this.

Any "professional" map implementation will have a resize to avoid the O(n) degeneration. If the Infinity Engine doesn't have it, it's probably because it's hand rolled (why?)

Edited by i30817, 15 April 2012 - 08:59 AM.


#71 Suslik

Suslik

    Investigator

  • Member
  • 500 posts

Posted 15 April 2012 - 09:28 AM

Cuckoo hashing is complex to implement.

Umm.. It does not look complicated, why? Anyway, implementing the algorithm itself is far easier than injecting its code to the binary. All those DETOUR things, special memory management, Tramp- stuff is a little overwhelming and gives way more trouble than implementing a standard algorithm.

I'm currently tinkering with overriding their hashmap functionality. It has some hidden mechanism to determine if the hashtable needs to be resized. For example, ::Resize() is not called if Add() returns 0, but it is called under some other circumstances I do not know about. I cannot implement any hashing algorithm until I can say the engine that my map needs to be resized.

Edited by Suslik, 15 April 2012 - 09:28 AM.


#72 i30817

i30817
  • Member
  • 611 posts

Posted 15 April 2012 - 09:40 AM

Because to keep the O(1) property of Gets(), you need to rebuild the table with 2 different hash functions when you find a cycle on the cuckoo bumping.

http://mybiasedcoin....ce-part_15.html

  • The failure mode of cuckoo hashing is unpleasant. One thing I glossed over last time is that there's some (small) chance that cuckoo hashing gets caught in a loop; in moving things around to try to find an empty space, it can get stuck in a cycle, moving the same things back and forth repeatedly without ever being able to squeeze in a new element. In the theoretical analysis, this can swept under the rug by saying that after a certain number of steps, declare a failure and re-hash everything with new hash functions. In practical settings, like a router, this is often not a feasible option.


Edited by i30817, 15 April 2012 - 09:50 AM.


#73 Suslik

Suslik

    Investigator

  • Member
  • 500 posts

Posted 15 April 2012 - 10:00 AM

In the theoretical analysis, this can swept under the rug by saying that after a certain number of steps, declare a failure and re-hash everything with new hash functions.

What's the problem? When the table becomes too tight, just rehash it. It will anyway be rehashed when it's fully occupied, why not make it a little earlier?

Anyway, there are currently a few things that should be done at work, I'm afraid I won't have time to investigate this issue further in a day or two.

#74 i30817

i30817
  • Member
  • 611 posts

Posted 15 April 2012 - 10:05 AM

I'm currently tinkering with overriding their hashmap functionality. It has some hidden mechanism to determine if the hashtable needs to be resized. For example, ::Resize() is not called if Add() returns 0, but it is called under some other circumstances I do not know about. I cannot implement any hashing algorithm until I can say the engine that my map needs to be resized.

Resize is a private, implementation detail of Maps. It normally happens when you need to add a new mapping that doesn't exist yet on the map and it's rate of occupancy is beyond what the designer thought was wise. As it is private, i'd be very surprised if it was called outside the class.

And even if it was, i'd just NOP it and let other; implicit on Add resize of the implementation you snipped in work.


In the theoretical analysis, this can swept under the rug by saying that after a certain number of steps, declare a failure and re-hash everything with new hash functions.

What's the problem? When the table becomes too tight, just rehash it. It will anyway be rehashed when it's fully occupied, why not make it a little earlier?


Because it's not the same hash function. To make this work flawlessly you have to have a way to generate hashfunctions for strings (and there is no guarantee that the generated function doesn't lead to a cuckoo cycle either, so it's another cycle too). The cuckoo hashing amortizes the linear probing of Get() into the Add().
It's like trying to get a perfect hashing function at runtime.

I'd really not use cuckoo hashing for this, unless you find a very pro library that doesn't gloss over this problem and assures you you'll never lose a mapping because of this "failure mode".
I know i suggested it, but it was a "brainfart" due to not remembering it's required low occupancy for best inserts (50% with 2 hash funcions; 75% with 3 hash functions) and it's braindead "failure mode". It's not surprising it's used in caches... caches can ignore the failure mode!

Edited by i30817, 15 April 2012 - 10:26 AM.


#75 Suslik

Suslik

    Investigator

  • Member
  • 500 posts

Posted 15 April 2012 - 10:23 AM

i'd be very surprised if it was called outside the class.

Maybe it's called by another private method of this class, but since its whole interface is unknown, it does not matter if this is class's private method or an outside function.

And even if it was, i'd just NOP it and let other; implicit on Add resize of the implementation you snipped in work.

That was the first thing I tried, but it seems to crash in some other method I have not yet externalized. It seems that the only way to figure this out is either ask Asc64 for a more complete list of class methods or debug the executable myself.

About specific hashing algorithms. First of all I'm trying to implement a simple linear search just as a replacement for the existing one. I was able to replace its Find() with a dummy linear search of my own, but if I try to replace Add with a linear O(n) one-by-one addition, it works well until I try to resize the array. It looks like a simple operator new() causes a memory corruption, but I need more time to investigate it all. And I have doubts that I will posess this time in upcoming 1-2 days.

Edited by Suslik, 15 April 2012 - 10:24 AM.


#76 i30817

i30817
  • Member
  • 611 posts

Posted 15 April 2012 - 10:29 AM

I think Ascension64 is the man to ask about memory allocation constraints in TobEx. I'm pretty sure i read something about that on the TobEx archive too.

#77 Suslik

Suslik

    Investigator

  • Member
  • 500 posts

Posted 15 April 2012 - 01:40 PM

Yep. Asc64, am I allowed to allocate memory on BGMain's heap by direct calling of operator new/malloc or do I have to use BGMain's allocator?

Edited by Suslik, 15 April 2012 - 01:43 PM.


#78 Suslik

Suslik

    Investigator

  • Member
  • 500 posts

Posted 15 April 2012 - 06:25 PM

Second major scientific breakthrough of the millenia. I have implemented and successfully injected my own hashmap which works by far faster than the standard one, MWAHAHA! It works fast enough so that there's absolutely no any noticeable stuttering when executing my test baldur.bcs and/or loading my stuttering savegame:
static randomNumbers[256][32];
static randomNumbersInitialized = 0;
//alternative hashing for strings of max length 32. never used but looks promising.
unsigned int StrTableHash(const char *str)
{
	if(!randomNumbersInitialized)
	{
		for(int i = 0; i < 256; i++)
		{
			for(int j = 0; j < 32; j++)
			{
				randomNumbers[i][j] = IERand();
			}
		}
		randomNumbersInitialized = 1;
	}

	unsigned int hash = 0;
	for(int i = 0; str[i] != 0; i++)
		hash ^= randomNumbers[str[i]][i];

	return hash;
}

unsigned int StrHash(const char *str)
{
	unsigned int hash = 5381;
	unsigned int c;

	while (c = *str++)
		hash = ((hash << 5) + hash) + c; /* hash * 33 + c */

	return hash;
}

void		DETOUR_CVariableMap::DETOUR_Construct(int nSize)	
{
	//nSize = 4096;
	(this->*Tramp_CVariableMap_Construct)(nSize);
}
void		DETOUR_CVariableMap::DETOUR_Deconstruct()			
{
	return (this->*Tramp_CVariableMap_Deconstruct)();
}

bool IsEmpty(CVariable &var)
{
	return var.name.GetBuf()[0] == 0;
}

BOOL		DETOUR_CVariableMap::DETOUR_Add(CVariable& var)		
{
	int i = 0;
	for(int i = 0; (i < 32) && var.name.GetBuf()[i] != 0; i++)
	{
		var.name.GetBuf()[i] = toupper(var.name.GetBuf()[i]);
	}

	unsigned int start = StrHash((char *)&(var.name));

	for(unsigned int i = 0; i < nArraySize / 10; i++)
	{
		int index = (i + start) % nArraySize;
		if(IsEmpty(pArray[index]) || strcmp(var.name.GetBuf(), pArray[index].name.GetBuf()) == 0)//.ToCString().CompareNoCase(pArray[index].name.ToCString()) == 0) //bucket is empty
		{
			pArray[index] = var;
			return 1;
		}
	}
	
	SetSize(nArraySize * 2);
	return DETOUR_Add(var);
	
	int sizeBefore = nArraySize;
	BOOL res = (this->*Tramp_CVariableMap_Add)(var);
	return res;
}

CVariable&	DETOUR_CVariableMap::DETOUR_Find(IECString sVar)	
{
	char key[32];
	int i = 0;
	do
	{
		key[i] = toupper(sVar[i]);
	}while(sVar[i++] != 0 && i < 32);
	key[31] = 0;

	unsigned int start = StrHash(key);

	for(unsigned int i = 0; i < nArraySize; i++)
	{
		unsigned int index = (start + i) % nArraySize;
		if(strcmp(key, pArray[index].name.GetBuf()) == 0)
		{
			return pArray[index];
		}
		if(IsEmpty(pArray[index])) break;
	}
	return *((CVariable *)0);
	return (this->*Tramp_CVariableMap_Find)(sVar);
}

void		DETOUR_CVariableMap::DETOUR_Empty()					
{
	return (this->*Tramp_CVariableMap_Empty)();
}

void		DETOUR_CVariableMap::DETOUR_SetSize(int nNewSize)	
{
	//console.writef("resizing from size %d to %d\n", nArraySize, nNewSize);
	(this->*Tramp_CVariableMap_SetSize)(nNewSize);

	/*int largestBlockSize = 1;
	int currBlockSize = 0;
	for(int i = 0; i < nArraySize; i++)
	{
		if(IsEmpty(pArray[i])) currBlockSize = 0;
		currBlockSize++;
		if(currBlockSize > largestBlockSize) largestBlockSize = currBlockSize;
	}
	console.writef("after resize largest block is %d\n", largestBlockSize);*/
}
There are two major things I have changed from the original algorithm:
1) I have rewritten the hashfunction. Actually it's the first one for search term "C++ string hash", but dont tell anyone.
2) I have decided to grow the table when its fragmented parts grow to about 10% of the table size. I have tried to grow it under the same circumstances in the Find() method but it works about the same.

Largest chunks of framgented data with those optimizations ^ are about 1/100 of the table size which's by far better than full table size of the original algorithm just before it grows.

Actually the second optimization made it unnecessary to implement a more complicated hashing algorithm since performance is great.

Question to Asc64
If I try to delete [] pArray of CVariableMap , BG crashes. What can be the reason? Due to this problem I was forced to use IE's SetSize() instead of my own resize method.

I dont know about you guys but when I'm done with work I'm finally going to play. Without stuttering, lol.

Edited by Suslik, 15 April 2012 - 07:36 PM.


#79 i30817

i30817
  • Member
  • 611 posts

Posted 15 April 2012 - 08:55 PM

Still not clear on why you can't just use Boost::Map directly or something - it's bound to be optimized to death.
(i don't know if Boost::Map exists, i'm just a lowly java programmer).

Edit:
toupper - BGScript variables are case-insensitive?

Ow, i really dislike datastructures that change the data.

Also it's not a char array?

for(int i = 0; (i < 32) && var.name.GetBuf()[i] != 0; i++)


this is legal C++?

*shoots head*

edit: byte array. duh.

BTW, a rate of occupancy of 10% is... really bad - you don't want to slow down the gets, but you don't want to waste memory either - not to mention the ultra expensive copying, but i suppose it doesn't matter for the low amount of variables needed.
edit2: finally read it correctly, not a rate of occupancy, but only the buckets you'll see in one go before resizing. Eh, still think that a map from a well used library will work much better. Especially if the array is anything approaching full and space is added to the end... multiple resizes?

Can't you just wrap a map variable?

Edited by i30817, 15 April 2012 - 09:42 PM.


#80 i30817

i30817
  • Member
  • 611 posts

Posted 15 April 2012 - 09:28 PM

Why the hell is setsize public wtf? Is that the cause of the slowdown? Multiple calls of setsize per turn?