We’ve reached some sort of plateau …

By xorlabs

I say that because it’s working in grid mode with two regions, and I could add more with a two minute session on the text editor for another region XML file.  There is a second user, whose avatar has had a sex change operation, and is a man now.  I named the new user in honor of an old friend who grew up with me in the old neighborhood and with whom I am still in contact.

This doesn’t mean that we’re out of the woods yet.  Software problems are bound to appear, and one already has.  What I would like to be able to do this weekend is tell you how I solved it.  Thanks to a very tiring day yesterday cleaning up the last of the broken fences Ike left in my yard, and some things which must get done today, that won’t happen.  Instead, I’ll tell you what I have found so far, and what I’m going to do about it as I get time during the coming week.

I let OpenSim run in grid mode overnight, after having successfully logged on to it, only to find out that the next day logging on wasn’t possible.  A restart solved the problem.  However, letting OpenSim run overnight a second time produced the same problem.  The command prompt where the User Server was running indicated that a login attempt caused a MySQL exception to be thrown.  It was clear that this was something more than an incorrect setting in a configuration file.

Seeing as how there are a lot of people working on OpenSim, and that there is a bug list on the OpenSim site, it seemed reasonable to look there first.  This is the sort of problem that is bound to have shown up already, and probably someone was at least working the problem.  As it happens, someone had worked the problem and a patch was posted.  The symptoms I experienced fit right in with the bug with problem ID 0002099.

A cursory examination of the problem description and the patch seems to indicate that OpenSim makes a connection to MySQL at startup, and expects that connection to be good as long as OpenSim is running.  However, MySQL has a timer on connections, and failure to see continued activity on a connection causes it to time out.  In a situation where people are logging on and logging off regularly, this wouldn’t cause a problem.  However, running a single user for a while, logging off, and letting OpenSim run all night without other logins causes the connection to time out.

That seems to be the nature of the problem, and no doubt the code modification reflected in the posted patch would solve the problem.  So, where do we go from here.  Presumably the preferred approach is to insert the patch, do a rebuild, and go on with life.  That, however, is not what is going to happen.

Now this is the first time for me to use Subversion.  In that part of the software world where people get paychecks and work with Microsoft stuff, Visual SourceSafe is the usual bureaucratic tool to use.  So, the question is, what in the Chicken Fried Hell do I do with this patch?  I assume that somehow, this works with some Subversion functionality so perhaps the first step is to do some reading on this aspect of Subversion.

That particular line of action is not very useful for my purposes anyway.  The idea is not just to get the problem fixed and gone.  The idea is to continue to get a better understanding of the inner workings of OpenSim.  So, the first step is to open up OpenSim with Visual Studio, and look at the two classes which are modified in the patch, and fully understand what is going on.  The assumption is that the change causes a new connection to be made every time a user logs in rather than relying on the one time establishment of a connection to MySQL.  An examination of the code will either verify that assumption or will indicate that some other approach was taken.

Now there is another alternative.  MySQL has a variable, easily set from the command line, which determines the number of seconds MySQL waits before timing out a connection.  Maybe an alternative is just to set that to some incredibly long time.  Perhaps there is a default value one can use which MySQL interprets as never letting a connection time out.  If there are any database DBA’s or security folks reading this, no doubt that last suggestion caused a scream of outrage.  Yes, that is not the best security decision to make.  In principle, the database could be located on some other server on the internet, and this could make it vulnerable to attack by the script kiddies.  In my case, however, the database is on the same server as OpenSim, and that server is not accessible from anywhere except my private network.  In the event that I ever set up an instance of OpenSim on the internet and invite people in, this would not be the best way to solve the problem.

So, no report of progress this weekend, merely an indication of some paths of investigation which will be taken.  I’ll let you know the eventual result, and how it all came out.

Tags: ,

Leave a Reply