Sorry for the posting delay. Back to the blog. For now.
Apart from excessive registration activity, we had a few other technical challenges, most related to backhaul and power.
Our backhaul was a point-to-point WiFi link, running from a Nanostation 5 on our tower to a 30' tower in a camp called Fusion Valley, over at the 3:00 plaza. Fusion Valley, in turn, linked us to Center Camp where we got onto a 10 MB/s microwave relay back to an ISP in the "real world". The people running this network had been very supportive of our project and their system worked most of the time, but keeping anything working reliably on the Playa is hard and they were not granted any magical protection against that. Our static IP number changed and sometimes just quit working. A few times we lost backhaul completely, like the time someone at Fusion Valley plugged some power tools into the same circuit as the network gear, tripped the breaker and then just walked away for a while. Well, it's Burning Man. Stuff happens.
We were running a mostly-local service. Occasional loss of backhaul should not have been a serious problem, but it was, because Asterisk kept trying to run DNS and reverse-DNS queries. Asterisk would just lock up when it couldn't reach the internet. We tried replacing every hostname we could find with a numeric address, we tried filling out /etc/hosts by hand, we tried setting up a local DNS cache, we really tried our best to understand and trim down our Asterisk configuration, but we just could not stop Asterisk from freezing when the backhaul was down. Since we were using the Asterisk SIP registry has our HLR for location updating and SMS address resolution, a frozen Asterisk server also shut down everything else we were doing.
Another technical problem was power. Here, we did something dumb: we assumed that new batteries would be topped off, and so we didn't bother checking the acid levels. We had enough battery capacity to run the full system for at least 10 hours but the first time we tried to leave the system on overnight we woke up to a low voltage alarm at 4:00. Since we didn't want to run a generator right next to the tent at 4:00, we shut everything down and went back to bed more than a little concerned about the batteries. The next morning we started the generator but the batteries were not responding well, not taking a charge. By the afternoon, though, we thought to check the acid levels and found they were very low. We topped off the batteries with water and had no more problems that week, except for a little boil-over from over-filling. BTW: Playa dust is excellent for neutralizing battery acid spills.
There were also a few loose ends for John to tie up in the SMS server, like saving and reloading the message queue and "bouncing" undeliverable message back to their senders. He also added some test features, like a "411" short code that would return system status information, that were really handy for coverage testing. These weren't really problems, though, just straightforward development.
By day 6, Thursday, we had fixed the registration loads, had decent power and backhaul, and a good feature set on the SMS server. We were (finally) starting to have a stable system.