25 September 2009

Burning Man 2009: Days 4-6 (part 2)

Sorry for the posting delay. Back to the blog. For now.

Apart from excessive registration activity, we had a few other technical challenges, most related to backhaul and power.

Our backhaul was a point-to-point WiFi link, running from a Nanostation 5 on our tower to a 30' tower in a camp called Fusion Valley, over at the 3:00 plaza. Fusion Valley, in turn, linked us to Center Camp where we got onto a 10 MB/s microwave relay back to an ISP in the "real world". The people running this network had been very supportive of our project and their system worked most of the time, but keeping anything working reliably on the Playa is hard and they were not granted any magical protection against that. Our static IP number changed and sometimes just quit working. A few times we lost backhaul completely, like the time someone at Fusion Valley plugged some power tools into the same circuit as the network gear, tripped the breaker and then just walked away for a while. Well, it's Burning Man. Stuff happens.

We were running a mostly-local service. Occasional loss of backhaul should not have been a serious problem, but it was, because Asterisk kept trying to run DNS and reverse-DNS queries. Asterisk would just lock up when it couldn't reach the internet. We tried replacing every hostname we could find with a numeric address, we tried filling out /etc/hosts by hand, we tried setting up a local DNS cache, we really tried our best to understand and trim down our Asterisk configuration, but we just could not stop Asterisk from freezing when the backhaul was down. Since we were using the Asterisk SIP registry has our HLR for location updating and SMS address resolution, a frozen Asterisk server also shut down everything else we were doing.

Another technical problem was power. Here, we did something dumb: we assumed that new batteries would be topped off, and so we didn't bother checking the acid levels. We had enough battery capacity to run the full system for at least 10 hours but the first time we tried to leave the system on overnight we woke up to a low voltage alarm at 4:00. Since we didn't want to run a generator right next to the tent at 4:00, we shut everything down and went back to bed more than a little concerned about the batteries. The next morning we started the generator but the batteries were not responding well, not taking a charge. By the afternoon, though, we thought to check the acid levels and found they were very low. We topped off the batteries with water and had no more problems that week, except for a little boil-over from over-filling. BTW: Playa dust is excellent for neutralizing battery acid spills.

There were also a few loose ends for John to tie up in the SMS server, like saving and reloading the message queue and "bouncing" undeliverable message back to their senders. He also added some test features, like a "411" short code that would return system status information, that were really handy for coverage testing. These weren't really problems, though, just straightforward development.

By day 6, Thursday, we had fixed the registration loads, had decent power and backhaul, and a good feature set on the SMS server. We were (finally) starting to have a stable system.


12 comments:

  1. My advice would be to avoid asterisk like the plague. Start using FreeSwitch before you go too far down the road whereby you become intrenched with asterisk that you cant replace it easily.
    Asterisk is the biggest POS in terms of code. Use something like FreeSwitch which has learnt from the design errors asterisk made.

    ReplyDelete
  2. From what I read somewhere else I think there are plans for a HLR that would not depend on Asterisk. It would be cool if a HLR compatible with both openBTS and openBSC could be made.

    ReplyDelete
  3. I think the criticism of Asterisk may be a little too harsh. It is built to do a specific thing and it does that thing, it is just not exactly the thing we needed here. This is similar to the criticism of the USRP. It is not ideal for our application, however, had we not had use of it, the system may have never been built.

    ReplyDelete
  4. I was happy to see it working. Great job!
    Max RV3DNZ

    ReplyDelete
  5. Would a solution like OpenSIPS be able to replace the functionality of Asterisk here?

    ReplyDelete
  6. I've looked and OpenSIPS and would say that it probably could replace Asterisk. However, at this point, we are still using Asterisk and starting to get better at it.

    ReplyDelete
  7. Would integrating OpenSIPS with asterisk to help with scaling be of any good?

    ReplyDelete
  8. Certainly, It should from my point of view. I would try it anyway.

    ReplyDelete
  9. Is your SMS server code available? Thanks!

    ReplyDelete
  10. sasha -

    The SMS server code is part of the 2.6 public release. See openbts.sf.net for a download.

    ReplyDelete
  11. http://www.airshoes.us

    http://www.frenchtn.com

    ReplyDelete
  12. Hello.. Firstly I would like to send greetings to all readers. After this, I recognize the content so interesting about this article. For me personally I liked all the information. I would like to know of cases like this more often. In my personal experience I might mention a book called Green Parks Costa Rica in this book that I mentioned have very interesting topics, and also you have much to do with the main theme of this article.

    ReplyDelete