Image:Kiosknet-title.png

Maintenance and log management

KioskNet needs to be deployed in areas with little or no other infrastructure. Therefore, one of our key design goals was to build a system that could be maintained with the least possible effort by semi-skilled field technicians. We also desired a means to cheaply, securely, and reliably monitor both ferries and kiosks from an NGO central office. These two features would allow a handful of skilled workers at the central office, helped by a larger number of field technicians, to support hundreds or even thousands of kiosks and ferries. In this section, we describe KioskNet maintenance and monitoring.

Maintenance

Routine software maintenance requires software running on kiosk controllers to be upgraded and patched from time to time. To avoid having technicians travel to each kiosk location to install or upgrade software, we provide a sub-system for centralized management and maintenance of kiosk controllers. This mechanism, similar to the Disruption Tolerant Shell, is described next.


In KioskNet terminology, an update is a zipped and signed file that contains a executable script, the recipients' GUIDs, a unique sequence number, and all other files that the script needs for execution (this is similar to a RedHat RPM). When a KioskNet component receives an update, it first checks the signature. An authentic update is uncompressed in a pre-specified location, and the script is then run with root privilege in a forked shell. When the shell terminates, its sequence number is recorded along with the exit value of the controller script and output logs are submitted to the logging sub-system (described next).

The controller script performs the following steps:

  • Checking pre-conditions: The script may check the sequence number records, along with other preconditions.
  • Running the main task: In this stage the controller script can use any of the local files in addition to the files shipped with the update.
  • Generating short and long logs: KioskNet requires that, updates generate two log files. Short logs are immediately reported back to the central administration, potentially using the SMS control channel, and long logs are treated as normal system logs.
  • Returning a status value: The returned status value is recorded along with the sequence number.


Updates can reach KioskNet nodes over one of three channels. The normal DTN/OCMP mechanical backhaul channel is the preferred transmission mechanism. When this channel does not work, the central office can choose to flood updates to all KioskNet nodes. In rare cases when a node is not reachable using any of these two channels, a field technician can apply the update using a USB key - on detecting an authenticated USB key, the controller reads the update on the key and applies it, just as if it had received it over the wireless link.

Logging

KioskNet has been designed to be robust and tolerant to failures. However, both DTN and OCMP, which are critical software layers, are under active development. Therefore, software failure is a distinct possibility. When a failure does occur, central office technicians require a means to collect and debug system logs that does not rely on OCMP or DTN. We have, therefore, designed and implemented a mechanism that floods logs across a disconnected network to the Internet using opportunistic connections. We call this application log-flood.

Log-flood periodically compresses the contents of \textit{/var/log/}, timestamps it, and signs it with a sequence number. It then periodically sends a broadcast ping to detect neighbouring KioskNet components. When a neighbour is detected they exchange log archives opportunistically using the standard Unix $rsync$ utility. For secure transfer, we actually tunnel rsync over ssh using an ssh key installed by the central office when configuring the KioskNet component.

Each KioskNet component floods log archives to each other until the files reach a gateway. To prevent redundant flooding, the gateway does not flood logs to neighbouring ferries; it simply forwards log archives to the proxy on the Internet. The proxy subsequently acknowledges the delivery of each log archive and forwards an acknowledgement file to the gateway. Acknowledgement files are then transferred from the gateway to neighbouring ferries, and flooded back across the disconnected network. When a KioskNet component receives an acknowledgement file, it deletes the originating log archive. Acknowledgement files eventually expire on each component. In this way, by mimicking DTN using rsync, we allow robust log propagation.

Retrieved from "http://blizzard.cs.uwaterloo.ca/tetherless/index.php/KioskNet_maintenance"

This page has been accessed 1112 times. This page was last modified 19:19, 5 Sep 2007.


Main Page

About

Current Projects

Downloads

Documents

Internal

Old Projects

Meta