|
|
|
|
Table of Contents
Webob (short for "Web Obsessive") is a unique, application-level systems monitoring tool. There are plenty of apps available for network and os-level operations management, but really they only solve part of the real problem. Application glitches, database performance problems, critical workflow interruptions --- Webob lets you look inside your apps to make sure everything is working together. Use Webob to easily:
Instructions for installing the most current version of Webob are always online at http://www.softwarepoetry.com/webob. Frankly, there's not much to it ... just download the installation package and unpack it onto your system. No muss, no fuss! The WEBOB.XML file tells Webob everything it needs to know about what your systems look like are and who needs to be told if things go awry. The file is in industry-standard "XML" format. XML is a simple text-based format that looks a lot like traditional HTML, and you should have no trouble working with it. However, if you've never worked with XML before you might want to visit a resource such as xml.com to read up a bit. The following instructions will walk you through constructing a very simple WEBOB.XML file. Subsequent sections of the documentation will go into greater detail about the different pieces that make up the configuration file. In your webob installation directory, you'll find the WEBOB.XML file. Open it up with any text editor such as Notepad (on Windows) or VI (on Linux). It should look something like this: Example 1. An empty WEBOB.XML file <?xml version="1.0"?>
<webob-config>
<settings>
<http-port>8001</http-port>
</settings>
<actions>
</actions>
<resources>
</resources>
</webob-config>This file shows the three major sections of a configuration file:
There are dozens of options you can use when configuring the system. However, to get up and running you need only a few. The following configuration will check every 60 seconds to see if a special page at www.softwarepoetry.com is running: Example 2. A simple WEBOB.XML file, monitoring one resource <?xml version="1.0"?>
<webob-config>
<settings>
<http-port>8001</http-port>
</settings>
<actions>
</actions>
<resources>
<resource name="webob tutorial 1" type="url">
<path>http://www.softwarepoetry.com/webob/doc/tutorial1.asp</path>
<interval>60</interval>
</resource>
</resources>
</webob-config>(As we add content to the example file new lines will be emphasized) Notice that the global setting "http-port". This setting tells Webob to listen for status requests on port 8001. We'll point our web browser at this port to see current status, pause and resume monitoring, and so forth. Go ahead and make these changes to your file now, then refer to the next section to run the Webob application. Remember that each time you make changes to the WEBOB.XML file, you will need to restart the Webob application for your changes to be picked up. Under Linux, the daemon is started by running the command "webob", either manually from a shell or at system startup in an rc.local file. By default, Webob will look for its configuration file (WEBOB.XML) in the same directory where webob resides; this can be altered by specifying "-c pathtoconfigdir" on the command line, where "pathtoconfigdir" is a path to the WEBOB.XML file you'd like to use. Webob creates a file "webob.pid" in the installation directory (see the reference section for how to place this file elsewhere); you can terminate the daemon by sending a TERM signal to this process, as in "kill -TERM `cat webob.pid`". Webob registers itself as a service under Windows, so it may be started and stopped from the Services Control Panel. If you would like to provide an alternative configuration file under Windows, create a registry key HKEY_LOCAL_MACHINE\Software\SoftwarePoetry\webob\args. Under this key, create a REG_SZ value with the name "c" and the value a full path to your WEBOB.XML file. Once you've started the Webob application, point your web browser at http://localhost:8001/dashboard, replacing "localhost" if necessary with the name of the Webob server. You should see a page similar to the following:
This page shows a summary of all the resources you're currently monitoring, and will refresh every 30 seconds to show the most recently collected information. This page can be completely replaced or customized with a page more tailored to your organization using any dynamic page generation technology that supports XML objects or XSL transformations. We'll talk more about ways to view the current status later in this tutorial. It's also useful to note that viewing this page will have zero impact on the performance of your production systems. You are seeing the cached results of the latest monitoring pass on each resource. This is important because it means that timeouts or other production problems will not cascade into the monitoring tool --- it will always be available to keep you up to date. Successfully fetching a web page --- especially a dynamically generated one --- is a great first step at application-level monitoring. However, it really doesn't serve as a very meaningful test. For example, if your database server is down, web pages may be served properly but contain nothing more than an error message. Webob supports "content matching" to help you detect failures at this level. Hopefully the "tutorial1.asp" resource at softwarepoetry.com will always be available. If not, it means our corporate web servers are having trouble! The page simply contains the first paragraph of the short story "The Offshore Pirate" by F. Scott Fitzgerald. However, for the purposes of this tutorial we have built the page so that approximately one-third of the time, the attribution lines at the end of the page are not included. The following addition to the WEBOB.XML file will cause Webob to report an error when the text "The Offshore Pirate" is missing from the tutorial1.asp page. Remember to restart the application (we know, it's annoying). Example 3. Adding a "must-contain" element to a resource <?xml version="1.0"?>
<webob-config>
<settings>
<http-port>8001</http-port>
</settings>
<actions>
</actions>
<resources>
<resource name="webob tutorial 1" type="url">
<path>http://www.softwarepoetry.com/webob/doc/tutorial1.asp</path>
<interval>60</interval>
<must-contain>The Offshore Pirate</must-contain>
</resource>
</resources>
</webob-config>Use your web browser to view the dashboard page. As the page refreshes, about one third of the time the "status" should show up as a content failure and the colored indicator will turn red. Of course, in your application you'll be looking for some sort of meaningful text. Initially you can just create a page that tries to connect to your database, returning the text "db open ok" or "db open failure" depending on the result. You can then use a "must-contain" element (there is also a "must-not-contain" tag) to search for the appropriate text. Here's a sample page in classic Microsoft ASP that would do the trick in that environment: Example 4. Checking for database connectivity with Classic ASP <%
On Error Resume Next
Response.Expires = 0
Set db = Server.CreateObject("ADODB.Connection")
Err.Clear
db.Open "DSN=mydsn;UID=someuser;PWD=somepass"
If (Err.Number = 0) Then
Response.Write "db open ok"
db.Close
Else
Response.Write "db open error: " & Err.Description
End If
%>You can place as many content matching tags in a resource as you like. In addition to database connectivity, you might check for other resources such as files or web services. For each resource you can write out a pass/fail line and then use a separate "must-contain" tag for each one. This makes it easy to see where problems are occurring. Later in this tutorial we'll discuss "xpath" content matching, which is much more powerful than simple content matching, but is a bit more complicated to get a handle on. A live status page is a great way to see current status, but of course you won't be watching it 24 hours a day. Webob is built so that, when the status of a resource changes, it can take a number of different actions. The most critical of these from an operational perspective is email. Each time a resource goes from an OK to ERROR status (and from ERROR to OK), Webob can send an email to your mailbox or pager to notify your operational staff. You can configure as many email addresses as you like, and assign them to as many resources as you like. Make the following changes to your WEBOB.XML file. Be sure to replace "nobody@softwarepoetry.com" with your own email address, and don't forget to set the "smtp-host" and "smtp-port" to appropriate values for your environment. If you're not sure what "smtp-port" your server uses, use "25" which is almost certainly correct. You know the drill --- restart the application: Example 5. Adding email notification to a resource <?xml version="1.0"?>
<webob-config>
<settings>
<http-port>8001</http-port>
<smtp-host>YOURMAILSERVER-NAME</smtp-host>
<smtp-port>YOURMAILSERVER-PORT</smtp-port>
</settings>
<actions>
<action name="emailme" type="email">
<address>nobody@softwarepoetry.com</address>
</action>
</actions>
<resources>
<resource name="webob tutorial 1" type="url">
<path>http://www.softwarepoetry.com/webob/doc/tutorial1.asp</path>
<interval>120</interval>
<must-contain>The Offshore Pirate</must-contain>
<do-action name="emailme" />
</resource>
</resources>
</webob-config>Note that the action is defined in the "actions" section, identified with the name "emailme" and then referenced within the resource definition with a "do-action" tag. You can have as many "do-action" tags as you like, and within an email "action" you can have as many different "address" elements as you like. See the Webob Reference document for an exhaustive description of the different action types and their elements. One useful tweak you can make to an action is to add a "skip" attribute that causes Webob to wait for a specified number of consecutive failures before triggering an action. For example, you may have a system that returns a timeout once or twice a week and that's ok for the level of service you want to provide. In this case you might want to add a skip value of 1 so that only if the resource fails twice in a row will email be sent. You can place a "skip" attribute on the action tag itself, or on an indiviual "do-action" tag (a value on the "do-action" tag will take precedence if both are present). The syntax for adding a skip value to an action tag is as follows: When taking a system down for scheduled (or unscheduled) maintenance, you can suspend monitoring on a resource-by-resource basis to prevent spurious outage reports. Point your web browser to http://localhost:8001/pause to view the very simple ui for doing so. The pause/resume interface is built to be scriptable. If you have scripts that reboot/restart systems on a regular basis, it is a trivial task to automatically pause monitoring of one or more resources when the script starts, then resume when they come back online. You can also set up "ignore" windows on a daily or weekly basis during which monitoring will be automatically paused. See the Webob Reference Manual for details on this functionality. Once you have visibility into current system status and notification of outage events, you've got two of the three pieces of a great monitoring system. The last piece is a historical record of system performance. A record of outages is great for generating uptime and availability statistics, but more importantly it can be used to help identify outage patterns that may otherwise go unnoticed. If you can identify that your systems puke every first Monday of the month at 11pm, you're probably 90% of the way to solving the problem. Webob will keep this history for you, either in a comma-separated text file or a relational database (mysql or sqlite on Linux, any ODBC-compatible database on Windows). In this tutorial we'll keep things simple and log to a text file, but you can find all the details on setting up a database history in the Webob Reference Manual. The event history actions (types "file" and "db") write events only when they are resolved --- that is, when the system returns to an "ok" status from an error condition. The resource name, start and end time, duration in seconds and most recent error code and description are written to the history file. The following fragment (now that you're familiar with WEBOB.XML we'll quit including the entire file) instructs Webob to write events for the tutorial1 resource to the file /home/webob/history.csv: This section pertains to the Windows version of Webob only. In addition to web URLs, Webob can monitor performance counter values. These counters will show up on your dashboard, and may have associated minimum and maximum thresholds that will trigger actions. Multiple counters can be listed within each resource element of type "perfmon". IMPORTANT: If you will be monitoring counters on remote machines, the Webob service must be configured to run as a user that has sufficient privileges to view the counters. The interface for setting this account is slightly different on the various Windows platforms. Open the "Services" Control Panel (under Administrative Tools), right-click the "Webob by Software Poetry" service and choose "Properties". Under the "Log On" tab, choose "This Account" and specify an account with the apprpriate rights. Counters are specified using a "counter path", generally in the following form: \\Machine\PerfObject(ParentInstance/ObjectInstance#InstanceIndex)\Counter More information on counter paths is available from Microsoft; you can also open the Performance Monitor application and look at counter properties to see sample syntax. The following changes will monitor cpu and memory usage on the webob server every 10 seconds, triggering the "emailme" action if thresholds are crossed for more than four consecutive checks: Example 8. Adding two performance counters with thresholds ...
<resources>
<resource name="webob performance" type="perfmon">
<interval>10</interval>
<do-action name="emailme" skip="4" />
<counter>
<friendly-name>webob cpu usage</friendly-name>
<path>\\.\Processor(_Total)\% Processor Time</path>
<threshold type="max">90</threshold>
</counter>
<counter>
<friendly-name>webob free memory</friendly-name>
<path>\\.\Memory\Available MBytes</path>
<threshold type="min">50</threshold>
</counter>
</resource>
...Notice that the interval and action are associated with the group of counters; if any one of the thresholds is crossed the actions will be taken. The default dashboard is quick, easy and does a fine job at displaying current status. However, it also masks the real power of deep application monitoring with Webob. Point your web browser at http://localhost:8001/status and you'll see the real data behind the dashboard, which should look something like this:
This XML is the data that drives the default dashboard. Webob includes an XSLT engine that can transform the data into other XML forms or, more frequently, HTML. As you may have noticed in the address bar of your web browser, the default dashboard URL http://localhost:8001/dashboard is really just a shortcut to a particular transformation: http://localhost:8001/status?xsl=dashboard.xsl&mime=html. You can modify the default transformation, or specify your own by placing an XSLT file into the "web" directory and referencing it using this syntax. Note that the "mime" parameter is important, because it tells Webob what text format your transform generates. If XSLT isn't in your preferred bag of tricks, you can create a custom dashboard using any dynamic page generation technology that can reference an XML tree: ASP, PHP, whatever. You can find a complete description of the Webob status XML format in the Webob Reference Manual. Finally we get to our very favorite feature of Webob: custom XML fragments. Instead of simply returning text such as "db open ok" from your resource pages, you can return an XML tree representing the status of your resource at a very detailed level. For the purposes of this tutorial, we've created a page that returns a current stock quote as an XML fragment. Point your web browser at the URL http://www.softwarepoetry.com/webob/doc/tutorial2.asp?symbol=msft and choose "View Source" to see the fragment. Now let's add this URL as a resource, using the "include-body-xml" attribute to indicate that we should insert the XML fragment into the status XML: Example 9. Inserting a custom XML fragment using the "include-body-xml" attribute ...
<resources>
<resource name="msft stock quote" type="url" include-body-xml="1">
<path>http://www.softwarepoetry.com/webob/doc/tutorial2.asp?symbol=msft</path>
<interval>300</interval>
</resource>
...Restart Webob and take a look at the status XML (http://localhost:8001/status). The "msft stock quote" resource element now contains the body fragment returned by the resource, which can be easily incorporated into your custom dashboard:
Having incorporated custom XML fragments into our system, of course we'd like to be able to validate the data coming back. In our stock quote example, perhaps we want to be notified if the Microsoft share price drops below $50. Webob supports using the XPath syntax for creating arbitrarily complex rules that can trigger actions. XPath is a syntax for defining parts of an XML document; a great tutorial is available at W3Schools. It can seem a bit daunting at first, but once you get a hang of it the things you can do with XPath in Webob are fantastic. Webob allows you to specify an "xpath-must-match" (or "xpath-must-not-match") element, the text of which is an XPath statement that defines an element search. If at least one element matching the statement is found, the condition is met. If not, any actions associated with the resource are triggered. Make the following change to your WEBOB.XML file and restart the application: Example 10. Advanced content matching using the "xpath-must-match" element ...
<resources>
<resource name="msft stock quote" type="url" include-body-xml="1">
<path>http://www.softwarepoetry.com/webob/doc/tutorial2.asp?symbol=msft</path>
<interval>300</interval>
<xpath-must-match info="MSFT < $50">
//stockquote[symbol = 'MSFT' and quote >= 50]
</xpath-must-match>
</resource>
...This example also shows the optional "info" attribute on the "xpath-must-match" element (it may also be used on a "must-contain" element). If present, this text is used as a description when the test fails --- often much nicer than an email with a long cryptic XPath expression! Finally, watch for the "<" and ">" entities in the expression. The Webob configuration file must be valid XML, so the raw characters "<", ">", "&" and """ must be "escaped" using their corresponding entity names ("&" should be put in as "&" and a quote mark as """). An alternative to this is to wrap elements containing lots of these characters with a "CDATA" tag; we'll see an example of this in the next section when we look at external processes. Once you have a feel for XML fragments and XPath matching, you can begin to monitor at a level that most organizations never get anywhere near. Perhaps you're an e-commerce company and want to ensure that all orders are closed within 48 hours. Build a resource to select a count of open orders older than 48 hours and create an xml fragment. Use XPath to trigger actions when the value is greater than 0 and you have an incredibly valuable business tool. One of the critical advantages to Webob is that it allows you to consolidate all of your monitoring activity in one place. We make sure that you can get all the data you need into (and out of) Webob by enabling process-based resources and actions. If you can write a script to do it, you can integrate it with Webob. The "scripts" directory contains a number of process examples that you might find useful either as is or as a starting point for your own system. There are times where you might like to take action beyond sending email or logging history. For example, you may want to run a script that automatically restarts affected systems. Webob allows you to do this using process-based actions. In the following example, we'll go back to our "Offshore Pirate" resource and use the Windows Messenger Service to post an alert to the Webob machine (note that you have to be using the Windows version of Webob for this to work!). This example also illustrates the use of printf-style parameter substitution to pass relevant information to the process --- you can find an exhaustive list of supported codes in the Webob Reference Manual. Example 11. Triggering a message to the local console ...
<actions>
<action name="popup" type="process">
<command>net send localhost webob status change for %n: %d</command>
</action>
...
<resources>
<resource name="webob tutorial 1 type="url">
<do-action name="popup" />
...The text of the command element is used to spawn a new process. On Windows, the Windows Script Host (cscript.exe) is a great environment for writing actions, although anything you can run from a command window will work as well. On Linux, you've got a million choices. Remember that your process will run with the permissions of the Webob process! Webob knows how to run an external process and interpret the results. The exit code of the process should be 0 on success; otherwise it is assumed that the script detected an error and actions will be triggered. The standard output of the process is used as the "body", so all of the same body inclusion and content matching techniques can be used to detect issues using process-based resources. The Linux example below shows how you can track available disk space in the /tmp directory on the Webob machine. Obviously, you will usually want to use rsh or some other device to execute your processes on target machines. The command we will execute is: df -k /tmp | awk '($1 != "Filesystem") { print "<disk><mnt>" $6 "</mnt><avail>" $4 "</avail></disk>" }' Note that normally we'd have to do quite a bit of really ugly entity escaping to make this work in WEBOB.XML. Instead we'll use a "CDATA" tag to tell the XML parser to take a valium. The resource element (remember to run this only if you're using the Linux version of Webob!) looks like this (note that I've added a linebreak in the command for readability that should not be present): Example 12. Adding a process resource to monitor /tmp disk space on Linux ...
<resources>
<resource name="/tmp disk space" type="process" include-body-xml="1">
<path><![CDATA[
df -k /tmp | awk '($1 != "Filesystem") { print "<disk><mnt>"
$6 "</mnt><avail>" $4 "</avail></disk>" }'
]]></path>
<interval>300</interval>
</resource>
...Disk space is now available to your dashboard, but more importantly it can be monitored using XPath matching. If the available space drops below some theshold (say 100KB), you can trigger actions using the following element: Example 13. Monitoring /tmp for sufficient disk space ...
<resources>
<resource name="/tmp disk space" type="process" include-body-xml="1">
<path><![CDATA[
df -k /tmp | awk '($1 != "Filesystem") { print "<disk><mnt>"
$6 "</mnt><avail>" $4 "</avail></disk>" }'
]]></path>
<interval>300</interval>
<xpath-must-match info="tmp space low!">
//resource[name = '/tmp disk space']/disk[avail >= 100]
</xpath-must-match>
</resource>
...If you made it this far, you should have an excellent sense of what Webob can do. The Webob Reference Manual is the place to go for detailed syntax specifications and an exhaustive list of options. If you have any issues, please drop us an email. We'd love to hear what you're doing with Webob! |
|
|