Note: Please go to docs.rightscale.com to access the current RightScale documentation set. Also, feel free to Chat with us!
Home > Guides > RightScale 101 > Management Tools > Monitoring System > Custom collectd plug-ins > Create Custom Collectd Plug-ins for Linux

Create Custom Collectd Plug-ins for Linux

Objective

To write a custom collectd plugin for Linux instances so that the data can be collected and graphs can be drawn and displayed in the RightScale Dashboard.

Table of Contents

Steps

Create the Custom Plugin

Let's try a simple example: a plug-in that collects cpu load by calling 'uptime' and parsing the output (note this is not necessarily the best way to do the data collection, but it's a simple example). Save the contents into the mycpuload.rb file.

Note: The step must be 20 for all RRDs and data must be sent at 20 second intervals. Anything else is a waste of bandwidth and processing.

#!/usr/bin/env ruby
require 'getoptlong'

# The name of the collectd plugin, something like apache, memory, mysql, interface, ...
PLUGIN_NAME = 'mycpuload'

def usage
  puts("#{$0} -h <host_id> [-i <sampling_interval>]")
  exit
end

# Main
begin
  # Sync stdout so that it will flush to collectd properly. 
  $stdout.sync = true

  # Parse command line options
  hostname = nil
  sampling_interval = 20  # in seconds, Default value
  opts = GetoptLong.new(
    [ '--hostid', '-h', GetoptLong::REQUIRED_ARGUMENT ],
    [ '--sampling-interval', '-i',  GetoptLong::OPTIONAL_ARGUMENT ]
  )
  opts.each do |opt, arg|
    case opt
      when '--hostid'
        hostname = arg
      when '--sampling-interval'
        sampling_interval = arg.to_i
    end
  end
  usage if !hostname

  # Collection loop
  while true do
    start_run = Time.now.to_i
    next_run = start_run + sampling_interval

    # collectd data and print the values
    data = `uptime`[/load average: ([\d.]+)/, 1] # get 5-minute load average
    puts("PUTVAL #{hostname}/#{PLUGIN_NAME}/gauge-5_minute_load #{start_run}:#{data}")

    # sleep to make the interval
    while((time_left = (next_run - Time.now.to_i)) > 0) do
      sleep(time_left)
    end
  end
end

IMPORTANT NOTE: The value you choose for your PLUGIN_NAME variable must be unique and not conflict with any of the existing RigthtScale collectd plugin names. For example, if you use 'haproxy' as a plugin name an error will occur and may cause unexpected behavior.

 

For the next part, you'll need your server's ID. To get that, for a V4 image, go to the /var/spool/ec2/meta-data/instance-id file. For v5 images, you need to create a very basic RightScript to get it for you. Copy the following code sample into a new RightScript and call the RightScript something like "Get My Server ID." Once you've created the RightScript, run it and then login to the server via SSH and read the /myserverid.txt file.  Inside the file will be your server's ID. (You can also use the following RightScript on v4 images by replacing $RS_INSTANCE_UUID with $EC2_INSTANCE_ID)

#!/bin/bash

echo $RS_INSTANCE_UUID > /myserverid.txt

Now when we run this from the command line, because we have our Server Instance ID, this is what it looks like: (Replace 01-EHII6FZK7KAAB below with your own Server Instance ID)

[root@ip-10-251-70-47 /]# cd /usr/lib/collectd/plugins
[root@ip-10-251-70-47 plugins]# chmod +x mycpuload.rb  #give permissions to execute the file
[root@ip-10-251-70-47 plugins]# ./mycpuload.rb -h 01-EHII6FZK7KAAB  #Terminate the script with Ctrl+c
PUTVAL 01-EHII6FZK7KAAB/mycpuload/gauge-5_minute_load 1207188959:0.01
PUTVAL 01-EHII6FZK7KAAB/mycpuload/gauge-5_minute_load 1207188979:0.00
PUTVAL 01-EHII6FZK7KAAB/mycpuload/gauge-5_minute_load 1207188999:0.08
./mycpuload:46:in `sleep': Interrupt
	from ./mycpuload:46
[root@ip-10-251-70-47 plugins]# 

 

If you look carefully at the timestamps (the 10-digit number), you'll notice that the three lines are all 20 seconds apart. The number at the end of the line is the load reported by uptime.

Now it's time to explain what the 01-EHII6FZK7KAAB/mycpuload/gauge-5_minute_load string represents.

The format of this string is <instance-id>/<plugin>-<plugin_instance>/<type>-<type_instance>

The meaning of each field is:

  • instance-id: This is the ID that identifies the server that is sending the data to the monitoring servers. More information on this follows.
  • plugin: identifies the plugin that is typically associated with an application or a resource. Examples include apache, mysql, squid, cpu, memory, etc. NOTE:
  • plugin_instance: identifies the instance of an application/resource when there are multiple applications/resourse. Examples are cpu-0, cpu-1 on dual-core servers, or df-mnt and df-root for the two filesystems on small instances.
  • type: identifies the type of data being collected.  Your custom plugin must use one of the Supported Graphs Types, defined in types.db.
  • type-instance: the name of the variable being collected, or the instance of the variable of the given type being collected Examples are: (for the cpu type) idle, wait, busy; (for the 'mysql_command' type) selects, updates, executes.

All of this can be pretty confusing at first. To try and make the fields easier to understand, note how a '-' separates plugin and plugin_instance or type and type_instance, while an '_' is sometimes used within any of these four items. The best way to understand how all this works is to look at where each of these identifiers shows up on the web pages.

  • Each <plugin>-<plugin_instance> combination results in a menu box at the top of the monitoring page
  • The graph <type> determines how the data is interpreted and how the graphs are drawn by the sketchy servers.  The preceding example uses the 'gauge' type.   You must use one of the Supported Graphs Types, otherwise sketchy will not be able to create any graphs.   Either the 'counter' or 'gauge' graph types are recommended for custom plug-ins.
  • The <type_instance> shows up in the title of the graph(s)

Add the New Plugin

Now, you need to link the new plugin into collectd.conf.  Based on the distribution, you need to create a mycpuload.conf file in either /etc/collectd.d/ (CentOS) or /etc/collectd/conf/ (Ubuntu) with this content: (Make sure you again replace the 01-EHII6FZK7KAAB identification number in the following sample code with your own Server Instance ID that you obtained earlier in this tutorial.)

LoadPlugin exec
<Plugin exec>
  #     userid    plugin executable            plugin        args
  Exec "my_user" "/usr/lib/collectd/plugins/mycpuload.rb" "-h" "01-EHII6FZK7KAAB"
</Plugin>

 

Now, we need to create the user my_user, used by collectd to run the script:

[root@ip-10-251-70-47 /]# cd /usr/lib/collectd/plugins/
[root@ip-10-251-70-47 plugins]# useradd my_user -s /sbin/nologin -M
[root@ip-10-251-70-47 plugins]# chown my_user:my_user mycpuload.rb  #change owner and group
[root@ip-10-251-70-47 plugins]# ls -l
total 12
-rwxr-xr-x 1 haproxy haproxy 7066 Sep 28 10:23 haproxy
-rwxr-xr-x 1 my_user my_user 1220 Sep 30 15:32 mycpuload.rb

Allow my_user in sudoers

Allow my_user to execute mycpuload.rb by using the visudo command and adding the following line after "root    ALL=(ALL)       ALL":

 

my_user ALL=(ALL) NOPASSWD:/usr/lib/collectd/plugins/mycpuload.rb

 

Restart collectd

Now we need to restart collectd and check for correct script execution:

[root@ip-10-251-70-47 plugins]# service collectd restart
Stopping collectd:                                         [  OK  ]
Starting collectd:                                         [  OK  ]
[root@ip-10-251-70-47 plugins]# tail /var/log/messages        #/var/log/syslog on Ubuntu
Sep 30 16:00:46 ip-10-251-70-47 collectd[23768]: Exiting normally
Sep 30 16:00:47 ip-10-251-70-47 collectd[23768]: exec plugin: Sent SIGTERM to 23777
Sep 30 16:00:47 ip-10-251-70-47 collectd[23768]: exec plugin: Sent SIGTERM to 23778
[root@ip-10-251-70-47 plugins]# ps aux | grep mycpuload
my_user  23591  0.0  0.1   3528  1936 ?        S    15:37   0:00 ruby /usr/lib/collectd/plugins/mycpuload.rb -h 01-EHII6FZK7KAAB

 

If the script is being executed by my_user, you should find the "mycpuload" graph in the Dashboard.

You must to post a comment.
Last modified
14:54, 4 Mar 2015

Tags

Classifications

This page has no classifications.

Announcements

None


© 2006-2014 RightScale, Inc. All rights reserved.
RightScale is a registered trademark of RightScale, Inc. All other products and services may be trademarks or servicemarks of their respective owners.