September | 2010 | MazenHarake

Hacking entop

2010/09/15 mazenharake 1 comment

entop was built to be extendible, mostly because one can’t build a monitoring tool which suits everyone and every project. In this post I’ll go through how to extend entop and how to show the information that is important to you and/or your project.

Future releases of entop will have a better way to specify the callback modules that one wish to use when starting entop. This will make use of your own modules without the need to recompile entop every time and/or save some generic callback modules in between projects. Currently though you have to recompile entop but I’m getting ahead of myself.

The UI

The entop UI (just like top) has two sections; headers and a table.

The headers consist of three parts. The first part is the static information gathered from the remote node. This is the kind of information that will never change (or very very rarely changes) in between the time you poll. This is static data and is displayed on the first line. This is not intended to be changed (unless you change it in the code). The second part consists of four rows of data and are intended to be customized by the callback function, we will come back to this part later. The third and last part contains some information on how the rows in the table are displayed and how long it took to fetch the data from the remote node. This is displayed on line five (last line) and is not intended to be changed (again, unless you change the code).

The table consist of two parts; columns and rows (yes, really!). Column titles show what the information in each column is suppose to show and each row after that is the item (row/data) itself. Non-brainer. Both the column and the row are intended to be customizable to show as many columns as you want with what ever information you want.

The collector module

When entop starts up it tries to connect to the node that was given as an argument to it. If successful it will read the static data from that node and then start polling the node for the dynamic data. Before the polling starts, entop pushes the binary code of the callback module to the remote node and does so every time entop establishes a connection. Which module that is used as the collector module is currently specified directly in code but future versions of entop will enable to change this name as a configuration or a CLI argument.

Note: Because a module is pushed and loaded on the other node it is important that the name of the module doesn’t clash with something else so be careful when naming your modules.

The collector module needs to have one function exported, get_data/0. This function is the one that will be called every time the node is polled. This function will return data that is later used by a format module (we’ll get to that module later). The get_data/0 function MUST return a tuple of three:

get_data() ->
    ...
    {ok, HeaderData, TableData}

First element is the atom ok. The second element can be any Erlang term, it will be used later on to format the header lines. The header data must be something that makes sense to the format module. The third argument is a list of Erlang terms where each item in that list is data which will be used to format each row in the table. Each piece of data in the list must be something that makes sense to the format module. The default collector module in entop is currently entop_collector.erl.

Note: HeaderData can be any Erlang term and TableData must be a list of Erlang terms.

The format module

When the data has been fetched entop will call a format module to format the information it will display on the screen. To do this the format module needs to implement the init/1, header/2 and row/2 functions.

The init/1 will be called when the callback is initialized (once when the application is started and a connection was successfully established with the remote node). It is called with one argument which is the node name of the target node and must return a tuple of three:

init(Nodename) ->
    ...
    {ok, {ColumnSpec, DefaultColumn}, State}

First element is the atom ok. The second element is a tuple of two in which the first element is a column specification and the second element the default column number to sort on. The third element is the user defined state.

A column specification looks like this:

    Columns = [ {"Title", Width, Options}, {"Title2", Width, Options}, ...].

Each column is specified by a tuple of three. The first element, the title, must be a flat io list. The second element, the width, is a number which reserves that many characters to display in the column and should be larger than three, excess characters will be cut off. The third element, options, is a property list with options that can be applied to that column; currently there is only one such option: {align, ...} which specifies if the text should be aligned right or left (default left) in the column.

After the data has been retrieved from the remote node the header/2 function will be called first. The first argument to this function is the HeaderData which was returned by the get_data/0 function and the second argument is the user state data which was returned by init/1. The return value of this function must be a tuple of three as described below:

header(HeaderData, State) ->
    Line1 = "First row in the header (second after the node info)",
    Line2 = "Second row in the header ...",
    Line3 = "Third row ...",
    Line4 = "...",
    {ok, [Line1, Line2, Line3, Line4], NewState}.

The second element of the return value must be a list of length == 4. Each line in that list must be a flattened io list with no newlines and they have to be formatted the way the user wish to have them; entop will not touch them. Each line will be shown on row 1-4 in the header section of the gui. If a line is too long for the width of the screen it will be cut off.

Side note: I am aware that having a list with a fixed size is not very intuitive and I will probably change this to be a tuple of size four later on but currently this is because I’m lazy in the background implementation 😉

After the headers have been processed the table will be populated. For each element in the TableData list, the function row/2 will be called together with the user state.

row(TableDataElement, State) ->
    ...
    {ok, {"Col1", [{test}], atom, 1337}, NewState}.

The function must return a tuple with the size of exactly the length of the column list specified in the init/1 function (size(ReturnTuple) == length(Columns)). The values in the tuple will be used to populate each cell in the table and can be any term, what ever term is put there it will be formatted, made into a string, flattened and truncated to the width of the column. If a particular line is to be skipped (not to be included in the table) the tuple {ok, skip, State} can be used.

Putting it together: An application viewer

So as a useful example let’s create a view that shows us all the applications on the target node which are loaded and which state they are in (if started or just loaded) and if they have a main process (I.e. the application is not a library).

I recommend starting by creating the view; this lets you think about what you want to show and how, it gives you a mock-up of the view. Start by creating the file entop_application_format.erl (never mind the corny name, what ever floats the boat for now :))

First we need to think about which columns we want to show. My idea was that we have six columns as we want to show the following information; Name, Description, Version, State (Loaded/Started), Type (permanent/temporary) and Pid (if it has one).

Enter the module and export attribute and implement the init/1 function as follows:

%% entop_application_format.erl
-module(entop_application_format).
-export([init/1, header/2, row/2]).

init(_Node) ->
    Columns = [{"Name", 13, []},
	           {"Description", 20, []},
	           {"Version", 8, []},
	           {"State", 8, []},
	           {"Type", 12, []},
	           {"Pid", 10, []}],
    {ok, {Columns, 1}, undefined}.

This specifies the six columns and their width. It also specifies that the first column is going to be the one to sort on by default and also that we don’t care about the state so we set it to undefined.

Usually I now create a dummy header/2 and row/2 function to give myself a mockup of how it will be (and perhaps to get the column width right) but I will skip that step here and go straight to implementation. Below is the header function which will be implemented:

header(HeaderProplist, State) ->
    SysMem = proplists:get_value(system, HeaderProplist),
    ProcMem = proplists:get_value(processes, HeaderProplist),
    AtomMem = proplists:get_value(atom, HeaderProplist),
    BinMem = proplists:get_value(binary, HeaderProplist),
    CodeMem = proplists:get_value(code, HeaderProplist),
    EtsMem = proplists:get_value(ets, HeaderProplist),

    Line1 = lists:concat(["System: ", SysMem, ", Process: ", ProcMem, ", Atom: ", AtomMem]),
    Line2 = lists:concat(["Binary: ", BinMem, ", Code: ", CodeMem, ", ETS: ", EtsMem]),
    Line3 = "Machine uptime: ",
    Line4 = proplists:get_value(machine_uptime, HeaderProplist),

    {ok, [Line1, Line2, Line3, Line4], State}.

To make the concept more clear I have bloated the function for readability. Lines 2-7 show various values that are read from a proplist; this means that we expect a proplist from the get_data/0 function which we implement later. During development I don’t make any assumptions on the data structures here but rather use dummy values and come back and change them to fit the real data structure after I have implemented the collector module. Lines 9-12 formats the four strings that we need and finally I return them as a list (remember; must be a list of four!).

After the header has been implemented we need to provide a callback to fill the rows in the table. The row function will be called for every element in the TableData. Here is the row function:

row({Name, Desc, Vers, Pid, Type}, State) ->
    {AppState, PidStr} =
        case Pid of
            not_running ->
                {"Loaded", "-"};
            undefined ->
                {"Started", "-"};
            Pid ->
                [_, Mdl, End] = string:tokens(erlang:pid_to_list(Pid), "."),
                {"Started", lists:concat(["<0.", Mdl, ".", End])}
        end,
    {ok, {Name, Desc, Vers, AppState, Type, PidStr}, State}.

As mentioned before I wouldn’t make any assumptions on the data structure I get in the function header but this time I happen to know ;). Most of this should be self explanatory but a note about line 9-10; if you retrieve a pid (as a pid type) from a remote node the pid reference on the local node will indicate that it isn’t a local pid by setting the first identifier to some number (other than 0). This is useful but ugly because in this case you want to make it seem as if you are on the other node showing its data. A hack to overcome this is to make the pid into a string on the remote side when you ask for it, this will return a string based on a local pid and thus the local reference format. Another hack is to get the remote pid, make it into a string, strip away the “remote” part of it and replace it with a “local” part. The latter is what I’m doing here. The return of this function is a tuple with a value for each column in the table, in our case exactly six since we have six columns.

The final part is to provide the collector module, call it entop_application_collector.erl:

%% entop_application_collector.erl
-module(entop_application_collector).
-export([get_data/0]).

get_data() ->
    HeaderProplist = [{machine_uptime, os:cmd("uptime")} |
                      erlang:memory([system, processes, atom, binary, code, ets])],

    AppInfo = application:info(),
    Loaded = proplists:get_value(loaded, AppInfo),
    Running = proplists:get_value(running, AppInfo),
    Started = proplists:get_value(started, AppInfo),

    MapFun = fun({Name, Desc, Vers}, Acc) ->
                      AppPid = proplists:get_value(Name, Running, not_running),
                      StartType = proplists:get_value(Name, Started, not_running),
                      [ {Name, Desc, Vers, AppPid, StartType} | Acc ]
                  end,

    Applications = lists:foldl(MapFun, [], Loaded),

    {ok, HeaderProplist, Applications}.

This function will first produce a proplist on which the header/2 function will be applied (line 6-7) and then produce a list of tuples for which the function row/2 will be mapped on (applied to every element produced in that list).

One last step: to specifiy the callbacks. Currently there is no easier way to do this but it will probably be part of future versions of the tool. To specify the callbacks edit the file entop.hrl and change the state record definition so that the field callback = entop_application_format and the field remote_module = entop_application_collector.

If you did it all right, it should look something like this:

If you want to continue experimenting try adding a “Crash’n Recovery” notification which says how many times an application has crashed and restarted during the time it has been monitored (Tip: Use the state to monitor the pids).

Good luck, have fun

Categories: entop Tags: cecho, entop, Erlang

S	M	T	W	T	F	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30

MazenHarake

Archive