Hi
I think that I've found a bug when use unixsock plugin. The problem is
releate with missing state, when no value is received by daemon for a
while in the cache is marked as MISSING, but the last value is still
showing even when machine is not reporting in a GETVAL and LISTVAL
commands. Some utlities like collectd-nagios does not work correctly,
and report an OKAY value when host is not reporting from a long time.
I attach a patch with check the state value of an cache entry in
uc_get_names and in uc_get_rate_by_name. This patch works for me, but
it's not very tested yet, and I not very sure about if it's a good way
to check the problem. The patch is tested on 4.7.2 release version.
BTW a GETSTATE command will be an useful feature too :P
Regards,
Andres
Signed-off-by: Florian Forster <octo@huhu.verplant.org>
{
assert (ce != NULL);
- ret_num = ce->values_num;
- ret = (gauge_t *) malloc (ret_num * sizeof (gauge_t));
- if (ret == NULL)
+ /* remove missing values from getval */
+ if (ce->state == STATE_MISSING)
{
- ERROR ("utils_cache: uc_get_rate_by_name: malloc failed.");
status = -1;
}
else
{
- memcpy (ret, ce->values_gauge, ret_num * sizeof (gauge_t));
+ ret_num = ce->values_num;
+ ret = (gauge_t *) malloc (ret_num * sizeof (gauge_t));
+ if (ret == NULL)
+ {
+ ERROR ("utils_cache: uc_get_rate_by_name: malloc failed.");
+ status = -1;
+ }
+ else
+ {
+ memcpy (ret, ce->values_gauge, ret_num * sizeof (gauge_t));
+ }
}
}
else
{
char **temp;
+ /* remove missing values when list values */
+ if (value->state == STATE_MISSING)
+ continue;
+
if (ret_times != NULL)
{
time_t *tmp_times;