First, apologies for not posting in a while, I have been terribly busy at work and at home as we have the puppy.
When playing around with different LLMs with Ollama, you might want to see how fast some model is. Ollama won't return you any performance statistics in server mode (or does it? Let me know!), so to get the statistics to Zabbix I had to be creative. For those who don't know, Ollama is a cross-platform software for running all kinds of LLMs locally.
Let's run Ollama
After installing Ollama, running it is just this easy from command line:
ollama run gemma "Tell me about Zabbix in one short sentence."
Zabbix is an open-source monitoring platform designed for infrastructure and application performance tracking.
... where gemma is the language model name you want to use.
If you add --verbose to its parameters, it gives you some more info.
ollama run --verbose gemma "Tell me about Zabbix in one short sentence."
Zabbix is an open-source monitoring platform for infrastructure and applications.
total duration: 557.514208ms
load duration: 19.8135ms
prompt eval count: 32 token(s)
prompt eval duration: 48.352667ms
prompt eval rate: 661.80 tokens/s
eval count: 15 token(s)
eval duration: 488.363458ms
eval rate: 30.71 tokens/s
Now there's many interesting details that would be great to have in Zabbix! But how?
How to get this to Zabbix?
For this demo purpose, a short and ugly shell snippet with good old zabbix_sender command will do. It will redirect the STDERR (which is where Ollama spits the metrics) to a separate text file, which will then be fed to zabbix_sender.
export llmmodel="gemma";
echo "model name: $llmmodel" >/tmp/ollama_stats.txt;
ollama run --verbose $llmmodel "Tell me about Zabbix" 2> >/tmp/ollama_stats.txt;
zabbix_sender -z myzabbixserver -s "llmstatistics" -k ollama.masteritem -o "$(cat /tmp/ollama_stats.txt)"
To try out with different model, just change the llmmodel environment variable to be something else.
Zabbix configuration
But where does all this go in Zabbix?
I created a new host called llmstatistics and gave it few item keys. My Ollama master item will receive the text file, and then there's bunch of dependent items parsing all those values in one sweep.

Any of the dependent items are like the example below, with simple item preprocessing regular expression catching the relevant line.


Does it work?
Sure it does, apart from one item where I have a typo somewhere and I'm not going to hunt it down right now; you get the idea for demo purposes and it's quite late already while I type this.

From here I could go on and do any dashboards I'd like, but that would be repetitive and next I'm going to sleep.

Add new comment