Splunk Hogs is a quick script that was written to find what systems are sending too many events to splunk and using up all the licenses. This is good for several reasons: 1) Developers will turn debugging on, and this will catch it. 2) INFO messages will get turned on and useless events will be sent. 3) Issues will be detected that were going ignored by others. * WHY PERL AND NOT USE SPLUNK SAVESEARCH: Since I am not well versed in all of Splunk's query language to achieve the output results I am looking for. It was just easier and quicker to just output the basic information I was looking for and massage the data to my needs. Also, since I output the raw data, I can import it into Nagios or other tools we use in our environment. * WHAT SPLUNK HOGS DOES: The perl script collects raw data from splunk that consists of the Host, splunk server, and the count of events that took place. host splunk_server count -------------- ------------------------------- ------ 110.56.128.40 splunk0001.foo.com 110627 110.56.128.42 splunk0001.foo.com 98243 110.56.128.44 splunk0001.foo.com 88861 The count is determined by the "earliest" setting in $SPLUNKCOMMAND. The $MIN variable is used to divide the count so the events per minute can be determined. A threshould ($THRESH) is set to determine how many events per minute is considered to be an issue that needs to be addressed. For our environment 1000 is a good number. If the number of events exceed $HOT then the number is displayed in bold red and should be addressed immediately. 4000 is a good number for us. Look at some of the high numbers in yours and determine what a good setting for you is. The raw data in $DFILE is written out in case you want to use it for something else. A email/html file is written out and sendmail sends an email. * INSTALLATION 1) Put it anywhere you want. 2) Modify the variables at the top. They are self-explanatory. $DFILE - Raw Datafile. $EFILE - html/email formatted file that is emailed. $eMailTO - Sending the report to who. $eMailFROM - Who the email is from. $SPLUNKCOMMAND - The splunk command that is used to collect event counts. Note: change "earliest=-15m" to your environment. The higher the number then longer the query will take. $MIN - This number divided by the earliest minutes set to get p/min results. $THRESH - If this number of events are reached add node to the report. $HOT - If this number of events make red and bold in the report. 3) Add to cron to generate reports every day.