mcollective and activemq the 800 node limit

I've been running into the 800 node limit on mcollective and splitting up my nodes into subcollectives. I had a spot where I couldn't split up the nodes, so I started looking at why we were hitting this 800 node wall.

I'm using activemq with the ssl plugin, after turning on all the debugging I could find in activemq, it turns out it's just a simple resource limit problem.

With activemq running, I waited for my nodes to connect and watched the number of threads on the active java process. (This is after increasing the memory limits for activemq as described on puppetlabs website.

Getting the number of threads, two different ways.

$ pgrep java |xargs ps uH |wc -l
1023
$ pgrep java |xargs -I % ls -l /proc/%/task |wc -l
1023

Either way we are seeing around 1024 processes (threads), looks suspiciously like a limit. I increased the limit in /etc/security/limits.d/activemq.conf

activemq soft nofile 16384
activemq hard nofile 16384
activemq soft nproc 4096
activemq hard nproc 4096

Not really sure if the nofile limit is required, but nproc seems to fix my issue.
After restarting activemq

$ pgrep java |xargs ps uH |wc -l
1530
$ pgrep java |xargs -I % ls -l /proc/%/task |wc -l
1530

The number of nodes returned by mco find goes from a random result in the 800-1000 range to the 1400 or so that I was expecting.

I'm going to have to update my section on mcollective in my book

Wordpress category: