I've been using the convolver with a 6-channel setup for a few years. The JRiver convolver seems to uses multiple cores, but I don't know what the maximum number it will use is.
The broader question of how powerful of a CPU you need is kind of a "how long is a piece of string" type question. The longer and more complex your filters are the more CPU it uses. I run a two to six channel tri-amp setup using convolution on a ten-year-old i7-2600k and it only uses a two or three CPU percent during playback, but my filters are also relatively short (I think I'm using 4k taps right now?). With somewhat longer filters I managed to push my usage up to eight or nine percent, but I've never used filters longer than a few hundred milliseconds.
If you use very long/complex filters it will use proportionately more CPU, but there's no way to know without doing some testing with the kinds of filters you want to use. Try setting up a test filter in your convolution filter generating software and then test it on your existing computer while monitoring CPU usage. Then look up your existing CPU's benchmark on something like PassMark, and scale up appropriately.
As to "shortening linear-phase filter latencies" that's entirely a filter design issue. JRiver can't change how long the filters are, your convolution filters are the length they are and that's the latency you get. If you want to reduce filter latencies, your convolution filter generating software may have some optimizations you can do to shave off some milliseconds, but the surest solution is to reduce the complexity of your convolution filters. I do that by pushing as much of my EQ and processing as I can out of convolution and into parametric EQ which is more or less latency-less. Then I only use convolution for the things that can only be done in convolution (like phase tinkering).
One other advantage of using convolution only for phase linearization is that if you are bi-amping, tri-amping, etc. (i.e. starting with two channels and finishing with more) you can run the convolution block on just two channels, and then do all the routing and crossovers later in PEQ which gets you an even larger CPU "savings".