Hi mwillems,
it's just been my preference, I've found that if I perform my crossover / EQ etc some other way than FIR filters I have more phase to correct (obviously) and end up with just as many taps, that and I really like using rephase, it's an excellent bit of software. I've found that I can keep most of my tap counts below about 2048 which keeps the latency to a minimum for video and is how I've been running my MiniSharc and hence my other threads about keeping other latencies to a minimum, there is little point in doing this if other bits of the software or plugins etc introduce more latency than is desireable.
That's sort of my point; software will introduce some latency, and so will hardware, so you may find that your total latency budget for convolution may shrink a bit, meaning you might need to do "more with less." 2048 taps may be fine even when added to other unavoidable latencies once you get everything optimized, but it depends quite a bit on your soundcard. 2000 taps at 44k is about 20 milliseconds, which isn't a problem by itself, but even 10 additional milliseconds from hardware/software would start making lipsync potentially iffy; 22 milliseconds is the film standard for unacceptable delay, 45 milliseconds for TV broadcast.
And that's where PEQ can really come to the rescue. I can confirm from my own measurements that doing minimum phase filtering using PEQ and then just using RePhase for phase correction produces less total latency than trying to do the exact same thing entirely in rephase by a wide margin. It may not be an issue for you if you can do everything you want in 2000 taps or less, but there are other advantages to doing the "basics" to PEQ as well (see below).
If you haven't already, you might want to check out my how-to on speaker correction in JRiver over in the Sound Cards Sub forum for some ideas on how a blended IIR/FIR approach might work in practice with JRiver:
http://yabb.jriver.com/interact/index.php?topic=87538.0I am hoping to experiment with steeper slopes which will probably require more taps but then maybe I can set up a different zone that uses less taps for video and another that uses uber taps for just audio, is that doable?
Yes, zones are the recommended approach and work fine for that. In addition to long and short filter zones, I actually have a zone where I just turn convolution all the way off for certain kinds of latency sensitive live content (video games, etc.). Because I run an active system with JRiver as the crossover, if I didn't do most of my crossover work in PEQ that would be impossible: doing what you can do in PEQ makes turning off the convolution altogether (when necessary) a viable option