The units are correct, but its not as simple as that, because when you play a file the buffer can be filled faster then realtime, so you don't have to wait 7 seconds for it to fill.
The first number represents actual processing latency, the second includes device buffers and whatnot, which is only important if you try to optimize for live playback with WDM or other live sources were we can't pre-read to fill big buffers.