I used to work on AMD video cards, and I have 2 theories:
1). Let's say you have a quad core machine, and Galciv4 is coded to have two threads: the thread doing the actual AI crunching between turns, and the MEL--which basically lets you click your mouse around but not much more than that. The AI thread would be at 100% and your MEL thread would be very low, resulting in a 25-30% CPU utilization on a quad core.
2). Cache misses. Cache misses even on the L1 can easily cost as many as 16 CPU cycles. Cache misses on the RAM and...let's just say your CPU utilization would be very low while it waits to retrieve a page from the RAM. That's one of the reasons why we multithread: because the CPU is just sitting there. Might as well be doing something else. But you are barebones. You are not doing much else.
First I would look at per-core CPU utilization, and if that is also at 30%, then I would look at how much of the time it is in an I/O wait state. I am more familiar with Linux than Windows at that kind of analysis.