I worked in Systems Validation at Intel when the 8087 was current. Intel had an engineer dedicated to validating customer bug reports and reproducing them. Day in, day out, that's pretty much all he did. Sooooo many corner cases, and so many opinions on what the 'right' thing to do was when you lost precision[1].
[1] I'd say that over half of the bug reports were people who were annoyed that doing fp instructions in one order got them the right answer but in another order got them the wrong answer.
80 bits always seemed a strange choice for floating point, but as soon as you said there’s a 16-bit exponent and a 64-bit fraction part, it made sense.
I assume microcode was a choice for both ease of development/testing/changes and saving die space. Would there come a point later on where performance could be gained by converting the microcode into a full set of discrete logic, or is that not worth the effort?
Usually, it's not worth the effort of converting microcode into discrete logic to get performance. Among other things, it's a mess to try to fix a bug.
A few exceptions: The different models of the IBM System/360 mainframe are almost all microcoded, except for the high-end machines, which were hard-wired for performance. The design of the Apollo Guidance Computer is microcode, but the implementation is discrete logic. The 8086 and derivatives are microcoded, except NEC created a faster hard-wired version, the V33.
Wouldn't it be simpler for Intel to have designed a chip, with those 8 identical instructions (xfer, shift, add, arith, far jmp, far call, local jmp, misc), but read/executed from normal RAM accessible by the user, perhaps with a tiny cache, instead all these ROM/microcode special compression/hidden architecture shenanigans?
[1] I'd say that over half of the bug reports were people who were annoyed that doing fp instructions in one order got them the right answer but in another order got them the wrong answer.
I assume microcode was a choice for both ease of development/testing/changes and saving die space. Would there come a point later on where performance could be gained by converting the microcode into a full set of discrete logic, or is that not worth the effort?
A few exceptions: The different models of the IBM System/360 mainframe are almost all microcoded, except for the high-end machines, which were hard-wired for performance. The design of the Apollo Guidance Computer is microcode, but the implementation is discrete logic. The 8086 and derivatives are microcoded, except NEC created a faster hard-wired version, the V33.