Fine-tuning an LLM to write docs like it's 1995

(passo.uno)

31 points | by taubek 2 hours ago

3 comments

  • v1ne 23 minutes ago
    The trick about documentation is depth, not prose. You need context and understanding to write documentation "like in the old days". No amount of LLM trickery will free you from that. Once you have that source material, it's easy to re-shape it into an 80's/90's/00's doc format.

    Negative example: I was looking into the German manual of my Canon EOS R5 II, and it is just fluff. Hundreds of pages, full of white space, telling me about features without actually explaining what they mean. Awful automatic translations. Their manuals used to be good (looking at my EOS 6D). But these days: oh boy.

  • vintagedave 47 minutes ago
    I love old-school docs, and this was a fantastic read. But, I couldn't see the three generated doc pages linked anywhere. Did I miss something?

    I'd really like to see the Win2K-style docs on REST, for example.

    Edit: it was right there, in bold, too. https://gist.github.com/theletterf/0b8ee1112fbd087f3141d0cad...

  • mock-possum 49 minutes ago
    > we’re not there yet, in part because of how much more powerful connected frontier models are

    Is that why though? You need a beast of a machine to run a functional local model in my experience.

    I think the big part is there’s significant sticker shock to buying capable hardware.

    That said,

    > weekend. I chose to try fine-tuning on two models, Llama 3.1 8B Instruct and Qwen 2.5 7B Instruct. At their size (around 8B) they run comfortably on a MacBook Air

    Perhaps I spoke too soon?

    Anyway

    > I chose the Microsoft collection as the source of training materials. The collection contains out-of-print docs published between 1977 and 2005: more than 37 million words, covering old systems and SDKs

    this strikes me as a very specific brand of 1995’s prose, spanning about 30 years. It’s a cool article though, so maybe that’s a forgivably clickbaity title.

    • mschild 38 minutes ago
      Running models locally is surprisingly easy and possible even on older hardware.

      Obviously not the largest, up-to-date models but for what I expect most people use them for, even on hn, there are some shockingly good models that dont require €4k machines.

      I have a desktop with an AMD 6900XT and 5600 with 32GB ram. Obviously no slouch but its several years old at this point. I can comfortably run qwen 3.5 9b and get a speedy 60 token/sec output with decent results.

      • mock-possum 35 minutes ago
        idk I can barely field a 14b on my desktop, and it’s rough trying to replicate the agentic pair programming experience I’m accustomed to with Claude. And I don’t mean it doesn’t work as well, I mean it doesn’t work.

        Is there some secret I’m missing? I’ve tried rolling my own harness, and tried a few of the ones the cool kids use - I think pi was the most recent. Not quite my tempo, I’m afraid.

        • mschild 2 minutes ago
          Depends on your desktop specs and specific model.

          The easiest way I have found is to use LM Studio, grab the model you want, and point whatever tooling you're using at the local exposed API.

          You will have to configure the model params (temperature, etc) a bit to get the style you're expecting but it works decently well for me.

        • visha1v 14 minutes ago
          [dead]
    • OJFord 35 minutes ago
      > this strikes me as a very specific brand of 1995’s prose, spanning about 30 years.

      It's probably a fair approach to say the significant influence (training dataset) on writing at a particular time is the preceeding 30 years' material? It's certainly not only what's already written that year (nor anything since).