bbor 5 days ago

Well put! If you/y’all haven’t heard, there’s a popular breakdown of “technical documentation” into four types, and this is one of the axes: https://nick.groenen.me/posts/the-4-types-of-technical-docum...

People’s names for the four types vary, but I’m personally a fan of naming the axes “propositional vs procedural” and “informational vs developmental”, giving us a final four categories (00, 01, 10, 11) of “References”, “Instructions”, “Lessons”, and “Tutorials”. I think the applicability to LLM clearly holds up! Though more so for advanced chatbots than HR widgets TBF, I doubt anyone is looking for developmental content from one of those.

  • shabie 4 days ago

    Thanks a lot for sharing, I have not heard of this before.

niobe 5 days ago

Well put, and separating these would be a good use case for system prompts e.g.

llm -m model --save instructional --system "provide the detailed steps to achieve the outcome, using a suitable example if necessary"

llm -m model --save informational --system "provide a concise conceptual overview but do not provide implementation steps or detailed examples"

  • shabie 4 days ago

    That's actually a pretty interesting point. Not just evals but other components like system prompt should also be tailored to match the expected outcome.

trash_cat 5 hours ago

But then how do you classify a task that the LMM performed, such as a summary? I think you are onto something here but it really depends on what task you want the LMM to perform, search, how to, summary, extraction etc...

rwnspace 9 hours ago

My experience with them doesn't quite fit either: I've primarily used LLMs for giving me hints when I'm struggling with a leetcode problem or similar. They're surprisingly good at it, providing you regularly remind them to provide little clues only.

js8 a day ago

I wish there also was a distinction between truthful and whimsical.

fuzzy_biscuit a day ago

When I was doing SEO full-time, this is one of the ways we used to categorize content - via intent. As a result, my immediate question becomes: how long before those responses start to be subsumed by commercial intent responses? To me, this is an inevitability. A when, not an if.