73% of AI startups are just prompt engineering

pub.towardsai.net

204 points by kllrnohj 4 hours ago

For me, 2023 was an entire year of weekly demos that now looking back at were basically a "Look at this dank prompt I wrote" followed by thunderous applause from the audience (which was mostly, but not exclusively, upper management)

Hell man, I attended a session at an AWS event last year that was entirely the presenter opening Claud and writing random prompts to help with AWS stuff... Like thanks dude... That was a great use of an hour. I left 15 minutes in.

We have a team that's been working on an "Agent" for about 6 months now. Started as prompt engineering, then they were like "no we need to add more value" developed a ton of tools and integrations and "connectors" and evals etc. The last couple of weeks were a "repivot" going back full circle to "Lets simplify all that by prompt engineering and give it a sandbox environment to run publicly documented CLIs. You know, like Claude Code"

The funny thing is I know where it's going next...

scuff3d 19 minutes ago

I can't take anyone seriously who uses prompt engineering unironically. I see those emails come through at work and all I can do is roll my eyes and move on
- apwell23 18 minutes ago
  
  what level of seriousness does "context engineering" deserve?
morkalork 38 minutes ago

But did it work? This is the sticking point with me now. I've seen slides, architecture diagrams, job descriptions, roadmaps and other docs now from about a dozen different companies doing AI Agent projects. And while it's completely feasible to build the systems they're describing, what I have not seen yet is evidence of any of them working.
When you press them on this, they have all sorts of ideas like a judge LLM that takes the outputs, comes up with modified SOPs and feeds those into the prompts of the mixture-of-experts LLMs. But I don't think that works, I've tried closing that loop and all I got was LLMs flailing around.
ivape an hour ago

Wait ...
You mean teams are already building their own solutions to existing solutions? Software development will live on in eternity then.
- tempodox 16 minutes ago
  
  They are just reselling OpenAI subscriptions at a markup. Surprise!

indymike 3 hours ago

A long time ago a mentor of mine said,

"In tech, often an expert is someone that know one or two things more than everyone else. When things are new, sometimes that's all it takes."

It's no surprise it's just prompt engineering. Every new tech goes that way - mainly because innovation is often adding one or two things more the the existing stack.

KineticLensman 2 hours ago

I remember being told that the secret of good consultancy is knowing what to read on your way to the meeting
- bdangubic 2 hours ago
  
  very true. and these days take a lot less effort than before getting llms to summarize shit which is one task they inarguably shine on
  - verdverm 24 minutes ago
    
    They make too many mistakes for me to rely on their summaries for consulting. Repeating one of those is a great way to embarrass yourself in front of a client and damage your reputation
rynn 2 hours ago

It’s easy to underestimate the amount of testing “just” prompt/context engineering takes to get above average results.
And then you need to see what variations work best with different models.
My POCs for personal AI projects take time to get this right. It’s not like the API calls are the hard portion of the software.
mycall 3 hours ago

I'm always more interested in the 'less is more' strategy, taking things away from the already hyper-complicated stack, reviewing first principles and simplifying for the same effectiveness. This is ever more rare.
- pitched an hour ago
  
  I think this sense of “less is more” roughly means refactoring? I think the reason these go south so often is because we’re likely moving complexity around rather than removing it. Removing a layer from the stack means making a different layer more complex to take over for it.

simonw an hour ago

Why is this post published in November 2025 talking about GPT-4?

I'm suspicious of their methodology:

> Open DevTools (F12), go to the Network tab, and interact with their AI feature. If you see: api.openai.com, api.anthropic.com, api.cohere.ai You’re looking at a wrapper. They might have middleware, but the AI isn’t theirs.

But... everyone knows that you shouldn't make requests directly to those hosts from your web frontend because doing so exposes your API key in a way that can be stolen by attackers.

If you have "middleware" that's likely to solve that particular problem - but then how can you investigate by intercepting traffic?

Something doesn't smell right about this investigation.

It does later say:

> I found 12 companies that left API keys in their frontend code.

So that's 12 companies, but what about the rest?

laristine an hour ago

Providers such as OpenAI have client keys so your client application can call the providers directly. Many developers prefer them as they save roundtrip costs and latency.
https://platform.openai.com/docs/api-reference/realtime-sess...
- simonw 23 minutes ago
  
  Do those still only work for the voice APIs though?
  I've been hoping they would extend that to other APIs, and I'd love to see the same kind of mechanism for other providers.
irthomasthomas an hour ago

That's a big llm smell when it mentions old models like GPT-4

theptip 2 hours ago

> just prompt engineering

This dismisses a lot of actual hard work. The scaffolding required to get SOTA performance is non-trivial!

Eg how do you build representative evals and measure forward progress?

Also, tool calling, caching, etc is beyond what folks normally call “prompt engineering”.

If you think it’s trivial though - go build a startup and raise a seed round, the money is easy to come by if you can show results.

jasonjmcghee 2 hours ago

prompt engineering + CRUD is likely much more fair.
And many companies are "just CRUD".
bayarearefugee 2 hours ago

The money is easy to come by because wealthy investors, while they don't want to pay any more in taxes, are desperate to find possible returns in an economy that sucks outside of ballooning healthcare and the AI bubble... not because they need the money but because NUMBER MUST GO UP.
And more so than even most VC markets, raising for an "AI" company is more about who you know than what results you can show.
If anyone is actually showing significant results, where's the actual output of the AI-driven software boom (beyond just LLMs making coders more efficient by being a better google)? I don't see any real signs of it. All I see is people doing after market modifications on the shovels, I've yet to see any of the end users of these shovels coming down from the hills with sacks of real gold.
- theptip 14 minutes ago
  
  What’s your opinion on any of the plethora of unicorns in domain-specific AI, like Harvey? ($100m ARR from what I could find on a cursory search)
  https://www.forbes.com/sites/iainmartin/2025/10/29/legal-ai-...
biophysboy 2 hours ago

This is like when people say that you should short the market if you think its going to crash. People have different risk premiums.
iLoveOncall 2 hours ago

> Eg how do you build representative evals and measure forward progress?
This assumes that those companies do evaluations. In my experience, seeing a huge amount of internal AI projects at my company (FAANG), there's not even 5% that have any sort of eval in place.
- theptip 20 minutes ago
  
  Yeah, I believe that lots of startups don’t have evals either, but as soon as you get paying customers you’re gonna need something to prevent accidentally regressing as you tune your scaffolding, swap in newer models, etc.
  This is a big chasm that I could well believe a lot of founders fail to cross.
  It’s really easy to build an impressive-looking tech demo, much harder to get and retain paying customers and continuously improve.
  But! Plenty of companies are actually doing this hard work.
  See for example this post: https://news.ycombinator.com/item?id=46025683
submeta 2 hours ago

This should be the top comment.

zkmon 2 hours ago

But ... what else should they be doing? What's the expectation here?

For example, in the 90's, a startup that offered a nice UI for a legacy console based system, would have been a great idea. What's wrong with that?

danny_codes an hour ago

IMO nothing wrong with it. Just misleading to call yourself an AI company when you actually make a CRUD app. I think if these companies were honest about what they’re doing nobody would be upset. There’s an obvious deliberate attempt to give an impression of technical complexity/competence that isn’t there.
I assume it works because the ecosystem is, as you say, so new. Non-technical observers have trouble distinguishing between LLM companies and CRUD companies
- zkmon an hour ago
  
  So, what is an AI company? What do they sell? AI models? agents? Are they building these from scratch or using some pre-trained base models/agents?
- Esophagus4 43 minutes ago
  
  I don’t have a problem with a company calling themselves an AI company if they use OpenAI behind the scenes.
  The thing that annoys me is when clearly non-AI companies try to brand themselves as AI: like how Long Island Iced Tea tried to brand themselves as a blockchain company or WeWork tried to brand themselves as a tech company.
  If we’re complaining about AI startups not building their own in house LLMs, that really just seems like people who are not in the arena criticizing those who are.
ungreased0675 an hour ago

They should be creating tiny domain specific models, because someday OpenAI will stop selling dollars for a nickel.
sethops1 2 hours ago

They should compete in the crucible of the free market. If prompt engineering is indeed a profitable industry then so be it. I for one am just tired of all things software being dominated by this hype funded AI frenzy.
ang_cire an hour ago

I think the point is more to point out the inherent danger presented when your platform is just a wrapper, but is being sold as more than that.
A lot of these startups have little to no moat, but they're raking in money like no one's business. That's exactly what happened in the dotcom bubble.
nottorp 2 hours ago

Actual AI. Not being "AI" users.
Being LLM users would be fine but they pretend they do AI.
- zkmon 2 hours ago
  
  AI is an ecosystem that includes users at all layers and innovation at all those layers - infra, databases, models, agents, portals, UIs and so on. What do you mean by doing AI?
  Btw, the so-called AI devs or model developers are "users" of the databases and all the underlying layers of the stack.
- jasonjmcghee 2 hours ago
  
  Everything is a spectrum.
  At what point can you claim that you did "it"?
  Do you have to use an open source model instead of an API? Do you have to fine tune it? How much do you need to? Do you have to create synthetic data for training? Do you have to gather your own data? Do you need to train from scratch? Do you need to come up with a novel architecture?
  10 years ago if you gathered some data and trained a linear model to determine the likelihood your client would default on their loan and used that to decide how much, if any, to loan them- you're absolutely doing "actual AI"
  ---
  Any other software you could ask all the same questions but with using a high level language, frameworks, dependencies, hiring consultants / firm, using an LLM, no-code, etc.
  At what point does outsourcing some portion of the end product become no longer doing the thing?
- xenospn an hour ago
  
  What’s actual AI in this context?

tedggh 2 hours ago

Isn’t this true for most start ups out there even before AI? Some sort of bundle/wrapper around existing technology? I worked auditing companies and we used a particular system that cost tens of thousands of dollars per user per year and we charged customers up to a million to generate reports with it. The platform didn’t have anything proprietary other than the UX, under the hood it was a few common tools some of them open source. We could have created our own product but our margins were so huge it didn’t make sense to setup a software development unit not even bother with outsourcing it.

gnarlouse 2 hours ago

This post hovers on something I came to the week after ChatGPT dropped in 2023.

If an AI company has an AGI, what incentive do they actually have to sell it as a product, especially if it’s a 10x cost/productivity/reliability silicon engineer? Just undercut the competition by building their services from scratch.

yalogin 3 hours ago

That is lower than I expected. There are just a handful of companies that create llms. They are all more ir less similar. So all automation is in using them, which is prompt engineering if you see that way.

The bigger question is, this is the same story with apps on mobile phones. Apple and google could easily replicate your app if they wanted to and they did too. That danger is much higher with these ai startups. The llms are already there in terms of functionality, all the creators figured out the value is in vertical integration and all of them are doing it. From that sense all these startups are just showing them what to build. Even perplexity and cursor are in danger.

hotpaper75 2 hours ago

Do not forget that a product idea needs to meet a certain ROI to be stolen. Big Tech won't go after opportunities that do not generate billion-level revenue. This leaves some room for applications where you can earn decent money.
- yalogin 2 hours ago
  
  That is not how companies work. What you said may be true for the immediate short term but over time every team in the company needs to show improvement and set yearly milestones. All these startups will then become functionality they want to push that quarter. Yes it doesn’t mean the death of the startup but a struggle

Oras 2 hours ago

I can believe that many startups are doing prompt engineering and agents but in a sense this like saying 90% of startups are using cloud providers mainly AWS and Azure.

There is absolutely no point of reinventing the wheel to create a generic LLM, spend fortune to run GPUs while there are providers giving this power cheaply

Esophagus4 an hour ago

In addition, there may be value in getting to market quickly with existing LLM providers, proving out the concept, then building / training specialized models if needed once you have traction.
See: https://en.wikipedia.org/wiki/Lean_startup

goranmoomin 3 hours ago

It is beyond annoying that the article is totally generated by AI. I appreciate the author (hopefully) spending effort in trying to figure out the AI systems, but the obviously-LLM non-edited content makes me not trust the article.

teiferer 2 hours ago

What makes you believe that anything in the article is real?
The author seems to not exist and it's unclear where the data underlying the claims is even coming from since you can't just go and capture network traffic wherever you like.
A little due diligence please.

muppetman 3 hours ago

This makes no sense to me? I don't understand why a company, even if it is using GPT or Claude as their true backend, is going to leave API calls in Javascript that anyone can find. Sure maybe a couple would, but 73% of those tested? Surely your browser is going to talk to their webserver, and yup sure it'll then go off and use Claude etc then return the answer to you, but surely they're not all going to just skin an easily-discoverable website over the big models?

I don't believe any of this. Why aren't we questioning the source of how the author is apparently able to figure out some sites are using REDIS etc etc?

jcrawfordor 3 hours ago

It's very confusing in the text of the article, at times it sounds like the author is using heuristic methods (like timings) but at times it sounds like they somehow have access to network traffic from the provider's backend. I could 100% believe that a ton of these companies are making API calls to providers directly from an SPA, but the flow diagrams in the article seem to specifically rule that out as an explanation.
I might allow them more credit if the article wasn't in such an obviously LLM-written style. I've seen a few cases like this, now, where it seems like someone did some very modest technical investigation or even none at all and then prompted an LLM to write a whole article based on it. It comes out like this... a whole lot of bullet points and numbered lists, breathless language about the implications, but on repeated close readings you can't tell what they actually did.
It's unfortunate that, if this author really did collect this data, their choice to have an LLM write the article and in the process obscure the details has completely undermined their credibility.
Zanfa 3 hours ago

It makes perfect sense when you consider that the average Javascript developer does not know that business logic can exist outside of React components.

aurareturn 3 hours ago

Prompt engineering isn't as simple as writing prompts in english. It's still engineering data flow, when data is relevant, systems that the AI can access and search, tools that the AI can use, etc.

drowsspa 3 hours ago

Is it, though? Apparently the current best practice is just to allow the LLM untethered access to everything and try to control access by preventing prompt injection...
- kgeist 3 hours ago
  
  Well it took me 2 full-time weeks to properly implement a RAG-based system so that it found actually relevant data and did not hallucinate. Had to:
  - write an evaluation pipeline to automate quality testing
  - add a query rewriting step to explore more options during search
  - add hybrid BM-25+vector search with proper rank fusion
  - tune all the hyperparameters for best results (like weight bias for bm25 vs. vector, how many documents to retrieve for analysis, how to chunk documents based on semantics)
  - parallelize the search pipeline to decrease wait times
  - add moderation
  - add a reranker to find best candidates
  - add background embedding calculation of user documents
  - lots of failure cases to iron out so that the prompt worked for most cases
  There's no "just give LLM all the data", it's more complex than that, especially if you want best results and also full control of data (we run all of that using open source models because user data is under NDA)
  - saberience 2 hours ago
    
    Sounds like you vibe coded a RAG system in two weeks, which isn't very hard. Any startup can do it.
    I've debugged single difficult bugs before for two weeks, a whole feature that takes two weeks is an easy feature to build.
    
    kgeist 2 hours ago
    
    I already had experience with RAG before so I had a head start. You're right that it's not rocket science, but it's not just "press F to implement the feature" either
    P.S. No vibe coding was used. I only used LLM-as-a-judge to automate quality testing when tuning the parameters, before passing it to human QA
  - mbesto 2 hours ago
    
    "did not hallucinate"
    Sorry to nitpick, but this is not technically possible no matter how much RAG you throw at it. I assume you just mean "hallucinates a lot less"
    
    kgeist 2 hours ago
    
    You're right, bad wording
  - altcognito 2 hours ago
    
    whoa, two weeks
    
    rynn 2 hours ago
    
    @apwell23 while the author didn’t say how s/he measured QA, creating the QA process was literally the first bullet.
- mettamage 3 hours ago
  
  You still need to find the correct data, and get it to the LLM. IMO, a lot of it is data engineering work with API calls to an LLM as an extra step. I'm currently doing a lot of ETL work with Airflow (and whatever data {warehouses, lakes, bases} are needed) to get the right data to a prompt engineering flow. The prompt engineering flow is literally a for loop of Google Docs in a Google Drive that non-tech people, but domain experts in their field, can access.
  It's up to the domain experts and me to understand where giving it data will tone down the hallucinative nonsense an LLM puts out, and where we should not give data because we need the problem solving skills of the LLM itself. A similar process is for tool-use, which in our case are pre-selected Python scripts that it is allowed to run.
  - apwell23 2 hours ago
    
    can you describe what the usecase is ?
nradov 2 hours ago

Nah. There's no such thing as prompt engineering. It doesn't exist. Engineering involves applying scientific principles to solve real world problems. There are no clear scientific principles to apply here. It's all instinct, hunches, educated guesses, and heuristics with maybe some sort of feedback loop. And that's fine, it can still produce useful results. Just don't call it engineering. Maybe artisanal prompt crafting? Or prompt alchemy?
- exmicrosoldier 2 hours ago
  
  Prompt engineering is the new Search Engine Optimization.
  Not sure if we called it engineering ten years ago.
dustbunny 3 hours ago

Human speech is "engineering data flow"
Painting is "engineering data flow"
Directing a movie is "engineering data flow"
Playing the guitar is "engineering data flow"
This statement merely reveals a bias to apply high value to the word "engineering" and to the identity "engineer".
Ironic in that silicon valley lifted that identity and it's not even legally recognized as a licensed profession.
lomase 3 hours ago

Imagine you are a top of the line engenier...
Engineering data flow... sure, we all like to use big words.
- goostavos 3 hours ago
  
  The new 10x engineering is writing "please don't write bugs" in a markdown file.

analogpixel 3 hours ago

Where is this guy sitting that he is able to collect all of this data? And why is he able to release it all in a blog post? (my company wouldn't allow me to collect and release customer data like this.)

teiferer 3 hours ago

Another red flag with the article is that the author's LinkedIn profile link at the bottom leads to a non-existing page.
Is Teja Kusireddy a real person? Or is this maybe just an experiment from some AI company (or other actor) to see how far they can push it? A Google search by that name doesn't find anything not related to the article.
The article should be flagged. Otoh, this should get discussed.
- zkmon 2 hours ago
  
  He seems real. Goes by Teja K. Seems a startup founder.
  - hn_throwaway_99 2 hours ago
    
    He may be real, but the article is fake BS. There is simply no way he'd be in a position to intercept the calls, and he never explains it.
    
    less_penguiny 2 hours ago
    
    There is nothing difficult about monitoring network traffic in this way for desktop or native apps.
    
    teiferer 2 hours ago
    
    Did you read the article? It claims to have knowledge of network traffic between the startup's devices and the AI providers' devices.
  - teiferer 2 hours ago
    
    Do you have any link that supports this?
InsideOutSanta 3 hours ago

It sounds like some of these companies call the OpenAI or Anthropic APIs directly from their frontend. Later, the author also mentions "response time patterns for every major AI API," so maybe there's some information about the backend leaking that way even if the API calls are bridged.
But I'd like to know an actual answer to this, too, especially since large parts of this post read as if they were written by an LLM.
- hn_throwaway_99 2 hours ago
  
  > It sounds like some of these companies call the OpenAI or Anthropic APIs directly from their frontend.
  Which would be a major security hole. And sure, lots of startups have major security holes, but not enough that he could come up with these BS statistics.
  I'm a little dismayed at how high up this has been voted given the data is guaranteed to be made up.
  - kllrnohj 17 minutes ago
    
    > > It sounds like some of these companies call the OpenAI or Anthropic APIs directly from their frontend.
    > Which would be a major security hole.
    An officially supported security hole
    https://platform.openai.com/docs/api-reference/realtime-sess...
- cess11 2 hours ago
  
  "I found 12 companies that left API keys in their frontend code. I reported them all. None responded."
  They claim to have found that.
hn_throwaway_99 3 hours ago

Yeah, TBH my BS detector is going off because this article never explains how he is able to intercept these calls.
To be able to call the OpenAI directly from the front end, you'd need to include the OpenAI key, which would be a huge security hole. I don't doubt that many of these companies are just wrappers around the big LLM providers, but they'd be calling the APIs from their backend where nothing should be interceptable. And sure, I believe a few of them are dumb enough to call OpenAI from the frontend, but that would be a minority.
This whole thing smells fishy, and I call BS unless the author provides more details about how he intercepted the calls.
- chmod775 2 hours ago
  
  > Yeah, TBH my BS detector is going off because this article never explains how he is able to intercept these calls.
  You mean, except for explaining what he's doing 4-5 times? He was literally repeating himself restating it. Half the article is about the various indicators he used. THERE'S EXAMPLES OF THEM.
  There's this bit:
  > Monitored their network traffic for 60-second sessions
  > Decompiled and analyzed their JavaScript bundles
  Also there's this whole explanation:
  > The giveaways when I monitored outbound traffic:
  > Requests to api.openai.com every time a user interacted with their "AI"
  > Request headers containing OpenAI-Organization identifiers
  > Response times matching OpenAI’s API latency patterns (150–400ms for most queries)
  > Token usage patterns identical to GPT-4’s pricing tiers
  > Characteristic exponential backoff on rate limits (OpenAI’s signature pattern)
  Also there's these bits:
  > The Methodology (Free on GitHub next week):
  > - The complete scraping infrastructure
  > - API fingerprinting techniques
  > - Response time patterns for every major AI AP
  One time he even repeats himself by stating what he's doing as playwright pseudocode, in case plain English isn't enough.
  This was also really funny:
  > One company’s “revolutionary natural language understanding engine” was literally this: [clientside code with prompt + direct openai API call].
  And there's also this bit at the end of the article:
  > The truth is just an F12 away.
  There's more because LITERALLY HALF THE ARTICLE IS HIM DOING THE THING YOU COMPLAIN HE DIDN'T DO.
  In case it's still not clear, he was capturing local traffic while automating with playwright as well as analyzing clientside JS.
  - teiferer an hour ago
    
    > Monitored their network traffic for 60-second sessions
    How can he monitor what's going on between a startup's backend and OpenAI's server?
    > The truth is just an F12 away
    That's just not how this works. You can see the network traffic between your browser and some service. In 12 cases that was OpenAI or similar. Fine. But that's not 73%. What about the rest? He literally has a diagram claiming that the startups contact an LLM service behind the scenes. That's what's not described, how does he measure that?
    You are not bothered by the only sign that the author even exist is this one article and the previous one? Together with the claim to be a startup founder? Anybody can claim that. It doesn't automatically provide credibility.
    
    kllrnohj 39 minutes ago
    
    I believe he's saying that a large number of the startups he tested did not have their own backend to mediate. It was literally direct front-end calls to openai. And if this sounds insane, remember that openai actually supports this: https://platform.openai.com/docs/api-reference/realtime-sess...
    Presumably OpenAI didn't add that for fun, either, so there must be non-zero demand for it.
  - DetroitThrow an hour ago
    
    >Response times matching OpenAI’s API latency patterns (150–400ms for most queries)
    This also matches the latency of a large number of DB queries and non-OpenAI LLM inference requests.
    >Token usage patterns identical to GPT-4’s pricing tiers
    What? Yes this totally smells real.
    He also mentions backoff patterns, which I'm not sure how he'd disambiguate extremely standard backoff in a normal API.
    Given the ridiculousness of these claims, I believe there's a reason he didn't include the fingerprinting methodology in this article.
jmogly 3 hours ago

Im also wondering how he is able to see calls to AI providers directly in the browser, client side api calls? Thats strange to me. Also how is he able to peer into the rag architectures? I don’t get that, maybe GpT4.1 allows unauthenticated requests? Is there an OAuth setup that allows client side requests to OpenAI?
- muppetman 3 hours ago
  
  Yea I just posted a similar comment. I'm sure some websites just skin OpenAI/Claude etc, but ALL of them? It makes no sense.
cess11 2 hours ago

There's a link in the preview of TFA that unlocks the rest of the article, looks like this for me:
https://medium.com/@teja.kusireddy23/i-reverse-engineered-20...
The article is basically a description of where to look for clues. Perhaps they've contracted with some of these companies and don't want to break some NDA by naming them, but still know a lot about how they work.
- hn_throwaway_99 2 hours ago
  
  > Perhaps they've contracted with some of these companies and don't want to break some NDA by naming them, but still know a lot about how they work.
  This makes literally no sense. Why would any companies (let alone most of them) contract with this guy who seems hell bent on exposing them all.
  The article is simple made up, most likely by an LLM.

CuriouslyC 3 hours ago

The thing that drives me nuts is that most "AI Applications" are just adding crappy chat to a web app. A true AI application should have AI driven workflows that automate boring or repetitive tasks without user intervention, and simplify the UI surface of the application.

commandar 3 hours ago

I'm firmly of the opinion that, as a general rule, if you're directly embedding the output of a model into a workflow and you're not one of a handful of very big players, you're probably doing it wrong.[1]
If we overlook that non-determinism isn't really compatible with a lot of business processes and assume you can make the model spit out exactly what you need, you can't get around the fact that an LLM is going to be a slower and more expensive way of getting the data you need in most cases.
LLMs are fantastic for building things. Use them to build quickly and pivot where needed and then deploy traditional architecture for actually running the workloads. If your production pipeline includes an LLM somewhere in the flow, you need to really, seriously slow down and consider whether that's actually the move that makes sense.
[1] - There are exceptions. There are always exceptions. It's a general rule not a law of physics.

amelius 3 hours ago

73% of AI startups are building their castle in someone else's kingdom.

benoau 3 hours ago

It's worse than that, someone else's models, someone else's smartphone operating systems, it's every conceivable disadvantage.
- scottiebarnes 3 hours ago
  
  Every city should have its own municipal chip fabrication plant!
  - octoberfranklin 3 hours ago
    
    If you break up AT&T the Bell System will collapse!
    
    jedberg 2 hours ago
    
    Not sure if you're familiar, but it did collapse. It's all one company again.
asah 3 hours ago

-1: there's lots of "kingdoms" (openai, anthropic, google, plus open source) - if one king comes for your castle, you can move in minutes.
- amelius 3 hours ago
  
  True, even OpenAI built their castle in nVidia's kingdom. And nVidia built their castle in TSMC's kingdom. And TSMC built their castle in ASML's kingdom.
  - skx001 3 hours ago
    
    lastly we need the FDIC meme "Backed by the full faith and credit of the U.S. Government" for good measure, haha.
  - octoberfranklin 3 hours ago
    
    TSMC bought a huge chunk of ASML's shares before taking the plunge on EUV -- enough to get them a board seat.

lumost 2 hours ago

My question with these is always "what happens when the model doesn't need prompting?". For example, there was a brief period where IDE integrations for coding agents were a huge value add - folks spent eons crafting clever prompts and integrations to get the context right for the model. Then... Claude, Gemini, Codex, and Grok got better. All indications are that engineers are pivoting to using foundation model vended coding toolchains and their wrappers.

This is rapidly becoming a more extreme version of the classic "what if google does that?" as the foundation model vendors don't necessarily need to target your business or even think about it to eat it.

ojr 26 minutes ago

5% prompt engineering, 95 % orchestration and no you can not vibe code your way and clone my apps, I have paid subscriptions why aren't you doing it then? Oh because models degrade severely over 500 lines.

LLMs is the new AJAX. AJAX made pages dynamic, LLMs make pages interactive.

Workaccount2 3 hours ago

I'm surprised by the number of people who are running head first into AI wrapper start-ups.

Either you have a smash-and-grab strategy or you are awful at risk analysis.

mvkel 3 hours ago

Do you want to be right, or do you want to make money? You'll be correct in 5-10 years. Do you wait and do nothing until then?

zach_moore 3 hours ago

The reason is because VC needs to show that their flagship investments have "traction" so they manufacture ecosystem interest by funding and encouraging ecosystem product usage. It's a small price to pay. If someone builds a wrapper that gets 100 business users then token use on the foundation layer gets that passed down. Big scheme.

doe88 2 hours ago

This a kind of global app store all over again, where all these companies are clients of only few true ai companies and try to distinguish themselves in the bounds of the underlying models and apis, just like apps were trying to find niches in the bound of apis and exposed hw of underlying iphones. Apis versions bugs are now models updates. And of course, all are at the mercy of their respective Leviathan.

mmaunder 2 hours ago

Flagged. Please don't post items on HN where we have to pay or hand over PII to read it. Thanks.

apwell23 2 hours ago

pls don't create new guidelines
- mmaunder 2 minutes ago
  
  Ditto.

yawnxyz 2 hours ago

it's wild, I work with some fortune 500 engineers who don't spend a lot of time prompting AI, and just a quick few prompts like "output your code in <code lang="whatever">...</code>" tags" — a trick that most people in the prompting world are very familiar with, but outside of the bubble virtually no one knows about — can improve AI code generation outputs to almost 100%.

It doesn't have to be this way and it won't be this way forever, but this is the world we live in right now, and it's unclear how many years (or weeks) it'll be until we don't have to do this anymore

furyofantares 2 hours ago

Another slop article that could probably be good if the author was interested in writing it, but instead they dumped everything into an LLM and now I can't tell what's real and what's not and get no sense of what parts of the findings the author found important or interesting compared to what other parts.

I have to wonder, are people voting this up after reading the article fully, and I'm just wrong and this sort of info dump with LLM dressing is desirable? Or are people skimming it and upvoting? Or is it more of an excuse to talk about the topic in the title? What level of cynicism should I be on here, if any?

leeroy0xffffff 2 hours ago

Interesting article and plausible conclusions but the author needs to provide more details to back up their claims. The author has yet to release anything supporting their approach on their Github. https://github.com/tejakusireddy

mmaunder 2 hours ago

https://archive.ph/Zjs2J

piyushpr134 2 hours ago

98% of all websites are just database wrappers

koakuma-chan 2 hours ago

2% are unjust?

tracerbulletx an hour ago

I don't care how you get to a system that does something useful.

zahlman 2 hours ago

That's actually lower than I would have thought.

michaelgiba 3 hours ago

73% of startups are just writing computer programs

Ozzie_osman an hour ago

And 73% of SaaS companies are just CRUD.

Honestly it sounds about right: at the end of the day, most companies will always be an interesting UI and workflow around some commodity tech, but, that's valuable. Not all of it may be defensible, but still valuable.

sc077y 2 hours ago

73% of statistics are wrong

hn_throwaway_99 3 hours ago

I decided to flag this article because it has to be fake.

The author never explains how he is able to intercept these API calls to OpenAI, etc. I definitely believe tons of these companies are just wrappers, but they'd be doing the "wrapping" in their backend, with only a couple (dumb) companies doing the calls directly to OpenAI from the front end where they could be traced.

This article is BS. My guess is it was probably AI generated because it doesn't make any sense.

teiferer 2 hours ago

I find it shocking that most comments here just accept the article as fact and discuss the implications.
The message might not even be wrong. But why is everybody's BS detection on ice in the AI topic space? Come one people, you can all do better than this!
Thanks for flagging. Though whenever such a made up thing is flagged, we lose the chance to discuss this (meta) topic. People need to be aware how prevalent this is. By just hiding it every time we notice, we're preventing everybody to read the kind of comment you wrote and recalibrate their BS-meters.

jedberg 2 hours ago

And 99% of software development is just feeding data into a complier. But that sort of misses the point doesn't it?

AI has created a new interface with a higher level abstraction that is easier to use. Of course everyone is going to use it (how many people still code assembler?).

The point is what people are doing with it is still clever (or at least has potential to be).

system2 2 hours ago

I disagree. Software development is not limited to LLM-type responses and incorporates proper logic. You are at the mercy of LLM when you build an "AI" interface for the LLM apis. 73% these "AI" companies will collapse when the original API providing company comes up with a simple option (Gemini for Sheets, for example), they will disappear. It is already happening.
AI software is not long-lasting; its results are not deterministic.

tekno45 2 hours ago

Maybe one day i can ask my tech in natural language for the weather...could you imagine?

Wait...nvm.

g42gregory 2 hours ago

Isn’t it a bit like saying, “X% of startups are just writing code”?

weakwire 3 hours ago

So? 73% of Saas startups are DB connectors & queries.

tylerchilds 3 hours ago

The difference is, if your company “moat” is a “prompt” on a commodity engine, there is no moat.
Google even said they have no moat, when clearly the moat is people that trust them and not any particular piece of technology.
- ojr 24 minutes ago
  
  the orchestration layer is the moat, ask any LLM and they will give paragraphs explaining why this is...
indymike 3 hours ago

And 73% of paas are deploy scripts for existing software. It's how the industry works.
parineum 3 hours ago

If tekens aren't profitable then prices per token are likely to go up. If that's all these businesses are, they're all very sensitive to token prices.
- weakwire 3 hours ago
  
  Not with open weight models you can deploy yourself. Different economics but not venerable to price increases.
  - parineum 28 minutes ago
    
    First, someone has to develop those models and that's currently being done with VC backing. Second, running those models is still not profitable, even if you self host (obviously true because everything is self hosted eventually).
    Burning VC money isn't a long term business model and unless your business is somehow both profitable on Llama 8b (or some such low power model) _and_ your secret sauce can't be easily duplicated, you're in for a rough ride.
    The only barrier between AI startups at this point is access to the best models, and that's dependent on being able to run unprofitable models that spend someone else's money.
    Investing in a startup that's basically just a clever prompt is gambling on the first mover's advantage because that's the only advantage they can have.

alex_young 2 hours ago

73% of AI blog post statistics are bogus. Subscribe to learn more.

Der_Einzige 3 hours ago

And out of that 73%, 99% of them don't even do the obvious step of trying to actually optimize/engineer their damn prompts!

https://github.com/zou-group/textgrad

and bonus, my rant about this circa 2023 in the context of Stable Diffusion models: https://gist.github.com/Hellisotherpeople/45c619ee22aac6865c...

gsibble 2 hours ago

Flagged. AI written article with questionable sources behind a wall that requires handing over PII.

ReptileMan 3 hours ago

The really impressive thing about AI startups is not that they sell wrappers around (whatever), but that they are not complete vaporware.

danenania 2 hours ago

It’s because the LLM is a commodity.

What differentiates a product is not the commodity layer it’s built on (databases, programming languages, open source libraries, OS apis, hosting, etc) but how it all gets glued together into something useful and accessible.

It would be a bad strategy for most startups to do anything other than prompt engineering in their AI implementations for the same reason it would be a bad idea for most startups to write low-level database code instead of SQL queries. You need to spend your innovation tokens wisely.

apwell23 2 hours ago

Yep I just use chatgpt . I can write better prompts and data for my own usecases

Devasta 3 hours ago

Atlas himself doesn't carry as much as "engineering" does in that headline.

turnsout 3 hours ago

That's like saying "73% of business is just meetings"

RC_ITR 3 hours ago

One of the biggest problems frontier models will face going forward is how many tasks require expertise that cannot be achieved through Internet-scale pre-training.

Any reasonably informed person realizes that most AI start-ups looking to solve this are not trying to create their own pre-trained models from scratch (they will almost always lose to the hyperscale models).

A pragmatic person realizes that they're not fine-tuning/RL'ing existing models (that path has many technical dead ends).

So, a reasonably informed and pragmatic VC looks at the landscape, realizes they can't just put all their money into the hyperscale models (LP's don t want that) and they look for start-ups that take existing hyperscale models and expose them to data that wasn't in their pre-Training set, hopefully in a way that's useful to some users somewhere.

To a certain extent, this study is like saying that Internet start-ups in the 90's relied on HTML and weren't building their own custom browsers.

I'm not saying that this current generation of start-ups will be successful as Amazon and Google, but I just don't know what the counterfactual scenario is.

Skunkleton 3 hours ago

The question that isn't answered completely in the article is how useful are the pipelines for these startups? The article certainly implies that for at least some of these startups there very little value add in the wrapper.
bradfa 3 hours ago

Got any links to explanations of why fine tuning open models isn’t a productive solution? Besides renting the GPU time, what other downsides exist on today’s SOTA open models for doing this?

drivingmenuts 3 hours ago

When people are desperate to invest, they often don't care what someone actually can do but more about what they claim they can do. Getting investors these days is about how much bullshit you can shovel as opposed to how much real shit you shoveled before.

Thus has it always been. Thus will it always be.

IncreasePosts 3 hours ago

Prompt engineering and using an expensive general model in order to prove your market, and then putting in the resources to develop a smaller(cheaper) specialized model seems like a good idea?

Forgeties79 3 hours ago

Are people down to have a bunch of specialized models? The expectation set by OpenAI and everyone else has set is that you will have one model that can do everything for you.
It’s like how we’ve seen basically all gadgets meld into the smart phone. People don’t have Garmin’s and beepers and clock radios anymore (or dedicated phones!). It’s all on the screen that fits in your pocket. Any would-be gadget is now just an app
- bjt 2 hours ago
  
  Having everything in my phone is a great convenience for me as a consumer. Pockets are small, and you only have a small number of them in any outfit.
  But cloud services run in... the cloud. It's as big as you need it to be. My cloud service can have as many backing services as I want. I can switch them whenever I want. Consumers don't care.
  "One model that can do everything for you" is a nice story for the hyper scalers because only companies of their size can pull that off. But I don't think the smartphone analogy holds. The convenience in that world is for the the developers of user-facing apps. Maybe some will want to use an everything model. But plenty will try something specialized. I expect the winner to be determined by which performs better. Developers aren't constrained by size or number of pockets.
- dragonwriter 2 hours ago
  
  > The expectation set by OpenAI and everyone else has set is that you will have one model that can do everything for you.
  I don’t think that’s the expectation set by “everyone else” in the AI space, even if it arguably is for OpenAI (which has always, at least publicly, had something of a focus on eventual omnicapable superintelligence.) I think Google Antigravity is evidence of this: there’s a main, user selected coding model, but regardless of which coding model is used, there are specialized models used for browser interaction and image generation. While more and more capabilities are at least tolerably supported by the big general purpose models, the range of specialized models seems to be increasing rather than decreasing, and seems likely that, for conplex efforts, combining a general purpose model with a set of focussed, task-specific models will be a useful approach for the forseeable future.
- teucris 3 hours ago
  
  I think of the foundational model like CPUs. They're the core of powerful, general-purpose computers, and will likely remain popular and common for most computing solutions. But we also have GPUs, microcontrollers, FPGAs, etc. that don't just act as the core of a wide variety of solutions, but are also paired alongside CPUs for specific use cases that need specialization.
  Foundational models are not great for many specific tasks. Assuming that one architecture will eventually work for everything is like saying that x86/amd64/ARM will be all we ever need for processors.
- lukeschlather 3 hours ago
  
  Specialized models are cheaper. For a company you're looking for some task that needs to be done millions of times per day, and where general models can do it well enough that people will pay you more than the general model's API cost to do it. Once you've validated that people will pay you for your API wrapper you can train a specialized model to increase your profit and if necessary lower your pricing so people won't pay OpenAI directly.
- Workaccount2 3 hours ago
  
  It's probably the direction it will go, at least in the near term.
  It seems right now like there is a tradeoff between creativity and factuality, with creative models being good at writing and chatting, and factuality models being good at engineering and math.
  It why we are getting these specific -code models.
- IncreasePosts 35 minutes ago
  
  It's really an implementation decision. The end user doesn't need to know their request is routed to a certain model. A smaller specialized model might have identical output to a larger general purpose model, but just be cheaper and faster to run.
- palata 3 hours ago
  
  Happy with my Garmin :-)
  - gdulli 2 hours ago
    
    I still use the Garmin I bought in 2010. I refuse to turn on my phone's location tracking. Also the single-purpose interface is better and safer than switching between apps and contexts on a general purpose device.
- sgt101 3 hours ago
  
  My coffee maker app is quite disappointing.
  - Forgeties79 3 hours ago
    
    I imagine you’re being facetious but I wouldn’t count food-related products for the most part. It’s not like Claude is brewing a pot for me anyway lol

RobertDeNiro 3 hours ago

People talk about an AI bubble. I think this is the real bubble.

dboreham 3 hours ago

Not really because the money involved is relatively small. The bubble is where people are using D8s to push square kilometers of dirt around for data centers that need new nuclear power plants built, to house millions of obsolete Nvidia GPUs that need new fabs constructed to make, using yet more D8s..
- mh- 3 hours ago
  
  (D8s apparently refers to a specific Caterpillar-brand bulldozer, not some kubernetes takeoff.)

DetroitThrow 2 hours ago

Why is slop with ridiculous or impossible claims at the top of HN?

mvkel 3 hours ago

Wait til you hear what GPT 5 is

kilroy123 3 hours ago

What is it? A gpt-4o wrapper?

hmans 21 minutes ago

[dead]

strathmeyer 2 hours ago

[dead]

theshetty 3 hours ago

Prompt is code.

bwfan123 2 hours ago

prompt as code is a pipe-dream.
The machine model for natural language doesnt exist - it is too ambiguous to be useful for many applications.
Hence, we limited natural language to create programming languages whose machine model is well defined.
In math, we created formalism to again limit language to a subset that can be reasoned with.
add-sub-mul-div 2 hours ago

I've always said, determinism has been holding the field back.
amelius 3 hours ago

Prompt is specification, not code.
- teucris 3 hours ago
  
  Not to be too pedantic, but code is a kind of specification. I think making the blanket statement "Prompt is code" is inaccurate but there does exist a methodology of writing prompts as if they are specifications that can reliably converted to computational actions, and I believe we're heading toward that.
  - amelius an hour ago
    
    Yeah, I assumed someone would say this.
    My manager gives me specifications, which I turn into code.
    My manager is not coding.
    This is how to look at it.

adidoit 3 hours ago

100% of AI startups are just multiplying matrices 100% of tech startups are just database engineering

It's still early in the paradigm and most startups will fail but those that succeed will embed themselves in workflows.