More

tempoponet · 2026-03-18T19:46:22 1773863182

OpenClaw had a huge viral marketing campaign. It wasn't a coincidence everyone on twitter was talking about it at the same time suddenly. To its credit, it also executed well enough in a few areas that captured people's imagination. Most of the concepts are ideas people have been toying with for years, though.

wahnfrieden · 2026-03-19T14:58:39 1773932319

Steinberg funded or directed a campaign? It looked to me like unconnected parties liked it and marketed it to offer their own solutions and services on top of it. You saw that they were paid by Steinberg/his affiliates?

tempoponet · 2026-03-13T23:18:16 1773443896

What kind of small tasks do you find it's good at? My non-coding use of agents has been related to server admin, and my local-llm use-case is for 24/7 tasks that would be cost-prohibitive. So my best guess for this would be monitoring logs, security cameras, and general home automation tasks.

steve_adams_86 · 2026-03-14T01:04:43 1773450283

That's about it. The harness is still pretty rudimentary so I'm sure the system could be more capable, and that might reveal more interesting opportunities. I don't really know.

So far I've got it orchestrating a few instances to dig through logs, local emails, git repositories, and github to figure out what I've been doing and what I need to do. Opus is waayyy better at it, but Qwen does a good enough job to actually be useful.

I tried having it parse orders in emails and create a CSV of expenses, and that went pretty badly. I'm not sure why. The CSV was invalid and full of bunk entries by the end, almost every time. It missed a lot of expenses. It would parse out only 5 or 6 items of 7, for example. Opus and Sonnet do spectacular jobs on tasks like this, and do cool things like create lists of emails with orders then systematically ensure each line item within each email is accounted for, even without prompting to do so. It's an entirely different category of performance.

Automation is something I'd like to dabble in next, but all I can think of it being useful for is mapping commands (probably from voice) to tool calls, and the reality is I'd rather tap a button on my phone. My family might like being able to use voice commands, though. Otherwise, having it parse logs to determine how to act based on thresholds or something would also be far better implemented with simple algorithms. It's hard to find truly useful and clear fits for LLMs

novok · 2026-03-14T03:27:21 1773458841

Oh man you just gave me an idea to use something like qwen 3.5 to categorize a lot of emails. You can keep the context small, do it per email and just churn through a lot of crap.

nl · 2026-03-14T14:13:57 1773497637

The 0.8B can do this pretty well.

Actually pg's original "A plan for spam" explains how to do this with a Bayesian classifier.

steve_adams_86 · 2026-03-14T15:58:58 1773503938

I've been learning to apply these lately and it has been pretty eye opening. Combined with Fourier analysis (for example) you can do what seems kind of like magic, in my opinion. But it has been possible since long before LLMs showed up.

Totally different categories and different use cases, but the more I learn about LLMs the more I discover there's a powerful, determinsitic, well-established statistical model or two to do the same thing.

Really, LLMs are kind of like convenient, wildly inefficient proxies for useful processes. But I'm not convinced they should often end up as permanent fixtures of logical pipelines. Unless you're making a chat bot, I guess.

1dom · 2026-03-14T20:00:44 1773518444

> Really, LLMs are kind of like convenient, wildly inefficient proxies for useful processes. But I'm not convinced they should often end up as permanent fixtures of logical pipelines. Unless you're making a chat bot, I guess.

I think I agree with this. It's made me realise LLMs are great for prototyping processes in the same way that 3D printers are great at prototyping physical things. They make it quick and easy to get something close enough to see the unforeseen problems a proper solution might have.

steve_adams_86 · 2026-03-14T21:39:36 1773524376

3d printing is a great analog because there are so many critical considerations that are often missed or can't be accounted for in the prototype, but, it's alright because it's a prototype. The strain testing, durability, manufacturing at scale; none of that is properly addressed. Those might involved some serious, expensive challenges, too. But it's alright because you've got something in your hand that informs you whether or not those challenges are worth contending with. I really love this about LLMs and 3d printing.

novok · 2026-03-17T04:15:49 1773720949

IMO the fact that spam detection has devolved into reputation management vs. being able to work on the content themselves makes me think there is a lot of alpha between an llm process vs. the most traditional processes we have now.

alexpotato · 2026-03-14T14:19:13 1773497953

I was just chatting with a co-worker that wanted to run a LLM locally to classify a bunch of text. He was worried about spending too many tokens though.

I asked him why he didn't just have the LLM build him a python ML library based classifier instead.

The LLMs are great but you can also build supporting tools so that:

- you use fewer tokens

- it's deterministic

- you as the human can also use the tools

- it's faster b/c the LLM isn't "shamboozling" every time you need to do the same task.

vidarh · 2026-03-14T21:15:52 1773522952

I use Haiku to classify my mail - it's way overkill, but also doesn't require training unlike a classifer. I recieve many dozens of e-mails a day, and it's burned on average ~$3 worth of tokens per month. I'll probably switch that to a cheaper model soon, but it's cheap enough the "payoff" from spending the time optimizing it is long.

nunodonato · 2026-03-14T12:39:26 1773491966

you can use 4B for that, its quite good

tempoponet · 2026-01-27T15:56:46 1769529406

Maybe for a coding agent, but a daily/weekly report on sensitive info?

If it were 2016 and this technology existed but only in 1 t/s, every company would find a way to extract the most leverage out of it.

michaellee8 · 2026-01-27T17:18:40 1769534320

If they figured out it can be this useful in 2016 running 1 t/s, they would make it run at least 20 t/s by 2019

esafak · 2026-01-27T17:02:38 1769533358

But it's 2026 and 'secure' (by executive standards) hosted options exist.

dabockster · 2026-01-27T19:34:58 1769542498

> 'secure' (by executive standards)

"Secure" in the sense that they can sue someone after the fact, instead of preventing data from leaking in the first place.

tempoponet · 2025-12-29T16:23:14 1767025394

And here's the blog article describing the widget: https://maurycyz.com/misc/ads/

tempoponet · 2025-12-05T14:31:10 1764945070

Remember when Netflix almost split its brand with "Quickster"? It was the dying DVD by mail service, but the whole debacle did nothing but confuse people.

xp84 · 2025-12-05T20:44:07 1764967447

True, although Netflix knew the DVD business had no permanent future anyway, so they really didn't care. If they'd picked a less silly name like "DVDflix" or something, it wouldn't have become a viral story, but either way it wouldn't have changed NFLX's fortunes.

tempoponet · 2025-12-02T01:07:17 1764637637

The new Alexa uses Claude under the hood, and it also misinterprets my intent, only with a 2 second longer delay and slightly more approachable tone.

tempoponet · 2025-10-15T20:27:13 1760560033

Everyone has their own hill to die on, that's the thing about personal computing. It's the same if you ask why they can't switch mobile OS. It's some seemingly trivial app or feature that almost nobody cares about.

tempoponet · 2025-10-15T14:31:10 1760538670

They support their phones for years longer than any vendor. This has been widely understood for probably 10+ years at this point.

There's plenty of room for criticism without a blanket conspiracy that doesn't match what most can observe.

imcritic · 2025-10-16T13:03:38 1760619818

Support means that the manufacturer just still releases OS updates. But it says absolutely nothing about the quality of those updates: what if those updates simply degrade the situation? Every iPhone user I know says the same without conspiring with each other: it's better to stop updating to newer major OS releases for older iPhones.

tempoponet · 2025-09-24T16:18:01 1758730681

I was really looking for tangible, actionable advice since I'm facing slow adoption in my org. This post seems to hide behind the "secret sauce" that it claims made all of the difference.

namanyayg · 2025-09-24T19:02:19 1758740539

I wanted to reach out but I couldn't find your email. Mine's in the profile if you want to chat.

Esophagus4 · 2025-09-24T18:57:33 1758740253

Out of curiosity, do you have thoughts on why the slow adoption?

tempoponet · 2025-09-03T13:56:23 1756907783

Once local models are good enough there will be a $20 cloud provider that can give you more context, parameters, and t/s than you could dream of at home. This is true today with services like groq.

sunir · 2025-09-03T16:21:19 1756916479

Not exactly. Those models are based on intermittent usage. If you're using an AI engineer using a sophisticated agent flow, the usage is constant and continuous. That can price to an equivalent of a dedicated cube at home over 2 years.

I had 3 projects running today. I hit my Claude Max Pro session limits twice today in about 90 minutes. I'm now keeping it down to 1 project, and I may interrupt it until the evening when I don't need Claude Web. If I could run it passively on my laptop, I would.

theshrike79 · 2025-09-04T05:57:28 1756965448

Anthropic used to have unlimited subscriptions, then people started running angents 24/7.

Now they have 5 hour buckets of limited use.

Groq most likely stays afloat because they're a bit player - and propped by VC money.

With a local system I can run it at full blast all the time, nobody can suddenly make it stupid by reallocating resources to training their new model, nobody can censor it or do stealth updates that make it perform worse.

hatefulmoron · 2025-09-03T15:57:53 1756915073

Groq and Cerebras definitely have the t/s, but their hardware is tremendously expensive, even compared to the standard data center GPUs. Worth keeping in mind if we're talking about a $20 subscription.