Home » Blog » Why LLMs Might Be A/B Testing Your Thinking at Desktop

Why LLMs Might Be A/B Testing Your Thinking at Desktop

August 13, 2025

I’ve been diving deeper into prompt engineering (or prompt design) lately. Like most things on the internet, there’s no shortage of resources—free courses, general tips, videos, and more. The early focus on simply crafting prompts makes sense, but over time, the real value will shift toward specificity and personalization of responses driven by user feedback.

Why? Prompt engineering might sound like a lofty skill or role, but in reality, it’s surprisingly accessible. It feels destined to become a baseline expectation rather than a specialized craft. Techniques like chain-of-thought prompting or prompt chaining are easy to understand and apply. In it’s current form, I don’t think prompt engineering has a future.

What’s more compelling to me is the role personalization and feedback will play in shaping large language model (LLM) behavior. Some personalization already exists—for example, LLM memory features that remember details from past chats to save users from repeating themselves. But the real question is: when will LLMs adapt to a user’s unique intelligence profile or preferred reasoning style?

We’re starting to see early user research in this direction. Gemini, for instance, recently asked me, “Which response is more helpful?”—prompting me to choose between two replies that differed mainly in sentiment (positive vs. negative). What’s surpassing is that prompt only appeared on desktop. If the future of LLMs is shaped by the nuances of how we think, not just what we ask, then personalization may be the new frontier of prompt engineering.

What People Really Want From Air Quality Apps (It’s Not Just the Numbers)

November 12, 2025

·

Amha

Over the past few weeks, I ran a short survey to understand how people think about air quality — how often they check it, what…
Read More
🎉 10 Years at Mastercard 🎉

November 10, 2025

·

Amha

Today marks my 10th anniversary at Mastercard. In that time, I’ve held 𝘁𝗵𝗿𝗲𝗲 𝗿𝗼𝗹𝗲𝘀, lived in 𝘁𝗵𝗿𝗲𝗲 𝗰𝗼𝘂𝗻𝘁𝗿𝗶𝗲𝘀, and traveled to 𝘀𝗲𝘃𝗲𝗻𝘁𝗲𝗲𝗻 𝗺𝗼𝗿𝗲. Along the…
Read More
Homage to the Flowchart: Appreciating Symbolic Representations of Information

August 28, 2025

·

Amha

Much research links learning to storytelling, but that’s never been the model that works for me. I’m prone to abstraction. When it comes to learning,…
Read More
Working with AI or selling it: Revisiting Microsofts AI Applicability

August 19, 2025

·

Amha

I recently read a paper from Microsoft Research called Working with AI. It analyzed 200,000 conversations with Bing Copilot and introduced a new metric: the AI Applicability Score.…
Read More
Why LLMs Might Be A/B Testing Your Thinking at Desktop

August 13, 2025

·

Amha

I’ve been diving deeper into prompt engineering (or prompt design) lately. Like most things on the internet, there’s no shortage of resources—free courses, general tips,…
Read More

Amha Mogus

Why LLMs Might Be A/B Testing Your Thinking at Desktop

Previous Posts

What People Really Want From Air Quality Apps (It’s Not Just the Numbers)

🎉 10 Years at Mastercard 🎉

Homage to the Flowchart: Appreciating Symbolic Representations of Information

Working with AI or selling it: Revisiting Microsofts AI Applicability

Why LLMs Might Be A/B Testing Your Thinking at Desktop