# Did we just hit AGI?
> [!NOTE]
> All opinions expressed in this post are my personal views and do not represent the opinions of my employer. I do not work directly with AI models, and my observations are drawn entirely from using AI products as a consumer.
For years, the definition of Artificial General Intelligence (AGI) has been notoriously slippery. Historically, it has been less of a fixed milestone and more of a moving goalpost. This is often called the "[AI Effect](https://en.wikipedia.org/wiki/AI_effect "null")," where as soon as a machine masters a task, that task is no longer considered "true intelligence." We've moved from the Turing Test to "passing the Bar exam" to "economically valuable work," yet the definition remains poorly defined and highly subjective.
I don’t think AGI is going to dramatically announce itself, more likely we will fly right past AGI and barely anyone will bat an eyelid. Just look at the [passing of the Turing Test](https://www.popsci.com/technology/chatgpt-turing-test/ "null") for a recent example of exactly this happening.
Because of this lack of a rigorous benchmark, the debate has often focused on what AI _can’t_ do rather than the intelligence it displays. AI models cannot entirely replicate every action of a human, however this is often due to structural and computational constraints. For example they lack "perfect" infinite memory, they don’t have 24/7 autonomous agency, and they don’t have physical embodiment.
But if we take those constraints into account and we look at the cognitive output, I believe we’ve reached the tipping point. When I compare the reasoning, synthesis, and intuition of frontier models like Gemini to the humans I encounter daily, I struggle to find any deficiency that the models have over humans. In many ways they are significantly better.
I’m calling it: **we have AGI**.
## Anecdotal AGI
I want to highlight 3 examples where I have used one of the latest foundational modal (Gemini 3 Pro and Flash), when I have thought “how is this not AGI?”.
These aren’t benchmark questions, they aren’t pushing the models to their limit, but that’s sort of the point. Before the latest round of frontier models, they were questions where AI results would be insufficient and I would need to turn to a human to complete, or augment the answer (even if that human was me).
### 1. The Proactive Personal Assistant
Recently, I asked Gemini to create a task about calling my home insurance company. I then asked it to move that task to tomorrow. Simple enough. But then I asked it to look through my emails find out who my insurer actually was, as I couldn't remember, find the company’s contact number and add that to my task.
Gemini scanned through my emails, identified the last renewal notice, found the name of the company, and pulled the contact number. Then, **of it’s own volition**, it found my specific policy number and also added that to the task as well. That leap from "find a phone number" to "he’ll probably need his policy number to make this call successful" is a hallmark of human-level proactive reasoning. That is a personal assistant! That’s the dream!
### 2. Navigating the Grey Areas
Intelligence isn't just about processing data, it’s about understanding the human condition. I recently faced a subtle, nuanced ethical dilemma. I considered discussing it with some people in a similar situation, but I didn’t want to discuss it with them as it could burden them with an ethical quandary they aren't yet aware of.
So, I turned to Gemini. I outlined the issue with all the details to hand, and asked it to provide me with an absolutist position, a more relaxed opinion, and then a balanced opinion with a recommendation. I was really surprised by its ability to grasp the subtlety of the situation. It engaged with the nuance, weighed the potential outcomes, and helped me resolve the issue in a way where I could not imagine way it could be improved. For an AI model to display ethics and nuance like that unlocks use cases I hadn’t previously considered.
### 3. From "Yes-Man" to Peer
In earlier iterations, LLMs were notorious for being to sycophantic. If you suggested a bad architectural path for a software project, the model would usually try its best to make your bad idea work.
The latest frontier models have broken that habit. I’ve been using them to brainstorm project architectures, and sometimes I’ll suggest approaches that I suspect are overkill, or I’m not sure if they are appropriate, because I don’t have enough knowledge in that area. In the past I found that AI models would often incorporate all of your suggestions simply because you mentioned them, even if they weren't that appropriate. However, I’ve found the latest models are comfortable with calling out approaches that won’t work, or are unnecessarily complex.
When a system stops just echoing your input and starts acting as a critical collaborator, it has moved beyond a "stochastic parrot." It is thinking.
## Conclusion
Whether the examples above count as “Artificial General Intelligence” will depend a lot on your definition of AGI. But given the constraints they operate within, I think the latest round of frontier model have cognitive ability that surpass any human I have ever met. For me, that counts as AGI.