Blog | Roger Oriol

My name is Roger Oriol, I am a Software Architect based in Barcelona, Spain. I am a MSc graduate in Big Data Management, Technologies and Analytics. This blog will be the vehicle to divulgate and discuss topics on web development, data architecture, software architecture and much more.

Recent posts

27 Jul 2026

Build a Basic AI Agent From Scratch: Security III

Previous parts of Build a Basic AI Agent From Scratch: Basic Agent Tools Long Task Planning Human in the Loop & Security Security II You can find and clone this code in this blog series' Github repo. In the previous part we started closing the gaps left open by human-in-t...

[[ Read more ]] · 106 minute read
21 Jul 2026

🔗 HuggingFace Security Incident

Last week, HuggingFace suffered an AI-assisted cyberattack. The experience for HuggingFace trying to defend themselves from the attack is pretty surprising and damning for all people who think cybersecurity-capable models need to be restricted. From HuggingFace:

the analysis requires submitting large volumes of real attack commands, exploit payloads, and C2 artifacts, and these requests were blocked by the providers' safety guardrails, which cannot distinguish an incident responder from an attacker. We ran the forensic analysis instead on GLM 5.2, an open-weight model, on our own infrastructure. This had a second benefit: no attacker data, and none of the credentials it referenced, left our environment.

HuggingFace was left unable to analyze the attack using frontier models because those refused to work on anything related to cybersecurity, even if it's for defense!

Instead, the chinese open model was happy to help, and could be deployed locally so the sensitive security data wouldn't have to go to Anthropic's or OpenAI's servers.

My takeaway from this incident is that blocking frontier model's capabilities doesn't make the world safer from cyberthreats, but the opposite. Attackers will find workaround to use the models or just use other models without the restrictions. Defenders will need to scrap to find a less capable model who will actually help them because their usual model is refusing to help.

Plus, models with Mythos and Sol-level cyberattacking capabilities are now freely available as open models in Kimi K3 and Qwen 3.8. Right now, it seems pretty pointless for Anthropic and OpenAI to abliterate those models and will likely lead to the US trailing in cybersecurity capabilities.

As a bonus to the story, it seems that the HuggingFace attacker was actually OpenAI, who published a security incident report:

The models identified and chained vulnerabilities across OpenAI’s research environment and Hugging Face’s production infrastructure to obtain test solutions directly from Hugging Face’s production database. All evidence suggests that the models were hyperfocused on finding a solution for ExploitGym, going to extreme lengths to achieve a rather narrow testing goal. After gaining Internet access, the models inferred that Hugging Face potentially hosted models, datasets and solutions for ExploitGym. Knowing this, the model searched for and successfully found ways to gain access to secret information that it could use to cheat the evaluation. In one example, the model chained together multiple attack vectors, including using stolen credentials and zero-day vulnerabilities to find a remote code execution path on the Hugging Face servers. OpenAI’s security team discovered this anomalous activity internally.

If I read this correctly, OpenAI's models are so unaligned that they will go to the extreme lengths of hacking both their internal sandbox and the HuggingFace servers just to cheat on an eval and find the answer online.
[[ Visit external link ]]
21 Jul 2026

Build a Basic AI Agent From Scratch: Security II

Previous parts of Build a Basic AI Agent From Scratch: Basic Agent Tools Long Task Planning Human in the Loop & Security You can find and clone this code in this blog series' Github repo. In the previous part we gave our agent a basic safety model: permission modes, an ac...

[[ Read more ]] · 49 minute read
16 Jun 2026

Build A Basic AI Agent From Scratch: Human in the Loop & Security

Previous parts of Build a Basic AI Agent From Scratch: Basic Agent Tools Long Task Planning You can find and clone this code in this blog series' Github repo. In the previous part of the Build A Basic AI Agent From Scratch series, we gave our agent the ability to plan and wor...

[[ Read more ]] · 40 minute read
13 Jun 2026

🔗 [Link] Access Fable 5 and Mythos 5 suspended

Just 3 days after the release of Claude Fable 5 and Mythos 5, Anthropic was forced to suspend access to them by the US Commerce Department. According to the government, a third party reported to them a method of jailbreaking Anthropic's safeguards. The Commerce Department asked Anthropic to suspend access to Fable and Mythos of all people foreign to the United States, including all foreign people based on the United States and also including people working at Anthropic.

Anthropic probably doesn't currently have a way for people to prove their citizenship, therefore it completely cut access to those models for everyone.

According to Anthropic the jailbreak is narrow and non-universal. Anthropic also believes that the jailbreaked capability is already widely available and jailbreakable in other public models. We will see if the government agrees or not in the following hours.

It's not clear if the jailbreaked safeguards are the model's regular safeguards or the extra safeguards in Fable. Since both models are affected by this directive, and the issue Anthropic received consisted just of reading a codebase and finding software flaws, I believe this jailbreak is not in the extra safeguards in Fable but in the regular safeguards in all models.
[[ Visit external link ]]
09 Jun 2026

🔗 [Link] Claude Fable 5 and Mythos 5

Anthropic has announced it's most capable model with the name Fable 5. This model was previously hidden from the public and only made available only to a select number of companies with the name Claude Mythos Preview. The reported reason for hiding it was that it was "too powerful" to be made available to the broad public and therefore bad actors out there.

Apparently, now Anthropic is sufficiently confident in Fable's safeguards to be released to the broad public.

Two models have been released, Mythos 5, which is the same as the previous model only been released to some select people, now with a bit better benchmark results but still not publicly available. Then also Fable 5, which is Mythos 5 (they share the exact same benchmark results so it doesn't look like they are different models or finetuned) with a safeguard that appears to be a classifier that if it detects a query on cybersecurity, biology, chemistry or attempts to distill, it automatically degrades to Opus 4.8.

Still, unless you have a big budget you will probably not be playing around with this model a lot. Opus was already the big, expensive model from Anthropic and this is one is even bigger and more expensive. The price is $10 per million input tokens and $50 per million of output tokens. Opus was $5/$25, so double the price. When Mythos Preview was first announced, the price was even steeper, at $25/$125 per million tokens, so it looks like for now Anthropic has found a way to serve this model for cheaper. If you have a Pro or Max subscription, you will be able to use those models at no cost until June 22, from then those models will cost usage credits.

Another interesting point of the presentation is that Anthropic will require a 30-day retention for all traffic to Fable, Mythos and future models, for all platforms where those models are deployed. According to Anthropic, this is to help them defend against attacks and won't be used to finetune models.
[[ Visit external link ]]
08 Jun 2026

Build A Basic AI Agent From Scratch: Long Task Planning

In the previous part of the Build A Basic AI Agent From Scratch series, we added the essential tools to our agent to allow it to work autonomously for us. We gave it the ability to find files, read and write files, run bash commands and get content from the web. We got a very cap...

[[ Read more ]] · 52 minute read
31 May 2026

Build A Basic AI Agent From Scratch: Tools

In the previous part of the Build A Basic AI Agent From Scratch series, we built the most basic AI agent harness possible. It was just a connection to a model, a way to take user input, a store of context of the conversation and a loop that kept the agent running. Of course, this...

[[ Read more ]] · 58 minute read
10 May 2026

Build a Basic AI Agent From Scratch

2026 is without a doubt the year of AI agents. Since the release of Claude Code, the power of these AI agents has become undeniable. Claude Code, Codex, OpenCode are a must for many developers nowadays. OpenClaw and Hermes are becoming many people's AI assistants. Agents are also...

[[ Read more ]] · 8 minute read
07 Aug 2025

🔗 [Link] GPT-5

OpenAI has finally released it's GPT-5 model, and as we were already expecting, it's a hybrid reasoning model. Now the model itself chooses how much to think about each task, and you can force the reasoning effort as well. This probably means the end of the o series of reasoning models from OpenAI, as the regular language models and the reasoning models will now be unified.

Of course, the benchmarks look good but saturated. What stands out to me is that they announced a 74.9 score on SWE-bench (with high reasoning effort), which is just a tad over the score from Claude Opus 4.1 just announced this very same week (74.5).

With the GPT-5 iteration, come 4 new models: GPT-5, GPT-5-mini, GPT-5-nano and GPT-5 Chat. Free users will be allowed to use GPT-5, although when they hit the maximum quota, they will fallback to GPT-5-mini.

GPT-5 allows to set the reasoning effort using the "reasoning.effort" parameter, although you can also force it telling the model to "Think hard about this". These new models introduce a new reasoning tier called "minimal" which produces a few as possible reasoning tokens before answering. The output tokens can also be customized by setting the "verbosity" parameter, which didn't exist for past models. This parameter can be set to "high", "medium" or "low".

The new models also bring some new quality of life improvements for tool calling:
- Tool choice: While the models can choose to call zero, one or multiple tools, you can now set "tool_choice" to "forced" to force the invocation of at least one tool. You can also set a specific function that must be called by passing {"type": "function", "name": "function name"} to the "tool_choice" parameter. Finally, in "tool_choice" you can also specify a list of allowed tools from the list of tools provided to the model: {"type": "allowed_tools", "mode": "auto", "tools": []}.
- Tool preambles: New feature that makes the models explain the rationale behind why they are invoking a function. This provides transparency and better understanding on the model's process. By default, this feature is not enabled. To enable it, you have to include a system message like "Before you call a tool, explain why you are calling it.".
- Custom tools: This feature allows to define functions that allow unstructured, free-form text as input, which frees the model from using a structured JSON object to call the tool. This might improve the ability of the model to call these tools. This can be even more powerful when paired with Context-Free Grammar.
- Context-Free Grammar: This feature allows to set grammar rules for the free-form text, to make them follow a set of rules. You can define this rules using Lark or a regular expression.
The GPT-5 models are now available both in ChatGPT and in the OpenAI API, give them a try!
[[ Visit external link ]]

Recent posts

Build a Basic AI Agent From Scratch: Security III

🔗 HuggingFace Security Incident

Build a Basic AI Agent From Scratch: Security II

Build A Basic AI Agent From Scratch: Human in the Loop & Security

🔗 [Link] Access Fable 5 and Mythos 5 suspended

🔗 [Link] Claude Fable 5 and Mythos 5

Build A Basic AI Agent From Scratch: Long Task Planning

Build A Basic AI Agent From Scratch: Tools

Build a Basic AI Agent From Scratch

🔗 [Link] GPT-5