Learning-focused CTFs are Facing a Restructure

There have been a large number of blog posts in the past month or two talking about the end of the CTF as we know it. I've been working on this post since April, and have trimmed it down to ensure it is additive and not redundant.

Background

About a decade ago, I participated in my first Capture the Flag (CTF) competition and was immediately hooked. The rush of solving a challenge, getting the flag, and inching up the scoreboard scratched an itch I didn't know I had.

CTFs became a cornerstone of my career. Each one would teach me something new, and as I progressed there was always headroom to grow and learn. I wouldn't say I truly entered the competitive scene, but I eventually co-ran a hobby team with mild success before life pulled the members in different directions.

As I took on more leadership-oriented positions, it became not just a great way to learn, but also an excellent way to mentor. I often extended an invitation to new team members, hoping they would discover the same passion and competitive drive I had previously found, which many did. They loved the combination of competition and learning, and their knowledge grew quickly as a result.

Things have changed. Here are some screenshots are from recent CTFs which still had the old score graphs available.

Before: Dice CTF 2023

dice 2023

After: Dice CTF 2025

dice 2025

(Note: I included the 2025 score graph instead of 2026 because DiceCTF changed their score graph design)

Before: srdnlen CTF 2025

cini 2025

After: srdnlen CTF 2026

cini 2026

Before: UMass CTF 2025

umass 2025

After: UMass CTF 2026

umass 2026

I'm pretty confident you can guess what's going on here.

The Modern Landscape

Agentic workflows have taken over. Whether it's Claude Code, Codex, or even a harness with a local model, teams and individuals are being forced to incorporate agents in order to remain competitive. On top of this, GPT 5.5 and Opus 4.7 have all proven to be a large step above the models from even a year ago.

This reflects the state of the industry. Anthropic's Mythos marketing push has seen a surge in the use of LLMs for discovering real-world vulnerabilities in significant projects such as Firefox, the Linux kernel, and even cURL. There are a lot of caveats nested in there, and I encourage you to read the details if you haven't already. But regardless of how much is hype versus a truly significant capability (it's probably somewhere in the middle), the increase in the capabilities of models are affecting CTFs significantly.

In some aspects, top CTF competitions have always represented the cutting edge, showcasing newly discovered vulnerabilities with a twist or demonstrating new technologies (AIxCC in 2025 or the Cyber Grand Challenge in 2016). So it's no surprise that the trend is continuing. The winning team being decided by who has the most efficient agentic harness is just a reflection of the times, right?

But what about academic settings where CTFs are often used as a learning tool?

Only two years prior, studies identified LLM assistance as a potential issue in academic settings. But due to the lack of capability in models at the time the study concluded that the assistance provided was limited and there was no need for structural changes. Yet even then, separate studies in 2024 were showing that LLMs were able to achieve a "higher success rate than an average human participant".

In 2026, leveraging a fully automated or hybrid (human-in-the-loop) workflow has become the most effective strategy for winning competitions. And on certain learning platforms, specifically Hack the Box, this has manifested as a significant decrease in average time-to-solve. This suggests participants in large, competitive learning platforms are already using LLMs for automation. Hack the Box has historically been a very competitive platform, so this isn't too much of a surprise.

However, there are significant implications for CTFs intended explicitly for academic and learning purposes, especially those targeted at a beginner-level audience. Examples include PicoCTF (which is now CyLab Security Academy) or those hosted as part of college courses.

The reason CTFs were such an effective learning tool was because they leveraged the psychology of competitiveness to motivate new participants to achieve new heights. That leverage evaporates if someone else is foregoing learning and using LLMs to simply win.

The Information Search Process

Some may be familiar with Kuhlthau's Information Search Process. This is the process most CTF participants went through prior to agentic workflows and today's models.

kuhlthau ISP

Using agentic workflows completely removes every part of the ISP. Build, clone, or vibecode a good framework once, point it at a CTF server, and rocket up the leaderboard. A quick git clone of ctf-agent, some first-time setup, and the flags start pouring in.

This compromises two major components of CTFs as a learning tool: the incentive structure and the repetition necessary to form strong pathways in your brain.

Incentive Structure

Currently, CTFs incentivize quick time-to-solve using extrinsic reinforcers such as extra points for first-blood or solving tie-breakers based on who solved it first. The bottom line is current competition design primarily incentivizes time-to-solve compression instead of how the challenge was solved.

Previously, this incentive effectively motivated both beginners and experienced participants to practice consistently, dabble in software development to write light automation, and learn new methodologies. This process built some of the best security practitioners in the industry.

Yet these same motivators are pushing participants towards full automation. Assuming models continue to improve, participants will be pushed to incorporate less and less human-in-the-loop.

The ability for entry-level or junior participants to struggle through manually solving a challenge and feel like they're actually competing is gone. If you want to compete, you must use a model. This removes one of the most effective long-term motivation tools for beginners.

Repetition

Continuously solving progressively harder challenges gives a clear growth path for a participant. It allows them to develop their workflow, identify patterns, and incrementally build mental models. Even if the participant attempts to understand what is going on by reading an LLM-generated solution, there's a good chance they miss they why. Missing out on this over and over will cause atrophy of an already limited skill set.

Not many articles, papers, or studies have touched on the effects these changes will have on those who are just entering the workforce or who are still learning. Combined, the absence of the information search process, the now-ineffective incentive structure, and the lack of repetition are especially dangerous for a beginner's learning. How do we continue to ensure the next generation of security practitioners are capable and not overly reliant on autonomous systems?

Poll Data

I recently polled a graduate-level binary exploitation course on their experience using LLMs during the course and received 10 responses. The course is split into two competitions: one for "labs" that runs the entire semester and requires write-ups, and the other a 24-hour jeopardy-style competition as a capstone. The two competitions combined make up the final grade.

As far as LLM usage goes, the two most popular use cases were using them for information retrieval (similar to web search) and giving the LLM partial context in order to ask for help understanding a challenge. Only two people out of the ten trended towards full automation in the CTF, and only one used heavy automation for the labs.

As I mentioned, the labs required submitting a write-up to validate the points for the solve. Interestingly, 80% of the students said their write-ups had no LLM-generated content, and in long-form responses many stated that they were in the course to learn and that even if they solved the challenge with LLMs, they did the write-up manually to ensure they understood.

question23

However, during the 24-hour CTF competition LLM usage was much more popular. Half of the surveyed students relied more heavily on LLMs, giving full context and asking for a solve, or building automated workflows which either produced flags or even submitted them automatically.

This shift is likely due to the lack of a write-up requirement, meaning the only requirement for points was to retrieve the flag. Additionally, many students didn't seem to care much about the capstone because many only needed to solve one challenge to get the grade they desired. It is unclear how this skewed the data, as it could be the case that more students stated they did not use LLMs heavily simply because they didn't really participate.

This indicates people learning via CTF must now rely more on intrinsic motivation or other extrinsic reinforcers rather than relying on competitive psychology as motivation. The effect this will have on learners remains unclear.

Next Steps

The CTF format that so many of us know and love is no more. In order for CTFs to continue to be an effective learning tool, the incentive structures need to be tweaked. The time-to-solve incentive shouldn't go away, but other incentives need to be added to balance it out.

Based on the polling data, I believe a write-up requirement is an effective methodology. Yes, LLMs can do write-ups too, but based on the limited data, it seems that learners are less willing to submit LLM-generated write-ups than they are LLM-retrieved flags. While this still omits the information search process if the challenge is solved with LLMs, it encourages more learning than vibecoding.

To be honest though, implementing this idea in traditional, non-academic settings will be cumbersome for organizers. It likely will not prevent more competitions from closing their doors for good.

But humans adapt and evolve, it's what we're best at. So whether it's going back to online forums or blog-based discussions sent over link aggregators, our focus should be maintaining the community, because ultimately that's what CTFs were all about.

Most importantly, if you're a more senior member of the field, find one or two beginners to mentor. If you're a beginner, join communities and seek out someone who is willing to mentor you. As the internet becomes more hostile towards humans, we have to be more diligent about seeking out that connection.