Sign up for a free trial!

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form. Please complete fields and try again...

AI thought leadership series 2026

we'll break it all down - AI tech, tactics, and practical steps

Post 1: AI is not new. But what's happening now is terrifyingly different.

[interface] screenshot of cybersecurity dashboard interface (for an ai cybersecurity company)

Most people think AI just got good recently.
Wrong.
Weโ€™re not living in the AI arrival. Weโ€™re living in the aftermath of a breakthrough that happened in 2017.
Hereโ€™s what changed, and how itโ€™s impacting a lot.

Yep, weโ€™ve left the realm of fancy autocomplete.

Weโ€™re watching models:
โ€ข Generate full working software
โ€ข Understand visual context
โ€ข Speak fluently across languagesโ€ข Operate tools autonomously

AIโ€™s been hiding in plain sight for years, powering everything from credit card fraud detection to GPS.
The shift came when we stopped writing rules and let models learn patterns on their own.

๐Ÿค– ๐—ช๐—ต๐—ฎ๐˜ ๐˜„๐—ฒ'๐—ฟ๐—ฒ ๐˜๐—ฎ๐—น๐—ธ๐—ถ๐—ป๐—ด ๐—ฎ๐—ฏ๐—ผ๐˜‚๐˜:
LLMs donโ€™t store facts.
They recognise patterns across language, data, and visuals.
And make shockingly accurate predictions.
It doesnโ€™t โ€œknow.โ€
It still predicts and is just weirdly good at it.
โ€
This leap didnโ€™t happen overnight. Letโ€™s rewind how we got here๐Ÿ‘‡

๐Ÿ“œ ๐—ฃ๐—ฟ๐—ฒ-๐Ÿฎ๐Ÿฌ๐Ÿญ๐Ÿฌ: ๐—ฅ๐˜‚๐—น๐—ฒ๐˜€ & ๐—ฆ๐˜๐—ฎ๐˜๐—ถ๐˜€๐˜๐—ถ๐—ฐ๐˜€
Rigid models, endless if-then logic.
Machine Learning for statistical prediction, think cluster models, decision trees, etc.
Keyword matching ruled the day. Narrow, brittle, and dumb.

๐Ÿ’ก ๐Ÿฎ๐Ÿฌ๐Ÿญ๐Ÿฌโ€“๐Ÿฎ๐Ÿฌ๐Ÿญ๐Ÿฒ: ๐—ก๐—ฒ๐˜‚๐—ฟ๐—ฎ๐—น ๐—ก๐—ฒ๐˜๐˜„๐—ผ๐—ฟ๐—ธ๐˜€ & ๐——๐—ฒ๐—ฒ๐—ฝ ๐—Ÿ๐—ฒ๐—ฎ๐—ฟ๐—ป๐—ถ๐—ป๐—ด
Models began to learn representations.
AlphaGo beats Lee Sedol using reinforcement learning.
Still narrow. Impractical at scale. But promising.

๐Ÿ’ฅ ๐Ÿฎ๐Ÿฌ๐Ÿญ๐Ÿณ: ๐—ง๐—ฟ๐—ฎ๐—ป๐˜€๐—ณ๐—ผ๐—ฟ๐—บ๐—ฒ๐—ฟ๐˜€
The real breakthrough. An architecture that understands relationships between words.
Attention replaces recurrence, unlocking scalability.
This is the moment everything started shifting.

โš™๏ธ ๐Ÿฎ๐Ÿฌ๐Ÿญ๐Ÿดโ€“๐Ÿฎ๐Ÿฌ๐Ÿฎ๐Ÿฌ: ๐—Ÿ๐—Ÿ๐— ๐˜€, ๐—š๐—”๐—ก๐˜€ & ๐——๐—ถ๐—ณ๐—ณ๐˜‚๐˜€๐—ถ๐—ผ๐—ป ๐— ๐—ผ๐—ฑ๐—ฒ๐—น๐˜€
Train once on larger and larger data sets.
Text, images, audio, video through GANs (Generative Adversarial Networks) and Diffusion models.
AI becomes general-purpose.

๐ŸŒ ๐Ÿฎ๐Ÿฌ๐Ÿฎ๐Ÿฌโ€“๐Ÿฎ๐Ÿฌ๐Ÿฎ๐Ÿฎ: ๐—ฃ๐˜‚๐—ฏ๐—น๐—ถ๐—ฐ ๐—•๐—ฟ๐—ฒ๐—ฎ๐—ธ๐˜๐—ต๐—ฟ๐—ผ๐˜‚๐—ด๐—ต
ChatGPT-3.5. AI moves from โ€œcoolโ€ to โ€œcanโ€™t ignore".
Strong zero-shot performance, now without task-specific training.
Reinforcement learning from human feedback becomes standard.
Audio models go multilingual. Image models sharpen with Diffusion.

๐Ÿš€ ๐Ÿฎ๐Ÿฌ๐Ÿฎ๐Ÿฏโ€“๐—ป๐—ผ๐˜„: ๐—˜๐—บ๐—ฒ๐—ฟ๐—ด๐—ฒ๐—ป๐—ฐ๐—ฒ & ๐—ฆ๐—ฐ๐—ฎ๐—น๐—ฒ
Models start doing things they werenโ€™t trained to do. This is where it gets scary:)
Tool use with MCP. Workflow orchestration. Process automation.
Broad capabilities with even broader application.
Weโ€™re not just prompting anymore, weโ€™re delegating.

Post 2: AI didn't get better gradually. It crossed a Threshold.

[interface] screenshot of cybersecurity dashboard interface (for an ai cybersecurity company)

If youโ€™re trying to make sense of why AI progress feels chaotic, hereโ€™s the uncomfortable truth:
Itโ€™s not linear.
Large Language Models didnโ€™t slowly improve year by year.
They crossed thresholds.

The history of LLMs is defined by step changes, not slow improvement.
For years, models lookedโ€ฆ underwhelming.
Then suddenly, sometimes after months of training, they โ€œget itโ€.
โ€
This phenomenon is called ๐—ด๐—ฟ๐—ผ๐—ธ๐—ธ๐—ถ๐—ป๐—ด.
โ€
For a long time, a model behaves like this:
โ€ข It memorises examples
โ€ข Outputs are brittle
โ€ข Performance feels inconsistent or random

๐Ÿง Then the model stops memorising. ๐—œ๐˜ ๐˜€๐˜๐—ฎ๐—ฟ๐˜๐˜€ ๐˜‚๐—ป๐—ฑ๐—ฒ๐—ฟ๐˜€๐˜๐—ฎ๐—ป๐—ฑ๐—ถ๐—ป๐—ด ๐˜€๐˜๐—ฟ๐˜‚๐—ฐ๐˜๐˜‚๐—ฟ๐—ฒ.
Not facts. But relationships.
โ€
Grokking is one of the clearest examples of why AI progress feels unpredictable. It's a sharp, non-linear jump in capability. A model can appear stuck or mediocre, then abruptly transition from surface-level pattern matching to deep structural understanding. It moves from overfitting to generalisation and its capability spikes almost vertically. Not because it learned new rules, but because it learned the relationships.

In early 2022, researchers formally described this behaviour in paper โ€œ๐˜Ž๐˜ณ๐˜ฐ๐˜ฌ๐˜ฌ๐˜ช๐˜ฏ๐˜จ: ๐˜Ž๐˜ฆ๐˜ฏ๐˜ฆ๐˜ณ๐˜ข๐˜ญ๐˜ช๐˜ป๐˜ข๐˜ต๐˜ช๐˜ฐ๐˜ฏ ๐˜‰๐˜ฆ๐˜บ๐˜ฐ๐˜ฏ๐˜ฅ ๐˜–๐˜ท๐˜ฆ๐˜ณ๐˜ง๐˜ช๐˜ต๐˜ต๐˜ช๐˜ฏ๐˜จ ๐˜ฐ๐˜ฏ ๐˜š๐˜ฎ๐˜ข๐˜ญ๐˜ญ ๐˜ˆ๐˜ญ๐˜จ๐˜ฐ๐˜ณ๐˜ช๐˜ต๐˜ฉ๐˜ฎ๐˜ช๐˜ค ๐˜‹๐˜ข๐˜ต๐˜ข๐˜ด๐˜ฆ๐˜ต๐˜ดโ€. They observed models that showed no meaningful improvement on unseen data, until suddenly they did. When capability doesnโ€™t improve how you expect it to.

๐Ÿšจ ๐—ง๐—ต๐—ถ๐˜€ ๐—บ๐—ฎ๐˜๐˜๐—ฒ๐—ฟ๐˜€ ๐—บ๐—ผ๐—ฟ๐—ฒ ๐˜๐—ต๐—ฎ๐—ป ๐—บ๐—ผ๐˜€๐˜ ๐—ผ๐—ฟ๐—ด๐—ฎ๐—ป๐—ถ๐˜€๐—ฎ๐˜๐—ถ๐—ผ๐—ป๐˜€ ๐—ฟ๐—ฒ๐—ฎ๐—น๐—ถ๐˜€๐—ฒ.
Because AI pilots donโ€™t behave like traditional software.
โ€ข Early underperformance doesnโ€™t always predict final value
โ€ข Breakthroughs often arrive without warning
โ€ข Safety, governance, and control frameworks must assume future behaviours, not current ones, and this is essential before moving to production.

This is why โ€œwait and seeโ€ is a risky strategy.
And why treating AI as just another tool upgrade misses the point.

AI doesnโ€™t evolve like people.
Thatโ€™s what makes it powerful.
And risky.
And deeply fascinating.

Post 3: Why Prompt Engineering won't save you AI strategy

[interface] screenshot of cybersecurity dashboard interface (for an ai cybersecurity company)

Letโ€™s be honest.
โ€
Most โ€œAI failuresโ€ arenโ€™t model failures.
Theyโ€™re use case and data governance failures.

If AI keeps hallucinating, the solution is not: โ€œWrite a better prompt.โ€
It is: โ€œGive it clear data to work with and set boundariesโ€

๐—ฃ๐—ฟ๐—ผ๐—บ๐—ฝ๐˜ = how the model should think
๐—–๐—ผ๐—ป๐˜๐—ฒ๐˜…๐˜ = what the model can think about
๐— ๐—ผ๐—ฑ๐—ฒ๐—น = how well it can think

Prompting shapes behaviour, where context determines truth.
What ๐—ฝ๐—ฟ๐—ผ๐—บ๐—ฝ๐˜๐—ถ๐—ป๐—ด does really well:
โ€ข Behaviour: โ€œYou are a senior risk consultant advising the CEO.โ€
โ€ข Structure: โ€œReturn JSON with fields: Risk, Impact, Mitigation.โ€
โ€ข Tasks: โ€œFirst classify, then generate SQL, then explain.โ€

And where it fails:
โ€ข Missing information: โ€œAssume the client uses SAP S/4HANA.โ€
โ€ข Forcing accuracy: โ€œOnly answer if you are 100% sure.โ€
โ€ข Domain expertise: โ€œHereโ€™s 20 pages of IFRS 17โ€ฆโ€

Prompting controls how the model behaves. Not what it knows.
Context is the power for factual grounding, not for instructions or behaviour.

What ๐—ฐ๐—ผ๐—ป๐˜๐—ฒ๐˜…๐˜ does really well:
โ€ข Facts: โ€œHere are 6 months of incident logsโ€ฆโ€
โ€ข Structure: Schemas, process models, policies.
โ€ข Tool output: SQL results, APIs, metrics.

Context is what turns LLMs from generic into specific. Performing in a known environment.
From toy to tool. From chatbot into decision engine. From demo to production.

How to get that context?

๐—ฅ๐—”๐—š (Retrieval Augmented Generation) pulls in external documents at runtime and injects them into the modelโ€™s context. Itโ€™s ideal for large or fast-changing knowledge, like searching policy documents or answering questions over recent research. But if retrieval fails, the model confidently makes things up.

๐—ž๐—”๐—š (Knowledge Augmented Generation) integrates structured knowledge graphs into reasoning. Instead of guessing from text, it reasons over known entities and relationships. Perfect for compliance, diagnostics, or domains where logic beats language.

๐—–๐—”๐—š (Cache Augmented Generation) skips retrieval entirely and injects data straight into the context. Itโ€™s fast, cheap, and reliable, but only works when your knowledge is stable and fits in the context window, good answers for static data like manuals or internal policies.

๐—›๐˜†๐—ฏ๐—ฟ๐—ถ๐—ฑ systems are common with CAG for core rules, RAG for real-time facts, and KAG for reasoning over domain logic.

AI projects fail from over-investments in prompts and under-investments in data pipelines. And its hybrid models that deliver, with the precision and agility where needed.

Post 4: From bigger models to better behaviour

[interface] screenshot of cybersecurity dashboard interface (for an ai cybersecurity company)

Training made AI smart. Tuning made it useful.

Weโ€™re going through a change in model progression. It used to be about throwing larger data sets at them. Now improvements come from how to behave better, and when to think harder.

This is a shift, not just because weโ€™ve run out of training data, but because peopleโ€™s expectations changed. Weโ€™ve gone beyond summarising or document creation. Itโ€™s about automating what we do, from concept to implementation, interaction to autonomy.

In above posts, we covered:
โ€ข Transformers as the start of AI adoption
โ€ข Grokking, the sudden jump in capability
โ€ข And the value of context and prompting

Now how models start to bring business value๐Ÿ‘‡

๐Ÿง  ๐—ง๐—ฟ๐—ฎ๐—ถ๐—ป๐—ถ๐—ป๐—ด ๐˜ƒ๐˜€ ๐—ฃ๐—ผ๐˜€๐˜-๐—ง๐—ฟ๐—ฎ๐—ถ๐—ป๐—ถ๐—ป๐—ด
Pre-training creates capability. The generalist reasonably good at everythingโ€ฆ and occasionally wrong with confidence.
Post-training is where models learn how to follow instructions, when to refuse, and what โ€œgoodโ€ looks like.

Reinforcement Learning (RL) with human feedback (original post-training) is what made ChatGPT helpful, modern AI combines:

โš–๏ธ๐——๐—ถ๐—ฟ๐—ฒ๐—ฐ๐˜ ๐—ฃ๐—ฟ๐—ฒ๐—ณ๐—ฒ๐—ฟ๐—ฒ๐—ป๐—ฐ๐—ฒ ๐—ข๐—ฝ๐˜๐—ถ๐—บ๐—ถ๐˜€๐—ฎ๐˜๐—ถ๐—ผ๐—ป
โ€ข Show pairs of answers to the same prompt
โ€ข Humans (/rules) select the better one
โ€ข The model is trained on preferred output probability
RL is slow & expensive; DPO is simple & fast.

๐Ÿ“Š ๐—š๐—ฟ๐—ผ๐˜‚๐—ฝ ๐—ฅ๐—ฒ๐—น๐—ฎ๐˜๐—ถ๐˜ƒ๐—ฒ ๐—ฃ๐—ผ๐—น๐—ถ๐—ฐ๐˜† ๐—ข๐—ฝ๐˜๐—ถ๐—บ๐—ถ๐˜€๐—ฎ๐˜๐—ถ๐—ผ๐—ป
Pairwise comparisons donโ€™t scale when:
โ€ข Answers are complex
โ€ข Multiple dimensions matter
โ€ข You care about relative quality
So not A vs B, but ranking groups. Good for optimising trajectories.

๐Ÿงช๐—š๐—ฟ๐—ฎ๐—ฑ๐—ฒ๐—ฟ๐˜€ & ๐—ฉ๐—ฒ๐—ฟ๐—ถ๐—ณ๐—ถ๐—ฒ๐—ฟ๐˜€
โ€ข Graders score against criteria: โ€œDid it have a source?โ€, โ€œDoes the code compile?โ€
โ€ข Verifiers check correctness after generation: reasoning steps, constraint checks.
Graders for quality, smarter routing for a step-based approach and validation. All without touching the base model.

โšก๐—œ๐—ป๐—ณ๐—ฒ๐—ฟ๐—ฒ๐—ป๐—ฐ๐—ฒ-๐—ง๐—ถ๐—บ๐—ฒ ๐—ฆ๐—ฐ๐—ฎ๐—น๐—ถ๐—ป๐—ด
Thinking harder when needed. Generating many answers, verifying, and selecting. It avoids retraining, is easy to govern, and only hard cases require more funding.

There you go, ๐—ณ๐—ผ๐˜‚๐—ฟ ๐—บ๐—ฒ๐˜๐—ต๐—ผ๐—ฑ๐˜€ to make modern models more trustworthy.

๐Ÿšจ ๐—ง๐—ต๐—ฒ ๐˜‚๐—ป๐—ฐ๐—ผ๐—บ๐—ณ๐—ผ๐—ฟ๐˜๐—ฎ๐—ฏ๐—น๐—ฒ ๐—น๐—ฒ๐˜€๐˜€๐—ผ๐—ป
โ€ข Fine-tuning doesnโ€™t fix unclear requirements
โ€ข Most AI issues are measurement failures, not model issues
โ€ข If humans canโ€™t agree what โ€œgoodโ€ looks like, neither can the model

Weโ€™re not training intelligence anymore. Weโ€™re creating incentive systems.
Not what it can do, what it is rewarded to do.

๐ŸŽฏ ๐—ฆ๐˜๐—ฒ๐—ฝ๐˜€
1. Define what is โ€œgoodโ€ (explicitly)
2. Encode it in graders & evaluations
3. Route to cheap models (limit reasoning)
4. Freeze a regression set (not to break)
5. Train as last step

AI doesnโ€™t optimise for:
โ€ข truth
โ€ข usefulness
โ€ข safety
โ€ข customer happiness

It optimises for what you score, rank, reward, or select!
โ€

Post 5: The moment we realised AI needed hands

[interface] screenshot of cybersecurity dashboard interface (for an ai cybersecurity company)

It gave great answers, but nothing really happenedโ€ฆ

Well, thatโ€™s how LLMs started. Now it's ChatGPT Agent & Operator, Claude Cowork, Copilot Studio, or any AI workflow orchestration tool. LLMs have received the ability to act (๐˜ฅ๐˜ฐ instead of just ๐˜ด๐˜ข๐˜บ things).

๐Ÿค–๐—”๐—œ ๐—”๐—ด๐—ฒ๐—ป๐˜๐˜€
Where traditional LLMs answer questions, the AI Agent plans, executes, checks, and adapts. Agentic AI? The digital workers that behave like teammates, not as another tool.

This requires:
โ€ข ๐—ฅ๐—ฒ๐—ฎ๐˜€๐—ผ๐—ป๐—ถ๐—ป๐—ด: Break complex goals into small tasks
โ€ข ๐— ๐—ฒ๐—บ๐—ผ๐—ฟ๐˜†: Track past steps and results
โ€ข ๐—ง๐—ผ๐—ผ๐—น ๐—จ๐˜€๐—ฒ: Call APIs, trigger systems, act on data
โ€ข ๐—š๐—ผ๐—ฎ๐—น ๐—ข๐—ฟ๐—ถ๐—ฒ๐—ป๐˜๐—ฎ๐˜๐—ถ๐—ผ๐—ป: Stay focused, avoid distractions

Agents plan paths, execute steps, check own result, and adapt. This represents a shift from a tool you talk to, into a system that works with you.
However, effectiveness depends on a valid use case and solid design (incl. data it can handle, clear constraints, etc.).

If Agentic AI is the ๐—ฏ๐—ฟ๐—ฎ๐—ถ๐—ป that decides to act, tool use is the ๐—ต๐—ฎ๐—ป๐—ฑ๐˜€ to perform the action. To be truly useful in business context, AI must be able to interact with the real world by querying databases, triggering workflows, or calling APIs.

๐Ÿ“ก๐— ๐—–๐—ฃ
Model Context Protocol was introduced by Anthropic in 2024 and has quickly become the standard for connecting systems. Itโ€™s an open protocol that leverages JSON for communication between tools, data sources, and systems.

Prior, every AI + tool integration was custom. Now MCP tells AI what tools exist, what they do, and when and how to use them.

MCP is infrastructure (not hype), a way for model-agnostic AI integration. The MCP server works as tool provider. The MCP Client is the LLM host that:โ€ข Discovers available tools dynamicallyโ€ข Understands inputs/outputs via schemasโ€ข Calls tools without hard-coded logic

โŒ ๐—•๐—ฒ๐—ณ๐—ผ๐—ฟ๐—ฒ ๐— ๐—–๐—ฃ
โ€ข Every LLM had its own tool API
โ€ข Strict coupling between model & tools
โ€ข Re-implementation of the same integration over and over
โ€ข Difficult to swap LLMsโ€ข And brittle if anything changed

โœ… ๐—”๐—ณ๐˜๐—ฒ๐—ฟ ๐— ๐—–๐—ฃ
โ€ข Build a connector once
โ€ข Swap LLMs without rewriting
โ€ข Integration thatโ€™s easy to configure and change
โ€ข Agents become maintainable

At Intellifold, we run our own MCP server to create a seamless bridge between our platform, the clientโ€™s data model, and LLM. This makes sure the AI isnโ€™t โ€œguessingโ€ or returning generic answers. Instead, it works directly with the data it has access to.

The LLM reads the context, transforms business questions into queries, the MCP allows execution, and the LLM provides structured responses.From tough business questions and changing dashboardsโ€ฆto creating new solutions and improvement recommendations.

All made possible through a smart design, understanding system data, and years of improving business processes.

[interface] screenshot of cybersecurity dashboard interface (for an ai cybersecurity company)

Share this insight

Process clarity. Actionable insights. Real results.

Gain full process transparency, spot inefficiencies instantly, and drive automation with real-time analytics and AI-powered monitoring across your business operations.

image of open office space (hr tech)