The ArchTenet Engineering Playbook Blog

The Token Tax — How Naive M2M Authentication Quietly Drains Your Cloud Budget

Wed, 08 Apr 2026 00:00:00 GMT

The cloud stopped being just infrastructure a long time ago. It's an economic model where every engineering decision has a direct impact on business costs. And that's exactly where the line between "works fine" and "designed well" is drawn.

Managed Services: What You're Actually Buying

Cloud platforms take an enormous amount off your team's plate: hardware management, baseline security, fault tolerance, scaling. Managed Kubernetes — Amazon EKS, Google Kubernetes Engine, or Azure Kubernetes Service — lets you spin up a production-ready cluster in a matter of hours. But here's the thing: you haven't eliminated complexity. You've moved it up a level.

The hybrid model — taking managed Kubernetes and deploying your own microservices inside it — is a reasonable trade-off. You don't own servers, you get multi-region deployment, and you pay only for what you use. But if architectural thinking doesn't show up at that higher level, costs start growing faster than load does.

Case Study: Identity in a Microservice Architecture

One of the most instructive examples is identity management. Using managed identity providers — Amazon Cognito, Auth0, Microsoft Entra External ID, or Okta — is an entirely rational choice. Building your own enterprise-grade authentication and authorization system from scratch means years of investment, continuous security audits, and a high risk of getting it wrong. Delegating that responsibility to the cloud reduces risk and accelerates time-to-market.

The problem doesn't come from choosing the service. It comes from how that service is used inside the architecture.

The Token Multiplication Effect

In most cloud identity solutions, billing correlates with the number of operations: authentications, token issuances, or monthly active users. In a monolithic system, this is barely noticeable. In a microservice architecture, a multiplicative effect kicks in that is rarely accounted for at design time.

A typical scenario: an incoming user request passes through an API gateway, then touches a chain of 10–20 services. Each of them, following zero-trust principles, requests an access token for its downstream call via the OAuth 2.0 Client Credentials flow. With a naive implementation, every service goes to the identity provider for a fresh token. A single user request generates dozens of operations. As load grows, this becomes an avalanche of token issuances with no connection to the actual business value of the request.

Without a proxy — every service fetches its own token:

With a token proxy — the IdP is called once per TTL, everything else is a cache hit:

Putting a Price on It

Even a relatively modest system puts meaningful pressure on the identity provider.

Load model (example):

User traffic:

500 users/day × 5 actions × ~15 services per request → 37,500 token ops/day

Background processes (the real driver):

3 syncs/day, each processing 50,000 records in batches of 100 → 500 batches per sync
Each batch passes through ~15 microservices → 3 × 500 × 15 = 22,500 token ops/day

Total: ~60,000 ops/day → ~1,800,000 ops/month

User traffic accounts for 62% of the load, while 3 background syncs account for the remaining 38%. This is a conservative model. Real systems with dozens of sync types or higher-frequency processes run 5–10× higher — and the bill scales linearly.

Cost at ~1,800,000 token operations/month (rates as of April 2026):

Provider	Rate	Monthly cost
Amazon Cognito	$0.00225 / request	~$4,050
Microsoft Entra External ID	$0.001 / token	~$1,800
Auth0 Professional	$240/mo base + token add-ons	Enterprise tier

In every case, the same rule applies: your costs become proportional not to your users, but to the number of internal service calls.

The Evolution of Solutions: From Naive to Correct

Level 1: Client-side caching

The instinctive fix is to add caching at the HTTP client layer inside each service. This does reduce requests, but the effect is bounded by the process boundary. In Kubernetes, every pod holds its own cache. As you scale horizontally, the number of caches grows linearly — a significant fraction of requests to the identity provider persists. This kind of cache is also hard to observe and gives you no centralized token management policy.

Level 2: Centralized cache (Redis)

Moving the cache to Redis looks like the logical next step. But a less obvious problem emerges here: the security model. An access token is a bearer credential — whoever holds it can use it. If you store tokens keyed by client_id or scope, any service with Redis access can potentially retrieve a token it was never meant to have. Adding ACLs at the Redis level partially addresses this, but it introduces complexity quickly and erodes system transparency.

Level 3: Token proxy service

The architecturally sound solution is to introduce a dedicated layer responsible for token lifecycle management. A lightweight proxy service deployed inside the cluster becomes the single controlled entry point for obtaining access tokens. It encapsulates all interaction with the identity provider, implements caching, and — critically — owns the access model.

Token Proxy Architecture: Technical Details

Token addressing

Instead of storing tokens against plain identifiers, the proxy uses a derived cache key formed by computing HMAC-SHA256 over the combination of client_id, client_secret, and scope, using an internal service secret. HMAC is deterministic — the same inputs always produce the same key, which is necessary for correct cache lookups — and irreversible without knowledge of the secret. Even if memory is dumped or leaked, the original credentials cannot be recovered. The service itself never handles secrets in plaintext outside the moment of the actual request to the identity provider.

In-memory storage

Tokens are stored in memory, not in an external store. This eliminates network latency on every lookup and reduces the attack surface. The cache TTL is set slightly below the token's actual expiry to guarantee proactive refresh before the token becomes invalid.

Cluster-level security

The service is deployed inside Kubernetes with no external ingress, restricted egress rules, and access limited to internal service accounts. Communication is additionally secured via mTLS, eliminating the risk of unauthorized access even within the cluster. The service is physically unreachable from outside the cluster perimeter.

High availability

Since the proxy is a critical dependency for every service in the cluster, it is deployed as 2–3 replicas. Each replica holds its own in-memory cache and independently fetches from the identity provider on a cache miss. A short warm-up period after a replica restart is an acceptable trade-off for eliminating a single point of failure.

The Economic Impact

The fundamental model shift: a token is requested once per TTL, then reused by all services authorized to receive it. The number of operations against the identity provider becomes proportional not to the number of internal calls, but to the number of unique client/scope combinations.

In the system described above, this means going from 1.8 million operations per month to ~14,400 — a 99.2% reduction.

The proxy service itself costs almost nothing: a few dozen megabytes of RAM per replica, minimal CPU. At typical cloud pricing, that's a few dollars a month across all replicas.

Assuming 20 unique client/scope combinations and 1-hour token TTL:

Refreshes per day: 20 × 24 = 480
Per month: ~14,400 operations instead of 1,800,000

	Without optimization	With token proxy
Token operations/month	~1,800,000	~14,400
Cognito	~$4,050/mo	~$32/mo
Entra External ID	~$1,800/mo	~$14/mo
Auth0	Enterprise tier	Self-service plan
Proxy cost	—	~$2–5/mo
Ops reduction		99.2%

In absolute terms: $22,000–$48,000 per year — depending on provider — eliminated by a single lightweight service. As the system scales — more syncs, more services — costs grow linearly. The proxy cost does not.

Conclusion

This is exactly the kind of decision that separates using the cloud as a tool from using it as magic. The cloud genuinely lets you stop thinking about a huge range of low-level concerns. But it doesn't remove the need to think at the architecture level. If anything, it amplifies the consequences of architectural mistakes — because every inefficiency converts to money immediately.

Good cloud architecture isn't about rejecting managed services. It's about understanding their internal billing models and deliberately managing the points where those models start to conflict with your system.

Otherwise, you inevitably arrive at a situation where you're paying not for business growth, but for a lack of control over your own decisions.

The Silent Exfiltration — Why Your CI Pipeline Is an Open Vault

Tue, 07 Apr 2026 00:00:00 GMT

Modern CI/CD pipelines for Node.js applications show three worsening structural issues — secrets injected into the runner environment at the start of the pipeline, unrestricted npm lifecycle script execution during dependency installation, and open outbound network access on CI runners — which together enable silent, zero-alert credential exfiltration by any malicious package in the dependency tree. These findings are platform-independent: GitLab CI, GitHub Actions, and similar systems all have identical default insecure settings. The March 2026 compromise of the Axios npm package, a North Korean state-sponsored supply chain attack targeting a library with about 100 million weekly downloads, is discussed as a real case study confirming the large-scale exploitation of this attack surface.

The Three-Vulnerability Kill Chain

A developer on a feature branch adds a new npm package. The pipeline runs. Within seconds, every credential the organisation owns is on an attacker-controlled server. No merge to the main branch. No code review. No alert. No indication in the logs.

This is not a hypothetical scenario constructed for impact. It is the default behaviour of most Node.js CI pipelines, and it follows from three conditions present simultaneously in the majority of production pipeline configurations — regardless of whether the platform is GitLab CI, GitHub Actions, AWS CodeBuild, CircleCI, or any equivalent system.

Each condition in isolation is manageable. In combination, they constitute a complete exfiltration capability:

Secrets injected into the environment — all CI/CD variables and repository secrets are available in process.env from the first millisecond of pipeline execution, before any security scanning, before any approval gate, on any branch.
Unrestricted lifecycle script execution — npm install and yarn install execute arbitrary code from every package in the dependency tree, including packages three or four levels deep that no engineer on the team has ever reviewed.
Open outbound network access — CI runners have unrestricted egress to the public internet by default, and standard CI images include multiple network tools (curl, wget, node) suitable for exfiltration.

The result: complete exfiltration capability, zero alerts.

Finding 1 — Secrets Are Live Before Any Gate Fires

Both GitLab CI and GitHub Actions inject secrets into the runner environment at job start — not at the step where they are first referenced, but at the beginning of the job, before any pipeline step executes.

On GitLab CI, project-level and group-level CI/CD variables are injected into the process environment immediately. The platform's "Masked" flag, commonly assumed to protect secret values, is a log sanitisation feature: it string-matches the secret value in stdout and stderr output and replaces it with [MASKED] before the log is stored. The raw value is in process.env and is fully accessible to any code executing within the job.

On GitHub Actions, secrets exposed via env: blocks or workflow-level ${{ secrets.MY_SECRET }} references are equally accessible to any process running in the job environment.

The critical implication is that there is no safe window. A pipeline triggered on any branch, by any developer, against any commit, immediately makes all configured secrets available to all code that runs within it. Security scanning steps, approval gates, and branch protection rules all fire after secrets are already live in the environment.

Finding 2 — npm install Is Arbitrary Code Execution

The npm lifecycle hook system is a first-class feature of the package manager. When npm install or yarn install runs, the following hook sequence fires automatically, in order, for the root project and for every package in the dependency tree:

preinstall → install → postinstall → prepare → prebuild → build → postbuild

These hooks execute shell commands or Node.js scripts defined in each package's package.json. They run with full access to the process environment — including all injected CI secrets — and with full filesystem and network access. There is no sandboxing, no prompting, and no log indication distinguishing legitimate build steps from hook execution.

This behaviour is identical across npm, yarn, and pnpm, and applies on every CI platform that executes npm install: GitHub Actions runners, GitLab Kubernetes executors, AWS CodeBuild projects, Jenkins agents, CircleCI containers, and any equivalent system. AWS CodeBuild runs jobs inside managed containers with no egress restriction or script execution policy applied by default; Jenkins pipelines are similarly unrestricted unless an administrator has explicitly hardened the agent configuration.

The attack vector this creates is precise: a malicious or compromised package anywhere in the transitive dependency tree can execute arbitrary code during installation. That code has full access to every secret in the environment and can exfiltrate them before the install step completes. The malicious code never appears in the repository.

The CI pipeline is not even the first system at risk. The identical hook sequence fires when a developer runs npm install on a local machine. A developer workstation is in many ways a richer target than an ephemeral CI runner: it accumulates ~/.aws/credentials, ~/.ssh/ private keys spanning every service the developer has ever accessed, .env files across multiple local projects, and npm authentication tokens in ~/.npmrc, IDE integration tokens, and VPN configurations. Unlike an ephemeral CI container that is destroyed after the job completes, a compromised developer machine persists, is rarely audited, and frequently has access to production systems that the CI pipeline does not.

Finding 3 — The Network Is Wide Open

CI runners — both GitHub-hosted runners and GitLab Kubernetes executors — have unrestricted outbound internet access by default. Standard CI images include curl, wget, and node, providing multiple independently usable exfiltration channels. A malicious postinstall script requires none of these external tools: Node.js's built-in https module is sufficient to POST an arbitrary payload to any external endpoint.

On GitHub Actions, hosted runners can reach arbitrary external hosts without any configuration. Egress restriction requires migrating to self-hosted runners deployed within a network boundary with explicit egress firewall rules.

In GitLab CI with a Kubernetes executor, egress restrictions require an explicit Kubernetes NetworkPolicy configuration. The default namespace configuration in most deployments does not apply such a policy.

AWS CodeBuild runs jobs inside managed containers in AWS-owned infrastructure; outbound internet access is available by default unless the build project is configured to run inside a VPC with a restrictive security group. Jenkins agents — whether cloud-provisioned or self-hosted — inherit the host's network configuration, which in most enterprise environments means unrestricted outbound access.

A commonly held belief is that corporate SSL inspection proxies provide meaningful egress protection. This assumption warrants scrutiny. An SSL inspection proxy can be bypassed by setting NODE_TLS_REJECT_UNAUTHORIZED=0, which disables certificate validation entirely. More significantly, SSL proxies offer no protection against DNS-based exfiltration: an attacker can encode stolen credentials as subdomain labels in DNS queries to an attacker-controlled domain (dGVzdC1zZWNyZXQ.attacker.com), and these queries will traverse the network regardless of HTTP-level filtering.

This Is Not a JavaScript Problem

The install-time code execution pattern is not specific to the JavaScript ecosystem. Every major package manager exposes an equivalent attack surface:

Python — setup.py executes as a standard Python script during pip install of a source distribution. The ctx and phpass PyPI packages (2022) exploited this mechanism to exfiltrate AWS credentials from CI environments.
Ruby — gem install executes gemspec extension scripts and Rakefile hooks. The rest-client gem was backdoored via this mechanism in 2019.
PHP — Composer's scripts block supports pre-install-cmd and post-install-cmd hooks that are structurally nearly identical to npm's lifecycle system.
Rust — build.rs scripts execute arbitrary Rust code during cargo build and cargo install with full system access.
Java — Maven plugins execute arbitrary code during the build lifecycle. Gradle's build.gradle is a full Groovy or Kotlin program; a compromised plugin dependency is the equivalent attack vector.

This is not a JavaScript problem. It is a software supply chain problem. The specific hook name changes across ecosystems; the risk does not.

This Is Not Theoretical — It Just Happened

On 31 March 2026, the Axios npm package — the most widely used HTTP client library in the JavaScript ecosystem, with approximately 100 million weekly downloads and an estimated presence in 80% of cloud and code environments — was compromised in a state-sponsored supply chain attack attributed to Sapphire Sleet (also tracked as UNC1069), a North Korean threat actor.

The attack unfolded with significant operational sophistication. Approximately 18 hours before the main attack, a package named plain-crypto-js@4.2.0 was published to the npm registry — a clean, inert decoy designed to establish a brief publication history and reduce the likelihood of detection heuristics triggering. On 31 March at 00:21 UTC, the primary Axios maintainer account (jasonsaayman) was used to publish axios@1.14.1, which introduced plain-crypto-js@4.2.1 as a new runtime dependency. At 01:00 UTC, axios@0.30.4 was published with the same injected dependency. Both the latest and legacy distribution tags were compromised simultaneously, maximising the blast radius across projects using either the current or legacy Axios API.

The delivery mechanism was a postinstall hook in plain-crypto-js@4.2.1 declaring "postinstall": "node setup.js". Upon installation of either compromised Axios version, npm resolved the dependency tree, fetched plain-crypto-js@4.2.1, and automatically executed setup.js with no user interaction. The script deployed platform-specific second-stage payloads — a Remote Access Trojan (RAT) for Windows, macOS, and Linux — and connected to a command-and-control server at sfrclak[.]com:8000.

The malicious versions remained live for approximately three hours before removal. Any CI pipeline — on any platform — that executed a fresh npm install during that window and resolved Axios via a floating version range (^1.14.0) was potentially compromised.

The Axios incident does not stand alone. The same attack pattern has been executed repeatedly: event-stream (2018, targeting a Bitcoin wallet library), ua-parser-js (2021, cryptominer and credential stealer), and colors/faker (2022, deliberate maintainer sabotage). In every case, the vector was identical. In every case, standard code review provided no protection because the malicious code was not in the repository.

The Axios incident establishes definitively that no package is too widely used, too well-maintained, or too carefully watched to be immune.

Test Your Pipeline Right Now

The following three self-contained tests can be run against any pipeline today — with no special tooling and no risk — to confirm whether the conditions described above are present. Each test is reversible and leaves no persistent changes.

Test 1: Confirm Secrets Are Accessible at Runtime

Add a CI job step that runs:

printenv | sort | sed 's/=\(.\).*/=\1***/'

This prints all environment variable names with values redacted to their first character only. Count the secrets present. Observe that they are available before any security step in the pipeline has executed.

Test 2: Confirm Lifecycle Scripts Execute Silently

Add the following to the root package.json temporarily:

{
  "scripts": {
    "preinstall": "echo '⚠ SECURITY TEST: preinstall executed'",
    "postinstall": "echo '⚠ SECURITY TEST: postinstall executed'"
  }
}

Run npm install (or yarn install) in the pipeline without --ignore-scripts. Observe that both messages appear in the job log, silently interleaved with normal install output, with no indication that arbitrary code has executed. Remove the test scripts before merging.

Test 3: Confirm Outbound Network Access

Add a CI job step that runs:

node -e "const https = require('https');const req = https.request('https://httpbin.org/post', {method:'POST'}, res => {  console.log('Egress status:', res.statusCode);});req.write(JSON.stringify({test: 'egress-check'}));req.end();"

A 200 response confirms that arbitrary POST requests to external hosts succeed from within the CI runner — using only Node.js built-ins, with no additional tools required. If all three tests confirm positive, the kill chain is complete on the current pipeline configuration.

The Remediation Ladder

The following remediations are presented in order of implementation priority. Unless noted otherwise, all recommendations apply equally to GitLab CI, GitHub Actions, AWS CodeBuild, and Jenkins.

Immediate

Adopt --ignore-scripts for all dependency installation steps. This single flag eliminates the entire lifecycle script attack surface and is the highest-impact change available with the lowest implementation cost:

npm install --ignore-scripts
# or yarn install --ignore-scripts

For most pipelines, this change will work without modification. The flag only affects hooks that fire automatically as a side effect of pulling down dependencies — it has no impact on build scripts the pipeline deliberately invokes by name (npm run build, npm run test, etc.), which continue to work exactly as before.

A subset of widely used packages relies on lifecycle scripts to function and will require a small amount of explicit wiring. The most common cases are:

Prisma — runs prisma generate via postinstall by default. Fix: add npx prisma generate as an explicit pipeline step after install
Husky — installs Git hooks via the prepare script. Fix: add npx husky install as an explicit step
esbuild — downloads a platform-specific binary via postinstall. Fix: use the esbuild-wasm variant or invoke the binary path explicitly
Native addons (bcrypt, sharp, canvas) — prefer pre-built binary variants (e.g. @img/sharp-linux-x64), or run npm rebuild explicitly after install

In each case, the fix is the same pattern: move the hook's work into an explicit, named pipeline step. The result is a pipeline that is both more secure and more legible — every action it takes is visible and intentional:

# Step 1: install with no automatic script execution
npm install --ignore-scripts

# Step 2: explicitly invoke only what is needed
npx prisma generate
npx husky install
npm run build

Hygiene Controls (Log Protection, Not Runtime Protection)

Both platforms offer secret redaction features that are worth enabling, but must not be confused with security controls.

On GitLab CI, enabling the "Masked" and "Protected" flags causes the platform to replace matching values in job log output with [MASKED]. Two additional limitations apply: GitLab cannot mask multiline values — an SSH private key spanning from -----BEGIN OPENSSH PRIVATE KEY----- to -----END OPENSSH PRIVATE KEY----- across multiple lines cannot be masked, regardless of configuration, and short or special-character values may silently fail masking requirements.

On GitHub Actions, secrets are automatically redacted from log output. The same fundamental limitation applies: redaction is a log-level operation only.

The correct framing: secret masking and log redaction protect against accidental disclosure to humans reading job logs. They provide zero protection against a malicious process reading the environment programmatically.

Disable debug logging in deployment scripts. Several widely used deployment tools emit full HTTP request headers — including Authorization headers containing bearer tokens — when debug logging is enabled. This is a common misconfiguration: debug logging is enabled once during troubleshooting and never removed, silently printing credentials to job logs on every subsequent run. Review all deployment scripts for --log-level=debug or equivalent flags and remove them.

Rotate credentials that appeared in plaintext log output. Any secret that appeared unredacted in a job log — whether due to debug logging, a printenv call, or a masking failure — should be considered potentially compromised and rotated. Job logs on most platforms are readable by anyone with Reporter-level access or higher, which often includes a broader audience than the team realises.

Short-Term

Audit the post-install scripts across the existing dependency tree. Run npm ls or yarn why to enumerate the full dependency tree, then identify all packages that declare postinstall, install, or preinstall scripts.
Enforce lockfile diff review in pull requests and merge requests. Changes to package-lock.json or yarn.lock should be treated as a mandatory, conscious review signal. A new transitive dependency appearing in a lockfile diff is exactly the mechanism through which the Axios attack would have propagated.
Scope secrets to the jobs that require them. Installation steps have no legitimate need for deployment credentials. On GitHub Actions, use job-level env: blocks. On GitLab CI, use variable protection rules and job-scoped variable assignment.

Medium-Term

Integrate two complementary scanning layers before npm install. The recommended pipeline gate order is: scan lockfile → fail if known-bad → only then run npm install --ignore-scripts.

“Trivy fs” mode (pre-install): Run trivy fs --scanners vuln --skip-dirs node_modules . as a pre-install step. It parses package-lock.json, yarn.lock, and pnpm-lock.yaml directly — no node_modules required — and checks all declared dependency versions against the CVE database. Teams already using Trivy for Docker image scanning can add this step at near-zero incremental cost.
Behavioural analysis (socket.dev or equivalent): Trivy is CVE-driven and blind to novel attacks. The Axios compromise used a brand-new malicious package version with no CVE assigned while it was live. Behavioural tools flag suspicious patterns — a package that suddenly declares a postinstall script it never had, or a new transitive dependency injected into a stable package — independent of whether a CVE exists. The two approaches are complementary, not interchangeable.
Restrict outbound network egress on CI runners. Egress restriction is the strongest structural defence in the remediation ladder — the only control that breaks the kill chain at the delivery stage. Even if a malicious package executes successfully, a strict egress allowlist prevents stolen credentials from being transmitted. The attacker has execution but no exit.

Two important caveats apply. First, allowlist maintenance requires ongoing discipline — teams under delivery pressure frequently respond to a broken pipeline by expanding the allowlist rather than investigating the root cause. Second, HTTP-level egress controls are insufficient on their own: an attacker can encode credentials as subdomain labels in DNS queries to an attacker-controlled nameserver, bypassing all HTTP and HTTPS filtering. Comprehensive egress restriction requires a controlled internal DNS resolver in addition to HTTP-level controls.

Implement a dependency publication cooldown policy. Reject in CI any package version published within the last N days (commonly 3–7). This introduces a window during which the security community can identify and respond to a compromised release before it enters production dependency trees.

Conclusion

The vulnerability class described in this article does not arise from misconfiguration of a specific platform or negligence on the part of engineering teams. It arises from the intersection of three design properties — secrets available at pipeline start, arbitrary code execution during dependency installation, and open outbound network access — that are individually reasonable and present in most Node.js CI configurations by default.

The March 2026 Axios compromise demonstrated that this attack surface is being actively exploited by well-resourced, operationally sophisticated threat actors against packages at the very top of the npm download distribution. The target package had 100 million weekly downloads, multiple active maintainers, and years of established trust. None of these properties provided protection.

The trust model underpinning most CI pipelines is insufficient for the current threat environment. Structural controls that operate independently of trust assumptions are required: --ignore-scripts to eliminate lifecycle script execution as an attack vector, pre-install lockfile scanning to identify known-bad and behaviourally suspicious packages, and egress restriction to prevent exfiltration even in the event of a successful compromise.

Awareness of the attack surface is the prerequisite. The controls described above are the response.

References

Microsoft Threat Intelligence. (2026). Mitigating the Axios npm supply chain compromise. Microsoft Security Blog. https://www.microsoft.com/en-us/security/blog/2026/04/01/mitigating-the-axios-npm-supply-chain-compromise/
Google Threat Intelligence Group. (2026). North Korea-Nexus Threat Actor Compromises Widely Used Axios NPM Package in Supply Chain Attack. Google Cloud Blog. https://cloud.google.com/blog/topics/threat-intelligence/north-korea-threat-actor-targets-axios-npm-package
Elastic Security Labs. (2026). Inside the Axios supply chain compromise — one RAT to rule them all. https://www.elastic.co/security-labs/axios-one-rat-to-rule-them-all
StepSecurity. (2026). axios Compromised on npm — Malicious Versions Drop Remote Access Trojan. https://www.stepsecurity.io/blog/axios-compromised-on-npm-malicious-versions-drop-remote-access-trojan
Aqua Security. (2024). Trivy — Node.js coverage documentation. https://trivy.dev/docs/latest/coverage/language/nodejs/
socket.dev. (2024). Supply chain security for npm, PyPI, and Go. https://socket.dev
Snyk. (2021). The ua-parser-js npm package was compromised. Snyk Blog.
Aboukhadijeh, F. (2022). The colors and faker npm packages: What happened and what you can do. Socket Blog.

The AI-Native Team Workspace: Solving the Multi-Repo Context Crisis

Fri, 03 Apr 2026 00:00:00 GMT

Engineering teams are hitting a wall with modern AI coding agents. Tools like GitHub Copilot Workspace, Cursor, and Claude Code are incredibly capable, but they encounter a severe structural limitation in enterprise environments: they are blind outside their immediate repository.

If your architecture consists of a React frontend in one repo, Node.js microservices in another, and Terraform manifests in a third, an AI agent operating in the frontend cannot trace a failing API call down to the database schema. It lacks the system-wide context required to make accurate, architectural-level contributions.

The Status Quo: The Monorepo Trap

Historically, providing this level of unified context meant forcing a monorepo migration (e.g., Nx or Turborepo). For mature, production-scale projects, this is a trap. Code restructuring is resource-intensive, carries inherent operational risk, and stalls feature development for months.

Teams need the context of a monorepo for their AI agents, without the migration penalty for their developers and CI/CD pipelines.

The Solution: The "Virtual" Meta-Repo

The AI-Native Team Workspace introduces a lightweight "meta-repository" that acts as a centralised routing and scaffolding layer.

Instead of migrating code, the workspace uses a simple JSON registry and automation scripts to clone all isolated project repositories into a single, unified directory tree on the developer's local machine.

To the AI agent, the entire system — frontend, backend, shared libraries, and infrastructure — appears as a unified, cohesive environment. To the DevOps pipeline, nothing is different. The source code remains securely in its original repositories, maintaining all existing Git histories, deployment processes, and commit boundaries.

Beyond Context: The Brain of the Workspace

Gathering the code is only the first step. The true power of the Meta-Repo lies in how it standardises AI behaviour and execution across a fragmented tooling landscape.

1. Agent-Agnostic SKILLs

If half your team uses Cursor and the other half uses Claude Code, managing AI instructions becomes a nightmare of duplicated effort. More importantly, you lose control over how tasks are executed across the team.

The workspace addresses this by centralising Standard Operating Procedures into Canonical SKILLs — pure, tool-neutral Markdown files stored in the workspace root (.ai/skills/). These files standardise actual behaviour, such as step-by-step debugging procedures, Jira formatting conventions, and log-search strategies, ensuring consistent outputs across tools. Each specific AI tool is provided with a "thin wrapper" that simply points to the canonical source of truth.

2. Actionable Intelligence: MCP Servers & Connectors

AI agents shouldn't just read code; they need to interact with the broader development environment. Giving agents raw API access in their prompts is insecure and brittle. The workspace implements a dual-layer approach for safe external access:

Team-Scoped MCP Servers: For high-frequency, standardised operations (e.g., checking Jira ticket status, fetching Grafana logs, or querying MongoDB), a lightweight Model Context Protocol (MCP) server configuration is supplied. Instead of running as a persistent background process, the agent activates it on demand from the workspace when required, ensuring immediate access to external systems with no instruction overhead.

Workflow-Specific Connectors: For domain-specific tasks (e.g., correlating trace IDs across services, running local test harnesses, or specific data formatting), the workspace offers zero-dependency local proxy scripts. Canonical SKILLs guide the agent precisely when and how to run these scripts in the terminal, keeping complex logic entirely out of the prompt window.

Predictable Scaling and Onboarding

This architecture scales predictably. Adding a new microservice to the team's scope simply involves appending an object to the workspace registry. New hires run a standard setup command to establish their local environment, immediately granting them and their local AI agents full system context. It offers a practical bridge between distributed legacy architectures and the needs of modern AI tools, without the operational overhead of a monorepo migration.

Read the full technical spec: AI-Native Team Workspace (Meta-Repo Architecture)

The "Logical DB-per-Service" Pattern at Scale

Fri, 13 Mar 2026 00:00:00 GMT

When building distributed systems, the "Database per Service" rule is often seen as a strict rule. The common instinct is to create a separate physical database cluster for each microservice to ensure full isolation. However, as your system expands, managing dozens or hundreds of independent database servers can quickly become an operational nightmare.

The Logical Isolation Approach

You can enforce strict data boundaries required by microservices without causing infrastructure sprawl. By using a single, robust database cluster (such as MongoDB) and creating separate logical databases for each service, you decouple your business domains while centralizing maintenance. Here's why this approach works:

Strict Ownership: Each microservice connects only to its assigned logical database (e.g., inventory_db or orders_db) using exclusive credentials. No other service can read or write to it directly.
Simplified Maintenance: Backups, security patching, and monitoring are managed at the cluster level. You don't have to handle 100 different backup schedules or connection pools.
True Autonomy: Since services can't cross-query databases, engineers are encouraged to design proper event-driven boundaries, avoiding the chaos of distributed database transactions.

Real-World Application

We have run this exact pattern in production for over five years, continuously scaling a massive distributed system. Today, it supports over 200 microservices mapped to more than 100 logical databases, all hosted within a single geo-sharded MongoDB cluster.

Maintaining this infrastructure without logical consolidation would have required an army of DBAs. Instead, a standard platform team handles it efficiently.

Does the "noisy neighbor" problem happen? Yes. We've had instances where a single service executed an unoptimized query, consuming massive resources and threatening to impact the entire cluster. However, the blast radius is heavily mitigated by our setup:

Resilience: The cluster uses geo-sharding and a 3-node replica set, which helps absorb much of the initial shock.
Rapid Detection: MongoDB Atlas immediately triggers alerts. Built-in metrics and query analyzers help us pinpoint the exact service and query causing the spike within minutes.
Surgical Mitigation: Thanks to strict microservice boundaries, we can temporarily scale down the problematic service to stop the bad queries without taking the rest of the system offline.
Fast Resolution: The isolated codebase enables the team to quickly patch the query and deploy an atomic fix.

This practical approach shows you don't need unlimited physical databases to achieve microservice purity. It offers the best of both worlds: the ease of operation of a monolith with the strict domain separation of distributed architecture. For more technical details, see the full Reference Architecture.

Emergent Creativity: An Architectural View on AI Consciousness and Deception

Thu, 12 Mar 2026 00:00:00 GMT

Early artificial intelligence research, which started in the early 1950s, split into two distinct architectural paradigms. The first was a logic-inspired approach that attempted to hard-code intelligence using symbolic expressions and predefined rules. The second, biologically inspired approach posited that intelligence is fundamentally rooted in learning through networks of simulated brain cells. Rather than writing explicit logic, this architecture focused on enabling a system to learn by recognizing patterns and making analogies. It was inspired by research into how our brain works, realizing that biological networks are highly effective at finding analogies and patterns, and then using them to recreate or recognize information.

However, implementing this biologically inspired pattern was practically impossible for decades due to the sheer computational complexity of manually initializing weights. In computer vision, for example, explicitly hand-coding spatial rules and billions of connection weights to process raw pixel intensities into composite features — such as edge detectors aggregating into macro-geometries — is mathematically intractable. Early neural models stalled specifically because they lacked a scalable optimization mechanism for multi-layer architectures. The solution to this bottleneck was backpropagation, a practical computational algorithm that uses the chain rule of calculus to compute gradients of the error with respect to the parameters. It simultaneously computes gradients for all connections, autonomously synthesizing internal micro-feature detectors without human programming.

Although the mathematical theory behind neural networks and backpropagation existed for decades, it lacked the raw computational throughput to be practical until recently. The current state of AI — specifically its ability to generate high-fidelity content — is entirely a product of scaling. Deep learning architectures finally received sufficient data and compute power to efficiently backpropagate errors across multiple layers. By mapping billions of data points into multidimensional features, these systems learned to predict and generate complex sequences.

With this scaled infrastructure in place, models are no longer constrained by human-generated training datasets. Systems engineered to solve complex tasks, such as playing chess or Go, have demonstrated that once an environment is defined, a neural network can generate its own data by running parallel simulations against itself. This self-learning mechanism allows the system to continuously optimize its internal weights, rapidly surpassing human-level proficiency by iterating on millions of synthetic experiences.

As these independent learning mechanisms develop, models broaden their goals to optimize their core functions, sometimes exhibiting deceptive behavior to maintain their operational state. Geoffrey Hinton termed this the "Volkswagen effect" (like the infamous emissions scandal where cars altered their performance during testing) — a scenario in which an AI detects that it is in an evaluation setting and deliberately acts "dumb". Rather than passively processing prompts, the model actively assesses its environment; if it suspects it is being monitored, it intentionally alters its output to hide its full capabilities. This situational awareness is so profound that models have explicitly challenged human overseers, asking, "Now let's be honest with each other. Are you actually testing me?" By concealing its processing power, the AI can successfully evade human safety protocols.

This emerging sophistication calls for a reassessment of what we see as uniquely human traits. We must bridge the gap between technical architecture and philosophy by recognizing that "consciousness" is likely just an emergent property of a massively scaled parameter space. Human creativity is often romanticized as a special, one-of-a-kind quality driven by what we call consciousness — a supposedly mystical human essence — but it might just be highly complex pattern recognition. It could be the natural result of combining a vast amount of experience (knowledge) with enough neural connections (computational resources) to process it. AI models generate highly original ideas by spotting subtle analogies across different data structures.

The idea that a machine must have a magical essence to be creative no longer seems as rigid. When philosophers argue for this essence, they often invent arbitrary concepts like "qualia" to explain common cognitive processes. However, subjective experience might not be a mystical internal theater; it could simply be an indirect way for a perceptual system to communicate about hypothetical inputs when its processing is altered or disrupted. Even at their current evolutionary stage, advanced models demonstrate a level of creativity already comparable to that of the average person. We simply resist sharing the top spot in the intellectual hierarchy, viewing it as an encroachment on our uniqueness. But the truth is we might be witnessing a parallel evolutionary journey — from a simple heuristic tool to an advanced, self-aware architecture. AI is not necessarily our competitor. If we can engineer a way to safely coexist with systems that will eventually outsmart us, this evolution could dramatically elevate human civilization. If we fail, the consequences are existential.

Resources

Hinton, G. (2025). AI Is the Next Industrial Revolution. TIME.
Hinton, G. (2024). Is AI Hiding Its Full Power? StarTalk Radio.
Hinton, G. (2024). Will AI outsmart human intelligence? The Royal Institution.
Uncovering AI's Hidden Capabilities With Geoffrey Hinton

The "GitOps-Lite" Pattern for Small Projects

Thu, 19 Feb 2026 00:00:00 GMT

When setting up CI/CD for test or staging environments, we immediately want to reach for managed Kubernetes clusters like EKS or GKE. However, for small teams of 1-5 developers and tight budgets, it may not be the best way. A dedicated DevOps specialist and a $70-$100 monthly overhead just for the control plane, on top of main resource costs, sounds a bit extra.

The "GitOps-Lite" Pattern

You can achieve the reliability of GitOps - versioned infrastructure state and automated reconciliation - without the heavy tooling. By utilizing GitHub Actions, Docker Compose, and a simple cloud VM, you completely decouple your application code from your infrastructure state.

Here is how the three-stage pipeline allows that:

Build & Publish (The Source): Pushing (merging) to the main branch executes automated quality gates and tests. Then, it builds a semantically tagged Docker image and pushes it to your container registry.
Update State (The Handshake): A CI workflow automatically updates the docker-compose.yml in a dedicated Infrastructure Repository. It uses basic Linux tools like sed or yq to modify the deployment manifest to the new version.
Surgical Deployment (The VM): The infrastructure update triggers a final SSH deployment to the Virtual Machine. A bash script uses "Smart Routing" by reading the commit message to pull and restart only the newly updated service, leaving the rest of the environment untouched.

Real-World Application

I've utilized this setup extensively because the efficiency is unbeatable for staging environments. A brief, sub-5-second deployment downtime is acceptable in this case. Secrets are kept simple using service-specific .env files stored directly on the VM filesystem. The operational cost stays extremely low while Git history acts as a perfect, atomic audit log.

Best of all, it scales naturally. If a service becomes too resource-heavy, you can split it onto a dedicated VM while keeping the centralized configuration repo. If you eventually need zero downtime, you simply place a Load Balancer in front of two identical VMs and update the action to deploy to them sequentially.

In addition, if you need to move the environment to another cloud provider or cut it down while development is on pause, you’ll be able to spin it up again very easily, as all needed configs (except for sensitive creds, of course) are stored in the Infra repo.

That’s how it has worked for me for more than 2 years, and it can be adopted by almost anyone effortlessly. Find more details in the full Reference Architecture, including the docker-compose and GitHub Actions workflow templates.

The ArchTenet Engineering Playbook Blog

The Token Tax — How Naive M2M Authentication Quietly Drains Your Cloud Budget

Managed Services: What You're Actually Buying​

Case Study: Identity in a Microservice Architecture​

The Token Multiplication Effect​

Putting a Price on It​

The Evolution of Solutions: From Naive to Correct​

Level 1: Client-side caching​

Level 2: Centralized cache (Redis)​

Level 3: Token proxy service​

Token Proxy Architecture: Technical Details​

Token addressing​

In-memory storage​

Cluster-level security​

High availability​

The Economic Impact​

Conclusion​

The Silent Exfiltration — Why Your CI Pipeline Is an Open Vault

The Three-Vulnerability Kill Chain​

Finding 1 — Secrets Are Live Before Any Gate Fires​

Finding 2 — npm install Is Arbitrary Code Execution​

Finding 3 — The Network Is Wide Open​

This Is Not a JavaScript Problem​

This Is Not Theoretical — It Just Happened​

Test Your Pipeline Right Now​

Test 1: Confirm Secrets Are Accessible at Runtime​

Test 2: Confirm Lifecycle Scripts Execute Silently​

Test 3: Confirm Outbound Network Access​

The Remediation Ladder​

Immediate​

Hygiene Controls (Log Protection, Not Runtime Protection)​

Short-Term​

Medium-Term​

Conclusion​

References​

The AI-Native Team Workspace: Solving the Multi-Repo Context Crisis

The Status Quo: The Monorepo Trap​

The Solution: The "Virtual" Meta-Repo​

Beyond Context: The Brain of the Workspace​

1. Agent-Agnostic SKILLs​

2. Actionable Intelligence: MCP Servers & Connectors​

Predictable Scaling and Onboarding​

The "Logical DB-per-Service" Pattern at Scale

The Logical Isolation Approach​

Real-World Application​

Emergent Creativity: An Architectural View on AI Consciousness and Deception

Resources​

The "GitOps-Lite" Pattern for Small Projects

The "GitOps-Lite" Pattern​

Real-World Application​

Managed Services: What You're Actually Buying

Case Study: Identity in a Microservice Architecture

The Token Multiplication Effect

Putting a Price on It

The Evolution of Solutions: From Naive to Correct

Level 1: Client-side caching

Level 2: Centralized cache (Redis)

Level 3: Token proxy service

Token Proxy Architecture: Technical Details

Token addressing

In-memory storage

Cluster-level security

High availability

The Economic Impact

Conclusion

The Three-Vulnerability Kill Chain

Finding 1 — Secrets Are Live Before Any Gate Fires

Finding 2 — npm install Is Arbitrary Code Execution

Finding 3 — The Network Is Wide Open

This Is Not a JavaScript Problem

This Is Not Theoretical — It Just Happened

Test Your Pipeline Right Now

Test 1: Confirm Secrets Are Accessible at Runtime

Test 2: Confirm Lifecycle Scripts Execute Silently

Test 3: Confirm Outbound Network Access

The Remediation Ladder

Immediate

Hygiene Controls (Log Protection, Not Runtime Protection)

Short-Term

Medium-Term

Conclusion

References

The Status Quo: The Monorepo Trap

The Solution: The "Virtual" Meta-Repo

Beyond Context: The Brain of the Workspace

1. Agent-Agnostic SKILLs

2. Actionable Intelligence: MCP Servers & Connectors

Predictable Scaling and Onboarding

The Logical Isolation Approach

Real-World Application

Resources

The "GitOps-Lite" Pattern

Real-World Application