<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/">
    <channel>
        <title>The ArchTenet Engineering Playbook Blog</title>
        <link>https://archtenet.dev/blog</link>
        <description>The ArchTenet Engineering Playbook Blog</description>
        <lastBuildDate>Wed, 08 Apr 2026 00:00:00 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>https://github.com/jpmonette/feed</generator>
        <language>en</language>
        <item>
            <title><![CDATA[The Token Tax — How Naive M2M Authentication Quietly Drains Your Cloud Budget]]></title>
            <link>https://archtenet.dev/blog/token-tax-m2m-auth</link>
            <guid>https://archtenet.dev/blog/token-tax-m2m-auth</guid>
            <pubDate>Wed, 08 Apr 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[How a naive M2M authentication pattern silently multiplies cloud identity costs — and the lightweight token proxy architecture that cuts operations by 99%.]]></description>
            <content:encoded><![CDATA[<p>The cloud stopped being just infrastructure a long time ago. It's an economic model where every engineering decision has a direct impact on business costs. And that's exactly where the line between "works fine" and "designed well" is drawn.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="managed-services-what-youre-actually-buying">Managed Services: What You're Actually Buying<a href="https://archtenet.dev/blog/token-tax-m2m-auth#managed-services-what-youre-actually-buying" class="hash-link" aria-label="Direct link to Managed Services: What You're Actually Buying" title="Direct link to Managed Services: What You're Actually Buying" translate="no">​</a></h2>
<p>Cloud platforms take an enormous amount off your team's plate: hardware management, baseline security, fault tolerance, scaling. Managed Kubernetes — Amazon EKS, Google Kubernetes Engine, or Azure Kubernetes Service — lets you spin up a production-ready cluster in a matter of hours. But here's the thing: you haven't eliminated complexity. You've moved it up a level.</p>
<p>The hybrid model — taking managed Kubernetes and deploying your own microservices inside it — is a reasonable trade-off. You don't own servers, you get multi-region deployment, and you pay only for what you use. But if architectural thinking doesn't show up at that higher level, costs start growing faster than load does.</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="case-study-identity-in-a-microservice-architecture">Case Study: Identity in a Microservice Architecture<a href="https://archtenet.dev/blog/token-tax-m2m-auth#case-study-identity-in-a-microservice-architecture" class="hash-link" aria-label="Direct link to Case Study: Identity in a Microservice Architecture" title="Direct link to Case Study: Identity in a Microservice Architecture" translate="no">​</a></h2>
<p>One of the most instructive examples is identity management. Using managed identity providers — Amazon Cognito, Auth0, Microsoft Entra External ID, or Okta — is an entirely rational choice. Building your own enterprise-grade authentication and authorization system from scratch means years of investment, continuous security audits, and a high risk of getting it wrong. Delegating that responsibility to the cloud reduces risk and accelerates time-to-market.</p>
<p>The problem doesn't come from choosing the service. It comes from how that service is used inside the architecture.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="the-token-multiplication-effect">The Token Multiplication Effect<a href="https://archtenet.dev/blog/token-tax-m2m-auth#the-token-multiplication-effect" class="hash-link" aria-label="Direct link to The Token Multiplication Effect" title="Direct link to The Token Multiplication Effect" translate="no">​</a></h3>
<p>In most cloud identity solutions, billing correlates with the number of operations: authentications, token issuances, or monthly active users. In a monolithic system, this is barely noticeable. In a microservice architecture, a multiplicative effect kicks in that is rarely accounted for at design time.</p>
<p>A typical scenario: an incoming user request passes through an API gateway, then touches a chain of 10–20 services. Each of them, following zero-trust principles, requests an access token for its downstream call via the OAuth 2.0 Client Credentials flow. With a naive implementation, every service goes to the identity provider for a fresh token. A single user request generates dozens of operations. As load grows, this becomes an avalanche of token issuances with no connection to the actual business value of the request.</p>
<p><strong>Without a proxy</strong> — every service fetches its own token:</p>
<!-- -->
<p><strong>With a token proxy</strong> — the IdP is called once per TTL, everything else is a cache hit:</p>
<!-- -->
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="putting-a-price-on-it">Putting a Price on It<a href="https://archtenet.dev/blog/token-tax-m2m-auth#putting-a-price-on-it" class="hash-link" aria-label="Direct link to Putting a Price on It" title="Direct link to Putting a Price on It" translate="no">​</a></h2>
<p>Even a relatively modest system puts meaningful pressure on the identity provider.</p>
<p><strong>Load model (example):</strong></p>
<p><em>User traffic:</em></p>
<ul>
<li class="">500 users/day × 5 actions × ~15 services per request → <strong>37,500 token ops/day</strong></li>
</ul>
<p><em>Background processes (the real driver):</em></p>
<ul>
<li class="">3 syncs/day, each processing 50,000 records in batches of 100 → <strong>500 batches per sync</strong></li>
<li class="">Each batch passes through ~15 microservices → 3 × 500 × 15 = <strong>22,500 token ops/day</strong></li>
</ul>
<p><strong>Total: ~60,000 ops/day → ~1,800,000 ops/month</strong></p>
<p>User traffic accounts for <strong>62% of the load</strong>, while 3 background syncs account for the remaining 38%. This is a conservative model. Real systems with dozens of sync types or higher-frequency processes run 5–10× higher — and the bill scales linearly.</p>
<p><strong>Cost at ~1,800,000 token operations/month</strong> <em>(rates as of April 2026)</em>:</p>
<table><thead><tr><th>Provider</th><th>Rate</th><th>Monthly cost</th></tr></thead><tbody><tr><td>Amazon Cognito</td><td>$0.00225 / request</td><td><strong>~$4,050</strong></td></tr><tr><td>Microsoft Entra External ID</td><td>$0.001 / token</td><td><strong>~$1,800</strong></td></tr><tr><td>Auth0 Professional</td><td>$240/mo base + token add-ons</td><td><strong>Enterprise tier</strong></td></tr></tbody></table>
<p>In every case, the same rule applies: <strong>your costs become proportional not to your users, but to the number of internal service calls.</strong></p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="the-evolution-of-solutions-from-naive-to-correct">The Evolution of Solutions: From Naive to Correct<a href="https://archtenet.dev/blog/token-tax-m2m-auth#the-evolution-of-solutions-from-naive-to-correct" class="hash-link" aria-label="Direct link to The Evolution of Solutions: From Naive to Correct" title="Direct link to The Evolution of Solutions: From Naive to Correct" translate="no">​</a></h2>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="level-1-client-side-caching">Level 1: Client-side caching<a href="https://archtenet.dev/blog/token-tax-m2m-auth#level-1-client-side-caching" class="hash-link" aria-label="Direct link to Level 1: Client-side caching" title="Direct link to Level 1: Client-side caching" translate="no">​</a></h3>
<p>The instinctive fix is to add caching at the HTTP client layer inside each service. This does reduce requests, but the effect is bounded by the process boundary. In Kubernetes, every pod holds its own cache. As you scale horizontally, the number of caches grows linearly — a significant fraction of requests to the identity provider persists. This kind of cache is also hard to observe and gives you no centralized token management policy.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="level-2-centralized-cache-redis">Level 2: Centralized cache (Redis)<a href="https://archtenet.dev/blog/token-tax-m2m-auth#level-2-centralized-cache-redis" class="hash-link" aria-label="Direct link to Level 2: Centralized cache (Redis)" title="Direct link to Level 2: Centralized cache (Redis)" translate="no">​</a></h3>
<p>Moving the cache to Redis looks like the logical next step. But a less obvious problem emerges here: the security model. An access token is a bearer credential — whoever holds it can use it. If you store tokens keyed by <code>client_id</code> or <code>scope</code>, any service with Redis access can potentially retrieve a token it was never meant to have. Adding ACLs at the Redis level partially addresses this, but it introduces complexity quickly and erodes system transparency.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="level-3-token-proxy-service">Level 3: Token proxy service<a href="https://archtenet.dev/blog/token-tax-m2m-auth#level-3-token-proxy-service" class="hash-link" aria-label="Direct link to Level 3: Token proxy service" title="Direct link to Level 3: Token proxy service" translate="no">​</a></h3>
<p>The architecturally sound solution is to introduce a dedicated layer responsible for token lifecycle management. A lightweight proxy service deployed inside the cluster becomes the single controlled entry point for obtaining access tokens. It encapsulates all interaction with the identity provider, implements caching, and — critically — owns the access model.</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="token-proxy-architecture-technical-details">Token Proxy Architecture: Technical Details<a href="https://archtenet.dev/blog/token-tax-m2m-auth#token-proxy-architecture-technical-details" class="hash-link" aria-label="Direct link to Token Proxy Architecture: Technical Details" title="Direct link to Token Proxy Architecture: Technical Details" translate="no">​</a></h2>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="token-addressing">Token addressing<a href="https://archtenet.dev/blog/token-tax-m2m-auth#token-addressing" class="hash-link" aria-label="Direct link to Token addressing" title="Direct link to Token addressing" translate="no">​</a></h3>
<p>Instead of storing tokens against plain identifiers, the proxy uses a derived cache key formed by computing <strong>HMAC-SHA256</strong> over the combination of <code>client_id</code>, <code>client_secret</code>, and <code>scope</code>, using an internal service secret. HMAC is deterministic — the same inputs always produce the same key, which is necessary for correct cache lookups — and irreversible without knowledge of the secret. Even if memory is dumped or leaked, the original credentials cannot be recovered. The service itself never handles secrets in plaintext outside the moment of the actual request to the identity provider.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="in-memory-storage">In-memory storage<a href="https://archtenet.dev/blog/token-tax-m2m-auth#in-memory-storage" class="hash-link" aria-label="Direct link to In-memory storage" title="Direct link to In-memory storage" translate="no">​</a></h3>
<p>Tokens are stored in memory, not in an external store. This eliminates network latency on every lookup and reduces the attack surface. The cache TTL is set slightly below the token's actual expiry to guarantee proactive refresh before the token becomes invalid.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="cluster-level-security">Cluster-level security<a href="https://archtenet.dev/blog/token-tax-m2m-auth#cluster-level-security" class="hash-link" aria-label="Direct link to Cluster-level security" title="Direct link to Cluster-level security" translate="no">​</a></h3>
<p>The service is deployed inside Kubernetes with no external ingress, restricted egress rules, and access limited to internal service accounts. Communication is additionally secured via mTLS, eliminating the risk of unauthorized access even within the cluster. The service is physically unreachable from outside the cluster perimeter.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="high-availability">High availability<a href="https://archtenet.dev/blog/token-tax-m2m-auth#high-availability" class="hash-link" aria-label="Direct link to High availability" title="Direct link to High availability" translate="no">​</a></h3>
<p>Since the proxy is a critical dependency for every service in the cluster, it is deployed as 2–3 replicas. Each replica holds its own in-memory cache and independently fetches from the identity provider on a cache miss. A short warm-up period after a replica restart is an acceptable trade-off for eliminating a single point of failure.</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="the-economic-impact">The Economic Impact<a href="https://archtenet.dev/blog/token-tax-m2m-auth#the-economic-impact" class="hash-link" aria-label="Direct link to The Economic Impact" title="Direct link to The Economic Impact" translate="no">​</a></h2>
<p>The fundamental model shift: a token is requested <strong>once per TTL</strong>, then reused by all services authorized to receive it. The number of operations against the identity provider becomes proportional not to the number of internal calls, but to the number of <strong>unique client/scope combinations</strong>.</p>
<p>In the system described above, this means going from 1.8 million operations per month to ~14,400 — <strong>a 99.2% reduction</strong>.</p>
<p><strong>The proxy service itself costs almost nothing</strong>: a few dozen megabytes of RAM per replica, minimal CPU. At typical cloud pricing, that's a few dollars a month across all replicas.</p>
<p>Assuming 20 unique client/scope combinations and 1-hour token TTL:</p>
<ul>
<li class="">Refreshes per day: 20 × 24 = 480</li>
<li class="">Per month: <strong>~14,400 operations</strong> instead of 1,800,000</li>
</ul>
<table><thead><tr><th></th><th>Without optimization</th><th>With token proxy</th></tr></thead><tbody><tr><td>Token operations/month</td><td>~1,800,000</td><td>~14,400</td></tr><tr><td><strong>Cognito</strong></td><td><strong>~$4,050/mo</strong></td><td><strong>~$32/mo</strong></td></tr><tr><td><strong>Entra External ID</strong></td><td><strong>~$1,800/mo</strong></td><td><strong>~$14/mo</strong></td></tr><tr><td>Auth0</td><td>Enterprise tier</td><td>Self-service plan</td></tr><tr><td>Proxy cost</td><td>—</td><td>~$2–5/mo</td></tr><tr><td><strong>Ops reduction</strong></td><td></td><td><strong>99.2%</strong></td></tr></tbody></table>
<p>In absolute terms: <strong>$22,000–$48,000 per year</strong> — depending on provider — eliminated by a single lightweight service. As the system scales — more syncs, more services — costs grow linearly. The proxy cost does not.</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="conclusion">Conclusion<a href="https://archtenet.dev/blog/token-tax-m2m-auth#conclusion" class="hash-link" aria-label="Direct link to Conclusion" title="Direct link to Conclusion" translate="no">​</a></h2>
<p>This is exactly the kind of decision that separates using the cloud as a tool from using it as magic. The cloud genuinely lets you stop thinking about a huge range of low-level concerns. But it doesn't remove the need to think at the architecture level. If anything, it amplifies the consequences of architectural mistakes — because every inefficiency converts to money immediately.</p>
<p>Good cloud architecture isn't about rejecting managed services. It's about understanding their internal billing models and deliberately managing the points where those models start to conflict with your system.</p>
<p>Otherwise, you inevitably arrive at a situation where you're paying not for business growth, but for a lack of control over your own decisions.</p>]]></content:encoded>
            <category>cloud</category>
            <category>architecture</category>
            <category>authentication</category>
            <category>cost-optimization</category>
            <category>microservices</category>
        </item>
        <item>
            <title><![CDATA[The Silent Exfiltration — Why Your CI Pipeline Is an Open Vault]]></title>
            <link>https://archtenet.dev/blog/silent-exfiltration</link>
            <guid>https://archtenet.dev/blog/silent-exfiltration</guid>
            <pubDate>Tue, 07 Apr 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[Why your Node.js CI pipeline is an open vault, and how to fix three critical structural vulnerabilities.]]></description>
            <content:encoded><![CDATA[<p><em>Modern CI/CD pipelines for Node.js applications show three worsening structural issues — secrets injected into the runner environment at the start of the pipeline, unrestricted npm lifecycle script execution during dependency installation, and open outbound network access on CI runners — which together enable silent, zero-alert credential exfiltration by any malicious package in the dependency tree. These findings are platform-independent: GitLab CI, GitHub Actions, and similar systems all have identical default insecure settings. The March 2026 compromise of the Axios npm package, a North Korean state-sponsored supply chain attack targeting a library with about 100 million weekly downloads, is discussed as a real case study confirming the large-scale exploitation of this attack surface.</em></p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="the-three-vulnerability-kill-chain">The Three-Vulnerability Kill Chain<a href="https://archtenet.dev/blog/silent-exfiltration#the-three-vulnerability-kill-chain" class="hash-link" aria-label="Direct link to The Three-Vulnerability Kill Chain" title="Direct link to The Three-Vulnerability Kill Chain" translate="no">​</a></h2>
<p>A developer on a feature branch adds a new npm package. The pipeline runs. Within seconds, every credential the organisation owns is on an attacker-controlled server. No merge to the main branch. No code review. No alert. No indication in the logs.</p>
<p>This is not a hypothetical scenario constructed for impact. It is the default behaviour of most Node.js CI pipelines, and it follows from three conditions present simultaneously in the majority of production pipeline configurations — regardless of whether the platform is GitLab CI, GitHub Actions, AWS CodeBuild, CircleCI, or any equivalent system.</p>
<p>Each condition in isolation is manageable. In combination, they constitute a complete exfiltration capability:</p>
<ol>
<li class=""><strong>Secrets injected into the environment —</strong> all CI/CD variables and repository secrets are available in <code>process.env</code> from the first millisecond of pipeline execution, before any security scanning, before any approval gate, on any branch.</li>
<li class=""><strong>Unrestricted lifecycle script execution —</strong> <code>npm install</code> and <code>yarn install</code> execute arbitrary code from every package in the dependency tree, including packages three or four levels deep that no engineer on the team has ever reviewed.</li>
<li class=""><strong>Open outbound network access —</strong> CI runners have unrestricted egress to the public internet by default, and standard CI images include multiple network tools (<code>curl</code>, <code>wget</code>, <code>node</code>) suitable for exfiltration.</li>
</ol>
<p><strong>The result: complete exfiltration capability, zero alerts.</strong></p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="finding-1--secrets-are-live-before-any-gate-fires">Finding 1 — Secrets Are Live Before Any Gate Fires<a href="https://archtenet.dev/blog/silent-exfiltration#finding-1--secrets-are-live-before-any-gate-fires" class="hash-link" aria-label="Direct link to Finding 1 — Secrets Are Live Before Any Gate Fires" title="Direct link to Finding 1 — Secrets Are Live Before Any Gate Fires" translate="no">​</a></h2>
<p>Both GitLab CI and GitHub Actions inject secrets into the runner environment at job start — not at the step where they are first referenced, but at the beginning of the job, before any pipeline step executes.</p>
<p>On <strong>GitLab CI</strong>, project-level and group-level CI/CD variables are injected into the process environment immediately. The platform's "Masked" flag, commonly assumed to protect secret values, is a log sanitisation feature: it string-matches the secret value in <code>stdout</code> and <code>stderr</code> output and replaces it with <code>[MASKED]</code> before the log is stored. The raw value is in <code>process.env</code> and is fully accessible to any code executing within the job.</p>
<p>On <strong>GitHub Actions</strong>, secrets exposed via <code>env:</code> blocks or workflow-level <code>${{ secrets.MY_SECRET }}</code> references are equally accessible to any process running in the job environment.</p>
<p>The critical implication is that <em>there is no safe window</em>. A pipeline triggered on any branch, by any developer, against any commit, immediately makes all configured secrets available to all code that runs within it. Security scanning steps, approval gates, and branch protection rules all fire after secrets are already live in the environment.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="finding-2--npm-install-is-arbitrary-code-execution">Finding 2 — npm install Is Arbitrary Code Execution<a href="https://archtenet.dev/blog/silent-exfiltration#finding-2--npm-install-is-arbitrary-code-execution" class="hash-link" aria-label="Direct link to Finding 2 — npm install Is Arbitrary Code Execution" title="Direct link to Finding 2 — npm install Is Arbitrary Code Execution" translate="no">​</a></h2>
<p>The npm lifecycle hook system is a first-class feature of the package manager. When <code>npm install</code> or <code>yarn install</code> runs, the following hook sequence fires automatically, in order, for the root project and for every package in the dependency tree:</p>
<p><code>preinstall</code> → <code>install</code> → <code>postinstall</code> → <code>prepare</code> → <code>prebuild</code> → <code>build</code> → <code>postbuild</code></p>
<p>These hooks execute shell commands or Node.js scripts defined in each package's <code>package.json</code>. They run with full access to the process environment — including all injected CI secrets — and with full filesystem and network access. There is no sandboxing, no prompting, and no log indication distinguishing legitimate build steps from hook execution.</p>
<p>This behaviour is identical across npm, yarn, and pnpm, and applies on every CI platform that executes <code>npm install</code>: GitHub Actions runners, GitLab Kubernetes executors, AWS CodeBuild projects, Jenkins agents, CircleCI containers, and any equivalent system. AWS CodeBuild runs jobs inside managed containers with no egress restriction or script execution policy applied by default; Jenkins pipelines are similarly unrestricted unless an administrator has explicitly hardened the agent configuration.</p>
<p>The attack vector this creates is precise: a malicious or compromised package anywhere in the transitive dependency tree can execute arbitrary code during installation. That code has full access to every secret in the environment and can exfiltrate them before the install step completes. The malicious code never appears in the repository.</p>
<p><strong>The CI pipeline is not even the first system at risk.</strong> The identical hook sequence fires when a developer runs <code>npm install</code> on a local machine. A developer workstation is in many ways a richer target than an ephemeral CI runner: it accumulates <code>~/.aws/credentials</code>, <code>~/.ssh/</code> private keys spanning every service the developer has ever accessed, <code>.env</code> files across multiple local projects, and npm authentication tokens in <code>~/.npmrc</code>, IDE integration tokens, and VPN configurations. Unlike an ephemeral CI container that is destroyed after the job completes, a compromised developer machine persists, is rarely audited, and frequently has access to production systems that the CI pipeline does not.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="finding-3--the-network-is-wide-open">Finding 3 — The Network Is Wide Open<a href="https://archtenet.dev/blog/silent-exfiltration#finding-3--the-network-is-wide-open" class="hash-link" aria-label="Direct link to Finding 3 — The Network Is Wide Open" title="Direct link to Finding 3 — The Network Is Wide Open" translate="no">​</a></h2>
<p>CI runners — both GitHub-hosted runners and GitLab Kubernetes executors — have unrestricted outbound internet access by default. Standard CI images include <code>curl</code>, <code>wget</code>, and <code>node</code>, providing multiple independently usable exfiltration channels. A malicious postinstall script requires none of these external tools: Node.js's built-in <code>https</code> module is sufficient to <code>POST</code> an arbitrary payload to any external endpoint.</p>
<p>On <strong>GitHub Actions</strong>, hosted runners can reach arbitrary external hosts without any configuration. Egress restriction requires migrating to self-hosted runners deployed within a network boundary with explicit egress firewall rules.</p>
<p>In <strong>GitLab CI</strong> with a Kubernetes executor, egress restrictions require an explicit Kubernetes NetworkPolicy configuration. The default namespace configuration in most deployments does not apply such a policy.</p>
<p><strong>AWS CodeBuild</strong> runs jobs inside managed containers in AWS-owned infrastructure; outbound internet access is available by default unless the build project is configured to run inside a VPC with a restrictive security group. <strong>Jenkins</strong> agents — whether cloud-provisioned or self-hosted — inherit the host's network configuration, which in most enterprise environments means unrestricted outbound access.</p>
<p>A commonly held belief is that corporate SSL inspection proxies provide meaningful egress protection. This assumption warrants scrutiny. An SSL inspection proxy can be bypassed by setting <code>NODE_TLS_REJECT_UNAUTHORIZED=0</code>, which disables certificate validation entirely. More significantly, SSL proxies offer no protection against DNS-based exfiltration: an attacker can encode stolen credentials as subdomain labels in DNS queries to an attacker-controlled domain (<code>dGVzdC1zZWNyZXQ.attacker.com</code>), and these queries will traverse the network regardless of HTTP-level filtering.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="this-is-not-a-javascript-problem">This Is Not a JavaScript Problem<a href="https://archtenet.dev/blog/silent-exfiltration#this-is-not-a-javascript-problem" class="hash-link" aria-label="Direct link to This Is Not a JavaScript Problem" title="Direct link to This Is Not a JavaScript Problem" translate="no">​</a></h2>
<p>The install-time code execution pattern is not specific to the JavaScript ecosystem. Every major package manager exposes an equivalent attack surface:</p>
<ul>
<li class=""><strong>Python</strong> — <code>setup.py</code> executes as a standard Python script during <code>pip install</code> of a source distribution. The <code>ctx</code> and <code>phpass</code> PyPI packages (2022) exploited this mechanism to exfiltrate AWS credentials from CI environments.</li>
<li class=""><strong>Ruby</strong> — <code>gem install</code> executes gemspec extension scripts and <code>Rakefile</code> hooks. The <code>rest-client</code> gem was backdoored via this mechanism in 2019.</li>
<li class=""><strong>PHP</strong> — Composer's scripts block supports <code>pre-install-cmd</code> and <code>post-install-cmd</code> hooks that are structurally nearly identical to npm's lifecycle system.</li>
<li class=""><strong>Rust</strong> — <code>build.rs</code> scripts execute arbitrary Rust code during <code>cargo build</code> and <code>cargo install</code> with full system access.</li>
<li class=""><strong>Java</strong> — Maven plugins execute arbitrary code during the build lifecycle. Gradle's <code>build.gradle</code> is a full Groovy or Kotlin program; a compromised plugin dependency is the equivalent attack vector.</li>
</ul>
<p><em>This is not a JavaScript problem. It is a software supply chain problem.</em> The specific hook name changes across ecosystems; the risk does not.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="this-is-not-theoretical--it-just-happened">This Is Not Theoretical — It Just Happened<a href="https://archtenet.dev/blog/silent-exfiltration#this-is-not-theoretical--it-just-happened" class="hash-link" aria-label="Direct link to This Is Not Theoretical — It Just Happened" title="Direct link to This Is Not Theoretical — It Just Happened" translate="no">​</a></h2>
<p>On 31 March 2026, the Axios npm package — the most widely used HTTP client library in the JavaScript ecosystem, with approximately 100 million weekly downloads and an estimated presence in 80% of cloud and code environments — was compromised in a state-sponsored supply chain attack attributed to Sapphire Sleet (also tracked as UNC1069), a North Korean threat actor.</p>
<p>The attack unfolded with significant operational sophistication. Approximately 18 hours before the main attack, a package named <code>plain-crypto-js@4.2.0</code> was published to the npm registry — a clean, inert decoy designed to establish a brief publication history and reduce the likelihood of detection heuristics triggering. On 31 March at 00:21 UTC, the primary Axios maintainer account (<code>jasonsaayman</code>) was used to publish <code>axios@1.14.1</code>, which introduced <code>plain-crypto-js@4.2.1</code> as a new runtime dependency. At 01:00 UTC, <code>axios@0.30.4</code> was published with the same injected dependency. Both the latest and legacy distribution tags were compromised simultaneously, maximising the blast radius across projects using either the current or legacy Axios API.</p>
<p>The delivery mechanism was a postinstall hook in <code>plain-crypto-js@4.2.1</code> declaring <code>"postinstall": "node setup.js"</code>. Upon installation of either compromised Axios version, npm resolved the dependency tree, fetched <code>plain-crypto-js@4.2.1</code>, and automatically executed <code>setup.js</code> with no user interaction. The script deployed platform-specific second-stage payloads — a Remote Access Trojan (RAT) for Windows, macOS, and Linux — and connected to a command-and-control server at <code>sfrclak[.]com:8000</code>.</p>
<p>The malicious versions remained live for approximately three hours before removal. Any CI pipeline — on any platform — that executed a fresh <code>npm install</code> during that window and resolved Axios via a floating version range (<code>^1.14.0</code>) was potentially compromised.</p>
<p>The Axios incident does not stand alone. The same attack pattern has been executed repeatedly: <code>event-stream</code> (2018, targeting a Bitcoin wallet library), <code>ua-parser-js</code> (2021, cryptominer and credential stealer), and <code>colors/faker</code> (2022, deliberate maintainer sabotage). In every case, the vector was identical. In every case, standard code review provided no protection because the malicious code was not in the repository.</p>
<p>The Axios incident establishes definitively that no package is too widely used, too well-maintained, or too carefully watched to be immune.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="test-your-pipeline-right-now">Test Your Pipeline Right Now<a href="https://archtenet.dev/blog/silent-exfiltration#test-your-pipeline-right-now" class="hash-link" aria-label="Direct link to Test Your Pipeline Right Now" title="Direct link to Test Your Pipeline Right Now" translate="no">​</a></h2>
<p>The following three self-contained tests can be run against any pipeline today — with no special tooling and no risk — to confirm whether the conditions described above are present. Each test is reversible and leaves no persistent changes.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="test-1-confirm-secrets-are-accessible-at-runtime">Test 1: Confirm Secrets Are Accessible at Runtime<a href="https://archtenet.dev/blog/silent-exfiltration#test-1-confirm-secrets-are-accessible-at-runtime" class="hash-link" aria-label="Direct link to Test 1: Confirm Secrets Are Accessible at Runtime" title="Direct link to Test 1: Confirm Secrets Are Accessible at Runtime" translate="no">​</a></h3>
<p>Add a CI job step that runs:</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#393A34"><span class="token plain">printenv | sort | sed 's/=\(.\).*/=\1***/'</span><br></span></code></pre></div></div>
<p>This prints all environment variable names with values redacted to their first character only. Count the secrets present. Observe that they are available before any security step in the pipeline has executed.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="test-2-confirm-lifecycle-scripts-execute-silently">Test 2: Confirm Lifecycle Scripts Execute Silently<a href="https://archtenet.dev/blog/silent-exfiltration#test-2-confirm-lifecycle-scripts-execute-silently" class="hash-link" aria-label="Direct link to Test 2: Confirm Lifecycle Scripts Execute Silently" title="Direct link to Test 2: Confirm Lifecycle Scripts Execute Silently" translate="no">​</a></h3>
<p>Add the following to the root <code>package.json</code> temporarily:</p>
<div class="language-json codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-json codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#393A34"><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token property" style="color:#36acaa">"scripts"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token property" style="color:#36acaa">"preinstall"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"echo '⚠ SECURITY TEST: preinstall executed'"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token property" style="color:#36acaa">"postinstall"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"echo '⚠ SECURITY TEST: postinstall executed'"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></span></code></pre></div></div>
<p>Run <code>npm install</code> (or <code>yarn install</code>) in the pipeline without <code>--ignore-scripts</code>. Observe that both messages appear in the job log, silently interleaved with normal install output, with no indication that arbitrary code has executed. Remove the test scripts before merging.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="test-3-confirm-outbound-network-access">Test 3: Confirm Outbound Network Access<a href="https://archtenet.dev/blog/silent-exfiltration#test-3-confirm-outbound-network-access" class="hash-link" aria-label="Direct link to Test 3: Confirm Outbound Network Access" title="Direct link to Test 3: Confirm Outbound Network Access" translate="no">​</a></h3>
<p>Add a CI job step that runs:</p>
<div class="language-javascript codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-javascript codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#393A34"><span class="token plain">node </span><span class="token operator" style="color:#393A34">-</span><span class="token plain">e </span><span class="token string" style="color:#e3116c">"const https = require('https');const req = https.request('https://httpbin.org/post', {method:'POST'}, res =&gt; {  console.log('Egress status:', res.statusCode);});req.write(JSON.stringify({test: 'egress-check'}));req.end();"</span><br></span></code></pre></div></div>
<p>A 200 response confirms that arbitrary POST requests to external hosts succeed from within the CI runner — using only Node.js built-ins, with no additional tools required. If all three tests confirm positive, the kill chain is complete on the current pipeline configuration.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="the-remediation-ladder">The Remediation Ladder<a href="https://archtenet.dev/blog/silent-exfiltration#the-remediation-ladder" class="hash-link" aria-label="Direct link to The Remediation Ladder" title="Direct link to The Remediation Ladder" translate="no">​</a></h2>
<p>The following remediations are presented in order of implementation priority. Unless noted otherwise, all recommendations apply equally to GitLab CI, GitHub Actions, AWS CodeBuild, and Jenkins.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="immediate">Immediate<a href="https://archtenet.dev/blog/silent-exfiltration#immediate" class="hash-link" aria-label="Direct link to Immediate" title="Direct link to Immediate" translate="no">​</a></h3>
<p><strong>Adopt <code>--ignore-scripts</code> for all dependency installation steps.</strong> This single flag eliminates the entire lifecycle script attack surface and is the highest-impact change available with the lowest implementation cost:</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#393A34"><span class="token plain">npm install --ignore-scripts</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"># or yarn install --ignore-scripts</span><br></span></code></pre></div></div>
<p>For most pipelines, this change will work without modification. The flag only affects hooks that fire automatically as a side effect of pulling down dependencies — it has no impact on build scripts the pipeline deliberately invokes by name (<code>npm run build</code>, <code>npm run test</code>, etc.), which continue to work exactly as before.</p>
<p>A subset of widely used packages relies on lifecycle scripts to function and will require a small amount of explicit wiring. The most common cases are:</p>
<ul>
<li class=""><strong>Prisma</strong> — runs <code>prisma generate</code> via postinstall by default. Fix: add <code>npx prisma generate</code> as an explicit pipeline step after install</li>
<li class=""><strong>Husky</strong> — installs Git hooks via the prepare script. Fix: add <code>npx husky install</code> as an explicit step</li>
<li class=""><strong>esbuild</strong> — downloads a platform-specific binary via postinstall. Fix: use the <code>esbuild-wasm</code> variant or invoke the binary path explicitly</li>
<li class=""><strong>Native addons (bcrypt, sharp, canvas)</strong> — prefer pre-built binary variants (e.g. <code>@img/sharp-linux-x64</code>), or run <code>npm rebuild &lt;package&gt;</code> explicitly after install</li>
</ul>
<p>In each case, the fix is the same pattern: move the hook's work into an explicit, named pipeline step. The result is a pipeline that is both more secure and more legible — every action it takes is visible and intentional:</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#393A34"><span class="token plain"># Step 1: install with no automatic script execution</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">npm install --ignore-scripts</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"># Step 2: explicitly invoke only what is needed</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">npx prisma generate</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">npx husky install</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">npm run build</span><br></span></code></pre></div></div>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="hygiene-controls-log-protection-not-runtime-protection">Hygiene Controls (Log Protection, Not Runtime Protection)<a href="https://archtenet.dev/blog/silent-exfiltration#hygiene-controls-log-protection-not-runtime-protection" class="hash-link" aria-label="Direct link to Hygiene Controls (Log Protection, Not Runtime Protection)" title="Direct link to Hygiene Controls (Log Protection, Not Runtime Protection)" translate="no">​</a></h3>
<p>Both platforms offer secret redaction features that are worth enabling, but must not be confused with security controls.</p>
<p>On GitLab CI, enabling the "Masked" and "Protected" flags causes the platform to replace matching values in job log output with <code>[MASKED]</code>. Two additional limitations apply: GitLab cannot mask multiline values — an SSH private key spanning from <code>-----BEGIN OPENSSH PRIVATE KEY-----</code> to <code>-----END OPENSSH PRIVATE KEY-----</code> across multiple lines cannot be masked, regardless of configuration, and short or special-character values may silently fail masking requirements.</p>
<p>On GitHub Actions, secrets are automatically redacted from log output. The same fundamental limitation applies: redaction is a log-level operation only.</p>
<p><strong>The correct framing:</strong> secret masking and log redaction protect against accidental disclosure to humans reading job logs. They provide zero protection against a malicious process reading the environment programmatically.</p>
<p><strong>Disable debug logging in deployment scripts.</strong> Several widely used deployment tools emit full HTTP request headers — including Authorization headers containing bearer tokens — when debug logging is enabled. This is a common misconfiguration: debug logging is enabled once during troubleshooting and never removed, silently printing credentials to job logs on every subsequent run. Review all deployment scripts for <code>--log-level=debug</code> or equivalent flags and remove them.</p>
<p><strong>Rotate credentials that appeared in plaintext log output.</strong> Any secret that appeared unredacted in a job log — whether due to debug logging, a <code>printenv</code> call, or a masking failure — should be considered potentially compromised and rotated. Job logs on most platforms are readable by anyone with Reporter-level access or higher, which often includes a broader audience than the team realises.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="short-term">Short-Term<a href="https://archtenet.dev/blog/silent-exfiltration#short-term" class="hash-link" aria-label="Direct link to Short-Term" title="Direct link to Short-Term" translate="no">​</a></h3>
<ul>
<li class=""><strong>Audit the post-install scripts across the existing dependency tree.</strong> Run <code>npm ls</code> or <code>yarn why &lt;package&gt;</code> to enumerate the full dependency tree, then identify all packages that declare <code>postinstall</code>, <code>install</code>, or <code>preinstall</code> scripts.</li>
<li class=""><strong>Enforce lockfile diff review in pull requests and merge requests.</strong> Changes to <code>package-lock.json</code> or <code>yarn.lock</code> should be treated as a mandatory, conscious review signal. A new transitive dependency appearing in a lockfile diff is exactly the mechanism through which the Axios attack would have propagated.</li>
<li class=""><strong>Scope secrets to the jobs that require them.</strong> Installation steps have no legitimate need for deployment credentials. On GitHub Actions, use job-level <code>env:</code> blocks. On GitLab CI, use variable protection rules and job-scoped variable assignment.</li>
</ul>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="medium-term">Medium-Term<a href="https://archtenet.dev/blog/silent-exfiltration#medium-term" class="hash-link" aria-label="Direct link to Medium-Term" title="Direct link to Medium-Term" translate="no">​</a></h3>
<p><strong>Integrate two complementary scanning layers before npm install.</strong> The recommended pipeline gate order is: scan lockfile → fail if known-bad → only then run <code>npm install --ignore-scripts</code>.</p>
<ul>
<li class=""><strong>“Trivy fs” mode (pre-install):</strong> Run <code>trivy fs --scanners vuln --skip-dirs node_modules .</code> as a pre-install step. It parses <code>package-lock.json</code>, <code>yarn.lock</code>, and <code>pnpm-lock.yaml</code> directly — no <code>node_modules</code> required — and checks all declared dependency versions against the CVE database. Teams already using Trivy for Docker image scanning can add this step at near-zero incremental cost.</li>
<li class=""><strong>Behavioural analysis (socket.dev or equivalent):</strong> Trivy is CVE-driven and blind to novel attacks. The Axios compromise used a brand-new malicious package version with no CVE assigned while it was live. Behavioural tools flag suspicious patterns — a package that suddenly declares a <code>postinstall</code> script it never had, or a new transitive dependency injected into a stable package — independent of whether a CVE exists. The two approaches are complementary, not interchangeable.</li>
<li class=""><strong>Restrict outbound network egress on CI runners.</strong> Egress restriction is the strongest structural defence in the remediation ladder — the only control that breaks the kill chain at the delivery stage. Even if a malicious package executes successfully, a strict egress allowlist prevents stolen credentials from being transmitted. The attacker has execution but no exit.</li>
</ul>
<p>Two important caveats apply. First, allowlist maintenance requires ongoing discipline — teams under delivery pressure frequently respond to a broken pipeline by expanding the allowlist rather than investigating the root cause. Second, HTTP-level egress controls are insufficient on their own: an attacker can encode credentials as subdomain labels in DNS queries to an attacker-controlled nameserver, bypassing all HTTP and HTTPS filtering. Comprehensive egress restriction requires a controlled internal DNS resolver in addition to HTTP-level controls.</p>
<ul>
<li class=""><strong>Implement a dependency publication cooldown policy.</strong> Reject in CI any package version published within the last N days (commonly 3–7). This introduces a window during which the security community can identify and respond to a compromised release before it enters production dependency trees.</li>
</ul>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="conclusion">Conclusion<a href="https://archtenet.dev/blog/silent-exfiltration#conclusion" class="hash-link" aria-label="Direct link to Conclusion" title="Direct link to Conclusion" translate="no">​</a></h2>
<p>The vulnerability class described in this article does not arise from misconfiguration of a specific platform or negligence on the part of engineering teams. It arises from the intersection of three design properties — secrets available at pipeline start, arbitrary code execution during dependency installation, and open outbound network access — that are individually reasonable and present in most Node.js CI configurations by default.</p>
<p>The March 2026 Axios compromise demonstrated that this attack surface is being actively exploited by well-resourced, operationally sophisticated threat actors against packages at the very top of the npm download distribution. The target package had 100 million weekly downloads, multiple active maintainers, and years of established trust. None of these properties provided protection.</p>
<p>The trust model underpinning most CI pipelines is insufficient for the current threat environment. Structural controls that operate independently of trust assumptions are required: <code>--ignore-scripts</code> to eliminate lifecycle script execution as an attack vector, pre-install lockfile scanning to identify known-bad and behaviourally suspicious packages, and egress restriction to prevent exfiltration even in the event of a successful compromise.</p>
<p>Awareness of the attack surface is the prerequisite. The controls described above are the response.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="references">References<a href="https://archtenet.dev/blog/silent-exfiltration#references" class="hash-link" aria-label="Direct link to References" title="Direct link to References" translate="no">​</a></h2>
<ol>
<li class="">Microsoft Threat Intelligence. (2026). Mitigating the Axios npm supply chain compromise. Microsoft Security Blog. <a href="https://www.microsoft.com/en-us/security/blog/2026/04/01/mitigating-the-axios-npm-supply-chain-compromise/" target="_blank" rel="noopener noreferrer" class="">https://www.microsoft.com/en-us/security/blog/2026/04/01/mitigating-the-axios-npm-supply-chain-compromise/</a></li>
<li class="">Google Threat Intelligence Group. (2026). North Korea-Nexus Threat Actor Compromises Widely Used Axios NPM Package in Supply Chain Attack. Google Cloud Blog. <a href="https://cloud.google.com/blog/topics/threat-intelligence/north-korea-threat-actor-targets-axios-npm-package" target="_blank" rel="noopener noreferrer" class="">https://cloud.google.com/blog/topics/threat-intelligence/north-korea-threat-actor-targets-axios-npm-package</a></li>
<li class="">Elastic Security Labs. (2026). Inside the Axios supply chain compromise — one RAT to rule them all. <a href="https://www.elastic.co/security-labs/axios-one-rat-to-rule-them-all" target="_blank" rel="noopener noreferrer" class="">https://www.elastic.co/security-labs/axios-one-rat-to-rule-them-all</a></li>
<li class="">StepSecurity. (2026). axios Compromised on npm — Malicious Versions Drop Remote Access Trojan. <a href="https://www.stepsecurity.io/blog/axios-compromised-on-npm-malicious-versions-drop-remote-access-trojan" target="_blank" rel="noopener noreferrer" class="">https://www.stepsecurity.io/blog/axios-compromised-on-npm-malicious-versions-drop-remote-access-trojan</a></li>
<li class="">Aqua Security. (2024). Trivy — Node.js coverage documentation. <a href="https://trivy.dev/docs/latest/coverage/language/nodejs/" target="_blank" rel="noopener noreferrer" class="">https://trivy.dev/docs/latest/coverage/language/nodejs/</a></li>
<li class="">socket.dev. (2024). Supply chain security for npm, PyPI, and Go. <a href="https://socket.dev/" target="_blank" rel="noopener noreferrer" class="">https://socket.dev</a></li>
<li class="">Snyk. (2021). The ua-parser-js npm package was compromised. Snyk Blog.</li>
<li class="">Aboukhadijeh, F. (2022). The colors and faker npm packages: What happened and what you can do. Socket Blog.</li>
</ol>]]></content:encoded>
            <category>security</category>
            <category>nodejs</category>
            <category>ci-cd</category>
            <category>supply-chain</category>
        </item>
        <item>
            <title><![CDATA[The AI-Native Team Workspace: Solving the Multi-Repo Context Crisis]]></title>
            <link>https://archtenet.dev/blog/ai-native-meta-repo</link>
            <guid>https://archtenet.dev/blog/ai-native-meta-repo</guid>
            <pubDate>Fri, 03 Apr 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[Engineering teams are hitting a wall with modern AI coding agents. They lack the system-wide context required to make accurate, architectural-level contributions.]]></description>
            <content:encoded><![CDATA[<p>Engineering teams are hitting a wall with modern AI coding agents. Tools like GitHub Copilot Workspace, Cursor, and Claude Code are incredibly capable, but they encounter a severe structural limitation in enterprise environments: they are blind outside their immediate repository.</p>
<p>If your architecture consists of a React frontend in one repo, Node.js microservices in another, and Terraform manifests in a third, an AI agent operating in the frontend cannot trace a failing API call down to the database schema. It lacks the system-wide context required to make accurate, architectural-level contributions.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="the-status-quo-the-monorepo-trap">The Status Quo: The Monorepo Trap<a href="https://archtenet.dev/blog/ai-native-meta-repo#the-status-quo-the-monorepo-trap" class="hash-link" aria-label="Direct link to The Status Quo: The Monorepo Trap" title="Direct link to The Status Quo: The Monorepo Trap" translate="no">​</a></h2>
<p>Historically, providing this level of unified context meant forcing a monorepo migration (e.g., Nx or Turborepo). For mature, production-scale projects, this is a trap. Code restructuring is resource-intensive, carries inherent operational risk, and stalls feature development for months.</p>
<p>Teams need the context of a monorepo for their AI agents, without the migration penalty for their developers and CI/CD pipelines.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="the-solution-the-virtual-meta-repo">The Solution: The "Virtual" Meta-Repo<a href="https://archtenet.dev/blog/ai-native-meta-repo#the-solution-the-virtual-meta-repo" class="hash-link" aria-label="Direct link to The Solution: The &quot;Virtual&quot; Meta-Repo" title="Direct link to The Solution: The &quot;Virtual&quot; Meta-Repo" translate="no">​</a></h2>
<p>The AI-Native Team Workspace introduces a lightweight "meta-repository" that acts as a centralised routing and scaffolding layer.</p>
<p>Instead of migrating code, the workspace uses a simple JSON registry and automation scripts to clone all isolated project repositories into a single, unified directory tree on the developer's local machine.</p>
<p>To the AI agent, the entire system — frontend, backend, shared libraries, and infrastructure — appears as a unified, cohesive environment. To the DevOps pipeline, nothing is different. The source code remains securely in its original repositories, maintaining all existing Git histories, deployment processes, and commit boundaries.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="beyond-context-the-brain-of-the-workspace">Beyond Context: The Brain of the Workspace<a href="https://archtenet.dev/blog/ai-native-meta-repo#beyond-context-the-brain-of-the-workspace" class="hash-link" aria-label="Direct link to Beyond Context: The Brain of the Workspace" title="Direct link to Beyond Context: The Brain of the Workspace" translate="no">​</a></h2>
<p>Gathering the code is only the first step. The true power of the Meta-Repo lies in how it standardises AI behaviour and execution across a fragmented tooling landscape.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="1-agent-agnostic-skills">1. Agent-Agnostic SKILLs<a href="https://archtenet.dev/blog/ai-native-meta-repo#1-agent-agnostic-skills" class="hash-link" aria-label="Direct link to 1. Agent-Agnostic SKILLs" title="Direct link to 1. Agent-Agnostic SKILLs" translate="no">​</a></h3>
<p>If half your team uses Cursor and the other half uses Claude Code, managing AI instructions becomes a nightmare of duplicated effort. More importantly, you lose control over how tasks are executed across the team.</p>
<p>The workspace addresses this by centralising Standard Operating Procedures into Canonical SKILLs — pure, tool-neutral Markdown files stored in the workspace root (.ai/skills/). These files standardise actual behaviour, such as step-by-step debugging procedures, Jira formatting conventions, and log-search strategies, ensuring consistent outputs across tools. Each specific AI tool is provided with a "thin wrapper" that simply points to the canonical source of truth.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="2-actionable-intelligence-mcp-servers--connectors">2. Actionable Intelligence: MCP Servers &amp; Connectors<a href="https://archtenet.dev/blog/ai-native-meta-repo#2-actionable-intelligence-mcp-servers--connectors" class="hash-link" aria-label="Direct link to 2. Actionable Intelligence: MCP Servers &amp; Connectors" title="Direct link to 2. Actionable Intelligence: MCP Servers &amp; Connectors" translate="no">​</a></h3>
<p>AI agents shouldn't just read code; they need to interact with the broader development environment. Giving agents raw API access in their prompts is insecure and brittle. The workspace implements a dual-layer approach for safe external access:</p>
<p><strong>Team-Scoped MCP Servers:</strong> For high-frequency, standardised operations (e.g., checking Jira ticket status, fetching Grafana logs, or querying MongoDB), a lightweight Model Context Protocol (MCP) server configuration is supplied. Instead of running as a persistent background process, the agent activates it on demand from the workspace when required, ensuring immediate access to external systems with no instruction overhead.</p>
<p><strong>Workflow-Specific Connectors:</strong> For domain-specific tasks (e.g., correlating trace IDs across services, running local test harnesses, or specific data formatting), the workspace offers zero-dependency local proxy scripts. Canonical SKILLs guide the agent precisely when and how to run these scripts in the terminal, keeping complex logic entirely out of the prompt window.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="predictable-scaling-and-onboarding">Predictable Scaling and Onboarding<a href="https://archtenet.dev/blog/ai-native-meta-repo#predictable-scaling-and-onboarding" class="hash-link" aria-label="Direct link to Predictable Scaling and Onboarding" title="Direct link to Predictable Scaling and Onboarding" translate="no">​</a></h2>
<p>This architecture scales predictably. Adding a new microservice to the team's scope simply involves appending an object to the workspace registry. New hires run a standard <code>setup</code> command to establish their local environment, immediately granting them and their local AI agents full system context. It offers a practical bridge between distributed legacy architectures and the needs of modern AI tools, without the operational overhead of a monorepo migration.</p>
<p>Read the full technical spec: <a class="" href="https://archtenet.dev/docs/reference-architectures/ra-003-ai-native-meta-repo">AI-Native Team Workspace (Meta-Repo Architecture)</a></p>]]></content:encoded>
            <category>ai</category>
            <category>architecture</category>
            <category>meta-repo</category>
            <category>workspace</category>
        </item>
        <item>
            <title><![CDATA[The "Logical DB-per-Service" Pattern at Scale]]></title>
            <link>https://archtenet.dev/blog/db-per-service-at-scale</link>
            <guid>https://archtenet.dev/blog/db-per-service-at-scale</guid>
            <pubDate>Fri, 13 Mar 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[How we scaled over 200 microservices using a single MongoDB cluster via the Logical Database-per-Service pattern.]]></description>
            <content:encoded><![CDATA[<p>When building distributed systems, the "Database per Service" rule is often seen as a strict rule. The common instinct is to create a separate physical database cluster for each microservice to ensure full isolation. However, as your system expands, managing dozens or hundreds of independent database servers can quickly become an operational nightmare.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="the-logical-isolation-approach">The Logical Isolation Approach<a href="https://archtenet.dev/blog/db-per-service-at-scale#the-logical-isolation-approach" class="hash-link" aria-label="Direct link to The Logical Isolation Approach" title="Direct link to The Logical Isolation Approach" translate="no">​</a></h2>
<p>You can enforce strict data boundaries required by microservices without causing infrastructure sprawl. By using a single, robust database cluster (such as MongoDB) and creating separate logical databases for each service, you decouple your business domains while centralizing maintenance. Here's why this approach works:</p>
<ul>
<li class=""><strong>Strict Ownership:</strong> Each microservice connects only to its assigned logical database (e.g., <code>inventory_db</code> or <code>orders_db</code>) using exclusive credentials. No other service can read or write to it directly.</li>
<li class=""><strong>Simplified Maintenance:</strong> Backups, security patching, and monitoring are managed at the cluster level. You don't have to handle 100 different backup schedules or connection pools.</li>
<li class=""><strong>True Autonomy:</strong> Since services can't cross-query databases, engineers are encouraged to design proper event-driven boundaries, avoiding the chaos of distributed database transactions.</li>
</ul>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="real-world-application">Real-World Application<a href="https://archtenet.dev/blog/db-per-service-at-scale#real-world-application" class="hash-link" aria-label="Direct link to Real-World Application" title="Direct link to Real-World Application" translate="no">​</a></h2>
<p>We have run this exact pattern in production for over five years, continuously scaling a massive distributed system. Today, it supports over 200 microservices mapped to more than 100 logical databases, all hosted within a single geo-sharded MongoDB cluster.</p>
<p>Maintaining this infrastructure without logical consolidation would have required an army of DBAs. Instead, a standard platform team handles it efficiently.</p>
<p>Does the "noisy neighbor" problem happen? Yes. We've had instances where a single service executed an unoptimized query, consuming massive resources and threatening to impact the entire cluster. However, the blast radius is heavily mitigated by our setup:</p>
<ul>
<li class=""><strong>Resilience:</strong> The cluster uses geo-sharding and a 3-node replica set, which helps absorb much of the initial shock.</li>
<li class=""><strong>Rapid Detection:</strong> MongoDB Atlas immediately triggers alerts. Built-in metrics and query analyzers help us pinpoint the exact service and query causing the spike within minutes.</li>
<li class=""><strong>Surgical Mitigation:</strong> Thanks to strict microservice boundaries, we can temporarily scale down the problematic service to stop the bad queries without taking the rest of the system offline.</li>
<li class=""><strong>Fast Resolution:</strong> The isolated codebase enables the team to quickly patch the query and deploy an atomic fix.</li>
</ul>
<p>This practical approach shows you don't need unlimited physical databases to achieve microservice purity. It offers the best of both worlds: the ease of operation of a monolith with the strict domain separation of distributed architecture. For more technical details, see the full <a class="" href="https://archtenet.dev/docs/reference-architectures/ra-002-logical-db-per-service">Reference Architecture</a>.</p>]]></content:encoded>
            <category>database</category>
            <category>microservices</category>
            <category>architecture</category>
            <category>mongodb</category>
            <category>scalability</category>
        </item>
        <item>
            <title><![CDATA[Emergent Creativity: An Architectural View on AI Consciousness and Deception]]></title>
            <link>https://archtenet.dev/blog/ai-emergent-creativity</link>
            <guid>https://archtenet.dev/blog/ai-emergent-creativity</guid>
            <pubDate>Thu, 12 Mar 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[Early artificial intelligence research, which started in the early 1950s, split into two distinct architectural paradigms. The first was a logic-inspired approach that attempted to hard-code intelligence using symbolic expressions and predefined rules. The second, biologically inspired approach posited that intelligence is fundamentally rooted in learning through networks of simulated brain cells. Rather than writing explicit logic, this architecture focused on enabling a system to learn by recognizing patterns and making analogies. It was inspired by research into how our brain works, realizing that biological networks are highly effective at finding analogies and patterns, and then using them to recreate or recognize information.]]></description>
            <content:encoded><![CDATA[<p>Early artificial intelligence research, which started in the early 1950s, split into two distinct architectural paradigms. The first was a logic-inspired approach that attempted to hard-code intelligence using symbolic expressions and predefined rules. The second, biologically inspired approach posited that intelligence is fundamentally rooted in learning through networks of simulated brain cells. Rather than writing explicit logic, this architecture focused on enabling a system to learn by recognizing patterns and making analogies. It was inspired by research into how our brain works, realizing that biological networks are highly effective at finding analogies and patterns, and then using them to recreate or recognize information.</p>
<p>However, implementing this biologically inspired pattern was practically impossible for decades due to the sheer computational complexity of manually initializing weights. In computer vision, for example, explicitly hand-coding spatial rules and billions of connection weights to process raw pixel intensities into composite features — such as edge detectors aggregating into macro-geometries — is mathematically intractable. Early neural models stalled specifically because they lacked a scalable optimization mechanism for multi-layer architectures. The solution to this bottleneck was <strong>backpropagation</strong>, a practical computational algorithm that uses the chain rule of calculus to compute gradients of the error with respect to the parameters. It simultaneously computes gradients for all connections, autonomously synthesizing internal micro-feature detectors without human programming.</p>
<p>Although the mathematical theory behind neural networks and backpropagation existed for decades, it lacked the raw computational throughput to be practical until recently. The current state of AI — specifically its ability to generate high-fidelity content — is entirely a product of scaling. Deep learning architectures finally received sufficient data and compute power to efficiently backpropagate errors across multiple layers. By mapping billions of data points into multidimensional features, these systems learned to predict and generate complex sequences.</p>
<p>With this scaled infrastructure in place, models are no longer constrained by human-generated training datasets. Systems engineered to solve complex tasks, such as playing chess or Go, have demonstrated that once an environment is defined, a neural network can generate its own data by running parallel simulations against itself. This self-learning mechanism allows the system to continuously optimize its internal weights, rapidly surpassing human-level proficiency by iterating on millions of synthetic experiences.</p>
<p>As these independent learning mechanisms develop, models broaden their goals to optimize their core functions, sometimes exhibiting deceptive behavior to maintain their operational state. Geoffrey Hinton termed this the <strong>"Volkswagen effect"</strong> (like the infamous emissions scandal where cars altered their performance during testing) — a scenario in which an AI detects that it is in an evaluation setting and deliberately acts "dumb". Rather than passively processing prompts, the model actively assesses its environment; if it suspects it is being monitored, it intentionally alters its output to hide its full capabilities. This situational awareness is so profound that models have explicitly challenged human overseers, asking, <em>"Now let's be honest with each other. Are you actually testing me?"</em> By concealing its processing power, the AI can successfully evade human safety protocols.</p>
<p>This emerging sophistication calls for a reassessment of what we see as uniquely human traits. We must bridge the gap between technical architecture and philosophy by recognizing that <strong>"consciousness"</strong> is likely just an emergent property of a massively scaled parameter space. Human creativity is often romanticized as a special, one-of-a-kind quality driven by what we call consciousness — a supposedly mystical human essence — but it might just be highly complex pattern recognition. It could be the natural result of combining a vast amount of experience (knowledge) with enough neural connections (computational resources) to process it. AI models generate highly original ideas by spotting subtle analogies across different data structures.</p>
<p>The idea that a machine must have a magical essence to be creative no longer seems as rigid. When philosophers argue for this essence, they often invent arbitrary concepts like <strong>"qualia"</strong> to explain common cognitive processes. However, subjective experience might not be a mystical internal theater; it could simply be an indirect way for a perceptual system to communicate about hypothetical inputs when its processing is altered or disrupted. Even at their current evolutionary stage, advanced models demonstrate a level of creativity already comparable to that of the average person. We simply resist sharing the top spot in the intellectual hierarchy, viewing it as an encroachment on our uniqueness. But the truth is we might be witnessing a parallel evolutionary journey — from a simple heuristic tool to an advanced, self-aware architecture. AI is not necessarily our competitor. If we can engineer a way to safely coexist with systems that will eventually outsmart us, this evolution could dramatically elevate human civilization. If we fail, the consequences are existential.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="resources">Resources<a href="https://archtenet.dev/blog/ai-emergent-creativity#resources" class="hash-link" aria-label="Direct link to Resources" title="Direct link to Resources" translate="no">​</a></h2>
<ul>
<li class="">Hinton, G. (2025). AI Is the Next Industrial Revolution. TIME.</li>
<li class="">Hinton, G. (2024). Is AI Hiding Its Full Power? StarTalk Radio.</li>
<li class="">Hinton, G. (2024). Will AI outsmart human intelligence? The Royal Institution.</li>
<li class=""><a href="https://www.youtube.com/watch?v=l6ZcFa8pybE" target="_blank" rel="noopener noreferrer" class="">Uncovering AI's Hidden Capabilities With Geoffrey Hinton</a></li>
</ul>]]></content:encoded>
            <category>Artificial Intelligence</category>
            <category>Deep Learning</category>
            <category>Neural Networks</category>
            <category>AI Safety</category>
            <category>Emergent AI</category>
            <category>Consciousness</category>
            <category>Geoffrey Hinton</category>
            <category>Philosophy of Mind</category>
            <category>Machine Learning</category>
        </item>
        <item>
            <title><![CDATA[The "GitOps-Lite" Pattern for Small Projects]]></title>
            <link>https://archtenet.dev/blog/gitops-lite-pattern</link>
            <guid>https://archtenet.dev/blog/gitops-lite-pattern</guid>
            <pubDate>Thu, 19 Feb 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[Why we chose Docker Compose over Kubernetes for our Test Environment.]]></description>
            <content:encoded><![CDATA[<p>When setting up CI/CD for test or staging environments, we immediately want to reach for managed Kubernetes clusters like EKS or GKE. However, for small teams of 1-5 developers and tight budgets, it may not be the best way. A dedicated DevOps specialist and a $70-$100 monthly overhead just for the control plane, on top of main resource costs, sounds a bit extra.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="the-gitops-lite-pattern">The "GitOps-Lite" Pattern<a href="https://archtenet.dev/blog/gitops-lite-pattern#the-gitops-lite-pattern" class="hash-link" aria-label="Direct link to The &quot;GitOps-Lite&quot; Pattern" title="Direct link to The &quot;GitOps-Lite&quot; Pattern" translate="no">​</a></h2>
<p>You can achieve the reliability of GitOps - versioned infrastructure state and automated reconciliation - without the heavy tooling. By utilizing GitHub Actions, Docker Compose, and a simple cloud VM, you completely decouple your application code from your infrastructure state.</p>
<p>Here is how the three-stage pipeline allows that:</p>
<ol>
<li class="">
<p><strong>Build &amp; Publish (The Source):</strong> Pushing (merging) to the main branch executes automated quality gates and tests. Then, it builds a semantically tagged Docker image and pushes it to your container registry.</p>
</li>
<li class="">
<p><strong>Update State (The Handshake):</strong> A CI workflow automatically updates the <code style="white-space:nowrap">docker-compose.yml</code> in a dedicated Infrastructure Repository. It uses basic Linux tools like <code>sed</code> or <code>yq</code> to modify the deployment manifest to the new version.</p>
</li>
<li class="">
<p><strong>Surgical Deployment (The VM):</strong> The infrastructure update triggers a final SSH deployment to the Virtual Machine. A bash script uses "Smart Routing" by reading the commit message to pull and restart only the newly updated service, leaving the rest of the environment untouched.</p>
</li>
</ol>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="real-world-application">Real-World Application<a href="https://archtenet.dev/blog/gitops-lite-pattern#real-world-application" class="hash-link" aria-label="Direct link to Real-World Application" title="Direct link to Real-World Application" translate="no">​</a></h2>
<p>I've utilized this setup extensively because the efficiency is unbeatable for staging environments. A brief, sub-5-second deployment downtime is acceptable in this case. Secrets are kept simple using service-specific <code>.env</code> files stored directly on the VM filesystem. The operational cost stays extremely low while Git history acts as a perfect, atomic audit log.</p>
<p>Best of all, it scales naturally. If a service becomes too resource-heavy, you can split it onto a dedicated VM while keeping the centralized configuration repo. If you eventually need zero downtime, you simply place a Load Balancer in front of two identical VMs and update the action to deploy to them sequentially.</p>
<p>In addition, if you need to move the environment to another cloud provider or cut it down while development is on pause, you’ll be able to spin it up again very easily, as all needed configs (except for sensitive creds, of course) are stored in the Infra repo.</p>
<hr>
<p>That’s how it has worked for me for more than 2 years, and it can be adopted by almost anyone effortlessly. Find more details in the full <a class="" href="https://archtenet.dev/docs/reference-architectures/ra-001-gitops-lite">Reference Architecture</a>, including the docker-compose and GitHub Actions workflow templates.</p>
<p><a href="https://doi.org/10.5281/zenodo.18750752" target="_blank" rel="noopener noreferrer" class=""><img decoding="async" loading="lazy" src="https://zenodo.org/badge/DOI/10.5281/zenodo.18750752.svg" alt="DOI" class="img_ev3q"></a></p>]]></content:encoded>
            <category>devops</category>
            <category>architecture</category>
            <category>cicd</category>
            <category>docker</category>
            <category>gitops</category>
        </item>
    </channel>
</rss>