Notes

Short notes on pipelines, performance habits, and experiments with the web’s newer graphics APIs. The GPU handles the math; the surrounding tooling still does most of the heavy lifting.

WebGPU: The Future of Web Graphics

Technical Overview · 7 min read

Pipelines, bind groups, timestamp queries—and a small triangle example to keep everything concrete.

WebGPU is a modern, low‑overhead graphics and compute API for the web. It exposes explicit control over pipelines, bind groups, command encoders, and compute workloads, bringing browser rendering closer to Vulkan, Metal, and Direct3D 12 in terms of mental model. Hidden driver behavior does not disappear, but far fewer surprises leak into frame times.

In practical terms, WebGPU favors deliberate setup over ad‑hoc state changes. Pipeline objects are created ahead of time and reused; resource bindings are grouped and treated as data; command buffers are constructed explicitly and submitted in batches. The trade‑off is extra boilerplate upfront in exchange for predictable performance and clear ownership of GPU work.

Some of the core shifts compared to WebGL:

Explicit pipelines: pipeline state is compiled once and kept around, rather than being inferred from mutable global state.
Bind groups: textures, buffers, and samplers are clustered into bind groups so that whole sets of resources can be swapped with a single call.
Timestamp queries: GPUQuerySet makes it possible to measure the cost of individual passes instead of guessing from wall‑clock timings.

The minimal triangle example below shows the end‑to‑end setup without helpers or frameworks. It covers adapter selection, device creation, canvas configuration, and a single render pass that clears the screen and draws three vertices:

// 1) Init
const canvas = document.querySelector('canvas');
const adapter = await navigator.gpu?.requestAdapter();
const device = await adapter.requestDevice();
const context = canvas.getContext('webgpu');
const format = navigator.gpu.getPreferredCanvasFormat();
context.configure({ device, format });

// 2) Shaders (WGSL)
const shader = device.createShaderModule({ code: `
  @vertex fn v_main(@builtin(vertex_index) vi: u32) -> @builtin(position) vec4f {
    var p = array<vec2f,3>(
      vec2f(0.0, 0.6), vec2f(-0.6, -0.6), vec2f(0.6, -0.6)
    );
    return vec4f(p[vi], 0.0, 1.0);
  }
  @fragment fn f_main() -> @location(0) vec4f {
    return vec4f(0.9, 0.3, 0.2, 1.0);
  }
`});

// 3) Pipeline
const pipeline = device.createRenderPipeline({
  layout: 'auto',
  vertex: { module: shader, entryPoint: 'v_main' },
  fragment: { module: shader, entryPoint: 'f_main', targets: [{ format }] },
  primitive: { topology: 'triangle-list' }
});

// 4) Draw
function frame(){
  const encoder = device.createCommandEncoder();
  const view = context.getCurrentTexture().createView();
  const pass = encoder.beginRenderPass({
    colorAttachments: [{ view, loadOp: 'clear', storeOp: 'store', clearValue: {r:0.08,g:0.09,b:0.1,a:1} }]
  });
  pass.setPipeline(pipeline);
  pass.draw(3);
  pass.end();
  device.queue.submit([encoder.finish()]);
  requestAnimationFrame(frame);
}
frame();

Even in this small example, the WebGPU style is visible: pipeline and shaders are created once, while each animation frame focuses on encoding commands and submitting work. That separation scales cleanly to scenes with multiple passes and frame‑graph‑style orchestration.

WebGPU is not limited to rasterization. Compute pipelines share the same device, queue, and resource model, which makes it straightforward to interleave rendering passes with general‑purpose GPU work. The following snippet performs a prefix sum over a small buffer as a compact demonstration of dispatching compute work and reading the results back to the CPU:

const n = 256;
const input = new Uint32Array(n).map((_,i) => i);
const inBuf = device.createBuffer({ size: input.byteLength, usage: GPUBufferUsage.STORAGE | GPUBufferUsage.COPY_DST });
const outBuf = device.createBuffer({ size: input.byteLength, usage: GPUBufferUsage.STORAGE | GPUBufferUsage.COPY_SRC });
const readBuf = device.createBuffer({ size: input.byteLength, usage: GPUBufferUsage.MAP_READ | GPUBufferUsage.COPY_DST });

device.queue.writeBuffer(inBuf, 0, input);
const cshader = device.createShaderModule({ code: `
  @group(0) @binding(0) var<storage, read>  In: array<u32>;
  @group(0) @binding(1) var<storage, read_write> Out: array<u32>;
  @compute @workgroup_size(64) fn main(@builtin(global_invocation_id) gid: vec3u) {
    let i = gid.x;
    if (i < arrayLength(&In)) { Out[i] = (i == 0u) ? In[i] : In[i] + Out[i-1u]; }
  }
`});
const cpipe = device.createComputePipeline({ layout: 'auto', compute: { module: cshader, entryPoint:'main' } });
const group = device.createBindGroup({ layout: cpipe.getBindGroupLayout(0), entries:[
  { binding:0, resource:{ buffer: inBuf } },
  { binding:1, resource:{ buffer: outBuf } }
]});
const enc = device.createCommandEncoder();
const passC = enc.beginComputePass();
passC.setPipeline(cpipe); passC.setBindGroup(0, group); passC.dispatchWorkgroups(Math.ceil(n/64));
passC.end(); enc.copyBufferToBuffer(outBuf,0,readBuf,0,input.byteLength);
device.queue.submit([enc.finish()]);
await readBuf.mapAsync(GPUMapMode.READ);
console.log(new Uint32Array(readBuf.getMappedRange()));

In production workloads, compute stages often handle tasks such as culling, clustering, particle updates, or preprocessing data for later rendering passes. Common patterns include designing pipeline layouts early, minimizing transient buffer allocations, and batching updates through queue.writeBuffer or staging buffers. Timings from GPUQuerySet help track regressions and keep each pass honest.

ComfyUI: Revolutionizing AI Workflows

Workflow Guide · 8 min read

A shareable, debuggable node graph for diffusion—starter JSON and ControlNet add‑on included.

ComfyUI treats diffusion pipelines like a box of modular bricks. Loader, sampler, upscaler, face‑restoration, and post‑processing steps are represented as nodes in a graph, with connections that document exactly how data flows between them. The main benefit is reproducibility: a working setup is no longer a loose collection of screenshots and half‑remembered sliders, but a single graph that can be saved, versioned, and shared.

Starter workflow (load model → prompt → sampler → save):

{
  "nodes": [
    { "type": "CheckpointLoader", "id": "ckpt", "model": "juggernautXL_v9.safetensors" },
    { "type": "CLIPTextEncode", "id": "txt", "positive": "cinematic portrait, rim light, 85mm", "negative": "lowres, blurry" },
    { "type": "KSampler", "id": "sampler", "steps": 28, "cfg": 5.5, "sampler": "dpmpp_2m" },
    { "type": "SaveImage", "id": "save", "path": "out/portrait.png" }
  ],
  "links": [
    ["ckpt:clip", "txt:clip"],
    ["ckpt:unet", "sampler:unet"],
    ["txt:cond", "sampler:cond"],
    ["sampler:image", "save:image"]
  ]
}

Even a simple checkpoint‑plus‑sampler graph becomes easier to reason about when it is expressed this way. Hyperparameters live in one place, image outputs live in another, and every intermediate step can be inspected or swapped without rewriting a monolithic script.

Several habits help large graphs stay maintainable over time:

Cache and reuse latent tensors across upscalers whenever possible to reduce VRAM pressure and avoid redundant work.
Branch and compare by duplicating a subgraph and changing only one or two sampler parameters; small, targeted diffs tend to produce much clearer signal.
Use Batch‑style nodes to sweep over seeds or step counts, and encode run metadata into filenames so experiment histories remain traceable.

Adding ControlNet for pose guidance (fragment):

{
  "nodes": [
    { "type": "ControlNetLoader", "id": "pose", "model": "controlnet_openpose.safetensors" },
    { "type": "ImageLoad", "id": "ref", "path": "refs/pose.png" }
  ],
  "links": [
    ["pose:control", "sampler:control"],
    ["ref:image", "pose:image"]
  ]
}

For larger setups, consistent naming such as sampler_main and sampler_hi makes graphs easier to navigate, and exporting JSON with lightweight metadata allows teams to track changes through standard diff tools instead of manual visual inspection.

Chrome DevTools: Buffed (MCP)

Practical Guide · 6 min read

Google shipped a DevTools MCP server that lets agents inspect, click, and debug the live web through Chrome.

Chrome DevTools MCP is a Model Context Protocol server that exposes live browser state to language‑model‑driven tools. Through the DevTools bridge, an agent can inspect the DOM, read console logs, follow network activity, and interact with elements in a running tab without custom automation glue.

The key idea is that the browser becomes a first‑class data source instead of a black box. HTML, CSS, script errors, and performance traces are all accessible through a structured interface, which makes it possible to debug or validate behavior programmatically while a real page is loaded.

A minimal example, expressed in pseudo‑Python, asks the DevTools MCP server for information about all <img> tags in the active tab:

from openai import OpenAI
client = OpenAI()

# Ask Chrome DevTools MCP for DOM data
resp = client.responses.create(
  model="gpt-4.1",
  messages=[{"role": "user", "content": "List all <img> tags on the active tab."}],
  tools=["chrome-devtools"]
)

print(resp.output_text)

Once this channel is available, an agent no longer has to imagine what the page might look like. It can read actual DOM nodes, check applied styles, correlate layout shifts with network activity, and capture stack traces from failing requests.

Several scenarios benefit directly from this approach:

Live context — tests and diagnostics can act on the real page instead of on a mocked or recorded snapshot.
Debugging — CSS rules, computed styles, and console output are all exposed in one place, which shortens the feedback loop between a failing step and its root cause.
Scoped automation — MCP permissions define which pages and actions are allowed, reducing the risk of scripts wandering into destructive flows in unrelated tabs.

DevTools MCP can be combined with navigation layers such as Puppeteer or Playwright for workflows that need scripted browsing as well as inspection. In that configuration, navigation code drives the browser to the relevant state, and the MCP side provides a rich view into what the page is doing internally.

Recent Chrome builds expose the server behind an experimental flag and through an extension, making it straightforward to attach an agent process during development and let it assist with repetitive inspection tasks or regression checks.

MCPs vs APIs: For Dummies

Explainer · 5 min read

APIs move data; MCP moves the UI, and the combination often covers real‑world workflows more completely.

Traditional web APIs expose structured data and operations through stable contracts. MCP workflows, in contrast, capture interactions with a user interface: clicks, keystrokes, and extractions driven by selectors. One feels like a loading dock with pallets and manifests; the other is closer to a forklift moving through a warehouse aisle.

In many systems, both layers coexist. An API call fetches primary records or triggers business logic, while an MCP script fills in the gaps where no formal endpoint exists, such as dashboard interactions, file uploads, or legacy flows that only exist in the UI.

A minimal REST example that lists open roles might look like this:

// REST example: list open roles
const res = await fetch('https://api.example.com/jobs?status=open', {
  headers: { 'Authorization': 'Bearer <token>' }
});
const jobs = await res.json();
console.table(jobs.map(j => ({ id: j.id, title: j.title })));

When no public endpoint is available, an MCP recording can approximate the same job by encoding the required UI steps as data:

{
  "steps": [
    { "click": { "selector": "a[href='/careers']" } },
    { "type":  { "selector": "input[name='q']", "text": "frontend" } },
    { "press": { "key": "Enter" } },
    { "extract": { "selector": ".job-card", "fields": { "title": ".title", "url": "a@href" } } }
  ]
}

Several rough guidelines tend to work well in practice:

Prefer APIs for high‑volume, latency‑sensitive tasks and for any operation that needs strong guarantees around rate limits and error handling.
Rely on MCP flows when the only reliable surface is a browser interface, or when exploratory tasks are still changing too quickly to justify a dedicated API.
Treat MCP as a way to validate ideas and patch short‑term gaps, with the expectation that frequently used flows may eventually graduate into formal endpoints.

Regardless of the mix, security concerns remain the same: secrets belong in a vault, session cookies need to be rotated, and automation should be throttled to avoid stressing upstream services. A small amount of discipline at this layer usually prevents an integration from turning into an operational burden later.