MansfieldPlumbing
github.com/MansfieldPlumbing
★ ZERO-COPY GPU MEMORY TRANSPORT ⚠ NO TERMUX — NO LINUX LAYER — NO CHROOT ★ FIRST NATIVE POWERSHELL RUNSPACE ON ANDROID ⚠ GC REMOVED FROM THE DATA PLANE ★ IN-PROC ROSLYN GATE — FAIL-CLOSED ⚠ NO PYTHON AT RUNTIME ★ ZERO-COPY GPU MEMORY TRANSPORT ⚠ NO TERMUX — NO LINUX LAYER — NO CHROOT ★ FIRST NATIVE POWERSHELL RUNSPACE ON ANDROID ⚠ GC REMOVED FROM THE DATA PLANE ★ IN-PROC ROSLYN GATE — FAIL-CLOSED ⚠ NO PYTHON AT RUNTIME
MansfieldPlumbing
// Systems Engineering — Windows · Android · GPU

I Fix
The Pipes.

Subsystem runs PowerShell 7.7 in-process inside a native Android app — no Termux, no Linux layer, no chroot, no root. As far as the record shows, that had not been done before. The runspace is one mounted device inside an NT-Object-Manager-shaped CoreCLR runtime: one namespace of refcounted handles, one push transport, no garbage collector on the data plane. Specification and log.

Availability
01 //

Projects

01 — Subsystem

NT-Shaped Runtime

An in-process CoreCLR / .NET 11 runtime — JIT, full reflection, runtime self-compile via Roslyn — hosting PowerShell 7.7, with the garbage collector removed from the data plane. One object namespace of refcounted handles; per-owner quotas; deterministic cascade-kill. As far as the record shows, the first native in-process CoreCLR + PowerShell runspace on Android: no Termux, no Linux layer, no chroot, no root.

Specification + Log → Source →
02 — DirectPort-SDK

GPU IPC

Push-model GPU memory transport for Windows. A producer writes a 256-byte-aligned shared texture or buffer exposed through a named NT shared handle, then signals a monotonic timeline fence; consumers wait on the fence value. Zero copies on the same adapter, no polling, no OS scheduler in the hot path. Wait latency is unmeasured; no figure is cited until a benchmark prints one.

DirectPort-SDK →
03 — VirtuaCam

Virtual Camera

A Media Foundation virtual camera over DirectPort. Producer applications run as separate processes and share D3D11 textures with fences through NT handles; a broker multiplexes the feeds into a composited output registered as a system camera. Inter-process frame transfers stay on the GPU.

VirtuaCam →
04 — DPX Interpreter

ONNX Without onnxruntime

An ONNX inference engine written in .NET from the protobuf down: the interpreter walks the graph in topological order and dispatches to hand-rolled kernels. It runs Kokoro-82M text-to-speech end to end — 49 of 49 op-types, all 2,463 graph nodes — at RMSE 2.516e-2 against the onnxruntime oracle. It is the substrate under Subsystem's offline TTS and its local LLM path.

Part of Subsystem →
05 — Demucs_v4_TRT

Kernel-Fused Source Separation

Meta's HTDemucs 6-stem separator exported as a single ONNX graph — the STFT runs inside the export, so TensorRT fuses the full dataflow. Prior public ports hand-wrote kernels around an externalized STFT; keeping it in the graph makes those kernels unnecessary. Compiled FP16 for a native Windows executable: C# host, C++ bridge, no Python at runtime. 118.7 ms mean GPU compute per 7.8-second chunk on an RTX 3090 (trtexec receipt in-repo).

Demucs_v4_TRT → HuggingFace →
06 — RIFE 4.9

TensorRT + Hexagon NPU

RIFE 4.9 frame interpolation compiled two ways. TensorRT for native Windows: Media Foundation hardware decode, C# host, no Python at runtime. And a Qualcomm QNN context binary for the Hexagon V73 NPU (Snapdragon 8 Gen 2) — a model with no prior QNN existence, run on a stock phone without root. Device output is bit-exact against the host reference: max|diff| = 0.000. Pure NPU execute is 81 ms per 256×256 frame, a ~12 fps ceiling; the identified lever is W8A16 quantization. Depth Anything V2 ported to TensorRT the same way.

RIFE_TRT → Depth_TRT →
02 //

Receipts

System Measurement Mechanism Receipt
Subsystem gate 573 standing findings, 0 new permitted 26 Roslyn analyzers, in-process, fail-closed ratchet GREEN 2026-07-01
Kokoro-82M on DPX 49/49 op-types, all 2,463 graph nodes ONNX interpreter in .NET, no onnxruntime RMSE 2.516e-2 vs oracle
MatMulNBits SIMD 16-token decode: 88.4 s → 17.1 s Hand-rolled SIMD q4 kernel 5.2×
Demucs_v4_TRT Mean GPU compute per chunk, RTX 3090 Single-graph FP16 engine, fused STFT 118.7 ms
RIFE 4.9 on Hexagon V73 81 ms/frame pure NPU execute at 256×256; device vs host reference QNN context binary, HTP backend, on-device profile max|diff| = 0.000
VOM cascade-kill 3 owners → 0, native memory reclaimed Terminate: token cancel → depth-first handle revoke Selftest GREEN

A number appears here only after a benchmark prints it.

03 //

Public Service

CoolPro An image editor in the browser
CoolProMobile The same editor, shaped for phones
MansfieldTeachesTyping A typing tutor
ArlineArcade Arcade games
Emulator An emulator in the browser
art4quinn Generative remixes

Free static web apps. No accounts, no ads, no tracking, no backend. They run in the browser and install as PWAs.