Est. 2019 New City, New York Studio · Lab · Ops

AI systems,
implemented & operated.

Practice
AI implementation, generative pipelines, and infrastructure orchestration — across studio work and a private compute environment.
Based
New City, New York — operating remote, with private compute set up for generation-heavy work.
Currently
Shipping content tracks and quietly improving the operating layer.
Booking
Q3 · 2026
Accepting brief-stage engagements. 20-min intro, no decks required.
01 — 09
Selected work

Nine projects, three stacks, one workshop.

A snapshot of the work currently moving through the studio. The first three cards are stacks — collections of tools I implement against. The remaining six are projects shipping out of those stacks. Click any card for the full read.

AI Stack — agentic routing topology
AI Stack
01 · Stack
Claude · Hermes Agent · Paperclip · GPT-5.5

The agentic stack I implement against — a Claude / GPT-5.5 reasoning layer fronted by Hermes (the agent runtime) and Paperclip (the gateway / cron / delegation tier). One default agent, many models, all calls audited at the edge.

What I'm learning
  • Centralizing access through one gateway makes cost, audit, and rate limits enforceable. Direct keys per tool is the slow drift into chaos.
  • "Agent" is a verb (a thing that decides what to call) more than a noun. Picking one default keeps the verb consistent.
  • Tool inventory belongs in the agent layer, not in the prompt — keeps prompts small and tools swappable.
  • One agent, many models is easier to operate than many agents, one model.
  • Cron-driven agent delegation is dramatically simpler than event-driven for small teams. Most "real-time" requirements aren't.
  • Mass-update workflows require idempotency from day one — or the rollback story gets ugly fast.
  • Audit logging at the gateway saves arguing about what an agent actually called when something breaks at 2am.
Tools
ClaudeHermes PaperclipGPT-5.5
Posture
One default agent
Many models
Audit at the edge
Generation Stack — image, video, audio tools
Generation Stack
02 · Stack
ComfyUI · GPT-Image2 · LTX 2.3 · AceStep 1.5 XL

The media-generation stack covering image, video, and audio. ComfyUI is the workflow editor; GPT-Image2 handles batch persona renders; LTX 2.3 is the video diffusion runner; AceStep 1.5 XL covers music. Each tool is wired to the same farm and gateway.

What I'm learning
  • A saved workflow .json is more reproducible than any prompt — treat it like code, commit it, diff it.
  • Custom nodes age fast; pin the ones you depend on or accept the breakage.
  • Splitting the workflow at the latent stage (load, sample, decode separately) is the cheapest debugging move.
  • Persona enrichment as structured JSON beats freeform prompt rewrites every time — and survives reruns.
  • Visual-DNA maps make characters consistent across hundreds of frames without re-prompting from scratch.
  • Video models reward thinking in shots, not seconds — pacing is a prompt input, not a render setting.
  • Music generation is most useful when the visual cut already exists — composition lives in the video, not the audio.
Tools
ComfyUIGPT-Image2 LTX 2.3AceStep
Coverage
Image · Video
Audio · Music
LoRAs · Fine-tunes
Workflow as code
Hosted Stack — self-hosted mail and edge
Hosted Stack
03 · Stack
Mail Operations · Web Operations

The self-hosted operations layer — managed mail with proper authentication and an outbound relay, plus a portfolio-wide edge / DNS / proxy setup with structured contact-form sweeping. Both managed as data, both built to roll back per zone.

What I'm learning
  • Reputation is the metric that matters; everything else is upstream of it.
  • An outbound relay for transactional mail is a much better story than sending direct from the mail server.
  • Authentication reports are noisy until you act on them — read them weekly or don't enable reporting at all.
  • The dry-run output IS the rollout plan. Anything you can't reproduce in dry-run won't behave the same on apply.
  • Parked domains are a contact-form attack surface most teams forget exists. Sweep them like any other input.
  • Edge rules outlive any single deployment script — write them as data, not as one-shot commands.
  • Phased rollout per zone is slower but spares you the "everything broke at once" debugging.
Approach
MailWeb DNSEdge
Stack
SESAWS
Posture
Self-hosted
Managed deliverability
Phased rollout
AI Image/Video Farm
AI Image/Video Farm
04 · Live
Generation & LoRAs · Distributed Inference

A private compute environment for image and video generation, with LoRA fine-tuning baked into the pipeline. Centralized orchestration handles model routing, throughput shaping, and node-level health.

What I'm learning
  • VRAM is the binding constraint on multi-model serving — utilization without VRAM tracking is a misleading green dashboard.
  • LoRA training stays cheap if the base model is locked and only the adapter cycles — full retrains rarely earn their cost.
  • Auto-load/unload is the difference between "real cluster" and "one big model that won't move."
  • Cost-per-generation only becomes a real number once node-level telemetry is wired in.
  • Long-running generations should checkpoint to disk; "cluster goes down at 80%" should not erase the work.
  • Throughput shaping per-tenant is the difference between "shared farm" and "one user starves everyone else."
Approach
LinuxCUDA LoRALocal
Topology
Multi-node
Internal-only
Auto-routing
AI Attendant and Assistant
AI Attendant and Assistant
05 · Alpha
Voice-driven Attendant · Always-on Assistant

A speech-first assistant on a self-hosted speech pipeline — text-to-speech, speech-to-text, and multi-language voice generation, wired to a turn-taking attendant and an always-on assistant interface.

What I'm learning
  • Model selection matters more than parameter tuning for TTS quality — the wrong base model can't be fixed downstream.
  • Subtitle alignment is its own problem class. STT timestamps need post-processing per language.
  • MOS-style scoring is the only way to compare voices without taste arguments creeping in.
  • Multi-language coverage cascades into engine choice — the language list is locked early.
  • Turn-taking is harder than transcription — the assistant that interrupts itself loses every conversation.
  • Latency budget is real; sub-second response is the floor, sub-300ms first-syllable is the goal.
  • A voice persona is a brand asset — it should be consistent across calls, sessions, even years.
Approach
TTSSTT Multi-langTurn-taking
Coverage
Voice-first
Multi-language
Sub-second target
AI Music Idols — albums and music videos
AI Music Idols
06 · Live
Albums & Music Videos · Character-led Acts

A small label of in-house AI idols — characters with discographies, not producers with stage names. Currently QKeyV (synth · dark pop) and DvYnT (electronic · cinematic). Releases ship as albums plus visual companions cut from the same generative pipeline.

What I'm learning
  • Visuals tied to the audio cut from day one beat post-hoc music videos — every time.
  • Album shape forces a story; single-track-only acts lose continuity fast.
  • Distribution is its own production pass — budget it as a stage.
  • An idol is a character with a discography, not a producer's stage name — write the persona, then the songs.
  • Voice consistency across releases is what keeps an act feeling like one act, not many.
  • The fan loop (release → social → response → next release) is the actual product, not any single song.
Acts
QKeyVDvYnT
Output
Albums + videos
Visual companions
DSP distribution
AI Persona, Models, UGC roster
AI Persona, Models, UGC
07 · Live
Persona Roster · Brand-safe Media · UGC Feeds

A locked roster of AI personas used as brand-safe media assets. Each persona ships with a defined aesthetic, a social tone, a UGC-ready feed format, and album/track structure for ongoing campaigns. Output runs through manual QA before anything ships.

What I'm learning
  • A persona is a brand asset — versioned, signed off, retired the same way logos are.
  • Visual DNA is what separates a real character from a generated face — without it, every shoot starts at zero.
  • UGC feel comes from imperfection on purpose, not from prompt-perfect renders.
  • Cross-platform formats (story, reel, square) cost less when designed up front than retro-fitted.
  • Album/track structure forces editorial rhythm into otherwise scattered AI output.
  • DOCX as a deliverable beats HTML for client review — comments and markup land where reviewers already work.
  • Quality assurance has to live in the loop, not bolted on at the end.
Approach
RosterUGC SocialBrand-safe
Use
Campaign assets
Social feeds
UGC seeding
AI Learning Course — one foundation, four modality tracks
AI Learning Course
08 · 1 Foundation · 4 Tracks
Text · Image · Video · Voice · All built on the same transformer math

A from-the-ground curriculum on transformer-based AI, structured by output modality rather than module number. One Common Foundation covers representation, math, architecture, training, and inference — the layers that apply to every transformer regardless of what it generates. Four modality tracks then specialize: Text, Image, Video, Voice. The same attention mechanism powers GPT, Stable Diffusion, Sora, and ElevenLabs — and seeing them side-by-side is the curriculum's organizing idea. Authored twice in parallel with two AI co-authors (Gemini + Antigravity, Sonnet 4.6) and folded into a single canonical narrative. 125 audio narrations, two interactive demos, one pre-rendered 3D embedding video. No login. No paywall.

Open the course →

§1 Common Foundation · the five layers every modality builds on
  • 1A Representation — tokens, embeddings, the prequel methods (3 modules).
  • 1B Math underneath — softmax, cross-entropy, why next-token prediction works (2 modules).
  • 1C Architecture — attention, Q/K/V, the transformer block, encoder-decoder (6 modules).
  • 1D Training — pretraining, optimizations, RLHF, DPO/KTO/ORPO, PEFT/LoRA (6 modules).
  • 1E Inference — the decoding loop, the highway, offline engines, the HF registry (4 modules).
§2-§5 Four modality tracks · current depth + future slots visible
  • §2 Text — RAG · chat UI · latent space. Inherits all of §1. Most complete track.
  • §3 Image — Vision Transformers, cross-attention, VAE+Diffusion, ControlNet. 4 lessons + 5 future slots.
  • §4 Video — AI Film Studios pioneers (Sora, Runway, Luma, Kling, Higgsfield). 1 lesson + 4 future slots.
  • §5 Voice — Speech synthesis (ElevenLabs, Whisper). 1 lesson + 4 future slots.
What I learned shipping the course
  • Two AI co-authors covering the same material in parallel surfaces what each tool understands and what it papers over — the diff is the lesson.
  • The original module numbering hid the modality structure — module 04 ("vision-speech") is actually pure Vision Transformers; module 26 ("multimodal-film") covers both video AND speech and had to be split into 26v + 26s.
  • A common foundation that all modality tracks inherit beats duplicating the basics per track — mirrors how the systems actually work.
  • Visible "Coming soon" slots in thin tracks (Video, Voice) set honest expectations and show where future content lands — better than hiding the asymmetry.
  • Three.js scrollytelling looks great in a demo and harms reading flow on a real lesson — the revamp drops the 3D shell and keeps the canonical content.
  • Curriculum survives best when anchored on principles rather than specific products — Llama 3 and FLUX.2 examples will date; the residual stream and the diffusion process won't.
Foundation
21 modules
5 layers
(1A-1E)
Modality tracks
Text · 3 modules
Image · 4 + 5 future
Video · 1 + 4 future
Voice · 1 + 4 future
Media
125 mp3 clips
2 interactive demos
1 pre-rendered video
3 hero diagrams
Format
Self-pacedNo login AudioDemos
Source
github.com /
ryzenx570 /
LLM-Understanding
Security, IDS, Hardening
Security, IDS, Hardening
09 · Live
SOC Posture · Intrusion Detection · Runbooks

Continuous security work across access policy, an ongoing threat-tracking notebook, an intrusion-detection layer, versioned hardening checklists, and a monitoring runbook. Real attack analysis feeds the runbook back instead of drifting into archive.

What I'm learning
  • A threat database without a monitoring runbook is a graveyard. Alerting is what makes the data load-bearing.
  • Hardening checklists need versioning the same way code does — each revision should explicitly subsume the last, with a diff.
  • Brute-force at the perimeter is still the floor of what you'll see, and rate-limit tooling is still the cheapest mitigation.
  • Snapshot completion does not equal recovery. Restore drills uncover gaps the snapshot job hides.
  • IDS signal-to-noise ratio drives whether you'll act on it — tune for false-positive cost, not detection rate.
  • Patch cadence is a posture choice, not a calendar item — pick monthly, weekly, or "as critical drops" and commit.
  • The audit log is only useful if someone reads it — schedule the review, or the log isn't security, it's storage.
Approach
HardeningIDS Rate LimitsRunbook
Stack
CloudflareMail Certs
Cadence
Daily review
Weekly audit
Monthly retro
9
Active Projects
3
Stacks Implemented
Private Compute
2
Idols Shipping
10
Process

How a project actually moves.

There's no proprietary methodology here — just the order of operations that has stopped me losing weekends to preventable rework.

01
Discover

Understand what the system actually does today. Talk to the operators, read the logs, run a baseline. The job before this is usually wrong about what's broken.

02
Architect

Design around the binding constraint, not the most exciting one. Most often that's memory, retry rate, blast radius, or audit. Pick before implementing.

03
Implement

Ship in dry-run first. Wire monitoring before features. Make rollback a button, not a story you tell at the post-mortem.

04
Operate

Cron, runbook, on-call. The most expensive part of any system is what happens after launch — design for the second year, not the first week.

11
Cross-project insights
Nine projects in flight teach you the same six lessons over and over — until you finally write them down.
i / 01

Route, don't proliferate

Centralizing access through one gateway makes cost, audit, and rate limits enforceable. Direct keys per tool is the slow drift into chaos — and the audit trail you'll wish you had.

From: AI Stack
i / 02

One-shot rate is the cost line

Token spend follows retry rate. Lowering retries — through tighter prompts, smaller diffs, and read-before-edit — saves more than picking a cheaper model ever will.

From: AI Stack · Generation Stack
i / 03

Dry-run is the spec

Whether it's edge config, a fine-tune, or a workflow graph, anything you can't reproduce in dry-run won't behave in production. Treat dry-run output as the deliverable, not a sanity check.

From: Hosted Stack · Generation Stack
i / 04

Persona = structured DNA

A consistent character across a hundred frames, a year of releases, or a five-minute call isn't about a better prompt. It's about persisting the persona as data and enriching every call from it.

From: Generation Stack · AI Music Idols · AI Persona/Models/UGC
i / 05

Memory is the binding constraint

On a multi-model compute environment, utilization without memory tracking is a misleading green dashboard. Schedule on memory headroom; the throughput follows.

From: AI Image/Video Farm
i / 06

The runbook is the alert

A threat database without a monitoring runbook is a graveyard. Alerting + a written response procedure is what makes any operational data load-bearing — security, ops, or otherwise.

From: Security, IDS, Hardening · Hosted Stack
Let's
build
quietly.
Based

New City, New York

Working Remote

New Brief?

20-min intro call.

No decks required.