// ~/services/ai-ml
custom AI & ML.
Fine-tuned LLMs, RAG systems, agents, vision models. Built end-to-end, deployed to your stack, and yours to own. Beyond off-the-shelf APIs.
| price | custom-quoted by scope (fixed, not hourly) |
|---|---|
| turnaround | typically 2-8 weeks |
| delivery | code + weights in your repo / cloud |
| nda | happy to sign — yours or mine |
| models | Llama, Mistral, Qwen (open) · Claude, GPT, Gemini (hosted) |
// capabilities
seven capabilities. most builds combine two or three.
- [01]llm fine-tuning// teach an open-weights base your domain, tone, and tasks
- [02]RAG & retrieval// vector search, hybrid, citations — grounded in your docs
- [03]agents & orchestration// tool use, planners, guardrails, multi-step workflows
- [04]computer vision// detect, segment, classify — photos, scans, live video
- [05]api & deployment// FastAPI / Modal / Vercel · auth, observability, rate limits
- [06]evals & quality// regression tracking, online + offline eval pipelines
- [07]cost & latency tuning// right-sized models, quantization, caching
// process
four phases · weekly demos · no mystery.
[01] discovery
map the problem, success criteria, data audit. one-page brief out at the end.
[02] data & design
collect, clean, label. pick architecture and the eval that says we're done.
[03] build & train
iterate on data/prompts, fine-tune, run evals weekly until quality crosses bar.
[04] deploy & monitor
ship to your stack with logging and eval-on-prod. clean handoff included.
// faq
things people ask before we start.
Q: bring my own data?
A: helpful, not required. if you have it, we'll use it. if not — open dataset, synthesis, or a small collection phase.
Q: timeline?
A: 2-8 weeks end-to-end. focused RAG ships in 2; fine-tune with custom evals is 4-6.
Q: hosting & ongoing costs?
A: I deploy to your cloud so you own runtime. inference cost depends on model size + traffic — right-sized before shipping.
Q: open vs hosted models?
A: open-weights (Llama, Mistral, Qwen) when you want to self-host. hosted (Claude, GPT, Gemini) when latency/quality demands. often both.
Q: do I own everything?
A: yes. code, weights, fine-tuned artifacts ship to a repo + storage bucket you control.
// start a project
have a concrete problem an AI could solve?
Send a paragraph. I'll respond within 24 hours.