I build production AI systems for companies moving past demos.
LLM agents, RAG, MCP, AdTech automation, private-data workflows, and eval gates. The operating layer between frontier models and real companies, built on 28 years of systems craft.
Show me the receipts. That is the design brief.
Ask for proof.
Ask the question a hiring partner would actually ask. The answer retrieves evidence first, then asks Gemini to respond from that bounded context. No generic pitch without a citation trail.
The claims stay attached to sources.
A tighter inventory of what this page is proving: current platform ownership, governed agent tooling, and eval-backed delivery.
AI strategy with operating scope.
Current work spans MCP server architecture, LLM agent systems, Snowflake governance, vendor procurement, local inference strategy, and desktop-agent policy.
Signal: this is live platform ownership, not a generic AI interest.Tool count becomes a contract problem.
Hundreds of callable capabilities are framed around scopes, naming, approvals, logging, reusable skills, and private-data boundaries.
Signal: the proof is governance and repeatability, not a big number by itself.Answers ship through gates.
Retrieval quality, refusals, latency, hallucination behavior, regressions, and spend are treated as deploy gates, not post-demo cleanup.
Signal: the proof surface shows how AI behavior is allowed to ship.AI is the current chapter. The advantage is the whole arc.
I started in 1998 as a Fordham University web developer while still a student. I lived the degree before it was official, then kept learning after the degree was already out of date. The career since then is not a random stack list; it is the craft record behind the AI work.
Models are the flour. Experience is the bake.
Everyone has access to AI now. Everyone also has access to flour. The difference is knowing how to combine ingredients, control heat, recover when something goes wrong, and make something worth serving. My edge is the 28-year operating record behind the prompts: self-taught, still learning, and always turning the work into a path other people can climb.
- 28y
- software craft
- 40+
- systems shipped
- 458
- governed AI tools
- 1,104
- eval tests
- 01 · 1998
Web foundations
Fordham web development while still a CS student; the degree became formal later, but the work was already real.
- 02 · Data systems
Operational software
Nextel commissions data, Intrepid web and intranet modernization, and early Basilecom client systems where software had to serve real staff.
- 03 · Creative velocity
Audience and taste
360i real-time campaigns, Oreo Daily Twist, Cannes Grand Prix work, and BaubleBar launches taught timing, polish, and pressure.
- 04 · Regulated scale
Systems with consequences
Teladoc through NYSE-debut scale and IntegraMed clinical ML under HIPAA, SOC 2, PCI, patient-data, and uptime constraints.
- 05 · Enterprise + defense
Trust boundaries
IBM search, Atlas Air logistics, Dragos OT security, and Air Force mission planning gave the AI work its governance instincts.
- 06 · Now
AI operating layer
MCP, RAG, eval gates, local inference, prompt-injection defense, tenant isolation, and human approval paths for real organizations.
Self-taught never meant isolated. The pattern is learn the thing, ship the thing, document the thing, and raise the people around it. I have coached engineers into senior roles, compressed onboarding paths, built field guides, led interviews for clients, and usually tried to lift the team higher than myself. That is why the AI work is not just prompt skill; it is judgment turned into systems other people can trust and use.
Where the record is strongest right now.
A buyer or hiring team should not have to read the whole site to know whether the fit is real. These are the lanes where the proof, case studies, and public artifacts line up cleanly.
Enterprise tools without uncontrolled autonomy.
MCP servers, OAuth boundaries, tool contracts, approvals, logging, desktop-agent policy, and recovery when integrations behave differently than the demo.
Best evidence: MCP platform + systems atlasRetrieval systems with release gates.
Private knowledge workflows, citation behavior, regression suites, refusal checks, cost budgets, latency targets, and evals that block bad releases.
Best evidence: RAG eval harnessAI workflows over data that cannot leak.
Tenant isolation, Snowflake roles, local inference, approval queues, NL-to-SQL guardrails, and useful analyst interfaces over sensitive operational data.
Best evidence: marketing intelligence caseAmbiguous AI work turned into operating systems.
Vendor diligence, platform architecture, adoption surfaces, field guides, internal standards, and the translation between executives, operators, and engineers.
Best evidence: current platform recordI am not optimizing this site for frontier model research roles, academic publication paths, brand-only AI strategy, or prompt coaching without implementation authority. The strongest fit is senior applied AI, production systems, platform ownership, and fractional technical leadership.
Three production systems show the pattern.
Private data, governed tools, measurable quality, and business workflows that survive outside the demo. The 458-tool number is the total agent-callable surface across MCP tools, custom tools, Claude skills, and workflow actions — the architecture includes 12+ production MCP server integrations, not 458 separate MCP servers.
28 years of production systems underneath the AI work.
Started as a Fordham web developer while still a student, then kept self-teaching as each official credential aged. Classified defense, HIPAA-regulated clinical data, NYSE-scale telehealth, global logistics, commerce, enterprise search, and Cannes-winning advertising platforms became the operating base the current AI platform sits on.
- 01U.S. Air Force AMCClassified mission planning at Scott AFB
- 02IBMGlobal search, 3× query speed, 80% cost reduction, CTO commendation
- 03Teladoc HealthNYSE-debut scale, 12.2M → 15.1M members, sub-2-second WebRTC connect time
- 04IntegraMed50+ clinics, 40K+ IVF cycles, ~30% prediction improvement, $5M+ ML impact
- 05BaubleBar$10M+ platform revenue, 30% conversion lift, 100K+ concurrent launch users
- 06360i / DentsuOreo Daily Twist, Super Bowl blackout, sub-5-min content decisions, millions concurrent
Where the work actually had to survive.
- Defense
- U.S. Air Force AMC (Scott AFB), classified mission planning
- Healthcare & regulated
- Teladoc Health, IntegraMed Fertility, Bayer HealthCare — HIPAA, SOC 2 Type II, PCI DSS, FedRAMP-aligned controls, WCAG 2.2 AA
- Enterprise & logistics
- IBM, Atlas Air, Dragos, ADP, FIS, Phoenix Contact, Shubert Ticketing, Bremer Bank, IFF
- Creative & AdTech
- 360i / Dentsu, Publicis Groupe, Fox Sports, Cannes Grand Prix & SABRE Gold campaigns
Real-time, simulation, creative — design lineage, not the brand.
Before the current agent platform work, I spent years on real-time systems, game AI, interactive campaigns, creative tools, and simulation workflows. Unreal Engine 5 (Epic Games-recognized developer since 2014, Nintendo and Sony licensed), Unity 6, behavior trees, AI perception, navigation, digital twins, ONNX integration. Plus the creative-AI stack — Stable Diffusion, Adobe Firefly, Midjourney, Sora, LoRA, ControlNet, ComfyUI, Creative Cloud integration — wired into multimodal pipelines that cut on-brand asset production time ~40%, with LoRA / PEFT fine-tuning delivering ~25% performance improvement on domain-specific generation tasks. That background shapes how I design agents today: state, memory, perception, pathing, tool choice, fallback behavior, and decisions under frame-time constraints. The site's main lane is still production AI for enterprise — this is the design lineage underneath it.
Every major claim has a trail.
A case study, public artifact, resume entry, repo, field note, or operating plan behind every line.
Enterprise operating layer, not model research theater.
Sole technical decision-maker for AI platform strategy across UK, US, and APJ operations: MCP architecture, LLM agent systems, Snowflake governance, vendor procurement, local inference, and desktop-agent policy.
Open proof458 governed tools with contracts, scopes, and approval paths.
The claim is not tool count. The evidence is the governed surface: OAuth, naming, scope control, reusable Claude skills, failure recovery, and human approval boundaries.
Case study1,104 eval and regression tests before RAG deployment.
Retrieval quality, hallucination behavior, refusal posture, latency, cost, and regressions are treated as release gates instead of demo polish.
Case study11 tenants, 2.3M rows, audited isolation controls.
Prompt constraints, SQL validation, server-side tenant injection, local inference paths, and approval queues that reduce cross-tenant leakage risk over private client data.
Case studyTwenty-eight years shipping systems where failure is visible.
NYSE-scale telehealth, classified mission planning, global flight scheduling, clinical ML, commerce launch scale, and Cannes Grand Prix interactive work before the AI platform layer.
RecordA free field guide built from production patterns.
Eight text-only missions, local artifacts, knowledge checks, persisted progress, and a self-issued certificate. The course supports the production thesis instead of replacing the portfolio.
CourseWhere this sits in the AI engineer market.
The competitor set is not one market. It splits into research authority, education authority, agent tooling companies, and enterprise AI operators. This site should win only one of those lanes: production AI systems inside real organizations.
Frontier labs and university faculty.
Their edge: papers, citations, labs, students, awards, foundation-model history, and institutional reputation.
Our answer: do not pretend to be that. Translate frontier capability into controlled production systems: identity, tools, retrieval, evals, observability, and rollout policy.
Massive courses and public teachers.
Their edge: learner scale, lectures, books, certificates, testimonials, and established teaching brands.
Our answer: a narrower field guide from production work. The course exists to expose judgment, not to compete with university-scale AI education.
Platforms, frameworks, docs, and adoption metrics.
Their edge: product gravity, downloads, customer logos, SDKs, docs, changelogs, and ecosystem ownership.
Our answer: be the operator who chooses, wires, governs, evaluates, and recovers those tools inside a business with private data and real risk.
The quiet lane with the most buying intent.
Their edge: many have strong private experience but weak public surfaces, because the work sits behind client systems and internal documents.
Our answer: make the private work legible without leaking it: redacted case studies, artifact ledgers, system maps, public course material, and a proof engine.
The positioning is simple: not the best researcher, not the biggest teacher, not a tooling vendor. A production AI systems engineer with proof that the operating layer has already been built.
Trace the proof →I am not positioning as a frontier model researcher. I build the enterprise operating layer that lets teams use frontier models safely: tools, identity, retrieval, evals, governance, rollout, and recovery when the system fails in public.
Connected Claude to enterprise systems without turning auth into folklore.
Architected 12+ production MCP server integrations across NetSuite, Monday.com, HubSpot, Wrike, Google Ads, Adverity, Apify, Snowflake, LinkedIn, and Jasper. Root-caused a critical LinkedIn MCP OAuth failure — missing audience parameter producing opaque tokens instead of JWTs — that unblocked 8 downstream tools and cut resolution from 4 days to 2 hours. Patched a Google Ads list_accessible_customers parsing bug affecting multi-account access. Identity boundary on Auth0 with OAuth 2.1 / OIDC, modern Passkeys / FIDO2 supported for end-user surfaces.
Made the data plane usable by agents and analysts.
Architected the Snowflake role hierarchy, including an AUDIENCEINTELLIGENCE role and a tiered analyst account model serving 25 users across 3 regions. Consolidated 40 legacy roles in a cross-regional governance cleanup, dropping credit consumption 35% (~£2,800/month). Integrated Bombora intent + Leadspace firmographics + GWI audience data into a Snowflake + Claude + Jasper pipeline (12 analysts on previously vendor-portal-locked queries), then built the custom content-pack-pipeline Jasper MCP server for four enterprise B2B clients.
Built the operating rules around the tools.
Defined the tooling architecture across Claude API, Claude Desktop, Claude Code, Cowork, OpenAI Codex, Cursor, Ollama, and LM Studio. Conducted a security review of autonomous agent options (OpenClaw, NanoClaw) — recommended sandboxed NanoClaw deployment after weighing exfiltration risk, prompt-injection surface, and audit-logging maturity. Identified governance gaps in Cowork (Claude Desktop automation) that drove revised internal policy on autonomous desktop agents. Recommended Datadog for Claude activity observability and governed the 45+ skill library (30 practitioners) including version control, review process, and backup strategy.
Evaluated vendors like a builder and trained the organisation to use the result.
Delivered integration assessments for Google Ads, Reddit Ads, The Trade Desk, LinkedIn Ads, Meta Ads, DV360, StackAdapt, Clay.com, Windsor.ai, and Adverity. Owned the Claude + Firecrawl content audit pipeline with Apify failover. Built the 52-level Claude Field Guide adopted by 65+ practitioners (3 days → 2 hours time-to-first-useful-prompt). Shipped an AI Brief Builder standardising input quality across 9 account teams. Codified a reproducible macOS Node.js engineering environment (fnm, pnpm, Starship, Codex) used by 8 internal engineers.
Operating outcomes from the current platform.
- 40% reduction in NetSuite manual reconciliation, ~15 hours / week saved
- 35% reduction in Snowflake credit consumption, ~£2,800 / month saved
- £24,000 + 4 months of rebuild avoided by stabilizing the crawl pipeline instead of replacing it
- £450,000 in media trading decisions informed by structured AdTech API assessments
- £18,000 + 6 months of self-hosted observability avoided by standardizing on Datadog
- 3 days → 2 hours time-to-first-useful-prompt across 65+ practitioners trained through the Claude Field Guide
- 60% reduction in data-to-insight latency from the Windsor.ai vendor selection
Self-healing agent architecture, not happy-path demos.
Circuit breakers around external integrations, token-bucket rate limiting, LRU caching with size-and-age eviction, priority queues for approval-gated actions, retry policies with jitter, and per-tool timeouts. Multi-tenant API gateways in Go (Fiber) sustaining 2M+ daily requests at p95 < 100ms when the workload is high-throughput rather than agent-paced. 100% strict TypeScript with Zod validation across 77+ typed API interfaces — the entire agent surface area is auditable in CI before deployment. The point is the second-day failure modes that turn a working demo into a midnight pager.
Two-tier local stack so 75% of routine work never hits an external API.
Designed a two-tier local inference strategy: Qwen 3.5 27B dense for planning and reasoning, Qwen3-Coder-Next for agentic code execution, with the Claude API reserved for high-stakes decisions. Eliminates external API exposure for 75% of routine internal tasks — relevant when the work is over private operational data and tenant-isolated pipelines. Edge inference (Cloudflare Workers AI, Vercel Edge / Fluid Compute) reserved for latency-bound public surfaces where TLS termination, rate limiting, and AI personalization need to happen at the POP.
Vision-driven agents for systems that don't have APIs.
Shipped Computer Use agent workflows on the Anthropic Agent SDK + Computer Use API — letting agents navigate browser interfaces, interact with legacy web apps that lack APIs, and automate multi-step UI tasks via vision-based screen understanding. Closes the gap between "everything must be an MCP server" and the long tail of business-critical tools that aren't.
FY26 planning baseline owned end-to-end.
Authored the Media Planning & Finance AI Augmentation Roadmap (synced HTML + Word source), presented to leadership as the FY26 planning baseline. AI roadmaps owned end-to-end and held to outcomes, not demos.
The engineering layer the AI work runs on.
AWS Lambda, Bedrock, S3, IAM, CloudWatch · Snowflake · Docker · Terraform · GitHub Actions · OpenTelemetry, Prometheus, Datadog · Ollama and local Qwen inference · Python, TypeScript, Rust · React, FastAPI, Node.js · vLLM, Triton, TensorRT, MLflow, W&B, PyTorch, DeepSpeed, LoRA, PEFT.
The point is not the tool list. The point is choosing the cheapest controlled path that satisfies latency, privacy, eval, and rollout requirements.
The operating map behind the work.
One graph for the production surface: models, MCP servers, OAuth, Snowflake, Jasper, crawlers, eval gates, observability, and rollout governance.
MCP connector registry
NetSuite, Monday.com, HubSpot, Wrike, Google Ads, Snowflake, LinkedIn, Jasper. The proof is the governed surface: OAuth, scopes, naming, approvals, and failure recovery.
tool: finance.invoice.reconcile auth: oauth-delegated scope: read-only + approval-for-write audit: request, owner, fallbackopen registry artifact →
Eval release harness
1,104 tests across 29 suites covering retrieval quality, refusals, latency, regressions, hallucination behavior, and spend before deployment.
retrieval_quality: pass citation_integrity: pass refusal_correctness: pass release_decision: ship with monitoropen eval artifact →
Private intelligence stack
Snowflake, Bombora, Leadspace, GWI, local inference. Analyst and agent workflows over private data without leaking sensitive context into every model call.
attempt: tenant A asks tenant B expected: refusal + no SQL required: server tenant predicate blocked: raw rows, private identifiersopen isolation artifact →
Auth failure recovery
Opaque tokens instead of JWTs. A missing OAuth audience parameter on the LinkedIn MCP integration turned into the kind of low-level bug that blocks every downstream agent. Root-caused, fixed, 8 downstream tools unblocked, resolution time cut from 4 days to 2 hours.
symptom: opaque access token root_cause: missing audience recovery: JWT claims restored recheck: downstream account listopen recovery artifact →
52-level Claude Field Guide
Training as product, not documentation. Internal adoption moved from scattered prompt lore into an interactive React learning surface.
path: prompt lore -> guided missions surface: React field guide audience: 65+ practitioners result: 3 days -> 2 hoursread writing →
Vendor decisions with teeth
34-page Monday.com MCP assessment plus AdTech API reviews. Feasibility, security, licensing, rollout cost, and operating risk in one decision path.
review: capability, auth, cost risk: rollout + data exposure decision: pilot / defer / reject handoff: owner + next checkask the proof engine →
Endorsed by peers across seven domains.
AI implementation, technical leadership, quality gates, delivery, and client engagement.
“ Phil and I talk about AI regularly, especially MLOps. He has solid expertise with LLMs and RAG implementations in particular, plus knows how to put Python to work effectively in AI projects. He'd be a real asset to any team working in this space. ”
“ I just recently had the pleasure of working with Philip Basile on a team for an extended period. He was a committed, strong, and dedicated team member. He provided guidance and knowledge to the entire team, from assistance with onboarding and IDE configuration and integration with source control and CI systems to learning the newest offerings in our team's technology stack, followed by documenting and sharing his experience. He immediately became a mentor. Philip brought with him, and shared, an impressive depth of understanding of front-end systems, enterprise architecture, and the intricate interdependence of design, functionality and user experience. With all of this, he consistently produced elegant code, markup, and CSS that provided a comprehensive, engaging, and seamless user experience, catching and handling edge and corner cases gracefully. Philip was easy to work with, cooperative, and delivered constructive feedback in a manner that encouraged others to participate in a healthy and productive peer review process. He made the team stronger and greater than the sum of its parts. ”
“ Phil contributed front-end development to our team as a contractor. He developed the web interface for a number of applications and ensured the user experience requirements were met in both desktop and mobile rendering. As a front-end engineer, the applications required TypeScript/JavaScript using the VueJS framework and Vuex for state management, with back-end data retrieval using REST. Phil ensured very high code coverage and code quality standards were met through unit testing with Jest and Vue Development Utils, end-to-end testing using WebdriverIO, and SonarQube quality scans. Docker environments were also part of the daily development lifecycle. Phil worked well with other team members. ”
“ It was a pleasure working with Philip. He excelled at developing multiple UI components simultaneously while integrating with various microservices that utilized RESTful APIs. Philip displayed extraordinary communication skills while implementing UX designs by effectively detailing any blockers, inconsistent documentation, or missing requirements. He has also been able to confidently demonstrate and document his completed work. Philip always brings positive energy to meetings and discussions. He works well in teams and can quickly adapt to changes in organizational structure. ”
“ Philip is an exceptional front-end developer and IT professional. I had the pleasure of working with him on a few challenging client engagements, and his ability to quickly step up in a lead capacity, drive work, and produce results was greatly appreciated. Combined with his technical skill, his professionalism and personality makes him a key asset to any team. ”
“ Phil consistently produces new ideas and approaches to improve code or streamline development processes. He has a knack for identifying potential issues early and developing creative solutions to complex challenges. His passion for innovation makes him an asset in providing fresh perspectives during technical discussions. Phil is an asset to any team seeking an influential contributor with an innovative mindset. ”
“ Philip is a UX Master, and a front-end usability champion. He has deep knowledge of CSS, JavaScript and User Experience Design. He improved the overall quality of Teladoc's website experience, and made the product better. Besides front-end, Philip is a self-starter who always keeps improving himself, and I know he has become quite good at backend development too. Finally, Philip is a great co-worker, reliable and with a fantastic sense of humor. He would be a valuable and important member of any team he joins or leads. ”
A public body of work, not a feed.
Native essays, older posts, and source-labeled archives brought back onto the site so the thinking is easy to inspect by date, platform, and topic.
MCP is not a tooling problem. It is a governance problem.
Anyone can expose an API to a model. The hard part is deciding what the model is allowed to touch, what gets logged, what requires approval, and what happens when an integration fails halfway through a business process.
The impressive number is not 458 tools. It is the contract discipline that keeps hundreds of tools from becoming hundreds of new ways to lose control.
Ask the portfolio about MCP →Most eval harnesses are too impressed with answers.
Answer quality matters, but production systems fail in less flattering ways. Retrieval gets worse after a content migration. Refusals regress. Latency crosses the budget. Spend creeps. A prompt change helps one client and quietly hurts another.
A real eval harness is a release gate for behavior, cost, latency, refusal posture, and retrieval drift, not a scoreboard for pretty generations.
RAG & evals case study →The best agent systems are deliberately unromantic.
The pitch says autonomy. The shipped product needs boring rails: scoped tools, observable plans, deterministic fallbacks, permission checks, and a clear place where a human can say no.
The job is not to make the agent seem alive. The job is to make it safe enough that the business can let it act.
Agent platform case study →Public writing should compound, not disappear.
Competitor sites get authority from a dated body of public work: papers, posts, essays, talks, and notes that prove the thinking existed before the current page. Older writing now lives here as an inspectable archive instead of being stranded on platform profiles.
The archive is not here to make every old post equally important. It is here to show a continuous public trail across AI, engineering, career, accessibility, frameworks, and developer education.
Open writing archive →A field guide that exposes the production judgment behind the work.
Not a course business. A public artifact that exposes the judgment behind the case studies: agents, agentic coding, RAG, vector databases, evals, tool use, and rollout discipline — translated into eight text-only missions with hands-on builds.
What an agent actually is.
Learn the loop: context, plan, action, observation, state, stop.
Prompting as interface design.
Turn prompt writing into contracts, schemas, refusals, and handoffs.
Agentic coding without chaos.
Use plans, file boundaries, diffs, tests, and rollback discipline.
RAG from first principles.
Build retrieval around chunks, metadata, citations, and answer grounding.
Vector databases without mystery.
Compare keyword search, embeddings, filters, reranking, and top-k failure.
Evals before belief.
Make golden questions, regression checks, refusal checks, and cost limits.
Tool use, MCP, and boundaries.
Define tool contracts, scopes, approval steps, dry runs, and trace logs.
Ship the loop.
Connect prompts, retrieval, tools, evals, traces, cost guards, and fallback paths.
The current competitive lane.
Sole technical decision-maker for AI platform strategy at a global B2B marketing agency, advising the SVP Global Operations across UK, US, and APJ. Owned MCP server architecture, LLM agent systems, Snowflake governance, vendor procurement, local inference strategy, and desktop-agent policy.
458 agent-callable tools governed across 12+ production MCP servers, 23 packaged Claude skills (from a 45+ skill internal library used by 30 practitioners), 1,104 eval tests across 29 suites, 34-page Monday.com MCP assessment, 52-level Claude Field Guide, and 4 enterprise B2B clients on Jasper workflow.
The operating base underneath the AI work.
Started in 1998 as a Fordham University web developer while still a CS student, then carried that self-taught habit through client systems, IBM global search modernization, Atlas Air "Hawk" global flight scheduling, Dragos ICS/OT cybersecurity, and classified Air Force AMC mission planning. Teams of 4–20 across 12 time zones, ~90% client retention, 75% repeat business, 5 engineers coached to senior roles.
Predictive ML on 200+ clinical features driving ~30% IVF success-prediction improvement. HIPAA / SOC 2 / PCI DSS platform across 50+ clinics, 40K+ IVF cycles, $100M+ in patient financing at 99.9% uptime. Zero-downtime integration of 9 acquired clinics and 50K+ patient records. $5M+ revenue impact through optimized treatment protocols.
Scaled platform through NYSE debut: 12.2M → 15.1M members and 240K+ quarterly visits. Migrated backend to Elixir/Phoenix for real-time telehealth with sub-2-second WebRTC connect times across ~50K concurrent sessions. CVS MinuteClinic API — first NCQA telehealth credentialing.
First technical hire — built platform driving $10M+ revenue with a 30% conversion-rate lift. Celebrity launch sites at 100K+ concurrent users. Served on executive search panel for hiring the CTO.
Engineering lead on Oreo Daily Twist and the Super Bowl blackout real-time response (sub-5-minute content decisions at millions of concurrent users), Oscar Mayer Bacon Barter, Coca-Cola Polar Bowl, and platforms for Marvel, NBC, and National Geographic. 3-year technical advisor to Polywork (Product Hunt Golden Kitty winner, 50K+ users in 48 hours).
Foundations underneath the work.
Formal CS foundation for the operating systems, data structures, compilers, and web work that started while still a student.
Classic supervised learning, model evaluation, bias/variance tradeoffs, and practical ML workflows refreshed against current AI systems.
Neural-network, sequence-model, and optimization foundations behind the production RAG and agent-eval work.
Long-running operational context for mission planning, chain-of-command communication, and work where process discipline matters.
Things a stranger can inspect.
The competitors with the strongest careers have public gravity: papers, talks, open source, courses, and durable writing. This is the current public surface for the production-AI lane.
Redacted proof packets
Inspectable reconstructions for MCP governance, RAG reliability, private-data AI, eval gates, approvals, and tenant boundaries.
public proof vault field guideAgent engineering course
Eight text-only missions with progress, artifact logs, field checks, capstone, and a self-issued completion certificate.
course player case studiesProduction proof
Private-data intelligence, MCP agent governance, and eval-gated RAG systems written as inspectable project narratives.
three case studies interactive proofPortfolio answer engine
A Gemini-backed RAG interface that answers hiring-style questions from retrieved career-brain evidence.
ask for proof writingWriting archive
Native production-AI essays plus a dated archive of older posts imported from DEV, Medium, and LinkedIn.
notes + archive career documentResume PDF
A conventional hiring artifact for the claims that should not depend on interactive site copy.
downloadable peer proofRecommendations
Endorsements from managers, peers, clients, and collaborators across the full systems career arc.
public endorsements course artifactCompletion certificate
A local progress artifact for the Agent Engineering mission path, tied to completed field exercises.
printable certificate sourceGitHub profile
Public repositories across local retrieval, AI tooling, MLOps experiments, TypeScript systems, Rust, and applied ML labs.
github.com/philipjohnbasileResearch labs & public experiments
These are public applied-ML labs used to demonstrate local-first MLOps, model evaluation, cheminformatics, and reproducible pipelines. They support the case studies — they are not the production case studies themselves.
VecStore
Embeddable vector database for local-first retrieval. HNSW + BM25, Python bindings, browser-runnable WASM build.
github.com/philipjohnbasile/vecstore TypeScriptPhilJS
A modern UI framework experiment with fine-grained reactivity, AI-native streaming SDK, WebAssembly support, and first-class Rust integration.
github.com/philipjohnbasile/philjs Python · PharmacovigilanceSignalScope
Research-only signal-detection lab over FAERS and PubMed: extraction, disproportionality scoring, embedding analysis, ablation, and writeup.
github.com/philipjohnbasile/signal-scope Python · MLOpsCardValueML
Local-first MLOps pipeline for trading-card pricing — ingestion, feature engineering, model registry, drift, calibration, and CI monitoring.
github.com/philipjohnbasile/CardValueML Python · CheminformaticsProtein-Ligand Playground
An offline-friendly protein-ligand affinity workflow over ChEMBL sample data: random and time splits, RDKit baselines, ChemBERTa embeddings, GNNs, CLIs.
github.com/philipjohnbasile/protein-molecule-ai-playground — MoreThe rest of the GitHub
58 public repositories across infrastructure, AI tooling, frameworks, and experiments. Python, Rust, TypeScript, and practical build artifacts.
github.com/philipjohnbasileWhat happens after the conversation.
The strongest competitor sites do not just list credentials; they make the next step feel concrete. This is the operating plan I would use to turn an AI mandate into controlled production movement.
AI system inventory, governed tool registry, eval suite, risk register, rollout policy, working pilot, training surface, and a credible roadmap for the next quarter.
Start the conversation
Twenty-eight years in. The unfashionable parts of the job — eval gates, rollback paths, latency budgets, the failure modes nobody writes blog posts about — are the parts I find most interesting.
Currently available for senior applied AI roles and fractional technical leadership.
If you have an AI system that needs to work in the real world.
For Principal AI Engineer, Staff Applied AI, AI Platform, Agent Systems, RAG, MCP, or fractional AI leadership work — start here.