The Anatomy of an Agent Orchestration System

What does rigorous AI-assisted data analysis actually look like?

Total Human Time ~30 min

Total Claude Time ~5 hrs

Distinct Datasets
Analyzed 8

Lines of
fully reproducible
analytic code ~6,000

LLM-based AI assistants are becoming increasingly capable, but they are always at risk of hallucination, sycophancy, over-confidence, and laziness. How can these flawed and non-deterministic tools ever be useful for conducting rigorous data analysis?

I'm very glad you asked!

Enter DAAF: the Data Analyst Augmentation Framework. DAAF is a free and open-source instructions framework for Claude Code that helps skilled researchers rapidly scale their expertise and accelerate data analysis across any domain with AI assistance -- without sacrificing the transparency, rigor, or reproducibility that good science demands. DAAF sits between you and Claude Code to automatically and consistently help the standard Claude AI think and work more like a responsible researcher by:

Enforcing strict auditability and reproducibility standards for all work, allowing you to review and verify everything Claude does on your behalf
Preventing potentially dangerous unintended file access and editing by sandboxing Claude with strict permission guardrails
Setting high standards of care, rigor, and thoroughness in all data analysis, by forcing Claude to explain, verify, and adversarially review all analytic code before you ever see it
Teaching it the best practices for a wide variety of research methodologies like causal inference and geospatial analysis, by embedding rich reference files that extend Claude's base capabilities with real citations and resources
Collaborating with you, the human expert, directly on all key decisions and keeping you firmly in the driver's seat

Think of DAAF like your personal lab manager for an AI-powered research lab, informed and guided by a richly detailed library of bespoke reference material to ground everything it does in real scientific best practices.

Just like a highly skilled colleague, there are a lot of ways DAAF can help based on your own workflows and your comfort level with AI: everything from having it serve as a data documentation oracle for your hyper-specific data definition questions (e.g., "Which collection years do we have in common across these eight datasets, again?"), to one-off vibe coding requests (e.g., "Can you help me review this diff-in-diff regression specification I wrote?"), to producing entire data analytic pipelines and reports given a starting research question, to verifying the empirical reproducibility of entire past analyses, and much more.

...Okay, that all sounds pretty useful to me. But what does that all mean in practice? What does that actually look like?

Great question! Let's take a deep dive together into a real project with DAAF to see how it all really works from start to finish.

Because DAAF logs and traces everything it does on your behalf, I've developed this page as a transparent walk-through of the key steps and processes involved in working through a full end-to-end analysis with DAAF: from a single natural-language prompt to a fully reproducible data analytic pipeline complete with a consolidated and cleaned analytic dataset, several thoughtfully-constructed data visualizations, supplementary regression analyses, and an in-depth data analysis report pulling it all together for your review and ready to extend in any direction you can imagine. Everything you'll see here is pulled from raw log files generated by an actual run with DAAF: no cherry-picking or hiding here.

To start, you can inspect the initial data analysis report (right-hand panel on desktop, click the "View Analytic Report" button on the bottom of your screen on mobile) that DAAF produces by default in its "Full Pipeline Mode": the full end-to-end analytic workflow. The goal of this document is to walk the human researcher through the key findings of DAAF's analysis, after which you can proceed to making revisions, extensions, or translating it into publication-level products for various venues like journals and policymaker briefs.

As you scroll down this page, you'll see exactly how DAAF takes that initial prompt and methodically steps through an extremely deliberate research and data analysis pipeline. For each step of that workflow, I explain exactly what the point of that workflow step is and display what this actually looks like in conversation with DAAF. If you want to read more about any step, you can expand each one to see (a) exactly what each specialized assistant is doing in that step of the workflow, (b) exactly what reference files each assistant reads and references to guide its work, and (c) exactly what each assistant produces in terms of analytic code, data interpretations, or research artifacts for downstream use. Every single artifact can be viewed in the right-hand file viewer panel, as well as in the full GitHub sample project folder.

Altogether, DAAF allows researchers to massively kickstart an analytic project like this one bringing together 8 different datasets from two different data providers to answer a high-level research question with in-depth data visualizations, regression analyses, and interpretation limitations -- in all of ~30 minutes of raw human time. And from there, the researcher can use DAAF to conduct arbitrary additional analyses, data visualizations, policymaker briefs, interactive dashboards, press releases, academic paper drafting, and more -- all just another prompt or two away. Nothing DAAF produces should be treated uncritically and absolutely needs to be reviewed by the human expert, but it nonetheless represents an enormous value-add for rapidly accelerating research in alignment with our core scientific principles.

The goal of DAAF is ultimately to be a force-multiplying exo-skeleton for human researchers: a way to extend and expand their expertise to produce more phenomenal and rigorous research for the betterment of our society. Made by researchers, for researchers. And perhaps most importantly: DAAF and all accompanying educational materials are open-source and will forever be free to all.

Full Pipeline Mode is just one of the many ways researchers can use DAAF to extend, enhance, and support various research workflows and tasks. Learn more about DAAF more generally at the GitHub repos and tutorial videos linked below, or begin the walkthrough below to see how complex AI-empowered research workflows actually look in practice.

Learn more about DAAF → View the full sample project on GitHub → View the DAAF v2.0.0 Showcase Video →

Next Steps

From here, the analysis is yours to take in any direction

You've now seen exactly how DAAF turned a single conversational prompt into a complete, reproducible, and fully auditable data analytic pipeline: every step, every artifact, and every decision is transparent by design.

From this point, the researcher has total flexibility. Want to dig deeper into a specific result? Generate a new visualization? Re-run the analysis on a different cohort year? Draft an interactive dashboard for some of the results? Put together an outline for the academic paper? Just ask: with a single prompt, DAAF easily picks up exactly where it left off with the full context of all the work above loaded and ready to go. One of the best parts: The final project folder is structured intentionally to serve as a stand-alone replication package: share the project folder with a collaborator to have them extend it in their own directions out of the box, and/or submit it alongside any journal submission to effortlessly align with reproducibility best practices and developing AI use disclosure guidelines.

That's the promise of rigorous AI-augmented research for science: speed and flexibility without sacrificing accountability and rigor, keeping the human experts firmly in the driver's seat to guide and stand by all decisions. The Full Pipeline Mode analysis is just one of many bespoke research workflow styles that DAAF is designed to support and accelerate, including a more light-touch Ad Hoc Collaboration Mode (i.e., vibe-coding with rigor!), the Data Lookup Mode (your personal data documentation oracle), and Reproducibility Verification Mode (mechanically verify reproducibility of, and methodologically critique, any full analysis produced via DAAF). And this is just the beginning: as an open-source project that anyone can contribute to and paired with DAAF's iteratively self-improving systems, DAAF will only continue to get better and more useful as more people put it through its paces and discover where it can continue to grow. Put another way: This is the worst a tool like DAAF will ever be. Who knows where can we push forward the frontiers of responsible and rigorous AI research, together?

Ready to get started with DAAF for yourself? Visit the full GitHub repo for installation instructions; you can get started with a fresh computer and a high-usage Anthropic account in under 10 minutes.

Total Human Time ~30 min

Total Claude Time ~5 hrs

Distinct Datasets
Analyzed 8

Lines of
fully reproducible
analytic code ~6,000

Learn more about DAAF and get started on GitHub

View the DAAF v2.0.0 Showcase Video → Further reading: A three-part mental model for making sense of this weird moment on the AI frontier → Further reading: Six steps towards building a more optimistic AI-empowered future for academia and science, together →