I Built a CRM Without a Developer. Here's What It Actually Cost.
Stay in the loop
Get new posts when they go up. No noise, no pitch — just what I'm working out.
You can build a CRM without a developer. The tools exist: Replit, Claude Code, Cursor. They work well enough to get a production system live. What they do not do is tell you upfront what it will cost in time, in trade-offs, and in technical debt you will not discover until the wrong moment. This post is that account, from someone who built one and runs it.
What it actually cost — the time
MentorOS is the operations software I built for Australian Mentoring Services, the NDIS direct support provider I co-founded in Blacktown, NSW. AMS has been running since 2020 and supports close to 200 participants. MentorOS handles participant records, support worker rostering, progress note workflows, and incident tracking. It is not a side project. It runs a real organisation.
I built it working 5 to 10pm on weeknights, early mornings before my daughter woke up, and across weekends when the work needed uninterrupted hours. That pattern ran for months. It still does.
Almost everything written about AI coding tools undersells the time commitment. The tools accelerate the build. They do not replace the thinking, the debugging, the decision-making, or the discipline required to keep going when a session produces nothing useful. A rough count of the hours it took to get MentorOS to a state where I trusted it with real participant data: somewhere between 300 and 400 hours across the first year.
That is not a weekend project. It is a sustained commitment, done in the margins of running two organisations.
I did it because buying off-the-shelf NDIS software meant buying systems built for organisations bigger than ours, with assumptions baked in that did not match our service model. Building meant we got exactly what AMS needed. It also meant I carried the risk of building something brittle. At points, I did.
What AI-assisted building is genuinely good for
AI coding tools are good at a specific set of things. Understanding that set is the difference between using them effectively and burning time on problems they cannot solve.
Describing problems clearly. The most useful thing I learned was to write precisely. When I described a problem in vague terms ("the rostering isn't working right"), the output was useless. When I described it specifically ("the rostering component is not filtering out workers whose active hours overlap with the proposed shift start and end time, even though the overlap check function exists in the codebase"), the output was immediately useful. AI coding tools reward precision. They punish vagueness at speed.
Iterating on logic. Testing a different approach to a workflow, like how a progress note gets flagged for review or how a participant's funding period is tracked against service bookings, used to take hours. With AI tools, I could generate a working alternative, test it against real data, and reject it inside a single session.
Generating boilerplate. Authentication scaffolding, form components, API route structures, database schema drafts: the mechanical work that underpins every application. AI tools handle this competently and quickly. I did not spend meaningful time on structural plumbing. I spent time on the logic specific to NDIS service delivery, which is where it belonged.
Testing ideas before committing. Before building anything significant, I could describe a feature and generate a rough implementation to probe. Half the time, the act of probing revealed that the idea had a flaw I had not seen. That saved weeks of building things that would have needed pulling apart.
What AI-assisted building handles — and where it stops
Category: Speed — Handles well: Boilerplate, scaffolding, first drafts of logic — Fails here: Complex architectural decisions spanning multiple systems — Bring a developer in when: The system grows past what one person can hold in their head
Category: Quality — Handles well: Working code for isolated, well-described problems — Fails here: Consistent style and coherent rationale across the codebase — Bring a developer in when: The codebase needs handing to someone else
Category: Security — Handles well: Basic patterns when explicitly prompted — Fails here: Authentication, authorisation, API security, exposed secrets — Bring a developer in when: Before anything touches real user data
Category: Integration — Handles well: Simple API calls with clear documentation — Fails here: External systems with complex auth flows — Bring a developer in when: When integrating with financial systems, health records, or government APIs
Category: Accessibility — Handles well: Basic semantic HTML when requested — Fails here: WCAG compliance across dynamic interfaces — Bring a developer in when: Before any public-facing product goes live
Category: Scalability — Handles well: Functional code for current load — Fails here: Performance under 10x traffic or concurrent users — Bring a developer in when: Before onboarding enterprise or government customers
Where AI-built code breaks
This is the part the AI coding tool vendors will not lead with.
Architecture. A codebase built across hundreds of AI-assisted sessions does not have a coherent design. Each session generates code that works. The sessions do not accumulate into a system with a consistent rationale. The problem compounds as the application grows. David Linthicum put it precisely in InfoWorld in March 2026:
"AI-generated systems create a different kind of debt: debt without authorship. There is no shared memory. There is no consistent style. There is no coherent rationale spanning the codebase. There is only an output that 'passed tests.'"
I felt this. At around twelve months, the MentorOS codebase had components written in four different ways, some conflicting with each other. Nothing was broken in a visible sense. But it was not a system. It was an accumulation of solutions to isolated problems. Cleaning that up took as long as some of the original build phases.
Security. Veracode's 2025 GenAI Code Security Report found 45 percent of AI-generated code contains security flaws. A 2026 Forbes analysis found AI-generated code carries 2.74 times more security vulnerabilities than human-written code. Checkmarx has identified six recurring risk categories: injection vulnerabilities and broken authentication, unverified dependencies, hard-coded secrets (API keys and credentials embedded directly in the code), over-trust in AI output, reduced auditability, and change volume that outpaces security review.
"AI can introduce insecure logic, unsafe dependencies, weak access controls, or exposed secrets at a pace that quickly outstrips manual review." — Checkmarx, March 2026
When MentorOS was handling data about real NDIS participants, names, contact details, support records, I needed to be certain the security was right. I was not certain. I brought a developer in specifically to audit the authentication layer and database access controls. That is not a failure of the AI tools. That is the correct use of them.
The handoff problem. I have attempted to hand parts of the MentorOS codebase to another developer on two occasions. Both times, the first question was some version of "what is the rationale for this?" In an AI-built codebase, the answer is often: "there isn't one. It's the output that worked." That is not an answer a developer can build on. A full rewrite of a vibe-coded MVP takes three to four months, according to Hubql's 2026 analysis. A targeted audit and remediation takes four to six weeks. Both are real costs that do not appear on the original build timeline.
As Linthicum also wrote: "Software is not merely produced; it is stewarded." An AI-built codebase has no steward. That becomes a problem the moment anyone else needs to work on it.
When you actually need a developer
The question is not whether to use AI tools or hire a developer. Most real builds will involve both. The question is timing.
Hubql's 2026 analysis identified five signals that the handoff point has arrived:
1. You have stopped being able to explain what the application does under the hood. 2. You are handling sensitive data from real users: names, emails, payment information, health data. 3. An investor or enterprise customer has asked to review your infrastructure. 4. You have had your first unexplained outage or data incident. 5. You have tried to bring in a developer and they described the codebase as "a mess."
For organisations working with NDIS participants, signals two and four apply earlier than most founders expect. Participant data is sensitive health and disability information. An unexplained data incident involving a participant is not a startup war story. It is a potential NDIS Commission notification and a breach of the Privacy Act 1988. The threshold for when you need professional oversight of the security layer is lower in disability services than in almost any other context.
Research from Seedium (2025) offers some reassurance alongside the warning: 70 to 80 percent of AI-built applications are structurally sound. The 20 to 30 percent that needs work clusters around authentication and authorisation, database security, API design, and deployment infrastructure. Targeted rather than wholesale remediation is usually possible, but it requires a developer who can read what the AI generated and work with it.
What to look for in a developer. Technical prestige is the wrong filter. The question is not which tools they use or what their GitHub portfolio looks like. It is whether they understand the operating environment their software will run in. A developer who does not know what an NDIS service agreement is, or what incident reporting obligations a registered provider carries, will build the right technical solution to the wrong problem.
I would take a mid-level developer who has worked in community services over a senior developer who has not, every time.
Three questions to ask before hiring:
1. "What would break first under ten times our current traffic?" 2. "If a sophisticated attacker looked at this codebase, what would they target?" 3. "What would it cost to bring this to a standard that a mid-size enterprise would accept?"
The answers tell you whether the developer understands production systems, security thinking, and honest cost estimation. Any answer to question two that does not name a specific area is a red flag.
What building MentorOS taught us
What we got wrong. We built too fast in the early phases without documenting the decisions we were making. When a session produced working code, we moved on. We did not record why a particular approach was chosen, what alternatives were considered, or what the assumptions were. That documentation debt compounded. By the time the codebase was large enough to need a developer's involvement, the rationale for dozens of decisions was gone. We now document decisions as they are made. This should have been a practice from day one.
We also delayed the security audit longer than we should have. The argument I made to myself was that the system was small enough that the exposure was limited. That argument does not hold when the data belongs to people with disability who are already in a position of vulnerability relative to the organisations that hold their records.
What we got right. We built a system that matches how AMS actually operates, not how a generic NDIS software vendor imagines a direct support provider operates. The rostering logic reflects our service model. The progress note workflow reflects our quality review process. The incident tracking reflects our actual reporting obligations. No off-the-shelf system gets that right for a provider our size without expensive customisation that puts you at the vendor's mercy.
We also learned, faster than we would have through any other method, what our own operational processes actually were. Describing a workflow to an AI coding tool well enough for it to generate useful code forces clarity. Every ambiguity in the description produces ambiguity in the output. Building MentorOS surfaced process gaps in AMS that we had not articulated before. That has value independent of the software.
Where it has positioned AMS. The trial and error of building MentorOS, the months of sessions, the architectural mistakes, the security audit, the handoff attempts, is now institutional knowledge. We know what AI coding tools can do, what they cannot do, where to use them, when to stop, and what a developer engagement looks like when it follows an AI-assisted build phase. Other community organisations are at the start of that same process. They do not need to run the same experiments to learn the same lessons.
Frequently asked questions
Can I really build a CRM without a developer?
Yes. Tools like Replit, Claude Code, and Cursor make it possible to get a functioning CRM live without writing code from scratch. Replit is the fastest path to a working prototype: browser-based, zero setup, provisions the server and database in the same flow. The constraint is not the tooling. It is time, discipline, and knowing when to bring professional oversight in.
What does AI-assisted software development actually cost?
Tool pricing is low. Most AI coding tools run between $20 and $100 per month. The real cost is time. Expect hundreds of hours across the build phase for anything beyond a simple prototype. Add a developer engagement for security review before handling real user data: budget four to six weeks for a targeted audit, three to four months if the codebase has grown enough to need a full rewrite.
Is AI-generated code secure enough for an NDIS provider?
Not without review. Veracode's 2025 research found 45 percent of AI-generated code contains security flaws. For an NDIS provider handling participant data, names, contact details, support records, incident logs, a security audit of the authentication layer, database access controls, and API design is not optional. It is a condition of responsible operation under the NDIS Practice Standards and the Privacy Act 1988.
What is the difference between Replit, Claude Code, and Cursor for non-technical founders?
Replit runs in the browser with zero setup: the fastest path from idea to working prototype, best for non-technical founders who need something live quickly. Claude Code runs in the terminal with a one-million token context window and the highest benchmark scores for complex multi-file architecture work. Steeper learning curve for non-technical users, but handles large codebases well. Cursor is a VS Code fork used by over 800,000 active developers monthly. Professional-grade, best once the build has moved beyond the prototype phase.
When should a nonprofit or community organisation hire a developer?
When you are handling real user data, especially health or disability information, bring a developer in for at least a security audit before going live. Other triggers: the codebase has grown past what you can explain to someone else, you have had an unexplained outage, or a partner organisation has asked to review your infrastructure. Sector understanding matters in who you hire. A developer who does not know your operating environment will solve the wrong problems.
What is technical debt and why does it matter for AI-built software?
Technical debt is the accumulated cost of shortcuts taken during development: code that works now but will need rebuilding later. AI-assisted builds accumulate a specific kind, code without a coherent rationale, inconsistent style, and decisions that cannot be explained because they were generated rather than designed. This debt is invisible until the moment someone else needs to work on the system. Budget for it from the beginning.
The trial and error AMS went through is now part of how we work. If your organisation needs to go through this process, building an internal tool, automating a workflow, figuring out when to get a developer involved, book a discovery call. We will map out what makes sense for your situation before any work starts.
For founders and community orgs
If this resonated, let's have a straight conversation.
Book a discovery callExplore more