What AI-Assisted Coding & GAS Can’t Do

I have spent the last year or so writing and talking about why educators and districts should consider building their own tools. The argument is one I still believe in: if you can describe a workflow clearly, AI can help you turn it into a working automation using something like Google Apps Script, and that gives you a degree of agency over your technology that vendor-dependent solutions never will. I have built tools this way for communication, rostering, course management, and file tracking, and I have shared them as open-source resources for others to adapt.

But I have also hit walls. Repeatedly. And if I am going to keep encouraging people to try this approach, I owe them an honest accounting of where it breaks down, what it cannot do, and what you need to plan for before you start building. This is that accounting.

A Note on What We’re Actually Talking About

You have probably heard the term “vibe coding” by now. Coined by AI researcher Andrej Karpathy in early 2025, it describes an approach where you describe what you want in plain language, let AI generate the code, and accept the output without really examining what it produced. The idea is to “forget that the code even exists” and just work from the results.

That is not quite what I do, and it is not what I recommend.

What I advocate for is closer to what programmer Simon Willison describes as using AI as a coding assistant. You describe the problem. AI generates a solution. But then you review it, test it, try to understand what it is doing, and iterate when something does not work. You are not ignoring the code. You are collaborating with a tool that handles the syntax while you handle the logic, the context, and the judgment calls about whether the output actually does what your users need.

The distinction matters because the limits look very different depending on which approach you take. If you are truly vibing, accepting AI output without review and deploying it to your district, the risks around security, reliability, and maintainability are significant. Research from late 2025 found that AI co-authored code contained roughly 1.7 times more major issues compared to human-written code, with security vulnerabilities appearing at nearly three times the rate. In an educational context where you are handling student data and institutional processes, that is not a risk you can afford to ignore.

If you are taking the more deliberate approach, reviewing and understanding the code as you go, the risks shift. They become less about hidden vulnerabilities and more about the hard constraints of the platform you are building on and the practical limits of what a non-developer can reasonably manage. Those are the limits I want to focus on here, because they are the ones I have actually encountered in my own work.

The Platform Has a Ceiling

Google Apps Script is remarkably capable for what it is: a free, integrated scripting environment that runs inside the Google Workspace ecosystem most schools already use. But it was designed for lightweight automation, not for building production-grade software, and the constraints reflect that.

The most common wall you will hit is the six-minute execution limit. Every time a script runs, it has six minutes to finish before Google shuts it down. For simple tasks, like sending a batch of emails or updating a spreadsheet, this is plenty. For anything that needs to process a large amount of data, it becomes a serious design constraint.

I ran into this when building a tool to track when files and folders had last been updated across a large Google Drive. The Drive contained thousands of files, and the script needed to walk through the entire folder structure, check modification dates, and log the results. There was no way to do that in six minutes. The solution was to build what amounted to a working memory using a Google Sheet: the script would log which sections of the Drive it had already checked and which still needed processing, then a time-based trigger would restart the script every ten minutes to pick up where it left off. It works, but it is a workaround for a hard platform limitation, and it adds significant complexity to something that would be straightforward in a more capable environment.

This pattern, breaking a long process into chunks and using a sheet to track progress between runs, comes up constantly in Apps Script development. AI tools will often generate code that ignores this limit entirely, producing scripts that look correct but will fail silently when they hit the six-minute wall in practice. It is one of the first things I have learned to check for in any AI-generated script, and it is one of the first things I mention when helping someone else get started.

Beyond execution time, there are daily quotas on almost everything: how many emails you can send, how many URL requests you can make, how many times you can read from or write to a spreadsheet. Consumer Gmail accounts have significantly lower limits than Google Workspace accounts, and the quotas can change at any time without notice. If you are building a tool that will be used by many people or that processes large volumes of data, you need to understand these limits before you start, not after your tool stops working on a busy Monday morning.

The platform is also single-threaded and stateless. Each time a script runs, it starts from scratch with no memory of previous runs unless you have explicitly stored that state somewhere (usually in a spreadsheet or the Properties Service). There is no way to run multiple operations in parallel. For the kinds of tools most educators need, these constraints are manageable. But they mean that certain categories of tool, anything requiring real-time responsiveness, high concurrency, or complex multi-step workflows, are simply not a good fit for Apps Script, regardless of how well you prompt your AI assistant.

Scope Creep Is the Default, Not the Exception

One of the less-discussed limits of AI-assisted coding is that AI tools are, by nature, eager to build. Describe a problem, and the AI will generate a solution. Describe a bigger problem, and it will generate a bigger solution. It does not naturally push back and say “that is too complex for this platform” or “you should break this into three separate tools instead of one.” That restraint has to come from you.

I learned this the hard way with an accessibility tool I was building for Canvas courses. The original idea was straightforward: a script that would go through course pages and clean up common accessibility issues that could be somewhat automated, things like header levels being out of order. But as I started describing edge cases to Claude, the project kept expanding. What about images without alt text? What about tables without headers? What about links that say “click here”? Each addition seemed reasonable on its own, but together they turned a focused utility into something sprawling and brittle.

The problem was not that AI could not write the code. It could, and it did. The problem was that I had not spent enough time upfront thinking through what was actually feasible to automate reliably versus what required human judgment. Some accessibility issues have clear, rule-based fixes. Others require contextual understanding that no script can provide. By the time I realized I had let the scope balloon, I had invested significant time in a tool that tried to do too much and did none of it particularly well.

What I would do differently, and what I now recommend to anyone starting a project like this, is to spend time planning with the AI before any code gets written. Describe the problem, ask the AI to help you map out what is automatable and what is not, identify the edge cases, and agree on a scope before a single function gets defined. This is a form of chain-of-thought prompting, but applied to project planning rather than code generation. It takes longer upfront and saves enormous amounts of time later.

The Sharing and Scaling Question

When you build a tool with Google Apps Script bound to a spreadsheet, you face a fundamental architectural choice about how it will be used and shared. The approach I have settled on, and the one I use for all of my WebTools, is to make the spreadsheet itself the configuration interface. All the settings, branding, text content, and behavioral options live in a config tab on the sheet. When someone else wants to use the tool, they make a copy of the sheet, adjust the config tab for their context, authorize the script, and they are up and running with their own independent instance.

This approach has real strengths. It is genuinely open source in the most practical sense: anyone with a Google account can copy and run the tool without asking permission, paying a subscription, or depending on my infrastructure to stay online. Each instance is independent, which means one school’s data never touches another’s. And the config-tab pattern means people can customize the tool’s behavior without ever opening the script editor. We are using this approach right now with a Danielson domain walkthrough tool, where all the branding, domain content, and deployment settings live in the config sheet so we can roll it out across multiple sites without modifying the underlying code.

At the same time, this architecture has limits that are worth naming honestly. Because each copy of the tool is a separate, independent instance, there is no built-in way to aggregate data across sites. If ten schools are each running their own copy of a walkthrough tool, and you want a district-level summary of how the tool is being used, you have to build that reporting layer separately. In some cases, this data independence is actually a feature rather than a limitation; it means each entity owns its own data and there are no privacy concerns about information crossing organizational boundaries. But it does mean that certain use cases, particularly those requiring centralized reporting or cross-site analytics, are not a natural fit for this model.

The scalability ceiling is also real. If a tool runs as the developer (meaning it uses your Google account’s permissions to execute), everything counts against your quotas. In practical terms, that means a tool deployed to a hundred or more users under a single developer account is going to start hitting quota limits. Running the tool as the user (where each person’s own Google account handles the execution) scales much better, but it requires each user to go through the authorization process, which is its own barrier. That authorization screen, the one that says “This app isn’t verified” and asks users to click through multiple warnings, is the single biggest point where people abandon the setup process for Apps Script tools. It is safe, and the warnings are standard for any script that has not gone through Google’s formal review process, but it looks alarming if you do not know what to expect.

What You Cannot Skip

Even with the more careful, AI-assisted approach I advocate for, there are things that this workflow genuinely cannot replace.

You still need to understand what your code is doing at a conceptual level. You do not need to be able to write JavaScript from scratch, but you do need to be able to read through a script and follow its logic well enough to spot when something does not match what you asked for. AI is good at generating plausible code. It is not always good at generating correct code, especially when the requirements involve nuances that were implicit in your description but not explicit in your prompt. The more you build, the better you get at this kind of review, but it is a skill that develops over time rather than something you can skip.

You also need to think about what happens when you are not around. If you build a tool that your team relies on and then you change jobs, someone needs to be able to maintain it. This means documentation matters, the config-tab pattern matters, and keeping the code as simple as possible matters. One of the risks of AI-generated code is that it can be structurally complex in ways that are hard for someone else (or even for you, six months later) to follow. Being intentional about asking for well-commented, clearly structured code helps, but it does not eliminate the maintenance question entirely.

And you cannot skip thinking about data privacy and security. Google Apps Script tools that handle student information need to comply with FERPA and any relevant state privacy laws, just like commercial tools do. The difference is that commercial vendors (ideally) have security teams reviewing their code. When you build your own tool, that responsibility falls on you. This does not mean you should not build your own tools. It means you should be thoughtful about what data your tools access, how they store it, and who can see it. For most of the sheet-bound tools I build, the data stays entirely within the user’s own Google Workspace, which keeps the privacy picture relatively simple. But if your tool sends data to external APIs or stores information outside of Google’s ecosystem, the considerations get more complex.

Where This Leaves Us

None of what I have described here is a reason not to build your own tools. It is a reason to build them with clear eyes about what the approach can and cannot do.

AI-assisted coding with Google Apps Script is genuinely good at producing focused, single-purpose tools that automate repetitive workflows within the Google ecosystem. It is good at giving educators and districts agency over their own technology rather than depending entirely on vendors. And it is a legitimate way for people without traditional programming backgrounds to create solutions that actually fit their specific context.

What it is not good at is replacing professional software development for complex, high-stakes, or large-scale systems. It is not a substitute for IT infrastructure, security review, or the kind of careful architecture that comes from deep technical expertise. And it requires more planning, more testing, and more ongoing attention than the most enthusiastic accounts of vibe coding might suggest.

The approach I keep coming back to is: start simple, plan before you build, understand what you have made, and be honest about the constraints. If a tool needs to do more than the platform supports, that is useful information, not a failure. Sometimes the right answer is a simple script. Sometimes it is a commercial product. And sometimes it is knowing the difference.

If you have been building tools this way and have run into limits of your own, or if you are considering getting started and want to think through what is feasible, I would like to hear from you. You can reach me at licht.education@gmail.com, and there are more tools, articles, and resources at bradylicht.com.