Can AI Write Better Than Humans? I Tested It. Here Is What Actually Happened.

⏱️ Reading time:


This is one of those questions where both camps have already decided the answer before looking at the evidence.

The AI enthusiasts say: yes, obviously, it is faster and often better. The writers and humanists say: no, obviously, AI lacks soul, lived experience, and genuine creativity. Both camps are partially right. Both are missing what is actually interesting about the comparison.

The AI Vanguard ran ten direct writing tests across different formats and evaluated the results against defined criteria. The AI tools used were Claude 3.5 Sonnet and ChatGPT GPT-4o, both on paid tiers. The human writing was produced by the author of this post before seeing any AI output.

The results are genuinely interesting. In some categories the AI output was objectively better. In others it was not even competitive. And in the most revealing cases, the comparison exposed something important about what writing actually is and what we actually value when we say something is well-written.

How the Tests Were Structured

Ten writing tasks were selected to represent the full range of professional writing contexts. For each task, the same brief was given to Claude, to ChatGPT, and to the human author. The outputs were then evaluated on four criteria: technical quality (grammar, structure, coherence), relevance to the brief, originality of expression, and whether the output would be publishable or usable without editing.

 

The evaluation was done by the author, which introduces a bias toward human writing that should be acknowledged. What reduces but does not eliminate this bias: the criteria were defined before any output was produced, and the evaluation was conducted on anonymised outputs where possible.

The results below show selected tests from the ten. The pattern across all ten is summarised in the conclusion.

The Tests

TEST: Professional email declining a partnership

AI output: Claude produced a well-structured, appropriately warm email that hit all the required notes: gratitude, clear decline, positive framing of the future relationship, and a specific offer to reconnect. It required no editing before sending.

Human output: The human version was shorter, slightly more personal in tone, and included a specific reference to a shared conversation that the AI could not have known to include. The human version would have landed better with the specific recipient.

Verdict: Draw. AI wins on consistency and completeness. Human wins on relational specificity. For a generic professional contact, AI is sufficient. For a valued relationship, the human version is meaningfully better.

TEST: 500-word explainer on how neural networks work, for a non-technical reader

AI output: Both Claude and ChatGPT produced technically accurate, well-structured explainers. Claude's analogy choices were slightly more original. ChatGPT's structure was slightly cleaner. Neither hallucinated.

Human output: The human version was written with a specific person in mind (a relative who asked the question over dinner) and that specificity gave it an energy and directness that the AI versions, written for an abstract 'non-technical reader,' did not quite match.

Verdict: AI wins on technical accuracy and structure. Human wins on voice and relational specificity. The difference is small and the AI versions are fully publishable.

TEST: Opening paragraph of a literary short story, no brief beyond 'a character arrives in an unfamiliar city'

AI output: Claude produced a competent, atmospheric opening with well-constructed sentences and a clear sense of setting. It read like a strong creative writing workshop submission: technically accomplished, emotionally neutral.

Human output: The human version made an unusual structural choice in the third sentence that created genuine tension. It was not better-written in a technical sense but it was more surprising, which in literary fiction is more important.

Verdict: Human wins, clearly. AI creative fiction is technically proficient and emotionally predictable. The predictability is the problem. Literary writing derives its value from the unexpected, and AI overwhelmingly produces the statistically expected.

TEST: Marketing copy for a new AI productivity tool, 150 words, targeting small business owners

AI output: Both AI tools produced tight, benefit-focused copy that hit the key conversion requirements: clear value proposition, specific audience, action-oriented close. ChatGPT's was slightly punchier. Both were immediately usable.

Human output: The human version was good but required two more drafts to reach the same quality level as the AI first drafts. The AI outputs were more consistent in maintaining marketing discipline throughout the 150 words.

Verdict: AI wins, clearly. Marketing copy with defined requirements and a defined audience is exactly the task AI does well. The human process was slower and the quality of the first draft was lower.

TEST: Opinion piece arguing a position the author genuinely holds

AI output: Claude was given the position and asked to argue for it. The result was a well-structured argument with logical coherence and appropriate evidence. It was convincing in the way that a competent lawyer's brief is convincing.

Human output: The human version had a different quality: it felt argued rather than constructed. The examples were drawn from personal experience. The objections anticipated were the specific ones the author had actually encountered. The irritation at those objections was audible in the prose.

Verdict: Human wins. Opinion writing is not primarily a logical exercise. It is an expression of a perspective shaped by specific experience. AI can construct an argument. It cannot argue from lived conviction, and the difference is perceptible.

TEST: Press release for a fictional product launch, 300 words, in standard press release format

AI output: Both AI tools produced press releases that were indistinguishable from standard industry releases. Correct structure, appropriate language, clear news hook, boilerplate company description. Both were immediately publishable.

Human output: The human version took 25 minutes and required two revisions to match the AI output quality. The AI versions took under two minutes each.

Verdict: AI wins, comprehensively. Press releases follow a defined structure with defined language conventions. AI executes this better and faster than most humans because it has processed thousands of examples and requires no time to recall the format.

What the Pattern Reveals

Across all ten tests, a consistent pattern emerged that is more useful than any simple verdict about which is better.

AI writing is excellent when the value of the output is in its technical execution: its structure, its accuracy to the brief, its adherence to format conventions, its consistent quality across multiple outputs. Press releases, marketing copy, explainers, professional emails, and any format where 'following the template well' is the primary success criterion are all areas where AI matches or exceeds human output on the first draft.

Human writing is better when the value of the output is in its specificity: its connection to a particular relationship, a particular experience, a particular conviction. Opinion writing, literary fiction, personal essays, and communications where the recipient's specific context matters all benefit from a human perspective that AI cannot authentically replicate.

The insight this produces is one that neither the AI enthusiasts nor the writing purists want to confront: the writing that AI cannot do well is not the writing that most of the world's professional writing actually is. Most professional writing is structured, templated, and evaluated on technical criteria. AI is already better at that writing than most humans.

The writing that AI cannot do well is the writing that matters most to culture, to literature, to genuine human communication. AI cannot write from conviction, from specific experience, or from the kind of unexpected perspective that makes writing genuinely surprising.

The Uncomfortable Implication

If AI is already better than most humans at most professional writing tasks, and if most professional writing consists of structured, templated, format-compliant documents, then the practical question for anyone whose work involves writing is not whether AI can replace them. It is which parts of their writing it has already replaced and what that means for how they spend the working hours that remain.

The answer is not to resist the tool. It is to concentrate effort on the writing that AI demonstrably cannot do: writing that emerges from specific experience, specific relationships, specific conviction. That is where human writing has a structural advantage that statistical text prediction cannot close.

The writers who will thrive are not the ones who refuse AI tools or the ones who outsource everything to them. They are the ones who use AI to handle the templated work efficiently and concentrate their own attention on the writing that only their particular life could produce.

The AI Vanguard Take:  The question 'can AI write better than humans' is the wrong question. The right question is: which writing tasks benefit from AI assistance and which writing tasks require human specificity? The answer to the second question is also the answer to what makes writing worth preserving as a human practice.

Key Takeaways

        AI writing excels at technically defined tasks with format conventions: press releases, marketing copy, professional emails, and explainers. On these tasks, AI first drafts consistently match or beat human first drafts in quality and always beat them in speed

        Human writing retains a clear advantage in writing requiring specific experience, relational knowledge, genuine conviction, or literary surprise. AI produces technically proficient but emotionally predictable prose in these genres

        The uncomfortable implication: most professional writing is templated and format-driven, which is exactly where AI performs best. The writing that AI cannot replicate is not the majority of what professional writers are paid to produce

        The productive response is not resistance but concentration: use AI for templated work and concentrate human writing effort on the specific, the convictional, and the genuinely surprising

        Writing well in the AI era means understanding which of your writing tasks are now delegatable and which are now more valuable precisely because they are not

Frequently Asked Questions

Will AI replace professional writers?

AI has already displaced some writing work, particularly commodity content production, basic copywriting, and templated document drafting. It is unlikely to displace writing that requires genuine originality, specific expertise, personal voice, or authentic experience. The professional writers most at risk are those whose work sits closest to the templated end of the spectrum. The writers most insulated are those whose value is in their specific perspective, not their technical execution.

Is AI writing detectable?

AI detection tools are unreliable and produce significant false positive rates. More practically, skilled readers can often identify AI writing through a recognisable quality: technically competent, emotionally neutral, and statistically predictable in its choices. The quality that makes human writing memorable, the unexpected turn, the specific example no general training could have produced, is also the quality that makes it distinguishable. AI writing is rarely wrong. It is rarely surprising.

Which AI tool writes better, Claude or ChatGPT?

For most writing tasks in this test, Claude produced slightly higher quality output on the first attempt, particularly on tasks requiring tonal precision and instruction following. ChatGPT's interface is more intuitive and its marketing copy was marginally punchier. The difference on most writing tasks is small. A full head-to-head comparison across ten categories is available in the Day 5 post: ChatGPT vs Claude vs Gemini.

Should I use AI to write my blog posts?

Using AI to generate first drafts, outlines, and structural scaffolding for blog posts is legitimate and efficient. The AI Vanguard uses AI as part of its own writing process. The editorial judgment, the specific examples, the original framing, and the voice that distinguishes good content from generic content all require human investment. A blog post written entirely by AI and published without editorial intervention will read like one. A blog post where AI drafts and a human edits, refines, and adds specificity can be genuinely excellent.

A Note on Transparency:  This post, like all content on The AI Vanguard, was produced with AI assistance. The writing tests described were conducted as described. The analysis and conclusions are the author's own. The post was edited, structured, and published by a human. This is our standard practice and the practice we recommend.



React to this post

Friends don't let friends miss out on good content. Hit that share button below.

Post a Comment

Please keep it clear and respectful

Previous Post Next Post