Fear Not ChatGPT: Only Humans Can Judge Writing Contests (And Stories)

June 5, 2024: Contest, Evaluation and Revision
😃   Subscribe for Weekly Posts:

Fear Not ChatGPT: Only Humans Can Judge Writing Contests (And Stories)

While attending the AWP conference, we were fortunate enough to chat with numerous writers about Bardsy's anthology contests. One question kept popping up: "Do you use AI to judge the entries?"
That writers even need to question whether a human will judge their entries is grim. Unfortunately, with the increasing use of AI in so-called creative spaces, they're wise to ask. To answer it, no. We never — and will never — use AI to judge Bardsy's anthology contests. Identifying good stories (and providing actionable feedback) is a job for humans. So-called language models understand stories. To put it simply: I know current AI can't replace me, as should you.
Quick disclaimer: there's too much discussion of AI, how it works, and its consequences. The following is high level and avoids technical detail in order to focus on the essentials for us, serious authors. With interest, we'll continue this series with further explanation, analysis, and reflection.
quotemark

We are all born as storytellers. Our inner voice tells the first story we ever hear. -Kamand Kojouri

man woman and laptop

Background: Humans vs. Robots

Humans are hardwired for stories. To see for yourself, give anyone three completely random facts, and they'll easily create a narrative. From childhood onward, our storytelling ability grows to form the basis for our communication and, crucially, gives us the ability to assign meanings, large and small, to life. ChatGPT and similar language models don't compare. The clearest way to think about them is as "stochastic parrots." In this view, they eavesdrop on enormous bodies of human text (and other content) to spit back random selections in response to prompts. Recent innovations, besides the size of the corpus, center on how they use math to grind out a much smaller and more accessible memory footprint and to align prompts with this stored summation.
At first glance, entering a prompt produces a human-like response. Behind the scenes, responses essentially average whatever responses humans have made to similar queries. This ability leads many, including myself, to conclude that ChatGPT passes the Turing test. However, this isn’t significant, and more likely underlines that test's primitivity; after all it debuted in the 1950s when digital memory was millions of times more expensive. ChatGPT's flaws become more apparent once you examine large numbers of responses across many prompts. Anyone familiar with the subject can list them, especially its "hallucinations," formulaic writing, and tendency to produce word salad. Note that these are fundamental, and likely irresolvable, problems with this technology. The insight: ChatGPT has no idea what it's talking about. Put another way, it has no ability to assign meanings.
At Bardsy, we define a story to be a coherent combination of five elements: plot, character, conflict, theme, and world. Our definition depends on coherence. Thrown-together elements do not a story make. Words need careful orchestration—at every level—to go beyond an unsatisfying story or, possibly, nonsense. Nonsense literally being the absence of meaning.
Novels begin in authors' minds and end in readers'. Reading is a co-creative process—one that's only successful if a satisfying story emerges in the reader, in line with the author's intentions. This requires mutual understanding, i.e. transferring meaning. As a judge, in this way, my job is to see if a story has meaning and then to see whether it was conveyed effectively. You can see where this is headed. ChatGPT and its ilk can't assign meaning, meaning it can't do my job.

Judging Criteria

Of course, blanket statements about meaning are not fully helpful when it comes to judging ChatGPT, not to mention ranking stories and providing feedback. At Bardsy, we use our Publishability Index™ (PI) , a 23-part checklist for stories, to put flesh on the nuts and bolts of effectively transferring meaning, which for this purpose is synonymous with great storytelling.
Here's an example of how it works: the PI's “character/backstory” dimension has a specific threshold focused on developing characters; it calls for them to have "relevant histories that spur individual development and action.” True it is a yes / no question (by design) but one that ChatGPT can’t answer correctly. Alternatively, our editorial team’s level of agreement exceeds ninety percent (presumably they're "correct," as well).
To effectively use the PI, a human must recreate the story in their mind before assessing whether criteria are met. When judging, I read each story to capture their totality before performing 23 evaluations. Thus, a story's success emerges from the subtle interplay of words as they run across pages to share meaning through my appreciation. (This method also helps me provide precise feedback.) Unpacking this conclusion requires more space than this blog allows; suffice to say extant forms of AI can never understand a story in this way. Evidence suggesting they do are merely tricks of the stochastic parrot, which can be easily rebutted case by case. Let’s look more closely at three prerequisites of winning and how ChatGPT falls short.
feedback
quotemark

You're never going to kill storytelling, because it's built into the human plan. We come with it. -Margaret Atwood

Three Storytelling Aspects AI Can’t Judge

Believable character and storyworld
Winning contest entries have characters who come to imagined life as well as storyworlds that I can imagine moving through. Reasonable requirements for immersion, at least. Successful authors achieve this by tricking readers into applying the same mechanism humans use to know real people on fictional characters.
Psychological research shows we form a holistic model for everyone we meet. We can mentally interact with these simulacra—predicting likely responses, for example—and in this way "know" them in greater or lesser detail. Nearly the same process carries over to storyworlds.
Machines cannot mentally recreate your storyworld and characters, nor use these simulacra like a human reader. To gauge whether characters are authentic, and storyworlds sufficiently robust, co-creation is required. ChatGPT can regurgitate a composite of stored data, but the uniqueness and specificity true recreation requires exceed its capabilities. Therefore, it has no ability to imaginarily interact with them nor to judge them.
A synergistic plot and conflict
Most fiction writers are familiar with the universal story structure: inciting incident, rising action, climax, and so on. Often overlooked is how conflict and theme connect these plot points. For example, events in the story should steadily intensify the stakes (the rising action), eventually resulting in the climax. In a great story, plot and conflict are synergized, working together toward a satisfying resolution.
Research, as illustrated above, has found that study participants spontaneously make up narratives when given an assortment of random events. We continually try to make sense of events as we automatically seek meaning. To satisfy this craving and elevate your story from good to great, give readers what they (perhaps subconsciously) want: a thematized conflict that works synergistically with the plot.
AI programs can’t appreciate this synergy as they’re incapable of picking up the meaningful, abstract nuances of a skillfully integrated plot and conflict. Machines can’t relate to your story’s conflict, so they miss critical aspects that satisfy readers. Human judges can tell when every element has this significance—and a story has meaning.
That something special: cohesion
Cohesion is a crucial component of a satisfying story. However, more obvious elements—such as plot and conflict—typically pull focus. Still, great stories are more than carefully developed story elements. When every word, line, and punctuation mark unite to serve this greater purpose, stories become unforgettable. Cohesive stories resonate with the reader, lingering in their mind days or even weeks later.
Probably AI's biggest difficulty will lay in separating great stories from the good. They are also stories AI can’t evaluate accurately and provide feedback on. While AI may excel at identifying basic patterns, understanding the meticulous construction of a cohesive story—again a meaningful, underlying pattern—is beyond its reach. In short, it’s our humanity that enables us to catch the nuances that give a story meaning and assess whether a story has survived the co-creative process. So until Chat GPT learns how to read between the lines, my job is safe.
Receive human feedback on your short story by entering it in our Spring Anthology Contest. Our robot-free editing team will complete a PI™ on your work, and then you’ll have the opportunity to revise. Once you have, our (all-human) judging team will get to work. One skilled writer will win $500, and all published finalists will receive $50. Click here to learn more or email joinus@bardsy.com.
TO DO SCRATCHPAD PRIVATE JOURNAL TRACKING Update Assessment
CLICK A TAB TO USE WILL.POWER

TO DO LIST:
Add tasks to your sortable list, then revel in checking them off.

SCRATCHPAD:
Cache your gems as they fall in this always accessible place.

PRIVATE JOURNAL:
Reflect on your process — good, bad and ugly — in your dated diary.

TRACKING:
Measure your progress with key writing metrics, automatically,
ADD DO
Show Dones
Metric:
Words
Minutes
ADD
Click anywhere to close