perplexity and burstiness explained (without the jargon)

every article about ai detection mentions perplexity and burstiness. most of them explain it badly. let's fix that.

perplexity = how surprising your word choices are

imagine you're writing "i went to the ___."

common answers: store, park, doctor, gym. these are low perplexity choices. predictable. expected.

surprising answers: observatory, courthouse, wrong address. higher perplexity. less expected.

ai almost always picks the low perplexity option. it generates the most statistically likely next word. that's literally how it works — predict the next token, pick the most probable one.

humans are weirder. you might write "i went to the store" in one sentence and "i dragged myself to that overpriced deli on 5th" in the next. the second one is way less predictable. higher perplexity.

detectors measure this across your whole text. if every word choice is the predictable one, the text looks ai-generated.

burstiness = how much your writing rhythm varies

read this paragraph you're reading right now. some sentences are long. some short. that's burstiness.

ai writes in this steady rhythm. medium sentence. medium sentence. medium sentence. occasionally a short one. back to medium. the variation is minimal.

human writing is all over the place. you might write a 40-word sentence followed by a 3-word sentence followed by a parenthetical aside that doesn't really need to be there but you included it anyway because it felt right. that's high burstiness.

detectors measure sentence length variation and structural complexity across the document. low variation = probably ai.

why these matter together

high perplexity + high burstiness = probably human. the word choices are surprising and the rhythm is varied.

low perplexity + low burstiness = probably ai. predictable words, steady rhythm.

most detectors combine these signals (plus others) into a single score. the specifics vary but the core idea is the same.

what you can do about it

manually: vary your sentence length deliberately. use unexpected words sometimes. break up the rhythm. start a sentence with "and" or "but." throw in a fragment.

with tools: paraai's paraphrase naturally handles both signals. the fine-tuned models trained on human-text corpora learned these patterns from real writing. the output has natural perplexity variation and burstiness because the training data did.

the cheat sheet

if your text reads like a metronome — steady, even, predictable — it'll get flagged.

if it reads like a conversation — varied, surprising, a little messy — it won't.

untraceable ai writing has high perplexity and high burstiness. that's the whole game.