De-Generate 🔎 🤖

Detect AI generated text sections and tell them apart from human written text, using an open source LLM as critic.

Prompt

Some texts that were originally human written are considered "AI" here:
- very well-know samples like Wikipedia articles, classic books in the public domain, etc
- generic and predictable phrases like idioms, formal style, etc; so "take this with a grain of salt"!

IE anything that is highly predictable is deemed "AI": more precisely, this app asks the question "what is *original* human work in this text?".

Believe it or not this entire introduction was written by hand, so you can see the limits of my tool!

More than individual scores, the "aspect" of whole sections is reliable.
Use your judgment to assess where your sample sits between the two extrems:
- a sentence with several green tokens is likely human, at least reworked by a human to some extent
- a sentence that is all red is highly generic, which often indicates templated or LLM writing

People already wrote generic copypasta like "sincerely yours" before LLMs, and IMHO they deserve to be in red.

The final probability score is a combination of several metrics, computed token by token (word by word):
- the `unicode` metric outlines invisible or relatively rare glyphs like the infamous em-dash "—" or the curly quotes "“"
- the `surprisal` metric measures how individual tokens diverge from the distribution estimated by the model
- the `perplexity` metric rates how surprising a sequence of tokens is according to the model
- the `sampling` metric evaluates how much more likely a token is under a typical sampling policy
- the `ramping` ratio is used to dampen the scores at the begining, where the model lacks context

The fixed recipe used to combine these metrics into the final score for each token is obviously insufficient to detect the ever improving outputs of LLMs.

It is highly likely that you will make a better use of them, so the indicators are plotted in the "graphs" tab.
You will find a few samples to train your eye and spot LLM patterns.

And the details of the scoring recipe are available in the tab "docs".

Results

This app is essentially asking a L LM " would you have generated the same response ?". Then it probes the internals of the model for its opinion . The more surprised the L LM is the more " human " the text is scored . Some texts that were originally human written are considered " AI " here : - very well -know samples like Wikipedia articles , classic books in the public domain , etc - generic and predictable phrases like idi oms , formal style , etc ; so " take this with a grain of salt "! IE anything that is highly predictable is deemed " AI ": more precisely , this app asks the question " what is * original * human work in this text ?". Bel ieve it or not this entire introduction was written by hand , so you can see the limits of my tool ! More than individual scores , the " aspect " of whole sections is reliable . Use your judgment to assess where your sample sits between the two extre ms : - a sentence with several green tokens is likely human , at least re worked by a human to some extent - a sentence that is all red is highly generic , which often indicates templ ated or L LM writing People already wrote generic cop yp asta like " s incerely yours " before L LM s , and IM HO they deserve to be in red . The final probability score is a combination of several metrics , computed token by token ( word by word ): - the ` unicode ` metric outlines invisible or relatively rare glyphs like the infamous em -d ash " —" or the curly quotes " “ " - the ` sur pris al ` metric measures how individual tokens diver ge from the distribution estimated by the model - the ` per plex ity ` metric rates how surprising a sequence of tokens is according to the model - the ` sampling ` metric evaluates how much more likely a token is under a typical sampling policy - the ` r amping ` ratio is used to damp en the scores at the begin ing , where the model lacks context The fixed recipe used to combine these metrics into the final score for each token is obviously insufficient to detect the ever improving outputs of L LM s . It is highly likely that you will make a better use of them , so the indicators are plotted in the " graphs " tab . You will find a few samples to train your eye and spot L LM patterns . And the details of the scoring recipe are available in the tab " docs ".