Skip to content
K Kashif Ullah
← All projects
2024 Sole developer

AI Text Detection System

NLP pipeline that trains classical ML models on stylometric features to distinguish AI-generated text from human writing, with a Streamlit demo UI.

The problem

Teachers, editors, and platforms want a quick screen for AI-generated text without sending content to a third-party API. The goal was an on-device, explainable classifier — not a black box.

The approach

  • Feature engineering: type-token ratio, burstiness, sentence-length variance, punctuation density, function-word frequency.
  • Trained Logistic Regression, Random Forest, and Gradient Boosting; compared with cross-validated ROC-AUC.
  • Streamlit demo shows per-feature contributions so users can see why a passage was flagged.

Outcome

  • Best model reached ~0.92 ROC-AUC on the held-out test set.
  • Explainability was the headline feature — users trusted a transparent score over a single confidence number.

Need something similar?

If this is the kind of problem you're working on, I can help.

Get in touch →