evalution

Star

Here are 10 public repositories matching this topic...

gomate-community / rageval

Star

Evaluation tools for Retrieval-augmented Generation (RAG) methods.

rag llm evalution

Updated Nov 18, 2024
Python

lemon07r / SanityBoard

Sponsor

Star

Home of the SanityHarness Leaderboard website.

agent website benchmark leaderboard coding eval llm evalution coding-agent

Updated Feb 28, 2026
HTML

ZSYNOTZSH / FactualBench

Star

The official repository for the dataset FactualBench, which is introduced in paper "Exploring the Generalizability of Factual Hallucination Mitigation via Enhancing Precise Knowledge Utilization".

nlp deep-learning dataset chinese evalution

Updated Dec 30, 2025
Python

sandy1990418 / Flask-Celery-Template

Star

A Flask-Celery integration template for distributed task processing with progress tracking and use MMMLU to evaluation example.

flask async celery jinja2-templates evalution

Updated Nov 25, 2024
Python

luckyabner / MusicRoast

Star

An AI-powered application that provides different reviews of users' playlists across various platforms.

music ai evalution

Updated Apr 2, 2025
TypeScript

Mr-TalhaIlyas / seizurekit

Star

A Python toolkit for the robust evaluation and analysis of seizure detection models.

detection epilepsy timeseries-analysis seizure epilepsy-monitoring epilepsy-prediction evalution timeseries-evaluation epilepsy-evaluation seizure-evaluation

Updated Jun 24, 2025
Python

Naidu-DS-2026 / News-Article-Classification-NLP

Star

Multi-class classification of news articles using NLP techniques, TF-IDF, and Naive Bayes

nlp machine-learning logistic-regression tfidf news-classification text-classification-python navies-bayes-classifer evalution

Updated Jul 10, 2025
Jupyter Notebook

pascalrink / pppms

Star

R package for multiplicity-adjusted bootstrap tilting confidence limits for prediction performance after model selection

bootstrap machine-learning statistics prediction biostatistics r-package evalution post-selection-performance

Updated Mar 9, 2026
R

Deepak-Manian / Multiclass-Fish-Image-Classification

Star

Automated fish species classification using deep learning (CNN + transfer learning). Includes model training, evaluation metrics, and a Streamlit app for real-time predictions.

machine-learning deep-learning transfer-learning cnn-classification streamlit evalution

Updated Nov 25, 2025
Jupyter Notebook

ZhengtongYan / opencompass

Star

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

llm evalution

Updated Apr 11, 2025
Python

Improve this page

Add a description, image, and links to the evalution topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the evalution topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

evalution

Here are 10 public repositories matching this topic...

gomate-community / rageval

lemon07r / SanityBoard

ZSYNOTZSH / FactualBench

sandy1990418 / Flask-Celery-Template

luckyabner / MusicRoast

Mr-TalhaIlyas / seizurekit

Naidu-DS-2026 / News-Article-Classification-NLP

pascalrink / pppms

Deepak-Manian / Multiclass-Fish-Image-Classification

ZhengtongYan / opencompass

Improve this page

Add this topic to your repo