Add AI Evaluation#3923
Conversation
|
unicorn 🦄 |
|
This entire section should be deleted. |
Thanks, fixed. |
|
The topic fills a genuine gap in the awesome ecosystem — no existing list focuses specifically on AI/LLM evaluation. The content depth is impressive. A few issues to address before this can be merged: Non-standard list item format. Every entry uses an inline shields.io badge prefix and bold link text: The required awesome format is simply: Inline badges in list items fall in the same category as CI badges — they add visual noise, are a maintenance burden (star counts change constantly), and break the consistent formatting expected across awesome lists. All ~130 entries would need to be reformatted. Dead links. A few entries point to repositories that return 404 — at minimum Entry placement. The diff shows the entry inserted between "AI in Finance" and "JAX" (alphabetically) rather than at the bottom of the Machine Learning sub-section, as required by the checklist. Horizontal rules. The Minor: The tagline uses "A curated list of…" which is redundant for an awesome list. Also, the |
AndrejOrsula
left a comment
There was a problem hiding this comment.
This is a relevant topic, and your list provides great depth. While the project and repository activity/star badges are a nice visual addition and I personally favor them, they unfortunately go against the simple formatting guidelines required for list items. Additionally, the description at the very top of your README.md says it is a curated list, while the guidelines state that you should only describe the topic and not the list itself. Finally, please make sure your entry in the main repository pull request is placed at the very bottom of the category instead of the middle. Thank you.
| - [H2O](https://github.com/h2oai/awesome-h2o#readme) - Open source distributed machine learning platform written in Java with APIs in R, Python, and Scala. | ||
| - [Software Engineering for Machine Learning](https://github.com/SE-ML/awesome-seml#readme) - From experiment to production-level machine learning. | ||
| - [AI in Finance](https://github.com/georgezouq/awesome-ai-in-finance#readme) - Solving problems in finance with machine learning. | ||
| - [AI Evaluation](https://github.com/Vvkmnn/awesome-ai-eval#readme) - Measuring reliability, accuracy, and safety of LLMs, RAG pipelines, and AI agents. |
There was a problem hiding this comment.
As per the requirement that you ticked but did not meet:
Your entry should be added at the bottom of the appropriate category.
|
Joseluisantigusrosa73
…On Thursday, March 5, 2026, Andrej Orsula ***@***.***> wrote:
***@***.**** requested changes on this pull request.
This is a relevant topic, and your list provides great depth. While the
project and repository activity/star badges are a nice visual addition and
I personally favor them, they unfortunately go against the simple
formatting guidelines required for list items. Additionally, the
description at the very top of your README.md says it is a curated list,
while the guidelines state that you should only describe the topic and not
the list itself. Finally, please make sure your entry in the main
repository pull request is placed at the very bottom of the category
instead of the middle. Thank you.
------------------------------
In readme.md
<#3923 (comment)>
:
> @@ -416,6 +416,7 @@
- [H2O](https://github.com/h2oai/awesome-h2o#readme) - Open source distributed machine learning platform written in Java with APIs in R, Python, and Scala.
- [Software Engineering for Machine Learning](https://github.com/SE-ML/awesome-seml#readme) - From experiment to production-level machine learning.
- [AI in Finance](https://github.com/georgezouq/awesome-ai-in-finance#readme) - Solving problems in finance with machine learning.
+ - [AI Evaluation](https://github.com/Vvkmnn/awesome-ai-eval#readme) - Measuring reliability, accuracy, and safety of LLMs, RAG pipelines, and AI agents.
As per the requirement that you ticked but did not meet:
Your entry should be added at the bottom of the appropriate category.
—
Reply to this email directly, view it on GitHub
<#3923 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/B637QFD76KXZ6YC343HYCHD4PF3RLAVCNFSM6AAAAACT5DHUASVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZTQOJWGIZTONRTGU>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
|
LMArena is renamed as Arena.ai. https://arena.ai/blog/lmarena-is-now-arena/ Otherwise most of the things look good. Also, really cool and helpful list. Will add few more to the list. |
Got it, thanks for catching. Would love any and all PRs, looking forward to it. |
|
There are a few broken links: |
|
|
Dead links (404):
|
|
Reviewed What looks good:
Issues to address:
Good topic with clear demand — worth fixing these before it'll be accepted. |
https://github.com/Vvkmnn/awesome-ai-eval#readme
Measuring reliability, accuracy, and safety of LLMs, RAG pipelines, and AI agents in production environments.
PRs reviewed:
By submitting this pull request I confirm I've read and complied with the below requirements 🖖
Requirements for your pull request
Add Name of List. It should not contain the wordAwesome.#readme.Requirements for your Awesome list
awesome-linton your list and fix the reported issues. ✔ Linting passed.main, notmaster.awesome-ai-eval.# Awesome AI Eval.awesome-list&awesomeas GitHub topics.Contents.