【BugFix】Fix model configuration compatibility in datasets and postprocessors#190
【BugFix】Fix model configuration compatibility in datasets and postprocessors#190GaoHuaZhang wants to merge 2 commits intoAISBench:masterfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses compatibility issues between dataset loading and model postprocessing by updating the list of supported multimodal APIs and performing a significant cleanup of deprecated post-processing code. The changes ensure that model configurations are handled consistently, preventing failures and incorrect outputs under various settings, while also improving code maintainability. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request aims to fix compatibility issues with model configurations by updating the list of multimodal APIs and refactoring the model post-processing logic. The changes include adding a new stream API to MM_APIS and removing deprecated naive and xfinder post-processing functions. While the code removal simplifies the module, it appears to break existing unit tests which have not been updated. I've also suggested a minor formatting improvement for better code style consistency.
I am having trouble creating individual review comments. Click here to see my feedback.
ais_bench/benchmark/utils/model_postprocessors.py (31-140)
The removal of naive_model_postprocess and xfinder_postprocess functions will cause existing unit tests in tests/UT/utils/test_model_postprocessors.py to fail (specifically test_naive_model_postprocess and test_xfinder_postprocess). The PR checklist indicates that tests have been updated, but this seems to have been overlooked. Please update or remove the corresponding tests to reflect these changes and ensure the test suite passes.
ais_bench/benchmark/datasets/utils/datasets.py (26-28)
The formatting of this list is inconsistent and does not follow common Python style guides like PEP 8. For better readability and consistency, when a list is split across multiple lines, it's recommended to place each item on a new line, indented.
MM_APIS = [
"ais_bench.benchmark.models.VLLMCustomAPIChat",
"ais_bench.benchmark.models.VLLMCustomAPIChatStream",
]
References
- According to PEP 8, for multi-line constructs, you can use hanging indents. When doing so, it is conventional to place the first element on a new line and indent subsequent lines to distinguish them as continuation lines. This improves readability. (link)
Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.
感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。
PR Type / PR类型
Related Issue | 关联 Issue
Fixes #(issue ID / issue 编号) / Relates to #(issue ID / issue 编号)
🔍 Motivation / 变更动机
The current model configuration handling causes compatibility issues between dataset loading and model postprocessing, which leads to incorrect or failed runs under some configurations.
目前的模型配置处理方式在数据集加载与模型后处理之间存在兼容性问题,在部分配置下会导致运行失败或结果不符合预期。
📝 Modification / 修改内容
ais_bench/benchmark/datasets/utils/datasets.pywith the latest model configuration schema to avoid field mismatch.ais_bench/benchmark/utils/model_postprocessors.py, removing deprecated or unused branches that depended on old configuration formats.对应变更包括:
ais_bench/benchmark/datasets/utils/datasets.py中修正与模型配置相关的字段使用,保证与最新配置结构保持一致,避免键名/默认值不匹配。ais_bench/benchmark/utils/model_postprocessors.py中大幅删减旧逻辑,移除依赖过时配置格式的代码分支,统一配置读取方式。📐 Associated Test Results / 关联测试结果
(如有 CI 链接或具体命令,可在此补充:CI pipeline 链接、测试命令等)
This change is not expected to introduce backward-incompatible behavior for normal users, since it mainly cleans up deprecated paths and aligns with the current configuration schema.
本次修改主要是对已废弃的配置路径进行清理,并与当前模型配置结构对齐,对正常使用场景不应引入向后不兼容的行为。
如果下游项目直接依赖被删除的旧字段或旧后处理分支,可能需要:
model_postprocessors的调用方式以使用新的兼容接口。No known performance regressions. The removal of redundant logic in
model_postprocessors.pymay slightly simplify the runtime path.目前未发现性能下降问题;删除多余逻辑后,理论上运行路径更简洁。
🌟 Use cases (Optional) / 使用案例(可选)
✅ Checklist / 检查列表
Before PR:
After PR:
👥 Collaboration Info / 协作信息
🌟 Useful CI Command / 实用的CI命令
/gemini review/gemini summary/gemini help/readthedocs build