这是用户在 2024-10-12 15:49 为 https://read.readwise.io/new/read/01j9zv47q22z759tdw5jgbahfa 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?

  1. 🚨 医疗 AI 研究警报!
    CliMedBench is a new benchmark designed to evaluate large language models (LLMs) in real clinical scenarios in China, using over 33,000 medical questions. It focuses on important areas like medical reasoning and knowledge application, highlighting the need for improvements in existing models. This initiative aims to enhance the practical use of AI in healthcare, moving beyond theoretical applications.
    x.comOpen Life Science AI2 mins
    3:47 pm

  2. 🪩@stateofaireport2024 已经到来了!
    The @stateofaireport 2024 provides a comprehensive overview of AI research, industry trends, and safety issues. It highlights the growing competition between open-source and proprietary models, as well as significant advancements in robotics and biotech. The report also discusses regulatory challenges and the impact of AI investments on the market.
    x.comNathan Benaich5 mins
    Oct 10th
  3. ⚠️ Cubox 隐私风险警告
    Cubox, a popular bookmarking tool, has serious privacy risks as it uploads users' browsing history in plain text, including all URL parameters. This means sensitive information can be exposed, making it much more dangerous than typical data monitoring. Users are advised to stop using the Cubox browser extension to protect their privacy.
    x.comHzao1 min
    Oct 6th
  4. 在 readwise reader 出来之前,用过 cubox 几个月。但是,reader 出来之后,体验差距巨大,很快就全面切换到 reader 上了。
    The author used Cubox for a few months but switched to Readwise Reader after it was released, noting a significant improvement in experience. They believe Readwise Reader greatly enhances internet reading and improves quality of life. The author recommends trying Readwise Reader, offering a link for new users to get a free trial.
    x.comhowie.serious1min left
    Oct 6th

  5. 2024 年大语言模型对齐偏好优化技术 PPO,DPO, SimPO,KTO,Step-DPO, MCTS-DPO,SPO
    The author discusses recent advancements in alignment techniques for large language models, focusing on methods like DPO and its variants. These methods aim to improve model performance by optimizing preference data without the need for a separate reward model, making training more efficient. The paper also highlights the growing importance of alignment in ensuring model safety and enhancing agent capabilities.
    mp.weixin.qq.com是念2 mins
    Oct 6th

  6. 如果你没被岩浆包围过,你应该……
    If you haven’t been living under a rock, you have already noticed that the developer community is currently super excited about DSPy. DSPy is a framework developed by Stanford NLP researchers (@lateinteraction) to help you build LLM-based applications. In contrast to similar frameworks, DSPy aims to tackle the fragility problem of developing LLM-based applications by prioritizing programming over prompting. DSPy does this by introducing the following concepts: - Hand-written prompts and fine-tuning are abstracted and replaced by signatures - Prompting techniques, such as Chain of Thought or ReAct, are abstracted and replaced by modules - Manual prompt engineering is automated with optimizers (teleprompters) and a DSPy Compiler Below, you can see what the code and information flow of a DSPy program for a naive RAG pipeline would look like. Read more on @TDataScience: https://t.co/AnrpyPaPku
    twitter.comLeonie1 min
    Feb 28th

  7. 开始使用 Reader 的方法
    Reader is a productivity software that allows users to read, highlight, and annotate without using a mouse. It is built with power users in mind and includes many keyboard shortcuts. Reader supports documents of all kinds, including web articles, RSS feeds, email newsletters, PDFs, EPUBs, Twitter threads, Twitter Lists, and YouTube videos. It also has a browser extension that allows users to save documents and highlight the native web page. The app is updated every day based on feedback from users.
    blog.readwise.ioDaniel Doyon7 mins
    Feb 28th


🚨 医疗人工智能研究警报! 🚨


大型语言模型能否在中国革命性地改变临床实践?


@ECNUER 发布:CliMedBench:针对临床场景评估医疗大语言模型的大型规模中国基准


作者:Zetian Ouyang, Yishuai Qiu, Linlin Wang, @gdm3000, Ya Zhang, Yanfeng Wang, Liang He


Spotify:https://t.co/yLLgYO3cMa


YouTube:https://t.co/Yci9wts4qU


以下是为什么它成为变革性工具的原因: 👇🧵


#人工智能 #医疗人工智能 #自然语言处理 #生物信息学 #健康科技 #机器学习 #医疗服务

 2/ 什么是 CliMedBench?


- 一个全面的基准测试,包含 33,735 个来自现实生活中的医学问题


- 在 14 种临床场景中评估大语言模型


- 突出关注如医学推理、事实一致性和知识应用等关键方面


这确保了在实际医疗实践中对医疗大语言模型进行全面的评估。


#医学基准 #临床问答


3/ CliMedBench 的关键特性


- 14 个来自真实电子健康记录(EHRs)的现实情况


- 7 旋转维度,包括临床质量保证、推理、知识应用和摘要


- 中国顶级医院的数据


这项基准提高了医学LLM评估的门槛!

 #临床 AI #电子健康记录


4/ 什么是模型的表现如何?


- 评估模型包括 GPT-4,ChatGPT,ERNIE-Bot,Qwen 和其他模型。


- 即便 GPT-4 只得到 69.2% 的分数,这也凸显了医学推理任务的挑战性。


- 中国的人工智能大语言模型表现不佳,这表明需要改进的领域非常广泛


#大语言模型性能 #AI 在医疗领域


5/ 为什么 CliMedBench 重要?


- 它是基于真实临床实践而非考试数据的第一个基准衡量标准


- 涵盖了神经外科和消化内科等 19 个医疗部门


- 有助于大语言模型从理论模型向实际医疗应用的过渡


#临床医学 #人工智能应用


6/ 同盟大学


- 由华东师范大学与上海交通大学联合,与 Hasso Plattner Institute 合作开发的


- 聚集了中国顶尖医疗机构的专业知识


#大学合作 #医疗研究

 7/ 下一步是什么?


- 医疗大语言模型需要更好的输入容量和推理能力


- CliMedBench 打开了未来 LLM 临床诊断和医疗决策支持改进的大门


- 人工智能驱动的医疗保健创新迈出了一步


#人工智能的未来 #医疗诊断

ghostreader ghost bodyghost glasses