كي بي لام: إنجاز مايكروسوفت في نماذج اللغة الكبيرة القابلة للتطوير والمدمجة بالمعرفة

تعتبر نماذج اللغة الكبيرة (LLMs) قوية، ولكنها تعاني من صعوبة في دمج المعرفة الخارجية بكفاءة. قدمت Microsoft Research نموذج KBLaM (نموذج اللغة المعزز بقاعدة المعرفة)، وهو نهج جديد يدمج قواعد المعرفة المنظمة مباشرة في نماذج اللغة الكبيرة المدربة مسبقًا، مما يوفر بديلاً قابلاً للتطوير وفعالًا وقابلاً للتفسير للطرق التقليدية مثل الضبط الدقيق وإنشاء الاسترجاع المعزز (RAG) والتعلم في السياق. تلخص هذه المقالة الإخبارية الميزات والفوائد الرئيسية لـ KBLaM.

التغلب على قيود طرق دمج المعرفة الحالية

تتضمن الأساليب الحالية لدمج المعرفة الخارجية في نماذج اللغة الكبيرة عيوبًا كبيرة:

الضبط الدقيق: يتطلب إعادة تدريب مكلفة للنموذج بأكمله لكل تحديث للمعرفة.
إنشاء الاسترجاع المعزز (RAG): يقدم وحدات استرجاع منفصلة، مما يزيد من التعقيد ويمنع التدريب الشامل. يقوم RAG أيضًا بإلحاق أجزاء المستندات المسترجعة بالمطالبات.
التعلم في السياق: يعاني من توسيع نطاق حسابي تربيعي، مما يجعله غير فعال لقواعد المعرفة الكبيرة.

تم تصميم KBLaM للتغلب على هذه القيود عن طريق تضمين المعرفة المنظمة مباشرة في بنية النموذج.

KBLaM: نهج جديد لدمج المعرفة

يستخدم KBLaM قاعدة معرفة منظمة، تتكون من ثلاثيات الكيان والخاصية والقيمة، لتوحيد البيانات وتمثيلها. تتكون العملية من خط أنابيب ثلاثي الخطوات:

ترميز المعرفة: يتم تحويل كل ثلاثية معرفة إلى زوج متجه قيمة المفتاح باستخدام مشفر جملة مدرب مسبقًا مع محولات خطية خفيفة الوزن.
- يعمل متجه المفتاح (اسم الكيان والخاصية) كـ “معلومات فهرس”.
- يلتقط متجه القيمة قيمة الخاصية المقابلة.
التكامل مع نماذج اللغة الكبيرة (LLMs): يتم زيادة أزواج قيمة المفتاح في طبقات انتباه النموذج باستخدام بنية انتباه مستطيلة متخصصة.
- تسمح هذه البنية لرموز اللغة (أسئلة المستخدم) بالاهتمام بجميع رموز المعرفة.
- لا تهتم رموز المعرفة ببعضها البعض أو برموز اللغة.
- يقلل هذا من التكلفة الحسابية إلى تعقيد خطي.
استرجاع المعرفة الفعال: يتعلم النموذج استرجاع رموز المعرفة ذات الصلة ديناميكيًا أثناء الاستدلال، مما يلغي الحاجة إلى خطوة استرجاع منفصلة.

المزايا الرئيسية لـ KBLaM

يوفر KBLaM العديد من المزايا مقارنة بالطرق الحالية:

قابلية التوسع: يحقق توسعًا خطيًا مع حجم قاعدة المعرفة، مما يمكنه من التعامل مع مستودعات معرفة أكبر بكثير من التعلم التقليدي في السياق. يمكن لـ KBLaM معالجة أكثر من 10000 ثلاثية معرفة على وحدة معالجة رسومات واحدة، أي ما يعادل 200000 رمز نصي تقريبًا.
الكفاءة: يقلل الانتباه المستطيل بشكل كبير من التكلفة الحسابية مقارنة بنهج التوسع التربيعي. الوقت اللازم للرمز المميز الأول أقل بكثير مما هو عليه مع الأساليب المشابهة لـ RAG.
التحديثات الديناميكية: يسمح بإجراء تحديثات ديناميكية لقاعدة المعرفة دون الحاجة إلى إعادة التدريب أو إعادة حساب قاعدة المعرفة بأكملها.
القابلية للتفسير: يوفر رؤى حول كيفية استخدام النموذج لرموز المعرفة من خلال أوزان الانتباه، مما يجعل العملية أكثر شفافية من التعلم في السياق.
الموثوقية: يعزز موثوقية النموذج من خلال تعلم متى لا يجيب على الأسئلة إذا كانت المعلومات المطلوبة مفقودة، مما يقلل من الهلوسة. يرفض النموذج الإجابة على الأسئلة غير الموجودة في قاعدة معارفه.

مستقبل الذكاء الاصطناعي المعزز بالمعرفة

يمثل KBLaM تقدمًا كبيرًا في الذكاء الاصطناعي المعزز بالمعرفة، مما يمهد الطريق لأنظمة الذكاء الاصطناعي التي يمكن أن توفر استجابات أكثر دقة وقابلية للتكيف ومتكاملة بعمق مع المعرفة. يتمتع هذا النهج بالقدرة على تغيير مجالات مختلفة، بما في ذلك الطب والمالية والبحث العلمي. أصدرت Microsoft Research رمز KBLaM ومجموعات البيانات الخاصة به للمجتمع البحثي، مع خطط للتكامل مع مكتبة Hugging Face transformers.

في حين أن النموذج الحالي مدرب بشكل أساسي على أزواج الأسئلة والأجوبة الواقعية، سيركز البحث المستقبلي على توسيع قدراته عبر مهام استدلال أكثر تعقيدًا ومجالات معرفة متنوعة.

المصدر: Microsoft Research Blog

Large Language Models (LLMs) are powerful, but struggle with efficiently integrating external knowledge. Microsoft Research has introduced Knowledge Base-Augmented Language Model (KBLaM), a novel approach that directly integrates structured knowledge bases into pre-trained LLMs, offering a scalable, efficient, and interpretable alternative to traditional methods like fine-tuning, Retrieval-Augmented Generation (RAG), and in-context learning. This blog post summarizes the key features and benefits of KBLaM.

Overcoming the Limitations of Existing Knowledge Integration Methods

Existing approaches for integrating external knowledge into LLMs have significant drawbacks:

Fine-tuning: Requires costly retraining of the entire model for every knowledge update.
Retrieval-Augmented Generation (RAG): Introduces separate retrieval modules, increasing complexity and preventing end-to-end training. RAG also appends retrieved document chunks to prompts.
In-Context Learning: Suffers from quadratic computational scaling, making it inefficient for large knowledge bases.

KBLaM is designed to overcome these limitations by directly embedding structured knowledge within the model’s architecture.

KBLaM: A Novel Approach to Knowledge Integration

KBLaM uses a structured knowledge base, consisting of entity, property, and value triples, to consolidate and represent the data. The process comprises a three-step pipeline:

Knowledge Encoding: Each knowledge triple is transformed into a key-value vector pair using a pre-trained sentence encoder with lightweight linear adapters.
- The key vector (entity name and property) acts as “index information.”
- The value vector captures the corresponding property value.
Integration with LLMs: Key-value pairs are augmented into the model’s attention layers using a specialized rectangular attention structure.
- This structure allows language tokens (user questions) to attend to all knowledge tokens.
- Knowledge tokens do not attend to each other or back to the language tokens.
- This reduces computational cost to linear complexity.
Efficient Knowledge Retrieval: The model learns to dynamically retrieve relevant knowledge tokens during inference, eliminating the need for a separate retrieval step.

Key Advantages of KBLaM

KBLaM offers several advantages over existing methods:

Scalability: Achieves linear scaling with the size of the knowledge base, enabling it to handle much larger knowledge repositories than traditional in-context learning. KBLaM can process over 10,000 knowledge triples on a single GPU, equivalent to ~200,000 text tokens.
Efficiency: Rectangular attention significantly reduces computational cost compared to quadratic scaling approaches. Time to first token is significanlty lower than with RAG-like approaches.
Dynamic Updates: Allows for dynamic updates to the knowledge base without requiring retraining or re-computation of the entire knowledge base.
Interpretability: Provides insights into how the model utilizes knowledge tokens through attention weights, making the process more transparent than in-context learning.
Reliability: Enhances model reliability by learning when not to answer questions if the required information is missing, reducing hallucinations. The model refuses to answer questions that aren’t in its knowledge base.

The Future of Knowledge-Augmented AI

KBLaM represents a significant advancement in knowledge-augmented AI, paving the way for AI systems that can provide more accurate, adaptable, and deeply integrated knowledge-driven responses. This approach has the potential to transform various fields, including medicine, finance, and scientific research. Microsoft Research has released KBLaM’s code and datasets to the research community, with plans for integrations with the Hugging Face transformers library.

While the current model is trained primarily on factual question-answer pairs, future research will focus on expanding its capabilities across more complex reasoning tasks and diverse knowledge domains.

Source: Microsoft Research Blog

القائمة

كي بي لام: إنجاز مايكروسوفت في نماذج اللغة الكبيرة القابلة للتطوير والمدمجة بالمعرفة

التغلب على قيود طرق دمج المعرفة الحالية

KBLaM: نهج جديد لدمج المعرفة

المزايا الرئيسية لـ KBLaM

مستقبل الذكاء الاصطناعي المعزز بالمعرفة

Overcoming the Limitations of Existing Knowledge Integration Methods

KBLaM: A Novel Approach to Knowledge Integration

Key Advantages of KBLaM

The Future of Knowledge-Augmented AI

مقالات ذات صلة

أبحاث مايكروسوفت تٌحرز تقدماً في التكميم منخفض البت للنماذج اللغوية الكبيرة على الأجهزة الطرفية

فهم وتحسين أنظمة الجيل المعزز بالاسترجاع (RAG)

إحداث ثورة في معالجة الفيديو: تقديم VidTok لتقنية التوكنيزايشن الفعالة للذكاء الاصطناعي

التعليقات