Mixture of Experts (MoE) Model

From GM-RKB
(Redirected from mixture-of-experts approach)
Jump to navigation Jump to search

A Mixture of Experts (MoE) Model is a machine learning model where multiple trained experts (learners) are used to divide the problem space into homogeneous regions.



References

2024-12-27

[1] https://www.ibm.com/think/topics/mixture-of-experts
[2] https://developer.nvidia.com/blog/applying-mixture-of-experts-in-llm-architectures/
[3] https://cameronrwolfe.substack.com/p/conditional-computation-the-birth
[4] https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-mixture-of-experts
[5] https://smt.readthedocs.io/en/latest/_src_docs/applications/moe.html
[6] https://www.techtarget.com/searchenterpriseai/feature/Mixture-of-experts-models-explained-What-you-need-to-know
[7] https://datasciencedojo.com/blog/mixture-of-experts/
[8] https://huggingface.co/blog/moe
[9] https://en.wikipedia.org/wiki/Mixture_of_experts
[10] https://zilliz.com/learn/what-is-mixture-of-experts

2024

2022

  • (Wikipedia, 2022) ⇒ https://en.wikipedia.org/wiki/Mixture_of_experts Retrieved:2022-3-4.
    • Mixture of experts (MoE) refers to a machine learning technique where multiple experts (learners) are used to divide the problem space into homogeneous regions. An example from the computer vision domain is combining a neural network model for human detection with another for pose estimation. If the output is conditioned on multiple levels of probabilistic gating functions, the mixture is called a hierarchical mixture of experts.

      A gating network decides which expert to use for each input region. Learning thus consists of 1) learning the parameters of individual learners and 2) learning the parameters of the gating network.

2021

2018

2017

2016

2014