DeepSeek

A DeepSeek is a Chinese AI company (develops large language models)

Context:
- It can perform Large Language Model Development through model architecture research, efficient training methods, and resource optimization.
- It can enable Open Source Model Distribution through model releases and code sharing.
- It can support Artificial General Intelligence Research through foundational research and algorithmic innovation.
- It can maintain Competitive Model Performance through parameter optimization and benchmark testing.
- It can handle Resource Efficient Training through computing infrastructure and training optimization.
- ...
- It can range from being a Research Organization to being a Technology Provider, depending on its operational focus.
- It can range from being a Model Developer to being an AGI Research Institute, depending on its research scope.
- ...
- It can integrate with High-Flyer Company for computational resources and funding support.
- It can connect to Academic Research Community for benchmark evaluation and performance validation.
- It can support Open Source Community for model distribution and code collaboration.
- ...
Examples:
- DeepSeek (2023-05), during company formation from High-Flyer Company.
- DeepSeek (2023-11-02), during first model release with DeepSeek Coder.
- DeepSeek (2023-11-29), during language model launch with DeepSeek LLM and DeepSeek Chat.
- DeepSeek (2024-05), during market disruption with DeepSeek-V2.
- DeepSeek (2024-11), during reasoning model release with DeepSeek R1-Lite-Preview.
- DeepSeek (2024-12), during efficient training achievement with DeepSeek-V3.
- DeepSeek (2025-01-20), during competitive milestone with DeepSeek-R1.
- ...
Counter-Examples:
- MiniMax, which focuses on multimodal capabilitys.
- SenseTime, which maintains closed source models.
- OpenAI, which charges 7.50 USD per million input tokens versus DeepSeek's 0.14 USD.
See: High-Flyer Company, Chinese Technology Company, AI Research Organization, Open Source AI, Large Language Model Developer, Hangzhou Technology Company.

References

2024

(Wikipedia, 2024) ⇒ https://en.wikipedia.org/wiki/High-Flyer_(company)#DeepSeek Retrieved:2024-12-27.
- In April 2023, High-Flyer announced it would form a new research body to explore the essence of artificial general intelligence. However it would not be used to perform stock trading.^[1] This organization would be called DeepSeek.^[2]
- In late 2023, DeepSeek released an open source LLM named DeepSeek after the organization name.^[3]
- In June 2024, DeepSeek V2 was launched. Financial Times reported that it was cheaper than its peers with a price of 2 RMB for every million output tokens. University of Waterloo Tiger Lab’s leaderboard ranked DeepSeek-V2 seventh on its LLM ranking.^[4]
- In November 2024, a preview of DeepSeek R1-Lite was released which claimed to have exceeded the performance of OpenAI o1.^[5]
- In December 2024, DeepSeek V3 was launched. It came with 671 billion parameters and trained in around two months at a cost of US$5.58 million using significantly less resources compared to its peers. It was trained on a dataset of 14.8 trillion tokens. Benchmark tests showed it outperformed Llama 3.1 and Qwen 2.5 while matching GPT-4o and Claude 3.5 Sonnet.^[6]^[7]^[8]

↑ "[Exclusive Chinese Quant Hedge Fund High-Flyer Won't Use AGI to Trade Stocks, MD Says"] (in en). https://www.yicaiglobal.com/news/exclusive-chinese-quant-fund-high-flyer-will-not-use-agi-to-trade-stocks-managing-director-says. Retrieved 2023-12-31.
↑ Ottinger, Lily (December 9, 2024). "Deepseek: From Hedge Fund to Frontier Model Maker" (in en). https://www.chinatalk.media/p/deepseek-from-hedge-fund-to-frontier. Retrieved 2024-12-27.
↑ Se, Ksenia (August 28, 2024). "Inside DeepSeek Models" (in en). https://www.turingpost.com/p/deepseek. Retrieved 2024-11-26.
↑ Cite error: Invalid <ref> tag; no text was provided for refs named FT
↑ Franzen, Carl (2024-11-20). "DeepSeek’s first reasoning model R1-Lite-Preview turns heads, beating OpenAI o1 performance" (in en-US). https://venturebeat.com/ai/deepseeks-first-reasoning-model-r1-lite-preview-turns-heads-beating-openai-o1-performance/. Retrieved 2024-11-26.
↑ Jiang, Ben (2024-12-27). "Chinese start-up DeepSeek’s new AI model outperforms Meta, OpenAI products" (in en). https://www.scmp.com/tech/tech-trends/article/3292507/chinese-start-deepseek-launches-ai-model-outperforms-meta-openai-products. Retrieved 2024-12-27.
↑ Wiggers, Kyle (26 December 2024). "DeepSeek's new AI model appears to be one of the best 'open' challengers yet". https://techcrunch.com/2024/12/26/deepseeks-new-ai-model-appears-to-be-one-of-the-best-open-challengers-yet/.
↑ Sharma, Shubham (26 December 2024). "DeepSeek-V3, ultra-large open-source AI, outperforms Llama and Qwen on launch". https://venturebeat.com/ai/deepseek-v3-ultra-large-open-source-ai-outperforms-llama-and-qwen-on-launch/.

[1] "[Exclusive Chinese Quant Hedge Fund High-Flyer Won't Use AGI to Trade Stocks, MD Says"] (in en). https://www.yicaiglobal.com/news/exclusive-chinese-quant-fund-high-flyer-will-not-use-agi-to-trade-stocks-managing-director-says. Retrieved 2023-12-31.

[2] Ottinger, Lily (December 9, 2024). "Deepseek: From Hedge Fund to Frontier Model Maker" (in en). https://www.chinatalk.media/p/deepseek-from-hedge-fund-to-frontier. Retrieved 2024-12-27.

[3] Se, Ksenia (August 28, 2024). "Inside DeepSeek Models" (in en). https://www.turingpost.com/p/deepseek. Retrieved 2024-11-26.

[FT-4] Cite error: Invalid <ref> tag; no text was provided for refs named FT

[5] Franzen, Carl (2024-11-20). "DeepSeek’s first reasoning model R1-Lite-Preview turns heads, beating OpenAI o1 performance" (in en-US). https://venturebeat.com/ai/deepseeks-first-reasoning-model-r1-lite-preview-turns-heads-beating-openai-o1-performance/. Retrieved 2024-11-26.

[6] Jiang, Ben (2024-12-27). "Chinese start-up DeepSeek’s new AI model outperforms Meta, OpenAI products" (in en). https://www.scmp.com/tech/tech-trends/article/3292507/chinese-start-deepseek-launches-ai-model-outperforms-meta-openai-products. Retrieved 2024-12-27.

[TechCrunch_26_December_2024-7] Wiggers, Kyle (26 December 2024). "DeepSeek's new AI model appears to be one of the best 'open' challengers yet". https://techcrunch.com/2024/12/26/deepseeks-new-ai-model-appears-to-be-one-of-the-best-open-challengers-yet/.

[VentureBeat_26_December_2024-8] Sharma, Shubham (26 December 2024). "DeepSeek-V3, ultra-large open-source AI, outperforms Llama and Qwen on launch". https://venturebeat.com/ai/deepseek-v3-ultra-large-open-source-ai-outperforms-llama-and-qwen-on-launch/.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

DeepSeek

References

2024

Navigation menu

Search