← Back
VentureBeat AI

Liquid AI's smallest model yet LFM2.5-230M beats models 4X its size at data extraction, can run 'anywhere'

6 min read
#llm#deployment#inference
Liquid AI's smallest model yet LFM2.5-230M beats models 4X its size at data extraction, can run 'anywhere'
Level:Intermediate
For:AI Engineers
TL;DR

Liquid AI has released its smallest AI language model, LFM2.5-230M, a 230-million-parameter foundation model designed for on-device agentic workflows, which outperforms models 4X its size in data extraction and can run on devices such as smartphones, laptops, and robotics. The model utilizes the LFM2 architecture to achieve high inference speeds without massive memory overhead, making it suitable for edge devices. With a memory footprint of under 400MB, the model achieves decode speeds of 213 tokens per second on a Samsung Galaxy S25 Ultra and 42 tokens per second on a Raspberry Pi 5. This architectural efficiency has significant implications for engineers building AI systems, as it enables complex workflows on edge devices without requiring massive computational power or persistent cloud connections.

⚡ Key Takeaways

  • LFM2.5-230M, a 230-million-parameter model, outperforms models like Alibaba Qwen3.5-0.8B and Google Gemma 3 1B in data extraction.
  • The LFM2 architecture enables high inference speeds without massive memory overhead, making it suitable for edge devices.
  • The model has a memory footprint of under 400MB and achieves decode speeds of 213 tokens per second on a Samsung Galaxy S25 Ultra and 42 tokens per second on a Raspberry Pi 5.
  • The model supports an expansive 32K context window, allowing it to ingest substantial documents or continuous streams of robotic telemetry.
  • The model operates under a dual-use commercial license, free for individuals and companies generating less than $10 million in annual revenue.
💡 Why It Matters

The release of LFM2.5-230M has significant implications for engineers building AI systems, as it enables complex workflows on edge devices without requiring massive computational power or persistent cloud connections. This architectural efficiency can lead to more efficient and cost-effective AI deployments, particularly in industries where data extraction and local processing are critical.

✅ Practical Steps

  1. Evaluate the LFM2.5-230M model for use in data extraction and local deployment on edge devices.
  2. Consider the LFM2 architecture for building lightweight data extraction pipelines and autonomous edge systems.
  3. Assess the model's performance on specific devices, such as smartphones, laptops, and robotics, to determine its suitability for particular use cases.

Want the full story? Read the original article.

Read on VentureBeat AI

More like this

Run a vLLM Server on HF Jobs in One Command

Hugging Face Blog#inference

Improving the speed and energy-efficiency of AI agents

MIT News AI#agents

How Daikin Applied Americas builds consistent data pipelines at scale with Genie Code

Databricks Blog#rag

NVIDIA and AWS Collaborate to Bring AI to Production at Scale

NVIDIA Blog#nvidia

EXPLORE AI NEWS

Daily hand-picked stories on LLMs, RAG, agents and production AI — curated for engineers who ship.

BROWSE NEWS

GET THE WEEKLY DIGEST

Join engineers getting the Monday signal-over-noise AI breakdown. No spam, unsubscribe anytime.

LEARN AI ENGINEERING

Curated courses, research papers, repos and tutorials built for engineers leveling up in AI.

START LEARNING