RxQwen-Micro-2B Chat (Supervised) Demo

Demo for RxQwen Micro dense model, based on Qwen3-1.7B combined with RxT Attention-Based Memory System, trained for conversations

Limitations

Supervised version of the model is still in intermediate stage and will be further improved in Direct Memory and Preference Optimization (DMPO) stage (demo will be constantly updated).

RxQwen-Nano Chat

Query

Thinking Mode

auto extended fast

Temperature

0.1 2

Top-p

0.1 1

Pricing / Cost Calculation

Considering pricing for Qwen3-8B (for upcoming bigger models, as the smallest models like 0.6B rather aren't hosted) from OpenRouter:

Input Tokens: $0.05 / 1M
Output Tokens: $0.4 / 1M
Cached Input Tokens: not available, but consider $0.02 (it's full GQA, so I think cache is too big for $0.01)

For our RxQwen version it will be:

Input Tokens: free
Output Tokens: could be $0.3 / 1M or even $0.2, as it's smaller computational cost, because of single interaction only KV-cache. Let's take $0.3 as a compromise