Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: mitKey Capabilities & Highlights
- Deep Chain-of-Thought (CoT) Reasoning: Automatically processes queries inside custom tags before formulating its final output inside tags, mirroring modern advanced reasoning models.
- Robust Agentic Tool Calling: Native capability to parse complex system prompts, invoke multiple external APIs/functions in structured native patterns, and smoothly digest responses to fulfill user objectives.
- Premium Trilingual Support: Engineered to understand native Sinhala and Tamil characters and switch effortlessly between English and native languages.
- Lightweight & High Efficiency: Fine-tuned using ultra-optimized kernel setups (Unsloth) preserving high numerical precision and dynamic memory footprints.
Technical Architecture & Training Details
- Base Model: Chat2Find-CPT (Continued Pre-trained Model based on Qwen3.5-7B)
- Maximum Sequence Length: 262,144 tokens (Native Context Window)
- Vocabulary: Highly-optimized multilingual vocabulary supporting South Asian unicode blocks.
Training Dataset
The model was fine-tuned on the premium Chat2Find Unified Reasoning & Tool Dataset (comprising 279,260 curated records):
- Tamil: 45%
- Sinhala: 36%
- English: 18%
- General SFT Tasks: 78.4%
- Multi-Turn Agentic Chat: 21.6%
Prompt Format
To get the best out of Chat2Find-Instruct, use the native ChatML template structure. Below is the standard template formatting for system instructions, reasoning, and answers:
html
<|im_start|>systemYou are a helpful assistant. Today's date is 2026-05-30. The user is located in Sri Lanka. Provide your reasoning inside <reasoning> tags and the final answer inside <answer> tags. You have access to tools. For any question that would benefit from up-to-date, real-time, or detailed external information, or for information that can change over time, prefer using the search_internet tool. When the user asks about specific individuals, companies, products, organizations, brands, or named entities, always use the search_internet tool to get accurate and current information. If no other specific tool is relevant to the user's query, always fall back to the search_internet tool.<|im_end|><|im_start|>userCompare the capital cities of Sri Lanka and India.<|im_end|><|im_start|>assistant<reasoning>The user is asking for a comparison between the capital cities of Sri Lanka and India.Sri Lanka has a dual-capital system: Sri Jayawardenepura Kotte (administrative) and Colombo (commercial). India has a single capital: New Delhi.I should compare their governance roles, historical context, and economic functions in a structured format.</reasoning><answer>Here is a comprehensive comparison between the capital cities of Sri Lanka and India, detailing their unique administrative, political, and commercial roles.### 1. Administrative vs. Commercial Capitals#### Sri Lanka: Sri Jayawardenepura Kotte & ColomboSri Lanka utilizes a dual-capital system to separate commercial congestion from legislative and administrative governance:* **Sri Jayawardenepura Kotte (Administrative & Legislative Capital):** Located as a suburb just east of the commercial center, Kotte has been the official capital since 1982. It houses the Parliament of Sri Lanka, major ministries, and administrative bodies.* **Colombo (Commercial & Judicial Capital):** Colombo remains the largest city, the economic backbone, and the judicial center housing the Supreme Court. It remains the face of Sri Lankan trade, finance, and diplomacy, hosting most foreign embassies.#### India: New DelhiIndia uses a unified system centered in a single metropolitan area:* **New Delhi (Official Federal Capital):** New Delhi is a planned city situated within the larger National Capital Territory (NCT) of Delhi. It serves as the seat of all three branches of the Government of India, containing the Parliament House (Sansad Bhavan), Rashtrapati Bhavan (President's Palace), and the Supreme Court of India.---### 2. Historical & Planning Context* **Sri Jayawardenepura Kotte:** Once the historic capital of the ancient Kingdom of Kotte in the 14th–16th centuries, the city was selected in the late 20th century to relieve the high density of urban Colombo. It was designed to build a modern, spacious administrative center centered around the parliamentary complex on Diyawanna Oya lake.* **New Delhi:** Inaugurated in 1931, New Delhi was designed by British architects Edwin Lutyens and Herbert Baker as a grand, planned imperial capital. It is characterized by wide avenues, symmetrical gardens, and iconic stone edifices like India Gate and the Secretariat buildings.---### 3. Structural Comparison| Feature | Sri Lanka (Sri Jayawardenepura Kotte / Colombo) | India (New Delhi / NCT of Delhi) || :--- | :--- | :--- || **Status** | Dual-Capital System (Kotte: Administrative / Colombo: Commercial) | Unified National Capital || **Establishment** | Designated in 1982 | Inaugurated in 1931 || **Governance seat** | Parliament in Kotte; Supreme Court in Colombo | Parliament, President, and Supreme Court in New Delhi || **Core Economy** | Colombo leads in maritime trade, tourism, and financial services | Driven by government administration, IT, service sector, and manufacturing |In summary, while India operates its vast governmental machinery directly out of the planned district of New Delhi, Sri Lanka distributes its national functions between the legislative suburb of Sri Jayawardenepura Kotte and the historical commercial center of Colombo.</answer><|im_end|>
Tool Calling Workflow
When the model determines that a query requires external or real-time information, it will output a tool call instead of a direct answer. To complete the interaction, you need to:
- Detect the tool call in the model's output (wrapped in
<tool_call>tags or<function=name>tags). - Execute the tool on your end (e.g., call your search API, fetch stock data, etc.).
- Append the tool response to the conversation history as a message with
role: "tool". - Re-run generation so the model produces a final answer incorporating the tool result.
Below is a complete example showing the full multi-turn flow:
html
<|im_start|>systemYou are a helpful assistant. Today's date is 2026-05-30. The user is located in Sri Lanka. Provide your reasoning inside <reasoning> tags and the final answer inside <answer> tags. You have access to tools. For any question that would benefit from up-to-date, real-time, or detailed external information, or for information that can change over time, prefer using the search_internet tool. When the user asks about specific individuals, companies, products, organizations, brands, or named entities, always use the search_internet tool to get accurate and current information. If no other specific tool is relevant to the user's query, always fall back to the search_internet tool.<|im_end|><|im_start|>userWhat is the current price of gold in Sri Lanka?<|im_end|><|im_start|>assistant<reasoning>The user is asking about the current gold price in Sri Lanka. This is real-time information that changes daily, so I need to use the search_internet tool to get accurate data.</reasoning><tool_call>{"name": "search_internet", "arguments": {"query": "current gold price in Sri Lanka today"}}</tool_call><|im_end|><|im_start|>tool{"results": "As of May 30, 2026, the price of 24K gold in Sri Lanka is approximately LKR 32,500 per gram. The price of 22K gold is approximately LKR 29,800 per gram. Prices are based on the Colombo bullion market rates."}<|im_end|><|im_start|>assistant<reasoning>The tool has returned the current gold prices in Sri Lanka. I can now present this information clearly to the user.</reasoning><answer>As of today (May 30, 2026), the gold prices in Sri Lanka are as follows:- **24K Gold**: approximately LKR 32,500 per gram- **22K Gold**: approximately LKR 29,800 per gramThese prices are based on the Colombo bullion market rates and may vary slightly between jewellers and dealers.</answer><|im_end|>
In code, the tool response message should be appended to your messages list like this:
python
# After detecting a tool call in the assistant's response, execute the tool and append:messages.append({"role": "tool","name": "search_internet", # must match the function name from the tool call"content": '{"results": "your tool output here"}'})# Then re-apply the chat template and call model.generate() again
The model will then use the tool response to formulate a complete, grounded answer for the user.
Quickstart Usage (Hugging Face Transformers)
You can easily run inference using standard Hugging Face tools. Ensure you have transformers and torch installed:
python
import torchfrom transformers import AutoModelForCausalLM, AutoTokenizermodel_name = "Chat2Find/chat2find-instruct-v1"# Load model and tokenizertokenizer = AutoTokenizer.from_pretrained(model_name)model = AutoModelForCausalLM.from_pretrained(model_name,torch_dtype=torch.float16,device_map="auto")# Setup your message listmessages = [{"role": "system","content": "You are a helpful assistant. Today's date is 2026-05-30. The user is located in Sri Lanka. Provide your reasoning inside <reasoning> tags and the final answer inside <answer> tags. You have access to tools. For any question that would benefit from up-to-date, real-time, or detailed external information, or for information that can change over time, prefer using the search_internet tool. When the user asks about specific individuals, companies, products, organizations, brands, or named entities, always use the search_internet tool to get accurate and current information. If no other specific tool is relevant to the user's query, always fall back to the search_internet tool."},{"role": "user","content": "What is the difference between a solar eclipse and a lunar eclipse?"}]# Apply chat templateprompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)inputs = tokenizer(prompt, return_tensors="pt").to("cuda")# Generate responsewith torch.no_grad():outputs = model.generate(**inputs,do_sample=True,max_new_tokens=1024,temperature=0.7,top_p=0.9,repetition_penalty=1.1,eos_token_id=tokenizer.eos_token_id)# Decode responseresponse = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)print(response)
License & Commercial Support
This model is licensed under the MIT License. The underlying training dataset is managed commercially by the Chat2Find Team.
For enterprise integration, tailored multilingual modeling, or custom data licensing queries, visit us at chat2find.com or reach out to our research leads.
Model provider
Chat2Find1
Model tree
Base
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information