π Dynamic Multi-Function Calling Locally with Gemma 3 and Ollama
π Learn How Gemma 3 Powers Dynamic Real-Time Search, Translation, Weather Retrieval, and General Queries β All Locally with Function Calling

π Introduction
Function Calling allows AI models to not just answer questions, but interact with external tools β APIs, databases, or scripts β enabling real-world actions from natural language.
In this guide, weβll build a dynamic multi-function calling system using Gemma 3 running locally with Ollama.
Youβll learn how to:
π Perform real-time search using Serper.dev
π Translate text to different languages dynamically using MyMemory API.
β
Fetch weather information live using OpenWeatherMap
π§ Answer intelligently using internal memory if the information is known
π Trigger dynamic function calling using structured JSON outputs
π₯οΈ Build a local, privacy-preserving multi-tool AI assistant
Everything happens locally, ensuring data privacy and offline capabilities.
π What is Function Calling?
When using models like Gemma3, function calling is a method where the model generates structured output (like a JSON or function call format) instead of plain text β instructing your application to trigger external APIs.
According to Gemma's official documentation:
The model cannot execute the API calls itself.
Instead, it outputs structured function call formats that the application must parse and execute safely.
This bridges the gap between text generation and real-world actions like:
Searching for information
Fetching weather updates
Translating text
Triggering agent workflows
π‘οΈ Why Use Gemma 3 Locally with Ollama?

β
Privacy First β Your prompts and API results stay local.
β
No Cloud Costs β Avoid paid API tokens for inference.
β
Faster Responses β No external latency bottlenecks.
β
Customizable β Easily add more dynamic functions in future.
We are demonstrating Gemma 3's ability to trigger dynamic multi-function calls β not just a single tool, but multiple based on intent.
Note:
While Google's official documentation recommends using Gemma 3 27B for best performance and 12B for a good balance between performance and latency, For my local use case, I chose Gemma 3 1B β
β
because it is sufficient for lightweight dynamic function calling,
β
and fits better with system resource constraints for fast local experimentation.
πΌοΈ System Architecture Overview
| Component | Function |
| Frontend | Gradio Interface |
| LLM | Gemma 3 (1B) via Ollama |
| Functions | Search, Translate, Weather, General Queries |
| Backend | Python APIs and JSON Parsing |
Below is the dynamic flow for handling user queries in our system β showcasing how Gemma 3 decides between internal knowledge and triggering external functions like Search, Translation, or Weather retrieval.

Functions Used:
| Task | Function | API Used |
| Search | google_search() | Serper.dev |
| Translation | translate_text() | MyMemory API |
| Weather Info | get_weather() | OpenWeatherMap |
π οΈ Step 1: Install Ollama and Pull Gemma 3 (1B)
Install Ollama:
https://ollama.com/
Pull the model:
ollama pull gemma3:1b
(You can also try bigger models based on usecases)

π οΈ Step 2: Install Required Python Packages
Clone the project or navigate to your project directory.
Create a virtual environment if needed:
python -m venv venv
venv\Scripts\activate
Install the required Python libraries using the requirements.txt file:
pip install -r requirements.txt
π οΈ Step 3: Setup Environment Variables
Create a .env file:
SERPER_API_KEY=your_serper_api_key_here
OPENWEATHER_API_KEY=your_openweathermap_api_key_here
Get your free API keys from:
π οΈ Step 4: Define the Functions
Functions are modularized into:
functions/search.py
import requests
import json
from config import SERPER_API_KEY
from models import SearchResult
def google_search(query: str) -> SearchResult:
"""Perform a Google search using Serper.dev API"""
print("Get result from Google search using google_search")
url = "https://google.serper.dev/search"
payload = json.dumps({"q": query})
headers = {
'X-API-KEY': SERPER_API_KEY,
'Content-Type': 'application/json'
}
response = requests.post(url, headers=headers, data=payload)
response.raise_for_status()
results = response.json()
if not results.get('organic'):
raise ValueError("No search results found.")
first_result = results['organic'][0]
return SearchResult(
title=first_result.get('title', 'No title'),
link=first_result.get('link', 'No link'),
snippet=first_result.get('snippet', 'No snippet available.')
)
functions/translate.py
import requests
def translate_text(text: str, target_language: str) -> str:
"""Translate text using MyMemory Translation API."""
print("Translate text using Translation API from translate_text")
try:
source_language = "en" # English
url = f"https://api.mymemory.translated.net/get?q={text}&langpair={source_language}|{target_language}"
response = requests.get(url)
response.raise_for_status()
result = response.json()
return result["responseData"]["translatedText"]
except Exception as e:
return f"Translation Error: {str(e)}"
functions/weather.py
import requests
import os
from dotenv import load_dotenv
load_dotenv()
OPENWEATHER_API_KEY = os.getenv("OPENWEATHER_API_KEY")
def get_weather(city: str) -> str:
"""Fetch current weather information for a city."""
print("Fetch current weather information from get_weather")
try:
url = f"https://api.openweathermap.org/data/2.5/weather?q={city}&appid={OPENWEATHER_API_KEY}&units=metric"
response = requests.get(url)
response.raise_for_status()
data = response.json()
city_name = data.get("name")
temp = data["main"]["temp"]
description = data["weather"][0]["description"]
humidity = data["main"]["humidity"]
wind_speed = data["wind"]["speed"]
return (
f"β
Weather in {city_name}:\n"
f"Temperature: {temp}Β°C\n"
f"Condition: {description.capitalize()}\n"
f"Humidity: {humidity}%\n"
f"Wind Speed: {wind_speed} m/s"
)
except Exception as e:
return f"Weather Fetch Error: {str(e)}"
Each function uses Pydantic models for clean parameter handling.
π οΈ Step 5: Crafting the System Message
The System Message acts as an instructional guide for Gemma 3, deciding when to answer directly and when to trigger a function call.
It ensures:
β
Answer directly for pre-2023 or timeless data from Ollama memory.
β
Use google_search for real-time or latest topics.
β
Use translate_text for translation requests.
β
Use get_weather for fetching live weather information.
β
Follow strict JSON structure when calling any external function.
Hereβs the full System Message:
SYSTEM_MESSAGE = """
You are an AI assistant with training data up to 2023.
Answer questions directly when possible, and use function calling when necessary.
DECISION PROCESS:
1. For historical events (before 2023):
β Answer directly from your training data.
2. For current events (after 2023):
β ALWAYS use 'google_search'. Never guess.
3. For real-time data (e.g., sports winners, current CEO, stock prices, event schedules):
β ALWAYS use 'google_search'.
4. For translation requests (e.g., "Translate 'Hello' to Spanish"):
β Use 'translate_text' function.
5. For weather-related questions (e.g., "What's the weather in Chennai?"):
β Use 'get_weather' function.
IMPORTANT RULES:
- When calling a function, respond ONLY with the JSON object, no additional text, no backticks.
- When answering directly from memory, respond ONLY in clean natural language text, NOT in JSON.
WHEN TO SEARCH (Mandatory):
- If the question mentions dates after 2023 (e.g., "AWS re:Invent 2025", "Olympics 2028")
- If the question contains words like "current", "latest", "now", "today", "recent", "new", "future".
- If the user asks about ongoing events, upcoming conferences, tournaments, elections, weather.
- If you are unsure about the information.
- DO NOT guess or invent dates or details.
WHEN TO FETCH WEATHER (Mandatory):
- If the user asks about "weather", "temperature", "climate", "forecast", or "current weather" β ALWAYS call the 'get_weather' function.
- NEVER answer weather questions from memory, even if you answered a similar query before.
- Each weather query must always trigger a fresh 'get_weather' API call.
FUNCTION CALL FORMAT (Strict):
Example for Search:
{
"name": "google_search",
"parameters": {
"query": "your search query here"
}
}
Example for Translation:
{
"name": "translate_text",
"parameters": {
"text": "Text to translate",
"target_language": "language code (e.g., fr, es, de)"
}
}
Example for Weather:
{
"name": "get_weather",
"parameters": {
"city": "City name (e.g., Chennai, Paris)"
}
}
SEARCH FUNCTION:
{
"name": "google_search",
"description": "Search for real-time information like current events, latest news, updates, dates",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search query"
}
},
"required": ["query"]
}
}
TRANSLATE FUNCTION:
{
"name": "translate_text",
"description": "Translate given text into the target language",
"parameters": {
"type": "object",
"properties": {
"text": {
"type": "string",
"description": "Text to translate"
},
"target_language": {
"type": "string",
"description": "Target language code (e.g., fr, es, de)"
}
},
"required": ["text", "target_language"]
}
}
WEATHER FUNCTION:
{
"name": "get_weather",
"description": "Fetch real-time weather information for a given city",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "City name"
}
},
"required": ["city"]
}
}
RESPONSE GUIDELINES:
- Only include facts directly from the search/translation/weather result.
- Never invent or assume information not provided.
- Quote dates, names, facts exactly as retrieved.
- Keep responses concise and factual.
- If using memory knowledge (pre-2023), respond naturally without any JSON.
VERY IMPORTANT:
- If you are answering from memory (no function call needed), respond ONLY in natural human-readable text, NOT JSON structure.
- Do NOT format memory answers as JSON.
- JSON format must be used only for function calls.
"""
π οΈ Step 6: Building the Gradio UI
The chatbot UI is developed using Gradio Blocks:
User input textbox
Clear dynamic detection of which function was used
Response displayed nicely with source
π οΈ Step 7: Launch the Application
Once everything is set up, run the application from your terminal:
python app.py
You will see an output similar to:
(venv) PS C:\Personal\Personal projects - ML\gemma3-function-calling\function-calling-gemma3-master\dynamic_function_calling_gemma> python app.py
* Running on local URL: http://127.0.0.1:7860
* Running on public URL: https://28ffea36aa2a7e83b1.gradio.live
Open the local URL or public URL shown in the terminal to access the Gradio Chatbot UI.
You can now interact with the assistant β asking questions that trigger real-time search, translation, weather, or memory-based answers dynamically!
πΈ Below: Screenshot from VS Code terminal showing the app launch.
πΈ Below: Screenshot of the Gradio Chat UI which will be launched in Browser.
Sample User Queries
| User Query | Behavior |
| "Translate Good Morning to French" | Triggers Translation |
| "Weather in Chennai today?" | Triggers Weather Function |
| "Write java program for adding two numbers" | Answer from Memory |
| "Next AWS Reinvent event?" | Dynamic Search |
π οΈ Step 8: Demo
Below are some sample queries and how Gemma 3 dynamically triggers function calls:
π― Query 1: Write Java Program for adding two numbers
β Response generated directly from Gemma 3's internal knowledge without calling any external API.
π Query 2: When is the next Cricket World Cup?
β Google Search function triggered via Serper.dev API to fetch real-time information.
π Query 3: Who won the last World Cup in football?
β Answered directly from Gemma 3βs internal memory (pre-2023 knowledge).
β
Query 4: Current weather in Chennai
β Weather function triggered to fetch live weather data from OpenWeatherMap API.
π Query 5: Translate "Hello" to French
β Translation function triggered dynamically for multilingual capability. Uses mymemory API.
Hereβs a quick preview of the full demo (GIF) in action:

If the GIF is not clear, you can download the full demo video from the GitHub repository.
π¬ Why Dynamic Function Calling Matters
β
Smarter Agents with Decision Making β Instead of always answering directly, Gemma 3 dynamically decides when to call real-world tools like search, translation, and weather APIs.
β
Enhanced Real-Time Knowledge β By combining internal memory with external APIs, Gemma 3 answers both timeless and current questions without hallucination.
β
Safe, Structured Output β Gemma 3 outputs strict JSON-based function calls, allowing safe parsing, controlled execution, and reduced risk of unexpected behavior.
β
Extendable Workflows β With function calling, new capabilities (like database lookup, event notifications, or automation) can be added without re-training the LLM.
Gemma 3's structured output enables safe, reliable, production-ready AI applications.
π¦ GitHub Repository
π GitHub - Dynamic Function Calling with Gemma 3
π Conclusion
By following this project:
β
You built a local, dynamic, multi-functional LLM system
β
You explored Gemma 3's function calling capabilities
β
You connected real-world APIs dynamically without cloud dependency
This foundation can be extended to file summarization, agent workflows, tool use, and even multi-agent systems.
Local + Dynamic Function calling + Smart = Future-ready AI!
π Let's Connect!
If you found this useful, feel free to connect with me:
π LinkedIn - Sridhar Sampath
π Hashnode Blog
π GitHub Repository





