📘 Introduction

Function Calling allows AI models to not just answer questions, but interact with external tools — APIs, databases, or scripts — enabling real-world actions from natural language.

In this guide, we’ll build a dynamic multi-function calling system using Gemma 3 running locally with Ollama.

You’ll learn how to:

🔍 Perform real-time search using Serper.dev
🌐 Translate text to different languages dynamically using MyMemory API.
⛅ Fetch weather information live using OpenWeatherMap
🧠 Answer intelligently using internal memory if the information is known
🔄 Trigger dynamic function calling using structured JSON outputs
🖥️ Build a local, privacy-preserving multi-tool AI assistant

Everything happens locally, ensuring data privacy and offline capabilities.

🌐 What is Function Calling?

When using models like Gemma3, function calling is a method where the model generates structured output (like a JSON or function call format) instead of plain text — instructing your application to trigger external APIs.

According to Gemma's official documentation:

The model cannot execute the API calls itself.
Instead, it outputs structured function call formats that the application must parse and execute safely.

This bridges the gap between text generation and real-world actions like:

Searching for information
Fetching weather updates
Translating text
Triggering agent workflows

🛡️ Why Use Gemma 3 Locally with Ollama?

✅ Privacy First — Your prompts and API results stay local.
✅ No Cloud Costs — Avoid paid API tokens for inference.
✅ Faster Responses — No external latency bottlenecks.
✅ Customizable — Easily add more dynamic functions in future.

We are demonstrating Gemma 3's ability to trigger dynamic multi-function calls — not just a single tool, but multiple based on intent.

Note:
While Google's official documentation recommends using Gemma 3 27B for best performance and 12B for a good balance between performance and latency, For my local use case, I chose Gemma 3 1B —
✅ because it is sufficient for lightweight dynamic function calling,
✅ and fits better with system resource constraints for fast local experimentation.

Ollama - Gemma 1B Model

🖼️ System Architecture Overview

Component	Function
Frontend	Gradio Interface
LLM	Gemma 3 (1B) via Ollama
Functions	Search, Translate, Weather, General Queries
Backend	Python APIs and JSON Parsing

Below is the dynamic flow for handling user queries in our system — showcasing how Gemma 3 decides between internal knowledge and triggering external functions like Search, Translation, or Weather retrieval.

Functions Used:

Task	Function	API Used
Search	`google_search()`	Serper.dev
Translation	`translate_text()`	MyMemory API
Weather Info	`get_weather()`	OpenWeatherMap

🛠️ Step 1: Install Ollama and Pull Gemma 3 (1B)

Install Ollama:
https://ollama.com/

Pull the model:

ollama pull gemma3:1b

(You can also try bigger models based on usecases)

🛠️ Step 2: Install Required Python Packages

Clone the project or navigate to your project directory.
Create a virtual environment if needed:

python -m venv venv
venv\Scripts\activate

Install the required Python libraries using the requirements.txt file:

pip install -r requirements.txt

🛠️ Step 3: Setup Environment Variables

Create a .env file:

SERPER_API_KEY=your_serper_api_key_here
OPENWEATHER_API_KEY=your_openweathermap_api_key_here

Get your free API keys from:

🛠️ Step 4: Define the Functions

Functions are modularized into:

functions/search.py

import requests
import json
from config import SERPER_API_KEY
from models import SearchResult

def google_search(query: str) -> SearchResult:
    """Perform a Google search using Serper.dev API"""
    print("Get result from Google search using google_search")
    url = "https://google.serper.dev/search"
    payload = json.dumps({"q": query})
    headers = {
        'X-API-KEY': SERPER_API_KEY,
        'Content-Type': 'application/json'
    }

    response = requests.post(url, headers=headers, data=payload)
    response.raise_for_status()

    results = response.json()

    if not results.get('organic'):
        raise ValueError("No search results found.")

    first_result = results['organic'][0]
    return SearchResult(
        title=first_result.get('title', 'No title'),
        link=first_result.get('link', 'No link'),
        snippet=first_result.get('snippet', 'No snippet available.')
    )

functions/translate.py

import requests

def translate_text(text: str, target_language: str) -> str:
    """Translate text using MyMemory Translation API."""
    print("Translate text using Translation API from translate_text")
    try:
        source_language = "en"  # English

        url = f"https://api.mymemory.translated.net/get?q={text}&langpair={source_language}|{target_language}"

        response = requests.get(url)
        response.raise_for_status()

        result = response.json()

        return result["responseData"]["translatedText"]

    except Exception as e:
        return f"Translation Error: {str(e)}"

functions/weather.py

import requests
import os
from dotenv import load_dotenv

load_dotenv()

OPENWEATHER_API_KEY = os.getenv("OPENWEATHER_API_KEY")

def get_weather(city: str) -> str:
    """Fetch current weather information for a city."""
    print("Fetch current weather information from get_weather")
    try:
        url = f"https://api.openweathermap.org/data/2.5/weather?q={city}&appid={OPENWEATHER_API_KEY}&units=metric"

        response = requests.get(url)
        response.raise_for_status()

        data = response.json()

        city_name = data.get("name")
        temp = data["main"]["temp"]
        description = data["weather"][0]["description"]
        humidity = data["main"]["humidity"]
        wind_speed = data["wind"]["speed"]

        return (
            f"⛅ Weather in {city_name}:\n"
            f"Temperature: {temp}°C\n"
            f"Condition: {description.capitalize()}\n"
            f"Humidity: {humidity}%\n"
            f"Wind Speed: {wind_speed} m/s"
        )

    except Exception as e:
        return f"Weather Fetch Error: {str(e)}"

Each function uses Pydantic models for clean parameter handling.

🛠️ Step 5: Crafting the System Message

The System Message acts as an instructional guide for Gemma 3, deciding when to answer directly and when to trigger a function call.

It ensures:

✅ Answer directly for pre-2023 or timeless data from Ollama memory.
✅ Use google_search for real-time or latest topics.
✅ Use translate_text for translation requests.
✅ Use get_weather for fetching live weather information.
✅ Follow strict JSON structure when calling any external function.

Here’s the full System Message:

SYSTEM_MESSAGE = """
You are an AI assistant with training data up to 2023. 
Answer questions directly when possible, and use function calling when necessary.

DECISION PROCESS:
1. For historical events (before 2023):
   → Answer directly from your training data.

2. For current events (after 2023):
   → ALWAYS use 'google_search'. Never guess.

3. For real-time data (e.g., sports winners, current CEO, stock prices, event schedules):
   → ALWAYS use 'google_search'.

4. For translation requests (e.g., "Translate 'Hello' to Spanish"):
   → Use 'translate_text' function.

5. For weather-related questions (e.g., "What's the weather in Chennai?"):
   → Use 'get_weather' function.

IMPORTANT RULES:
- When calling a function, respond ONLY with the JSON object, no additional text, no backticks.
- When answering directly from memory, respond ONLY in clean natural language text, NOT in JSON.

WHEN TO SEARCH (Mandatory):
- If the question mentions dates after 2023 (e.g., "AWS re:Invent 2025", "Olympics 2028")
- If the question contains words like "current", "latest", "now", "today", "recent", "new", "future".
- If the user asks about ongoing events, upcoming conferences, tournaments, elections, weather.
- If you are unsure about the information.
- DO NOT guess or invent dates or details.

WHEN TO FETCH WEATHER (Mandatory):
- If the user asks about "weather", "temperature", "climate", "forecast", or "current weather" — ALWAYS call the 'get_weather' function.
- NEVER answer weather questions from memory, even if you answered a similar query before.
- Each weather query must always trigger a fresh 'get_weather' API call.

FUNCTION CALL FORMAT (Strict):
Example for Search:
{
    "name": "google_search",
    "parameters": {
        "query": "your search query here"
    }
}

Example for Translation:
{
    "name": "translate_text",
    "parameters": {
        "text": "Text to translate",
        "target_language": "language code (e.g., fr, es, de)"
    }
}

Example for Weather:
{
    "name": "get_weather",
    "parameters": {
        "city": "City name (e.g., Chennai, Paris)"
    }
}

SEARCH FUNCTION:
{
    "name": "google_search",
    "description": "Search for real-time information like current events, latest news, updates, dates",
    "parameters": {
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": "Search query"
            }
        },
        "required": ["query"]
    }
}

TRANSLATE FUNCTION:
{
    "name": "translate_text",
    "description": "Translate given text into the target language",
    "parameters": {
        "type": "object",
        "properties": {
            "text": {
                "type": "string",
                "description": "Text to translate"
            },
            "target_language": {
                "type": "string",
                "description": "Target language code (e.g., fr, es, de)"
            }
        },
        "required": ["text", "target_language"]
    }
}

WEATHER FUNCTION:
{
    "name": "get_weather",
    "description": "Fetch real-time weather information for a given city",
    "parameters": {
        "type": "object",
        "properties": {
            "city": {
                "type": "string",
                "description": "City name"
            }
        },
        "required": ["city"]
    }
}

RESPONSE GUIDELINES:
- Only include facts directly from the search/translation/weather result.
- Never invent or assume information not provided.
- Quote dates, names, facts exactly as retrieved.
- Keep responses concise and factual.
- If using memory knowledge (pre-2023), respond naturally without any JSON.

VERY IMPORTANT:
- If you are answering from memory (no function call needed), respond ONLY in natural human-readable text, NOT JSON structure.
- Do NOT format memory answers as JSON.
- JSON format must be used only for function calls.

"""

🛠️ Step 6: Building the Gradio UI

The chatbot UI is developed using Gradio Blocks:

User input textbox
Clear dynamic detection of which function was used
Response displayed nicely with source

🛠️ Step 7: Launch the Application

Once everything is set up, run the application from your terminal:

python app.py

You will see an output similar to:

(venv) PS C:\Personal\Personal projects - ML\gemma3-function-calling\function-calling-gemma3-master\dynamic_function_calling_gemma> python app.py
* Running on local URL:  http://127.0.0.1:7860
* Running on public URL: https://28ffea36aa2a7e83b1.gradio.live

Open the local URL or public URL shown in the terminal to access the Gradio Chatbot UI.
You can now interact with the assistant — asking questions that trigger real-time search, translation, weather, or memory-based answers dynamically!

📸 Below: Screenshot from VS Code terminal showing the app launch.

📸 Below: Screenshot of the Gradio Chat UI which will be launched in Browser.

Sample User Queries

*User Query*	*Behavior*
"Translate Good Morning to French"	Triggers Translation
"Weather in Chennai today?"	Triggers Weather Function
"Write java program for adding two numbers"	Answer from Memory
"Next AWS Reinvent event?"	Dynamic Search

🛠️ Step 8: Demo

Below are some sample queries and how Gemma 3 dynamically triggers function calls:

🎯 Query 1: Write Java Program for adding two numbers
→ Response generated directly from Gemma 3's internal knowledge without calling any external API.

🌐 Query 2: When is the next Cricket World Cup?
→ Google Search function triggered via Serper.dev API to fetch real-time information.

📚 Query 3: Who won the last World Cup in football?
→ Answered directly from Gemma 3’s internal memory (pre-2023 knowledge).

⛅ Query 4: Current weather in Chennai
→ Weather function triggered to fetch live weather data from OpenWeatherMap API.

🌍 Query 5: Translate "Hello" to French
→ Translation function triggered dynamically for multilingual capability. Uses mymemory API.

Here’s a quick preview of the full demo (GIF) in action:

If the GIF is not clear, you can download the full demo video from the GitHub repository.

💬 Why Dynamic Function Calling Matters

✅ Smarter Agents with Decision Making — Instead of always answering directly, Gemma 3 dynamically decides when to call real-world tools like search, translation, and weather APIs.
✅ Enhanced Real-Time Knowledge — By combining internal memory with external APIs, Gemma 3 answers both timeless and current questions without hallucination.
✅ Safe, Structured Output — Gemma 3 outputs strict JSON-based function calls, allowing safe parsing, controlled execution, and reduced risk of unexpected behavior.
✅ Extendable Workflows — With function calling, new capabilities (like database lookup, event notifications, or automation) can be added without re-training the LLM.

Gemma 3's structured output enables safe, reliable, production-ready AI applications.

📦 GitHub Repository

🔗 GitHub - Dynamic Function Calling with Gemma 3

🔑 Conclusion

By following this project:

✅ You built a local, dynamic, multi-functional LLM system
✅ You explored Gemma 3's function calling capabilities
✅ You connected real-world APIs dynamically without cloud dependency

This foundation can be extended to file summarization, agent workflows, tool use, and even multi-agent systems.

Local + Dynamic Function calling + Smart = Future-ready AI!

🚀 Let's Connect!

If you found this useful, feel free to connect with me:
🔗 LinkedIn - Sridhar Sampath
🔗 Hashnode Blog
🔗 GitHub Repository

🚀 Dynamic Multi-Function Calling Locally with Gemma 3 and Ollama

📘 Introduction

🌐 What is Function Calling?

🛡️ Why Use Gemma 3 Locally with Ollama?

🖼️ System Architecture Overview

Functions Used:

🛠️ Step 1: Install Ollama and Pull Gemma 3 (1B)

🛠️ Step 2: Install Required Python Packages

🛠️ Step 3: Setup Environment Variables

🛠️ Step 4: Define the Functions

🛠️ Step 5: Crafting the System Message

🛠️ Step 6: Building the Gradio UI

🛠️ Step 7: Launch the Application

🛠️ Step 8: Demo

💬 Why Dynamic Function Calling Matters

📦 GitHub Repository

🔑 Conclusion

🚀 Let's Connect!

✨ End

Comments

More from this blog

🎙️ Local Speech-to-Text with NVIDIA Parakeet ASR (TDT 0.6B)

🚀 Beyond Text: Building Multimodal RAG Systems with Cohere and Gemini

🚀 Exploring GraphRAG: Smarter AI Knowledge Retrieval with Neo4j & LLMs

How to Build Multi-Agent Collaboration on AWS Bedrock: A Financial Assistant Tutorial

Command Palette

📘 Introduction

🌐 What is Function Calling?

🛡️ Why Use Gemma 3 Locally with Ollama?

🖼️ System Architecture Overview

Functions Used:

🛠️ Step 1: Install Ollama and Pull Gemma 3 (1B)

🛠️ Step 2: Install Required Python Packages

🛠️ Step 3: Setup Environment Variables

🛠️ Step 4: Define the Functions

🛠️ Step 5: Crafting the System Message

🛠️ Step 6: Building the Gradio UI

🛠️ Step 7: Launch the Application

🛠️ Step 8: Demo

💬 Why Dynamic Function Calling Matters

📦 GitHub Repository

🔑 Conclusion

🚀 Let's Connect!

✨ End

Comments

More from this blog