Exercise Class 6

Author

Jonas Skjold Raaschou-Pedersen

Published

October 9, 2025

This week we start from this repository.


For this week’s exercises you will need a Google account. We will use Google’s Gemini models through their API while it has sufficient free credits for our purposes.

Exercise 0 - Setting up Gemini API

  1. Create a Google account if you don’t already have one.

  2. Read the first couple of lines of the quickstart in the Gemini API documentation.

  3. Go to Google AI Studio and create an API key; this can be done by doing the following steps:

    • click on the Create API key in the upper right corner
    • name your project something relevant, for instance ai4h
    • in the Choose an imported project dropdown menu select Create project
    • click Create key
    • copy the API key to your clipboard and continue to the next exercise
  4. Read the documentation on setting your API key as an environment variable. The documentation shows how to do it for Mac/Linux and Windows. Don’t set the API key explicitly in your code, that is bad practice, see here.

  5. Go back to the quickstart. Add google-genai as project dependency using uv i.e. type uv add google-genai in your shell.

  6. Make your first request to the API by trying out their example snippet:

    from google import genai
    
    # The client gets the API key from the environment variable `GEMINI_API_KEY`.
    client = genai.Client()
    
    response = client.models.generate_content(
        model="gemini-2.5-flash", contents="Explain how AI works in a few words"
    )
    print(response.text)

Note: If you are interested in learning about the technical details of the Gemini models, check out the paper by the Gemini Team here

Exercise 1 – Exploring temperatures and the System Prompt

  1. Recall from lecture 5 the role of the temperature parameter and how it influences generation. This parameter can be varied as shown here. We will use this in the following.

  2. Write a function that does the following:

    • Creates a genai.Client() instance.
    • Loops over a range of temperature values, e.g. [0.1, 0.5, 0.9].
    • For each temperature, sends the prompt "Explain how AI works in a few words" to the model gemini-2.5-flash with the chosen temperature.
    • Optionally prints the output.
    • Saves the output for the given temperature to a markdown file named example-<temp>.md (for example, example-1.md, example-5.md, example-9.md corresponding to the array of temperature values). You could for instance save these files in a folder named output/gemini in your project directory.
  3. Run your function and compare the outputs.

    • Which temperature produces more predictable answers?
    • Which produces more variation or creativity?
  4. Fix the temperature to 0.5. Extend your function by setting three different system instructions. See the documentation here for how to do that. For example, you could try these:

    system_prompts = [
        "You are Geoffrey Hinton, the 'Godfather of Deep Learning'. Explain AI as if you were telling a bedtime story to children.",
        "You are Andrej Karpathy, former Tesla AI lead. Explain AI as if you were narrating a YouTube coding tutorial.",
        "You are Yann LeCun, Meta's Chief AI Scientist. Explain AI passionately while defending it against critics.",
    ]

    Save output for each system prompt i to a markdown file named e.g. example-system-prompt-{i}.md. How do the model’s answers differ depending on the persona?

Exercise 2 – (Let the AIs) Chat

  1. Read the documentation on multi-turn-conversations (i.e. chatting) to understand how multi-turn conversations work.

  2. Write a function gemini_chat() that creates two chat sessions:

    • one with model "gemini-2.0-flash"
    • one with model "gemini-2.5-flash"
  3. Make Gemini 2.0 start the conversation by sending the message:

    start_message = """Suggest one cost-efficient policy change that could significantly improve Danish society.
    Keep your answer brief."""

    to Gemini 2.5.

  4. Capture Gemini 2.5’s reply and send it back to Gemini 2.0.

  5. Continue the conversation for a fixed number of turns (e.g. 10). After each reply, make sure the other model answers back.

  6. Print each message to the terminal with a clear header (e.g. print(" Gemini 2.0 ".center(80, "-")) before Gemini 2.0 and print(" Gemini 2.5 ".center(80, "-")) before Gemini 2.5).

  7. Save the entire conversation into a file e.g. output/gemini-chat.md. Inspect the conversation and read through the exchange. How does the conversation develop? Do the models eventually fall into a repetitive loop, such as endlessly thanking each other?

  8. Bonus: Try setting the system prompt for Gemini 2.0 to "You are a Danish politician." and for Gemini 2.5 to "You are a junior consultant at a Big 3 consulting firm." and redo the above.

Exercise 3 - Structured Output Generation

In this exercise, you will use the Gemini 2.5 model, "gemini-2.5-flash", to extract a table from an image. We provide an image of a table here for you to use.

Note: The png is from this Danish Parliament annual report (2010–2011). The data is not interesting; it is only used as an example.

Note: In practice, you could pass the PDF file of the table directly to the model. Here, we’ll make the task more challenging by passing a screenshot (.png) of the table instead.

  1. Download the image and save it to a folder in your project directory; e.g. images/table.png.

  2. Read the documentation on passing images to the model and on structured output generation.

  3. Define a Pydantic model Table with the fields:

    • title: str
    • columns: list[str]
    • data: list[list[str]]
  4. Write a function img_parse() that:

    • Opens an image file and loads it as bytes; follow the example in the docs.

    • Sends the image to the model "gemini-2.5-flash", together with the instruction:

      Extract the tables from this image and return the data as a json
    • Configures the response to return JSON that matches the Table Pydantic model from above.

  5. After receiving the JSON response:

    • Parse it into python objects using the json module.
    • Convert each table into a Polars or Pandas DataFrame.
    • Optionally replace commas with dots in numeric strings and convert the numeric strings to numeric values.
  6. Print each parsed DataFrame.

    Question: Does the model correctly capture the table’s title, headers, and rows from the image? If not, what errors or inconsistencies do you notice?

  7. Save the raw JSON output to data/ft.json and optionally the dataframes as well.