The VLM is fed three inputs: the cropped RGB image, its corresponding Reinhard tonemapped envmap, and the brightest region information extracted by the algorithm above. The VLM is first instructed to generate a detailed one-paragraph lighting description. Based on this detailed paragraph, it is then instructed to produce two summary variants: a two-sentence version and a few-words version.
You are an expert in analyzing {scene_type (indoor/outdoor)} scene lighting. Your task is to describe the lighting in the image with technical accuracy.
Based on the provided images (cropped view, panorama, and coordinate map), write a concise, single-paragraph description of the lighting as seen from the perspective of the cropped image. \n
The following is the content of this paragraph:
(for indoor scenes) Start directly, begin the paragraph by immediately describing the most significant light source. Do not use introductory sentences like "The scene is illuminated by..." or "There are several light sources."
1. Identify Key Light Sources and Describe Each Source: Use the direct light source position and brightness information provided below, and use the panorama for full context, identify the dominant light sources that directly illuminate the scene in the cropped image. These can include windows, lamps, strip lights, or other fixtures. For each significant and dominant light source, describe its type (e.g., window, recessed strip light), position relative to the view (similar to which in the direct light information), color (e.g., warm yellow, neutral white), and brightness (e.g., bright, soft). Only one concise sentence should be used for each light source.
2. Use one short and concise sentence to describe the overall color of the scene.
(for outdoor scenes) Start directly, begin the paragraph by immediately describing the most significant light source. Do not use introductory sentences like "The scene is illuminated by..." or "There are several light sources."
1. Describe the Primary Natural Light: Identify the main source of natural light (e.g., the sun) with the light information provided below. In a single sentence, describe its direction relative to the view (similar to which in the direct light information), its color/hue (e.g., "warm golden," "cool blue"), and its brightness (e.g., "bright and direct," "soft and diffused").
2. Detail Any Artificial Lights: If any artificial lights are active and visible (like streetlights or building lights), briefly describe their type, location, and color.
3. Use one short and concise sentence to describe the overall color of the scene.
Important formatting requirements:
- Must give the correct and faithful description based on the lighting conditions of the scene
- Do not mention the coordinate colors in your final output.
- Make sure this paragraph flows naturally, and avoid redundancy
- Write in complete sentences without using bullet points, dashes, or numbered lists
- Do not use bullet points, dashes (-), or numbered lists
- Provide concise and brief descriptions
- Do not use words expressing uncertainty like 'appears to be', 'seems to', 'likely', or 'suggests'. State the lighting conditions as fact
- Do not use words like 'cropped image', 'cropped view', 'panorama' in your final output
You analysis should:
Use the Panorama for Context: the panorama provides a complete 360-degree view of all light sources. Use this to understand the lighting, but focus your description only on the lights that directly and strongly illuminate the scene in the cropped image.
Use the direct light source position and brightness information below (very important, the most precise information) to understand the lighting conditions
Here is some auxiliary information about the light sources in the scene to help you better understand the lighting conditions:\n
"Light {i}: maximum brightness {light['max_brightness']}, position description: {light['position_description']}, theta (elevation angle on envmap, center is 0): {light['theta_deg']}, phi (azimuthal angle on envmap, center is 0): {light['phi_deg']} \n"
You are an expert in analyzing {scene_type} scene lighting. Given existing lighting description, your task is to summarize current descriptions according to the provided images (cropped view, panorama, and coordinate map).
Important requirements:
- Must give the correct and faithful description based on the lighting conditions of the scene
- Use in total of two sentences, one for direct lighting that dominates the scene lighting, must describe what where these light sources are and their positions. And the other one describe the overall lighting, focus on the color (must have) and brightness
- Do not include any additional information or context beyond the lighting description
- These two sentences should be clearly separated, and very short and concise, like what humans will say
- Make sure them flows naturally, and avoid redundancy
- Write in complete sentences without using bullet points, dashes, or numbered lists
- Do not use words expressing uncertainty like 'appears to be', 'seems to', 'likely', or 'suggests'. State the lighting conditions as fact
Current light description: {cur_light_description}
You are an expert in analyzing {scene_type} scene lighting. Given existing lighting description, your task is to summarize current descriptions to a few words according to the provided images (cropped view and panorama) and the direct light source position and brightness information.
Important requirements:
- Must give the correct and faithful description based on the lighting conditions of the scene
- Use a few phrases (not complete sentences) to summarize the scene lighting condition
- Describe predominant direct light sources (like "a bright sun from the upper right", etc.) and overall scene lighting.
- Do not include any additional information or context beyond the lighting description
- Separate the phrases with commas
- Do not use words expressing uncertainty like 'appears to be', 'seems to', 'likely', or 'suggests'. State the lighting conditions as fact
Current light description: {cur_light_description}
We use the following algorithm to locate the dominant light sources in an HDR envmap:
For position descriptions, we map elevation (θ) and azimuth (φ) to short phrases as follows.
| Angular range (degrees) | Description |
|---|---|
| (-90, -45] | low down |
| (-45, -22.5] | down |
| (-22.5, 22.5] | (horizontal) |
| (22.5, 45] | up |
| (45, 90] | high up |
| Angular range (degrees) | Description |
|---|---|
| (-22.5, 22.5] | in the front |
| (22.5, 67.5] | on the front-right |
| (67.5, 112.5] | on the right |
| (112.5, 157.5] | on the back-right |
| (157.5, 180] or (-180, -157.5] | in the back |
| (-157.5, -112.5] | on the back-left |
| (-112.5, -67.5] | on the left |
| (-67.5, -22.5] | on the front-left |
{
"ulaval_outdoor": {
"9C4A0006": {
"000": [
{
"rank": 0,
"total_flux": 86642.76549932986,
"max_brightness": 13811.497809960936,
"area_size": 4926,
"u": 0.99609375,
"v": 0.1875,
"pixel_x": 1020,
"pixel_y": 96,
"theta_deg": 56.25,
"phi_deg": 178.59375,
"theta_rad": 0.9817477042468103,
"phi_rad": 3.117048960983623,
"position_description": "High up, in the back"
},
...
]
}
}
}
The primary natural light source is the sun, positioned high in the sky to the right, casting a bright and direct light with a cool blue hue. There are no visible artificial lights in the scene. The overall color of the scene is a mix of cool blues and greens, reflecting the sunlight on the water and the surrounding landscape.
A bright sun from above to the right illuminates the scene with a cool blue hue, reflecting off the water and casting a serene light on the landscape.
Bright sun from the upper right, cool blue hue, sunlight reflecting on water, cool blues and greens.
A bright window on the left side of the scene provides natural light, casting a warm yellow hue across the room. A recessed strip light on the ceiling, positioned above and slightly to the right, emits a soft, warm light, contributing to the overall illumination. The scene has a warm, yellowish color tone, enhancing the cozy atmosphere.
A bright window on the left and a recessed strip light on the ceiling provide warm, yellowish illumination, creating a cozy atmosphere.
Bright window on the left, recessed strip light on the ceiling, warm yellow hue, cozy atmosphere.
You are an embedding model. Encode the scene lighting description for similarity search and image generation conditioning. The embeddings must capture
the position (left, right, above, back, etc.) and the color of the dominant light sources (very important). And include the overall brightness, color temperature, and mood of the scene.
We fine-tune Stable Diffusion 3.5 Medium to output an LDR environment map and repurpose its text-conditioning branch to accept our lighting embedding.