Scene Detection

Sieve's Scene Detection is an automated scene detection API that identifies scene transitions in video content.

For pricing, click here.

For further information, click here.

Parameters

  • video: Input video file to process (minimum 3 seconds; max 30 mins for backend=turbo)
  • backend: (default: "base") Choose between "base" (high quality) or "turbo" (faster processing). For details, see Backends.*
  • start_time (default: 0): Start time in seconds to begin processing
  • end_time (default: -1.0): End time in seconds to stop processing (-1 for end of video)
  • return_scenes (default: false): Whether to return individual video clips for each detected scene
  • min_scene_duration (default: 0.1): Minimum length in seconds for a clip to be considered a scene and included in the output (must be >0)
  • threshold (default: 1.0): Adjusts detection sensitivity. Higher values result in fewer scene transitions. Recommended to increase/decrease in increments of 0.1, which is equivalent to ~10% increase/decrease in the sensitivity of the scene detection
  • transition_merge_gap (default: 0.1): Maximum gap in seconds between frames to be considered part of the same scene transition (i.e., if transition_merge_gap=0.5 and there are detected transitions <0.5s apart, they will be grouped together)

Important Notes

  • Scene detection sensitivity is automatically adjusted based on video characteristics; threshold further manipulates this sensitivity
  • The algorithms struggle to identify slow fade transitions between structurally similar scenes; backend="base" will identify fades more accurately
  • Non-H.264 videos are automatically re-encoded to H.264 using hardware acceleration, preserving the original resolution and aspect ratio, though slight quality loss from compression may occur

Output

If return_scenes is False:

  • A single dictionary of scenes containing the following for each detected scene:
    • scene_index: Index of the scene
    • start_time: Start time of the scene in seconds
    • end_time: End time of the scene in seconds
    • start_frame: The starting frame number of the scene
    • end_frame: The ending frame number of the scene
    • duration: The length (in seconds) of the scene

If return_scenes is True:

  • A tuple (clip, metadata) where:
    • clip: A sieve.File object representing the rendered scene
    • metadata: A dictionary with the keys listed above for each scene

Backends

BackendDescriptionSpeedQualityPrice (per minute)*
Up to 4K
return_scenes=false
Price (per minute)*
Up to 4K
return_scenes=true
Price (per minute)*
4K+
return_scenes=false
Price (per minute)*
4K+
return_scenes=true
baseSieve's premier quality scene detection solution. Accurately identifies scene transitions including fades and complex transitions.⭐️⭐️⭐️⭐️⭐️⭐️$0.05$0.10$0.10$0.20
turboA fast solution that works well for most videos with clear scene transitions. Approximately 3x faster than base.⭐️⭐️⭐️⭐️⭐️⭐️⭐️$0.02$0.07$0.04$0.14

*Pricing assumes 30 fps video content.

backend="base"backend="turbo"
sgoodsfast

For longer (>30 min) videos, or videos with subtle transitions or fade effects, we recommend using base. If processing speed is more important than detecting every transition, we recommend using turbo.

Notes

  • Discounts are available for high volume users. Please reach out to sales@sievedata.com or via Discord for more information.