RTSP APIs
The rtsp component provides a flexible, multi-codec RTSP streaming framework
for ESP32 devices. It supports MJPEG, H.264, and generic audio codecs through
an extensible packetizer/depacketizer architecture. The component handles RTP
packet splitting and reassembly; encoding and decoding of media data is handled
externally by the application.
How RTSP Works
The component uses a split control-plane / media-plane design:
RTSP over TCP handles session control such as
OPTIONS,DESCRIBE,SETUP,PLAY,PAUSE, andTEARDOWN.SDP returned from
DESCRIBEtells the client what tracks exist, how they are encoded, and which per-track control URLs must be used forSETUP.RTP/UDP carries encoded media packets after playback starts.
RTCP/UDP sockets are created alongside RTP sockets, but the current ESPP implementation keeps RTCP support lightweight and does not yet implement a full control/feedback plane.
sequenceDiagram
participant App as Application
participant Server as RtspServer / RtspSession
participant Client as RtspClient
App->>Server: add_track() / send_frame()
Client->>Server: OPTIONS
Server-->>Client: 200 OK
Client->>Server: DESCRIBE
Server-->>Client: SDP with session + track control paths
Client->>Server: SETUP(trackID=n, client_port=RTP-RTCP)
Server-->>Client: Session + Transport headers
Client->>Server: PLAY
Server-->>Client: 200 OK
Server-->>Client: RTP/UDP packets for each active track
Client-->>App: on_jpeg_frame() or on_frame(track_id, data)
Client->>Server: TEARDOWN
Server-->>Client: 200 OK
In ESPP, the server generates one SDP description per session, with one
m=... section and one a=control:.../trackID=N entry per registered
track. The client parses those lines during describe() and then issues
SETUP once per discovered track before calling PLAY.
Packetization Pipeline
The codec-specific logic is intentionally separated from the RTSP core:
flowchart LR
Frame["Encoded frame bytes"] --> Packetizer["Codec packetizer"]
Packetizer --> Chunks["RTP payload chunks"]
Chunks --> Header["RtspServer adds RTP headers"]
Header --> Session["RtspSession sends UDP packets"]
Session --> ClientRtp["RtspClient RTP socket"]
ClientRtp --> Depacketizer["Codec depacketizer"]
Depacketizer --> Callback["Application callback"]
RtspServer::send_frame(track_id, data) asks the selected packetizer to split
the encoded frame into MTU-sized chunks, adds RTP headers with track-specific
SSRC and sequence numbers, and leaves the resulting packets queued for active
sessions to transmit. On the client side, RtspClient::handle_rtp_packet()
parses the RTP header, uses the payload type to find the matching depacketizer,
and emits a completed frame through either on_jpeg_frame or the generic
on_frame(track_id, data) callback.
Legacy MJPEG Compatibility
For backward compatibility, the component still preserves the older MJPEG-only behavior:
RtspServer::send_frame(std::span<const uint8_t>)lazily creates a default track 0 and uses the legacy RFC 2435-compatible MJPEG wire format.RtspClientautomatically creates anMjpegDepacketizerwhen a JPEG callback is registered and payload type 26 is discovered in SDP.
This means older single-track MJPEG integrations can keep working while newer
multi-track applications use add_track() plus codec-specific packetizers.
RTSP Client
The RtspClient class connects to an RTSP server and receives media streams
over RTP/UDP. It dispatches incoming RTP packets to codec-specific
depacketizers based on payload type.
For backward compatibility, setting the on_jpeg_frame callback
automatically creates an MjpegDepacketizer for MJPEG streams (payload type
26). For generic multi-track use, applications can use the on_frame callback
and inspect parsed SDP metadata through tracks().
The client now supports:
generic
on_frame(track_id, data)callbacks for multi-track sessionsparsed SDP track metadata including media type, payload type, codec name, sample rate, channel count, and resolved control path
automatic depacketizer selection for MJPEG, H.264, and generic payloads discovered during
DESCRIBEan
on_connection_lostcallback for reconnect / rediscovery workflows when the RTSP control socket or RTP stream disappears after playback starts
RTSP Server
The RtspServer class accepts RTSP connections and streams media over
RTP/UDP. It supports multiple tracks, each with its own codec-specific
packetizer, SSRC, and sequence numbering.
For backward compatibility, calling send_frame(const JpegFrame&) lazily
creates a default MJPEG track. For other codecs, register tracks via
add_track() and send frames with send_frame(track_id, data).
The server also exposes helpers that are useful for embedded capture loops:
configurable accept, session-dispatch, and per-session control task stack sizes
has_active_sessions()to avoid capturing when no client is actively playingget_capture_cooldown()andget_recommended_capture_period()so an application can slow capture when RTP backpressure is observeda legacy MJPEG
send_frame(std::span<const uint8_t>)path that preserves the older wire format for existing MJPEG-only users
RTP Packetizers & Depacketizers
The packetizer/depacketizer abstraction allows the server and client to support multiple media codecs without changing the RTSP core. Concrete implementations are provided for:
MJPEG (
MjpegPacketizer/MjpegDepacketizer) — RFC 2435 JPEG over RTPH.264 (
H264Packetizer/H264Depacketizer) — RFC 6184 with FU-A fragmentationGeneric (
GenericPacketizer/GenericDepacketizer) — MTU chunking for audio or other pre-encoded payloads, with frame reconstruction based on RTP marker / timestamp boundaries
Custom packetizers can be created by subclassing RtpPacketizer or
RtpDepacketizer.
Relevant Specifications
These are the main standards to keep beside the code when working on this component:
Specification |
Why it matters here |
|---|---|
Primary control-plane reference for the RTSP/1.0 request and response
flow implemented by |
|
Useful background for newer RTSP deployments; informative here because the current component speaks RTSP/1.0 on the wire. |
|
Defines RTP headers, timestamps, sequence numbers, SSRC handling, and the RTCP control protocol model used by the transport layer. |
|
Describes the SDP |
|
Defines common RTP payload-type and clock-rate conventions used alongside dynamic payloads. |
|
Reference for the MJPEG packetization and depacketization path. |
|
Reference for the H.264 FU-A fragmentation and reassembly path. |
Testing and Utilities
There are several ways to exercise the RTSP stack:
ESPP Python library:
espp/libcontains the build scripts and bindings used to expose the RTSP client / server classes to Python.Python harness scripts:
espp/pythoncontains wrapper and multitrack scripts for exercising legacy MJPEG flows, generic multi-track flows, live microphone audio, and end-to-end host validation.Embedded examples and downstream apps: the component example plus repositories such as
camera-streamerandcamera-displaycover practical server/client integrations.
See python/README.md in the repository root for more information on the
host-side scripts.
Example
The RTSP Example page demonstrates several RTSP usage patterns selected via menuconfig, including:
legacy MJPEG server + client behavior
server-only MJPEG streaming
client-only MJPEG reception
API-level packetizer / depacketizer exercises
multi-track streaming with MJPEG video plus generic audio
For more complete integrations, see the camera-streamer and camera-display repositories.
API Reference
Header File
Classes
-
class RtspClient : public espp::BaseComponent
A class for interacting with an RTSP server using RTP and RTCP over UDP
This class is used to connect to an RTSP server and receive JPEG frames over RTP. It uses the TCP socket to send RTSP requests and receive RTSP responses. It uses the UDP socket to receive RTP and RTCP packets.
The RTSP client is designed to be used with the RTSP server in the [camera-streamer]https://github.com/esp-cpp/camera-streamer) project, but it should work with any RTSP server that sends JPEG frames over RTP.
RtspClient Example
espp::RtspClient rtsp_client({ .server_address = ip_address, .rtsp_port = CONFIG_RTSP_SERVER_PORT, .path = "/mjpeg/1", .on_jpeg_frame = [](std::shared_ptr<espp::JpegFrame> jpeg_frame) { fmt::print("Got JPEG frame of size {}x{}\n", jpeg_frame->get_width(), jpeg_frame->get_height()); }, .log_level = espp::Logger::Verbosity::ERROR, }); std::error_code ec; do { ec.clear(); rtsp_client.connect(ec); if (ec) { logger.error("Error connecting to server: {}", ec.message()); logger.info("Retrying in 1s..."); std::this_thread::sleep_for(1s); } } while (ec); rtsp_client.describe(ec); if (ec) { logger.error("Error describing server: {}", ec.message()); } rtsp_client.setup(ec); if (ec) { logger.error("Error setting up server: {}", ec.message()); } rtsp_client.play(ec); if (ec) { logger.error("Error playing server: {}", ec.message()); }
Public Types
-
typedef std::function<void(std::shared_ptr<espp::JpegFrame> jpeg_frame)> jpeg_frame_callback_t
Function type for the callback to call when a JPEG frame is received.
-
using frame_callback_t = std::function<void(int track_id, std::vector<uint8_t> &&data)>
Generic frame callback — called for any track/codec with raw frame data.
-
using disconnect_callback_t = std::function<void(void)>
Callback invoked when the RTSP server disappears after playback starts.
Public Functions
-
explicit RtspClient(const espp::RtspClient::Config &config)
Constructor
- Parameters:
config – The configuration for the RTSP client
-
~RtspClient()
Destructor Disconnects from the RTSP server
-
std::string send_request(const std::string &method, const std::string &path, const std::unordered_map<std::string, std::string> &extra_headers, std::error_code &ec)
Send an RTSP request to the server
Note
This is a blocking call
Note
This will parse the response and set the session ID if it is present in the response. If the response is not a 200 OK, then an error code will be set and the response will be returned. If the response is a 200 OK, then the response will be returned and the error code will be set to success.
- Parameters:
method – The method to use for connecting. Options are “OPTIONS”, “DESCRIBE”, “SETUP”, “PLAY”, and “TEARDOWN”
path – The path to the RTSP stream on the server.
extra_headers – Any extra headers to send with the request. These will be added to the request after the CSeq and Session headers. The key is the header name and the value is the header value. For example, {“Accept”: “application/sdp”} will add “Accept: application/sdp” to the request. The “User-Agent” header will be added automatically. The “CSeq” and “Session” headers will be added automatically. The “Accept” header will be added automatically. The “Transport” header will be added automatically for the “SETUP” method. Defaults to an empty map.
ec – The error code to set if an error occurs
- Returns:
The response from the server
-
void connect(std::error_code &ec)
Connect to the RTSP server Connects to the RTSP server and sends the OPTIONS request.
- Parameters:
ec – The error code to set if an error occurs
-
void disconnect(std::error_code &ec)
Disconnect from the RTSP server Disconnects from the RTSP server and sends the TEARDOWN request.
- Parameters:
ec – The error code to set if an error occurs
-
void describe(std::error_code &ec)
Describe the RTSP stream Sends the DESCRIBE request to the RTSP server and parses the response.
- Parameters:
ec – The error code to set if an error occurs
-
void setup(std::error_code &ec)
Setup the RTSP stream
Note
Starts the RTP and RTCP threads. Sends the SETUP request to the RTSP server and parses the response.
Note
The default ports are 5000 and 5001 for RTP and RTCP respectively.
Note
The default receive timeout is 5 seconds.
- Parameters:
ec – The error code to set if an error occurs
-
void setup(size_t rtp_port, size_t rtcp_port, const std::chrono::duration<float> &receive_timeout, std::error_code &ec)
Setup the RTSP stream Sends the SETUP request to the RTSP server and parses the response.
Note
Starts the RTP and RTCP threads.
- Parameters:
rtp_port – The RTP client port
rtcp_port – The RTCP client port
receive_timeout – The timeout for receiving RTP and RTCP packets
ec – The error code to set if an error occurs
Register a depacketizer for a specific RTP payload type. When RTP packets with this payload type are received, they are dispatched to the registered depacketizer.
- Parameters:
payload_type – The RTP payload type (e.g., 26 for MJPEG, 96 for H264)
depacketizer – The depacketizer to handle packets of this type
-
void play(std::error_code &ec)
Play the RTSP stream Sends the PLAY request to the RTSP server and parses the response.
- Parameters:
ec – The error code to set if an error occurs
-
void pause(std::error_code &ec)
Pause the RTSP stream Sends the PAUSE request to the RTSP server and parses the response.
- Parameters:
ec – The error code to set if an error occurs
-
void teardown(std::error_code &ec)
Teardown the RTSP stream Sends the TEARDOWN request to the RTSP server and parses the response.
- Parameters:
ec – The error code to set if an error occurs
-
inline const std::vector<TrackInfo> &tracks() const
Get the parsed SDP track descriptions from the most recent DESCRIBE call.
- Returns:
The ordered set of discovered media tracks.
-
inline const std::string &get_name() const
Get the name of the component
Note
This is the tag of the logger
- Returns:
A const reference to the name of the component
-
inline void set_log_tag(const std::string_view &tag)
Set the tag for the logger
- Parameters:
tag – The tag to use for the logger
-
inline espp::Logger::Verbosity get_log_level() const
Get the log level for the logger
See also
See also
- Returns:
The verbosity level of the logger
-
inline void set_log_level(espp::Logger::Verbosity level)
Set the log level for the logger
See also
See also
- Parameters:
level – The verbosity level to use for the logger
-
inline void set_log_verbosity(espp::Logger::Verbosity level)
Set the log verbosity for the logger
See also
See also
See also
Note
This is a convenience method that calls set_log_level
- Parameters:
level – The verbosity level to use for the logger
-
inline espp::Logger::Verbosity get_log_verbosity() const
Get the log verbosity for the logger
See also
See also
See also
Note
This is a convenience method that calls get_log_level
- Returns:
The verbosity level of the logger
-
inline void set_log_rate_limit(std::chrono::duration<float> rate_limit)
Set the rate limit for the logger
See also
Note
Only calls to the logger that have _rate_limit suffix will be rate limited
- Parameters:
rate_limit – The rate limit to use for the logger
-
struct Config
Configuration for the RTSP client.
Public Members
-
std::string server_address
The server IP Address to connect to.
-
int rtsp_port = {8554}
The port of the RTSP server.
-
std::string path = {"/mjpeg/1"}
The path to the RTSP stream on the server. Will be appended to the server address and port to form the full path of the form “rtsp://<server_address>:<rtsp_port><path>”
-
frame_callback_t on_frame = {nullptr}
Generic frame callback for any codec (track_id, raw frame data)
-
jpeg_frame_callback_t on_jpeg_frame = {nullptr}
JPEG-specific frame callback (backward compatible). If set and no depacketizer is registered for PT 26, an MjpegDepacketizer is automatically created.
-
disconnect_callback_t on_connection_lost = {nullptr}
Called once if the client loses the server after playback starts. This callback is intended for applications that want to stop playback and re-enter service discovery or reconnect logic automatically.
-
std::string server_address
-
struct TrackInfo
-
typedef std::function<void(std::shared_ptr<espp::JpegFrame> jpeg_frame)> jpeg_frame_callback_t
Header File
Classes
-
class RtspServer : public espp::BaseComponent
Class for streaming MJPEG data from a camera using RTSP + RTP Starts a TCP socket to listen for RTSP connections, and then spawns off a new RTSP session for each connection.
See also
RtspServer example
const std::string server_uri = fmt::format("rtsp://{}:{}/mjpeg/1", ip_address, CONFIG_RTSP_SERVER_PORT); logger.info("Starting RTSP Server on port {}", CONFIG_RTSP_SERVER_PORT); logger.info("RTSP URI: {}", server_uri); espp::RtspServer rtsp_server({ .server_address = ip_address, .port = CONFIG_RTSP_SERVER_PORT, .path = "/mjpeg/1", .log_level = espp::Logger::Verbosity::INFO, }); rtsp_server.start(); std::span<const uint8_t> frame_data(reinterpret_cast<const uint8_t *>(jpeg_data), sizeof(jpeg_data)); espp::JpegFrame jpeg_frame(frame_data); logger.info("Parsed JPEG image, num bytes: {}", jpeg_frame.get_data().size()); logger.info("Created frame of size {}x{}", jpeg_frame.get_width(), jpeg_frame.get_height()); rtsp_server.send_frame(jpeg_frame);
Note
This class does not currently send RTCP packets
Public Functions
-
explicit RtspServer(const espp::RtspServer::Config &config)
Construct an RTSP server.
- Parameters:
config – The configuration for the RTSP server
-
~RtspServer()
Destroy the RTSP server.
-
void set_session_log_level(espp::Logger::Verbosity log_level)
Sets the log level for the RTSP sessions created by this server.
Note
This does not affect the log level of the RTSP server itself
Note
This does not change the log level of any sessions that have already been created
- Parameters:
log_level – The log level to set
-
bool start(const std::chrono::duration<float> &accept_timeout = std::chrono::seconds(5))
Start the RTSP server Starts the accept task, session task, and binds the RTSP socket.
- Parameters:
accept_timeout – The timeout for accepting new connections
- Returns:
True if the server was started successfully, false otherwise
-
void stop()
Stop the FTP server Stops the accept task, session task, and closes the RTSP socket.
-
void add_track(const TrackConfig &config)
Register a media track with the server. Each track has its own packetizer, SSRC, and sequence number.
- Parameters:
config – Track configuration including the packetizer.
-
bool has_active_sessions()
Returns true when at least one session is actively playing.
- Returns:
True if an active RTSP session is ready to receive RTP packets.
-
std::chrono::milliseconds get_capture_cooldown()
Returns how long capture should wait before queueing another frame.
- Returns:
Remaining RTP backpressure cooldown, or zero if sending may resume.
-
std::chrono::milliseconds get_recommended_capture_period()
Returns the minimum recommended period between captured frames.
- Returns:
Recommended capture period based on recent RTP backpressure history.
-
void send_frame(int track_id, std::span<const uint8_t> frame_data)
Send a frame on a specific track. The track’s packetizer splits the frame into RTP payload chunks, which are then wrapped with RTP headers and queued for delivery.
Note
Overwrites any existing pending packets for this track.
- Parameters:
track_id – The track to send on.
frame_data – Raw encoded frame data.
-
void send_frame(const espp::JpegFrame &frame)
Send a JPEG frame over the RTSP connection (backward compatible). If no tracks have been added, lazily creates a default MJPEG track on track 0. Uses the legacy RtpJpegPacket packetization to preserve the exact wire format for existing MJPEG users.
Note
Overwrites any existing frame that has not been sent.
- Parameters:
frame – The frame to send.
-
void send_frame(std::span<const uint8_t> frame_data)
Send raw JPEG bytes over the default MJPEG track. Uses the legacy MJPEG RTP packetization path without copying the frame into an intermediate JpegFrame object.
Note
Overwrites any existing frame that has not been sent.
- Parameters:
frame_data – Complete JPEG bytes, including header and EOI marker.
-
inline const std::string &get_name() const
Get the name of the component
Note
This is the tag of the logger
- Returns:
A const reference to the name of the component
-
inline void set_log_tag(const std::string_view &tag)
Set the tag for the logger
- Parameters:
tag – The tag to use for the logger
-
inline espp::Logger::Verbosity get_log_level() const
Get the log level for the logger
See also
See also
- Returns:
The verbosity level of the logger
-
inline void set_log_level(espp::Logger::Verbosity level)
Set the log level for the logger
See also
See also
- Parameters:
level – The verbosity level to use for the logger
-
inline void set_log_verbosity(espp::Logger::Verbosity level)
Set the log verbosity for the logger
See also
See also
See also
Note
This is a convenience method that calls set_log_level
- Parameters:
level – The verbosity level to use for the logger
-
inline espp::Logger::Verbosity get_log_verbosity() const
Get the log verbosity for the logger
See also
See also
See also
Note
This is a convenience method that calls get_log_level
- Returns:
The verbosity level of the logger
-
inline void set_log_rate_limit(std::chrono::duration<float> rate_limit)
Set the rate limit for the logger
See also
Note
Only calls to the logger that have _rate_limit suffix will be rate limited
- Parameters:
rate_limit – The rate limit to use for the logger
-
struct Config
Configuration for the RTSP server.
Public Members
-
std::string server_address
The ip address of the server.
-
int port
The port to listen on.
-
std::string path
The path to the RTSP stream.
-
size_t max_data_size = 1000
The maximum size of RTP packet data for the MJPEG stream. Frames will be broken up into multiple packets if they are larger than this. It seems that 1500 works well for sending, but is too large for the esp32 (camera-display) to receive properly.
-
espp::Logger::Verbosity log_level = espp::Logger::Verbosity::WARN
The log level for the RTSP server.
-
size_t accept_task_stack_size_bytes = default_accept_task_stack_size_bytes
RTSP accept-task stack size, in bytes.
-
size_t session_task_stack_size_bytes = default_session_task_stack_size_bytes
RTSP session-dispatch task stack size, in bytes.
-
size_t control_task_stack_size_bytes = RtspSession::Config::default_control_task_stack_size_bytes
Per-session RTSP control-task stack size, in bytes
-
std::string server_address
-
struct TrackConfig
Configuration for a media track to be registered with the server.
Public Members
-
int track_id = {0}
Track identifier.
-
std::shared_ptr<espp::RtpPacketizer> packetizer
Codec-specific packetizer.
-
int track_id = {0}
-
explicit RtspServer(const espp::RtspServer::Config &config)
Header File
Classes
-
class RtspSession : public espp::BaseComponent
Class that reepresents an RTSP session, which is uniquely identified by a session id and sends frame data over RTP and RTCP to the client
Public Functions
Construct a new RtspSession object.
- Parameters:
control_socket – The control socket of the session
config – The configuration of the session
-
~RtspSession()
Destroy the RtspSession object Stop the session task.
-
uint32_t get_session_id() const
Get the session id.
- Returns:
The session id
-
bool is_closed() const
Check if the session is closed.
- Returns:
True if the session is closed, false otherwise
-
bool is_connected() const
Get whether the session is connected
- Returns:
True if the session is connected, false otherwise
-
bool is_active() const
Get whether the session is active
- Returns:
True if the session is active, false otherwise
-
void play()
Mark the session as active This will cause the server to start sending frames to the client
-
void pause()
Pause the session This will cause the server to stop sending frames to the client
Note
This does not stop the session, it just pauses it
Note
This is useful for when the client is buffering
-
void teardown()
Teardown the session This will cause the server to stop sending frames to the client and close the connection
-
bool send_rtp_packet(int track_id, const espp::RtpPacket &packet)
Send an RTP packet on a specific track
- Parameters:
track_id – The track to send on
packet – The RTP packet to send
- Returns:
True if the packet was sent successfully, false otherwise
-
bool send_rtp_packet(int track_id, std::span<const uint8_t> packet_data)
Send a serialized RTP packet on a specific track.
- Parameters:
track_id – The track to send on
packet_data – Serialized RTP packet bytes
- Returns:
True if the packet was sent successfully, false otherwise
-
bool send_rtp_packet(const espp::RtpPacket &packet)
Send an RTP packet to the client (backward compat — sends on default track 0)
- Parameters:
packet – The RTP packet to send
- Returns:
True if the packet was sent successfully, false otherwise
-
bool send_rtp_packet(std::span<const uint8_t> packet_data)
Send a serialized RTP packet to the client (default track 0).
- Parameters:
packet_data – Serialized RTP packet bytes
- Returns:
True if the packet was sent successfully, false otherwise
-
bool send_rtcp_packet(int track_id, const espp::RtcpPacket &packet)
Send an RTCP packet on a specific track
- Parameters:
track_id – The track to send on
packet – The RTCP packet to send
- Returns:
True if the packet was sent successfully, false otherwise
-
bool send_rtcp_packet(const espp::RtcpPacket &packet)
Send an RTCP packet to the client (backward compat — sends on default track 0)
- Parameters:
packet – The RTCP packet to send
- Returns:
True if the packet was sent successfully, false otherwise
-
inline const std::string &get_name() const
Get the name of the component
Note
This is the tag of the logger
- Returns:
A const reference to the name of the component
-
inline void set_log_tag(const std::string_view &tag)
Set the tag for the logger
- Parameters:
tag – The tag to use for the logger
-
inline espp::Logger::Verbosity get_log_level() const
Get the log level for the logger
See also
See also
- Returns:
The verbosity level of the logger
-
inline void set_log_level(espp::Logger::Verbosity level)
Set the log level for the logger
See also
See also
- Parameters:
level – The verbosity level to use for the logger
-
inline void set_log_verbosity(espp::Logger::Verbosity level)
Set the log verbosity for the logger
See also
See also
See also
Note
This is a convenience method that calls set_log_level
- Parameters:
level – The verbosity level to use for the logger
-
inline espp::Logger::Verbosity get_log_verbosity() const
Get the log verbosity for the logger
See also
See also
See also
Note
This is a convenience method that calls get_log_level
- Returns:
The verbosity level of the logger
-
inline void set_log_rate_limit(std::chrono::duration<float> rate_limit)
Set the rate limit for the logger
See also
Note
Only calls to the logger that have _rate_limit suffix will be rate limited
- Parameters:
rate_limit – The rate limit to use for the logger
-
struct Config
Configuration for the RTSP session.
Public Members
-
std::string server_address
The address of the server.
-
std::string rtsp_path
The RTSP path of the session.
-
std::chrono::duration<float> receive_timeout = std::chrono::seconds(5)
The timeout for receiving data. Should be > 0.
-
size_t control_task_stack_size_bytes = default_control_task_stack_size_bytes
RTSP control-task stack size, in bytes
-
std::function<std::string(const std::string &session_path, uint32_t session_id, const std::string &server_address)> sdp_generator
SDP generator callback. If set, called during DESCRIBE to produce the SDP body. If not set, a default MJPEG SDP is generated for backward compatibility.
- Param session_path:
Full RTSP path (e.g., “rtsp://ip:port/path”)
- Param session_id:
The session ID
- Param server_address:
The server address with port
-
std::string server_address
-
struct Track
Represents one media track within an RTSP session.
Header File
Classes
-
class RtpPacketizer : public espp::BaseComponent
Abstract base class for splitting media frames into RTP payload chunks. Concrete packetizers (e.g. MJPEG, H.264) override the pure-virtual methods to produce codec-specific payloads. The RTSP server wraps each returned RtpPayloadChunk with an RTP header before sending.
Subclassed by espp::GenericPacketizer, espp::H264Packetizer, espp::MjpegPacketizer
Public Functions
-
inline explicit RtpPacketizer(const Config &config, const std::string &name)
Construct an RtpPacketizer.
- Parameters:
config – The configuration for this packetizer.
name – A human-readable name used for logging.
-
virtual ~RtpPacketizer() = default
Destructor.
-
virtual std::vector<RtpPayloadChunk> packetize(std::span<const uint8_t> frame_data) = 0
Packetize a complete media frame into RTP payload chunks.
- Parameters:
frame_data – The raw frame bytes to packetize.
- Returns:
A vector of RtpPayloadChunk ready to be wrapped in RTP packets.
-
virtual int get_payload_type() const = 0
Get the RTP payload type number for this codec.
- Returns:
The RTP payload type (e.g. 26 for MJPEG, 96 for dynamic).
-
virtual uint32_t get_clock_rate() const = 0
Get the RTP clock rate for timestamp calculation.
- Returns:
The clock rate in Hz (e.g. 90000 for video, 8000 for audio).
-
virtual std::string get_sdp_media_attributes() const = 0
Generate the SDP media-level attributes for this codec.
- Returns:
A string containing SDP a= lines (without trailing CRLF).
-
virtual std::string get_sdp_media_line() const = 0
Generate the SDP m= line for this codec.
- Returns:
A string containing the SDP m= line (without trailing CRLF).
-
inline const std::string &get_name() const
Get the name of the component
Note
This is the tag of the logger
- Returns:
A const reference to the name of the component
-
inline void set_log_tag(const std::string_view &tag)
Set the tag for the logger
- Parameters:
tag – The tag to use for the logger
-
inline espp::Logger::Verbosity get_log_level() const
Get the log level for the logger
See also
See also
- Returns:
The verbosity level of the logger
-
inline void set_log_level(espp::Logger::Verbosity level)
Set the log level for the logger
See also
See also
- Parameters:
level – The verbosity level to use for the logger
-
inline void set_log_verbosity(espp::Logger::Verbosity level)
Set the log verbosity for the logger
See also
See also
See also
Note
This is a convenience method that calls set_log_level
- Parameters:
level – The verbosity level to use for the logger
-
inline espp::Logger::Verbosity get_log_verbosity() const
Get the log verbosity for the logger
See also
See also
See also
Note
This is a convenience method that calls get_log_level
- Returns:
The verbosity level of the logger
-
inline void set_log_rate_limit(std::chrono::duration<float> rate_limit)
Set the rate limit for the logger
See also
Note
Only calls to the logger that have _rate_limit suffix will be rate limited
- Parameters:
rate_limit – The rate limit to use for the logger
-
struct Config
Configuration for RtpPacketizer.
-
inline explicit RtpPacketizer(const Config &config, const std::string &name)
Header File
Classes
-
class RtpDepacketizer : public espp::BaseComponent
Abstract base class for reassembling media frames from incoming RTP packets. Concrete depacketizers (e.g. MJPEG, H.264) override process_packet() to accumulate payload data and invoke the frame callback when a complete frame has been assembled.
Subclassed by espp::GenericDepacketizer, espp::H264Depacketizer, espp::MjpegDepacketizer
Public Types
-
using frame_callback_t = std::function<void(std::vector<uint8_t>&&)>
Callback type invoked when a complete frame has been reassembled. The frame data is moved into the callback to avoid copies.
Public Functions
-
inline explicit RtpDepacketizer(const Config &config, const std::string &name)
Construct an RtpDepacketizer.
- Parameters:
config – The configuration for this depacketizer.
name – A human-readable name used for logging.
-
virtual ~RtpDepacketizer() = default
Destructor.
-
virtual void process_packet(const RtpPacket &packet) = 0
Process an incoming RTP packet, accumulating payload data. When a complete frame is assembled the frame callback is invoked.
- Parameters:
packet – The RTP packet to process.
-
inline void set_frame_callback(frame_callback_t cb)
Set the callback for completed frames.
- Parameters:
cb – The callback to invoke when a full frame is ready.
-
inline const std::string &get_name() const
Get the name of the component
Note
This is the tag of the logger
- Returns:
A const reference to the name of the component
-
inline void set_log_tag(const std::string_view &tag)
Set the tag for the logger
- Parameters:
tag – The tag to use for the logger
-
inline espp::Logger::Verbosity get_log_level() const
Get the log level for the logger
See also
See also
- Returns:
The verbosity level of the logger
-
inline void set_log_level(espp::Logger::Verbosity level)
Set the log level for the logger
See also
See also
- Parameters:
level – The verbosity level to use for the logger
-
inline void set_log_verbosity(espp::Logger::Verbosity level)
Set the log verbosity for the logger
See also
See also
See also
Note
This is a convenience method that calls set_log_level
- Parameters:
level – The verbosity level to use for the logger
-
inline espp::Logger::Verbosity get_log_verbosity() const
Get the log verbosity for the logger
See also
See also
See also
Note
This is a convenience method that calls get_log_level
- Returns:
The verbosity level of the logger
-
inline void set_log_rate_limit(std::chrono::duration<float> rate_limit)
Set the rate limit for the logger
See also
Note
Only calls to the logger that have _rate_limit suffix will be rate limited
- Parameters:
rate_limit – The rate limit to use for the logger
-
struct Config
Configuration for RtpDepacketizer.
-
using frame_callback_t = std::function<void(std::vector<uint8_t>&&)>
Header File
Header File
Classes
-
class MjpegPacketizer : public espp::RtpPacketizer
MJPEG packetizer that fragments JPEG frames into RFC 2435 RTP payloads.
This class takes complete JPEG frames and produces RTP payload chunks suitable for MJPEG streaming. Each chunk contains an RFC 2435 MJPEG header, and the first chunk additionally includes quantization tables.
Public Functions
-
inline explicit MjpegPacketizer(const Config &config)
Construct an MJPEG packetizer.
- Parameters:
config – Configuration for the packetizer.
-
virtual std::vector<RtpPayloadChunk> packetize(std::span<const uint8_t> frame_data) override
Packetize a complete JPEG frame into RFC 2435 RTP payload chunks.
- Parameters:
frame_data – Raw JPEG data including the JPEG header.
- Returns:
Vector of payload chunks ready to be wrapped in RTP packets.
-
virtual int get_payload_type() const override
Get the RTP payload type for MJPEG.
- Returns:
26 (static JPEG payload type).
-
virtual uint32_t get_clock_rate() const override
Get the RTP clock rate for MJPEG.
- Returns:
90000 Hz.
-
virtual std::string get_sdp_media_attributes() const override
Get the SDP media attributes for MJPEG.
- Returns:
SDP rtpmap attribute string.
-
virtual std::string get_sdp_media_line() const override
Get the SDP media line for MJPEG.
- Returns:
SDP media description line.
-
inline const std::string &get_name() const
Get the name of the component
Note
This is the tag of the logger
- Returns:
A const reference to the name of the component
-
inline void set_log_tag(const std::string_view &tag)
Set the tag for the logger
- Parameters:
tag – The tag to use for the logger
-
inline espp::Logger::Verbosity get_log_level() const
Get the log level for the logger
See also
See also
- Returns:
The verbosity level of the logger
-
inline void set_log_level(espp::Logger::Verbosity level)
Set the log level for the logger
See also
See also
- Parameters:
level – The verbosity level to use for the logger
-
inline void set_log_verbosity(espp::Logger::Verbosity level)
Set the log verbosity for the logger
See also
See also
See also
Note
This is a convenience method that calls set_log_level
- Parameters:
level – The verbosity level to use for the logger
-
inline espp::Logger::Verbosity get_log_verbosity() const
Get the log verbosity for the logger
See also
See also
See also
Note
This is a convenience method that calls get_log_level
- Returns:
The verbosity level of the logger
-
inline void set_log_rate_limit(std::chrono::duration<float> rate_limit)
Set the rate limit for the logger
See also
Note
Only calls to the logger that have _rate_limit suffix will be rate limited
- Parameters:
rate_limit – The rate limit to use for the logger
-
struct Config
Configuration for the MJPEG packetizer.
-
inline explicit MjpegPacketizer(const Config &config)
Header File
Classes
-
class MjpegDepacketizer : public espp::RtpDepacketizer
MJPEG depacketizer that reassembles JPEG frames from RTP packets.
This class receives individual RTP packets containing RFC 2435 MJPEG payloads, reassembles the scan data fragments, reconstructs the JPEG header from the MJPEG header fields, and delivers complete JPEG frames through callbacks.
Public Types
-
using jpeg_frame_callback_t = std::function<void(std::shared_ptr<JpegFrame>)>
Callback type for receiving complete JPEG frames as JpegFrame objects.
-
using frame_callback_t = std::function<void(std::vector<uint8_t>&&)>
Callback type invoked when a complete frame has been reassembled. The frame data is moved into the callback to avoid copies.
Public Functions
-
inline explicit MjpegDepacketizer(const Config &config)
Construct an MJPEG depacketizer.
- Parameters:
config – Configuration for the depacketizer.
-
virtual void process_packet(const RtpPacket &packet) override
Process an incoming RTP packet containing MJPEG data.
Note
Packets are parsed as RtpJpegPacket. When a complete frame is assembled (marker bit set and no missing sequence numbers), both the generic frame callback and the JPEG frame callback are invoked.
- Parameters:
packet – The RTP packet to process.
-
void set_jpeg_frame_callback(jpeg_frame_callback_t cb)
Set callback for receiving complete JPEG frames.
- Parameters:
cb – Callback receiving a shared pointer to the completed JpegFrame.
-
inline void set_frame_callback(frame_callback_t cb)
Set the callback for completed frames.
- Parameters:
cb – The callback to invoke when a full frame is ready.
-
inline const std::string &get_name() const
Get the name of the component
Note
This is the tag of the logger
- Returns:
A const reference to the name of the component
-
inline void set_log_tag(const std::string_view &tag)
Set the tag for the logger
- Parameters:
tag – The tag to use for the logger
-
inline espp::Logger::Verbosity get_log_level() const
Get the log level for the logger
See also
See also
- Returns:
The verbosity level of the logger
-
inline void set_log_level(espp::Logger::Verbosity level)
Set the log level for the logger
See also
See also
- Parameters:
level – The verbosity level to use for the logger
-
inline void set_log_verbosity(espp::Logger::Verbosity level)
Set the log verbosity for the logger
See also
See also
See also
Note
This is a convenience method that calls set_log_level
- Parameters:
level – The verbosity level to use for the logger
-
inline espp::Logger::Verbosity get_log_verbosity() const
Get the log verbosity for the logger
See also
See also
See also
Note
This is a convenience method that calls get_log_level
- Returns:
The verbosity level of the logger
-
inline void set_log_rate_limit(std::chrono::duration<float> rate_limit)
Set the rate limit for the logger
See also
Note
Only calls to the logger that have _rate_limit suffix will be rate limited
- Parameters:
rate_limit – The rate limit to use for the logger
-
struct Config
Configuration for the MJPEG depacketizer.
-
using jpeg_frame_callback_t = std::function<void(std::shared_ptr<JpegFrame>)>
Header File
Classes
-
class H264Packetizer : public espp::RtpPacketizer
RTP packetizer for H.264 video per RFC 6184.
Accepts H.264 access units in Annex B byte-stream format (NAL units separated by 0x00000001 or 0x000001 start codes) and produces a sequence of RTP payload chunks suitable for transmission.
Supports two NAL-unit packetization strategies:
**Single NAL unit mode** — NAL fits within max_payload_size.
**FU-A fragmentation** — NAL exceeds max_payload_size (packetization_mode >= 1).
Example
// Synthetic SPS and PPS (minimal valid-ish NAL units) std::vector<uint8_t> sps = {0x67, 0x42, 0xC0, 0x1E, 0xD9, 0x00, 0xA0, 0x47, 0xFE, 0xC8}; std::vector<uint8_t> pps = {0x68, 0xCE, 0x38, 0x80}; espp::H264Packetizer h264_packer({ .max_payload_size = 1400, .payload_type = 96, .profile_level_id = "42C01E", .packetization_mode = 1, .sps = sps, .pps = pps, }); check(h264_packer.get_payload_type() == 96, "H264Packetizer: payload type is 96"); check(h264_packer.get_clock_rate() == 90000, "H264Packetizer: clock rate is 90000"); auto sdp_attrs = h264_packer.get_sdp_media_attributes(); check(sdp_attrs.find("H264/90000") != std::string::npos, "H264Packetizer: SDP contains H264/90000"); check(sdp_attrs.find("profile-level-id=42C01E") != std::string::npos, "H264Packetizer: SDP contains profile-level-id"); check(sdp_attrs.find("sprop-parameter-sets=") != std::string::npos, "H264Packetizer: SDP contains SPS/PPS base64"); // Create a synthetic H.264 access unit in Annex B format: // Start code + SPS + Start code + PPS + Start code + small IDR slice std::vector<uint8_t> annex_b_frame; // SPS NAL annex_b_frame.insert(annex_b_frame.end(), {0x00, 0x00, 0x00, 0x01}); annex_b_frame.insert(annex_b_frame.end(), sps.begin(), sps.end()); // PPS NAL annex_b_frame.insert(annex_b_frame.end(), {0x00, 0x00, 0x00, 0x01}); annex_b_frame.insert(annex_b_frame.end(), pps.begin(), pps.end()); // IDR slice NAL (type 5) — fill with dummy data annex_b_frame.insert(annex_b_frame.end(), {0x00, 0x00, 0x00, 0x01}); annex_b_frame.push_back(0x65); // NAL header: type=5 (IDR) for (int i = 0; i < 100; i++) { annex_b_frame.push_back(static_cast<uint8_t>(i & 0xFF)); } auto chunks = h264_packer.packetize(std::span<const uint8_t>(annex_b_frame.data(), annex_b_frame.size())); check(!chunks.empty(), "H264Packetizer: produced chunks from Annex B data"); check(chunks.back().marker, "H264Packetizer: last chunk has marker bit"); // With small NALs, all should be single NAL mode (no FU-A needed) check(chunks.size() == 3, "H264Packetizer: 3 chunks for SPS+PPS+IDR");
Note
This class does not manage RTP headers (sequence numbers, timestamps, SSRC). The caller wraps each returned chunk into an RtpPacket.
Public Functions
-
explicit H264Packetizer(const Config &config)
Construct an H264Packetizer.
- Parameters:
config – The configuration for the packetizer.
-
~H264Packetizer() override = default
Destructor.
-
virtual std::vector<RtpPayloadChunk> packetize(std::span<const uint8_t> frame_data) override
Packetize a complete H.264 access unit (Annex B format).
The input may contain multiple NAL units separated by 3-byte or 4-byte start codes. Each NAL is individually packetized (single NAL or FU-A). The marker bit is set on the last chunk of the last NAL unit in the access unit.
- Parameters:
frame_data – Raw Annex B byte-stream of one access unit.
- Returns:
Vector of RTP payload chunks ready for transmission.
-
std::vector<RtpPayloadChunk> packetize_nal(std::span<const uint8_t> nal_data, bool is_last_nal = true)
Packetize a single pre-parsed NAL unit (no start code prefix).
- Parameters:
nal_data – The raw NAL unit bytes (including NAL header byte).
is_last_nal – If true, the marker bit is set on the last chunk.
- Returns:
Vector of RTP payload chunks for this NAL.
-
void set_sps_pps(std::span<const uint8_t> sps, std::span<const uint8_t> pps)
Update the SPS and PPS used for SDP generation.
- Parameters:
sps – Sequence Parameter Set raw bytes.
pps – Picture Parameter Set raw bytes.
-
virtual int get_payload_type() const override
Get the RTP payload type.
- Returns:
The dynamic payload type configured for H.264.
-
virtual uint32_t get_clock_rate() const override
Get the RTP clock rate for H.264 video.
- Returns:
90000 (fixed for H.264).
-
virtual std::string get_sdp_media_attributes() const override
Get the SDP attribute lines for H.264.
- Returns:
SDP a= lines (rtpmap and fmtp) without trailing CRLF.
-
virtual std::string get_sdp_media_line() const override
Get the SDP m= media line for H.264.
- Returns:
SDP m= line without trailing CRLF.
-
inline const std::string &get_name() const
Get the name of the component
Note
This is the tag of the logger
- Returns:
A const reference to the name of the component
-
inline void set_log_tag(const std::string_view &tag)
Set the tag for the logger
- Parameters:
tag – The tag to use for the logger
-
inline espp::Logger::Verbosity get_log_level() const
Get the log level for the logger
See also
See also
- Returns:
The verbosity level of the logger
-
inline void set_log_level(espp::Logger::Verbosity level)
Set the log level for the logger
See also
See also
- Parameters:
level – The verbosity level to use for the logger
-
inline void set_log_verbosity(espp::Logger::Verbosity level)
Set the log verbosity for the logger
See also
See also
See also
Note
This is a convenience method that calls set_log_level
- Parameters:
level – The verbosity level to use for the logger
-
inline espp::Logger::Verbosity get_log_verbosity() const
Get the log verbosity for the logger
See also
See also
See also
Note
This is a convenience method that calls get_log_level
- Returns:
The verbosity level of the logger
-
inline void set_log_rate_limit(std::chrono::duration<float> rate_limit)
Set the rate limit for the logger
See also
Note
Only calls to the logger that have _rate_limit suffix will be rate limited
- Parameters:
rate_limit – The rate limit to use for the logger
-
struct Config
Configuration for the H264Packetizer.
Public Members
-
size_t max_payload_size = {1400}
Maximum payload bytes per RTP packet.
-
int payload_type = {96}
Dynamic RTP payload type (typically 96–127).
-
std::string profile_level_id
H.264 profile-level-id hex string, e.g. “42C01E”.
-
int packetization_mode = {1}
0 = single NAL only, 1 = non-interleaved (FU-A allowed).
-
std::vector<uint8_t> sps
Sequence Parameter Set raw bytes (without start code).
-
std::vector<uint8_t> pps
Picture Parameter Set raw bytes (without start code).
-
size_t max_payload_size = {1400}
Header File
Classes
-
class H264Depacketizer : public espp::RtpDepacketizer
RTP depacketizer for H.264 video per RFC 6184.
Reassembles H.264 access units from incoming RTP packets. Supports:
**Single NAL unit** packets (NAL type 1–23)
**STAP-A** aggregation packets (NAL type 24)
**FU-A** fragmentation packets (NAL type 28)
When the RTP marker bit is set, the accumulated NAL units are delivered as one Annex B byte-stream (each NAL prefixed with 0x00 0x00 0x00 0x01) via the frame callback set with set_frame_callback().
Example
espp::H264Depacketizer h264_depacker(espp::H264Depacketizer::Config{}); bool frame_received = false; size_t frame_size = 0; h264_depacker.set_frame_callback([&](std::vector<uint8_t> &&data) { frame_received = true; frame_size = data.size(); logger.info("H264Depacketizer: got frame of {} bytes", data.size()); // Verify Annex B start codes are present if (data.size() >= 4) { bool has_start_code = (data[0] == 0x00 && data[1] == 0x00 && data[2] == 0x00 && data[3] == 0x01); logger.info("H264Depacketizer: Annex B start code present: {}", has_start_code); } }); // Create synthetic single NAL packets and feed them std::vector<uint8_t> sps = {0x67, 0x42, 0xC0, 0x1E, 0xD9}; std::vector<uint8_t> pps = {0x68, 0xCE, 0x38, 0x80}; std::vector<uint8_t> idr = {0x65, 0x01, 0x02, 0x03, 0x04}; auto make_rtp = [](const std::vector<uint8_t> &payload, int pt, uint16_t seq, bool marker) { espp::RtpPacket pkt(payload.size()); pkt.set_version(2); pkt.set_payload_type(pt); pkt.set_sequence_number(seq); pkt.set_timestamp(0); pkt.set_ssrc(54321); pkt.set_marker(marker); pkt.set_payload(std::span<const uint8_t>(payload)); pkt.serialize(); return pkt; }; h264_depacker.process_packet(make_rtp(sps, 96, 0, false)); h264_depacker.process_packet(make_rtp(pps, 96, 1, false)); h264_depacker.process_packet(make_rtp(idr, 96, 2, true)); // marker = end of AU check(frame_received, "H264Depacketizer: frame callback invoked"); // Expected: 3 NALs with start codes = 3*(4) + 5+4+5 = 26 bytes check(frame_size > 0, "H264Depacketizer: frame has data");
Public Types
-
using frame_callback_t = std::function<void(std::vector<uint8_t>&&)>
Callback type invoked when a complete frame has been reassembled. The frame data is moved into the callback to avoid copies.
Public Functions
-
explicit H264Depacketizer(const Config &config)
Construct an H264Depacketizer.
- Parameters:
config – The configuration for the depacketizer.
-
~H264Depacketizer() override = default
Destructor.
-
virtual void process_packet(const RtpPacket &packet) override
Process an incoming RTP packet containing H.264 payload.
Handles single NAL, STAP-A, and FU-A packet types. NAL units are buffered until the RTP marker bit indicates the end of an access unit, at which point the complete Annex B frame is delivered via the callback.
- Parameters:
packet – The RTP packet to process.
-
inline void set_frame_callback(frame_callback_t cb)
Set the callback for completed frames.
- Parameters:
cb – The callback to invoke when a full frame is ready.
-
inline const std::string &get_name() const
Get the name of the component
Note
This is the tag of the logger
- Returns:
A const reference to the name of the component
-
inline void set_log_tag(const std::string_view &tag)
Set the tag for the logger
- Parameters:
tag – The tag to use for the logger
-
inline espp::Logger::Verbosity get_log_level() const
Get the log level for the logger
See also
See also
- Returns:
The verbosity level of the logger
-
inline void set_log_level(espp::Logger::Verbosity level)
Set the log level for the logger
See also
See also
- Parameters:
level – The verbosity level to use for the logger
-
inline void set_log_verbosity(espp::Logger::Verbosity level)
Set the log verbosity for the logger
See also
See also
See also
Note
This is a convenience method that calls set_log_level
- Parameters:
level – The verbosity level to use for the logger
-
inline espp::Logger::Verbosity get_log_verbosity() const
Get the log verbosity for the logger
See also
See also
See also
Note
This is a convenience method that calls get_log_level
- Returns:
The verbosity level of the logger
-
inline void set_log_rate_limit(std::chrono::duration<float> rate_limit)
Set the rate limit for the logger
See also
Note
Only calls to the logger that have _rate_limit suffix will be rate limited
- Parameters:
rate_limit – The rate limit to use for the logger
-
struct Config
Configuration for the H264Depacketizer.
Header File
Classes
-
class GenericPacketizer : public espp::RtpPacketizer
A generic RTP packetizer suitable for audio codecs (PCM, G.711, Opus, etc.) or any pre-formatted data that simply needs MTU-based chunking. It splits frame data into chunks of at most max_payload_size bytes and marks the last chunk with the RTP marker bit.
Example
espp::GenericPacketizer generic_packer({ .max_payload_size = 500, .payload_type = 97, .clock_rate = 48000, .encoding_name = "opus", .channels = 2, .fmtp = {}, .media_type = espp::MediaType::AUDIO, }); check(generic_packer.get_payload_type() == 97, "GenericPacketizer: payload type is 97"); check(generic_packer.get_clock_rate() == 48000, "GenericPacketizer: clock rate is 48000"); auto sdp_line = generic_packer.get_sdp_media_line(); check(sdp_line.find("m=audio") != std::string::npos, "GenericPacketizer: SDP media line is audio"); auto sdp_attrs = generic_packer.get_sdp_media_attributes(); check(sdp_attrs.find("opus/48000/2") != std::string::npos, "GenericPacketizer: SDP has encoding/rate/channels"); // Packetize 1200 bytes of synthetic audio std::vector<uint8_t> audio_data(1200, 0xAB); auto chunks = generic_packer.packetize(std::span<const uint8_t>(audio_data.data(), audio_data.size())); check(chunks.size() == 3, "GenericPacketizer: 1200 bytes @ 500 MTU = 3 chunks"); check(chunks.back().marker, "GenericPacketizer: last chunk has marker"); check(!chunks.front().marker, "GenericPacketizer: first chunk has no marker");
Public Functions
-
explicit GenericPacketizer(const Config &config)
Construct a GenericPacketizer.
- Parameters:
config – The configuration for this packetizer.
-
~GenericPacketizer() override = default
Destructor.
-
virtual std::vector<RtpPayloadChunk> packetize(std::span<const uint8_t> frame_data) override
Split frame data into RTP payload chunks of at most max_payload_size. The last (or only) chunk has its marker flag set.
- Parameters:
frame_data – The raw frame bytes to packetize.
- Returns:
A vector of RtpPayloadChunk ready to be wrapped in RTP packets.
-
virtual int get_payload_type() const override
Get the RTP payload type number.
- Returns:
The configured RTP payload type.
-
virtual uint32_t get_clock_rate() const override
Get the RTP clock rate.
- Returns:
The configured clock rate in Hz.
-
virtual std::string get_sdp_media_attributes() const override
Generate the SDP media-level attribute lines for this codec. Produces an a=rtpmap line and, if fmtp is non-empty, an a=fmtp line.
- Returns:
A string containing the SDP a= lines.
-
virtual std::string get_sdp_media_line() const override
Generate the SDP m= line for this codec.
- Returns:
A string such as “m=audio 0 RTP/AVP 96”.
-
inline const std::string &get_name() const
Get the name of the component
Note
This is the tag of the logger
- Returns:
A const reference to the name of the component
-
inline void set_log_tag(const std::string_view &tag)
Set the tag for the logger
- Parameters:
tag – The tag to use for the logger
-
inline espp::Logger::Verbosity get_log_level() const
Get the log level for the logger
See also
See also
- Returns:
The verbosity level of the logger
-
inline void set_log_level(espp::Logger::Verbosity level)
Set the log level for the logger
See also
See also
- Parameters:
level – The verbosity level to use for the logger
-
inline void set_log_verbosity(espp::Logger::Verbosity level)
Set the log verbosity for the logger
See also
See also
See also
Note
This is a convenience method that calls set_log_level
- Parameters:
level – The verbosity level to use for the logger
-
inline espp::Logger::Verbosity get_log_verbosity() const
Get the log verbosity for the logger
See also
See also
See also
Note
This is a convenience method that calls get_log_level
- Returns:
The verbosity level of the logger
-
inline void set_log_rate_limit(std::chrono::duration<float> rate_limit)
Set the rate limit for the logger
See also
Note
Only calls to the logger that have _rate_limit suffix will be rate limited
- Parameters:
rate_limit – The rate limit to use for the logger
-
struct Config
Configuration for GenericPacketizer.
Public Members
-
size_t max_payload_size = {1400}
Maximum payload bytes per RTP packet.
-
int payload_type = {96}
RTP payload type number.
-
uint32_t clock_rate = {48000}
Clock rate in Hz for RTP timestamps.
-
std::string encoding_name = {"L16"}
Encoding name for SDP rtpmap line.
-
int channels = {1}
Number of audio channels.
-
std::string fmtp
Optional format parameters for SDP fmtp line.
-
espp::MediaType media_type = {espp::MediaType::AUDIO}
Media type for the SDP m= line.
-
size_t max_payload_size = {1400}
-
explicit GenericPacketizer(const Config &config)
Header File
Classes
-
class GenericDepacketizer : public espp::RtpDepacketizer
A generic RTP depacketizer that reassembles media frames from incoming RTP packets. It accumulates payload data until a packet with the marker bit set is received, then delivers the complete frame via the frame callback. If a packet arrives with a different RTP timestamp than the current accumulation buffer, the old buffer is discarded and a new one is started.
This is suitable for audio codecs (PCM, G.711, Opus, etc.) or any payload format that uses simple marker-based framing.
Example
espp::GenericDepacketizer generic_depacker(espp::GenericDepacketizer::Config{}); bool audio_frame_received = false; size_t audio_frame_size = 0; generic_depacker.set_frame_callback([&](std::vector<uint8_t> &&data) { audio_frame_received = true; audio_frame_size = data.size(); }); // Packetize and depacketize audio data espp::GenericPacketizer generic_packer({.max_payload_size = 500, .payload_type = 97, .clock_rate = 48000, .encoding_name = "L16", .channels = 1, .fmtp = {}, .media_type = espp::MediaType::AUDIO}); std::vector<uint8_t> audio_data(1200, 0xCD); auto chunks = generic_packer.packetize(std::span<const uint8_t>(audio_data.data(), audio_data.size())); uint16_t seq = 0; for (auto &chunk : chunks) { espp::RtpPacket pkt(chunk.data.size()); pkt.set_version(2); pkt.set_payload_type(97); pkt.set_sequence_number(seq++); pkt.set_timestamp(1000); pkt.set_ssrc(77777); pkt.set_marker(chunk.marker); pkt.set_payload(std::span<const uint8_t>(chunk.data)); pkt.serialize(); generic_depacker.process_packet(pkt); } check(audio_frame_received, "GenericDepacketizer: frame callback invoked"); check(audio_frame_size == 1200, "GenericDepacketizer: round-trip frame size matches");
Public Types
-
using frame_callback_t = std::function<void(std::vector<uint8_t>&&)>
Callback type invoked when a complete frame has been reassembled. The frame data is moved into the callback to avoid copies.
Public Functions
-
explicit GenericDepacketizer(const Config &config)
Construct a GenericDepacketizer.
- Parameters:
config – The configuration for this depacketizer.
-
~GenericDepacketizer() override = default
Destructor.
-
virtual void process_packet(const RtpPacket &packet) override
Process an incoming RTP packet. Payload data is accumulated until a packet with the marker bit set is received. At that point the assembled frame is delivered via the frame callback and the buffer is reset.
- Parameters:
packet – The RTP packet to process.
-
inline void set_frame_callback(frame_callback_t cb)
Set the callback for completed frames.
- Parameters:
cb – The callback to invoke when a full frame is ready.
-
inline const std::string &get_name() const
Get the name of the component
Note
This is the tag of the logger
- Returns:
A const reference to the name of the component
-
inline void set_log_tag(const std::string_view &tag)
Set the tag for the logger
- Parameters:
tag – The tag to use for the logger
-
inline espp::Logger::Verbosity get_log_level() const
Get the log level for the logger
See also
See also
- Returns:
The verbosity level of the logger
-
inline void set_log_level(espp::Logger::Verbosity level)
Set the log level for the logger
See also
See also
- Parameters:
level – The verbosity level to use for the logger
-
inline void set_log_verbosity(espp::Logger::Verbosity level)
Set the log verbosity for the logger
See also
See also
See also
Note
This is a convenience method that calls set_log_level
- Parameters:
level – The verbosity level to use for the logger
-
inline espp::Logger::Verbosity get_log_verbosity() const
Get the log verbosity for the logger
See also
See also
See also
Note
This is a convenience method that calls get_log_level
- Returns:
The verbosity level of the logger
-
inline void set_log_rate_limit(std::chrono::duration<float> rate_limit)
Set the rate limit for the logger
See also
Note
Only calls to the logger that have _rate_limit suffix will be rate limited
- Parameters:
rate_limit – The rate limit to use for the logger
-
struct Config
Configuration for GenericDepacketizer.
-
using frame_callback_t = std::function<void(std::vector<uint8_t>&&)>
Header File
Classes
-
class RtpPacket
RtpPacket is a class to parse RTP packet. It can be used to parse and serialize RTP packets. The RTP header fields are stored in the class and can be modified. The payload is stored in the packet_ vector and can be modified.
Subclassed by espp::RtpJpegPacket
Public Functions
-
RtpPacket()
Construct an empty RtpPacket. The packet_ vector is empty and the header fields are set to 0.
-
explicit RtpPacket(size_t payload_size)
Construct an RtpPacket with a payload of size payload_size. The packet_ vector is resized to RTP_HEADER_SIZE + payload_size.
-
explicit RtpPacket(std::span<const uint8_t> data)
Construct an RtpPacket from a span of bytes. Stores the bytes in the packet_ vector and parses the header.
- Parameters:
data – The span of bytes to parse.
-
~RtpPacket()
Destructor.
-
int get_version() const
Get the RTP version.
- Returns:
The RTP version.
-
bool get_padding() const
Get the padding flag.
- Returns:
The padding flag.
-
bool get_extension() const
Get the extension flag.
- Returns:
The extension flag.
-
int get_csrc_count() const
Get the CSRC count.
- Returns:
The CSRC count.
-
bool get_marker() const
Get the marker flag.
- Returns:
The marker flag.
-
int get_payload_type() const
Get the payload type.
- Returns:
The payload type.
-
int get_sequence_number() const
Get the sequence number.
- Returns:
The sequence number.
-
int get_timestamp() const
Get the timestamp.
- Returns:
The timestamp.
-
int get_ssrc() const
Get the SSRC.
- Returns:
The SSRC.
-
void set_version(int version)
Set the RTP version.
- Parameters:
version – The RTP version to set.
-
void set_padding(bool padding)
Set the padding flag.
- Parameters:
padding – The padding flag to set.
-
void set_extension(bool extension)
Set the extension flag.
- Parameters:
extension – The extension flag to set.
-
void set_csrc_count(int csrc_count)
Set the CSRC count.
- Parameters:
csrc_count – The CSRC count to set.
-
void set_marker(bool marker)
Set the marker flag.
- Parameters:
marker – The marker flag to set.
-
void set_payload_type(int payload_type)
Set the payload type.
- Parameters:
payload_type – The payload type to set.
-
void set_sequence_number(int sequence_number)
Set the sequence number.
- Parameters:
sequence_number – The sequence number to set.
-
void set_timestamp(int timestamp)
Set the timestamp.
- Parameters:
timestamp – The timestamp to set.
-
void set_ssrc(int ssrc)
Set the SSRC.
- Parameters:
ssrc – The SSRC to set.
-
void serialize()
Serialize the RTP header.
Note
This method should be called after modifying the RTP header fields.
Note
This method does not serialize the payload. To set the payload, use set_payload(). To get the payload, use get_payload().
-
std::span<const uint8_t> get_data() const
Get a span view of the whole packet.
Note
The span is valid as long as the packet_ vector is not modified.
Note
If you manually build the packet_ vector, you should make sure that you call serialize() before calling this method.
- Returns:
A span of the whole packet.
-
size_t get_rtp_header_size() const
Get the size of the RTP header.
- Returns:
The size of the RTP header.
-
std::span<const uint8_t> get_rtp_header() const
Get a span of bytes of the RTP header.
- Returns:
A span of bytes of the RTP header.
-
std::vector<uint8_t> &get_packet()
Get a reference to the packet_ vector.
- Returns:
A reference to the packet_ vector.
-
std::span<const uint8_t> get_payload() const
Get a span of bytes of the payload.
- Returns:
A span of bytes of the payload.
-
void set_payload(std::span<const uint8_t> payload)
Set the payload.
- Parameters:
payload – The payload to set.
-
RtpPacket()
Header File
Classes
-
class RtpJpegPacket : public espp::RtpPacket
RTP packet for JPEG video. The RTP payload for JPEG is defined in RFC 2435.
Public Functions
-
inline explicit RtpJpegPacket(std::span<const uint8_t> data)
Construct an RTP packet from a buffer.
- Parameters:
data – The buffer containing the RTP packet.
-
inline explicit RtpJpegPacket(const int type_specific, const int frag_type, const int q, const int width, const int height, std::span<const uint8_t> q0, std::span<const uint8_t> q1, std::span<const uint8_t> scan_data)
Construct an RTP packet from fields
This will construct a packet with quantization tables, so it can only be used for the first packet in a frame.
- Parameters:
type_specific – The type-specific field.
frag_type – The fragment type field.
q – The q field.
width – The width field.
height – The height field.
q0 – The first quantization table.
q1 – The second quantization table.
scan_data – The scan data.
-
inline explicit RtpJpegPacket(const int type_specific, const int offset, const int frag_type, const int q, const int width, const int height, std::span<const uint8_t> scan_data)
Construct an RTP packet from fields
This will construct a packet without quantization tables, so it cannot be used for the first packet in a frame.
- Parameters:
type_specific – The type-specific field.
offset – The offset field.
frag_type – The fragment type field.
q – The q field.
width – The width field.
height – The height field.
scan_data – The scan data.
-
inline int get_type_specific() const
Get the type-specific field.
- Returns:
The type-specific field.
-
inline int get_offset() const
Get the offset field.
- Returns:
The offset field.
-
inline int get_q() const
Get the fragment type field.
- Returns:
The fragment type field.
-
inline int get_width() const
Get the fragment type field.
- Returns:
The fragment type field.
-
inline int get_height() const
Get the fragment type field.
- Returns:
The fragment type field.
-
inline std::span<const uint8_t> get_mjpeg_header() const
Get the mjepg header.
- Returns:
The mjepg header.
-
inline bool has_q_tables() const
Get whether the packet contains quantization tables.
Note
The quantization tables are optional. If they are present, the number of quantization tables is always 2.
Note
This check is based on the value of the q field. If the q field is 128-256, the packet contains quantization tables.
- Returns:
Whether the packet contains quantization tables.
-
inline int get_num_q_tables() const
Get the number of quantization tables.
Note
The quantization tables are optional. If they are present, the number of quantization tables is always 2.
Note
Only the first packet in a frame contains quantization tables.
- Returns:
The number of quantization tables.
-
inline std::span<const uint8_t> get_q_table(int index) const
Get the quantization table at the specified index.
- Parameters:
index – The index of the quantization table.
- Returns:
The quantization table at the specified index.
-
inline void set_q_table(int index, std::span<const uint8_t> q_table)
Set the quantization table at the specified index.
Note
This will not change the size of the packet. If the index is out of bounds, the quantization table will not be set.
- Parameters:
index – The index of the quantization table.
q_table – The quantization table to set.
-
inline std::span<const uint8_t> get_jpeg_data() const
Get the JPEG data. The jpeg data is the payload minus the mjpeg header and quantization tables.
- Returns:
The JPEG data.
-
int get_version() const
Get the RTP version.
- Returns:
The RTP version.
-
bool get_padding() const
Get the padding flag.
- Returns:
The padding flag.
-
bool get_extension() const
Get the extension flag.
- Returns:
The extension flag.
-
int get_csrc_count() const
Get the CSRC count.
- Returns:
The CSRC count.
-
bool get_marker() const
Get the marker flag.
- Returns:
The marker flag.
-
int get_payload_type() const
Get the payload type.
- Returns:
The payload type.
-
int get_sequence_number() const
Get the sequence number.
- Returns:
The sequence number.
-
int get_timestamp() const
Get the timestamp.
- Returns:
The timestamp.
-
int get_ssrc() const
Get the SSRC.
- Returns:
The SSRC.
-
void set_version(int version)
Set the RTP version.
- Parameters:
version – The RTP version to set.
-
void set_padding(bool padding)
Set the padding flag.
- Parameters:
padding – The padding flag to set.
-
void set_extension(bool extension)
Set the extension flag.
- Parameters:
extension – The extension flag to set.
-
void set_csrc_count(int csrc_count)
Set the CSRC count.
- Parameters:
csrc_count – The CSRC count to set.
-
void set_marker(bool marker)
Set the marker flag.
- Parameters:
marker – The marker flag to set.
-
void set_payload_type(int payload_type)
Set the payload type.
- Parameters:
payload_type – The payload type to set.
-
void set_sequence_number(int sequence_number)
Set the sequence number.
- Parameters:
sequence_number – The sequence number to set.
-
void set_timestamp(int timestamp)
Set the timestamp.
- Parameters:
timestamp – The timestamp to set.
-
void set_ssrc(int ssrc)
Set the SSRC.
- Parameters:
ssrc – The SSRC to set.
-
void serialize()
Serialize the RTP header.
Note
This method should be called after modifying the RTP header fields.
Note
This method does not serialize the payload. To set the payload, use set_payload(). To get the payload, use get_payload().
-
std::span<const uint8_t> get_data() const
Get a span view of the whole packet.
Note
The span is valid as long as the packet_ vector is not modified.
Note
If you manually build the packet_ vector, you should make sure that you call serialize() before calling this method.
- Returns:
A span of the whole packet.
-
size_t get_rtp_header_size() const
Get the size of the RTP header.
- Returns:
The size of the RTP header.
-
std::span<const uint8_t> get_rtp_header() const
Get a span of bytes of the RTP header.
- Returns:
A span of bytes of the RTP header.
-
std::vector<uint8_t> &get_packet()
Get a reference to the packet_ vector.
- Returns:
A reference to the packet_ vector.
-
std::span<const uint8_t> get_payload() const
Get a span of bytes of the payload.
- Returns:
A span of bytes of the payload.
-
void set_payload(std::span<const uint8_t> payload)
Set the payload.
- Parameters:
payload – The payload to set.
-
inline explicit RtpJpegPacket(std::span<const uint8_t> data)
Header File
Classes
-
class RtcpPacket
A class to represent a RTCP packet.
This class is used to represent a RTCP packet. It is used as a base class for all RTCP packet types.
Note
At the moment, this class is not used.
Header File
Classes
-
class JpegHeader
A class to generate a JPEG header for a given image size and quantization tables. The header is generated once and then cached for future use. The header is generated according to the JPEG standard and is compatible with the ESP32 camera driver.
Public Functions
-
inline explicit JpegHeader(int width, int height, std::span<const uint8_t> q0_table, std::span<const uint8_t> q1_table)
Create a JPEG header for a given image size and quantization tables.
- Parameters:
width – The image width in pixels.
height – The image height in pixels.
q0_table – The quantization table for the Y channel.
q1_table – The quantization table for the Cb and Cr channels.
-
inline explicit JpegHeader(std::span<const uint8_t> data)
Create a JPEG header from a given JPEG header data.
-
inline ~JpegHeader()
Destructor.
-
inline int get_width() const
Get the image width.
- Returns:
The image width in pixels.
-
inline int get_height() const
Get the image height.
- Returns:
The image height in pixels.
-
inline size_t size() const
Get the size of the JPEG header data.
Note
This is the size of the serialized JPEG header, not the image size.
- Returns:
The size of the JPEG header data in bytes.
-
inline bool is_valid() const
Returns whether this header parsed or serialized successfully.
-
inline std::span<const uint8_t> get_data() const
Get the JPEG header data.
- Returns:
The JPEG header data.
-
inline std::span<const uint8_t> get_quantization_table(int index) const
Get the Quantization table at the index.
- Parameters:
index – The index of the quantization table.
- Returns:
The quantization table.
-
inline explicit JpegHeader(int width, int height, std::span<const uint8_t> q0_table, std::span<const uint8_t> q1_table)
Header File
Classes
-
class JpegFrame
A class that represents a complete JPEG frame.
This class is used to collect the JPEG scans that are received in RTP packets and to serialize them into a complete JPEG frame.
Public Functions
-
inline explicit JpegFrame(const espp::RtpJpegPacket &packet)
Construct a JpegFrame from a RtpJpegPacket.
This constructor will parse the header of the packet and add the JPEG data to the frame.
- Parameters:
packet – The packet to parse.
-
inline explicit JpegFrame(const std::vector<uint8_t> &data)
Construct a JpegFrame from a vector of jpeg data.
Note
The vector must contain the complete JPEG data, including the JPEG header and EOI marker.
- Parameters:
data – The vector containing the jpeg data.
-
inline explicit JpegFrame(std::span<const uint8_t> data)
Construct a JpegFrame from a span of jpeg data.
Note
The span must contain the complete JPEG data, including the JPEG header and EOI marker.
- Parameters:
data – The span containing the jpeg data.
-
inline explicit JpegFrame(const uint8_t *data, size_t size)
Construct a JpegFrame from buffer of jpeg data
- Parameters:
data – The buffer containing the jpeg data.
size – The size of the buffer.
-
inline const espp::JpegHeader &get_header() const
Get a reference to the header.
- Returns:
A reference to the header.
-
inline int get_width() const
Get the width of the frame.
- Returns:
The width of the frame.
-
inline int get_height() const
Get the height of the frame.
- Returns:
The height of the frame.
-
inline bool is_complete() const
Check if the frame is complete.
- Returns:
True if the frame is complete, false otherwise.
-
inline void append(const espp::RtpJpegPacket &packet)
Append a RtpJpegPacket to the frame. This will add the JPEG data to the frame.
- Parameters:
packet – The packet containing the scan to append.
-
inline void add_scan(const espp::RtpJpegPacket &packet)
Append a JPEG scan to the frame. This will add the JPEG data to the frame.
Note
If the packet contains the EOI marker, the frame will be finalized, and no further scans can be added.
- Parameters:
packet – The packet containing the scan to append.
-
inline std::span<const uint8_t> get_data() const
Get the serialized data. This will return the serialized data.
- Returns:
The serialized data.
-
inline std::span<const uint8_t> get_scan_data() const
Get the scan data. This will return the scan data.
- Returns:
The scan data.
-
inline explicit JpegFrame(const espp::RtpJpegPacket &packet)