Wireshark Plugins for nDisplay Traffic

Hi! When I try to use Wireshark to look at nDisplay traffic, I find that the traffic doesn’t quite match what’s in Epic’s docs. For example, ports 41003 and 41004 are only for cluster events from nodes to the Primary; cluster events from the Primary to the other nodes are on port 41001, in a different format, and intermixed with other traffic.

I built Wireshark plugins in Lua to display this, and to decode Cosm’s own binary and JSON cluster events, but if Epic / Pixela have Wireshark plugins already I’d love to switch over to something official.

Hi Christopher,

We don’t have any Wireshark plugins for nDisplay. Feel free to ask me anything related to networking in nDisplay.

Thank you Andrey. I’m also comfortable reading the source code to sort it out.

This Lua code we’re using so far, for the cluster sync traffic on Port 41001, is based on inspection of the wire traffic rather than the source code. There are some places in this code where I just skip a byte without knowing why, or read in several int32 fields and make an interpretation of their pattern that matches the data without knowing why.

The main thing I want to be sure of is that we’re looking in the right spot for each of the events: Binary events from nodes on port 41004, json events from nodes on port 41003, and both types of events from the Primary, multiplexed onto port 41001 (along with other sync traffic, and events sourced by other nodes).

cosm_lib = require "cosm_lib"
 
local p_cosm_cep = Proto("cosm_cep", "Unreal Cluster Sync Events v1.0.0");
local cep_table = DissectorTable.new("cosm_cep", "Unreal Cluster Sync Events", ftypes.INT, base.DEC, p_cosm_cep)
 
local f_class = ProtoField.string("cosm_cep.class", "Event Class")
local f_type = ProtoField.string("cosm_cep.type", "Event Type")
local f_source = ProtoField.string("cosm_cep.source", "Source")
local f_destination = ProtoField.string("cosm_cep.destination", "Destination")
local f_jce_count = ProtoField.uint32("cosm_cep.jce_count", "JSON Event Count", base.DEC)
local f_bce_count = ProtoField.uint32("cosm_cep.bce_count", "Binary Event Count", base.DEC)
 
p_cosm_cep.fields = { f_class, f_type, f_source, f_destination, f_jce_count, f_bce_count }
 
local data_dis = Dissector.get("data")
 
function p_cosm_cep.dissector(buf, pkt, tree)
 
	local subtree = tree:add(p_cosm_cep, buf)
	local ip_src = pkt.net_src -- Source IP address from the network layer
	local ip_dst = pkt.net_dst -- Destination IP address from the network layer
	local ip_source_port = pkt.src_port
 
	local payload_len = buf(0,4):le_uint()
	local cursor = 4
 
	-- e.g. GetEventsData
	local type_len = buf(cursor + 0, 4):le_uint()
	local type_value = buf(cursor + 4, type_len):stringz()
	cursor = cursor + (4 + type_len)
 
	-- e.g. response
	local subtype_len = buf(cursor + 0, 4):le_uint()
	local subtype_value = buf(cursor + 4, subtype_len):stringz()
	cursor = cursor + (4 + subtype_len)
 
	-- e.g. ClusterSync
	local family_len = buf(cursor + 0, 4):le_uint()
	local family_value = buf(cursor + 4, family_len):stringz()
	cursor = cursor + (4 + family_len)
 
	-- Add the source and destination GPs
	subtree:add(f_class, family_value)
	subtree:add(f_type, type_value)
	subtree:add(f_source, cosm_lib.ip_to_gp(ip_src))
	subtree:add(f_destination, cosm_lib.ip_to_gp(ip_dst))
 
	if ip_source_port == 41001 then
	
		-- TODO :Factor this out into its own dissector for ease of filtering
		if (type_value == "GetEventsData") and (string.lower(subtype_value) == "response") and (family_value == "ClusterSync") then
 
			-- ???
			cursor = cursor + 1
 
			-- ??? Three int32 values
			-- (0,0,1) followed by:
			--   (3, CS) followed by:
			--     int32 event count (or 1)
			--     json cluster events in a special format
			--       int32 length (from first byte after length)
			--       "category:type:name:system:discard:v5=(k=v,...);" nul-terminated; v5 is "Settings"
			--     if buffer left then int32 1
			--       then (3, CS)
			--         int32 1
			--           binary cluster events
			-- (0,0,0) followed by:
			--   int32 1 followed by
			--     (3, CS) followed by:
			--       int32 event count
			--       binary cluster events, encrypted
			--   OR in32 0 followed by other zeros meaning, no events
 
			local field_tbd3 = buf(cursor, 4):le_uint()
			cursor = cursor + 4
 
			local field_tbd4 = buf(cursor, 4):le_uint()
			cursor = cursor + 4
 
			local field_tbd5 = buf(cursor, 4):le_uint()
			cursor = cursor + 4
 
			if (field_tbd3 == 0) and (field_tbd4 == 0) then
				if (field_tbd5 == 1) then
					-- Handle reformatted JSON cluster events
					-- Get the CEPJ dissector inside this function to pick up registrations from other scripts
					local cepj_dissector = DissectorTable.get("cosm_cep"):get_dissector(0)
					-- CS
					local flag_len = buf(cursor, 4):le_uint()
					local flag_value = buf(cursor + 4, flag_len):stringz()
					cursor = cursor + (4 + flag_len)
					-- Count
					local jce_count = buf(cursor + 0, 4):le_uint()
					cursor = cursor + 4
					subtree:add(f_jce_count, jce_count)
					for i = 1, jce_count do
						local jce_len = buf(cursor + 0, 4):le_uint()
						if cepj_dissector ~= nil then
							-- Dissector was found, invoke subdissector with a new Tvb,
							-- created from the current buffer.
							cepj_dissector:call(buf(cursor, jce_len + 4):tvb(), pkt, tree)
						else
							-- fallback dissector that just shows the raw data.
							data_dis:call(buf(cursor + 4, jce_len):tvb(), pkt, tree)
						end
						cursor = cursor + (4 + jce_len)
					end
					-- If buffer remaining, it's binary cluster events.
					-- Trick the code below into parsing them
					if cursor < buf:len() then
						field_tbd5 = 0
					end
				else
					subtree:add(f_jce_count, 0)
				end
				if field_tbd5 == 0 then
					-- Handle binary cluster events
					local field_tbd6 = buf(cursor + 0, 4):le_uint()
					cursor = cursor + 4
					if field_tbd6 > 0 then
						-- Get the CEB dissector inside this function to pick up registrations from other scripts
						local ceb_dissector = DissectorTable.get("wtap_encap"):get_dissector(wtap.USER15)
						-- CS
						local flag_len = buf(cursor, 4):le_uint()
						local flag_value = buf(cursor + 4, flag_len):stringz()
						cursor = cursor + (4 + flag_len)
						-- Count
						local bce_count = buf(cursor + 0, 4):le_uint()
						cursor = cursor + 4
						subtree:add(f_bce_count, bce_count)
						for i = 1, bce_count do
							local bce_len = buf(cursor + 0, 4):le_uint()
							if ceb_dissector ~= nil then
								-- Dissector was found, invoke subdissector with a new Tvb,
								-- created from the current buffer.
								ceb_dissector:call(buf(cursor, bce_len + 4):tvb(), pkt, tree)
							else
								-- fallback dissector that just shows the raw data.
								data_dis:call(buf(cursor + 4, bce_len):tvb(), pkt, tree)
							end
							cursor = cursor + (4 + bce_len)
						end
					else
						subtree:add(f_bce_count, 0)
					end
				else
					subtree:add(f_bce_count, 0)
				end
			end
		else
			-- Don't understand this subformat. Fallback dissector that just shows the raw data.
			subtree:add(f_jce_count, 0)
			subtree:add(f_bce_count, 0)
			data_dis:call(buf(cursor):tvb(), pkt, tree)
		end
	else
		-- Event type we don't care about. fallback dissector that just shows the raw data.
		subtree:add(f_jce_count, 0)
		subtree:add(f_bce_count, 0)
		data_dis:call(buf(cursor):tvb(), pkt, tree)
	end
end
 
local wtap_encap_table = DissectorTable.get("wtap_encap")
wtap_encap_table:add(wtap.USER14, p_cosm_cep)
wtap_encap_table:add(wtap.USER11, p_cosm_cep)
 
local tcp_encap_table = DissectorTable.get("tcp.port")
tcp_encap_table:add(41001 , p_cosm_cep)

First, I should mention that in version 5.6, the internal communication topology was changed from a “star” to a “mesh”. This was one of the key steps toward implementing the new failover solution. Now, every node hosts a full set of internal and external servers, along with its own set of clients and their persistent connections. At any given time, the primary node determines the effective “center” of the active star. Because of this change, some online documentation may now be outdated.

Each cluster node runs several internal servers (bound to port 41001) and two external servers for events: JSON (41003) and Binary (41004).

  • External servers act as the public interface for anything outside the cluster. You can send events from anywhere to these ports.
  • Internal servers are used strictly for communication between cluster nodes and are not intended for external access (and therefore are not documented). These include: ClusterSync, RenderSync, GenericBarrier, JSON events, Binary events, InternalComm. This list may evolve in the future.

Since you’re specifically interested in cluster events, I’ll focus on that aspect of communication.

Sending Events

There are two common use cases:

  1. Send a cluster event from outside the cluster → This is what the external servers are for. Ports are defined in the config file (41003/41004 by default), and the JSON/Binary message formats are still correctly described in the online documentation.
  2. Emit an event from inside the cluster → Any cluster node can do this using the C++/Blueprint API.

How the event is handled depends on whether it originates externally or internally, and whether it’s sent to a primary or secondary node.

Event Propagation Pipelines

Here’s how events flow in different scenarios (notation: P-node = primary node, S-node = secondary node):

External → S-node

  • S-node::ExternalEventServer receives the event
  • S-node::InternalEventClient forwards it to P-node::InternalEventServer
  • P-node::InternalEventServer schedules it for synchronization on the next frame
  • P-node::ClusterSyncServer propagates it to every S-node via GetEventsData

External → P-node

  • P-node::ExternalEventServer receives the event
  • P-node schedules it for synchronization on the next frame
  • P-node::ClusterSyncServer propagates it to every S-node via GetEventsData

Internal (S-node emit)

  • S-node::InternalEventClient sends the event to P-node::InternalEventServer
  • P-node schedules it for synchronization on the next frame
  • P-node::ClusterSyncServer propagates it to every S-node via GetEventsData

Internal (P-node emit)

  • P-node schedules it for synchronization on the next frame
  • P-node::ClusterSyncServer propagates it to every S-node via GetEventsData

So the whole pipeline looks like this:

Anywhere Anywhere

| . . . . . . . . . |

S-node -> P-node -> Propagate internally on GetEventsData

Monitoring Events

You mentioned monitoring GetEventsData events. That’s exactly the right place to catch all events that will be processed in the current frame, regardless of origin.

  • If your system uses only external events, you can monitor just the external ports (41003/41004).
  • Otherwise, GetEventsData is the single point where all events converge, both external and internal.

Because we use TCP only, no events can be dropped — every event will eventually be delivered and processed on each cluster node.

A Note on Serialization

The external events follow the formats documented online (JSON or Binary).

  • S-node → P-node transfer: uses Unreal Engine’s internal serializers, which may add extra bytes depending on implementation.
  • Propagation via GetEventsData: both JSON and Binary events are multiplexed into the same message, again using UE’s internal serializers. Additional int32 fields likely describe things like counts of JSON events, property count, binary event count, event lengths, etc.

To Summarize

The main thing I want to be sure of is that we’re looking in the right spot for each of the events: Binary events from nodes on port 41004, json events from nodes on port 41003, and both types of events from the Primary, multiplexed onto port 41001 (along with other sync traffic, and events sourced by other nodes).

Your understanding is correct. All events eventually go through GetEventsData, so monitoring this is sufficient to catch everything. If you know your use case relies only on external events, you can stick to monitoring the external ports instead.

Hope this clears it up — let me know if you need me to dive deeper into any part of the flow or serialization details.

Thank you, this is awesome and helps a lot!

Hey Christopher, do you have any other questions? Otherwise, I’d close this ticket. Thank you.

No more questions - OK with closing the ticket!