The Perfectly Secure Data Diode Firewall

How to build the Perfectly Secure Data Diode Firewall

With respect to "only allow one-way communication", it's actually not that hard to build the absolute secure firewall, mathematically provable impossible to break. Just make sure the information only can flow in one direction and never ever pass in reverse. Simple, absolute secure and easy to implement - but not very useful in the general case.

Use cases

This said, there are a few use cases this is quite desirable;

Logging to a log-server where we must be able to guarantee the integrity of the logs no matter what.
Mailing into a secure network that must never be exposed to the Internet.
Monitoring of Internet connected resources from a secure, isolated, network.
Data replication (repositories, wsus, web-sites, ...) to a secure developer network.
Backups to an untouchable backup server.

But alas, you will not be able to surf the Internet via this kind of firewall - not unless you decide to replicate the Internet as a whole to your private network ...

The Implementation

Make sure you at the lowest level of transmission implement a simple, simplex transfer mechanism. I will use an optical isolator, since it is an established way of realizing this kind of isolation, easy to understand and implement.

Please note that this kind of firewall sometimes is refereed to as a "diode". I find this quite misleading, since you can easily transfer information in both directions over a diode with no problems whatsoever (using voltage forward and current for reverse direction). An optocoupler do not have this issue. An optocoupler works by sending information over light from a LED to a photo receiver much in the same way a sailor sends Morse code using a signal lamp to passing ships.

A better analogy would be a wormhole, a black/white hole pair, where anything entering the black hole is forever lost from that universe. A black hole firewall..?

Data integrity and flow control

Using a true one-way (simplex) transfer at the lowest level to guarantee one-way communication, and common networking mechanism such as data acknowledge, flow control, re-transmission etc can not be used. This rules out Ethernet and the IP protocol family among others since they use deliberate packet drops and feedback as means of flow control regulation, making data integrity impossible to guarantee.

Not using the IP protocol family, you say? Well, yes, over this simplex low level transmission link that is. We'll fix this using software proxies, described later in this paper.

The Hardware

We can never trust software. All software contains bugs, can be manipulated, changed, disabled or replaced. We must base the security of our solution on hardware;

We can view this device as our "event horizon" - there is no way anything, no information, no nothing, ever can return once passed this "point of no return". This is our guarantee. Even if there's flaws in the software implementation, no software can ever compromise this hardware constraint.

Here we are using an USB device, again, since we are not able to utilize Ethernet technology per above. We bridge the USB payload data stream to a single-ended bit stream, send it over the optocoupler and then back to USB again - all at an effective rate of about ~5 Mbps. Since the transmitting side owns the transmission link device it has perfect control over transmission speed, output queues etc, in contrast to a downstream Ethernet device. This way our software can control data rates from sending hosts via standard TCP sliding window flow control.

Schematics

The core of the circuit is very simple and trivial to validate with respect to simplex communication and galvanic isolation:

To the left and right this circuit is fed from / feeding to a couple of Prolific PL2303 USB-to-serial bridge controllers. Please try to get hold of the genuine chip since this chip is pirated beyond belief. Using our software below, make sure to use a chip version handling 6 Mbps or better - but I think most versions in production nowadays does (chips from the PRC and all bets are off).

Make sure to keep the signal path PL2303 <-> optocoupler short, i.e. centimeters, not meters. Keep it on the PCB, i.e. no wires.

Decoupling caps primarily for the PL2303 modules, optocoupler fed over 220 ohm, RX pull-up 1k and an 1N4148 to boost the turn-off speed somewhat - all done! Note that if using fake PL2303 chip, they may not handle the sink current as well as the originals making the 220 ohm resistor too large. If so, check the current over the optocoupler - or simply get genuine chips. :-)

Effective data rate

The rate limitation of this interface of ~5 Mbps effective speed is a blessing in disguise. Keep in mind that the final destination system of the data sent over this firewall must be able to keep up at this rate sustained, since there must be no mechanism to propagate flow control back over the one-way interface. Every implementation of such a mechanism would completely invalidate the whole concept as such, creating a back-channel not only possible to signal flow control but also leak out any kind of information as well. Such a mechanism must never be implemented. But even a low performance system can easily handle 5 Mbps data rate. Though, for strict 24/7 requirements such as a log system, a redundant high availability solution may be needed for the final destination to be able to handle downtime for maintenance and such - but that is outside the scope of this paper.

Also, for highly redundant textual data such as log files, a compression algorithm on the application level may increase the effective rate maybe 10x.

The Software

Now we need some software to make this useful. Basically we need to;

Accept data on TCP/IP and/or UDP/IP from the origin systems
Manage flow control to the original system not exceeding link capacity
Terminate IP protocol family and extract application level payload
Encapsulate the payload in a link level protocol suitable for our hardware interface
Transmit it over the simplex USB bridge
Receive on internal firewall host
Verify data integrity
Repack and re-transmit over the IP protocol suit to the final destinations

The important step here is that we get rid of Ethernet technology before relaying the pure payload over our simplex USB transmission. Without ACK/retransmission, we do not have the luxury to afford packet loss these standards rely heavily upon. We do run TCP/IP and/or UDP/IP to our peers on the outside and inside, but between blackhole and whitehole we must use our own link level protocol not using packet drops or any form of feedback for flow control and error correction.

The software is also responsible for logging the traffic to ensure data integrity. E.g. if the receiving host (loghost for example) is down, there is no way to inform the sender about this. In these circumstances data is inevitably lost, and the only thing to do is to log this event. This software also handles data integrity via checksums and error correcting codes to ensure the integrity of transmission link and data etc.

We can handle multiple logical bridges over the same USB interface from different network sources and protocols to multiple destinations using this software.

Download

GNU/Linux binaries: Linux/erbridge-2021-10-17.txz
sha256 checksum: ef2f8d7460ad1cd41be490a2bf5eaa1611e48744ba5ce12e79958b145c17ed52

Do not trust this software

You should not trust this software, neither have to:

All non-trivial software contains bugs
The security of the firewall isolation is guaranteed by the simplex opto-isolated interface hardware
The security of the firewall hosts must be implemented using netfilter/iptables and possible SELinux

Please report back any issues found using this software. Contact information found last on this page.

Examples and usage

syslog

Host "blackhole" on the public network, "whitehole" on the private. Any GNU/Linux systems, preferable a dedicated machine (e.g. small SoC) but the server generating and/or receiving the traffic itself may be used as well. Host "whitehole" must not be a virtual machine unless real-time can be guaranteed (no, it can't). Host "blackhole" may be virtual, though. Receiving loghost must be able to accept syslog at the lowest of rate generated and the 5 Mbps cap set by the USB bridge (the later recommended and trivially achieved).

Origin rsyslog server configuration

rsyslogd configuration of transmitting host:

*.info @blackhole
	action(type="omfwd" target="blackhole" port="514" protocol="tcp")

Internal loghost syslog configuration no special configuration more than accepting external syslog over TCP from host "whitehole".

Firewall host "blackhole" configuration

To enable external host to listen on TCP (-t) port 514 for bridge #1 plus UDP (-u) port 514 for bridge #2:

blackhole# black -t 1:514 -u 2:514

USB isolation device default /dev/ttyUSB0 but can be specified using option -d, e.g.:

blackhole# black -d /dev/ttyUSB1 -t 1:514 -u 2:514

Firewall host "whitehole" configuration

The internal host then relays bridge #1 to logserver:514/tcp as well as bridge #2.

whitehole# white -t 1:logserver:514 -t 2:logserver:514

Please note we here get a protocol conversion from UDP on the external side to TCP on the internal side. This is possible since this is not about tunneling.

In a real-world implementation, we would run a local syslog server on "whitehole" spooling the received logs before sending them via RELP to the final destination. This way we would not lose any logs in the case of downtime of the real logserver.

whitehole# white -t 1:localhost:514 -t 2:localhost:514

Since the receiver thread of white runs real-time scheduling the user need rtprio 99 (/etc/security/limits.conf).

Firewall software logs

Jun  7 07:56:19 loghost black: Starting ER-bridge
Jun  7 07:56:36 loghost white: Starting ER-bridge
Jun  7 07:56:56 loghost black: Client opened tcp:127.0.0.1:39474 -> 514 on bridge 1 connection 1
Jun  7 07:56:56 loghost black: Client closed tcp:127.0.0.1:39474 -> 514 on bridge 1 connection 1 after 5 bytes
Jun  7 07:56:56 loghost black: Client opened tcp:127.0.0.1:39476 -> 514 on bridge 1 connection 2
Jun  7 07:56:56 loghost white: Opened tcp:127.0.0.1:23233 on bridge 1 connection 1
Jun  7 07:56:56 loghost white: Closed tcp:127.0.0.1:23233 on bridge 1 connection 1 after 5 bytes
Jun  7 07:56:56 loghost white: Opened tcp:127.0.0.1:23233 on bridge 1 connection 2
Jun  7 07:57:20 loghost white: Opened tcp:127.0.0.1:23233 on bridge 2 connection 0
Jun  7 07:57:28 loghost black: Client opened tcp:127.0.0.1:39484 -> 514 on bridge 1 connection 3
Jun  7 07:57:28 loghost black: Client closed tcp:127.0.0.1:39484 -> 514 on bridge 1 connection 3 after 5 bytes
Jun  7 07:57:28 loghost white: Opened tcp:127.0.0.1:23233 on bridge 1 connection 3
Jun  7 07:57:28 loghost white: Closed tcp:127.0.0.1:23233 on bridge 1 connection 3 after 5 bytes
Jun  7 08:44:30 loghost black: Client closed tcp:127.0.0.1:39476 -> 514 on bridge 1 connection 2 after 9405 bytes
Jun  7 08:44:30 loghost white: Closed tcp:127.0.0.1:23233 on bridge 1 connection 2 after 9405 bytes
Jun  7 08:44:30 loghost black: Client opened tcp:127.0.0.1:39488 -> 514 on bridge 1 connection 4
Jun  7 08:44:30 loghost white: Opened tcp:127.0.0.1:23233 on bridge 1 connection 4
Jun  7 09:08:16 loghost white: Starting ER-bridge
Jun  7 09:08:16 loghost white: Lost synchronization on USB EvH
Jun  7 09:08:16 loghost white: Regained synchronization on USB EvH
Jun  7 09:08:16 loghost white: Opened tcp:127.0.0.1:23233 on bridge 1 connection 4
Jun  7 09:08:22 loghost white: Opened tcp:127.0.0.1:23233 on bridge 2 connection 0

Note that in this example above we send the firewall logs over the firewall itself as well, hence both "black" and "white" here in one and the same log. Please don't do this while enabling debug logging on the "black" firewall software itself, at least not more than once. You'll figure it out why. ;-)

Firewall logging is performed on facility local1.

File transfer and general application proxies

In a similar manner we can implement general file transfer implementing proxies for mail/smtp, program-to-program communication etc:

hostA$ ncat blackhole 12345 < file-to-send

hostB$ ncat -l 12345 -o file-to-receieve

Configuration of blackhole/whitehole hosts in similar way as in the syslog example but for port 12345/tcp and hosts A and B.

Using this file replication mechanism, it is simple to implement more advanced application proxies. For mail/smtp just replicate the mail spool files and then feed them into the internal mail system. To replicate a file structure (e.g. web server, repository, backup etc), send a tar archive over the firewall, etc. Using simple script programming, many application proxies can be implemented this way in just minutes.

A word of caution

Care must still be taken implementing application proxies. Accepting mail into a secure network, and it is vulnerable to a malware attack via this channel, and even if the malware have no possibility to "call home", it could still execute a ransomware attack encrypting a disk or similar. Or if replicating a WSUS server for your internal network, you still have to trust the content of that server.

Questions

Feel free to contact me for questions or other feedback.

/By Mikael Q Kuisma