[This was originally posted at https://research.nccgroup.com/2020/12/15/an-adventure-in-contingency-debugging-ruby-ioread-iowrite-considered-harmful/.]
Recently, I was working on weaponizing a particular bug with a colleague. For reasons unfathomable to me, we’ve been implementing our exploit in Ruby. As part of this, we wrote a rinky-dink port scanner that attempts to find instances of the service on a given host because it has a propensity to pick random ports to listen on. This was necessary because one can only really fingerprint the service by sending it specific malformed messages and detecting whether a specific error message was returned. So color me surprised when my colleague messages me and tells me that they have the scanner working, but with the caveat that “when I run it in a Docker container it hangs on read and when I run it in a VM it works.” And so began a day of stark raving incredulity.
require 'socket'
def test_target(host, port)
begin
socket = TCPSocket.new host, port
status = :open
rescue Errno::ECONNREFUSED
status = :closed
rescue Errno::ETIMEDOUT
status = :filtered
end
service = nil
if status == :open
socket.write "<REDACTED>"
begin
line = socket.read (1024 * 8)
if line.include? "<REDACTED>"
service = :thething
end
rescue IO::WaitReadable
IO.select([socket])
retry
end
end
return [host, port, status, service]
end
puts test_target('localhost', '<port>').inspect
The above code is a simplified version of the already pretty simple scanner. So let’s start with the failing case. We’ll run the code in a Docker container and claim our error.
root@970ccbf76e01:/mnt# ruby scan.rb
Traceback (most recent call last):
2: from scan.rb:30:in `<main>'
1: from scan.rb:17:in `test_target'
scan.rb:17:in `read': Connection reset by peer (Errno::ECONNRESET)
So that checks out. Next, we should confirm that the server is actually replying to us.
A quick tcpdump will confirm that the server gets our broken message, replies with
its own and then closes the connection.
root@970ccbf76e01:/mnt# tcpdump -i lo -n -X 'port <port>'
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo, link-type EN10MB (Ethernet), capture size 262144 bytes
10:12:50.611092 IP 127.0.0.1.58562 > 127.0.0.1.PORT: Flags [S], ...
...
10:12:50.611108 IP 127.0.0.1.PORT > 127.0.0.1.58562: Flags [S.], ...
...
10:12:50.611120 IP 127.0.0.1.58562 > 127.0.0.1.PORT: Flags [.], ...
...
10:12:50.611241 IP 127.0.0.1.58562 > 127.0.0.1.PORT: Flags [P.], ...
(<REDACTED>)
10:12:50.611653 IP 127.0.0.1.PORT > 127.0.0.1.58562: Flags [P.], ...
(<ERROR MESSAGE>)
10:12:50.611684 IP 127.0.0.1.58562 > 127.0.0.1.PORT: Flags [.], ...
...
10:12:50.611904 IP 127.0.0.1.PORT > 127.0.0.1.58562: Flags [R.], ...
(RST)
At this point, I’m starting to get suspicious of potential eldritch Ruby-isms
creeping there way in from the depths of R’lyeh. We know for a fact that the
remote side sent us some data and then closed the connection with a RST packet
after the client ACK’d the data. So it should not have been possible for an
error to have been received on the socket before the data. The Berkeley sockets
API is hot garbage, but it’s not that bad. My first thought to attempt to shake
out any Ruby horrors is Ruby’s TracePoint API.
Using this API, we can put together a fairly simple tracer that executes around a block.
def start_trace
trace = TracePoint.new(:call, :return, :line,
:c_call, :c_return, :b_call, :b_return,
:raise, :thread_begin, :thread_end) { |tp|
if [:call, :c_call].include?(tp.event)
p ["#{tp.path}:#{tp.lineno}", tp.event,
"#{tp.defined_class}.#{tp.method_id}",
tp.self.method(tp.method_id), tp.self.method(tp.method_id).parameters,
tp.parameters]
else
p ["#{tp.path}:#{tp.lineno}", tp.event,
"#{tp.defined_class}.#{tp.method_id}", tp.self.method(tp.method_id)]
end
}
trace.enable
yield
ensure
trace.disable
end
With this, we can trace through the entirety of the the socket.read call, and
even hoist the tracer around whatever pieces of code we want.
start_trace {
line = socket.read (1024 * 8)
}
root@970ccbf76e01:/mnt# ruby scan-tpex.rb
["<internal:prelude>:137", :c_return, "TracePoint.__enable", #<Method: TracePoint#__enable>]
["<internal:prelude>:138", :return, "TracePoint.enable", #<Method: TracePoint#enable>]
["scan-tpex.rb:15", :line, "Object.start_trace", #<Method: main.start_trace>]
["scan-tpex.rb:34", :b_call, "Object.test_target", #<Method: main.test_target>]
["scan-tpex.rb:35", :line, "Object.test_target", #<Method: main.test_target>]
["scan-tpex.rb:35", :c_call, "IO.read", #<Method: TCPSocket(IO)#read>, [[:rest]], [[:rest]]]
["scan-tpex.rb:35", :c_call, "SystemCallError.initialize", #<Method: Errno::ECONNRESET(SystemCallError)#initialize>, [[:rest]], [[:rest]]]
["scan-tpex.rb:35", :c_call, "Exception.initialize", #<Method: Errno::ECONNRESET(SystemCallError)#initialize>, [[:rest]], [[:rest]]]
["scan-tpex.rb:35", :c_return, "Exception.initialize", #<Method: Errno::ECONNRESET(SystemCallError)#initialize>]
["scan-tpex.rb:35", :c_return, "SystemCallError.initialize", #<Method: Errno::ECONNRESET(SystemCallError)#initialize>]
["scan-tpex.rb:35", :c_call, "Exception.exception", #<Method: Errno::ECONNRESET(Exception)#exception>, [[:rest]], [[:rest]]]
["scan-tpex.rb:35", :c_return, "Exception.exception", #<Method: Errno::ECONNRESET(Exception)#exception>]
["scan-tpex.rb:35", :c_call, "Exception.backtrace", #<Method: Errno::ECONNRESET(Exception)#backtrace>, [], []]
["scan-tpex.rb:35", :c_return, "Exception.backtrace", #<Method: Errno::ECONNRESET(Exception)#backtrace>]
["scan-tpex.rb:35", :raise, "IO.read", #<Method: TCPSocket(IO)#read>]
["scan-tpex.rb:35", :c_return, "IO.read", #<Method: TCPSocket(IO)#read>]
["scan-tpex.rb:35", :b_return, "Object.test_target", #<Method: main.test_target>]
["scan-tpex.rb:17", :line, "Object.start_trace", #<Method: main.start_trace>]
["scan-tpex.rb:17", :c_call, "TracePoint.disable", #<Method: TracePoint#disable>, [], []]
scan-tpex.rb:35:in `read': Connection reset by peer (Errno::ECONNRESET)
from scan-tpex.rb:35:in `block in test_target'
from scan-tpex.rb:15:in `start_trace'
from scan-tpex.rb:34:in `test_target'
from scan-tpex.rb:50:in `<main>'
The first and last couple of lines are typical of one of the things tracer is
hooking on, once it’s actually enabled internally, it can see the the inner
workings of the TracePoint API itself, from the remaining parts of it being
enabled to the beginning parts of it being disabled per the trace.enable and
trace.disable calls from start_trace. However, from our perspective, the only
interesting lines are the following:
["scan-tpex.rb:35", :c_call, "IO.read", #<Method: TCPSocket(IO)#read>, [[:rest]], [[:rest]]]- The group of lines from
:c_call, "SystemCallError.initialize"to:c_return, "Exception.backtrace" ["scan-tpex.rb:35", :raise, "IO.read", #<Method: TCPSocket(IO)#read>]["scan-tpex.rb:35", :c_return, "IO.read", #<Method: TCPSocket(IO)#read>]
From this we can tell that once the call heads into the native C function for
IO.read, the TracePoint API can only see it start constructing and raising an
exception, which then causes it to return from IO.read. This is not an ideal
amount of granularity, but at least it gives us a place to start looking. An
interesting thing to note here is that calling the read method of a TCPSocket
object is a direct call to the read method of TCPSocket‘s superclass, IO.
Therefore, the place to start looking is in the io.c source file within the
main Ruby codebase, where we can find that the IO#read method is registered
to the io_read C function.
To save readers the trouble of attempting to make
sense of this file, the call path we are looking for is
is_read -> io_fread
-> bufread_call
-> io_bufread.
This last function has a loop around rb_read_internal,
which is a wrapper for internal_read_func,
which directly calls read(2).
static long
io_bufread(char *ptr, long len, rb_io_t *fptr)
{
long offset = 0;
long n = len;
long c;
if (READ_DATA_PENDING(fptr) == 0) {
while (n > 0) {
again:
c = rb_read_internal(fptr->fd, ptr+offset, n);
if (c == 0) break;
if (c < 0) {
if (fptr_wait_readable(fptr))
goto again;
return -1;
}
offset += c;
if ((n -= c) <= 0) break;
}
return len - n;
}
while (n > 0) {
c = read_buffered_data(ptr+offset, n, fptr);
if (c > 0) {
offset += c;
if ((n -= c) <= 0) break;
}
rb_io_check_closed(fptr);
if (io_fillbuf(fptr) < 0) {
break;
}
}
return len - n;
}
The important thing about this code is the first while loop that calls rb_read_internal.
Essentially, io_bufread tries to read(2) from the socket until the amount requested has
been reached, or an EOL is returned. We can quickly confirm that is the case with strace.
root@970ccbf76e01:/mnt# strace -f -s 200 ruby scan.rb
...
socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_TCP) = 5
fcntl(5, F_GETFD) = 0x1 (flags FD_CLOEXEC)
connect(5, {sa_family=AF_INET, sin_port=htons(<port>), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
fstat(5, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
write(5, "<REDACTED>"..., 10) = 10
read(5, "<REDACTED>"..., 8192) = 695
read(5, 0x5587027e4297, 7497) = -1 ECONNRESET (Connection reset by peer)
write(2, "\33[1mTraceback\33[m (most recent call last):\n", 42Traceback (most recent call last):
) = 42
...
Two read(2) syscalls, with the second one performed immediately after the first
and attempting to read the 1024*8 byte size minus the amount read by the first syscall.
As the service being scanned closes the connection immediately after returning
its response, any subsequent attempts to read on the socket to that service will
fail. However, because of the quirky nature of io_bufread, which appears to
collapse multiple read(2) calls into a single operation that tries to read the
entire buffer, the second read for the remaining data will fail, which causes
the entire TcpSocket#read (really IO#read) operation to fail, even though data
was, in fact, read.
Per the documentation
for io_read, this is actually the expected behavior:
If length is a positive integer,
readtries to read length bytes without any conversion (binary mode). It returnsnilif an EOF is encountered before anything can be read. Fewer than length bytes are returned if an EOF is encountered during the read.
So what is the right way to write this code? Much like with C, the answer is not
to use generic file IO functions (i.e. read(2) and write(2)) for sockets, but
instead the socket-specific functions (i.e. send(2) and recv(2)). Much like
everything else in Ruby, there is a bit of indirection around TcpSocket#recv
(really BasicSocket#recv, which wraps recvfrom(2)), but it doesn’t exhibit the
error collapsing behavior of IO#read and will do what you would have expected
from calling read on a TcpSocket.
Truth be told, one of the very first things I tried when my colleague raised this
issue was calling recv instead of read and using strace to discern between the
two, which very quickly yielded the two read(2) calls. But from there, the
journey to find why there was a difference consumed disproportionately more
time given that the TracePoint API ended up being completely useless, especially
because it can’t really access method parameters, especially those for native
functions. Because of that, most of this analysis was done with a mix of source
code trawling and very hacky Frida-based
hooking of the few functions within Ruby/libruby that actually have symbols.
read_inteceptor = Interceptor.attach(mod.getExportByName('read'), {
onEnter: function(args) {
console.log(" read(" + args[0] + ", ..., " + args[2] + ")")
console.log(" " +
Thread.backtrace(this.context, Backtracer.FUZZY)
.map(DebugSymbol.fromAddress).join('\n')
);
},
onLeave: function(retval) {
console.log(" read() -> " + retval + "\n")
}
});
io_bufread_inteceptor = Interceptor.attach(sym_to_addr_map['io_bufread'].address, {
onEnter: function(args) {
console.log("io_bufread(..., " + args[1] + ", ...)")
},
onLeave: function(retval) {
console.log("io_bufread() -> " + retval)
}
});
rb_io_wait_readable_inteceptor = Interceptor.attach(sym_to_addr_map['rb_io_wait_readable'].address, {
onEnter: function(args) {
console.log(" rb_io_wait_readable(...)");
},
onLeave: function(retval) {
console.log(" rb_io_wait_readable() -> " + retval)
}
});
io_fillbuf_inteceptor = Interceptor.attach(sym_to_addr_map['io_fillbuf.part.39'].address, {
onEnter: function(args) {
console.log("io_fillbuf(...)")
},
onLeave: function(retval) {
console.log("io_fillbuf() -> " + retval + "\n")
}
});
Conclusion
Along this adventure, I learned that, much like how glibc syscall stubs
don’t have a one-to-one mapping to the syscalls
you would expect them to call,
Ruby’s POSIX syscall-like methods have a mind of their own. I also learned
that Ruby’s TracePoint API doesn’t help much once you get to the C parts of
(C)Ruby/MRI, and that vanilla syscall tracing remains an invaluable part of
the Ruby debugger’s toolkit (though I’m sure one can get more throughput with
eBPF-based syscall tracers). As it turns out, TracePoint, Ruby’s own tracing
feature, can’t even trace Ruby’s underlying virtual machine opcodes
(i.e. those printed by RubyVM::InstructionSequence.disasm). Admittedly, the
nature of this part of Ruby is implementation specific, but the lack of deeper
granularity is still felt. In a later post, I’ll discuss the Ruby tracer I’ve
been working on, which can trace (C)Ruby/MRI’s internal bytecode VM (and more!).