[This was originally posted at https://research.nccgroup.com/2020/12/15/an-adventure-in-contingency-debugging-ruby-ioread-iowrite-considered-harmful/.]

Recently, I was working on weaponizing a particular bug with a colleague. For reasons unfathomable to me, we’ve been implementing our exploit in Ruby. As part of this, we wrote a rinky-dink port scanner that attempts to find instances of the service on a given host because it has a propensity to pick random ports to listen on. This was necessary because one can only really fingerprint the service by sending it specific malformed messages and detecting whether a specific error message was returned. So color me surprised when my colleague messages me and tells me that they have the scanner working, but with the caveat that “when I run it in a Docker container it hangs on read and when I run it in a VM it works.” And so began a day of stark raving incredulity.

require 'socket'

def test_target(host, port)
  begin
    socket = TCPSocket.new host, port
    status = :open
  rescue Errno::ECONNREFUSED
    status = :closed
  rescue Errno::ETIMEDOUT
    status = :filtered
  end

  service = nil
  if status == :open
    socket.write "<REDACTED>"
    begin
      line = socket.read (1024 * 8)
      if line.include? "<REDACTED>"
        service = :thething
      end
    rescue IO::WaitReadable
      IO.select([socket])
      retry
    end
  end

  return [host, port, status, service]
end

puts test_target('localhost', '<port>').inspect

The above code is a simplified version of the already pretty simple scanner. So let’s start with the failing case. We’ll run the code in a Docker container and claim our error.

root@970ccbf76e01:/mnt# ruby scan.rb
Traceback (most recent call last):
  2: from scan.rb:30:in `<main>'
  1: from scan.rb:17:in `test_target'
scan.rb:17:in `read': Connection reset by peer (Errno::ECONNRESET)

So that checks out. Next, we should confirm that the server is actually replying to us. A quick tcpdump will confirm that the server gets our broken message, replies with its own and then closes the connection.

root@970ccbf76e01:/mnt# tcpdump -i lo -n -X 'port <port>'
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo, link-type EN10MB (Ethernet), capture size 262144 bytes
10:12:50.611092 IP 127.0.0.1.58562 > 127.0.0.1.PORT: Flags [S], ...
  ...
10:12:50.611108 IP 127.0.0.1.PORT > 127.0.0.1.58562: Flags [S.], ...
  ...
10:12:50.611120 IP 127.0.0.1.58562 > 127.0.0.1.PORT: Flags [.], ...
  ...
10:12:50.611241 IP 127.0.0.1.58562 > 127.0.0.1.PORT: Flags [P.], ...
  (<REDACTED>)
10:12:50.611653 IP 127.0.0.1.PORT > 127.0.0.1.58562: Flags [P.], ...
  (<ERROR MESSAGE>)
10:12:50.611684 IP 127.0.0.1.58562 > 127.0.0.1.PORT: Flags [.], ...
  ...
10:12:50.611904 IP 127.0.0.1.PORT > 127.0.0.1.58562: Flags [R.], ...
  (RST)

At this point, I’m starting to get suspicious of potential eldritch Ruby-isms creeping there way in from the depths of R’lyeh. We know for a fact that the remote side sent us some data and then closed the connection with a RST packet after the client ACK’d the data. So it should not have been possible for an error to have been received on the socket before the data. The Berkeley sockets API is hot garbage, but it’s not that bad. My first thought to attempt to shake out any Ruby horrors is Ruby’s TracePoint API.

Using this API, we can put together a fairly simple tracer that executes around a block.

def start_trace
  trace = TracePoint.new(:call, :return, :line,
                         :c_call, :c_return, :b_call, :b_return,
                         :raise, :thread_begin, :thread_end) { |tp|
    if [:call, :c_call].include?(tp.event)
      p ["#{tp.path}:#{tp.lineno}", tp.event,
         "#{tp.defined_class}.#{tp.method_id}",
         tp.self.method(tp.method_id), tp.self.method(tp.method_id).parameters,
         tp.parameters]
    else
      p ["#{tp.path}:#{tp.lineno}", tp.event,
         "#{tp.defined_class}.#{tp.method_id}", tp.self.method(tp.method_id)]
    end
  }
  trace.enable
  yield
ensure
  trace.disable
end

With this, we can trace through the entirety of the the socket.read call, and even hoist the tracer around whatever pieces of code we want.

start_trace {
  line = socket.read (1024 * 8)
}

root@970ccbf76e01:/mnt# ruby scan-tpex.rb
["<internal:prelude>:137", :c_return, "TracePoint.__enable", #<Method: TracePoint#__enable>]
["<internal:prelude>:138", :return, "TracePoint.enable", #<Method: TracePoint#enable>]
["scan-tpex.rb:15", :line, "Object.start_trace", #<Method: main.start_trace>]
["scan-tpex.rb:34", :b_call, "Object.test_target", #<Method: main.test_target>]
["scan-tpex.rb:35", :line, "Object.test_target", #<Method: main.test_target>]
["scan-tpex.rb:35", :c_call, "IO.read", #<Method: TCPSocket(IO)#read>, [[:rest]], [[:rest]]]
["scan-tpex.rb:35", :c_call, "SystemCallError.initialize", #<Method: Errno::ECONNRESET(SystemCallError)#initialize>, [[:rest]], [[:rest]]]
["scan-tpex.rb:35", :c_call, "Exception.initialize", #<Method: Errno::ECONNRESET(SystemCallError)#initialize>, [[:rest]], [[:rest]]]
["scan-tpex.rb:35", :c_return, "Exception.initialize", #<Method: Errno::ECONNRESET(SystemCallError)#initialize>]
["scan-tpex.rb:35", :c_return, "SystemCallError.initialize", #<Method: Errno::ECONNRESET(SystemCallError)#initialize>]
["scan-tpex.rb:35", :c_call, "Exception.exception", #<Method: Errno::ECONNRESET(Exception)#exception>, [[:rest]], [[:rest]]]
["scan-tpex.rb:35", :c_return, "Exception.exception", #<Method: Errno::ECONNRESET(Exception)#exception>]
["scan-tpex.rb:35", :c_call, "Exception.backtrace", #<Method: Errno::ECONNRESET(Exception)#backtrace>, [], []]
["scan-tpex.rb:35", :c_return, "Exception.backtrace", #<Method: Errno::ECONNRESET(Exception)#backtrace>]
["scan-tpex.rb:35", :raise, "IO.read", #<Method: TCPSocket(IO)#read>]
["scan-tpex.rb:35", :c_return, "IO.read", #<Method: TCPSocket(IO)#read>]
["scan-tpex.rb:35", :b_return, "Object.test_target", #<Method: main.test_target>]
["scan-tpex.rb:17", :line, "Object.start_trace", #<Method: main.start_trace>]
["scan-tpex.rb:17", :c_call, "TracePoint.disable", #<Method: TracePoint#disable>, [], []]
scan-tpex.rb:35:in `read': Connection reset by peer (Errno::ECONNRESET)
  from scan-tpex.rb:35:in `block in test_target'
  from scan-tpex.rb:15:in `start_trace'
  from scan-tpex.rb:34:in `test_target'
  from scan-tpex.rb:50:in `<main>'

The first and last couple of lines are typical of one of the things tracer is hooking on, once it’s actually enabled internally, it can see the the inner workings of the TracePoint API itself, from the remaining parts of it being enabled to the beginning parts of it being disabled per the trace.enable and trace.disable calls from start_trace. However, from our perspective, the only interesting lines are the following:

["scan-tpex.rb:35", :c_call, "IO.read", #<Method: TCPSocket(IO)#read>, [[:rest]], [[:rest]]]
The group of lines from :c_call, "SystemCallError.initialize" to :c_return, "Exception.backtrace"
["scan-tpex.rb:35", :raise, "IO.read", #<Method: TCPSocket(IO)#read>]
["scan-tpex.rb:35", :c_return, "IO.read", #<Method: TCPSocket(IO)#read>]

From this we can tell that once the call heads into the native C function for IO.read, the TracePoint API can only see it start constructing and raising an exception, which then causes it to return from IO.read. This is not an ideal amount of granularity, but at least it gives us a place to start looking. An interesting thing to note here is that calling the read method of a TCPSocket object is a direct call to the read method of TCPSocket‘s superclass, IO. Therefore, the place to start looking is in the io.c source file within the main Ruby codebase, where we can find that the IO#read method is registered to the io_read C function. To save readers the trouble of attempting to make sense of this file, the call path we are looking for is is_read -> io_fread -> bufread_call -> io_bufread. This last function has a loop around rb_read_internal, which is a wrapper for internal_read_func, which directly calls read(2).

static long
io_bufread(char *ptr, long len, rb_io_t *fptr)
{
    long offset = 0;
    long n = len;
    long c;

    if (READ_DATA_PENDING(fptr) == 0) {
        while (n > 0) {
          again:
            c = rb_read_internal(fptr->fd, ptr+offset, n);
            if (c == 0) break;
            if (c < 0) {
                if (fptr_wait_readable(fptr))
                    goto again;
                return -1;
            }
            offset += c;
            if ((n -= c) <= 0) break;
        }
        return len - n;
    }

    while (n > 0) {
        c = read_buffered_data(ptr+offset, n, fptr);
        if (c > 0) {
            offset += c;
            if ((n -= c) <= 0) break;
        }
        rb_io_check_closed(fptr);
        if (io_fillbuf(fptr) < 0) {
            break;
        }
    }
    return len - n;
}

The important thing about this code is the first while loop that calls rb_read_internal. Essentially, io_bufread tries to read(2) from the socket until the amount requested has been reached, or an EOL is returned. We can quickly confirm that is the case with strace.

root@970ccbf76e01:/mnt# strace -f -s 200 ruby scan.rb
...
socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_TCP) = 5
fcntl(5, F_GETFD)                       = 0x1 (flags FD_CLOEXEC)
connect(5, {sa_family=AF_INET, sin_port=htons(<port>), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
fstat(5, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
write(5, "<REDACTED>"..., 10)              = 10
read(5, "<REDACTED>"..., 8192) = 695
read(5, 0x5587027e4297, 7497)           = -1 ECONNRESET (Connection reset by peer)
write(2, "\33[1mTraceback\33[m (most recent call last):\n", 42Traceback (most recent call last):
) = 42
...

Two read(2) syscalls, with the second one performed immediately after the first and attempting to read the 1024*8 byte size minus the amount read by the first syscall.

As the service being scanned closes the connection immediately after returning its response, any subsequent attempts to read on the socket to that service will fail. However, because of the quirky nature of io_bufread, which appears to collapse multiple read(2) calls into a single operation that tries to read the entire buffer, the second read for the remaining data will fail, which causes the entire TcpSocket#read (really IO#read) operation to fail, even though data was, in fact, read.

Per the documentation for io_read, this is actually the expected behavior:

If length is a positive integer, read tries to read length bytes without any conversion (binary mode). It returns nil if an EOF is encountered before anything can be read. Fewer than length bytes are returned if an EOF is encountered during the read.

So what is the right way to write this code? Much like with C, the answer is not to use generic file IO functions (i.e. read(2) and write(2)) for sockets, but instead the socket-specific functions (i.e. send(2) and recv(2)). Much like everything else in Ruby, there is a bit of indirection around TcpSocket#recv (really BasicSocket#recv, which wraps recvfrom(2)), but it doesn’t exhibit the error collapsing behavior of IO#read and will do what you would have expected from calling read on a TcpSocket.

Truth be told, one of the very first things I tried when my colleague raised this issue was calling recv instead of read and using strace to discern between the two, which very quickly yielded the two read(2) calls. But from there, the journey to find why there was a difference consumed disproportionately more time given that the TracePoint API ended up being completely useless, especially because it can’t really access method parameters, especially those for native functions. Because of that, most of this analysis was done with a mix of source code trawling and very hacky Frida-based hooking of the few functions within Ruby/libruby that actually have symbols.

    read_inteceptor = Interceptor.attach(mod.getExportByName('read'), {
      onEnter: function(args) {
        console.log("    read(" + args[0] + ", ..., " + args[2] + ")")
        console.log("    " +
          Thread.backtrace(this.context, Backtracer.FUZZY)
                .map(DebugSymbol.fromAddress).join('\n')
        );
      },
      onLeave: function(retval) {
        console.log("    read() -> " + retval + "\n")
      }
    });
    io_bufread_inteceptor = Interceptor.attach(sym_to_addr_map['io_bufread'].address, {
      onEnter: function(args) {
        console.log("io_bufread(..., " + args[1] + ", ...)")
      },
      onLeave: function(retval) {
        console.log("io_bufread() -> " + retval)
      }
    });
    rb_io_wait_readable_inteceptor = Interceptor.attach(sym_to_addr_map['rb_io_wait_readable'].address, {
      onEnter: function(args) {
        console.log("  rb_io_wait_readable(...)");
      },
      onLeave: function(retval) {
        console.log("  rb_io_wait_readable() -> " + retval)
      }
    });

    io_fillbuf_inteceptor = Interceptor.attach(sym_to_addr_map['io_fillbuf.part.39'].address, {
      onEnter: function(args) {
        console.log("io_fillbuf(...)")
      },
      onLeave: function(retval) {
        console.log("io_fillbuf() -> " + retval + "\n")
      }
    });

Conclusion

Along this adventure, I learned that, much like how glibc syscall stubs don’t have a one-to-one mapping to the syscalls you would expect them to call, Ruby’s POSIX syscall-like methods have a mind of their own. I also learned that Ruby’s TracePoint API doesn’t help much once you get to the C parts of (C)Ruby/MRI, and that vanilla syscall tracing remains an invaluable part of the Ruby debugger’s toolkit (though I’m sure one can get more throughput with eBPF-based syscall tracers). As it turns out, TracePoint, Ruby’s own tracing feature, can’t even trace Ruby’s underlying virtual machine opcodes (i.e. those printed by RubyVM::InstructionSequence.disasm). Admittedly, the nature of this part of Ruby is implementation specific, but the lack of deeper granularity is still felt. In a later post, I’ll discuss the Ruby tracer I’ve been working on, which can trace (C)Ruby/MRI’s internal bytecode VM (and more!).

unspo.nso.red

An Adventure in Contingency Debugging: Ruby IO#read/IO#write Considered Harmful

15 Dec 2020 - Jeff Dileo // #Ruby #Linux #Sycalls #Tracing

Conclusion

« Some Musings on Common (eBPF) Linux Tracing Bugs