Tom Purl's Blog

cowsayseries

Tags: #cowsayseries

(This blog post was originally published on 2013/11/29 and is part 3 of 3 of my Cowsay Series of articles.)

This is the third post in a series of articles about writing my first application that uses sockets. For more information about why I’m doing this or how, please see my firt article.

Now With Rspec And STDERR!

Wow, that is not a sexy heading :–)

When I left off last time, I had a server that worked pretty well as long as you it could parse everything that you sent to it. However, once things got a little funny, the client or server would simply fail.

There’s a lot that I want to change about the socket-oriented aspects of the server (i.e. how it handles EOF’s), but it was bugging the heck out of me that this thing was so brittle. So I had to fix that first.

Also, I got tired of running a bunch of functional tests by hand every time I added a new feature or refactored something, so I decided to try this computer automation thing that all of the kids are doing. I’ll talk more about how I used RSpec to do this later in the article.

Oh, and since my “project” has 3 whole files now and, like, dozens of lines of code, I’ve decided to actually host it as a project on Github. You can see it here:

Using Popen3 To Improve Security and Error-Handling

Fixing My Command Injection Bug

In my last iteration, I executed cowsay using the following line of code:

`cowsay -f #{commands[:body]} "#{commands[:message]}"`

One of the problems with this code is that it makes it very easy to “inject” commands that have nothing to do with cowsay.

For example, here’s a simple way to invoke cowsay using a heredoc:

cat <<EOF | nc localhost 4481
MESSAGE Hi
BODY hellokitty
EOF

This would give us the following:

STATUS 0

 ----
< Hi >
 ----
 \
  \
     /\_)o<
     |      \
     | O . O|
      \_____/

In this example, the line of code above would interpolate to this:

`cowsay -f hellokitty "Hi"`

Everything looks good so far, but what if someone sent the following string to netcat:

cat <<EOF | nc localhost 4481
MESSAGE Hi"; sleep "5
BODY hellokitty
EOF

It’s possible that the line of code could interpolate to this:

`cowsay -f hellokitty "Hi"; sleep "5"`

This actually works. If you run the netcat command above against this version of the server.rb file, then it will sleep for 5 seconds before it returns the output of cowsay.

Of course, sleeping for 5 seconds isn’t really the worst case scenario. An attacker could inject a shell command that does things like delete important files or install malicious code.

The solution to this problem is simple and time-tested – parameterize your input. Here’s my new version of the code that executes the cowsay command:

def process(commands)
  output = nil
  err_msg = nil
  exit_status = nil

  Open3.popen3('/usr/games/cowsay', '-f', commands[:body], commands[:message]) { |stdin, stdout, stderr, wait_thr|
    # TODO Do I need to wait for the process to complete?
    output = stdout.read
    err_msg = stderr.read
    exit_status = wait_thr.value.exitstatus
  }

  if exit_status != 0 then
    output = "ERROR #{err_msg}"
  end

  return exit_status, output
end

This is a bit more complex than the previous one-liner, so here’s a quick summary of what I’m doing:

  • I use the popen3 method to execute cowsay command.
  • I parameterize my options and arguments by separating them with commas. By doing so, I’m no longer passing my command to the shell, which means significantly fewer options for command injection.

Now let’s try my “sleepy” version of the netcat command above with the new version of server.rb:

cat <<EOF | nc localhost 4481
MESSAGE Hi; sleep 5
BODY hellokitty
EOF

...whichwould give you this:

STATUS 0

  _____________
 < Hi; sleep 5 >
  -------------
  \
   \
      /\_)o<
      |      \
      | O . O|
      \_____/

Hooray! No more shell games.

Handling Non-Fatal Errors

The last version of my server.rb file did a really poor job handling really rudimentary parsing errors. For example, if you didn’t pass the MESSAGE heading properly, the server would write a message to the STDERR and then freeze. Also, if you messed up your BODY heading, the server would simply write a message to its console. This is not terribly helpful for your client.

I needed a way to convey error messages to the client. I therefore decided on the following conventions:

  • I would always return a STATUS heading. If everything was processed properly, this code would always be 0. Otherwise, it would be some number greater than 0.

  • If the STATUS is 0, then an ascii art picture would be returned. Otherwise, and error message would be returned.

Now when the MESSAGE heading is malformed I can simply send an error message back to the client with the appropriate status from the parse method.

Grabbing the status code and error message from the cowsay command is easily accomplished using the popen3 method in the code example above. This command makes it easy to read the STDOUT and STDERR file handles along with the status code returned by the cowsay process. All I have to do then is test if the status code is > 0, and if it is, return the contents of STDERR.

Automated Functional Testing Made Simple

Now that my little script is actually starting to flirt with the idea of usefulness, I found that I was running a lot of manual tests against it. Of course, running these tests was error prone and labor intensive, so I finally tried to find some way test the code in an automated way.

The solution was writing a half-dozen RSpec tests, which was much easier than I thought it would be. As a matter of fact, it only took half an hour to cover all of the tests that I needed, which will probably save me at least an hour this week alone.

Here’s the current version of cowsay-spec.rb. To run the tests, this is all that I have to type:

rspec cowsay-spec.rb

One nice thing about RSpec is that it’s very easy to read. Even if you’re not a programmer, you can probably infer what I’m doing.

Also, please note that I’m not using the cowsay client.rb file to drive these tests. I figured that if any network client written in any language can interact with the cowsay server, then it makes the most sense to test it using “raw” sockets. And the easiest way for me to do that is to shell out a call to netcat..

Seriously, I should have done this at the beginning. It’s already saving me a ton of time, and it’s so easy to use.

Conclusion

I finally feel like I’m getting close to something that is actually useful. I can handle errors in a robust and intuitive way, and I can now test any new or updated features very quickly and easily.

Next, I’m going to focus on improving the way that streams are read and written by the client and server. Once that’s done, I believe that I will have developed this project as much as I can.

Tags: #cowsayseries

This blog post was originally published on 2013/11/27

(This article is part 2 of 3 of my Cowsay Series of articles.)

This is the second post in a series of articles about writing my first application that uses sockets. For more information about why I’m doing this or how, please see my first article.

More Functional Requirements

I have a working server, but there are two things that bug me about it:

  1. I have to test it using netcat, which is good for simple stuff but things would be much easier with an actual client.
  2. Right now, the server just process a “raw” string of commands. I would rather have the server interpret parameters.

I figure that I’m going to need some type of “message format” to make requirement #2 work, so I first try to define that.

My Message Format

Since I’m familar with HTTP, I decided to use a message format that is very similar. Right now, I simply want to be able to pass a message and cow body format to the cowsay server. I therefore decided to send messages that look something like this:

MESSAGE This SUCKS!
BODY beavis.zen

That’s it. Just plain old text (unicode?) over the wire with two properties. In the future, I’ll probably want to use return codes and more header options.

The Client

Here’s my first stab at a very simple client:

Github Gist

require 'socket'

module CowSay
    class Client
        class << self
            attr_accessor :host, :port
        end

        # Convert our arguments into a document that we can send to the cowsay
        #>server.
        #
        # Options:
        #   message: The message that you want the cow to say
        #   body: The cowsay body that you want to use
        def self.say(options)

            if !options[:message]
                raise "ERROR: Missing message argument"
            end

            if !options[:body]
                options[:body] = "default"
            end

            request <<EOF
MESSAGE #{options[:message]}
BODY    #{options[:body]}
EOF
        end

        def self.request(string)
            # Create a new connection for each operation
            @client = TCPSocket.new(host, port)
            @client.write(string)

            # Send EOF after writing the request
            @client.close_write

            # Read until EOF to get the response
            @client.read
        end
    end
end

CowSay::Client.host = 'localhost'
CowSay::Client.port = 4481

puts CowSay::Client.say message: 'this is cool!'
puts CowSay::Client.say message: 'This SUCKS!', body: 'beavis.zen'
puts CowSay::Client.say message: 'Moshi moshi!', body: 'hellokitty'

This is really a very simple socket client. I have one real method called say which understands two keys, message and body. I then take those values, drop them in a heredoc, and then send that to the server.

Of course, now that I’m using a new message format, I’m going to need to make some changes on the server too.

The Server, Part Two

Here’s my stab at creating a server that can read the new message format:

Github Gist

require 'socket'

module CowSay
    class Server
        def initialize(port)
            # Create the underlying socket server
            @server = TCPServer.new(port)
            puts "Listening on port #{@server.local_address.ip_port}"
        end

        def start
            # TODO Currently this server can only accept one connection at at
            # time. Do I want to change that so I can process multiple requests
            # at once?
            Socket.accept_loop(@server) do |connection|
                handle(connection)
                connection.close
            end
        end

        # Find a value in a line for a given key
        def find_value_for_key(key, document)

            retval = nil

            re = /^#{key} (.*)/
            md = re.match(document)

            if md != nil
                retval = md[1]
            end

            retval
        end

        # Parse the document that is sent by the client and convert it into a
        # hash table.
        def parse(document)
            commands = Hash.new

            message_value = find_value_for_key("MESSAGE", document)
            if message_value == nil then
                $stderr.puts "ERROR: Empty message"
            end
            commands[:message] = message_value

            body_value = find_value_for_key("BODY", document)
            if body_value == nil then
                commands[:body] = "default"
            else
                commands[:body] = body_value
            end

            commands
        end

        def handle(connection)
            # TODO Read is going to block until EOF. I need to use something
            # different that will work without an EOF.
            request = connection.read

            # The current API will accept a message only from netcat. This
            # message is what the cow will say. Soon I will add support for
            # more features, like choosing your cow.

            # Write back the result of the hash operation
            connection.write process(parse(request))
        end

        def process(commands)
            # TODO Currently I can't capture STDERR output. This is
            # definitely a problem when someone passes a bogus
            # body file name.
            `cowsay -f #{commands[:body]} "#{commands[:message]}"`
        end
    end
end

server = CowSay::Server.new(4481)
server.start

There’s a few things that I added to this code:

  • Before sending the message to the process method, I now have to parse it.
  • The parse method simply grabs the MESSAGE and BODY values with some help from the find_value_for_key method and then performs some very simple validation.
  • The process method now does some very rudimentaryn parameterization. Eventually I would like some more safeguards in place to ensure that bad input cannot be passed to the cowsay executable, but for now this will do.

Testing

First, let’s take a look at some “happy path” testing. In your first window, execute the following command:

ruby server.rb
# Returns 'Listening on port 4481'

Great. Now in another window, execute the following command:

ruby client.rb
 _______________
< this is cool! >
 ---------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
 _____________
< This SUCKS! >
 -------------
   \         __------~~-,
    \      ,'            ,
          /               \
         /                :
        |                  '
         _| =-.     .-.   ||
         o|/o/       _.   |
         /  ~          \ |
       (____@)  ___~    |
          |_===~~~.`    |
       _______.--~     |
       \________       |
                \      |
              __/-___-- -__
             /            _ \
 ______________
< Moshi moshi! >
 --------------
  \
   \
      /\_)o<
     |      \
     | O . O|
      \_____/

Nice. Let’s also try a quick test using netcat:

echo "MESSAGE Oh YEAH\nBODY milk" | nc localhost 4481

...which should return:

 _________
< Oh YEAH >
 ---------
 \     ____________
  \    |__________|
      /           /\
     /           /  \
    /___________/___/|
    |          |     |
    |  ==\ /== |     |
    |   O   O  | \ \ |
    |     <    |  \ \|
   /|          |   \ \
  / |  \_____/ |   / /
 / /|          |  / /|
/||\|          | /||\/
    -------------|
       |  |  |  |
      <__/    \__>

And now for the unhappy path. What happens if I pass a “body type” that the cowsay server doesn’t recognize?

echo "MESSAGE Boom goes the dynamite\nBODY bogus" | nc localhost 4481

The client exits normally, but I see the following error message in the console window in which the server is running:

cowsay: Could not find bogus cowfile!

It looks like the STDERR from the cowsay process is only being written to the console. In the future, I’ll need to capture that and make the server appropriately.

What if I don’t pass a message?

echo "BODY default" | nc localhost 4481

In this case, the client freezes. I then see the following error in the server console window:

ERROR: Empty message

The server then becomes unresponsive. This is definitely the first bug that I will need to fix in my next revision.

Conclusion

I’m happy with the progress of my little socket server and client. In my next revision I am going to focus on the following:

  • Having the server handle bad input gracefully
  • Making sure that the server is able to respond in a predictable, informative way when it experiences issues
  • Finally ditching the backticks and executing the cowsay process in a more robust way.

Tags: #cowsayseries

This blog post was originally published on 2013/11/12

(This article is part 1 of 3 of my Cowsay Series of articles.)

I’ve read through Working With TCP Sockets a few times to improve my socket programming knowledge. I’ve administered software systems for a while now I know most of the basics, but there are definitely some gaps I should fill in. This book has been a great tool for helping me identify those gaps.

However, there is only so much I can learn by reading about other people’s code – I needed something that I could create and break and fix again to really understand the lessons from the book. I therefore decided to rip off Avdi Grimm and create my own cowsay server.

I always learn more when I write about what I’m learning, so I’m also going to blog about it. This post is the first in a series that will record the evolution of this script from a naive toy to something that someone else would actually consider using some day.

Requirements – Iteration 1

First, I need to point out that I’m not creating a web application. I’m creating a lower-level server that communicates with its client using plain old sockets. This example is designed to teach me about networking in general, not HTTP programming.

So what does that mean? Well, it means that I need to write our own server and client. Writing them both is a pretty tall order, and I’ve never even written one of these things before. What I need is some sort of naive “scaffold” that works well enough to provide feedback while I turn it into a “real” program.

I therefore think that my first requirement is to only write a server. All client communication will be performed by the netcat program. I can worry about the client in a future iteration.

My second and final requirement is that the server just work. I will put my ego on the bench for a little while and just write working code that I know has plenty of flaws and anti-patterns. I’m not writing the next Nginx here – I’m having fun and learning something new. Besides, there will be plenty of time to turn this into something that I can show off.

Code

Github gist

require 'socket'

module CowSay
    class Server
        def initialize(port)
            # Create the underlying socket server
            @server = TCPServer.new(port)
            puts "Listening on port #{@server.local_address.ip_port}"
        end

        def start
            # TODO Currently this server can only accept one connection at at
            # time. Do I want to change that so I can process multiple requests
            # at once?
            Socket.accept_loop(@server) do |connection|
                handle(connection)
                connection.close
            end
        end

        def handle(connection)
            # TODO Read is going to block until EOF. I need to use something
            # different that will work without an EOF.
            request = connection.read

            # The current API will accept a message only from netcat. This
            # message is what the cow will say. Soon I will add support for
            # more features, like choosing your cow.
            # TODO - Parse the request

            # Write back the result of the hash operation
            connection.write process(request)
        end

        def process(request)
            # TODO This is just painfully naive. I'll use a different
            # interface eventually.
            `cowsay "#{request}"`
        end
    end
end

server = CowSay::Server.new(4481)
server.start

The low-level details of this script are out of the scope of this blog post. If you’re curious, then I do recommend the Working With TCP Sockets book. It’s an excellent introduction.

Thankfully, even if you don’t know a bunch about socket programming, it’s pretty simple to read Ruby code. Here’s basically what is happening:

  1. A new server process is created in the initialize method.
  2. When the start method is called, the server waits for a client to try to connect. When that happens, we enter the accept_loop block and do something about it.
  3. In the handle method we read the contents of the request and then forward them on to the process method.
  4. Here, we “shell out” a call to the cowsay program that is on the server, passing it the contents of the request.
  5. Finally, the output of the cowsay program is sent back to the client in line 32.
  6. Oh wait, one more step. The program goes back to line 15 and waits for another request. The server will block until that happens.

Testing

Like I said earlier, a proper client is out of the scope of this iteration, so we will test the script using netcat. Here’s how everything works on my system.

First, let’s start the server:

ruby cowsays_server/server.rb

...which outputs:

Listening on port 4481

Next, let’s connect with our client:

echo "I like coffee" | nc localhost 4481

...which should show you this:

 ________________
< I like coffee  >
 ----------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

Hooray! Working code.

So What’s Wrong

Lots it turns out. Here’s some of the biggies.

EOF’s

If the client only sends part of a message and doesn’t end with an EOF character then my server will just block, waiting for that character. If another request comes along while it’s blocking, then that request will also wait until the first one is done, which will be never. Typically you don’t want to make it possible for one malformed request to DOS your server :–)

Here’s what I mean. Start your server using the commands above and then try type this:

(echo -n "Made you break"; cat) | nc localhost 4481

You may notice that nothing will happen. This command sends a string with no newline at the end, which means no EOF command for the server. The accept_loop command will therefore wait for that command forever.

Now type CTRL-z to stop that command and then type the following:

bg
echo "Message 1" | nc localhost 4481

Still nothing happens. Your first command is still being handled by the server, so this second command will just sit patiently in the queue and wait. To prove everything that I’ve said so far, trying killing the first blocking command. Press CTRL-z again and then the following commands:

bg
kill %1

You should see something like the following:

[1]  + 31288 terminated  ( echo -n "Made you break"; cat; ) |
       31289 terminated  nc localhost 4481

$  ____________
< Message 1  >
 ------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

[2]  + 31356 done       echo "Message 1" |
       31357 done       nc localhost 4481

What you just did was kill the first “job”, which was the message that was missing an EOF. Our server is finally free to respond to our second request.

Command Injection Attacks

Here’s another fun way to break your server. Try sending the following command:

echo "--bogus" | nc localhost 4481

Your server should write something like this to your STDOUT:

 nknown option: -
 nknown option: o
 nknown option: u
 nknown option:

Obviously, my code has no idea how to handle command line options that are disguised as a message. Also, now I won’t be able to use the server again until I restart it. Lame.

In a future iteration, I’ll actually need to parse request input and handle error codes and messages sent to STDERR. Backticks just aren’t going to cut it.

Performance

Performance isn’t super important for a server like this, but it’s still useful to see how a sever like this performs when more nthan one person is actually trying to use it at the same time. But how do you performance test a server like this?

for num in $(seq 5); do echo "Test #$num" | nc localhost 4481 &; done

This command may be a little scary looking since it’s an inline loop. Here’s how that command is actually expanded by the shell:

echo "Test #1" | nc localhost 4481 &
echo "Test #2" | nc localhost 4481 &
echo "Test #3" | nc localhost 4481 &
echo "Test #4" | nc localhost 4481 &
echo "Test #5" | nc localhost 4481 &

There are two key things to notice about these commands:

  • Each command has it’s own unique identifier. That will be important eventually.
  • Each command is “backgrounded” by the ampersand (&) sign. This means that the shell will not wait for the command to finish executing before it moves on to the next command. This simple trick allows us to send the five requests to the sever in very quick succession, which makes them nearly simultaneous.

So anywho, if you run the inline loop above, you should see 5 cows printed in quick succession. Great! Our server can handle 5 nearly-simultaneous requests.

At this point though, you may be wondering if the requests were handled in order. Let’s filter out everything but the “Test” message with this command:

for num in $(seq 5); do echo "Test #$num" | nc localhost 4481 &; done | grep Test

You should see output that looks something like this:

< Test #1  >
< Test #2  >
< Test #3  >
< Test #4  >
< Test #5  >

Cool. Every command was executed in order. What is I were to double the number of near-simultaneous requests? Since we are running our test with an inline loop, all you have to do is change the “5” to a “10” like this:

for num in $(seq 10); do echo "Test #$num" | nc localhost 4481 &; done | grep Test

...which will output something similar to (but probably diffferent than) this:

< Test #1  >
< Test #2  >
< Test #4  >
< Test #3  >
< Test #5  >
< Test #6  >
< Test #7  >
< Test #10  >
< Test #8  >
< Test #9  >

Interesting. I have to assume that “Test #10” was actually executed after “Test #9”, but apparently it was popped off of the accept queue first.

Of course it’s no fun to stress test something if you can’t find a way to break it. So how many requests does it take? Well, by default Ruby’s listen queue size is 5. This is the queue from which the accept_loop block grabs requests. I would imagine that 6 requests would cause at least one of my requests to fail. However, as we just saw above my server was easily able to handle 10 near-simultaneous requests.

The other possibility is that the accept_loop method actually sets the listen queue size to the SOMAXCONN value, which is 128 on my system. So how would my server handle 129 requests? To find out, simply change the “10” to “129” in the previous command.

On my system, the command executed without any errors. Granted, it took a few minutes to run, and you could definitely see some long pauses. But I guess the lesson learned is that even when we exceed the size of the listen queue, there seems to be enough idiot-proofing built into the Ruby runtime and Linux kernel to still make everything work eventually. Also, the long default TCP timeouts probably help.

I even tried running the loop above with 10,000 requests, but the only error I got was that I filled my shell’s job table. I really did not expect that. It looks like I need to find a better way to stress test this server.

Conclusion

There’s a lot more that I want to do with this server. Here’s some stuff that I haven’t mentioned yet:

  • Protcol Definition – Eventually, I need to create a client and I should define some type of protocol that it can use to talk to the server.
  • Concurrency – I would like to eventually make this a preforking server.
  • Support For Most Cowsay Features – You should be able to use a different cow.

I hope I was able to help someone else learn a little bit about socket programming. Thanks for reading!