2012-03-13 08:55 am

IPv6 not the peer connectivity panacea that people think

IPv6 is supposed to solve all of the peer connectivity issues introduced by NAT. And, on the surface, it seems to do just that by making it possible to assign a unique, globally routable IP address to every conceivable device that could possibly want one.

But this doesn't really solve the problem of peer connectivity.

My cell phone, for example, may be assigned an address by my carrier. But my carrier may be unwilling to let me have any more addresses. This means that any devices I want to connect to the Internet through my cell phone will not be able to have globally routable addresses because my ISP/cell carrier won't route them. And, of course, under IPv6, nobody is ever supposed to do NAT.

So, peer connectivity is still restrained by network topology. The power to decide who gets to be a router decides what gets to connect. And this is broken.

IMHO, the solution is to have addresses assigned to things that have nothing to do with routing, and allow a routing layer on top of the network layer that can route things to those addresses regardless of the actual topology of the network. Tor is an example of this sort of thing. Tor is basically a routing layer on top of TCP/IP that's designed to obscure which routes any given piece of information takes.

But Tor is a specific example of a larger issue. Routing cannot be left ultimately controlled by anybody except network end-points. Such creates failure modes both physical and political that are significantly less than the best we can do.

Which is one of the biggest advantages to a protocol like CAKE. :-) It divorces routing from addressing and expects end-nodes to have a hand in making routing decisions.

2011-11-08 01:59 pm

Working on a small library, what should I name it?

I'm working on a small library to express computations in terms of composable trees of dependencies. These dependencies can cross thread boundaries allowing one thread to depend on a result generated in another thread. This is sort of a riff on the whole promise and future concept, but the idea is that you have chains of these with a potential fanout in the chain greater than 1. Kind of like the venerable make utility in which you express what things need to be finished before starting on the particular thing you're talking about.

But I'm not sure what I should call it. Maybe Teleo because it encourages to express your program in terms of a teleology.

I'm writing this basically because I've encountered the same problem on at least two different projects now, and it occurs to me that it would be really good to have a well-defined standard way of launching things in other threads and waiting for the results that suggested an overall program architecture. The projects I worked on were all set to develop a huge mishmash of different techniques that wouldn't necessarily play well together or be easy to debug.

2011-06-23 12:05 pm
Entry tags:
2011-05-30 05:01 am
Entry tags:

Session properties

I've been puzzling over a minimal and orthogonal set of properties for a session. I at first thought there were 3:

Message boundaries preserved
Whether or not your messages are delivered in discrete units, or whether they are delivered as a stream of bytes in which the original sizes of the send calls bear no relevance to how the bytes are chunked together on the other end.
Whether or not data arrives in the order you sent it
Well, this has a tricky definition. For TCP it means that failure to deliver is considered a failure of the underlying connection. But after such a failure you can't really be sure about exactly which bytes were delivered and which weren't.

But, as is evidenced by my description of 'reliable', these properties are not as hard-edged as they might seem. I also thought about latency, for example a connection via email is relatively high latency, and a connection between memory and the CPU is generally pretty low latency. But I'm looking for hard-edged, yes/no type properties that are in some sense fundamental. Latency seems like a property that's rather fuzzy. It exists on a continuum, and isn't really a defining feature of a connection, something that would drastically alter how you wrote programs that used the connection. In an object model, it would be an object property, not something you'd make a different class for.

But I find TCP's notion of 'reliability' very curious. It isn't really, in any sense, particularly reliable. I've had ssh connections that died, but when I reconnect to my screen session, I discover that a whole bunch of the stuff I was typing made it through, it just wasn't echoed back.

It also interacts with 'ordered' in an odd way. It might make sense to have an unordered connection that was 'reliable', but what does that really mean then? If it's a TCP notion of reliability, you could just deliver the last message and have the connection drop. Also, what would it mean to have an unreliable, but ordered connection? Would that mean you could send a bunch of messages and have only the first and last ones delivered? And would it make any sense at all to have an unordered, unreliable connection in which message boundaries were not preserved?

So I've come up with a different division...

Message boundaries preserved
Whether or not your messages are delivered in discrete units, or whether they are delivered as a stream of bytes in which the original sizes of the send calls bear no relevance to how the bytes are chunked together on the other end.
Whether or not data arrives in the order you sent it
Must not drop
This means that if a message does not make it through, the connection is considered to be in an unrecoverable error state, and no further messages may be sent. Though you may not know which message didn't make it through.
Delivery notification
Whether or not you can know that a message made it to the other side or not.

These are not fully orthogonal. For example, if message boundaries are not preserved, then, in order for a connection to be in the least sensible, it must also have the 'ordered' and 'must not drop' properties. Also, if you must not drop messages, I'm not sure that it would then be sensible to have out-of-order delivery.

One of the rules of the system I'm designing is that any property that is not required may be provided anyway. This makes non-orthogonality much easier to deal with. So the prior cases aren't really a problem.

Can any of you think of a better set of properties, or important properties that I left out?

Some good discussion also happens in this Google Buzz post that mirrors this entry.

2011-03-28 08:25 am
Entry tags:

CAKE has reached a small milestone

CAKE reached a new milestone early this morning. It now successfully both generates and parses messages that use the new protocol. It also successfully detected a re-used session id. I also think the code that does this is also a lot better designed than the old code was. It's easier to see how to put it in the context of a larger system that implements a node that speaks the protocol

It's also much more extensively tested at a deeper level with tests that are designed to document the inner workings of the system.

Overall, it's in a much better state than I left it when I sort of stopped working on it much in 2004. And I'm going to handle the hard problems first, how to maintain the relationship between sessions and transports, and having two way realtime conversations between nodes. This rather than concentrating on the messages that will be traded back and forth at a higher level (which will be done using protobuf). That can come later, especially since I'm not likely to get it right the first time anyway.

I also need to think about getting nodes to participate in a DHT to share assertions (like how to reach a particular node) in a distributed way.

Lastly, the protocol has something of a problem with 'liveness' because I designed it with the idea of conversations being able to be initiated without any round trips. There are some mitigation for this problem in session ids, but that mitigation is somewhat problematic because it requires the recipient of a conversation initiation to keep track of some stuff for everybody who tries to talk to it.

I'm not really sure how to handle the 'liveness' problem though and still preserve the lack of round trips property. I could require that session ids contain an 'hour number' or something similar. Though that introduces a requirement for at least very coarse grain time synchronization for all nodes.

2011-02-02 02:36 pm
Entry tags:

Interesting design problem with serialization and deserialization

I have been working on a serialization framework I'm happy with for Python. I want to be able to describe CAKE protocol messages clearly and succinctly. This will make it easier to tweak the messages without having to rip apart difficult to understand code. It will also make it easier to understand if I drop the project again and then come back to it years later, or if (by some miracle) someone else decides to help me with it.

This is a very long post. )
2010-12-04 03:26 pm
Entry tags:

Protocol buffers?

I have a problem for which protocol buffers seem like a good solution, but I'm reluctant to use them. First, protocol buffers include facilities for handling the addition of new fields in the future. This adds a small amount to a typical protocol buffer message, but it's a facility I do not need.

Also, I feel the variable sized number encoding is less efficient than it could be, though this is a very minor issue. I also feel like I have a number of special purpose data types that are not adequately represented.

I'm also not completely pleased with the C++ and/or Python APIs. I think they contain too many googlisms. I would like to see public APIs published that were free of adherence to Google coding standards like do-nothing constructors and no exceptions.

I think, maybe, I will be using protocol buffers for some messages that are sent by applications using CAKE as a transport/session layer. These include some of the sub-protocols that are required to be implemented by a conforming CAKE implementation.

On a different note, I think Google's C++ coding standards are lowering the overall quality of Open Source C++ code. This isn't a huge effect, but it's there.

It happens because Google's good name is associated with a set of published standards for C++ coding that include advice that while possibly good for Google internally is of dubious quality as general purpose advice. It also happens because when Google releases code for their internal tools to the Open Source community, these tools follow Google's standards. And some of these standards have the effect of making it hard to use code that doesn't comply with those standards in conjunction with code that does.