How to do Interprocess Communication in k (k3 that is)
Tuesday, October 15th, 2002Start Listening
To start a k server listening on a port, run up a k session, and invoke
m i 2001
which will set it to listen on port 2001.
Stop Listening
To stop listening, invoke
m i 0
Amend
As Stevan Apter has mentioned on the k listbox, the key to k ipc is to see that the default message processing operation is the dot operator
.
So assuming we have a vector, a which contains !10, e.g.
a:!10
Using the familiar assignment operator : we can assign the value of a to another variable v in the same instance of k as follows
v:a
We can also achieve that using the dot operator as the amend verb (see Amend, K Reference Manual)
.[`v;();:;a]
or
.(`v;();:;a)
And so to set the value v on a remote server, using the local value of a, we can write
h:3:`”localhost”,2001 / Opens a connection to localhost, port 2001
h 3:(`v;();:;a) / Assign remote v the value of local a
3:h / Close the connection
Note that we used the 3: operator to send the data - this is an asychronous send mode, and will not block the local instance of k. We do not expect a result from this execution, so there is not need to wait for the completion of the invocation of the function.
Note that this can also be achieved with
h 3:”v:”,5:a
i.e. send the request over as text, but this can be horribly inefficient.
Suppose we only want to set an element of v to be a, e.g.
v[i]:a
This can be achieved via
.[`v;i;:;a]
or
.(`v;i;:;a)
And to do this on a remote server, using the local values i and a, assuming a valid connection handle h
h 3:(`v;i;:;a)
Suppose we would like to do
v +:a
which can be achieved via
.[`v;();+;a]
or
.(`v;();+;a)
And to do this on a remote server, using the local value of a, assuming a valid connection handle h
h 3:(`v;();+;a)
+ in this case can be replaced by any primitive, e.g. -,*% etc
And to conclude the dot section, consider
v[i]+:a
which can be achieved via
.[`v;i;+;a]
or
.(`v;i;+;a)
And to do this on a remote server, using the local value of i and a, assuming a valid connection handle h
h 3:(`v;i;+;a)
These style of calls using the amend verb will only work with k servers. Unfortunately, k and kdb have different message handlers.(.m.g) , because k and ksql are different languages. This means that we cannot get at the amend verb when talking to a kdb server. We can however issue ksql, e.g.
r:h 4:(”:”;(`v;a))
which will set the value v remotely to the local value of a. More complex ksql IPC can be thought up, e.g.
tablename:`trade
r:h 4:(`.d.r;(”. select count $ from ?”;,`tablename))
What about function calls? Consider the local invocation of a function f, with the parameters a, b and c, returning a value and assigning that to the variable r, i.e.
r:`f[a;b;c]
This can be written as
r:.(`f;(a;b;c))
And to execute the remote function f, using local parameters a,b and c, assuming a valid connection handle h
r:h 4:(`f;(a;b;c))
Function f must exist on the remote instance.
Notice that in this case, as we are expecting a result to be returned from the function, we use the synchronous send mode, 4:. This will cause the local instance of k to block until the remote instance has conpleted the invocation of the function and returned the result.
What type of server are you?
If you are unsure what kind of server you are connected to, you can issue a command that will work on both systems, and return different answers depending on whether it is k or kdb. e.g.
h 4:”1%1″
returns a float if the remote server is k, and returns an int if the remote servers is kdb. This is because % means divide in k, and means mod in ksql. It is worth noting that the k ticker plants have the features of kdb but actually have a k message processor (instead of ksql) on the IPC.
Error Trapping
Error trapping in k involves wrapping up the k to be executed using the Apply verb. The result is a 2 item list, the first item being 0 (success) or 1 (failure). The second item either contains the expected result, as per normal execution outside of error trap, or the failure message as a char vector). For further info on k error trapping please see the K Reference Manual, Apply verb.
IPC requests can be error trapped as follows
Attempt to open a connection
r:@[3::;(`localhost;2001);:]
Attempt to get a table list
r:.[4:;(handle;”tables”);:]
Attempt to get the rowcount of tablename
r:.[4:;(h;(`.d.r;,(”select count $ from ?”;,tablename)));:]
Authentication and Access Control
In k there is an Authorization Vector, .m.u that contains the names of users that are permitted to connect to the process in which .m.u is defined.
e.g. on the server, define
.m.u:,`charlie
which means that only processes identifying themselves as charlie will be allowed to connect, and then on the client (assuming that you are not actually running as user charlie!), try to connect and get a table listing
h:3:`,2001
h 4:”tables”
index error
h 4:”tables”
^
what has actually happened here, is that when the client connected it implicitly sent the username that the client process was running under to the server. The server checked whether this name was in .m.u, and as it was not, it disconnected the client immediately (you can see this through detecting the closing of connections, below). However, the disconnect was not detected until we tried to use the handle again by trying to get the list of tables
In kdb there is an additional level of access control. It is configured through 2 tables
user:([user]password)
access:([access,var,user])
e.g. start with an empty table
user:([user:()]password:())
then
‘user’insert(’admin’,'pw310′)
If there is no user table everyone is allowed. If there is no access table everything is allowed.
Clients can authenticate through
http://user:password@host:port/?query
KDBC: h:3:`host,port; h 4:”user:password”
ODBC: DBQ=//host:port;UID=…;PWD=…
JDBC: Properties p=new Properties;p.put(”user”,…);p.put(”password”,…);
Connection c=DriverManager.getConnection(”jdbc:kx://host:port”,p);
If a server is expecting a client to authenticate, then the client should send the following as the first kdb command
username:password
as a char vector. e.g.
h:3:`”localhost”,2001
h 4:”admin:pw310″
if the user is not permissioned, a `user error will be thrown.
If this does not give sufficient access control, one can always insert access control in the message filters on the server. See Tracing IPC and Message Filters, below.
Dropped/Closed Connection Detection
When a connection is dropped or gracefully closed by the remote party, the character string .m.c is automatically executed. The handle associated with the dropped connection is available as _w. e.g. define your dropped connection handler as
.m.c:”`0:”Closed “,($_w),n”
and then setup a connection from a client, and close the connection from the client.
What is the context?
When a remote user is executing code on your server, the username used to establish the IPC connection is available as a symbol in _u. e.g.
_u
`cskelton
As is the ip address of the remote side of the connection - this is in _a and can be formatted as
`$1_,/”.”,’$256 _vs _a
`”192.168.1.5″
And the handle of the connection is also available, in _w, e.g on the server one might see this
m i 2001
fn:{`0:$_w}
1840
when the client runs this
h:3:`,2001
h 4:(`fn;)
Tracing IPC and Message Filters
One can trace incoming IPC messages in kdb by setting
.d.DF:1
IPC can also be traced by intercepting the messages by overriding the message filters, .m.g and .m.s.
.m.g is invoked when a 4: request is received. .m.s is invoked when a 3: request is received. If you override .m.g as
oldmg:.m.g
.m.g:{[x] `0:”Received:”,(5:x),”n”;oldmg[x]}
then you should see all incoming 4: requests being printed in the console. One advantage of this is that you will see all incoming messages regardless of whether they are about to cause an error or not. This can be very handy when debugging rogue clients.
IPC Message Structure
The IPC message structure is simply the k data types in a serialised form, with a short header describing the message type and endian system used for encoding.
The message header format is as follows
byte offset 0 1 2 3 4 5 6 7
contents endianness 0 0 message type message length
Endian can be either 0 (big endian) or 1 (little endian).
Message type can be either 0 (async), 1 (sync) or 2 (response). In k, these message types are
handle 3:”a:!10″ / set a to !10 on the remote server, using async mode, 3:
In that case, the outgoing msg has msg type 0 (async)
handle 4:”!10″ / evaluate !10 on the remote server, send the results back, using sync mode, 4:
In that case, the outgoing message has msg type 1 (sync), and the resulting incoming message has msg type 2 (response)
In java the IPC message can be read a follows
public synchronized Object k() throws IOException, KServerException
{
byte [] buffer= new byte[8];
DataInputStream dis = new DataInputStream(socket.getInputStream());
dis.readFully(buffer); // read 8 bytes from the stream
architecture= b[0] == 1; // little endian if b[0] is 1. Big endian if b[0] is 0
msgOffset= 4; // skip the next 3 bytes - don’t worry about message type
int msgLength= readInteger(); // Now read integer contained in b[4..7]
buffer = new byte[msgLength]; // Allocate array to hold complete message payload
dis.readFully(buffer); // Read msgLength bytes from input stream
msgOffset= 0;
return readMessage(); // And deserialise the message payload
}
So the complete message format is header+serialised data.
The serialised data is simply encoded according to whether it is atomic or a vector. Atoms are encoded as
type followed by data, e.g. for an integer of value 12005, it is encoded as int[]{1,12005}. It is possible to see what the type number means by using the 4: operator in k. e.g. 4:100 results in type 1, an integer. 4:100.0 results in type 2, a float.
An int is encoded as 4 bytes, so the encoding of this integer actually takes up 8 bytes - 4 for the type, and 4 for the actual int value.
The data is 8 byte aligned, which is fine for the encoding of an int - it takes up 8 bytes so it is aligned by default.
However, an encoding of a k float of value 12005.0 results in int[]{2,0} followed by 8 bytes respresenting the double 12005. The encoding used for k floats is the IEEE Standard 754 Floating-Point. This is 8 byte aligned so no further padding is required.
The encoding of a k char is an integer respresenting the type, in this case 3 (try 4:”a” at the k prompt), followed by a single byte representing the char. The charset encoding used is plain ASCII. This is 8 byte aligned so no further padding required.
The encoding of a k symbol is an integer respresenting the type, in this case 4(try 4:`”hello” at the k prompt), followed by a number of bytes, each representing one char from the symbol. It is null-terminated, i.e. the end of the symbol is signalled by a value of 0. The charset encoding used is plain ASCII. As this data is not necessarily 8 byte aligned (the symbol can be of any length), we must pad the remaining space to the next 8 byte border to regain alignment.
Encoding of vectors is much the same, except the length of the vector is inserted between the type and the data. e.g.
a vector of the symbols `”MSFT” `”IBM” would be encoded as
-4,2,77,83,70,84,0,73,66,77,0,0
note the extra 0 on the end to realign to an 8 byte boundary.
There is a bit of handshaking that goes on when a connection is established. It is simply that the side which starts the connection sends an 8 byte message which is the username to be associated with the connection. In the server, the username for the context can be seen via _u, and restricted by the .m.u vector.
