emacs-jupyter/README.org

23 KiB

An interface to communicate with Jupyter kernels in Emacs.

Installation

If you would like to try this package out, in your Emacs configuration add

(add-to-list 'load-path "<path>")
(require 'jupyter)

where <path> is the path to the directory containing this README.org file.

Dependencies

Jupyter REPL

To start a new kernel on the localhost and connect a REPL client to it, run the command jupyter-run-repl. Alternatively you can connect to an existing kernel by supplying the kernel's connection file to jupyter-connect-repl.

The REPL supports most of the rich output that a kernel may send to a client. If the kernel sends image data, the image will be displayed in the REPL buffer. If LaTeX is sent, it will be compiled (using org-mode) and displayed. The currently available mimetypes and their dependencies are:

Mimetype Dependency
text/html Emacs built with libxml2
text/markdown markdown-mode
text/latex org-mode
image/png none
image/svg+xml Emacs built with librsvg2
text/plain none

Inspection

To send an inspect request to the kernel, press C-c C-f when the cursor is at the location of the code you would like to inspect.

Completion

Currently completion is dependent on company-mode being available since this is the completion framework that I use. Pull requests for support of other completion frameworks are welcome.

REPL history

When a new REPL connects to a kernel it sends a request to get the last jupyter-repl-history-maximum-length REPL inputs. You can navigate through this history using C-n and C-p. You can also search through this history using C-s to search forward through the history or C-r to search backward in history.

Associating other buffers with a REPL

After starting a REPL, it is possible to associate the REPL with other buffers if they pass certain criteria. Currently, the buffer must have the major-mode that corresponds to the REPL's kernel language. To associate a buffer with a REPL you can run the command jupyter-repl-associate-buffer.

jupyter-repl-associate-buffer will ask you for the REPL you would like to associate with the current-buffer and enable the minor mode jupyter-repl-interaction-mode. This minor mode populates the following keybindings for interacting with the REPL:

Key binding Command
C-c C-c jupyter-repl-eval-line-or-region
C-c C-f jupyter-inspect-at-point
C-c C-z jupyter-repl-pop-to-buffer
C-c C-i jupyter-repl-interrupt-kernel
C-c C-r jupyter-repl-restart-kernel

Integration with org-mode

For users of org-mode, integration with org-babel is provided through the ob-jupyter library. To enable Jupyter support for source code blocks add jupyter to org-babel-load-languages. After ob-jupyter has been loaded, new source code blocks with names of the form jupyter-LANG will be available. LANG can be any one of the kernel languages that were found on your system by jupyter-available-kernelspecs.

Every Jupyter source code block requires that the :session parameter be specified since all interaction with a Jupyter kernel is through a REPL connected to the kernel. So for example to interact with a python kernel you would create a new source block like so

#+BEGIN_SRC jupyter-python :session py
x = 'foo'
y = 'bar'
x + ' ' + y
#+END_SRC

By default, source blocks are executed synchronously. To execute a source block asynchronously set the :async parameter to yes.

#+BEGIN_SRC jupyter-python :session py :async yes
x = 'foo'
y = 'bar'
x + ' ' + y
#+END_SRC

Since a particular language may have multiple kernels available, the default kernel used for a language is the first kernelspec found by jupyter-available-kernelspecs that has the corresponding language. To change the kernel, set the :kernel parameter

#+BEGIN_SRC jupyter-python :session py :async yes :kernel python2
x = 'foo'
y = 'bar'
x + ' ' + y
#+END_SRC

Any of the defaults for a language can be changed by setting org-babel-default-header-args:jupyter-LANG to an appropriate value. For example to change the default header arguments of the julia kernel, you can set org-babel-default-header-args:jupyter-julia to something like

(setq org-babel-default-header-args:jupyter-julia '((:async . "yes")
                                                    (:session . "jl")
                                                    (:kernel . "julia-0.6")))

Rich kernel output

All of the mimetypes available when using the REPL are also available using ob-jupyter. If image data is received from the kernel, the image will be saved to file and an image link will be the result of the source block, for text/latex, text/markdown, text/org, text/html, the results are wrapped in a source block with the appropriate language. For text/plain the results are inserted as scalar data.

For images sent by the kernel, if no :file parameter is provided to the source block, a file name is automatically generated and the image data written to file in org-babel-jupyter-resource-directory. Otherwise, if a :file parameter is given, the image data is written to the file specified.

Editing the contents of a code block

When editing the code of a Jupyter source block, i.e. by pressing C-c ' when at a code block, jupyter-repl-interaction-mode is automatically enabled in the edit buffer and the buffer will be associated with the REPL session of the code block (see jupyter-repl-associate-buffer).

You may also bind the command org-babel-jupyter-scratch-buffer to an appropriate key in org-mode to display a scratch buffer in the code block's major-mode and connected to the code block's session.

Connecting to an existing kernel

You may also connect to an existing kernel by passing the kernel's connection file as the value of the :session parameter. In this case, a new REPL connected to the kernel will be created. The file must have a .json suffix for this to work.

If the file name supplied is a remote file name, i.e. has a prefix like /host:, the kernel's ports are assumed to live on host. Before attempting to connect to the kernel, the necessary ssh tunnels for the connection are created. So if you had a remote kernel on a host named ec2 whose connection file is /run/user/1000/jupyter/kernel-julia-0.6.json on that host, you would specify the :session as

#+BEGIN_SRC jupyter-julia :session /ec2:/run/user/1000/jupyter/kernel-julia-0.6.json
...
#+END_SRC

Currently there is no password handling, so if your ssh connection requires a password I suggest you instead use key-based authentication. Or if you are connecting to a server using a pem file add something like

Host ec2
    User <user>
    HostName <host>
    IdentityFile <identity>.pem

to your ~/.ssh/config file.

API

Method/message naming conventions

The message type strings as defined in the Jupyter spec become message type symbols, more specifically properties, with underscores replaced by hyphens. So an "execute_request" becomes an :execute-request.

Methods that send messages to a kernel are named jupyter-send-<msg-type> where <msg-type> is an appropriate message type. So to send an :execute-request you would call jupyter-send-execute-request. Similarly, methods that receive messages from a kernel are named jupyter-handle-<msg-type>.

The exception to the above rule is the :input-reply message. Although it sends a message to the kernel it has a handler method, jupyter-handle-input-reply, instead of a send method.

jupyter-kernel-client

Represents a client connected to a Jupyter kernel.

Initializing a connection

jupyter-initialize-connection takes a client and a connection file as arguments and configures the client to communicate with the kernel whose connection information is contained in the connection file. After initializing a connection, to begin communicating with a kernel you will need to call jupyter-start-channels.

(let ((client (jupyter-kernel-client)))
  (jupyter-initialize-connection client "kernel1234.json")
  (jupyter-start-channels client))

jupyter-initialize-connection is mainly useful when initializing a remote connection. The normal pathway to obtain a client on the localhost is to use jupyter-start-new-kernel like so

(cl-destructuring-bind (manager client info)
    (jupyter-start-new-kernel "python")
  BODY)

where manager will be a jupyter-kernel-manager which can be used to manage the lifetime of the local kernel process, client will be a newly connected jupyter-kernel-client connected to manager's kernel, and info will be the kernel info obtained from the initial :kernel-info-request to the kernel. If multiple client's connected to the kernel of manager are required, use jupyter-make-client. After the call to jupyter-start-new-kernel, client's channels will already be open.

How messages are sent to and received from the kernel

To free up Emacs from having to process messages sent to and received from a kernel, an Emacs subprocess is created for every client. This subprocess is responsible for polling the client's channels for messages and taking care of message signing, encoding, and decoding. The parent Emacs process is only responsible for supplying the message property lists (the representation used for Jupyter messages in Emacs) when sending a message and will receive the decoded message property list when receiving a message. The exception to this is the heartbeat channel which is implemented using timers in the parent Emacs process.

Also see Making requests to a kernel.

Starting/stopping channels

To start a client's channels, use jupyter-start-channels; to stop a client's channels, jupyter-stop-channels; and to determine if at least one channel is alive, jupyter-channels-running-p.

You may access each individual channel by accessing the corresponding slot of a client. So to get the shell channel of a client you would do

(oref client shell-channel)

this will give you the jupyter-channel object of the shell channel. By accessing the channel slots of the client individual channels may be started or stopped.

Making requests to a kernel

Sending and receiving messages is centered around the jupyter-kernel-client class. Each message sent or received has a corresponding method in jupyter-kernel-client. As stated previously, request messages have method names like jupyter-send-<msg-type> where <msg-type> is the request message type. So an :execute-request message has the corresponding method jupyter-send-execute-request.

(jupyter-send-execute-request client :code "1 + 2") ; Returns a `jupyter-request'

All requests sent to a kernel return a jupyter-request which encapsulates the current state of the request with the kernel and how the jupyter-kernel-client should handle messages received from the kernel in response to the request.

Handling received messages

The handler methods of a jupyter-kernel-client are intended to be overridden by subclasses that would like to execute arbitrary code in response to a received message, they have the following method signature

(cl-defmethod jupyter-handle-<msg-type> ((client jupyter-kernel-client) req arg1 arg2 ...)
  BODY)

where <msg-type> is the type of the message, e.g. the :execute-result handler has the method name jupyter-handle-execute-result. req will be the jupyter-request object that generated the message. arg1, arg2, … will be the unwrapped message contents passed to the handler; the number of arguments and their order are dependent on <msg-type>.

Whenever a message is received on a client, the corresponding handler method is called. The default implementations of the handler methods in jupyter-kernel-client do nothing with the exception of the :input-reply handler which gets input from the user and sends it to the kernel. See Evaluating code when a message is received for an alternative way of handling received messages.

Client local variables

Some variables which are used internally by jupyter-kernel-client have client local values. For example the variable jupyter-include-other-output tells a jupyter-kernel-client to handle IOPub messages originating from a different client and defaults to nil, i.e. do not handle IOPub messages from other clients. To modify a client local variable you would use jupyter-set

(jupyter-set client 'jupyter-include-other-output t)

Internally, this just sets the buffer local value of jupyter-include-other-output in a private buffer used by the client. To retrieve the client local value use jupyter-get

(jupyter-get client 'jupyter-include-other-output)

These functions just set/get the value of a buffer local variable in a private buffer of the client. You may work with these buffer local variables directly by using the jupyter-with-client-buffer macro, just be sure to use setq-local if you are setting a client local variable to a new value.

(jupyter-with-client-buffer client
  (message "jupyter-include-other-output: %s" jupyter-include-other-output)
  (setq-local jupyter-include-other-output (not jupyter-include-other-output)))

jupyter-kernel-manager

Manage the lifetime of a kernel on the localhost.

Kernelspecs

To get a list of kernelspecs on your system, as represented in Emacs, use jupyter-available-kernelspecs which processes the output of the shell command

jupyter kernelspec list

to construct the list of kernelspecs. To find kernelspecs that match a prefix of a kernel name, use jupyter-find-kernelspecs. jupyter-find-kernelspecs will return the subset of the available kernelspecs which have kernel names that begin with the prefix. Most likely you know the exact name of the kernel you want to use. In this case, use jupyter-get-kernelspec.

You may also use jupyter-completing-read-kernelspec in an interactive spec to ask the user to select a kernel. This is what is done in run-jupyter-repl.

Managing the lifetime of a local kernel

Starting a kernel

As was mentioned previously, to start a new kernel on the localhost and create a connected client, use jupyter-start-new-kernel which takes a kernel name and returns a jupyter-kernel-manager which manages the lifetime of the kernel, and a connected jupyter-kernel-client.

(cl-destructuring-bind (manager client)
    (jupyter-start-new-kernel "python")
  BODY)

Instead of supplying an exact kernel name, you may also supply the prefix of one. Then the first available kernel that has the same prefix will be started. See jupyter-find-kernelspecs.

Stopping a kernel

To shutdown a kernel, use jupyter-shutdown-kernel. To check if a kernel is alive, jupyter-kernel-alive-p.

Interrupting a kernel

To interrupt a kernel, use jupyter-interrupt-kernel.

Making clients connected to a local kernel

Once you have a kernel manager you can make new jupyter-kernel-client (or a subclass of one) instances using jupyter-make-client.

Evaluating code when a message is received

As mentioned previously, to evaluate code in response to a received message, you may subclass jupyter-kernel-client and override the handler methods. Alternatively you can add message callbacks to the jupyter-request objects returned by the jupyter-send-* methods. In both cases, when a message of a certain type is received for a request, the appropriate handler method or callback runs. If both methods are used in parallel, the message callbacks will run before the handler methods.

jupyter-request callbacks

To add callbacks to a request, use jupyter-add-callback. jupyter-add-callback accepts a jupyter-request object as its first argument and alternating (message type, callback) pairs as the remaining arguments. The callbacks are registered with the request object to run whenever a message of the appropriate type is received. For example, to do something with the client's kernel info you would do the following:

(jupyter-add-callback (jupyter-send-kernel-info-request client)
  :kernel-info-reply (lambda (msg)
                       (let ((info (jupyter-message-content msg)))
                         BODY)))

To print out the results of an execute request:

(jupyter-add-callback (jupyter-send-execute-request client :code "1 + 2")
  :execute-result (lambda (msg)
                    (message (jupyter-message-data msg :text/plain))))

To add multiple callbacks to a request:

(jupyter-add-callback (jupyter-send-execute-request client :code "1 + 2")
  :execute-result (lambda (msg)
                    (message (jupyter-message-data msg :text/plain)))
  :status (lambda (msg)
            (when (jupyter-message-status-idle-p msg)
              (message "DONE!"))))

There is also the possibility of running the same handler for different message types:

(jupyter-add-callback (jupyter-send-execute-request client :code "1 + 2")
  '(:status :execute-result :execute-reply)
  (lambda (msg)
    (pcase (jupyter-message-type msg)
      (:status ...)
      (:execute-reply ...)
      (:execute-result ...))))

Note, this can also be achieved by adding the same function to each message type.

Channel hooks

Hook variables are available for each channel: jupyter-iopub-message-hook, jupyter-stdin-message-hook, and jupyter-shell-message-hook. Unless you want to run a channel hook for every client, use jupyter-add-hook to add a function to one of the channel hooks. jupyter-add-hook only adds to the client local value of the hook variables.

(jupyter-add-hook
 client 'jupyter-iopub-message-hook
 (lambda (msg)
   (when (jupyter-message-status-idle-p msg)
     (message "Kernel idle."))))

There is also the function jupyter-remove-hook to remove a client local hook.

Suppressing handler methods

To prevent a client from running its handler methods for some requests, you may bind jupyter-inhibit-handlers to an appropriate value before a request is made. For example, to prevent a client from running its stream handler for a request you would do the following:

(let ((jupyter-inhibit-handlers '(:stream)))
  (jupyter-send-execute-request client :code "print(\"foo\")\n1 + 2"))

jupyter-inhibit-handlers can be a list of message types or t, the latter meaning inhibit handlers for all message types. This variable should be locally bound. If you set the global value of this variable, all new requests will prevent the handlers from running. The less intrusive way to prevent handlers from running for individual requests is to let bind jupyter-inhibit-handlers as in the above code.

Waiting for messages

All message receiving happens asynchronously, therefore we need primitives which will block until we know for sure that a message of a certain type has been received. The following functions all wait for different conditions to be met on the received messages of a request and return the message that caused the function to stop waiting or nil if no message was received within a timeout period. The default timeout is jupyter-default-timeout seconds.

To wait until an idle message is received for a request:

(let ((timeout 4))
  (jupyter-wait-until-idle
   (jupyter-send-execute-request
    client :code "import time\ntime.sleep(3)")
   timeout))

To wait until a message of a specific type is received for a request:

(jupyter-wait-until-received :execute-reply
  (jupyter-send-execute-request client :code "[i*10 for i in range(100000)]"))

The most general form of the blocking functions is jupyter-wait-until which takes a message type and a function of a single argument. Whenever a message is received that matches the message type, the message is passed to the function. If the function returns non-nil, jupyter-wait-until returns the message which caused the function to return non-nil. If the function never returns a non-nil value within timeout, jupyter-wait-until returns nil.

(defun stream-prints-50-p (msg)
  (let ((text (jupyter-message-get msg :text)))
    (cl-loop for line in (split-string text "\n")
             thereis (equal line "50"))))

(let ((timeout 2))
  (jupyter-wait-until
      (jupyter-send-execute-request client :code "[print(i) for i in range(100)]")
      :stream #'stream-prints-50-p
    timeout))

The above code runs stream-prints-50-p for every stream message received from a kernel (here assumed to be a python kernel) for an execute request that prints the numbers 0 to 99 and waits until the kernel has printed the number 50 before returning from the jupyter-wait-until call. If the number 50 is not printed before the two second timeout, jupyter-wait-until returns nil. Otherwise it returns the stream message whose content contains the number 50.

Message property lists

The jupyter-send-* methods already take care of constructing messages based on their arguments and the jupyter-handle-* methods have the contents of the message passed as their arguments so there is no need to work with message property lists directly unless you are using message callbacks since they pass the message property list directly to the callback function. In this case, the following functions will be of use:

;; Get the `:content' propery of MSG
(jupyter-message-content msg)
;; Get the message type (one of the keys in `jupyter-message-types')
(jupyter-message-type msg)
;; Get the value of KEY in the MSG contents
(jupyter-message-get msg key)
;; Get the value of the MIMETYPE in MSG's :data property
;; MIMETYPE should be one of `:image/png', `:text/plain', ...
(jupyter-message-data msg mimetype)