Use Cases

This page lists some examples for tasks that RPyC excels in solving.

Remote (“Web”) Services

Starting with RPyC 3.00, the library is service-oriented. This makes implementing secure remote services trivial: a service is basically a class that exposes a well-defined set of remote functions and objects. These exposed functions can be invoked by the clients of the service to obtain results. For example, a UPS-like company may expose a TrackYourPackage service with

get_current_location(pkgid)
get_location_history(pkgid)
get_delivery_status(pkgid)
report_package_as_lost(pkgid, info)

RPyC is configured (by default) to prevent the use of getattr on remote objects to all but “allowed attributes”, and the rest of the security model is based on passing capabilities. Passing capabilities is explicit and fine grained – for instance, instead of allowing the other party call open() and attempting to block disallowed calls at the file-name level (which is weak), you can pass an open file object to the other party. The other party could manipulate the file (calling read/write/seek on it), but it would have no access to the rest of the file system.

Administration and Central Control

Efficient system administration is quite difficult: you have a variety of platforms that you need to control, of different endianities (big/little) or bit-widths (32/64), different administration tools, and different shell languages (sh, tcsh, batch files, WMI, etc.). Moreover, you have to work across numerous transport protocols (telnet, ftp, ssh, etc.), and most system tools are domain-specific (awk, grep) and quite limited (operating on lines of text), and are difficult to extend or compose together. System administration today is a mishmash of technologies.

Why not use python for that? It’s a cross-platform, powerful and succinct programming language with loads of libraries and great support. All you have to do is pip install rpyc on all of your machines, set them up to start an RPyC server on boot (over SSH or SSL), and there you go! You can control every machine from a single place, using a unified set of tools and libraries.

Hardware Resources

Many times you find yourself in need of utilizing hardware (“physical”) resources of one machine from another. For instance, some testgear or device can only connect to Solaris SPARC machines, but you’re comfortable with developing on your Windows workstation. Assuming your device comes with C bindings, some command-line tool, or accepts commands via ioctl to some device node – you can just run an RPyC server on that machine, connect to it from your workstation, and access the device programmatically with ease (using ctypes or popen remotely).

Parallel Execution

In CPython, the GIL prevents multiple threads from executing python bytecode at once. This simplifies the design of the python interpreter, but the consequence of which is that CPython cannot utilize multiple/multicore CPUs. The only way to achieve scalable, CPU-bound python programs is to use multiple processes, instead of threads. The bright side of using processes over threads is reducing synchronization problems that are inherent to multithreading – but without a easy way to communicate between your processes, threads are more appealing.

Using RPyC, multiprocessing becomes very easy, since we can think of RPyC-connected processes as “one big process”. Another modus operandi is having the “master” process spawn multiple worker processes and distribute workload between them.

Distributed Computation Platform

RPyC forms a powerful foundation for distributed computations and clustering: it is architecture and platform agnostic, supports synchronous and asynchronous invocation, and clients and servers are symmetric. On top of these features, it is easy to develop distributed-computing frameworks; for instance, such a framework will need to:

Take care of nodes joining or leaving the cluster
Handle workload balancing and node failures
Collect results from workers
Migrate objects and code based on runtime profiling

Note

RPyC itself is only a mechanism for distributed computing; it is not a distributed computing framework

Distributed algorithms could then be built on top of this framework to make computations faster.

Testing

The first and foremost use case of RPyC is in testing environments, where the concept of the library was conceived (initially as pyinvoke).

Classic-mode RPyC is the ideal tool for centralized testing across multiple machines and platforms: control your heterogeneous testing environment (simulators, devices and other test equipment) and test procedure from the comfort of your workstation. Since RPyC integrates so well with python, it is very easy to have your test logic run on machine A, while the side-effects happen on machine B.

There is no need to copy and keep your files synchronized across several machines, or work on remote file systems mounts. Also, since RPyC requires a lot of network “ping-pongs”, and because of the inherent security risks of the classic mode, this mode works best on secure, fast local networks (which is usually the case in testing environments).