Everything about nothing: February 2016

Friday, February 12, 2016

Few notes about network namespaces in Linux

For some time I'm working with network namespaces as implemented in the Linux kernel. Here I'll collect some notes about the implementation, behavior, usage and anything else I learn while using network namespaces.

Kernel API for NETNS

Kernel offers two system calls that allow management of network namespaces. The first one is for creating a new network namespace, unshare(2). Actually, this system call allows other types of namespaces to be created, but here we are interested only in network namespaces. So, to create a new network namespace you should call the function like this:

#include <sched.h>
...
unshare(CLONE_NEWNET);
...

And that would define a new network namespace.

There are two ways other processes can now use that network namespace. The first approach is for the process that created new network namespace to fork other processes and each forked process would share and inherit the parent's process network namespace. The same is true if exec is used.

The second system call kernel offers is setns(2). To use this system call you have to have a file descriptor that is somehow related to the network namespace you want to use. There are two approaches how to obtain the file descriptor.

The first approach is to know the process that lives currently in the required network namespace. Let's say that the PID of the given process is $PID. So, to obtain file descriptor you should open the file /proc/$PID/ns/net file and that's it, pass file descriptor to setns(2) system call to switch network namespace. This approach always works.

The second approach works only for iproute2 compatible tools. Namely, ip command when creating new network namespace creates a file in /var/run/netns directory and bind mounts new network namespace to this file. So, if you know a name of network namespace you want to access (let's say the name is NAME), to obtain file descriptor you just need to open(2) related file, i.e. /var/run/netns/NAME.

Note that there is no system call that would allow you to remove some existing network namespace. Each network namespace exists as long as there is at least one process that uses it, or there is a mount point.

Two remarks for the end of this section. First, there is no system call that would allow one process to move some other process into another network namespace! And second, you need appropriate privileges to use the mentioned system calls, i.e. regular user processes can't switch namespaces.

Socket API behavior

The next question is how Socket API behaves when network namespaces are used, and things here are quite interesting.

First, each socket handle you create is bound to whatever network namespace was active at the time the socket was created. That means that you can set one network namespace to be active (say NS1) create socket and then immediately set another network namespace to be active (NS2). The socket created is bound to NS1 no matter which network namespace is active and socket can be used normally. In other words, when doing some operation with the socket (let's say bind, connect, anything) you don't need to activate socket's own network namespace before that!

Also, to note is that network namespace is per-thread setting, meaning if you set certain network namespace in one thread, this won't have any impact on other threads in the process.

Command line tools

There are two command line tools available to manipulate network namespaces. The first one is nsenter(1) which isn't specific to networking. It allows one to start some process within predefined network namespace. The second tool is ip command from iproute2 package. It allows management of network namespaces and also allows network interfaces to be switched between different namespaces.

NETLINK behavior

To change device from one network namespace to another one, NETLINK must be used. I found somewhere references to /sys files, but at least on my system they don't appear to exist.

One interesting fact is that interface ID is global across all network namespaces - except for loopback interface, i.e. if you create interface in one network namespace and it gets ID N, and then you move it to another network namespace, it will keep ID N.

TBD.

Tuesday, February 2, 2016

NetworkManager architecture

After figuring out how VPN establishment works in NetworkManager the next big question is about the architecture of NetworkManager and its run-time behavior. This post tries to provide answer to the given question. This post consists of two parts. The first part, Source code organization, gives information about the source tree organization of NetworkManager. The second part, Run-time initialization behavior has a goal to describe initialization of NetworkManager and its key parts. Note that I'll update this post as I learn about NetworkManager.

The last update time of this post was: February 2nd, 2016.

Source code organization

So, let us start with a source tree organization. If you take a look at the top level directory of the NetworkManager git tree, you'll find there some files and the set of directories. The files are build scripts and configuration files, there are also some documentation files. But, in the end all the interesting functionality is placed within the top level directories. There are the following top level directories:

callouts

TBD
clients

Clients to control NM like nmcli or nmtui.
introspection

contains XML description of DBus interfaces provided by NetworkManager and its components. For example, VPN plugins are external to NetworkManager but they use DBus to communicate with NetworkManager and there is an XML file that describes VPN plugin's interfaces. All the descriptions are converted into C code stubs and skeletons using dbus-binding-tool. There is a lot of magic staff happening here.
libnm

Libraries intended to be used by clients accessing and using services provided by NetworkManager. This also includes application that control NetworkManager itself (like nmcli and nmtui)
libnm-core

Core NM libraries used by NetworkManager itself and all the other components, like command line utilities.
libnm-glib, libnm-util

Old NM libraries that are deprecated starting with 1.0 release of NetworkManager. Still, it seems that at least some functionality is used from those libraries so probably until all the pieces of NetworkManager are converted to the new libraries they are to stay. Note that in my posts I ignore the content of those libraries as much as possible and I concentrate on the previous two libraries, libnm and libnm-core.
src

The directory with the code specific to the NetworkManager unlike the previous directories that contain code used by different programs. The src/ directory contains file and a number of subdirectories:

devices/

Classes and sub classes used to represent different networking devices, real and virtual, that can be managed by NetworkManager.
dhcp-manager/

Manager class that manages and interfaces DHCP clients to NetworkManager.
dns-manager/
dnsmasq-manager/
platform/

Platform specific code that interfaces NetworkManager to particular operating system it is running on. For the moment there is a base class and derived Linux specific class.
ppp-manager/
rdisc/

Code to monitor and parse IPv6 router advertisement messages.
settings/

Code to keep track of connections. Since connections are read from configuration files that depend on particular distributions this directory also contains plugins, one plugin for each supported distribution.
supplicant-manager/
systemd/
vpn-manager/

VPN specific code. Here is a manager class that is used to manage all the VPN connections and also class that is used to represent each active VPN connection.

Run-time initialization behavior

When NM is started execution starts from main() function that can be found in src/main.c. The goal of the main function is to parse command line options, check some prerequisites for running NM, do initialization steps and start glib main loop. The most interesting are initialization steps which are:

Initialize logging.
Create configuration object singleton, NM_TYPE_CONFIG (defined in file src/nm-config.c).
Creates a singleton for managing DBus connections and communication, NM_TYPE_BUS_MANAGER (defined in file src/nm-bus-manager.c).
Creates a singleton that is platform dependent used for management of network subsystem, NM_TYPE_LINUX_PLATFORM (defined in file src/platform/nm-linux-platform.c).
Creates a main singleton object of class NM_TYPE_MANAGER (defined in file src/nm-manager.c).
Initializes loopback interface.

So, while NetworkManager is running it basically consists of a set of singleton objects that communicate and make what is named NetworkManager.

References

https://wiki.gnome.org/Projects/NetworkManager/libnm
http://events.linuxfoundation.org/sites/events/files/slides/networking.pdf
https://blogs.gnome.org/dcbw/2009/06/25/networkmanager-and-connman/

Everything about nothing