Multicast

ArticleCategory: []

System Administration

AuthorImage:[]

[Photo of the Author]

TranslationInfo:[]

original in es Angel Lopez
es to en Javier Palacios

AboutTheAuthor:[]

Angel is finishing his studies of Computer Engineering. Now he is working as teacher for Sun Microsystems, teaching Solaris and network administration. Recently has published as a co-author with Ra-Ma the book entitled Internet protocols. Design and implementation on Unix systems. His main interests are network, security, systems/networks unix programming and, very recently, Linux kernel hacking is reducing his time for sleep ;)

Abstract:[]

This article attempts to be an introduction to multicast technologies on TCP/IP networks. It deals with the theoretic concepts of multicast communication, and details the Linux API that we can use for programming multicast applications. The kernel functions implementing this technology are also shown, to complete the global view of the multicast support under Linux. The article finishes with a simple C example on socket programming, where the creation of a multicast application is illustrated.

ArticleIllustration:[]

[Ilustration]

ArticleBody:[]

Introduction

When you try to reach a host (interface) within a network, you can use three different kinds of address:



Multicast addresses are useful when the information receiver is not only one host, and we do not want to produce a network broadcast. This scenario is typical in those situations that require the sending of multimedia information (real-time audio or video, for example) to some hosts. Thinking in bandwidth terms, these case are not the best for a unicast send to every client which wants to receive the multimedia emission. Neither is a broadcast the best solution, mainly if any of the clients are located out of the local subnet which origins the emission.

Multicast Address

As the reader will probably know, the IP address space is distributed in three groups of classes of addresses. A, B and C address classes. There is a fourth class (D) reserved for multicast address. IPv4 addresses between 224.0.0.0 and 239.255.255.255 belong to D class.

The 4 most significant bits of the IP address allow values between 224 and 239. The other 28 bits, less significant, are reserved for the multicast group identifier, as shown in the figure below:

[grouped bits]


At the network level, the IPv4 multicast addresses should be mapped over the physical addresses of the type of network where we are working. If we work with unicast network address, we should get the associated physical addresses using the ARP protocol. In the case of multicast addresses, ARP is not usable, and the physical address must be retrieved in a different way. There are some RFC documents dealing with the method to perform this mapping:

In Ethernet networks, the most extended ones, the mapping is performed fixing the 24 most significant bits of the Ethernet address to 01:00:5E. The next bit is fixed to 0, and the 23 less significant bits use the 23 less significant bits of the multicast IPv4 address. This process is shown in the graph below:
[transform to Ethernet]

For example, the multicast IPv4 address 224.0.0.5 will correspond to the physical ethernet address 01:00:5E:00:00:05.

There are some special multicast IPv4 addresses:

There is a big amount of allocated multicast addresses, apart from those shown here. The complete reference is found in the latest available version of the "Assigned Numbers" RFC.

The table below shows the full multicast address space, with the usual names for each address range and their associated TTL's (time to live counter in ip packet). Under multicast IPv4, the TTL has double meaning. As the reader probably nows, it controls the life-time of a datagram in the network to prevent any infinite loop in case of misconfigured routing tables. Working with multicast, the TTL value defines also the scope of the datagram, i. e., how far it will travel in the network. This allows a scope definition based on the datagram category.

Scope align=CENTER>TTL Adress range<</TD> Description
Node 0   The datagram is restricted to the local host. It will not reach any of the network interfaces.
Link 1 224.0.0.0 - 224.0.0.255 The datagram will be restricted to the sender host subnet, and will not progress beyond any router.
Department < 32 239.255.0.0 - 239.255.255.255 Restricted to one department of the organization.
Organization < 64 239.192.0.0 - 239.195.255.255 Restricted for a specific organization.
Global < 255 224.0.1.0 - 238.255.255.255 No restriction, global application.


Multicast at work

Within a LAN, a network interface on a host will send to the upper layers all those packet that has decides the host as destination. These packet will be those where the destination address are the interface physical addresses or those with a broadcast destination address.
If the host has joined to a multicast group, the network interface will recognize also those packet destined to that group: all those with a destination address corresponding to the multicast group with host membership.

Therefore, if the host interface has the physical address 80:C0:F6:A0:4A:B1 and has joined the multicast group 224.0.1.10, the packets that will be recognized as belonging to the host will be those with one of the next destination address:

To work with multicast over a WAN, the routers should support multicast routing. When a process running in a host joins a multicast group, the host sends an IGMP (Internet Group Management Protocol) message to every multicast router in the subnet, to inform then that the multicast messages sent to the multicast group must be sent to the local subnet to allow the reception by the subscribed process. The routers themselves will inform every other multicast router, letting it know which multicast messages must be routed into the subnet.

The routers also sent IGMP messages to the group 224.0.0.1 requesting every host information about the groups they are subscribed to. A host, after receiving such a message, sets a counter to a random value, and will reply when te counter goes to zero. This prevents all hosts replying at the same time, producing a network overload. When the host replies, it sends the message to the multicast address of the group and every other host with group membership will see the reply, and will not reply itself, as long as one subscribed host is enough for the subnet router to deal with multicast messages for that group.

If all hosts subscribed to a group have resigned, no one will reply, and the router will decide that no host is actually interested in such a group, and will finish the routing of the corresponding messages into the subnet. Another option implemented with IGMPv2, is the communication of the resign coming from the host, sending a message to address 224.0.0.2.

The Application Programming Interface (API)

With previous experience in sockets programming, the reader will only find five new socket operations to deal with multicast options. Functions setsockopt() and getsockopt() will be used to establish or read the values of these five options. The table below shows the available options for multicast, with their managed data types and a brief description:

IPv4 Option Data type Description
IP_ADD_MEMBERSHIP struct ip_mreq Join the multicast group.
IP_DROP_MEMBERSHIP struct ip_mreq Resign from the multicast group.
IP_MULTICAST_IF struct ip_mreq Specify an interface for submission of multicast messages.
IP_MULTICAST_TTL u_char Specify a TTL for submission of multicast messages.
IP_MULTICAST_LOOP u_char Activate or deactivate the multicast messages loopback.


The ip_mreq struct is defined in the header file <linux/in.h> as described below:

struct ip_mreq {
   struct in_addr imr_multiaddr; /* IP multicast address of group */
   struct in_addr imr_interface; /* local IP address of interface */
   };
And the multicast options in that file are:
#define IP_MULTICAST_IF  32
#define IP_MULTICAST_TTL 33
#define IP_MULTICAST_LOOP 34
#define IP_ADD_MEMBERSHIP 35
#define IP_DROP_MEMBERSHIP 36


IP_ADD_MEMBERSHIP

A process can join to a multicast group sending this option over a socket with the function setsockopt(). The parameter is a ip_mreq struct. The first structure field, imr_multiaddr, contains the multicast address we want join to. The second field, imr_interface, contains the IPv4 address of the interface we will use.

IP_DROP_MEMBERSHIP

Using this option a process can resign from a multicast group. The fields of the ip_mreq struct are used in the same manner as in the previous case.

IP_MULTICAST_IF

This option allows us to fix the network interface that the socket will use to sent the multicast messages. The interface will be given in the ip_mreq as in the previous cases.

IP_MULTICAST_TTL

Establish the TTL (Time To Live) for the datagrams with the multicast messages sent using the socket. Default value is 1, meaning that the datagram will not go beyond the local subnet.

IP_MULTICAST_LOOP

When a process sends a message for a multicast group, he will receive the messages if his interface is joined to the group, in the same way that it will be received if its origin is any other place in the network. This option allows to activate or deactivate this behavior.

A practical example

To test the ideas shown in this article, we will show a simple example, where there is a process that submits messages to a multicast group, and some processes associated to this group are receiving the messages, showing them on the screen.

The next code implements a server sending to the multicast group 224.0.1.1 everything going through his standard input. As can be seen, there is no need of any special action to sent information to a multicast group. The destination group addresses are enough.
Loopback and TTL options could be changed, if their default values were not appropriate for the application under development.

Server

The standard input is sent to multicast group 224.0.1.1

#include <sys/types.h> 
#include <sys/socket.h> 
#include <netinet/in.h> 
#include <arpa/inet.h> 
#include <string.h> 
#include <stdio.h>

#define MAXBUF 256 
#define PUERTO 5000 
#define GRUPO "224.0.1.1"  

int main(void) { 
  int s; 
  struct sockaddr_in srv; 
  char buf[MAXBUF]; 

  bzero(&srv, sizeof(srv)); 
  srv.sin_family = AF_INET; 
  srv.sin_port = htons(PUERTO); 
  if (inet_aton(GRUPO, &srv.sin_addr) < 0) { 
   perror("inet_aton"); 
   return 1; 
  } 
  if ((s = socket(AF_INET, SOCK_DGRAM, 0)) < 0) { 
   perror("socket"); 
   return 1; 
  }

  while (fgets(buf, MAXBUF, stdin)) { 
    if (sendto(s, buf, strlen(buf), 0, 
              (struct sockaddr *)&srv, sizeof(srv)) < 0) { 
      perror("recvfrom"); 
    } else { 
      fprintf(stdout, "Enviado a %s: %s\n", GRUPO, buf); 
    } 
  } 
} 


Client

The code below is the client side, which receives the information submitted to the multicast group by the server. The received messages are shown on standard output. The only peculiarity of this code is the establishment of the IP_ADD_MEMBERSHIP option. The remaining code is the standard one for a process which needs to receive UDP messages.

#include <sys/types.h> 
#include <sys/socket.h> 
#include <netinet/in.h> 
#include <arpa/inet.h> 
#include <stdio.h> 

#define MAXBUF 256 
#define PUERTO 5000 
#define GRUPO "224.0.1.1"

int main(void) { 
  int s, n, r; 
  struct sockaddr_in srv, cli; 
  struct ip_mreq mreq; 
  char buf[MAXBUF];

  bzero(&srv, sizeof(srv)); 
  srv.sin_family = AF_INET; 
  srv.sin_port = htons(PUERTO); 
  if (inet_aton(GRUPO, &srv.sin_addr) < 0) { 
    perror("inet_aton"); 
    return 1; 
  } 

  if ((s = socket(AF_INET, SOCK_DGRAM, 0)) < 0) { 
    perror("socket"); 
    return 1; 
  }

  if (bind(s, (struct sockaddr *)&srv, sizeof(srv)) < 0) { 
    perror("bind"); 
    return 1; 
  }

  if (inet_aton(GRUPO, &mreq.imr_multiaddr) < 0) { 
    perror("inet_aton"); 
    return 1; 
  } 
  mreq.imr_interface.s_addr = htonl(INADDR_ANY); 

  if (setsockopt(s,IPPROTO_IP,IP_ADD_MEMBERSHIP,&mreq,sizeof(mreq)) 
      < 0) { 
    perror("setsockopt"); 
    return 1; 
  }

  n = sizeof(cli); 
  while (1) { 
    if ((r = recvfrom(s, buf, MAXBUF, 0, (struct sockaddr *) 
         &cli, &n)) < 0) { 
      perror("recvfrom"); 
    } else { 
      buf[r] = 0; 
      fprintf(stdout, "Mensaje desde %s: %s\n", 
              inet_ntoa(cli.sin_addr), buf); 
    } 
  } 
}


Kernel and Multicast

As we showed above, when a process wants to join a multicast group, it uses the setsockopt() function to establish the option IP_ADD_MEMBERSHIP at the IP level. The actual implementation for this function can be found on /usr/src/linux/net/ipv4/ip_sockglue.c. The code executed within the function to set this option or the IP_DROP_MEMBERSHIP one is:

struct ip_mreqn mreq;

if (optlen < sizeof(struct ip_mreq)) 
  return -EINVAL; 
if (optlen >= sizeof(struct ip_mreqn)) { 
  if(copy_from_user(&mreq,optval,sizeof(mreq))) 
    return -EFAULT; 
} else { 
  memset(&mreq, 0, sizeof(mreq)); 
  if (copy_from_user(&mreq,optval,sizeof(struct ip_mreq))) 
    return -EFAULT; 
} 
if (optname == IP_ADD_MEMBERSHIP) 
  return ip_mc_join_group(sk,&mreq); 
else 
  return ip_mc_leave_group(sk,&mreq); 


The very first lines of code check that the input parameter, the ip_mreq struct, has a correct length, and it is possible to copy it from user to kernel areas. Once we get the parameter value, the function ip_mc_join_group() is called to join a multicast group, or ip_mc_leave_group() if we want to resign.

The code for these functions is found at /usr/src/linux/net/ipv4/igmp.c. To join a group, the source code is commented below:

int ip_mc_join_group(struct sock *sk , struct ip_mreqn *imr)
{ 
  int err; 
  u32 addr = imr->imr_multiaddr.s_addr; 
  struct ip_mc_socklist, *iml, *i; 
  struct in_device *in_dev; 
  int count = 0; 


At the very beginning we check, using the MULTICAST macro, that the group address are within the ranges reserved for multicast addresses. It´s enough to check that the most significant byte on the IP address is set to 224.

  if (!MULTICAST(addr)) 
    return -EINVAL; 

    rtnl_shlock(); 


After the verification, a network interface is set up to deal with the multicast group. If it is not possible the access by index to the interface, as should be under IPv6, the function ip_mc_find_dev() is called to find the device associated to a specified IP address. We will assume for the remaining of the article that this is the case, because we are working under IPv4. If the address were INADDR_ANY, the kernel should find itself the network interface, reading the routing table to choose the better interface taking into account the group address and the definition of the routing tables.

  if (!imr->imr_ifindex) 
    in_dev = ip_mc_find_dev(imr); 
  else 
    in_dev = inetdev_by_index(imr->imr_ifindex);

  if (!in_dev) { 
    iml = NULL; 
    err = -ENODEV; 
    goto done; 
  }


Then we reserve memory for a ip_mc_socklist struct, and each group address and interface associated to the socket are compared. If any entry previously associated to the socket matches, we jump out of the function, because it does not make sense to do a double association to a group and interface. If the network interface addresses were not INADDR_ANY, the corresponding counter is incremented before the function ends.

  iml = (struct ip_mc_socklist *)sock_kmalloc(sk, sizeof(*iml), 
    GFP_KERNEL); 
  err = -EADDRINUSE; 
  for (i=sk->ip_mc_list; i; i=i->next) { 
    if (memcmp(&i->multi, imr, sizeof(*imr)) == 0) { 
      /* New style additions are reference counted */ 
      if (imr->imr_address.s_addr == 0) { 
        i->count++; 
        err = 0; 
      } 
      goto done; 
    } 
    count++; 
  }
  err = -ENOBUFS; 
  if (iml == NULL || count >= sysctl_igmp_max_memberships) 
    goto done; 


If we arrive at this point, this means that a new socket will be linked to a new group so a new entry must be created and linked to the list of groups belonging to the socket. The memory was reserved in advance, and we only need to set the correct values for the various fields of the involved structures.

  memcpy(&iml->multi,imr, sizeof(*imr)); 
  iml->next = sk->ip_mc_list; 
  iml->count = 1; 
  sk->ip_mc_list = iml; 
  ip_mc_inc_group(in_dev,addr); 
  iml = NULL; 
  err = 0; 
done: 
  rtnl_shunlock(); 
  if (iml) 
    sock_kfree_s(sk, iml, sizeof(*iml)); 
  return err; 
}


The function ip_mc_leave_group() is in charge to resign from a multicast group, and is much simpler than the previous function. It takes the interface addresses and the group, and searches them among the entries related to the actual socket. Once they have been found, the number of references is decremented, as there is one less process associated to the group. If the new value is zero, the counter itself is deleted.

int ip_mc_leave_group(struct sock *sk, struct ip_mreqn *imr) 
{ 
  struct ip_mc_socklist *iml, **imlp;
  for (imlp=&sk->ip_mc_list;(iml=*imlp)!=NULL; imlp=&iml->next) { 
    if (iml->multi.imr_multiaddr.s_addr==imr->imr_multiaddr.s_addr 
     && iml->multi.imr_address.s_addr==imr->imr_address.s_addr &&
     (!imr->imr_ifindex || iml->multi.imr_ifindex==imr->imr_ifindex)) { 
      struct in_device *in_dev; 
      if (--iml->count) 
        return 0; 

      *imlp = iml->next; 
      synchronize_bh();

      in_dev = inetdev_by_index(iml->multi.imr_ifindex); 
      if (in_dev) 
        ip_mc_dec_group(in_dev, imr->imr_multiaddr.s_addr); 
      sock_kfree_s(sk, iml, sizeof(*iml)); 
      return 0; 
    } 
  } 
  return -EADDRNOTAVAIL; 
}


The other multicast options that we listed above are very simple, because they just set some values in the data fields of the internal structure that is associated to the socket we are working with. These assignments are performed directly by the function ip_setsockopt().