Blog post

Sending and processing ARP requests/responses using BPF (updated)

ARP BPF OpenBSD FreeBSD Networking

By Marco W. Soijer — November 2020

Share this post

The Address Resolution Protocol (ARP) is handled by the operating system. With the command line tool arp you can see the cache and clear entries, but you can neither trigger a request in order to refresh an entry or to check its validitiy, nor resolve an IPv4 address without sending anything on the Internet layer (3) or above — even a ping involves ICMP.

So you may want to control sending ARP queries from userland, and process the incoming responses. The required interface to the link-layer on Linux and BSD systems is the Berkeley Packet Filter (BPF). Under BSD, it appears as the device /dev/bpf and can be addressed through normal read and write operations, plus some ioctl. This post describes how — by building an ARP scanner that queries all addresses in the local network. You can find the full C code at the end of the page.

The Berkeley Packet Filter (BPF)

The Berkeley Packet Filter is a pseudo device, included with basically all Linux distributions, and all BSD systems — it is also part of BSD-based macOS. BPF provides a raw interface to link-layer network functions and is basically used to build firewalls, sniffers and the like. The way how to interact with BPF differs on the various systems, however — you can find details in the documentation for OpenBSD, FreeBSD, NetBSD, Linux Kernel, or macOS, although the latter also allows more high-level ARP control.

The BSD variants all seem to share the same /dev/bpf access, so what we do here with FreeBSD, can probably be applied to the other BSDs without change (except for the octet names in struct ether_addr as noted below).

The main process goes as follows: open the BPF device for reading and writing, attach it to the network interface on which you want to ARP scan, write the Ethernet frame with the ARP query, read the ARP responses that come back, clean up and close the device. There are two things that require a bit of additional effort: finding your own protocol (IP) and hardware (MAC) address for the chosen interface — so you can fill the Ethernet frame header correctly — and activating a packet filter, in order to receive only the ARP responses you are interested in and not everything that is on the network. Now you know where the name BPF comes from.

Opening the device

In our arpscan.c, we use the first command line argument argv[1] to pass the name of the interface on which we want to do our scan to the main function. So the most outside functionality looks like this:

Excerpt from arpscan.c
//...
175int main(int argc, char *argv[]) {
176
177 int fd;
//...
205 if (argc == 2) {
206 if ((fd = open("/dev/bpf", O_RDWR)) > 0) {
//...
232 }
233 else
234 fprintf(stderr, "%s (open)\n", strerror(errno));
235 }
236 else
237 fprintf(stdout, "Usage:\tarpscan if\n");
238
239 return -1;
240}

With the file descriptor, BPF can be bound to the interface passed as argv[1]; the ioctl request to do so is BIOCSETIF, which takes a pointer to a struct ifreq. Furthermore, we need to create a buffer to receive the incoming frames, so we need BPF's buffer length, requested with BIOCGBLEN; and we want BPF to return any ARP frame it receives immediately, which is set with BIOCSETIF, for which we abuse the int buflen that subsequently receives the buffer size. If binding the interface is successful, it is good to check whether the interface is indeed an Ethernet one (BIOCGDLT). So inside main, we add the following:

Excerpt from arpscan.c
178 int buflen;
179 int dlt;
180 struct ifreq ir;
//...
207 strncpy(ir.ifr_name, argv[1], IFNAMSIZ);
208 buflen = 1;
209 if (ioctl(fd, BIOCSETIF, &ir) != -1
210 && ioctl(fd, BIOCIMMEDIATE, &buflen) != -1
211 && ioctl(fd, BIOCGBLEN, &buflen) != -1) {
212 if (ioctl(fd, BIOCGDLT, &dlt) != -1
213 && dlt == DLT_EN10MB) {
//...
226 }
227 else
228 fprintf(stderr, "Link type unknown or not Ethernet\n");
229 }
230 else
231 fprintf(stderr, "%s (ioctl)\n", strerror(errno));

Setting the packet filter

Now for the most interesting part: setting the filter. The ioctl request for this is BIOCSETF, which takes a pointer to a struct bpf_program, which in turn is an integer with the length of the programme, followed by an array of instructions (struct bpf_insn *).

The filter language is described with some examples on the OpenBSD BPF manual page. Think machine code, with simple instructions like loading a byte (from the frame) into the accumulator, comparing, jumping — forward only — and returning. Working at the link layer, we need to take care of the whole Ethernet frame, although upon sending, BPF automatically adds any required padding — which is needed here, as ARP messages underrun the minimum frame length of 64 octets — and the CRC32-based frame check sequence. So we are left with the 14 octets of the frame that contain the destination and source addresses as six-octet hardware addresses each plus the length of type word, and the 28 octets of the ARP message.

To have BPF filter out ARP responses, we look at two words:

  • First, within the 14-octet frame header, bytes 12 and 13 (length of type) must equal 0x0806 or ETHERTYPE_ARP, to check that the payload is an ARP message; and
  • second, within the 28-octet ARP message, bytes 6 and 7 (opcode) must equal 2 or ARPOP_REPLY, to ensure that the message is an ARP response.

Thus we add to our main function:

Excerpt from arpscan.c
185 // BPF rule
186 struct bpf_insn insns[] = {
187 // Load word at octet 12
188 BPF_STMT(BPF_LD | BPF_H | BPF_ABS, 12),
189 // If not ETHERTYPE_ARP, skip next 3 (and return nothing)
190 BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, ETHERTYPE_ARP, 0, 3),
191 // Load word at octet 20
192 BPF_STMT(BPF_LD | BPF_H | BPF_ABS, 20),
193 // If not ARPOP_REPLY, skip next 1 (and return nothing)
194 BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, ARPOP_REPLY, 0, 1),
195 // Valid ARP reply received, return message
196 BPF_STMT(BPF_RET | BPF_K, sizeof(struct ether_arp) + sizeof(struct ether_header)),
197 // Return nothing
198 BPF_STMT(BPF_RET | BPF_K, 0),
199 };
200 struct bpf_program filter = {
201 sizeof insns / sizeof(insns[0]),
202 insns
203 };
//...
214 if (ioctl(fd, BIOCSETF, &filter) != -1) {
//...
223 }
224 else
225 fprintf(stderr, "Cannot set BPF rule\n");

Retrieving the interface addresses

Having to create an Ethernet frame ourselves, we need to know the hardware (MAC) and protocol (IP) addresses of the chosen interface. While we are at it, we also collect the address mask for the network, so we can determine what range of addresses to scan. The library function getifaddrs() provides a linked list of struct ifaddrs with everything we need; we only need to look for the right chunks.

Excerpt from arpscan.c
23// Find own protocol (IP) and hardware (MAC) addresses
24// Returns true iff both were found
25
26bool findownaddresses(char *interface, struct ether_addr *ownmac,
27 struct sockaddr_in *saip, struct sockaddr_in *samask) {
28
29 struct ifaddrs *ifap, *ifa;
30 struct sockaddr_dl *sdl;
31 unsigned int success = 0;
32
33 if (!getifaddrs(&ifap)) {
34
35 printf("Self\n");
36
37 for (ifa = ifap; ifa; ifa = ifa->ifa_next) {
38
39 if (!strcmp(ifa->ifa_name, interface)) {
40
41 sdl = (struct sockaddr_dl *)ifa->ifa_addr;
42
43 if (sdl->sdl_family == AF_LINK
44 && sdl->sdl_type == IFT_ETHER
45 && sdl->sdl_alen == ETHER_ADDR_LEN) {
46 memcpy((u_int8_t *)ownmac, (u_int8_t *)LLADDR(sdl), sizeof(struct ether_addr));
47 printf("MAC: %02x:%02x:%02x:%02x:%02x:%02x\n",
48 ownmac->octet[0], // ownmac->ether_addr_octet[...] on OpenBSD
49 ownmac->octet[1],
50 ownmac->octet[2],
51 ownmac->octet[3],
52 ownmac->octet[4],
53 ownmac->octet[5]);
54 success |= 0x01;
55 }
56 else if (sdl->sdl_family == AF_INET) {
57 saip->sin_addr.s_addr = ((struct sockaddr_in *)ifa->ifa_addr)->sin_addr.s_addr;
58 samask->sin_addr.s_addr = ((struct sockaddr_in *)ifa->ifa_netmask)->sin_addr.s_addr;
59 printf("%s, ", inet_ntoa(saip->sin_addr));
60 printf("netmask %s\n", inet_ntoa(samask->sin_addr));
61 success |= 0x02;
62 }
63 }
64 }
65 freeifaddrs(ifap);
66 }
67 else
68 fprintf(stderr, "%s (getifaddr)\n", strerror(errno));
69
70 return (success == 0x03);
71}
72
//...
174
//...
181 struct ether_addr ownmac;
182 struct sockaddr_in saip;
183 struct sockaddr_in samask;
//...
216 if (findownaddresses(argv[1], &ownmac, &saip, &samask)) {
//...
221 else
222 fprintf(stderr, "Missing address for interface\n");
223 }
240}

Writing the queries — or: flood them!

Filling the ARP request frame is straightforward. Using ff:ff:ff:ff:ff:ff as the broadcast destination address, filling in our own IP and MAC address, and setting the opcode and length indicators is equal for all queries:

Excerpt from arpscan.c
74// Construct ethernet frame header and ARP request
75
76void prepareframe(struct ether_addr *ownmac, struct sockaddr_in *saip,
77 struct ether_header *ethhdr, struct ether_arp *etharp) {
78
79 memset((unsigned char *)&ethhdr->ether_dhost, 0xff, ETHER_ADDR_LEN);
80 memcpy((unsigned char *)&ethhdr->ether_shost, (unsigned char *)ownmac, ETHER_ADDR_LEN);
81 ethhdr->ether_type = htons(ETHERTYPE_ARP);
82
83 etharp->arp_hrd = htons(ARPHRD_ETHER);
84 etharp->arp_pro = htons(ETHERTYPE_IP);
85 etharp->arp_hln = ETHER_ADDR_LEN;
86 etharp->arp_pln = 4;
87 etharp->arp_op = htons(ARPOP_REQUEST);
88 memcpy((u_int8_t *)etharp->arp_sha, (u_int8_t *)ownmac, sizeof(struct ether_addr));
89 memcpy((u_int8_t *)etharp->arp_spa, (u_int8_t *)&(saip->sin_addr.s_addr), 4*sizeof(u_int8_t));
90 memset((u_int8_t *)etharp->arp_tha, 0, ETHER_ADDR_LEN);
91
92 return;
93}

The structures ether_header and ether_arp are defined in the calling function, which also loops through the protocol addresses to query for. The range goes from the network address — the network-mask part of our own address — through the local broadcast address, less our own address. For a /24 network, this leaves 253 addresses to query for. You may not want to apply this arpscan as it is for larger networks…

Excerpt from arpscan.c
136// Write ARP request to all monocast addresses in the network
137// (all less network, broadcast, and self)
138
139void writequeries(int fd, struct ether_addr *ownmac, struct sockaddr_in *saip, struct sockaddr_in *samask) {
140
141 unsigned char msg[sizeof(struct ether_header) + sizeof(struct ether_arp)];
142 struct ether_header *ethhdr;
143 struct ether_arp *etharp;
144 struct sockaddr_in sat;
145 uint32_t addr, addrnw, addrbc, addrown;
146 int len, addlen;
147
148 ethhdr = (struct ether_header *)msg;
149 etharp = (struct ether_arp *)(ethhdr + 1);
150
151 prepareframe(ownmac, saip, ethhdr, etharp);
152
153 addrown = ntohl(saip->sin_addr.s_addr);
154 addrnw = ntohl(saip->sin_addr.s_addr & samask->sin_addr.s_addr);
155 addrbc = addrnw + (0xffffffff - ntohl(samask->sin_addr.s_addr));
156
157 printf("\nWho has? for %u IP addresses\n", addrbc - addrnw - 2);
158
159 for (addr = addrnw + 1; addr < addrbc; addr++) {
160 if (addr != addrown) {
161 sat.sin_addr.s_addr = htonl(addr);
162 memcpy((u_int8_t *)etharp->arp_tpa, (u_int8_t *)&(sat.sin_addr.s_addr), 4*sizeof(u_int8_t));
163 len = 0;
164 while (len<(int)(sizeof(struct ether_header) + sizeof(struct ether_arp))
165 && (addlen = write(fd, (char *)ethhdr + len,
166 sizeof(struct ether_header) + sizeof(struct ether_arp) - len)) >= 0)
167 len += addlen;
168 }
169 }
170
171 return;
172}

Did I mention, you may not want to apply this on any network that is not yours? Harmless as such scans may be, people may not like it.

Receiving the responses — or: reel them in!

All we need to do now, is listen to BPF and print out the hardware and protocol addresses that come in. ARP is a simple, connectionless protocol that only works on the local network, so answers arrive quickly. What is not there within half a second (actually a lot less), will not arrive at all. So we stop polling and receiving when no further reply has come in for 500 milliseconds:

Excerpt from arpscan.c
96// Collect filter outputs until no response for half a second
97
98void collectresponses(int fd, int buflen) {
99
100 unsigned char *buffer;
101 struct bpf_hdr *bpf;
102 int len;
103 struct timeval timeout;
104
105 if ((buffer = (unsigned char *)malloc(buflen))) {
106
107 bpf = (struct bpf_hdr *)buffer;
108
109 timeout.tv_sec = 0;
110 timeout.tv_usec = 500000;
111
112 if (ioctl(fd, BIOCSRTIMEOUT, &timeout) != -1) {
113 while ((len = read(fd, buffer, buflen)) > 0)
114 if (len >= (int)sizeof(struct bpf_hdr)
115 && len >= bpf->bh_hdrlen + 0x2a
116 && buffer[bpf->bh_hdrlen + 0x12] == 0x06
117 && buffer[bpf->bh_hdrlen + 0x13] == 0x04)
118 printf("\r%s is at %02x:%02x:%02x:%02x:%02x:%02x\n",
119 inet_ntoa(*(struct in_addr *)(buffer + bpf->bh_hdrlen + 0x1c)),
120 buffer[bpf->bh_hdrlen + 0x16],
121 buffer[bpf->bh_hdrlen + 0x17],
122 buffer[bpf->bh_hdrlen + 0x18],
123 buffer[bpf->bh_hdrlen + 0x19],
124 buffer[bpf->bh_hdrlen + 0x1a],
125 buffer[bpf->bh_hdrlen + 0x1b]);
126 }
127
128 free(buffer);
129 }
130
131 return;
132}

You can now almost piece together all of arpscan.c. The only thing that is missing apart from the includes, is the core of our main:

Excerpt from arpscan.c
175int main(int argc, char *argv[]) {
//...
217 writequeries(fd, &ownmac, &saip, &samask);
218 collectresponses(fd, buflen);
219 exit(0);
//...
240}

There you go. Address resolution under full control from userland.

Full code

The following C99 source was developed on FreeBSD 12.2 (patched through November 2020), compiled with clang, and run on an amd64 system with Intel NICs.

arpscan.c
1#include <stdio.h>
2#include <stdlib.h>
3#include <stdbool.h>
4#include <unistd.h>
5#include <string.h>
6#include <errno.h>
7
8#include <fcntl.h>
9#include <poll.h>
10#include <ifaddrs.h>
11#include <arpa/inet.h>
12#include <net/bpf.h>
13#include <net/if.h>
14#include <net/if_dl.h>
15#include <net/if_types.h>
16#include <netinet/in.h>
17#include <netinet/if_ether.h>
18#include <sys/ioctl.h>
19#include <sys/types.h>
20#include <sys/time.h>
21
22
23// Find own protocol (IP) and hardware (MAC) addresses
24// Returns true iff both were found
25
26bool findownaddresses(char *interface, struct ether_addr *ownmac,
27 struct sockaddr_in *saip, struct sockaddr_in *samask) {
28
29 struct ifaddrs *ifap, *ifa;
30 struct sockaddr_dl *sdl;
31 unsigned int success = 0;
32
33 if (!getifaddrs(&ifap)) {
34
35 printf("Self\n");
36
37 for (ifa = ifap; ifa; ifa = ifa->ifa_next) {
38
39 if (!strcmp(ifa->ifa_name, interface)) {
40
41 sdl = (struct sockaddr_dl *)ifa->ifa_addr;
42
43 if (sdl->sdl_family == AF_LINK
44 && sdl->sdl_type == IFT_ETHER
45 && sdl->sdl_alen == ETHER_ADDR_LEN) {
46 memcpy((u_int8_t *)ownmac, (u_int8_t *)LLADDR(sdl), sizeof(struct ether_addr));
47 printf("MAC: %02x:%02x:%02x:%02x:%02x:%02x\n",
48 ownmac->octet[0], // ownmac->ether_addr_octet[...] on OpenBSD
49 ownmac->octet[1],
50 ownmac->octet[2],
51 ownmac->octet[3],
52 ownmac->octet[4],
53 ownmac->octet[5]);
54 success |= 0x01;
55 }
56 else if (sdl->sdl_family == AF_INET) {
57 saip->sin_addr.s_addr = ((struct sockaddr_in *)ifa->ifa_addr)->sin_addr.s_addr;
58 samask->sin_addr.s_addr = ((struct sockaddr_in *)ifa->ifa_netmask)->sin_addr.s_addr;
59 printf("%s, ", inet_ntoa(saip->sin_addr));
60 printf("netmask %s\n", inet_ntoa(samask->sin_addr));
61 success |= 0x02;
62 }
63 }
64 }
65 freeifaddrs(ifap);
66 }
67 else
68 fprintf(stderr, "%s (getifaddr)\n", strerror(errno));
69
70 return (success == 0x03);
71}
72
73
74// Construct ethernet frame header and ARP request
75
76void prepareframe(struct ether_addr *ownmac, struct sockaddr_in *saip,
77 struct ether_header *ethhdr, struct ether_arp *etharp) {
78
79 memset((unsigned char *)&ethhdr->ether_dhost, 0xff, ETHER_ADDR_LEN);
80 memcpy((unsigned char *)&ethhdr->ether_shost, (unsigned char *)ownmac, ETHER_ADDR_LEN);
81 ethhdr->ether_type = htons(ETHERTYPE_ARP);
82
83 etharp->arp_hrd = htons(ARPHRD_ETHER);
84 etharp->arp_pro = htons(ETHERTYPE_IP);
85 etharp->arp_hln = ETHER_ADDR_LEN;
86 etharp->arp_pln = 4;
87 etharp->arp_op = htons(ARPOP_REQUEST);
88 memcpy((u_int8_t *)etharp->arp_sha, (u_int8_t *)ownmac, sizeof(struct ether_addr));
89 memcpy((u_int8_t *)etharp->arp_spa, (u_int8_t *)&(saip->sin_addr.s_addr), 4*sizeof(u_int8_t));
90 memset((u_int8_t *)etharp->arp_tha, 0, ETHER_ADDR_LEN);
91
92 return;
93}
94
95
96// Collect filter outputs until no response for half a second
97
98void collectresponses(int fd, int buflen) {
99
100 unsigned char *buffer;
101 struct bpf_hdr *bpf;
102 int len;
103 struct timeval timeout;
104
105 if ((buffer = (unsigned char *)malloc(buflen))) {
106
107 bpf = (struct bpf_hdr *)buffer;
108
109 timeout.tv_sec = 0;
110 timeout.tv_usec = 500000;
111
112 if (ioctl(fd, BIOCSRTIMEOUT, &timeout) != -1) {
113 while ((len = read(fd, buffer, buflen)) > 0)
114 if (len >= (int)sizeof(struct bpf_hdr)
115 && len >= bpf->bh_hdrlen + 0x2a
116 && buffer[bpf->bh_hdrlen + 0x12] == 0x06
117 && buffer[bpf->bh_hdrlen + 0x13] == 0x04)
118 printf("\r%s is at %02x:%02x:%02x:%02x:%02x:%02x\n",
119 inet_ntoa(*(struct in_addr *)(buffer + bpf->bh_hdrlen + 0x1c)),
120 buffer[bpf->bh_hdrlen + 0x16],
121 buffer[bpf->bh_hdrlen + 0x17],
122 buffer[bpf->bh_hdrlen + 0x18],
123 buffer[bpf->bh_hdrlen + 0x19],
124 buffer[bpf->bh_hdrlen + 0x1a],
125 buffer[bpf->bh_hdrlen + 0x1b]);
126 }
127
128 free(buffer);
129 }
130
131 return;
132}
133
134
135
136// Write ARP request to all monocast addresses in the network
137// (all less network, broadcast, and self)
138
139void writequeries(int fd, struct ether_addr *ownmac, struct sockaddr_in *saip, struct sockaddr_in *samask) {
140
141 unsigned char msg[sizeof(struct ether_header) + sizeof(struct ether_arp)];
142 struct ether_header *ethhdr;
143 struct ether_arp *etharp;
144 struct sockaddr_in sat;
145 uint32_t addr, addrnw, addrbc, addrown;
146 int len, addlen;
147
148 ethhdr = (struct ether_header *)msg;
149 etharp = (struct ether_arp *)(ethhdr + 1);
150
151 prepareframe(ownmac, saip, ethhdr, etharp);
152
153 addrown = ntohl(saip->sin_addr.s_addr);
154 addrnw = ntohl(saip->sin_addr.s_addr & samask->sin_addr.s_addr);
155 addrbc = addrnw + (0xffffffff - ntohl(samask->sin_addr.s_addr));
156
157 printf("\nWho has? for %u IP addresses\n", addrbc - addrnw - 2);
158
159 for (addr = addrnw + 1; addr < addrbc; addr++) {
160 if (addr != addrown) {
161 sat.sin_addr.s_addr = htonl(addr);
162 memcpy((u_int8_t *)etharp->arp_tpa, (u_int8_t *)&(sat.sin_addr.s_addr), 4*sizeof(u_int8_t));
163 len = 0;
164 while (len<(int)(sizeof(struct ether_header) + sizeof(struct ether_arp))
165 && (addlen = write(fd, (char *)ethhdr + len,
166 sizeof(struct ether_header) + sizeof(struct ether_arp) - len)) >= 0)
167 len += addlen;
168 }
169 }
170
171 return;
172}
173
174
175int main(int argc, char *argv[]) {
176
177 int fd;
178 int buflen;
179 int dlt;
180 struct ifreq ir;
181 struct ether_addr ownmac;
182 struct sockaddr_in saip;
183 struct sockaddr_in samask;
184
185 // BPF rule
186 struct bpf_insn insns[] = {
187 // Load word at octet 12
188 BPF_STMT(BPF_LD | BPF_H | BPF_ABS, 12),
189 // If not ETHERTYPE_ARP, skip next 3 (and return nothing)
190 BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, ETHERTYPE_ARP, 0, 3),
191 // Load word at octet 20
192 BPF_STMT(BPF_LD | BPF_H | BPF_ABS, 20),
193 // If not ARPOP_REPLY, skip next 1 (and return nothing)
194 BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, ARPOP_REPLY, 0, 1),
195 // Valid ARP reply received, return message
196 BPF_STMT(BPF_RET | BPF_K, sizeof(struct ether_arp) + sizeof(struct ether_header)),
197 // Return nothing
198 BPF_STMT(BPF_RET | BPF_K, 0),
199 };
200 struct bpf_program filter = {
201 sizeof insns / sizeof(insns[0]),
202 insns
203 };
204
205 if (argc == 2) {
206 if ((fd = open("/dev/bpf", O_RDWR)) > 0) {
207 strncpy(ir.ifr_name, argv[1], IFNAMSIZ);
208 buflen = 1;
209 if (ioctl(fd, BIOCSETIF, &ir) != -1
210 && ioctl(fd, BIOCIMMEDIATE, &buflen) != -1
211 && ioctl(fd, BIOCGBLEN, &buflen) != -1) {
212 if (ioctl(fd, BIOCGDLT, &dlt) != -1
213 && dlt == DLT_EN10MB) {
214 if (ioctl(fd, BIOCSETF, &filter) != -1) {
215
216 if (findownaddresses(argv[1], &ownmac, &saip, &samask)) {
217 writequeries(fd, &ownmac, &saip, &samask);
218 collectresponses(fd, buflen);
219 exit(0);
220 }
221 else
222 fprintf(stderr, "Missing address for interface\n");
223 }
224 else
225 fprintf(stderr, "Cannot set BPF rule\n");
226 }
227 else
228 fprintf(stderr, "Link type unknown or not Ethernet\n");
229 }
230 else
231 fprintf(stderr, "%s (ioctl)\n", strerror(errno));
232 }
233 else
234 fprintf(stderr, "%s (open)\n", strerror(errno));
235 }
236 else
237 fprintf(stdout, "Usage:\tarpscan if\n");
238
239 return -1;
240}