Gravwell has officially supported Netflow v5 and IPFIX for some time. As of Gravwell 3.3.3, we're happy to announce that we now support Netflow v9 as well! This post will talk about the essential differences between Netflow v9 and IPFIX, how we implemented support, and how to get up and running with Netflow v9 ingest. We'll also talk about some pretty serious efficiency improvements we made in our IPFIX/Netflow v9 parsing module.

The Netflow v9 Protocol

Netflow v5 has been around for a long time, but over the years it's begun to show its age. For starters, NFv5 only supports IPv4 addresses! It also has a single format for flow records; if you need to know anything beyond what's in the format, you're out of luck.

Cisco developed Netflow v9 to address this. In RFC 3954, they described a large set of fields (denoted by a numeric ID) including IPv4 and IPv6 source and destination addresses, a variety of byte and packet counters, routing information, MPLS info, MAC addresses, and more. Rather than shipping every field with every flow record--which would take a LOT of bytes--they implemented a template system.

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Template ID 256 | Field Count |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Field Type 1 | Field Length 1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Field Type 2 | Field Length 2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ... | ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Field Type N | Field Length N |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Netflow v9 template definition (RFC 3954)

The switch (or other monitoring system) determines which fields it wants to export for flows. It then defines a template, identified by a template ID number, which describes that list of desired fields. There might be template #257 containing IPv4 source and destination IPs, source ports, etc., then template #258 defines IPv6 address fields. Flow records then need only specify the appropriate template ID at the start of the record; if a flow record begins with "258", it'll be parsed as an IPv6 record in this example.

The important thing to understand about Netflow v9 is that it is stateful, but it is frequently sent over a stateless protocol. When the switch first begins sending Netflow, it sends a set of templates, then begins sending flow records which refer to those templates. Because Netflow is transported over UDP, the switch can't know exactly where the Netflow collector started consuming records, so it periodically re-sends the template definitions. This means that if you simply start listening for Netflow v9 traffic, you may receive dozens of Netflow records for which you have no templates! And, since switches are allowed to re-define templates, you cannot safely apply a freshly-received template definition to earlier messages, even if the template ID is the same.

The IPFIX Protocol

Although Netflow v9 defined many new field types, it only officially supported the data types Cisco listed in the specification. There was no mechanism for defining custom fields. If you made a firewall, you might want to send flow records giving numbers of blocked connections etc., but there were no fields defined for this purpose. IPFIX, specified in RFC 7011, was designed by the IETF to take the best parts of Netflow v9 and expand them for greater flexibility. IPFIX messages are very similar to Netflow v9, with a few differences we'll discuss later.

Besides the concept of numeric fields used by Netflow v9, IPFIX introduced the enterprise ID. The enterprise ID specifies a "namespace" for the field ID numbers. IANA defines a set of default fields in iana which encompasses the predefined Netflow v9 fields and may be considered enterprise ID 0. By specifying a non-zero enterprise ID, you can include arbitrary fields in your IPFIX templates. This means the conceptual firewall appliance we mentioned earlier could define its own set of fields, provided it uses an enterprise ID to distinguish them from the default fields.

     0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|E| Information Element ident. | Field Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Enterprise Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

IPFIX Field Specifier Format (RFC 7011)

Where Netflow v9 templates just include a field type and a field length, IPFIX templates use a field type ("information element identifier"), field length, and an enterprise number but ONLY if the first bit ('E') is set.

We're Not So Different, You And I

IPFIX and Netflow v9 are not the same protocol. That said, they are ridiculously similar. Let's take a look at the basics.

First, the headers. Both IPFIX and Netflow v9 send templates and flow records in batches. Each batch, a single UDP packet, has a header. Netflow's headers look like this:

    0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version Number | Count |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| sysUpTime |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| UNIX Secs |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source ID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Netflow v9 Packet Header (RFC 3954)

The IPFIX header is identical except that it omits the "sysUpTime" field. Yes, you heard right, a later protocol version actually reduced message size! There is also a small difference in the semantics of the "Count" field: in Netflow v9, it lists the number of records in the message, while in IPFIX it lists the length in bytes of the message.

Luckily for us, the first two bytes of the header tell us how to parse. If they're '0x09', we know it's Netflow v9. If they're '0x0a', it's IPFIX.

Both IPFIX and Netflow v9 define three types of records: templates, options templates, and data records. Every record indicates its type in the first two bytes of the record. In Netflow v9, templates have a type of 0, options templates have a type of 1, and data records have a type > 256 corresponding to a template ID. In IPFIX, templates are type 2, options templates are type 3, and data records have a type > 256 corresponding to a template ID.

Beside the different type IDs, the actual formats of records are backwards-compatible, meaning an IPFIX parser will happily consume a Netflow v9 record. The set of pre-defined IPFIX field IDs are a superset of the Netflow v9 field IDs. It's therefore really easy to adapt an IPFIX parser into an IPFIX + Netflow v9 parser: just make the code that reads the header do slightly different things based on the version number, and then handle the different type IDs when parsing records. The rest just carries over!

A Quick Diversion About Field Types

If you look at the predefined fields, found in RFC 3954 for Netflow v9 and at iana for IPFIX, you'll see that each field defines a type. There are a variety of unsigned integer types, IPv4 and IPv6 addresses, strings, etc. For instance, IPFIX's "octetDeltaCount" field (ID 1) is defined to be an unsigned 64-bit integer, while the corresponding Netflow field "IN_BYTES" is defined as a variable-length integer of 1-10 bytes.

We've found that in practice, the length given in the spec cannot be trusted. The only thing you can trust is the field length given in the template. Most of our testing has been done with the ipt-netflow kernel module (github) which can emit Netflow v5 and v9, plus IPFIX. We found that this module will send 32-bit integers for things like the "octetDeltaCount" counter, rather than 64-bit integers as given in the spec. Of course, this means if a flow transfers over 4GB of data between reporting intervals, the counter will overflow (we've observed this)! The moral: we now only trust the field lengths as reported in the template.

Library Changes

Gravwell's IPFIX support comes courtesy of Jakob Borg's library (github). Unfortunately, it seems Jakob no longer has the time (or perhaps the need) to maintain it, so he's archived the repository. As a result, we're now using our own fork (ipfix) exclusively. This has the advantage of much quicker turnarounds when we want to introduce a change!

The base library was quite effective at parsing IPFIX, and we found that with small changes we could also make it read Netflow v9. If you care to investigate, you'll find that parser.go contains a few places where we check the Version field in the message header and behave differently based on the result--this is essentially all it took to get Netflow v9 parsing!

We also imported the set of Netflow v9 field names into the internal dictionary and added functions to look up a Netflow or IPFIX field name given enterprise + field IDs, or vice versa. This makes it easy to walk over a template and list out the fields it contains, for instance.

The biggest change was driven purely by efficiency concerns. The ipfix module in Gravwell originally used a library function which would parse out an IPFIX message into a very verbose structure, attaching human-friendly field names and everything. Unfortunately, this causes a ton of allocations, making the ipfix module one of the slowest modules in the system. We implemented new library code which allows the programmer to register callbacks for certain fields, then walks the IPFIX or Netflow message, calling the callback whenever one of the registered fields is encountered. This has led to a massive speed improvement, making our module on the order of a thousand times faster!

Ingesting Netflow v9

"Ok, ok," you're saying, "Enough dry details about RFCs and protocol formats. How do I get Netflow v9 into Gravwell?"

It's super easy! Because Netflow v9 and IPFIX use the same underlying parsing library, starting in Gravwell 3.3.3 we can essentially treat them the same in both the ingester and the query language. Below is an example snippet from netflow_capture.conf which sets up a listener for IPFIX and a listener for Netflow v9; note that Flow-Type=ipfix IS NOT A TYPO--we use the same Flow-Type for both Netflow v9 and IPFIX!

[Collector "ipfix"]
Tag-Name=ipfix
Bind-String="0.0.0.0:6343"
Flow-Type=ipfix

[Collector "netflowv9"]
Tag-Name=v9
Bind-String="0.0.0.0:2055"
Flow-Type=ipfix

When querying, you can use the existing ipfix module to parse Netflow v9 messages. You can use Netflow field names (see RFC 3954) or IPFIX field names (see iana), or you can mix and match--all three examples below will return the same results!

tag=v9 ipfix PROTOCOL L4_SRC_PORT | table

tag=v9 ipfix protocolIdentifier sourceIPv4Address | table

tag=v9 ipfix PROTOCOL sourceIPv4Address | table

See the module documentation (gravwell) for more information on syntax, tips and tricks, and examples.

A Bit About Use Cases

It seems like every time we write a blog post, the act of digging around for interesting examples leads us down a rabbit hole. This time, Kris got interested in the flows captured during a game of "PlayerUnknown's Battlegrounds". We'll be posting a followup entry soon detailing his investigation, what he did and what he learned.

Conclusion

With Gravwell 3.3.3 adding Netflow v9 support, we've finally bridged that gap between Netflow v5 and IPFIX. Regardless of your particular network hardware, you should now be able to ship flow records directly to Gravwell, where you can meld them with all your *other* data and find out about all the weird stuff that's going on in your network! As always, feel free to email support@gravwell.io if you have questions or would like to schedule a demo, or just drop us a line about any particularly interesting things you've learned using Gravwell. If this all sounds great but you don't have Gravwell yet, you can grab our free Trial license and start playing with it immediately--just click the button below.

Request a Trial