If your enterprise is using Office 365, your users are generating log entries every time they log in, upload files to OneDrive, send an email--the logging is pretty extensive! You can analyze these log events in the O365 console, but wouldn't it be nice to pull them into Gravwell and correlate with other data sources? Thanks to the new Office 365 ingester, you can.
In this blog post we'll talk about the technical challenges involved in pulling logs from Office 365, then turn to less depressing topics for a quick description of the new ingester. If you are new to Gravwell or do not have an active Gravwell installation, request a Free Trial and then view the Quick Start Guide to get started once you have received your license.
The O365 Management API
Not to put too fine a point on it, the Office 365 Management API is a pain if you want to fetch all events as they come in.
The REST API deals in "content blobs", which are collections of events. In order to start reading these content blobs, you must create a subscription for the desired message type. With the subscription created, content blobs will start to show up within an indeterminate number of hours--officially, within 24 hours, which is a pretty strikingly long time for a Microsoft Azure product to start gathering logs from other Microsoft Azure products, but speed has never been a Microsoft strong point!
To actually fetch events, you request a list of all available content in a given time range. You'll get back a potentially paginated list of content blobs. Each content blob can then be fetched with a separate API call. Each content blob contains one or more events within it. The trick is that the same event may appear in multiple unique content blobs!
Luckily, we were able to find some code written by Chris Hendricks (github.com/counteractive/o365beat) which takes some of the hassle out of dealing with the API itself. We broke this out into a standalone Go library which we used to implement the ingester.
Because the logs API doesn't sequence events (you can't say "give me everything after event 0xFF0A"), we keep track of every content blob and every event we see. If we see a new content blob in the list of available content, we fetch it, then walk every event in it and make sure it has not already been seen before we actually ingest it. This means maintaining a fairly complex state file, but it prevents duplication.
The O365 Ingester
The ingester is available on our downloads page or in the Gravwell Debian package repository. Installation is straightforward: if you are installing on the same machine as the Gravwell indexer, it will automatically detect appropriate settings, otherwise you'll be prompted for the indexer address & the ingest auth secret. Once installed, you'll need to edit /opt/gravwell/etc/o365_ingest.conf to set up authentication with Office 365.
The ingester needs an Azure Active Directory application registration in order to access the logs. Luckily, Microsoft provides directions for registering an application with appropriate permissions. Once the application is registered, the portal will display an Application ID and a Directory ID; copy these into the Client-ID and Directory-ID fields in o365_ingest.conf. Click the 'Certificates and Secrets' menu to create a client secret, which should be pasted into the Client-Secret field. Finally, set Tenant-Domain to your Azure domain, e.g. "mycorp.onmicrosoft.com".
The default config file defines ContentType blocks to fetch all available O365 log types. If you wish to exclude, say, Exchange logs, comment out the block with Content-Type="Audit.Exchange".
Once configured and started (systemctl start gravwell_o365_ingester), the ingester will create subscriptions to all the requested event types and start polling for events. Note that it will take several hours before events begin to arrive, due to the way O365 event subscriptions work! However, once events start rolling, you'll be able to start building dashboards around them and fusing them with your other data sources.
(Please forgive the blanking out of some fields in the image above; we prefer to play it safe and obfuscate IDs when possible, even if they may be innocuous)
Summary
The Office 365 Log Ingester is new and experimental. We hope that by getting it out in front of a wide group of users, we can start to gather feedback and tweak it; Azure is a new thing for us, and we still have plenty to learn! If you'd like to set up the ingester with your Gravwell system, email support@gravwell.io with any questions you may have. If you haven't set up Gravwell yet, well, try out Gravwell for free! Our Trial version includes unlimited ingestion, retention, ingester endpoints, searches, users, and more. The power to extract insights from your data is just an 'apt install' away.
John's been writing Go since before it was cool and developing distributed systems for almost as long.