
You can filter on any of the fields present in the data model, and also by time, and the original index and sourcetype. | tstats summariesonly=true count from datamodel=Endpoint.Processes where Processes.process_name=powershell.exe NOT Processes.parent_process_name IN ("code.exe", "officeclicktorun.exe") by Processes.process_path Processes.process | `drop_dm_object_name("Processes")` Just as with the Hurricane Labs blog, options for filtering and manipulating tstats output can be managed with the same operations. The DatasetName components are not always needed – it depends whether you’re searching fields that are part of the root datamodel or not (it took me ages to get the hang of this so please don’t feel stupid if you’re struggling with it).

The statistics argument count and the by clause work similarly to the traditional stats command, but you will note that the search specifies Processes.process_name – a quirk of the structure of data models means that where you are searching a subset of a datamodel (a dataset in Splunk parlance), you need to specify your search in the form datamodel=DatamodelName where field_name=somevalue by field2_name field3_name Part of the indexing operation has broken out the process name in to a separate field, so we can search for an explicit name rather than wildcarding the path. What’s going on in this command? First of all, instead of going to a Splunk index and running all events that match the time range through filters to find “ *.powershell.exe“, my tstats command is telling it to search just the tsidx files – the accelerated indexes mentioned earlier – related to the Endpoint datamodel.

This search returned in 0.038 seconds, that’s nearly 270x faster! What sorcery is this? Well, the command used was: | tstats summariesonly=true count from datamodel=Endpoint.Processes where Processes.process_name=powershell.exe by Processes.process_path Processes.process What happens if we try searching an accelerated datamodel instead? Is there much of a difference? Splunk Job Inspector information for accelerated datamodel search That’s not so bad, but I only have a few thousand events here, and you might be searching millions, or tens of millions – or more! Splunk Job Inspector showing search time and cost breakdown In my (very limited) data set, according to the Job Inspector, it took 10.232 seconds to search 30 days’ worth of data. Let’s take this example, based on the fourth search in blog. This process carries a storage, CPU and RAM cost and is not on by default, so you need to understand the implications before enabling it. Once you have defined a datamodel and mapped a sourcetype to it, you can “accelerate” it, which generates indexes of the fields in the model. The CIM is a set of predefined datamodels for, as the name implies, types of information that are common. Separately, you can search the Endpoint.Processes datamodel for process_name=powershell.exe and get results for both. For example, instead of searching index=wineventlog EventCode=4688 New_Process_Name="*powershell.exe"Īnd index=sysmon EventCode=1 Image=*powershell.exe Datamodels allow you to define a schema to address similar events across diverse sources. The answer to these problems is datamodels, and in particular, Splunk’s Common Information Model (CIM). Putting together a search that covers three different sources for similar data can mean having to know three different field names, event codes specific to the products… it can get to be quite a hassle! Again, because there is no database, you are not constrained to predefined fields set by the SIEM vendor – but there is nothing to keep fields with similar data having the same name, so every type of data has its own naming conventions. Some searches that might have been fast in a database are not so rapid here. It’s a superb model, but does come with some drawbacks. Instead of parsing all the fields from every event as they arrive for insertion into a behemoth of a SQL database, they decided it was far more efficient to just sort them by the originating host, source type, and time, and extract everything else on the fly when you search. One of the biggest advantages Splunk grants is in the way it turns the traditional model of indexing SIEM events on its head. Is there anything we can do to improve our searches? Spoiler: yes! You may also have noticed that although these logs concern the same underlying event, you are using two different searches to find the same thing.

Depending on your environment, however, you might find these searches frustratingly slow, especially if you are trying to look at a large time window.
#SPLUNK TSTATS EXAMPLE WINDOWS#
The information in Sysmon EID 1 and Windows EID 4688 process execution events is invaluable for this task. Recently, posted an excellent introduction to threat hunting in Splunk.
