Data sources are a new feature, which will be the base of allowing access to
dynamic data stores in signature writing (currently only available in golang).
Data sources are currently an experimental feature and in active development,
and usage is opt-in.
Signatures should opt for data sources when they need access to data beyond what
is provided by the events they process.
For instance, a signature may need access to data about the container where the
event being processed was generated. With Tracee's integrated container data
source, this can be achieved without the signature having to separately monitor
container lifecycle events.
Tracee offer three built-in data sources out of the box.
There is also support for plugging in external data sources through the golang
plugin mechanism, similar to how signatures are currently supplied (see here).
However, there are known technical limitation to this approach, and the aim is to replace it
in the future.
Currently, the following data source are provided out of the box:
Containers: Provides metadata about containers given a container id.
Process Tree: Provides access to a tree of ever existing processes and threads.
DNS Cache: Provides access to relaated DNS queries of a given address (IP or domain).
This list will be expanded as other features are developed.
In order to use a data source in a signature you must request access to it in
the Init stage. This can be done through the SignatureContext passed at that
stage as such:
func(sig*mySig)Init(ctxdetect.SignatureContext)error{...containersData,ok:=ctx.GetDataSource("tracee","containers")if!ok{returnerrors.New("containers data source not registered")}ifcontainersData.Version()>1{returnfmt.Errorf("containers data source version not supported, please update this signature")}sig.containersData=containersData}
As you can see, access to the data source has been requested using two keys: a
namespace and a data source ID. Namespaces are employed to prevent name
conflicts in the future when integrating custom data sources. All built-in data
sources from Tracee will be available under the "tracee" namespace.
After verifying the data source's availability, it's suggested to include a
version check against the data source. This approach ensures that outdated
signatures aren't run with a newer data source schema.
Now, in the OnEvent function, you may use the data source like so:
container,err:=sig.containersData.Get(containerId)if!ok{returnfmt.Errorf("failed to find container in data source: %v",err)}containerName:=container["container_name"].(string)
Each Data source provides a querying method Get(key any) map[string]any. In
the provided example, type validation is omitted during key verification. This
omission is safe when adhering to the schema (provided by the Schema()
method), considering the JSON representation of the returned map, and after an
initial check of the data source version.