Work with String Data Using KQL Statements

 



Extract Data from Unstructured String Fields

Unstructured string fields often contains the Security log data and requires parsing to extract data. There are multiple ways of pulling information from string fields in KQL and the two primary operators used are extract and parse. 

Extract

It gets a match for a regular expression from a text string. You can convert the extracted substring to the indicated type. 
Arguments
  • regex- A regular expression.
  • captureGroup- A positive int constant indicating the capture group to extract. 0 stands for the entire match, 1 for the value matched by the first '('parenthesis')' in the regular expression, 2 or more for subsequent parenthesis. 
  • text- A string to search.
  • typeLiteral- An optional type literal. If provided, the extracted substring is converted to this type. 

Returns

If regex finds a match in text- the substring matched against the indicated capture group captureGroup, optionally converted to typeLiteral.

If there's no match, or the type conversion fails- null.

Parse

It evaluates a string expression and parses its value into one or more calculated columns. The computed columns will have nulls for unsuccessfully parsed strings.
Arguments
  • T- the input table.
  •  Kind:

       Simple (the default)- StringConstant is a regular string value and the match is strict. All string                 delimiters should appear in the parsed string, and all extended columns must match the required             types.

       regex- StringConstant may be a regular expression and the match is strict. All string delimiters,               which can be a regex for this mode, should appear in the parsed string, and all extended columns             must match the required types. 

      flags- Flags to be used in regex mode like U (Ungreedy), m (multi-line mode), s (match new line            \n), i (case-sensitive) in RE2 flags. 

  • relaxed- StringConstant is a regular string value and the match is relaxed. All string delimiters should appear in the parsed string, but extended columns may partially match the required types while those who didn't match will get the value null.
  • Expression- An expression that evaluates to a string.
  • ColumnName- The name of a column to assign a value to, extracted from the string expression.
  • ColumnType- (optional) The scalar value that indicates teh type top convert the value. The default is the string type.

Returns
The input table extended according to the list of the columns taht are provided to the operator.

Extend Data from Structured String Data

String fields may also contain structured data like JSON or Key-Value pairs. KQL provides easy access to these values for further analysis. 

JSON

KQL offers functions to manipulate JSON stored in string fields and as many logs submit data in JSON format, it is generally required for you to know how to transform JSON data to queryable fields.

Integrate External Data

The external data operator returns a table whose schema is defined in the query itself and whose data is read from an external storage artifact, such as a blob in Azure Blob Storage or an Azure Data Lake Storage File.
Arguments
  • ColumnName, ColumnType- The arguments define the schema of the table. The syntax is the same as the syntax used when defining a table in .create table. 
  • StorageConnectionString- Storage connection strings that describes the storage artifacts holding the data to return. 
  • PropertyName, PropertyValue,...- Additional properties that describes how to interpret the data retrieved from storage.

Currently, supported properties are:

Property

Type

Description

format

string

Data format. If not specified, an attempt is made to detect the data format from file extension. Any of the ingestion data formats are supported.

ignoreFirstRecord

bool

If set to true. Indicates that the first record in every file is ignored. This property is useful when querying CSV files with headers.

ingestionMapping

string

A string value that indicates how to map data from the source file to the actual columns in the operator result set.

 Returns

The externaldata operator returns a data table of the given schema with data parsed from the specified storage artifact, indicated ny the storage connection string.  







Comments

Popular posts from this blog

Deployment (Part 3)

Deployment (Part 2)

Deployment (Part 1)