1
00:00:00,000 --> 00:00:10,320
A Security Information and Event Management, a SIEM system, is a comprehensive tool that

2
00:00:10,320 --> 00:00:15,200
acts as a nerve center of an organization's security operations.

3
00:00:15,200 --> 00:00:20,800
It collects logs and security-related data from a wide array of sources across the

4
00:00:20,800 --> 00:00:27,920
digital infrastructure, including endpoints, servers, network devices, cloud services,

5
00:00:27,920 --> 00:00:32,960
and various security tools like firewalls or antivirus systems.

6
00:00:32,960 --> 00:00:38,480
The main advantage of a SIEM is its ability to centralize this data, which is crucial

7
00:00:38,480 --> 00:00:43,840
for effective monitoring and analysis. This allows security teams to have a unified

8
00:00:43,840 --> 00:00:49,680
view of all security events and logs enabling faster detection,

9
00:00:49,680 --> 00:00:53,760
investigation, and response to potential threats.

10
00:00:53,760 --> 00:00:58,640
One of the primary capabilities of a SIEM is centralized log management.

11
00:00:58,640 --> 00:01:04,560
It aggregates logs from different sources such as Syslog, Windows event logs,

12
00:01:04,560 --> 00:01:11,360
and cloud-based services, streamlining the process of managing large volumes

13
00:01:11,360 --> 00:01:14,320
of data. This centralized approach not only

14
00:01:14,320 --> 00:01:18,640
simplifies data storage but also enhances the efficiency

15
00:01:18,640 --> 00:01:24,160
of security operations as analysts can quickly access and correlate data from

16
00:01:24,160 --> 00:01:28,640
various systems. SIEM systems also offer significant threat

17
00:01:28,640 --> 00:01:32,400
visibility. By correlating events from multiple

18
00:01:32,400 --> 00:01:37,040
sources they can detect suspicious patterns that indicate potential

19
00:01:37,040 --> 00:01:41,120
threats such as lateral movement within a network

20
00:01:41,120 --> 00:01:44,880
or privileged escalation attempts within a host.

21
00:01:44,960 --> 00:01:48,560
For example, if a user with limited access rights

22
00:01:48,560 --> 00:01:54,000
suddenly tries to access sensitive data, a SIEM would flag this as a potential

23
00:01:54,000 --> 00:01:58,480
security incident. Moreover, SIEM systems play a crucial

24
00:01:58,480 --> 00:02:03,440
role in helping organizations maintain security compliance.

25
00:02:03,440 --> 00:02:07,680
They can be configured to meet various regulatory standards

26
00:02:07,680 --> 00:02:16,000
like PCI DSS, NIST, or ISO 27001. Ensuring that the security practices

27
00:02:16,000 --> 00:02:20,560
aligned with the industry requirements would be this compliance.

28
00:02:20,560 --> 00:02:24,400
These requirements often mandate detailed logging

29
00:02:24,400 --> 00:02:30,080
and real-time monitoring, both of which are core functions of the SIEM systems.

30
00:02:30,080 --> 00:02:35,040
For example, a retail company may use a SIEM system to monitor its payment

31
00:02:35,040 --> 00:02:39,840
processing systems. A system could be configured to send an alert

32
00:02:39,840 --> 00:02:45,280
if login attempts are made outside of business hours

33
00:02:45,280 --> 00:02:50,880
from a foreign IP address, signaling a potential security breach.

34
00:02:50,880 --> 00:02:55,440
This automated alert allows the security team to investigate the

35
00:02:55,440 --> 00:03:01,120
situation and take action before a full-scale attack occurs.

36
00:03:01,120 --> 00:03:05,760
SIEMs begin their operation by collecting logs from a variety of data

37
00:03:05,760 --> 00:03:09,200
sources within an organization infrastructure.

38
00:03:09,200 --> 00:03:12,560
These sources include endpoint servers, switches,

39
00:03:12,560 --> 00:03:16,160
routers, firewalls, and security appliances,

40
00:03:16,160 --> 00:03:20,240
all of which generate logs in different formats.

41
00:03:20,240 --> 00:03:26,640
To make sense of this disparate data, SIEMs employ a process of log

42
00:03:26,640 --> 00:03:32,400
normalization. This process involves converting logs from various formats

43
00:03:32,400 --> 00:03:36,240
into a unified structure, allowing them to be analyzed and

44
00:03:36,240 --> 00:03:41,040
correlated efficiently. The technical process of log collection

45
00:03:41,040 --> 00:03:46,000
typically involves collectors or agents such as NXLog,

46
00:03:46,000 --> 00:03:51,840
Winlogbeat, or Beats from ELK Stack, Fluentd, or others

47
00:03:51,840 --> 00:03:57,600
that pull data from these resources. These agents are responsible for gathering

48
00:03:57,600 --> 00:04:01,440
the logs and sending them to the SIEM for processing.

49
00:04:01,440 --> 00:04:06,480
Once collected, the logs are normalized using predefined parsing rules and

50
00:04:06,480 --> 00:04:11,200
field mappings. This step converts the data into a consistent

51
00:04:11,200 --> 00:04:16,240
schema like the Common Event Format (CEF) or JSON.

52
00:04:16,240 --> 00:04:20,240
These formats standardize the structure of the logs, making them

53
00:04:20,240 --> 00:04:24,160
easier to analyze and search across the entire system,

54
00:04:24,160 --> 00:04:29,360
regardless of the original source. Another critical aspect of normalization

55
00:04:29,360 --> 00:04:34,880
is timestamp unification. Since logs are generated in different

56
00:04:34,880 --> 00:04:40,560
might be generated in different time zones or may experience clock drifts,

57
00:04:40,560 --> 00:04:44,560
it's essential to align the timestamps to ensure that events are

58
00:04:44,560 --> 00:04:47,680
accurately correlated across the systems.

59
00:04:47,680 --> 00:04:52,880
Without timestamp unification, logs from different sources might appear out of

60
00:04:52,880 --> 00:04:56,480
sequence, making it difficult to track the flow of

61
00:04:56,480 --> 00:05:02,000
events and identify security incidents. For example, imagine a SIEM

62
00:05:02,000 --> 00:05:06,080
system that receives logs from a Cisco ASA

63
00:05:06,080 --> 00:05:12,240
firewall and a set of Linux servers. Each of these sources may generate logs in

64
00:05:12,240 --> 00:05:15,520
different formats. The SIEM system will standardize

65
00:05:15,520 --> 00:05:20,000
these logs so a single search can look for SSH

66
00:05:20,000 --> 00:05:24,960
login attempts across both systems regardless of whether the logs came

67
00:05:24,960 --> 00:05:30,640
from a firewall or the servers. This normalization enables analysts to

68
00:05:30,640 --> 00:05:36,000
quickly and efficiently correlate events leading to faster

69
00:05:36,000 --> 00:05:40,080
identification of potential security threats.

70
00:05:40,080 --> 00:05:45,280
In summary, SIEMs rely on log collection and normalization

71
00:05:45,280 --> 00:05:49,200
to transform diverse data into a structured format

72
00:05:49,200 --> 00:05:53,840
that can be effectively analyzed and correlated. This process ensures that

73
00:05:53,840 --> 00:05:58,480
security teams can quickly identify and respond to incidents

74
00:05:58,480 --> 00:06:03,360
regardless of the systems generating the data.

75
00:06:03,360 --> 00:06:07,440
The technical capabilities of SIEMs enable them to perform

76
00:06:07,440 --> 00:06:12,240
the type of sophisticated analysis. Correlation engines

77
00:06:12,240 --> 00:06:17,600
are at the heart of this functionality. These engines match patterns using

78
00:06:17,600 --> 00:06:21,360
predefined rules or by establishing behavioral

79
00:06:21,360 --> 00:06:26,480
baselines. In doing so, the SIEM can detect anomalies

80
00:06:26,480 --> 00:06:30,560
that deviate from the normal behavior or of a system

81
00:06:30,560 --> 00:06:35,280
or a user such as an unusual high number of failed logins in a short

82
00:06:35,280 --> 00:06:39,680
period. Thresholds and heuristics are then

83
00:06:39,760 --> 00:06:45,680
applied to determine when these patterns should trigger an alert,

84
00:06:45,680 --> 00:06:50,880
helping to reduce noise and focusing on the attention of events that are most

85
00:06:50,880 --> 00:06:56,480
likely to indicate a security incident. SIEMs also offer the flexibility to

86
00:06:56,480 --> 00:07:01,360
customize alert rules for specific needs.

87
00:07:01,360 --> 00:07:06,800
For example, you can create rules based on known attack signatures

88
00:07:06,800 --> 00:07:11,280
like specific network traffic patterns associated with malware

89
00:07:11,280 --> 00:07:15,280
or on behavior anomalies such as unusual login times

90
00:07:15,280 --> 00:07:18,960
or activity outside of normal working hours.

91
00:07:18,960 --> 00:07:23,360
Additionally, many SIEM platforms support custom rule scripting

92
00:07:23,360 --> 00:07:30,480
in languages such as Lucene, SPL, Search Processing Language which is in

93
00:07:30,480 --> 00:07:34,480
Splunk, Sigma or

94
00:07:34,560 --> 00:07:39,040
Kibana Query Language (KQL) and other, allowing security teams to tailor their

95
00:07:39,040 --> 00:07:43,120
detection capabilities to their specific environment.

96
00:07:43,120 --> 00:07:48,560
An example of how SIEM correlation works in practice could be when a user logs in

97
00:07:48,560 --> 00:07:52,080
from two geographically distant locations such as New York

98
00:07:52,080 --> 00:07:55,840
and Singapore within a matter of minutes.

99
00:07:55,840 --> 00:08:00,960
This event might be flagged by the SIEM system using a custom geolocation

100
00:08:00,960 --> 00:08:05,360
velocity rule. This rule compares the user's login patterns

101
00:08:05,360 --> 00:08:10,880
against the known geographical locations and flags a rapid change as a

102
00:08:10,880 --> 00:08:16,880
potential indication of credential compromise or account takeover.

103
00:08:16,880 --> 00:08:20,880
By correlating these events, the SIEM system can provide security

104
00:08:20,880 --> 00:08:26,240
analysts with actionable alerts that require further investigation.

105
00:08:26,240 --> 00:08:32,240
In summary, SIEMs use data correlation to link events and uncover complex

106
00:08:32,240 --> 00:08:36,720
attacks patterns. By combining predefined rules, behavioral

107
00:08:36,720 --> 00:08:41,600
analysis and customizable alert conditions, SIEM systems can help

108
00:08:41,600 --> 00:08:45,520
organizations detect and respond to security incidents more

109
00:08:45,520 --> 00:08:49,200
effectively and efficiently.

110
00:08:49,200 --> 00:08:53,360
SIEM dashboards are essential tools for security analysts providing

111
00:08:53,360 --> 00:08:57,360
real-time visualization of network and system activities.

112
00:08:57,360 --> 00:09:00,880
These dashboards help security teams to monitor key performance

113
00:09:00,880 --> 00:09:04,320
indicators such as failed logging attempts,

114
00:09:04,320 --> 00:09:09,760
firewall denials and virus detections. By displaying this information in an easy

115
00:09:09,760 --> 00:09:13,520
to understand format, SIEM dashboards allow

116
00:09:13,520 --> 00:09:16,560
analysts to quickly identify and act on

117
00:09:16,560 --> 00:09:21,280
potential security threats, reducing the time it takes to respond to

118
00:09:21,280 --> 00:09:24,960
incidents. One of the key features of SIEM dashboards

119
00:09:24,960 --> 00:09:28,880
is the ability to display event volume,

120
00:09:28,880 --> 00:09:32,880
severity levels and active alerts in real time.

121
00:09:32,880 --> 00:09:38,320
This dynamic view helps analysts track the flow of security data and

122
00:09:38,320 --> 00:09:42,880
prioritize the response based on the criticality of each event.

123
00:09:42,880 --> 00:09:47,600
For instance, if there is a surge in failed logging attempts or a sudden

124
00:09:47,600 --> 00:09:51,600
spike in suspicious network activity, it will immediately be

125
00:09:51,600 --> 00:09:55,520
visible to the dashboard, prompting further

126
00:09:55,520 --> 00:09:59,600
investigation. Additionally, many SIEM

127
00:09:59,600 --> 00:10:03,520
dashboards include the drill-down views.

128
00:10:03,520 --> 00:10:09,600
This allows analysts to click on specific anomalies or alerts and examine

129
00:10:09,600 --> 00:10:13,600
the details such as the source of the activity,

130
00:10:13,600 --> 00:10:17,280
the associated user or device and other contextual

131
00:10:17,280 --> 00:10:21,520
information. This capability speeds up the investigation

132
00:10:21,520 --> 00:10:26,240
process, enabling analysts to make informed decisions quickly.

133
00:10:26,240 --> 00:10:30,000
SIEM systems also provide reporting capabilities,

134
00:10:30,000 --> 00:10:34,960
offering pre-built templates that can be customized to meet organizational

135
00:10:34,960 --> 00:10:41,040
needs or regulatory requirements. These reports help ensure compliance with

136
00:10:41,040 --> 00:10:47,200
standards such as PCI DSS or GDPR and they are often used for audits

137
00:10:47,200 --> 00:10:52,800
or executive summaries. The ability to schedule these reports ensures that

138
00:10:52,800 --> 00:10:58,000
stakeholders stay informed about security posture and that compliance

139
00:10:58,000 --> 00:11:03,920
is continuously monitored. For example, a security operation center,

140
00:11:03,920 --> 00:11:08,720
a SOC team, might use the SIEM's dashboard to monitor SSH login

141
00:11:08,720 --> 00:11:14,000
attempts across production servers. If more than 10 failed login attempts occur

142
00:11:14,000 --> 00:11:18,800
within five minutes on any server, the system can trigger an alert.

143
00:11:18,800 --> 00:11:22,720
This allows the SOC teams to investigate potential brute force attack

144
00:11:22,720 --> 00:11:26,080
attempts, quickly taking action to prevent a

145
00:11:26,080 --> 00:11:30,880
successful intrusion. In summary, SIEM dashboards are very crucial

146
00:11:30,880 --> 00:11:34,720
for providing real-time visibility into the security events

147
00:11:34,720 --> 00:11:39,680
and streamlining the investigative process. By combining live monitoring

148
00:11:39,680 --> 00:11:45,120
features, drill-down analysis and automated reporting, SIEMs empower

149
00:11:45,120 --> 00:11:50,080
security teams to respond faster and maintain compliance within the

150
00:11:50,080 --> 00:11:54,560
organizational and regulatory standards.

151
00:11:54,560 --> 00:11:58,640
So, as we can see, one of the key capabilities of the SIEM

152
00:11:58,640 --> 00:12:03,440
is the detection of unusual behavior. For example, if a user logs in

153
00:12:03,440 --> 00:12:09,920
at odd hours, like 3 o'clock in the morning, or from an unfamiliar

154
00:12:09,920 --> 00:12:14,000
geolocation, the SIEM can flag this activity as suspicious.

155
00:12:14,000 --> 00:12:19,360
It's like User Behavior Analytics (UBA). These anomalies can be indicative of

156
00:12:19,360 --> 00:12:24,080
compromised credentials or unauthorized access attempts.

157
00:12:24,080 --> 00:12:28,800
Another powerful feature is the ability to correlate multiple events into a

158
00:12:28,880 --> 00:12:35,360
single consolidated alert. For example, a SIEM may detect a series of

159
00:12:35,360 --> 00:12:40,400
failed login attempts, followed by a successful login and subsequent

160
00:12:40,400 --> 00:12:45,280
privilege escalation. When these events occur in close succession,

161
00:12:45,280 --> 00:12:49,760
the SIEM links them together, triggering an alert that helps security teams

162
00:12:49,760 --> 00:12:53,760
identify potential attacks like brute force login attempts

163
00:12:53,840 --> 00:12:58,800
or lateral movement within the network. Additionally, SIEMs can

164
00:12:58,800 --> 00:13:03,680
integrate with threat intelligence feeds, to match activity against

165
00:13:03,680 --> 00:13:08,240
known indicators of compromise, such as IP addresses,

166
00:13:08,240 --> 00:13:11,680
file hashes or domain names associated with

167
00:13:11,680 --> 00:13:16,560
malicious behavior. This enables them to flag traffic

168
00:13:16,560 --> 00:13:19,920
or actions that align with known attack patterns,

169
00:13:19,920 --> 00:13:24,960
adding another layer of context and precision to the detection process.

170
00:13:24,960 --> 00:13:29,040
The SIEMs can also extract threat intelligence and

171
00:13:29,040 --> 00:13:34,080
indicators of compromise in order to help other SIEMs to respond

172
00:13:34,080 --> 00:13:40,000
to such activities. Let's see, for example, imagine a SIEM that detects a

173
00:13:40,000 --> 00:13:43,040
user account login at 3 o'clock in the morning

174
00:13:43,040 --> 00:13:47,040
from a foreign IP address. If the user then accesses

175
00:13:47,040 --> 00:13:50,640
sensitive files, the system can trigger an alert based on

176
00:13:50,640 --> 00:13:54,400
preconfigured correlation rules that link the

177
00:13:54,400 --> 00:14:00,000
unusual login behavior with the subsequent access to sensitive data.

178
00:14:00,000 --> 00:14:04,480
So it's like a kill chain. This will be identified by the SIEM.

179
00:14:04,480 --> 00:14:09,040
This early detection helps security teams respond swiftly,

180
00:14:09,040 --> 00:14:13,120
minimizing the risk of a data breach.

181
00:14:13,120 --> 00:14:16,560
Modern SIEMs have advanced capabilities that go beyond

182
00:14:16,560 --> 00:14:20,960
just detecting threats. They can also trigger predefined actions

183
00:14:20,960 --> 00:14:26,640
in response to specific alerts, which significantly reduces the need for

184
00:14:26,640 --> 00:14:30,160
manual intervention and accelerates the incident

185
00:14:30,160 --> 00:14:34,080
containment process. This automation allows security

186
00:14:34,080 --> 00:14:38,320
teams to respond to potential threats faster and more efficiently,

187
00:14:38,320 --> 00:14:43,840
minimizing the impact of security breaches. For example, a SIEM can automatically

188
00:14:43,920 --> 00:14:48,720
block IP addresses, of course, sending a firewall rule to the specific

189
00:14:48,720 --> 00:14:52,320
firewall that are involved in malicious activities,

190
00:14:52,320 --> 00:14:57,680
such as when being in a brute force login attempt or DDoS attacks.

191
00:14:57,680 --> 00:15:02,240
When multiple failed login attempts were detected, the SIEM can trigger an

192
00:15:02,240 --> 00:15:05,440
action to block the suspicious IP address,

193
00:15:05,440 --> 00:15:10,320
preventing further unauthorized access attempts.

194
00:15:10,320 --> 00:15:15,840
Another action SIEMs take is to automatically disable compromised accounts.

195
00:15:15,840 --> 00:15:21,120
If the system detects unusual login behavior, such as logins from

196
00:15:21,120 --> 00:15:26,880
other locations or weird hours, it can instantly disable

197
00:15:26,880 --> 00:15:31,760
the affected account to prevent any potential data breach or escalation.

198
00:15:31,760 --> 00:15:37,280
So it can be integrated the SIEM with firewalls or identity access controls

199
00:15:37,280 --> 00:15:41,680
in order to revoke rights and actually respond to the

200
00:15:41,680 --> 00:15:47,920
to the alert. In addition, SIEMs can send real-time alerts to security

201
00:15:47,920 --> 00:15:52,000
teams via various channels such as email, Slack,

202
00:15:52,000 --> 00:15:55,600
or integration with Security Orchestration, Automation

203
00:15:55,600 --> 00:16:01,680
and Response, as we call SOAR tools. This ensures that the right people

204
00:16:01,680 --> 00:16:07,040
are notified immediately, allowing them to take swift action.

205
00:16:07,040 --> 00:16:11,680
Let's give an example. Imagine a SIEM detects multiple failed SSH login

206
00:16:11,680 --> 00:16:15,280
attempts, followed by a successful login from an

207
00:16:15,280 --> 00:16:18,960
unfamiliar source. Upon detecting this pattern, the SIEM

208
00:16:18,960 --> 00:16:23,440
could automatically block the suspicious IP address

209
00:16:23,440 --> 00:16:27,440
through the firewall to prevent further access attempts.

210
00:16:27,440 --> 00:16:32,400
Simultaneously, it could send an email notification to the Security Operation

211
00:16:32,400 --> 00:16:36,720
Center, to the SOC team, with the incident details for further

212
00:16:36,720 --> 00:16:41,920
investigation. Let's see a specific tool, Splunk.

213
00:16:41,920 --> 00:16:46,640
Splunk is a commercial SIEM solution known for its scalability,

214
00:16:46,640 --> 00:16:50,960
rich visualization and powerful search capabilities.

215
00:16:50,960 --> 00:16:56,400
It ingests logs and machine data from a wide variety of sources, enabling

216
00:16:56,400 --> 00:17:01,760
real-time monitoring, threat detection and compliance reporting.

217
00:17:01,760 --> 00:17:06,400
One of the key features of Splunk is its Search Processing Language,

218
00:17:06,400 --> 00:17:11,200
also as we call it SPL. SPL is a domain-specific

219
00:17:11,200 --> 00:17:16,480
query language that enables complex searches, statistical analysis and the

220
00:17:16,480 --> 00:17:22,800
creations of visual dashboards. Analysts use SPL from Splunk

221
00:17:22,800 --> 00:17:27,680
to dive into data and identify potential threats.

222
00:17:27,680 --> 00:17:31,120
It provides a highly customizable search experience,

223
00:17:31,120 --> 00:17:35,600
helping analysts identify threats and outliers

224
00:17:35,600 --> 00:17:39,200
in large datasets, such as repeated failed login attempts

225
00:17:39,200 --> 00:17:44,480
or unusual network traffic patterns. Another powerful tool in Splunk is the

226
00:17:44,480 --> 00:17:49,440
Machine Learning Toolkit, which allows users to build predictive models

227
00:17:49,440 --> 00:17:53,520
for anomaly detection and User Behavior Analytics,

228
00:17:53,520 --> 00:17:59,600
UEBA. This toolkit helps detect unknown threats by identifying

229
00:17:59,600 --> 00:18:04,720
unusual patterns in data, which might not be flagged by traditional

230
00:18:04,720 --> 00:18:10,480
rule-based detection methods. It enables Splunk to go beyond basic searches

231
00:18:10,480 --> 00:18:15,600
and provide insights into future behavior, helping the analysts

232
00:18:15,600 --> 00:18:20,640
to detect advanced threats like insider threats or zero-day attacks.

233
00:18:20,720 --> 00:18:25,520
Splunk's Enterprise Security module adds an additional layer of

234
00:18:25,520 --> 00:18:28,960
functionality tailored to security operations.

235
00:18:28,960 --> 00:18:34,400
It includes security-specific dashboards, threat intelligence integration

236
00:18:34,400 --> 00:18:40,000
and incident review features. This module streamlines the detection and

237
00:18:40,000 --> 00:18:44,720
management of security incident by offering specialized tools for threat

238
00:18:44,720 --> 00:18:50,160
hunting and incident response, making it easier for analysts

239
00:18:50,240 --> 00:18:55,840
to investigate and mitigate potential threats across the infrastructure.

240
00:18:55,840 --> 00:18:59,840
For example, a security analyst might use SPL

241
00:18:59,840 --> 00:19:05,280
to search for failed login attempts across various systems in the network.

242
00:19:05,280 --> 00:19:09,440
The query could be something like index equals

243
00:19:09,440 --> 00:19:15,440
auth sourcetype equals linux_secure action equals failed

244
00:19:15,440 --> 00:19:19,600
and then stats count by user and source.

245
00:19:19,600 --> 00:19:24,080
This allows the analyst to quickly spot abnormal behavior such as a large

246
00:19:24,080 --> 00:19:28,720
number of failed logins from a single user or IP address,

247
00:19:28,720 --> 00:19:33,520
which could indicate a brute force attack. By using SPL, the analyst can

248
00:19:33,520 --> 00:19:36,560
immediately take actions such as locking accounts or

249
00:19:36,560 --> 00:19:41,040
blocking malicious IP addresses to prevent further escalation

250
00:19:41,040 --> 00:19:45,760
of the threat. The Splunk's combination of advanced querying, machine

251
00:19:45,760 --> 00:19:49,760
learning and tailored security modules makes it an

252
00:19:49,760 --> 00:19:53,360
essential tool for large-scale security operations.

253
00:19:53,360 --> 00:19:57,120
It enables organizations to detect and respond to threats

254
00:19:57,120 --> 00:20:00,640
faster, providing visibility and actionable

255
00:20:00,640 --> 00:20:05,360
insights into the security posture of the network.

256
00:20:05,360 --> 00:20:10,720
Another tool, a SIEM, is the ELK Stack, which is based on Elasticsearch,

257
00:20:10,720 --> 00:20:16,000
Logstash and Kibana. This is an open-source alternative that has gained

258
00:20:16,000 --> 00:20:19,920
significant traction in SIEM deployments lately.

259
00:20:19,920 --> 00:20:24,240
While it demands more setup and configuration compared to commercial

260
00:20:24,240 --> 00:20:27,520
solutions like Splunk, it provides organizations with

261
00:20:27,520 --> 00:20:33,280
flexibility and cost control. This makes it particularly

262
00:20:33,280 --> 00:20:38,560
attractive for businesses within-house DevOps or SecOps teams.

263
00:20:38,560 --> 00:20:43,600
The ELK Stack offers powerful capabilities for log aggregation, analysis and

264
00:20:43,600 --> 00:20:47,280
visualization, which is key for maintaining effective

265
00:20:47,280 --> 00:20:52,320
security monitoring. At the heart of the ELK Stack is Elasticsearch,

266
00:20:52,320 --> 00:20:57,840
a fast and distributed search engine designed to index and search vast

267
00:20:57,840 --> 00:21:02,960
amounts of log data. It allows more real-time querying,

268
00:21:02,960 --> 00:21:09,120
making it ideal for quickly retrieving and analyzing log data.

269
00:21:09,120 --> 00:21:15,120
Elasticsearch excels in handling large datasets, ensuring rapid response

270
00:21:15,120 --> 00:21:18,880
times even when dealing with complex searches.

271
00:21:18,880 --> 00:21:22,880
This ensures that security teams can query data efficiently

272
00:21:22,880 --> 00:21:27,040
and identify patterns or anomalies in a timely manner.

273
00:21:27,040 --> 00:21:30,800
Next, Logstash handles the data ingestion

274
00:21:30,800 --> 00:21:38,320
and transformation. It pulls log from data, the log data from various resources,

275
00:21:38,320 --> 00:21:43,440
parse it and then sends the processed data to the Elasticsearch.

276
00:21:43,440 --> 00:21:48,560
Logstash's role is critical because logs are often generated in different

277
00:21:48,560 --> 00:21:54,320
formats across multiple systems. It ensures that data is standardized,

278
00:21:54,320 --> 00:21:59,440
parsed correctly and then enriched with additional context where necessary,

279
00:21:59,440 --> 00:22:05,680
making it ready for analysis. Without this step logs would be difficult to work

280
00:22:05,680 --> 00:22:10,560
with and would hinder any meaningful investigation.

281
00:22:10,560 --> 00:22:15,440
Finally, Kibana serves as the visualization layer of the ELK Stack.

282
00:22:15,440 --> 00:22:21,920
It enables users to create interactive dashboards and run queries using

283
00:22:21,920 --> 00:22:28,000
Lucene-based syntax. They support Kibana Query Language (KQL), Elastic Query

284
00:22:28,000 --> 00:22:34,080
Language and so on. Kibana allows security analysts to visualize

285
00:22:34,080 --> 00:22:37,520
data trends, identify potential threats and track

286
00:22:37,520 --> 00:22:43,520
key metrics in real time. Through customizable dashboards, users can

287
00:22:43,520 --> 00:22:48,160
easily monitor anomalies, create reports and respond to emerging security

288
00:22:48,160 --> 00:22:53,120
events with a clear visual overview that the Kibana provides.

289
00:22:53,120 --> 00:22:57,280
For example, when monitoring for SSH brute force attacks,

290
00:22:57,280 --> 00:23:01,440
Logstash can be set up to parse the SSH logs

291
00:23:01,440 --> 00:23:07,920
from the folder /var/log/auth.log where login attempts are recorded on a Linux

292
00:23:07,920 --> 00:23:11,760
system. These logs are processed and indexed by

293
00:23:11,760 --> 00:23:17,840
Elasticsearch and NoSQL database, making it easy to search and query.

294
00:23:17,840 --> 00:23:22,800
Kibana can then visualize the failed login attempts over the time.

295
00:23:22,800 --> 00:23:26,880
Displaying trends and mapping the source IPs.

296
00:23:26,880 --> 00:23:32,320
This setup allows security teams to quickly identify abnormal patterns,

297
00:23:32,320 --> 00:23:36,720
such as multiple failed attempts from a single IP, which may signal

298
00:23:36,720 --> 00:23:39,680
a brute force attack.

299
00:23:40,000 --> 00:23:44,800
When comparing Splunk and the ELK Stack, there are several key differences in

300
00:23:44,800 --> 00:23:49,120
terms of features licensing and scalability.

301
00:23:49,120 --> 00:23:52,320
Understanding these differences can help organizations

302
00:23:52,320 --> 00:23:56,160
choose the right solution for their security monitoring and log management

303
00:23:56,160 --> 00:23:59,280
needs. Licensing is one of the most

304
00:23:59,280 --> 00:24:04,720
significant contrasts between the two. Splunk operates on a commercial

305
00:24:04,720 --> 00:24:09,600
tiered pricing model, meaning that organization must pay based on the

306
00:24:09,600 --> 00:24:14,160
amount of data ingested. While this model offers robust enterprise

307
00:24:14,160 --> 00:24:19,280
support, it can become very costly as data volume increases.

308
00:24:19,360 --> 00:24:24,640
On the other hand, the ELK Stack is open source, which means it's free to use.

309
00:24:24,640 --> 00:24:28,160
However, the organizations that require additional support,

310
00:24:28,160 --> 00:24:31,040
Elastic, the company behind the ELK Stack,

311
00:24:31,040 --> 00:24:34,800
offers paid support options, providing the flexibility

312
00:24:34,800 --> 00:24:38,960
to scale as needed without a significant upfront

313
00:24:38,960 --> 00:24:45,120
investment in licensing. In terms of query language, Splunk uses its own

314
00:24:45,120 --> 00:24:51,680
proprietary Search Processing Language, SPL. SPL is a highly

315
00:24:51,680 --> 00:24:56,240
flexible and powerful, allowing users to perform complex searches,

316
00:24:56,240 --> 00:25:00,240
statistical analysis, and even create custom dashboards.

317
00:25:00,240 --> 00:25:05,760
While SPL offers advanced capabilities, it can also have a steeper

318
00:25:05,760 --> 00:25:08,960
learning tier for users unfamiliar with these decision

319
00:25:08,960 --> 00:25:12,160
blocks. The ELK Stack, in contrast, uses

320
00:25:12,160 --> 00:25:17,120
Lucene and Kibana Query Language, both of which are more standardized

321
00:25:17,120 --> 00:25:22,080
and easier to learn, but less advanced compared to SPL.

322
00:25:22,080 --> 00:25:25,360
For those who are already familiar with SQL, like

323
00:25:25,360 --> 00:25:30,560
query language, Kibana Query Language offers a more accessible entry point.

324
00:25:30,560 --> 00:25:35,200
It feels similar to the SQL, but it may support,

325
00:25:35,200 --> 00:25:40,640
might not support, the level of complex querying that SPL does.

326
00:25:40,640 --> 00:25:46,000
When it comes to setup complexity, Splunk is known for being easy to deploy,

327
00:25:46,000 --> 00:25:50,320
especially with its enterprise support options.

328
00:25:50,320 --> 00:25:53,680
This makes it ideal for organizations that need a quick

329
00:25:53,680 --> 00:25:58,160
out-of-the-box solution with minimal configuration.

330
00:25:58,160 --> 00:26:04,080
The ELK Stack, however, requires more manual configuration and tuning.

331
00:26:04,080 --> 00:26:08,480
It can be set up to meet specific requirements, but this also means more

332
00:26:08,480 --> 00:26:12,640
effort is needed for deployment, especially when scaling it

333
00:26:12,640 --> 00:26:18,160
across multiple environments. Integration capabilities also vary between the two

334
00:26:18,160 --> 00:26:22,480
solutions. Splunk offers extensive built-in

335
00:26:22,480 --> 00:26:27,920
integrations and add-ons, making it easier to incorporate into an existing

336
00:26:27,920 --> 00:26:31,440
environment. These pre-built integrations with

337
00:26:31,440 --> 00:26:36,800
popular systems and tools help streamline the deployment processes.

338
00:26:36,800 --> 00:26:40,000
Meanwhile, the ELK Stack is highly extensible,

339
00:26:40,000 --> 00:26:44,320
offering the flexibility to integrate with many systems through plugins

340
00:26:44,320 --> 00:26:49,200
and Beats, lightweight data shippers. The agents on the ELK

341
00:26:49,200 --> 00:26:55,280
is called Beats. This extensibility allows the ELK Stack to

342
00:26:55,280 --> 00:27:00,240
be customized and tailored for specific use cases.

343
00:27:00,240 --> 00:27:04,080
However, it might require additional configurations

344
00:27:04,080 --> 00:27:07,520
to ensure seamless integration with the systems.

345
00:27:07,520 --> 00:27:12,400
Finally, scalability is an area where both solutions differ.

346
00:27:12,400 --> 00:27:17,360
Splunk is cloud-native, offering scaling capabilities that are managed by

347
00:27:17,360 --> 00:27:21,600
Splunk cloud itself. This makes it easier for organizations

348
00:27:21,600 --> 00:27:27,440
to scale up as needed without the burden of managing the infrastructure.

349
00:27:27,440 --> 00:27:32,560
On the other hand, ELK Stack requires more manual cluster management for

350
00:27:32,640 --> 00:27:36,640
scaling. While it can scale to handle large datasets,

351
00:27:36,640 --> 00:27:41,920
this process demands more hands-on effort, including managing

352
00:27:41,920 --> 00:27:47,680
nodes, handling data distribution, and ensuring high availability of the

353
00:27:47,680 --> 00:27:53,200
system itself. Graylog is an open-source log management

354
00:27:53,200 --> 00:27:57,920
platform commonly used for mid-sized SIEM deployments.

355
00:27:57,920 --> 00:28:02,000
It is built on top of Elasticsearch and MongoDB,

356
00:28:02,080 --> 00:28:06,240
offering a user-friendly, flexible, and extensible solution

357
00:28:06,240 --> 00:28:11,360
for log management and analysis. Known for its simplicity, Graylog is

358
00:28:11,360 --> 00:28:16,480
particularly suited for environments where ease of setup and customization

359
00:28:16,480 --> 00:28:20,720
are crucial, yet it maintains powerful features that meet the

360
00:28:20,720 --> 00:28:25,920
demands of security teams. One of the standout features of Graylog

361
00:28:25,920 --> 00:28:31,200
is its modular architecture. It uses components like inputs,

362
00:28:31,200 --> 00:28:36,400
extractors, and pipelines to process, filter, and enrich logs,

363
00:28:36,400 --> 00:28:42,480
which provides flexibility in how log data is ingested, parsed, and analyzed.

364
00:28:42,480 --> 00:28:47,520
This architecture allows organizations to tailor their log processing pipelines

365
00:28:47,520 --> 00:28:51,520
to fit specific needs, making Graylog an adaptable

366
00:28:51,520 --> 00:28:56,880
solution for various use cases. Graylog also provides

367
00:28:56,880 --> 00:29:02,240
stream-based filtering, which helps users route logs into specific streams

368
00:29:02,240 --> 00:29:07,120
based on their content. This enables customized alerting and

369
00:29:07,120 --> 00:29:11,280
dashboarding based on the log types that are most important

370
00:29:11,280 --> 00:29:16,560
to the security team. For example, if there are certain logs associated

371
00:29:16,560 --> 00:29:20,000
with critical systems or particular threats,

372
00:29:20,000 --> 00:29:23,920
Graylog can be configured to prioritize these events and trigger

373
00:29:23,920 --> 00:29:29,680
alerts accordingly. Alerting and correlation are core components

374
00:29:29,680 --> 00:29:33,840
of Graylog, offering real-time monitoring capabilities.

375
00:29:33,840 --> 00:29:39,040
Users can configure alert conditions based on thresholds or specific patterns

376
00:29:39,040 --> 00:29:43,920
within the stream of logs. For instance, if a predefined threshold

377
00:29:43,920 --> 00:29:47,360
is exceeded, such as multiple failed login attempts,

378
00:29:47,360 --> 00:29:50,880
in a short period of time, Graylog can trigger an alert,

379
00:29:50,880 --> 00:29:54,560
notifying the security team of potential threats.

380
00:29:54,560 --> 00:30:00,400
An example, a network administrator sets up a stream to monitor failed SSH

381
00:30:00,400 --> 00:30:04,960
logins. The stream filters logs where event

382
00:30:04,960 --> 00:30:10,640
dot action equals SSH login failed. And if more than 10

383
00:30:10,640 --> 00:30:15,760
failures occur within one minute from the same IP address, Graylog triggers an

384
00:30:15,760 --> 00:30:20,800
alert, helping identify potential brute-force attacks.

385
00:30:20,880 --> 00:30:23,920
From a technical perspective, Graylog supports several

386
00:30:23,920 --> 00:30:27,920
ingestion methods, including Syslog,

387
00:30:27,920 --> 00:30:34,000
GELF, which is Graylog Extended Log Format, and REST APIs is supported,

388
00:30:34,000 --> 00:30:39,920
making Graylog compatible with a wide range of log sources.

389
00:30:39,920 --> 00:30:44,080
The platform uses Elasticsearch again for indexing,

390
00:30:44,080 --> 00:30:47,840
which provides fast search and query capabilities across the

391
00:30:47,920 --> 00:30:52,320
large datasets. Additionally, Graylog plugin ecosystem is

392
00:30:52,320 --> 00:30:56,240
another highlight, offering integrations for geolocation,

393
00:30:56,240 --> 00:30:59,920
threat intelligence, and other advanced use cases.

394
00:30:59,920 --> 00:31:04,560
This flexibility allows organizations to extend Graylog functionality

395
00:31:04,560 --> 00:31:10,080
and adapt to their specific security and operational needs.

396
00:31:10,080 --> 00:31:15,200
IBM QRadar is a comprehensive enterprise-level SIEM platform,

397
00:31:15,280 --> 00:31:20,000
widely used in regulated industries, particularly where robust

398
00:31:20,000 --> 00:31:23,360
security and compliance requirements are essential.

399
00:31:23,360 --> 00:31:27,360
Known for its advanced correlation engine and integration capabilities,

400
00:31:27,360 --> 00:31:31,520
QRadar is designed to handle complex security environments with ease,

401
00:31:31,520 --> 00:31:35,600
providing deep insights and real-time detection of threats.

402
00:31:35,600 --> 00:31:38,960
One of QRadar's most notable features is its

403
00:31:38,960 --> 00:31:43,440
auto-correlation engine, which uses a rule-based

404
00:31:43,440 --> 00:31:48,240
logic to detect security patterns across multiple log sources.

405
00:31:48,240 --> 00:31:51,920
This engine automatically correlates events from various systems,

406
00:31:51,920 --> 00:31:57,920
such as firewalls, endpoints, and servers, to identify suspicious activities.

407
00:31:57,920 --> 00:32:02,400
Another key feature of QRadar is its asset modeling capability.

408
00:32:02,400 --> 00:32:06,960
The platform dynamically builds an asset inventory based on network traffic

409
00:32:06,960 --> 00:32:10,480
observations, giving security teams a clear view

410
00:32:10,480 --> 00:32:14,720
of their assets' vulnerabilities. This asset modeling helps them to

411
00:32:14,720 --> 00:32:18,240
identify critical systems that are being targeted

412
00:32:18,240 --> 00:32:22,800
and enables organizations to prioritize responses accordingly.

413
00:32:22,800 --> 00:32:27,600
Integrated threat feeds are another significant advantage of QRadar.

414
00:32:27,600 --> 00:32:31,600
It connects seamlessly with external threat intelligence,

415
00:32:31,600 --> 00:32:36,800
with providers like IBM X-Force and other third-party feeds,

416
00:32:36,800 --> 00:32:40,320
enriching the detection and correlation process.

417
00:32:40,320 --> 00:32:45,280
This integration allows QRadar to cross-reference incoming log data

418
00:32:45,280 --> 00:32:49,120
with known attack indicators and evolving threat intelligence,

419
00:32:49,120 --> 00:32:54,160
improving the accuracy of alerts and helping security teams respond more

420
00:32:54,160 --> 00:32:59,280
proactively. Let's see an example use. QRadar

421
00:32:59,280 --> 00:33:04,960
might correlate repeated failed logins from multiple users on the same subnet,

422
00:33:04,960 --> 00:33:09,440
followed by a successful login from one of the other users.

423
00:33:09,440 --> 00:33:13,680
This sequence suggests a credential stuffing attack, where attackers use

424
00:33:13,680 --> 00:33:18,880
stolen credentials to gain unauthorized access. QRadar's auto correlation

425
00:33:18,880 --> 00:33:21,680
engine will generate a high severity offense,

426
00:33:21,680 --> 00:33:26,480
alerting the security team to potential credential compromise.

427
00:33:26,480 --> 00:33:30,160
From a technical perspective, QRadar is highly scalable,

428
00:33:30,160 --> 00:33:34,320
supporting log ingestion from thousands of sources, including firewalls,

429
00:33:34,320 --> 00:33:38,880
endpoints, and cloud platforms. This makes QRadar well suited for

430
00:33:38,960 --> 00:33:43,040
large complex environments that require centralized log management and threat

431
00:33:43,040 --> 00:33:47,440
detection. Additionally, QRadar has built-in flow

432
00:33:47,440 --> 00:33:51,920
analytics through QFlow that allows it to inspect

433
00:33:51,920 --> 00:33:56,720
Layer 7 traffic, providing visibility into application level activity,

434
00:33:56,720 --> 00:34:00,480
which is essential for detecting more advanced attacks.

435
00:34:00,480 --> 00:34:05,200
QRadar also leverages the Ariel database for high-speed searching across

436
00:34:05,200 --> 00:34:09,840
terabytes of log data, enabling fast query performance,

437
00:34:09,840 --> 00:34:14,400
even in large-scale deployments. This allows security analysts to quickly

438
00:34:14,400 --> 00:34:18,960
search through extensive logs and identify critical security events,

439
00:34:18,960 --> 00:34:24,640
which is vital for maintaining real-time monitoring capabilities.

440
00:34:24,640 --> 00:34:29,760
Microsoft Sentinel is a cloud-native SIEM solution that is tightly

441
00:34:29,760 --> 00:34:32,800
integrated with Azure, offering scalability,

442
00:34:32,800 --> 00:34:37,600
powerful analytics, and automated response capabilities.

443
00:34:37,600 --> 00:34:41,520
It leverages Microsoft's AI, machine learning,

444
00:34:41,520 --> 00:34:45,360
and security graph intelligence to deliver smart threat detection

445
00:34:45,360 --> 00:34:50,000
and streamlined security operations, making it particularly suited for

446
00:34:50,000 --> 00:34:55,840
enterprises already using Azure or other Microsoft services.

447
00:34:55,840 --> 00:34:59,280
One of the Sentinel's key strengths is its use of a

448
00:34:59,280 --> 00:35:04,800
Kusto Query Language, KQL, a fast and flexible query language

449
00:35:04,800 --> 00:35:07,920
optimized for large-scale log analysis.

450
00:35:07,920 --> 00:35:12,560
KQL allows security analysts to quickly search and analyze vast

451
00:35:12,560 --> 00:35:16,720
amounts of log data, making it an essential tool for identifying

452
00:35:16,720 --> 00:35:20,320
threats in real time. The language is powerful enough

453
00:35:20,320 --> 00:35:23,920
to handle complex queries while being efficient enough

454
00:35:23,920 --> 00:35:27,280
to work at the scale of enterprise environments.

455
00:35:27,280 --> 00:35:30,800
Another important capability of Sentinel's

456
00:35:30,800 --> 00:35:35,440
is its behavior analytics, which includes User Behavior Analytics,

457
00:35:35,440 --> 00:35:39,760
UEBA. This feature analyzes user and the

458
00:35:39,760 --> 00:35:44,960
system behavior, to find data anomalies that might indicate malicious activity.

459
00:35:44,960 --> 00:35:48,720
By continuously monitoring user and entity actions,

460
00:35:48,720 --> 00:35:52,800
Sentinel can identify deviations from normal behavior,

461
00:35:52,800 --> 00:35:57,840
such as unusual login times, atypical access patterns,

462
00:35:57,840 --> 00:36:02,800
or unauthorized resource access, which are often the signs of insider threats

463
00:36:02,800 --> 00:36:07,760
or compromised accounts. Playbooks in Microsoft Sentinel,

464
00:36:07,760 --> 00:36:12,640
which are based on SOAR, Security Orchestration, Automation and Response,

465
00:36:12,640 --> 00:36:16,720
enable organizations to automate incident response.

466
00:36:16,720 --> 00:36:21,920
Using Azure Logic Apps, playbooks can execute predefined

467
00:36:21,920 --> 00:36:26,240
workflows to respond to security incidents automatically.

468
00:36:26,240 --> 00:36:30,240
These playbooks help reduce the response time to incidents,

469
00:36:30,240 --> 00:36:35,040
ensuring a quick and consistent reaction to common security threats.

470
00:36:35,040 --> 00:36:39,920
Let's see an example. The Sentinel threat detection capabilities

471
00:36:39,920 --> 00:36:44,080
is its ability to flag impossible logins.

472
00:36:44,080 --> 00:36:49,360
Suppose a user logs from Germany and then from the US, within a two-minute

473
00:36:49,360 --> 00:36:53,280
window, Sentinel would use KQL queries on

474
00:36:53,280 --> 00:36:58,000
sign-in logs to flag this as an anomaly. By querying the

475
00:36:58,000 --> 00:37:04,400
GeoIP and timestamp data, Sentinel can immediately identify and alert

476
00:37:04,400 --> 00:37:09,600
on the impossible scenario, indicating a potential credential compromise

477
00:37:09,600 --> 00:37:14,160
or an unauthorized access attempt. From a technical perspective, Microsoft

478
00:37:14,160 --> 00:37:17,520
Sentinel provides seamless integration with the

479
00:37:17,520 --> 00:37:23,120
Microsoft services, like Microsoft 365, Defender and Azure

480
00:37:23,120 --> 00:37:26,640
activity logs. This tight integration helps security

481
00:37:26,640 --> 00:37:30,320
teams gain comprehensive visibility across their

482
00:37:30,320 --> 00:37:34,080
cloud and on-premises environments. Additionally,

483
00:37:34,080 --> 00:37:37,920
Sentinel supports a wide range of third-party

484
00:37:37,920 --> 00:37:42,960
and connectors such as AWS, Cisco and Palo Alto,

485
00:37:42,960 --> 00:37:46,720
which allow organizations to ingest data, to ingest

486
00:37:46,720 --> 00:37:50,800
data from multiple sources into a centralized platform

487
00:37:50,800 --> 00:37:55,680
for more complete threat detection and analysis.

488
00:37:56,160 --> 00:38:01,680
To begin utilizing Splunk, for example, for log data analysis, it's

489
00:38:01,680 --> 00:38:05,520
important to understand each step of the process for ingesting and

490
00:38:05,520 --> 00:38:09,920
configuring log data. The first stage is to navigate to

491
00:38:09,920 --> 00:38:13,760
settings and add data, which opens a simple and

492
00:38:13,840 --> 00:38:17,120
intuitive workflow for adding your data sources.

493
00:38:17,120 --> 00:38:21,360
This process is critical as it ensures that your Splunk instance

494
00:38:21,360 --> 00:38:25,360
can properly collect data from a variety of sources,

495
00:38:25,360 --> 00:38:30,320
be it from files or real-time monitoring. The first action

496
00:38:30,320 --> 00:38:34,320
here is select monitor files and directories.

497
00:38:34,320 --> 00:38:37,760
This allows you to specify which log files to track for

498
00:38:37,760 --> 00:38:41,440
real-time collection. For instance, if you want to

499
00:38:41,440 --> 00:38:44,960
monitor authentication logs from a Linux system, you might

500
00:38:44,960 --> 00:38:51,760
choose files like /var/log/auth.log or /var/log/secure kernel or

501
00:38:51,760 --> 00:38:56,480
whatever. By choosing the files and directories

502
00:38:56,480 --> 00:39:00,560
option, you are telling Splunk to look for new log entries

503
00:39:00,560 --> 00:39:03,680
that appear within these files or directories,

504
00:39:03,680 --> 00:39:07,280
which will then be indexed for later searching.

505
00:39:07,360 --> 00:39:13,440
This is key because it means Splunk is constantly aware of new events,

506
00:39:13,440 --> 00:39:19,120
making it easier to detect issues as they arise in real-time.

507
00:39:19,120 --> 00:39:24,640
Next, one of the most crucial decisions is in the setup process

508
00:39:24,640 --> 00:39:28,320
to select a source type for the log data.

509
00:39:28,320 --> 00:39:31,920
The source type defines how Splunk should interpret

510
00:39:31,920 --> 00:39:36,000
and parse the raw log data. Without a proper source type,

511
00:39:36,000 --> 00:39:39,760
Splunk wouldn't be able to extract meaningful fields

512
00:39:39,760 --> 00:39:44,640
or structure the data correctly. For example, Linux authentication logs

513
00:39:44,640 --> 00:39:48,080
typically use linux_secure source type.

514
00:39:48,080 --> 00:39:51,600
By assigning the appropriate source type,

515
00:39:51,600 --> 00:39:55,360
you ensure that Splunk understands the log format

516
00:39:55,360 --> 00:39:59,600
and can correctly break down the logs into fields such as user names,

517
00:39:59,600 --> 00:40:05,440
timestamps, login as a success, failure, statuses and so on.

518
00:40:05,520 --> 00:40:10,400
After defining the source type, the next step is to configure an index.

519
00:40:10,400 --> 00:40:14,480
The index acts like a container for logs

520
00:40:14,480 --> 00:40:18,320
and is essential for organizing the large amounts of data

521
00:40:18,320 --> 00:40:24,000
that may be ingested into Splunk. For example, security logs could be placed

522
00:40:24,000 --> 00:40:28,320
in an index called "security_logs", which would make it easy

523
00:40:28,320 --> 00:40:32,640
to later filter and search through only the security-related events in your

524
00:40:32,640 --> 00:40:37,920
system. By grouping logs into different indexes,

525
00:40:37,920 --> 00:40:42,720
Splunk can more efficiently retrieve the data needed for analysis without

526
00:40:42,720 --> 00:40:46,160
sifting through irrelevant information, improving

527
00:40:46,160 --> 00:40:49,200
performance and ensuring that search queries

528
00:40:49,200 --> 00:40:54,160
are faster and more relevant. Once you've completed the source type and

529
00:40:54,160 --> 00:40:58,000
index configuration, Splunk will automatically begin

530
00:40:58,000 --> 00:41:02,800
ingesting the log data from the specified files into the database.

531
00:41:02,800 --> 00:41:08,160
As it ingests those logs, Splunk parses the raw log entries into

532
00:41:08,160 --> 00:41:11,680
structured events according to the source type.

533
00:41:11,680 --> 00:41:16,080
Each event contains important metadata fields such as time,

534
00:41:16,080 --> 00:41:21,360
host, source and source type which helps categorize the data and make it

535
00:41:21,360 --> 00:41:24,960
searchable. For example, the timestamp field

536
00:41:25,120 --> 00:41:29,680
last_time will allow you to search logs by date and time.

537
00:41:29,680 --> 00:41:33,520
Host will tell you which machine the log originated from

538
00:41:33,520 --> 00:41:36,960
and source type will help you identify the type of log,

539
00:41:36,960 --> 00:41:41,920
if it's a system log, application log, security log and so on.

540
00:41:41,920 --> 00:41:46,080
These fields make the data far more usable and allow you to quickly find

541
00:41:46,080 --> 00:41:50,800
relevant entries in a large dataset. Once the logs have been ingested

542
00:41:50,800 --> 00:41:55,840
and parsed into the events, the final step is to validate the data by

543
00:41:55,840 --> 00:41:59,360
running searches. This is where the power of Splunk

544
00:41:59,360 --> 00:42:04,240
really shines. By querying specific indexes and source types,

545
00:42:04,240 --> 00:42:09,200
you can rapidly filter through logs to uncover valuable insights.

546
00:42:09,200 --> 00:42:14,000
For instance, to track failed login attempts, you could search for all failed

547
00:42:14,000 --> 00:42:17,280
password entries in the security logs using a

548
00:42:17,280 --> 00:42:22,640
query like index=security_logs sourcetype=linux_secure

549
00:42:22,640 --> 00:42:26,960
and then the message "failed password".

550
00:42:26,960 --> 00:42:31,520
This query will return all entries matching that search string

551
00:42:31,520 --> 00:42:36,560
which would be incredibly helpful for identifying potential security

552
00:42:36,560 --> 00:42:42,960
breaches, brute force attacks or misconfigurations in the system.

553
00:42:42,960 --> 00:42:46,480
Once your logs are ingested into Splunk the next step

554
00:42:46,480 --> 00:42:50,320
is to analyze them using the SPL, the Search Processing

555
00:42:50,320 --> 00:42:54,640
Language. One of the most fundamental queries in SPL

556
00:42:54,640 --> 00:42:59,520
is using the stats command to calculate metrics across different fields in your

557
00:42:59,520 --> 00:43:02,800
logs. For instance, if you want to analyze how

558
00:43:02,800 --> 00:43:06,240
many events are coming from different log sources,

559
00:43:06,240 --> 00:43:12,000
you can use the following SPL query index=security_logs

560
00:43:12,000 --> 00:43:18,880
| stats count by source. This will provide a breakdown of event

561
00:43:18,880 --> 00:43:26,000
counts from each log source helping to identify which sources are most active.

562
00:43:26,000 --> 00:43:30,720
This analysis is important for security teams to monitor systems

563
00:43:30,720 --> 00:43:35,040
as unusual spikes in events from specific sources could indicate a

564
00:43:35,040 --> 00:43:40,080
potential security incident. Another key feature of SPL is the

565
00:43:40,080 --> 00:43:44,320
top command which identifies the most frequent

566
00:43:44,320 --> 00:43:49,440
frequent values in a specific field. For example, if you want to see which source

567
00:43:49,440 --> 00:43:53,600
types are most common in your logs, you could use the query

568
00:43:53,600 --> 00:43:59,520
index=security_logs | top sourcetype.

569
00:43:59,520 --> 00:44:04,480
Source types are essential categories that Splunk assigns to logs based on

570
00:44:04,480 --> 00:44:07,920
their format. By identifying the most

571
00:44:08,880 --> 00:44:13,600
frequent source types you can assess whether the log sources are

572
00:44:13,600 --> 00:44:17,920
appropriately categorized and if there are any unexpected

573
00:44:17,920 --> 00:44:22,480
log types presented. These unexpected source types might point

574
00:44:22,480 --> 00:44:27,600
to misconfigured systems or in the worst case potential

575
00:44:27,600 --> 00:44:32,880
security risks like a new undetected source of data.

576
00:44:32,880 --> 00:44:36,720
With these commands you can easily dive into your log data and start

577
00:44:36,720 --> 00:44:42,000
uncovering valuable insights. The SPL language supports also

578
00:44:42,000 --> 00:44:45,600
more advanced features such as subsearches,

579
00:44:45,600 --> 00:44:49,440
field extractions and time-based analytics.

580
00:44:49,440 --> 00:44:53,520
For example, if you want to focus on logs from a speccy

581
00:44:53,520 --> 00:44:58,800
specific time range or filter out specific events that aren't relevant

582
00:44:58,800 --> 00:45:03,120
to your investigation, SPL allows you to refine

583
00:45:03,120 --> 00:45:08,160
your queries according to the time. This makes it versatile tool for not only

584
00:45:08,160 --> 00:45:12,320
security incident response but also for ongoing operational

585
00:45:12,320 --> 00:45:16,880
monitoring. To detect brute force attempts, one

586
00:45:16,880 --> 00:45:20,480
effective method using Splunk is to examine logs for repeated

587
00:45:20,480 --> 00:45:24,480
failed login messages. Brute force attacks often involve an

588
00:45:24,480 --> 00:45:28,240
attack repeatedly trying different passwords in an attempt to gain unauthorized

589
00:45:28,240 --> 00:45:32,480
access to the system. These failed attempts are typically

590
00:45:32,480 --> 00:45:36,400
recorded in logs such as SSH or authentication logs

591
00:45:36,400 --> 00:45:40,080
which makes them essential for detecting malicious activity.

592
00:45:40,080 --> 00:45:44,400
By analyzing those logs you can identify patterns that indicate brute force

593
00:45:44,400 --> 00:45:48,160
attacks in progress. One way to detect a brute force attack is by

594
00:45:48,160 --> 00:45:53,840
using queries in log analysis. So index=security_logs "failed

595
00:45:53,840 --> 00:45:57,840
password" so this is the message that should appear in the logs

596
00:45:57,840 --> 00:46:02,400
and then | stats count by source. This command is specifically designed

597
00:46:02,400 --> 00:46:08,800
to find instances where login attempts fail. Here's how it works.

598
00:46:08,800 --> 00:46:13,280
index=security_logs specifies that the query should search within the

599
00:46:13,280 --> 00:46:17,280
security logs index which likely contains entries

600
00:46:17,280 --> 00:46:22,160
related to the login attempts. "failed password" searches for events

601
00:46:22,160 --> 00:46:25,520
in the logs that contain the phrase "failed password"

602
00:46:25,520 --> 00:46:29,840
which is commonly used to log unsuccessful login attempts.

603
00:46:29,920 --> 00:46:34,320
| stats count by source is a statistical function that counts the

604
00:46:34,320 --> 00:46:39,360
occurrences how many times for example this occurrence appeared

605
00:46:39,360 --> 00:46:44,000
grouped by the source IP address. For example it should be

606
00:46:44,000 --> 00:46:48,160
an IP address and then how many failed passwords

607
00:46:48,160 --> 00:46:53,200
was on a specific IP. Essentially this part of the command aggregates the

608
00:46:53,200 --> 00:46:56,640
data to show how many failed login attempts came from

609
00:46:56,640 --> 00:47:00,480
each unique IP address. The result from this query could look

610
00:47:00,480 --> 00:47:05,440
like this IP address and the IP address

611
00:47:05,440 --> 00:47:08,960
how many failed login attempts were appeared

612
00:47:08,960 --> 00:47:16,720
20 for example index=security_logs "failed password" | top user

613
00:47:16,720 --> 00:47:21,200
this work this query works slightly different but still aims to identify

614
00:47:21,200 --> 00:47:26,160
the brute force attack. Again the security logs is the

615
00:47:26,160 --> 00:47:30,240
index and the failed password is the phrase inside the logs.

616
00:47:30,240 --> 00:47:34,000
The | top user is a function that returns the most frequently

617
00:47:34,000 --> 00:47:38,640
targeted user names. It aggregates the data based on the

618
00:47:38,640 --> 00:47:42,320
number of times each username appears in the failed login attempts

619
00:47:42,320 --> 00:47:46,160
essentially ranking the user names based on how many times they were

620
00:47:46,160 --> 00:47:49,920
targeted so it will provide the highest number of

621
00:47:49,920 --> 00:47:53,280
user names that were tried for these login attempts.

622
00:47:53,360 --> 00:47:59,120
So that's the top command. We'll explain like that the result will be like

623
00:47:59,120 --> 00:48:05,440
username root which is very common 15 attempts username admin 10

624
00:48:05,440 --> 00:48:09,440
attempts username super user whatever

625
00:48:09,440 --> 00:48:13,840
number of attempts. This means that the user names root and admin were

626
00:48:13,840 --> 00:48:18,400
specifically targeted. Such results suggest that attackers

627
00:48:18,400 --> 00:48:22,320
are focusing on file value accounts from the root of the admin

628
00:48:22,320 --> 00:48:30,160
and they are actually trying to do brute force attacks on the systems.

629
00:48:30,160 --> 00:48:34,560
Visualizing log data is essential for quickly spotting suspicious activities

630
00:48:34,560 --> 00:48:39,760
or abnormal behavior in real time. Dashboards in SIEM tools like Splunk

631
00:48:39,760 --> 00:48:43,280
provide an interactive interface to track and analyze trends

632
00:48:43,280 --> 00:48:46,960
such as spikes in failed login attempts or repeated attacks.

633
00:48:46,960 --> 00:48:51,200
These visualizations help security analysts monitor potential threats

634
00:48:51,280 --> 00:48:55,600
without needing to shift through the raw log data manually

635
00:48:55,600 --> 00:48:58,800
allowing for more efficient and effective response.

636
00:48:58,800 --> 00:49:03,120
To track brute force attempts or other login failures

637
00:49:03,120 --> 00:49:07,920
over time you can use the query like the one that we explained security logs

638
00:49:07,920 --> 00:49:11,920
"failed password" and then | timechart span=1h count by source.

639
00:49:11,920 --> 00:49:16,640
This query generates a time chart the

640
00:49:16,640 --> 00:49:22,560
tracks failed login attempts by source IP address over a specific time period

641
00:49:22,560 --> 00:49:28,320
in this case hourly each hour. By breaking down the data into time-based

642
00:49:28,320 --> 00:49:33,840
increments you can visualize how login attempts evolve and identify sudden

643
00:49:33,840 --> 00:49:38,720
spikes or patterns that might indicate an ongoing attack.

644
00:49:38,720 --> 00:49:43,120
To create a visual representation of this query in Splunk

645
00:49:43,120 --> 00:49:48,080
close the steps navigate to the dashboard section and select create new

646
00:49:48,080 --> 00:49:51,360
dashboard. Choose the type of chart you want to use

647
00:49:51,360 --> 00:49:56,000
such as a line chart or a column chart use the query

648
00:49:56,000 --> 00:50:00,640
index=security_logs "failed password"

649
00:50:00,640 --> 00:50:06,240
| timechart span=1h count by source as the data source for your chart.

650
00:50:06,240 --> 00:50:10,720
Set a meaningful time range such as the last 24 hours to provide the relevant

651
00:50:10,720 --> 00:50:15,920
view of the activity you are monitoring. The expected output is a time-based

652
00:50:15,920 --> 00:50:19,120
graph where you can track failed login attempts by

653
00:50:19,120 --> 00:50:24,400
source IP. The graph will display which IP addresses

654
00:50:24,400 --> 00:50:29,200
are attempting logins at certain times and you can easily spot any IP addresses

655
00:50:29,200 --> 00:50:32,960
with an unusual high frequency of failed attempts.

656
00:50:32,960 --> 00:50:36,240
These are likely to be sources of brute force attacks.

657
00:50:36,240 --> 00:50:40,320
With this visualization you can quickly identify the most active and

658
00:50:40,320 --> 00:50:44,720
potentially malicious IP addresses allowing you to

659
00:50:44,720 --> 00:50:47,840
take immediate action such as blocking the IP

660
00:50:47,840 --> 00:50:53,360
or further investigating the activity.

661
00:50:53,360 --> 00:50:58,720
Logstash is a powerful tool in the ELK Stack designed for ingesting processing

662
00:50:58,720 --> 00:51:01,840
and forwarding log data from multiple resources.

663
00:51:01,840 --> 00:51:05,200
When you configure Logstash to handle logs from a firewall

664
00:51:05,200 --> 00:51:08,880
it plays a vital role in structuring raw data

665
00:51:08,880 --> 00:51:13,200
raw log data in meaningful searchable formats. The process starts with the

666
00:51:13,200 --> 00:51:16,960
installation of Logstash on your server which can be done

667
00:51:16,960 --> 00:51:22,880
using package manager or by directly downloading the binaries.

668
00:51:22,880 --> 00:51:26,800
Once installed you move on to the configuring the Logstash pipeline

669
00:51:26,800 --> 00:51:31,200
which consists of three main sections input filter and output.

670
00:51:31,200 --> 00:51:36,400
In the input section Logstash is set up to pull logs from your firewall's

671
00:51:36,400 --> 00:51:42,000
log file or syslog output. For example firewall logs are often stored in

672
00:51:42,000 --> 00:51:46,320
files like /var/log/firewall.log

673
00:51:46,320 --> 00:51:49,760
and Logstash needs to be told where to find them.

674
00:51:49,760 --> 00:51:55,520
To start position beginning option ensures that Logstash processor logs

675
00:51:55,520 --> 00:51:59,040
starting from the beginning of the log file.

676
00:51:59,040 --> 00:52:04,400
Once Logstash has adjusted the logs the filter section comes into play.

677
00:52:04,400 --> 00:52:08,880
Here you can use filters like Grok to parse and extract

678
00:52:08,880 --> 00:52:14,160
specific data fields from the raw logs. For example you might use the Grok

679
00:52:14,160 --> 00:52:18,960
filter to pull out the source IP destination IP and action taken by

680
00:52:18,960 --> 00:52:21,760
the firewall. This step is essential for

681
00:52:21,760 --> 00:52:25,280
transforming unstructured log data into structured

682
00:52:25,280 --> 00:52:29,680
fields that can be indexed and analyzed. Finally the output section

683
00:52:29,680 --> 00:52:33,920
of the configuration defines where the parsed data will go.

684
00:52:33,920 --> 00:52:38,000
In this case you can configure Logstash to forward the processed data to the

685
00:52:38,000 --> 00:52:42,000
Elasticsearch where it will be indexed and made available for

686
00:52:42,000 --> 00:52:46,000
search and analysis. The logs are usually stored in an

687
00:52:46,000 --> 00:52:51,760
index format such as "firewall_logs-date" or whatever name

688
00:52:51,760 --> 00:52:56,480
allowing for easy organization and querying based on the date the

689
00:52:56,480 --> 00:53:01,280
logs were created or upon a different tactic and methodology.

690
00:53:01,280 --> 00:53:06,480
Once the setup is complete and Logstash is actively processing the logs

691
00:53:06,480 --> 00:53:10,400
the output will consist of structured data that can be queried and

692
00:53:10,400 --> 00:53:15,040
visualized using visualization tools like Kibana.

693
00:53:15,040 --> 00:53:19,120
Its firewall entries such as one of the IP address

694
00:53:19,120 --> 00:53:23,680
destination and source with action of accept will be broken into

695
00:53:23,680 --> 00:53:28,400
different fields making it easy to search for trends, identify patterns

696
00:53:28,400 --> 00:53:32,320
and gain insights into the network security events.

697
00:53:32,320 --> 00:53:37,440
This integration of Logstash and Elasticsearch enables you to create a robust

698
00:53:37,440 --> 00:53:41,200
log management system that enhances your ability to monitor and

699
00:53:41,200 --> 00:53:45,920
secure your network. Once the logs are indexed into Elastic

700
00:53:45,920 --> 00:53:49,520
search you can use its powerful querying capabilities to detect

701
00:53:49,520 --> 00:53:54,320
potential security threats. By constructing targeted queries you

702
00:53:54,320 --> 00:53:58,880
can identify issues live such as unauthenticated access

703
00:53:58,880 --> 00:54:02,480
and brute force attempts. This query allow for efficient

704
00:54:02,480 --> 00:54:06,240
monitoring of network activity making it easier to identify

705
00:54:06,240 --> 00:54:10,080
security incidents. One way to detect suspicious activity is

706
00:54:10,080 --> 00:54:14,240
searching for traffic from a specific source IP address.

707
00:54:14,240 --> 00:54:18,160
For example using the query source_ip:

708
00:54:18,160 --> 00:54:22,160
and then the IP address AND action:deny

709
00:54:22,240 --> 00:54:28,080
will filter the logs where traffic from IP address

710
00:54:28,080 --> 00:54:31,280
the specific address was denied by the firewall.

711
00:54:31,280 --> 00:54:35,760
This can help you identify a particular source if it's attempting

712
00:54:35,760 --> 00:54:40,880
unauthorized access or repeatedly trying to breach the firewall.

713
00:54:40,880 --> 00:54:45,600
A typical output might show logs indicating that this IP address

714
00:54:45,600 --> 00:54:50,400
has been denied access multiple times to different destinations

715
00:54:50,480 --> 00:54:56,080
suggesting potential malicious activity. These types of logs are

716
00:54:56,080 --> 00:54:59,520
useful for detecting early signs of attack allowing timely

717
00:54:59,520 --> 00:55:04,400
intervention. Another important query focuses on detecting unusual access

718
00:55:04,400 --> 00:55:08,880
patterns for specific ports. For example destination_port:80

719
00:55:08,880 --> 00:55:13,840
AND action:allow searches for allowed traffic targeting port 80.

720
00:55:13,840 --> 00:55:17,920
Since port 80 is commonly used for web traffic monitoring it for

721
00:55:17,920 --> 00:55:21,680
unusual activity can reveal if there are abnormal access patterns

722
00:55:21,680 --> 00:55:26,160
such as frequent connections that may not match regular web browsing

723
00:55:26,160 --> 00:55:30,800
behavior. The results might show a list of allowed

724
00:55:30,800 --> 00:55:35,200
traffic entries but it's the frequency and context of these connections that

725
00:55:35,200 --> 00:55:39,600
will help determine whether the activity is a legitimate or

726
00:55:39,600 --> 00:55:44,640
indicative of an exploit attempt. By regularly reviewing those patterns

727
00:55:44,720 --> 00:55:48,720
network administrators can spot the irregularities and respond.

728
00:55:48,720 --> 00:55:52,000
To detect brute force attacks you can search for multiple

729
00:55:52,000 --> 00:55:59,440
login failures like we saw on Splunk like source_ip AND action:deny

730
00:55:59,440 --> 00:56:04,560
| stats count by source_ip. This query helps identify repeated

731
00:56:04,560 --> 00:56:08,640
denied login attempts for a specific IP address a common sign of

732
00:56:08,640 --> 00:56:11,840
brute force attack. The results might show that the IP

733
00:56:11,840 --> 00:56:15,280
address has made a high number of failed login attempts

734
00:56:15,280 --> 00:56:19,600
indicating an automated attack or a security misconfiguration that requires

735
00:56:19,600 --> 00:56:23,360
attention. Such queries are particularly useful

736
00:56:23,360 --> 00:56:27,520
for spotting high-risk activities and of course

737
00:56:27,520 --> 00:56:31,600
ELK Stack can be modeled can be customized

738
00:56:31,600 --> 00:56:38,400
to use the best the best queries that are possible for the infrastructure.

739
00:56:38,400 --> 00:56:43,520
As we said the Kibana is the powerful tool for visualizing the data stored in the

740
00:56:43,520 --> 00:56:47,680
Elasticsearch on the ELK Stack and it allows you to create

741
00:56:47,680 --> 00:56:51,360
interactive dashboards to track and analyze the events.

742
00:56:51,360 --> 00:56:56,240
By using Kibana security teams can easily identify the trends anomalies

743
00:56:56,240 --> 00:57:00,000
and monitor key security metrics in real time.

744
00:57:00,000 --> 00:57:04,320
To begin you can open Kibana and navigate to the dashboard section.

745
00:57:04,320 --> 00:57:08,000
From there you can create a new dashboard specifically for your security

746
00:57:08,000 --> 00:57:12,800
event monitoring needs. Adding various visualizations to the dashboard

747
00:57:12,800 --> 00:57:16,160
will help you to track important firewall events and gain

748
00:57:16,160 --> 00:57:21,040
different insights into network activity. One of the first visualizations

749
00:57:21,040 --> 00:57:25,280
you might want to add is the bar chart for top

750
00:57:25,280 --> 00:57:29,840
denied IP addresses. This chart will show you the source IP

751
00:57:29,840 --> 00:57:32,560
addresses that have been mostly frequently

752
00:57:32,560 --> 00:57:37,760
denied accessed by the firewall. To configure this you will aggregate

753
00:57:37,760 --> 00:57:43,360
data by the source IP field and display the count of deny actions.

754
00:57:43,360 --> 00:57:47,760
This will give you a quick view of which IP addresses are attempting access

755
00:57:47,760 --> 00:57:52,160
and being blocked most often or potentially highlighting sources of

756
00:57:52,160 --> 00:57:55,680
suspicious activity. Another useful visualization is the

757
00:57:55,680 --> 00:57:59,920
pie chart for action breakdown. This will break down the actions taken

758
00:57:59,920 --> 00:58:05,040
by the firewall such as allow versus deny. The pie chart can be configured

759
00:58:05,040 --> 00:58:09,680
to aggregate data by the action field providing an easy to read representation

760
00:58:09,680 --> 00:58:13,680
to the distribution of traffic types. This visualization is particularly

761
00:58:13,680 --> 00:58:17,360
useful for understanding the general behavior of the network of the

762
00:58:17,360 --> 00:58:21,680
firewall activity for example. Finally you can create a line graph

763
00:58:21,680 --> 00:58:26,320
for detailed denied access. This visualization will allow you to

764
00:58:26,320 --> 00:58:29,520
track the number of denied connections per day

765
00:58:29,520 --> 00:58:32,400
which can help you to spot trends over time.

766
00:58:32,480 --> 00:58:37,760
Configuring this line graph will with a time range of 24 hours or five or

767
00:58:37,760 --> 00:58:42,560
seven days will allow you to monitor if there are any spikes in denied traffic

768
00:58:42,560 --> 00:58:46,080
that might indicate ongoing or recurring attempts to breach

769
00:58:46,080 --> 00:58:49,760
the firewall or the systems.